c 2004 Society for Industrial and Applied Mathematics

Size: px
Start display at page:

Download "c 2004 Society for Industrial and Applied Mathematics"

Transcription

1 SIAM J. MATRIX ANAL. APPL. Vol. 25, No. 4, pp c 2004 Society for Industrial and Applied Mathematics CONVERGENCE OF RESTARTED KRYLOV SUBSPACES TO INVARIANT SUBSPACES CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI Abstract. The performance of Krylov subspace eigenvalue algorithms for large matrices can be measured by the angle between a desired invariant subspace and the Krylov subspace. We develop general bounds for this convergence that include the effects of polynomial restarting and impose no restrictions concerning the diagonalizability of the matrix or its degree of nonnormality. Associated with a desired set of eigenvalues is a maximum reachable invariant subspace that can be developed from the given starting vector. Convergence for this distinguished subspace is bounded in terms involving a polynomial approximation problem. Elementary results from potential theory lead to convergence rate estimates and suggest restarting strategies based on optimal approximation points (e.g., Leja or Chebyshev points); exact shifts are evaluated within this framework. Computational examples illustrate the utility of these results. Origins of superlinear effects are also described. Key words. Krylov subspace methods, Arnoldi algorithm, Lanczos algorithm, polynomial restarts, invariant subspaces, eigenvalues, pseudospectra, perturbation theory, potential theory, Zolotarev-type polynomial approximation problems AMS subject classifications. 15A18, 15A42, 31A15, 41A25, 65F15 DOI /S Setting. Let A be an n n complex matrix with N n distinct eigenvalues {λ j } N j=1 with corresponding eigenvectors {u j} N j=1. (We do not label multiple eigenvalues separately and make no assertion regarding the uniqueness of the u j.) Each distinct eigenvalue λ j has geometric multiplicity n j and algebraic multiplicity m j (so that 1 n j m j and N j=1 m j = n). We aim to compute an invariant subspace associated with L of these eigenvalues, which for brevity we call the good eigenvalues, labeled {λ 1,λ 2,...,λ L }. We intend to use a Krylov subspace algorithm to approximate this invariant subspace, possibly with the aid of restarts as described below. The remaining N L eigenvalues, the bad eigenvalues, are not of interest and we wish to avoid excessive expense involved in inadvertently calculating the subspaces associated with them. The class of algorithms considered here draws eigenvector approximations from Krylov subspaces generated by the starting vector v 1 C n, K l (A, v 1 ) = span{v 1, Av 1,...,A l 1 v 1 }. Such algorithms, including the Arnoldi and biorthogonal Lanczos methods reviewed in section 1.1, differ in their mechanisms for generating a basis for K l (A, v 1 ) and selecting approximate eigenvectors from this Krylov subspace. Though these approximate eigenvectors are obvious objects of study, their convergence can be greatly complicated by eigenvalue multiplicity and defectiveness; see [21]. The bounds developed in Received by the editors November 21, 2001; accepted for publication (in revised form) by Z. Strakoš June 9, 2003; published electronically July 14, Department of Mathematics, Virginia Polytechnic Institute and State University, Blacksburg, VA (beattie@math.vt.edu, rossi@math.vt.edu). Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford OX1 3QD, UK. Current address: Department of Computational and Applied Mathematics, Rice University, 6100 Main Street MS 134, Houston, TX (embree@caam.rice.edu). The research of this author was supported in part by UK Engineering and Physical Sciences Research Council Grant GR/M

2 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1075 the following sections avoid these difficulties by instead studying convergence of the Krylov subspace to an invariant subspace associated with the good eigenvalues as the dimension of the Krylov subspace is increased. Given two subspaces, W and V of C n, the extent to which V approximates W is measured (asymmetrically) by the containment gap (or just gap), defined as δ(w, V) = sup inf x W y V y x x = sin(ϑ max ). Here ϑ max is the largest canonical angle between W and a closest subspace V of V having dimension equal to dim W. (Throughout, denotes the vector 2-norm and the matrix norm it induces.) Notice that if dim V < dim W, then δ(w, V) = 1, while δ(w, V) = 0 if and only if W V. The gap can be expressed directly as the norm of a composition of projections: If Π W and Π V denote orthogonal projections onto W and V, respectively, then δ(w, V) = (I Π V )Π W (see, e.g., Chatelin [7, sect. 1.4]). The objective of this paper then is to measure the gap between Krylov subspaces and an m-dimensional invariant subspace U of A associated with the good eigenvalues. We explore how quickly δ(u, K l (A, v 1 )) can be driven to zero as l is increased, reflecting the speed of convergence, and how this behavior is influenced by the distribution of eigenvalues and nonnormality of A. Note that δ(u, K l (A, v 1 )) = 1 when l<m.forl m, our bounds ultimately take the form (1.1) max{ φ(z) : z Ω bad } δ(u, K l (A, v 1 )) C 0 C 1 C 2 φ P l m { φ(z) : z Ω good }, where P l is the set of degree-l polynomials, and Ω good and Ω bad are disjoint compact subsets of C containing the good and bad eigenvalues, respectively. The constant C 0 reflects nonnormal coupling between good and bad invariant subspaces, while C 2 reflects nonnormality within those two subspaces. The constant C 1 principally describes the effect of starting vector bias, though it, too, is influenced by nonnormality. In section 2 we identify the subspace U, which in common situations will be the entire invariant subspace of A associated with the good eigenvalues, but will be smaller when A is derogatory or the starting vector v 1 is deficient. The basic bound (1.1) is derived in section 3. Section 4 addresses the polynomial approximation problem embedded in (1.1), describing those factors that detere linear convergence rates or that lead to superlinear effects. Section 5 analyzes the constants C 1 and C 2, and section 6 provides computational examples illustrating the bounds. Since it becomes prohibitively expensive to construct and store a good basis for K l (A, v 1 ) when the dimension of A is large, practical algorithms typically limit the maximum dimension of the Krylov subspace to some p n. If satisfactory estimates cannot be extracted from K p (A, v 1 ), then the algorithm is restarted by replacing v 1 with some new v K p (A, v 1 ) that is, one hopes, enriched in the component lying in the subspace U. Since this v is chosen from the Krylov subspace, we can write v = ψ(a)v 1 for some polynomial ψ with deg(ψ) <p. Our bounds also apply to this situation, and ideas from potential theory, outlined in section 4, motivate particular choices for the polynomial ψ. The results presented here complement and extend earlier convergence theory, beginning with Saad s bound on the gap between a single eigenvector and the Krylov subspace for a matrix with simple eigenvalues [32]. Jia generalized this result to invariant subspaces associated with a single eigenvalue of a defective matrix, but

3 1076 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI these bounds involve the Jordan form of A and derivatives of approximating polynomials [20]. Simoncini uses pseudospectra to describe block-arnoldi convergence for defective matrices [37]. Interpreting restarted algorithms in terms of subspace iteration, Lehoucq developed an invariant subspace convergence theory incorporating results from Watkins and Elsner [25]. Calvetti, Reichel, and Sorensen studied single eigenvector convergence for Hermitian matrices using elements of potential theory [6]. A key feature of our approach is its applicability to general invariant subspaces, which may be better conditioned than individual eigenvectors (see, e.g., [39, Chap. V]). Notably, we estimate convergence rates for defective matrices without introducing any special choice of basis and without requiring knowledge of the Jordan form or any related similarity transformation. Finally, we note that other measures of convergence may be more appealing in certain situations. Alternatives include Ritz values [20, 24], although convergence behavior can be obscure for matrices that are defective (or nearly so). The subspace residual is computationally attractive because it doesn t require a priori knowledge of the good invariant subspace. This measure can be related to gap convergence [17, 38] Algorithmic context. Suppose V is an n n unitary matrix that reduces A to upper Hessenberg form; i.e., V AV = H for some upper Hessenberg matrix, H. For any index 1 l n, let H l denote the lth principal submatrix of H: h 11 h 12 h 1l β 2 h 22 h 2l H l = C l l. β l h ll The Arnoldi method [2, 32] builds up the matrices H and V one column at a time starting with the unit vector v 1 C n, although the process is typically stopped well before completion, with l n. The algorithm only accesses A through matrix-vector products, making this approach attractive when A is large and sparse. Different choices for v 1 produce distinct outcomes for H l. The defining recurrence may be derived from the fundamental relation AV l = V l H l + β l+1 v l+1 e l, where e l is the lth column of the l l identity matrix. The lth column of H l is detered so as to force v l+1 to be orthogonal to the columns of V l, and β l+1 then is detered so that v l+1 = 1. Provided H l is unreduced, the columns of V l constitute an orthonormal basis for the order-l Krylov subspace K l (A, v 1 )= span{v 1, Av 1,..., A l 1 v 1 }. Since Vl AV l = H l, the matrix H l is a Ritz Galerkin approximation of A on this subspace, as described by Saad [33]. The eigenvalues of H l are called Ritz values and will, in many circumstances, be reasonable approximations to some of the eigenvalues of A. An eigenvector of H l associated with a given Ritz value θ j can be used to construct an eigenvector approximation for A. Indeed, if H l y j = θ j y j, then the Ritz vector û j = V l y j yields the residual Aû j θ j û j = β l+1 e l y j. When β l+1 1, the columns of V l nearly span an invariant subspace of A. Small residuals more often arise from negligible trailing entries of the vector y j, indicating the most recent Krylov direction contributed negligibly to the Ritz vector û j.

4 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1077 Biorthogonal Lanczos methods have similar characteristics despite important differences both in conception and implementation; see, e.g., [4]. In particular, different bases for K l (A, v 1 ) are generated, and the associated Ritz values can differ considerably from those produced by the Arnoldi algorithm, even though the projection subspace K l (A, v 1 ) remains the same. Our focus here avoids the complications of Ritz value convergence and remains fixed on how well a good invariant subspace U is captured by K l (A, v 1 ), without regard to how a basis for K l (A, v 1 ) has been generated Polynomial restarts. The first p steps of the Arnoldi or biorthogonal Lanczos recurrence require p matrix-vector products of the form Av k, plus O(np 2 ) floating point operations for (bi)orthogonalization. For very large n and very sparse A (say, with a maximum number of nonzero entries per row very much smaller than n), the cost of orthogonalization will rapidly doate as p grows. Polynomial restarting is one general approach to alleviate this prohibitive expense. At the end of p + 1 steps of the recurrence, one selects some best vector v 1 + K p+1(a, v 1 ) and restarts the recurrence from the beginning using v 1 +. Different restart strategies differ essentially in how they attempt to condense progress made in the last p + 1 steps into the vector v 1 +. Since any vector in K p+1(a, v 1 ) can be represented as ψ p (A)v 1 for some polynomial ψ p of degree p or less, a restart of this type can be expressed as (1.2) v + 1 ψ p(a)v 1. If subsequent restarts occur (relabeling v + 1 as v(1) 1 ), then v (1) 1 ψ p [1] (A)v 1 (first restart), v (2) 1 ψ p [2] (A)v (1) 1 (second restart),. v (ν) 1 ψ p [ν] (A)v (ν 1) 1 (νth restart). We collect the effect of the restarts into a single aggregate polynomial of degree νp: (1.3) v (ν) 1 Ψ νp (A)v 1, where Ψ νp (λ) = ν k=1 ψ[k] p (λ) is called the filter polynomial. Evidently, the restart vectors should retain and amplify components of the good invariant subspace while damping and eventually purging components of the bad invariant subspace. One obvious way of encouraging such a trend is to choose the polynomial Ψ νp (λ) to be as large as possible when evaluated on the good eigenvalues while being as small as possible on the bad eigenvalues. If the bad eigenvalues are situated within a known compact set Ω bad (not containing any good eigenvalues), Chebyshev polynomials associated with Ω bad are often a reasonable choice. When integrated with the Arnoldi algorithm, this results in the Arnoldi Chebyshev method [34] (cf. [18]). This Chebyshev strategy requires either a priori or adaptively generated knowledge of Ω bad, a drawback. Sorensen identified an alternative approach, called exact shifts, that has proved extremely successful in practice. The filter polynomial Ψ νp is automatically constructed using Ritz eigenvalue estimates. Before each new restart of the Arnoldi method, one computes the eigenvalues of H l and sorts the resulting l = k + p Ritz values into two disjoint sets S good and S bad. The p Ritz values

5 1078 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI in the set S bad are used to define the restart polynomial ψ p (λ) = k+p j=k+1 (λ θ j). Morgan discovered a remarkable consequence of this restart strategy: The updated Krylov subspace K l (A, v 1 + ), generated by the new starting vector v+ 1 in (1.2) using exact shifts, satisfies K l (A, v 1 + ) = span{û 1, û 2,...,û k, Aû j, A 2 û j,...,a p û j } for each index j =1, 2,...,k [27]. Thus, Sorensen s exact shifts will provide, in the stage following a restart, a subspace containing every possible Krylov subspace of dimension p that could be obtained with a starting vector that was a linear combination of the good Ritz vectors (cf. [32]). Furthermore, Sorensen showed how to apply shifts implicitly, regenerating the Krylov subspace K l (A, v 1 + ) with only p matrix-vector products in a numerically stable way. Analogous features can be verified for the restarted biorthogonal Lanczos method using exact shifts to build polynomial filters. Such a strategy has been explored in [16, 9]. Assume now that an Arnoldi or biorthogonal Lanczos process has proceeded l steps past the last of ν restarts, each of which (for the sake of simplicity) has the same order p. In the jth restart (1 j ν), we use shifts {µ jk } p k=1. Define Ψ νp (λ) = ν j=1 k=1 p (λ µ jk ) to be the aggregate restart polynomial after ν restarts. An iteration without restarts will have p = ν = 0 and Ψ νp (λ) =1. Let K τ (A, v (ν) 1 ) denote the Krylov subspace of order τ generated by the starting vector v (ν) 1 that is obtained after ν restarts. The following basic result follows immediately from the observation that v (ν) 1 =Ψ νp (A)v 1. Lemma 1.1. For all τ 0, K τ (A, v (ν) 1 )=Ψ νp(a) K τ (A, v 1 ). 2. Reachable invariant subspaces. If the good eigenvalues are all simple, then the associated invariant subspace is uniquely detered as the span of good eigenvectors. However, if some of these eigenvalues are multiple, there could be a variety of associated invariant subspaces. Nonetheless, single-vector Krylov eigenvalue algorithms with polynomial restarts are capable of revealing only one of the many possible invariant subspaces for any given initial vector. Before developing convergence bounds, we first characterize this distinguished invariant subspace precisely. Let M be the cyclic subspace generated by the initial starting vector v 1, M = span{v 1, Av 1, A 2 v 1,...}. M is evidently an invariant subspace of A and s dim(m) n. Since any invariant subspace of A that contains v 1 must also contain A τ v 1, M is the smallest invariant subspace of A that contains v 1. The s vectors of the Krylov sequence {v 1, Av 1,...,A s 1 v 1 } are linearly independent, and thus constitute a basis for M. Recall that a linear transformation is nonderogatory if each eigenvalue has geometric multiplicity equal to 1; i.e., each distinct eigenvalue has precisely one eigenvector associated with it, detered up to scaling. Define A M to be the restriction of A to M. The following result is well known; see, e.g., [1], [13, Chap. VII]. Lemma 2.1. A M is nonderogatory, and K τ (A, v (ν) 1 )=K τ (A M, v (ν) 1 ) M. Define α j to be the ascent (or index) of the eigenvalue λ j, i.e., the imum positive integer α such that Ker (A λ j ) α = Ker (A λ j ) α+1. This α j is the maximum dimension of the n j different Jordan blocks associated with λ j, and Ker (A λ j ) αj then is the span of all generalized eigenvectors associated with λ j.

6 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1079 The spectral projection onto each subspace Ker (A λ j ) αj can be constructed in the following coordinate-free manner; see, e.g., [23, sect. I.5.3]. For each eigenvalue λ j,1 j N, let Γ j be some positively oriented Jordan curve in C containing λ j in its interior and all other eigenvalues in its exterior. The spectral projection is defined as P j 1 (z A) 1 dz. 2πi Γ j P j is a projection onto the span of all generalized eigenvectors associated with λ j.in particular, P j v 1 will be a generalized eigenvector associated with λ j and will generate a cyclic subspace K αj (A, P j v 1 ) Ker (A λ j ) αj. Let α j be the imum index α so that K α (A, P j v 1 )=K α+1 (A, P j v 1 ). This α j is called the ascent with respect to v 1 of the eigenvalue λ j. Notice that 1 α j α j and K αj (A, P j v 1 )isthesmallest invariant subspace of A that contains P j v 1. Furthermore, P j v 1 is a generalized eigenvector of grade α j associated with λ j and α j <α j only if v 1 is deficient in all generalized eigenvectors of maximal grade α j associated with λ j. Define spectral projections P good and P bad having ranges that are the maximal invariant subspaces associated with the good and bad eigenvalues, respectively, as L N P good = P j and P bad = P j. j=1 j=l+1 Note that P good + P bad = I. The following result in Lemma 2.2 characterizes M. The first statement, included for comparison, is well known; the second is also understood, though we are unaware of its explicit appearance in the literature. Related issues are discussed in [1], [13, Chap. VII]. Lemma 2.2. C n = N j=1 Ker(A λ j) αj with N j=1 α j n, and M = N j=1 K α j (A, P j v 1 ) with N j=1 α j = dim M. Proof. Since N j=1 P j = I, anyx C n can be written as x = Ix = N j=1 P jx, which shows that C n N j=1 Ker(A λ j) αj. The reverse inclusion is trivial. For the second statement, use N j=1 P j = I to get, for any integer τ>0, N N N v 1 = P j v 1, Av 1 = AP j v 1,..., A τ v 1 = A τ P j v 1. j=1 j=1 Thus, for each integer τ>0, K τ (A, v 1 ) N j=1 K α j (A, P j v 1 ), and, in particular, for τ sufficiently large this yields M N j=1 K α j (A, P j v 1 ). To show the reverse inclusion, note that for every j =1,...,N, there is a polynomial p j such that p j (A) = P j. (This polynomial interpolates at eigenvalues: p j (λ j ) = 1, p j has α j 1 zero derivatives at λ j, and p j (λ k ) = 0 for λ k λ j ; see, e.g., [19, sect. 6.1].) Thus for any x N j=1 K α j (A, P j v 1 ), one can write N N x = g j (A)P j v 1 = g j (A)p j (A)v 1 M j=1 j=1 for polynomials g j with degree not exceeding α j 1. Thus N j=1 K α j (A, P j v 1 ) M, and so M = N j=1 K α j (A, P j v 1 ). j=1

7 1080 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI Let X good and X bad be the invariant subspaces of A associated with the good and bad eigenvalues, respectively. Then define U good M X good and U bad M X bad. The following lemma develops a representation for U good and U bad ; it shows that U good is the maximum reachable invariant subspace associated with the good eigenvalues that can be obtained from a Krylov subspace algorithm started with v 1. Maximum reachable invariant subspace means that any invariant subspace U associated with the good eigenvalues and strictly larger than U good is unreachable: The angle between U and any computable subspace generated from v 1 is bounded away from zero independent of l, p, ν, and choice of filter shifts {µ jk }. Lemma 2.3. U good = L j=1k αj (A, P j v 1 ), L dim U good = α j m, j=1 and U bad = N j=l+1k αj (A, P j v 1 ), N dim U bad = α j = s m. j=l+1 Furthermore, for any subspace U of X good that properly contains U good, i.e., U good U X good, convergence in gap cannot occur. For all integers l 1, δ(u, K l (A, v (ν) 1 )) 1 P good > 0. Proof. Since K αj (A, P j v 1 ) Ker(A λ j ) αj, Lemma 2.2 leads to M X good = L j=1 K α j (A, P j v 1 ). Furthermore, dim K αj (A, P j v 1 )= α j implies that dim U good = m as defined above. The analogous results for U bad follow similarly. Note that X bad = N j=l+1 Ker(A λ j) αj so, for all l 0, K l (A, v (ν) 1 ) M U good X bad. Thus any v K l (A, v (ν) 1 ) can be decomposed as v = w 1 + w 2 for some w 1 U good and w 2 X bad. When U good is a proper subspace of U, there exists an x U so that x U good and x = 1. Note that x w 1 x =1. Now, Thus, v K l (A,v (ν) w 1 U good w 2 X bad 1 ) v x w 1 U good w 2 X bad max y X good w 2 X bad δ(u, K l (A, v (ν) 1 )) = max x U w 1 + w 2 x w 2 ( x w 1 ) x w 1 P good (w 2 y) w 2 y v x v K l (A,v (ν) 1 ) x v K l (A,v (ν) 1 ) v x y X good w 2 X bad 1 = 1 P good. w 2 y y 1 P good. This means that we have no hope of capturing any invariant subspace that contains a (generalized) eigenspace associated with multiple Jordan blocks at least when using

8 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1081 a single vector iteration in exact arithmetic. On the other hand, convergence can occur to the good invariant subspace U good, with a rate that depends on properties of A, v 1, and the choice of filter shifts {µ jk }, as we shall see. Almost every vector in an invariant subspace is a generalized eigenvector of maximal grade and so almost every starting vector will capture maximally defective Jordan blocks. While easily acknowledged, this fact can have perplexing consequences for the casual Arnoldi or biorthogonal Lanczos user, since eigenvectors of other Jordan blocks may be unexpectedly washed out. Suppose A is defined as A = A is in Jordan canonical form with the single eigenvalue λ = 1. Let e j denote the jth column of the 5 5 identity matrix. Then e 2 and e 5 are eigenvectors of A, e 1 and e 4 are generalized eigenvectors of grade 1 associated with the 2 2 and 3 3 Jordan blocks, and e 5 is a generalized eigenvector of grade 2 associated with the 3 3 block. For arbitrary β C, the vector v 1 =[1β 111] T generates a cyclic subspace spanned by the first three vectors in the Krylov sequence: v 1, Av 1, and A 2 v 1. By choosing β to be large, we can give the starting vector v 1 an arbitrarily large component in the direction of e 2, the eigenvector associated with the 2 2 Jordan block. Defining M = [ v 1, Av 1, A 2 ] v 1 and Ĥ = 1 0 3, a simple calculation reveals AM = MĤ. The Jordan form of Ĥ is easy to calculate as follows: (2.1) R 1 ĤR = 1 1 0, where R = The cyclic subspace generated by the single vector v 1 has captured a threedimensional invariant subspace, associated with the maximally defective 3 3 Jordan block. But this subspace is not the expected span{e 3, e 4, e 5 }. Using the change of basis defined by R in (2.1), one may calculate A(MR) =(MR)(R 1 ĤR), which is β = β Note that e 5 alone is revealed as the eigenvector associated with the eigenvalue 1; e 2 has been washed out in spite of v 1 having an arbitrarily large component in that direction. Indeed the eigenvector e 2 (and so any subspace containing it) is unreachable from any starting vector v 1 for which e 3v 1 0. In this example, v 1 itself emerges as a generalized eigenvector of grade 2. Note that every vector v in C 5 with e 3v 0is a generalized eigenvector of grade 2 associated with the eigenvalue 1. We close this section with a computational example that both confirms the gap stagnation lower bound for derogatory matrices given in Lemma 2.3 and illustrates

9 1082 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI 10 0 diagonalizable but derogatory δ(xgood, Kl(A, v1)) P good defective, not derogatory Krylov subspace dimension, l Fig The Krylov subspace can never capture X good when this subspace is associated with a derogatory eigenvalue; convergence is possible, however, when the associated eigenvalues are defective but not derogatory, as described by Lemma 2.3. other convergence properties explored in future sections. Consider two matrices A 1 and A 2, each of dimension n = 150 with eigenvalues spaced uniformly in the interval [0, 1]. In both cases, all the eigenvalues are simple except for the single good eigenvalue λ = 1, which has algebraic multiplicity 5. In the first case, the geometric multiplicity also equals 5, so the matrix is diagonalizable but derogatory. In the second case, there is only one eigenvector associated with λ = 1, so it is defective but not derogatory. Both matrices are constructed so that P good Figure 2.1 illustrates the gap convergence for the Krylov subspace to the invariant subspace X good associated with λ = 1. The starting vector v 1 has 1/ n in each component; no restarting is used here. Convergence cannot begin until the fifth iteration, when the Krylov subspace dimension matches the dimension of X good. This initial period of stagnation is followed by a sublinear phase of convergence leading to a second stagnation period. This is the end of the story for the derogatory case, but for the nonderogatory case, the second stagnation period is transient and the convergence rate eventually settles toward a nearly linear rate. In fact, this rate improves slightly over the final iterations shown here, yielding so-called superlinear convergence, the subject of section 4.3. These convergence phases resemble those observed for the GMRES iteration, as described by Nevanlinna [28]. 3. Basic estimates. Since all reachable subspaces are contained in M and A M is nonderogatory, henceforth we assume without loss of generality that A itself is nonderogatory so that n = dim M, and v 1 is not deficient in any generalized eigenvector of maximal grade. To summarize the current situation, A is an n n matrix with N n distinct eigenvalues, {λ j } N j=1, each having geometric multiplicity 1 and algebraic multiplicity m j, so that N j=1 m j = n. We seek L (1 L<N) of these eigenvalues {λ 1,λ 2,..., λ L } (the good eigenvalues) together with the corresponding (maximal) invariant subspace U good of dimension m = L j=1 m j, which is now the net algebraic multiplicity of good eigenvalues since A is nonderogatory.

10 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1083 We begin by establishing two lemmas that are used to develop a bound for the gap in terms of a polynomial approximation problem in the subsequent theorems. Lemma 3.1. Given U, V C n, suppose û U ( û =1)and v V satisfy δ(u, V) = max u U v V u v u = û v. Then û v V and û v δ(u, V) 2 û U. Proof. The first assertion is a fundamental property of least squares approximation. To show the second, consider an arbitrary unit vector u U and take >0. Letting Π V denote the orthogonal projection onto V, the optimality of û and v implies û v 2 (I Π V)(û + u) 2 û + u 2. Expanding this inequality, noting v = Π V û, and using the first assertion gives δ(u, V) 2 (1+2 Re(û u)+ 2 ) δ(u, V) 2 +2 Re((û v) u)+ 2 (I Π V )u 2. Collecting terms quadratic in on the left-hand side, 2 (δ(u, V) 2 (I Π V )u 2 ) 2 Re((û v δ(u, V) 2 û) u). Note that the left-hand side must be nonnegative. Repeating the above argument with u multiplied by a complex scalar of unit modulus, we can replace the right-hand side with 2 (û v δ(u, V) 2 û) u. Thus for any unit vector û U, (δ(u, V) 2 (I Π V )u 2 ) 2 (û v δ(u, V) 2 û) u 0. Taking 0, we conclude that û v δ(u, V) 2 û is orthogonal to every u U. As the gap between subspaces closes (δ(u, V) 0), û v becomes almost orthogonal to U in the sense that the projection of û v onto U has norm δ(u, V) 2. Lemma 3.2. Let P m 1 denote the space of polynomials of degree m 1 or less. The mapping ı: P m 1 U good defined by (3.1) ı(ψ) =ψ(a)p good v 1 is an isomorphism between P m 1 and U good. Furthermore, there exist positive constants c 1 and c 2 so that (3.2) c 1 ψ Pm 1 ψ(a)p good v 1 c 2 ψ Pm 1, uniformly for all ψ P m 1 for any fixed norm Pm 1 defined on the space P m 1. Proof. ı is clearly linear. To see that ı maps P m 1 onto U good, observe that for any given y U good, there exist polynomials {g j (λ)} L j=1 with deg(g j) m j 1 such that y = L g j (A)P j v 1. j=1 The L polynomials {g j } L j=1 provide L separate slices of a single polynomial that can be recovered by (generalized) Hermite interpolation. Let ψ be a polynomial interpolant that interpolates g j and its derivatives at λ j : ψ (k) (λ j )=g (k) j (λ j )

11 1084 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI for k =0, 1,...,m j 1 and j =1, 2,...,L. Theorem VIII.3.16 of [11] leads us first to observe that ψ(a)p j = g j (A)P j for each j =1,...,L. Then since deg(ψ) L j=1 m j 1=m 1, we have from (3.1) that y = L ψ(a)p j v 1 = ψ(a)p good v 1 = ı(ψ). j=1 Since dim(p m 1 ) = dim(u good ), nullity(ı) =0andı is bijective from P m 1 to U good. The last statement is an immediate consequence of the fact that linear bijections are bounded linear transformations with bounded inverses. Theorem 3.3. Suppose that A and v 1 satisfy the assumptions of this section, and that none of the filter shifts {µ jk } coincides with any of the good eigenvalues {λ j } L j=1. For all indices l m, the gap between the good invariant subspace, U good, and the Krylov subspace of order l, K l (A, v (ν) 1 ), generated from the ν-fold restarted vector, v (ν) 1, satisfies δ(u good, K l (A, v (ν) 1 )) C 0 max ψ P m 1 φ P l m φ(a)ψ(a)ψ νp (A)P bad v 1 φ(a)ψ(a)ψ νp (A)P good v 1, where C 0 1 if U good U bad and C 0 2 otherwise. Proof. First, suppose U good U bad. This implies that P good and P bad are orthogonal projections, U good is an invariant subspace for both Ψ νp (A) and [Ψ νp (A)], and, as we will see, that δ(u good, K l (A, v (ν) 1 )) < 1. Indeed, suppose instead that δ(u good, K l (A, v (ν) 1 )) = 1. Then there is a vector û U good with û = 1 such that û K l (A, v (ν) 1 ). Define ŷ [Ψ νp(a)] û U good, and note that by Lemma 3.2, there exists a polynomial ψ Pm 1 such that ŷ = ψ(a)p good v 1. Now, for each j =1, 2,...,l,wehave 0= û, A j 1 v (ν) 1 = û, Aj 1 Ψ νp (A)v 1 = ŷ, A j 1 P good v 1 = ψ(a)p good v 1, A j 1 P good v 1. Since l m, this implies first that ψ(a)p good v 1 = 0 and then û = 0. (Recall that [Ψ νp (A)] is bijective on U good since Ψ νp has no roots in common with good eigenvalues.) But û was given to be a unit vector, so it must be that δ(u good, K l (A, v (ν) 1 )) < 1. There are optimal vectors v K l (A, v (ν) 1 ) and x U good with x = 1 so that (3.3) δ(u good, K l (A, v (ν) 1 )) = max x U good v x = v x. v K l (A,v (ν) 1 ) x Since δ(u good, K l (A, v (ν) 1 )) < 1, it must be that v 0. Furthermore, optimality for v means v x K l (A, v (ν) 1 ) (viz., Lemma 3.1) and, in particular, v ( v x) =0. So, v 0 implies v U bad. There is a polynomial π l 1 P l 1 such that v = π l 1 (A)v (ν) 1 = π l 1 (A)Ψ νp (A)v 1. Define Q = U good Ker(π l 1 (A)) and let q be the imum (monic) annihilating polynomial for Q. 1 Evidently, π l 1 must contain q as a factor. 1 That is, q is the imum degree monic polynomial such that q(a)r = 0 for all r Q.

12 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1085 Since v U bad, π l 1 cannot be an annihilating polynomial for U good,soq U good and deg( q) m 1. One may factor π l 1 as the product of a polynomial, φ, of degree l m and a polynomial, q, of degree m 1 containing q as a factor, π l 1 (λ) =φ(λ)q(λ). Observing that U good is invariant for both φ(a) and φ(a), we may decompose x as x = φ(a)ŷ + n for some ŷ U good and some n Ker(φ(A) ) U good. Notice that v φ(a)ŷ = v x = v v > 0, so φ(a)ŷ 0. However, we ll see that it must happen that n = 0. Indeed, Lemma 3.1 shows that if z U good is orthogonal to x, x z = 0, then v z = 0 as well. In particular, for z = n 2 φ(a)ŷ φ(a)ŷ 2 n we have x z = 0. Since Ker φ(a) = Ran φ(a) implies v n =0,wehave 0= v z = n 2 v φ(a)ŷ. We have already seen that v φ(a)ŷ > 0, and so n = 0. Thus we can safely exclude from the maximization in (3.3) all x U good except for those vectors having the special form x = φ(a)y for y U good and φ as defined above. We can now begin our process of bounding the gap. Note that (3.4) δ(u good, K l (A, v (ν) 1 )) = max x U good = max x U good = max y U good v x v K l (A,v (ν) 1 ) x φ P l m φ P l m q P m 1 q P m 1 Ψ νp (A)φ(A)q(A)v 1 x x Ψ νp (A)φ(A)[q(A)v 1 y], Ψ νp (A)φ(A)y where we are able to justify the substitution x =Ψ νp (A)φ(A)y since Ψ νp (A) isan invertible map of U good to itself. Now by Lemma 3.2, y U good can be represented as y = ψ(a)p good v 1 for some ψ P m 1. Since I = P bad + P good, one finds ψ(a)v 1 y = ψ(a)p bad v 1. Continuing with (3.4), assign q ψ P m 1. Then δ(u good, K l (A, v (ν) 1 )) max y U good (y=ψ(a)p good v 1) = max ψ P m 1 φ P l m φ P l m Ψ νp (A)φ(A)[ψ(A)v 1 y] Ψ νp (A)φ(A)y Ψ νp (A)φ(A)ψ(A)P bad v 1 Ψ νp (A)φ(A)ψ(A)P good v 1, as required, concluding the proof when U good U bad. In case U good and U bad are not orthogonal subspaces, we introduce a new inner product on C n with respect to which they are orthogonal. For any u, v C n, define u, v P good u, P good v + P bad u, P bad v, and define the gap with respect to the new norm =, to be δ (W, V) = sup inf x W y V y x x.

13 1086 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI Notice that for any vector w C n, w 2 = P good w + P bad w 2 2 ( P good w 2 + P bad w 2) =2 w 2, P good w = P good w, and P bad w = P bad w. In particular, for any x U good and y C n these relationships directly imply y x x 2 y x x, and so δ(u good, K l (A, v (ν) 1 )) 2 δ (U good, K l (A, v (ν) 1 )). Since U good and U bad are orthogonal in this new inner product, we can apply the previous argument to conclude 2 δ(u good, K l (A, v (ν) 1 )) 2 max ψ P m 1 = 2 max ψ P m 1 φ P l m φ P l m φ(a)ψ(a)ψ νp (A)P bad v 1 φ(a)ψ(a)ψ νp (A)P good v 1 φ(a)ψ(a)ψ νp (A)P bad v 1 φ(a)ψ(a)ψ νp (A)P good v 1. If N is a square matrix with an invariant subspace V, define Nv N V max v V v = NΠ V, where Π V here denotes the orthogonal projection onto V. Theorem 3.4. Suppose A, v 1, and the shifts {µ jk } satisfy the conditions of Theorem 3.3. Then for l m, δ(u good, K l (A, v (ν) 1 )) C 0 C 1 φ P l m [φ(a)ψ νp (A)] 1 Ugood φ(a)ψ νp (A) Ubad, where C 0 is as defined in Theorem 3.3 and (3.5) C 1 max ψ P m 1 ψ(a)p bad v 1 ψ(a)p good v 1 is a constant independent of l, ν, p, or the filter shifts {µ jk }. Proof. Let Π good and Π bad denote the orthogonal projections onto U good and U bad, respectively. Then Ψ νp (A)φ(A)P bad ψ(a)v 1 = Ψ νp (A)φ(A)Π bad P bad ψ(a)v 1 Ψ νp (A)φ(A)Π bad P bad ψ(a)v 1, and, assug for the moment that φ(a) is invertible, P good ψ(a)v 1 = [Ψ νp (A)φ(A)] 1 Π good P good Ψ νp (A)φ(A)ψ(A)v 1 [Ψ νp (A)φ(A)] 1 Π good P good Ψ νp (A)φ(A)ψ(A)v 1. 2 A more precise value for C 0 can be found as 2 I 2 P good 1 C 0 = 2 1+ I 2P good 2 2; however, the marginal improvement in the final bound would not appear to merit the substantial complexity added.

14 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1087 Hence, Ψ νp (A)φ(A)P bad ψ(a)v 1 Ψ νp (A)φ(A)P good ψ(a)v 1 [Ψ νp (A)φ(A)] 1 ψ(a)p bad v 1 Ugood Ψ νp (A)φ(A) Ubad ψ(a)p good v 1. Minimizing with respect to φ and maximizing with respect to ψ yields the conclusion provided the expression for C 1 is finite. This is assured since, as an immediate consequence of (3.2), ψ(a)p good v 1 = 0 can occur only when ψ =0. It is instructive to consider the situation where we seek only a single good eigenvalue, λ 1, which is simple. In this case m = dim U good = 1; the conclusion of Theorem 3.3 may be stated as δ(u good, K l (A, v (ν) 1 )) C 0 C 1 φ P l 1 φ(a)ψ νp (A)w, φ(λ 1 )Ψ νp (λ 1 ) where w = P bad v 1 / P bad v 1 and C 1 = P bad v 1 / P good v 1. Elementary geometric considerations yield the following alternate expression for C 1 : ( ) 2 ( ) 2 1 sin Θ(U good, v 1 ) 1 cos Θ(U good, v 1 ) C 1 = P good cos Θ(U bad, v + 1 1) P good cos Θ(U bad, v, 1) where Θ(U good, v 1 ) and Θ(U bad, v 1) are the smallest angles between v 1 and the onedimensional subspaces U good and U bad, respectively. This special case is stated as Proposition 2.1 of [18]; 3 see also Saad s single eigenvalue convergence theory [32]. Our next step is to reduce the conclusion of Theorem 3.4 to an approximation problem in the complex plane. Let U be an invariant subspace of A associated with a compact subset Ω C (that is, Ω contains only those eigenvalues of A associated with U and no others). Define κ(ω) as the smallest constant for which the inequality (3.6) f(a) U κ(ω) max f(z) z Ω holds uniformly over all f H(Ω), where H(Ω) denotes the functions analytic on Ω. 4 Evidently, the value of the constant κ(ω) depends on the particular choice of Ω (a set containing, in any case, those eigenvalues of A associated with U). The following properties of κ(ω) are shared by the generalized Kreiss constant K(Ω) of Toh and Trefethen [41] (defined for U = C n ). κ(ω) is monotone decreasing with respect to set inclusion on Ω. Indeed, if Ω 1 Ω 2, then for each function f analytic on Ω 2, f(a) U max{ f(z) : z Ω 1 } f(a) U max{ f(z) : z Ω 2 }. Thus, Ω 1 Ω 2 implies κ(ω 1 ) κ(ω 2 ). Since the constant functions are always among the available analytic functions on Ω, κ(ω) 1. If A is normal, κ(ω) = 1. Indeed, if A is normal and Σ denotes the set of eigenvalues of A associated with the invariant subspace U, then 1 κ(ω) = sup f H(Ω) f(a) U max{ f(z) : z Ω} = sup max{ f(λ) : λ Σ} f H(Ω) max{ f(z) : z Ω} 1. 3 [18] contains an error amounting to the tacit assumption that P good is an orthogonal projection, which is true only if U good U bad. Thus the results coincide only in this special case (note C 0 = 1). 4 For given k 1, the sets Ω that (i) contain all eigenvalues of A and (ii) satisfy κ(ω) k are called k-spectral sets and figure proently in dilation theory of operators [29].

15 1088 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI If any eigenvalue associated with the invariant subspace U is defective, then some choices of Ω will not yield a finite value for κ(ω). For example, let A =[ ] and take U = C 2 as an invariant subspace associated with the defective eigenvalue λ =0. If Ω consists of the single point {0} and f(z) =z, then evidently f(a) U = 1 but max z Ω f(z) = 0. So, no finite value of κ(ω) is possible (see [31, p. 440]). More generally, if Ω is the spectrum of a defective matrix A, then the monic polynomial consisting of a single linear factor for each distinct eigenvalue of A is zero on Ω but cannot annihilate A, as it has lower degree than the imum polynomial of A. We now use κ to adapt Theorem 3.4 into a more approachable approximation problem. In particular, if Ω good is a compact subset of C containing all the good eigenvalues of A but none of the bad, then [φ(a)ψ νp (A)] 1 Ugood κ(ω good ) max{ [φ(z)ψ νp (z)] 1 : z Ω good } = κ(ω good ) { φ(z)ψ νp (z) : z Ω good }. Applying a similar bound to φ(a)ψ νp (A) Ubad, we obtain the following result, the centerpiece of our development. Theorem 3.5. Suppose A and v 1 satisfy the conditions of Theorem 3.3. Let Ω good and Ω bad be disjoint compact subsets of C that contain, respectively, the good and bad eigenvalues of A, and suppose that none of the filter shifts {µ jk } lies in Ω good. Then, for l m, δ(u good, K l (A, v (ν) 1 )) C max{ Ψ νp (z)φ(z) : z Ω bad } 0 C 1 C 2 φ P l m { Ψ νp (z)φ(z) : z Ω good }, where C 0 and C 1 are the constants introduced in Theorems 3.3 and 3.4, respectively, and C 2 κ(ω good ) κ(ω bad ). Evidently, Theorem 3.5 can be implemented with a variety of choices for Ω good and Ω bad, which affects both the polynomial approximation problem and the constant C 2 (considered in section 5.3). The polynomial approximation problem, classified as Zolotarev-type, is discussed in detail in the next section. Similar problems arise in calculating optimal ADI parameters [26]. 4. The polynomial approximation problem. Theorem 3.5 suggests the gap between a Krylov subspace and an invariant subspace will converge to zero at a rate detered by how small polynomials of increasing degree can become on Ω bad while maintaining a imal uniform magnitude on Ω good. How can this manifest as a linear convergence rate? Consider the ansatz max{ φ(w) : w Ω bad } φ P l { φ(z) : z Ω good } = rl, for some 0 <r 1. Pick a fixed φ P l, say, with exact degree l. Then ( ) max{ φ(w) : w Ωbad } (4.1) log l log(r). { φ(z) : z Ω good } ( ) Introducing U φ (z,ω bad ) 1 l log, (4.1) is equivalent to φ(z) max{ φ(w) :w Ω bad } z Ω good U φ (z,ω bad ) log(r).

16 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1089 Evidently, the size of r will be related to how large U φ (z,ω bad ) can be made uniformly throughout Ω good ; larger U φ values allow smaller r (faster rates). U φ (z,ω bad ) has the following properties: U φ (z,ω bad ) is harmonic at z where φ(z) 0; U φ (z,ω bad ) = log z + c + o(1) for a finite constant c as z ; U φ (z,ω bad ) 0 for all z Ω bad. Potential theory provides a natural setting for studying such approximation problems. It is central to the analysis of iterative methods for solving linear systems (see, e.g., [26] for ADI methods and [10, 28] for Krylov subspace methods), and has been used by Calvetti, Reichel, and Sorensen to analyze the Hermitian Lanczos algorithm with restarts [6]. We apply similar techniques here to study U φ (z,ω bad ) Potential theory background. Let D C be a compact set whose complement, C \ D, is a connected Dirichlet region. 5 The Green s function of C \ D with pole at infinity is defined as that function, g[z,d], that satisfies the following properties: (i) g is harmonic in C \ D; (ii) lim z g[z,d] = log z + finite constant; (iii) lim z ẑ g[z,d] = 0 for all ẑ D; (iv) g[z,d] > 0 for all z C \ D. Note that property (iv) can be deduced from (i), (ii), the fact that (ii) implies that g>0for all sufficiently large z, and the maximum principle for harmonic functions. The maximum principle also shows that g[z,d] is the only function satisfying (i) (iv). Example 4.1. If C \ D is simply connected, one is assured (from the Riemann mapping theorem; see, e.g., [8, sect. VII.4]) of the existence of a function F (z) that maps C\D conformally onto the exterior of the closed unit disk C\B 1 = {z : z > 1} such that F ( ) =. Such an F must behave asymptotically as αz + O(1) as z for some constant α, since it must remain one-to-one in any neighborhood of. Since log z is harmonic for any z 0, one may check that u(z) = log F (z) is also harmonic in z wherever F (z) 0,u( ) =, and u(z) 0as z 1 from C \ D. Thus, log F (z) is the Green s function with pole at infinity for C \ D. Evidently, lim z u(z) log z log α. Notice that log z itself is the Green s function with pole at infinity for C \ B 1. Even for more complicated compact sets D, the condition that g[z,d] is harmonic everywhere outside D with a pole at restricts the rate of growth of g[z,d] near. Loosely speaking, as z becomes very large, the compact set D becomes less and less distinguishable from a disk centered at 0 (say, with radius γ), and so g[z,d] becomes less and less distinguishable from g[z,b γ ] = log z/γ = log z log γ, which is the Green s function with pole at infinity for C \ B γ = {z : z >γ}. Indeed, from property (ii), g[z,d] has growth at infinity satisfying (4.2) lim g[z,d] log z = log γ z for some constant γ>0 known as the logarithmic capacity of the set D. This γ can be thought of as the effective radius of D in the sense we ve just described. Example 4.2. Suppose Φ l (z) is a monic polynomial of degree l and let D (Φ l )={z C : Φ l (z) } 5 See [8, sect. X.4]. For our purposes here, this can be taken to mean a set having a piecewise smooth boundary with no isolated points; the effect of isolated points is addressed in section 4.3.

17 1090 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI be a family of regions whose boundaries are the -lemniscates of Φ l (z). D (Φ l )is compact for each >0, though it need not be a connected region. With an easy calculation one may verify that D (Φ l ) has the Green s function (cf. [36, p. 164]) g[z,d (Φ l )] = 1 l log ( Φl (z) Equipped with the Green s function g[z,d], we return to the analysis of the function U φ (z,d) describing the error in our approximation problem. The following result is a simplified version of the Bernstein Walsh lemma (see [36, sect. III.2]). Proposition 4.3. Let D be a compact set with piecewise smooth boundary D. Suppose u is harmonic outside D and that u(z) 0 for z D. If u(z) = log z + c + o(1) for some constant c as z, then u(z) g[z,d]. In particular, if φ(z) is any polynomial of degree l, then for each z C \ D U φ (z,d) = 1 ( ) l log φ(z) (4.3) g[z,d]. max{ φ(w) : w D} For certain special choices of D =Ω bad, the polynomial approximation problem of Theorem 3.5 can be solved exactly. Theorem 4.4. Suppose Φ l (z) is a monic polynomial of degree l. Let Ω bad = D (Φ l ) be an associated -lemniscatic set as defined in Example 4.2 and suppose Ω good is a compact subset of C such that Ω good D (Φ l )=. Then ). max{ φ(w) : w Ω bad } φ P l { φ(z) : z Ω good } = { Φ l (z) : z Ω good }. Proof. Using the Green s function for D (Φ l ) described in Example 4.2, we can rearrange (4.3) to show that for any φ P l, φ(z) max{ φ(w) : w D (Φ l )} Φ l (z) holds for all z Ω good. Equality is attained for every z C whenever φ =Φ l. Minimizing over z Ω good and then maximizing over φ P l yields (4.4) max φ P l { φ(z) : z Ω good } max{ φ(w) : w D (Φ l )} { Φ l (z) : z Ω good }. In fact, equality must hold in (4.4) since φ =Φ l is included in the class of functions over which the maximization occurs. The conclusion then follows by taking the reciprocal of both sides. More general choices for D =Ω bad will not typically yield exactly solvable polynomial approximation problems, at least for fixed (finite) polynomial degree. However, the following asymptotic result holds as the polynomial degree increases. Theorem 4.5. Let Ω bad and Ω good be two disjoint compact sets in the complex plane such that C \ Ω bad is a Dirichlet region. Then (4.5) lim l φ P l ( ) 1/l max{ φ(w) : w Ωbad } = e {g[z,ω bad]:z Ω good }, { φ(z) : z Ω good } where g[z,ω bad ] is the Green s function of C \ Ω bad with pole at infinity.

18 CONVERGENCE OF RESTARTED KRYLOV SUBSPACES 1091 Proof. The theorem is proved in [26, p. 236], where the left-hand side of (4.5) is referred to as the (l, 0) Zolotarev number. We give here a brief indication of the proof to support later discussion. Inequality (4.3) can be manipulated to yield ( ) 1/l φ l (z) e g[z,ωbad], max{ φ l (w) : w Ω bad } which in turn implies ( ) 1/l max{ φl (w) : w Ω bad } e {g[z,ω bad]:z Ω good }. { φ l (z) : z Ω good } Furthermore, one may construct polynomials L k that have as their zeros points distributed on the boundary Ω bad, the Leja points {µ 1,µ 2,...,µ k }, defined recursively so that { k µ k+1 = arg max z µ j : z Ω bad }; j=1 see [36, sect. V.1]. This sequence of Leja polynomials satisfies asymptotic optimality, ( ) 1/k L k (z) (4.6) lim = e g[z,ω bad] k max{ L k (w) : w Ω bad } for each z C\Ω bad. Convergence is uniform on compact subsets of C\Ω bad.thuswe can reverse the order of the limit with respect to polynomial degree and imization with respect to z Ω good, then take reciprocals to find (4.7) Since lim k ( ) 1/k max{ Lk (w) : w Ω bad } = e {g[z,ω bad]:z Ω good}. { L k (z) : z Ω good } ( ) 1/l max{ Ll (w) : w Ω bad } { L l (z) : z Ω good } φ P l ( ) 1/l max{ φ(w) : w Ωbad } { φ(z) : z Ω good } e {g[z,ω bad]:z Ω good }, equality must hold throughout and thus (4.5) holds. In the context of Example 4.1, where F (z) was a conformal map taking the exterior of Ω bad to the exterior of the closed unit disk with F ( ) =, Theorem 4.5 reduces to (cf. [10, Thm. 2]) lim l φ P l ( ) 1/l max{ φ(w) : w Ωbad } 1 = max { φ(z) : z Ω good } z Ω good F (z) Effective restart strategies. The usual goal in constructing a restart strategy is to limit the size of the Krylov subspace (restricting the maximum degree of the polynomial φ) without degrading the asymptotic convergence rate. Demonstrating equality in (4.5) pivoted on the construction of an optimal family of polynomials in this case, Leja polynomials. There are other possibilities, however. Fekete polynomials are the usual choice for the construction in Theorem 4.5; see [36, sect. III.1]. Chebyshev polynomials and Faber polynomials offer familiar alternatives. (For Hermi-

19 1092 CHRISTOPHER BEATTIE, MARK EMBREE, AND JOHN ROSSI tian matrices, a practical Leja shift strategy has been developed by Baglama, Calvetti, and Reichel [3] and Calvetti, Reichel, and Sorenson [6]. Heuveline and Sadkane advocate numerical conformal mapping to detere Faber polynomials for restarting non-hermitian iterations [18].) Once some optimal family of polynomials is known that solves (4.5), effective restart strategies become evident. Theorem 4.6. Let Ω good and Ω bad be two disjoint compact sets in the complex plane containing, respectively, the good and bad eigenvalues of A, and such that C \ Ω bad is a Dirichlet region. Suppose that Ψ νp (z) is the aggregate restart polynomial representing ν restarts each of order p. (a) If polynomial restarts are performed using roots of optimal polynomials for Ω bad (i.e., Ψ νp (z) are optimal polynomials of degree νp), then (4.8) ( lim max{ Ψνp (w)φ(w) : w Ω bad } ν φ P l { Ψ νp (z)φ(z) : z Ω good } ) 1 νp+l = e {g[z,ω bad]:z Ω good }, where g[z,ω bad ] is the Green s function of Ω bad with pole at infinity. (b) If the boundary of Ω bad is a lemniscate of Ψ νp Φ l, Ω bad = D (Ψ νp Φ l )={z C : Ψ νp (z)φ l (z) }, for some degree-l monic polynomial Φ l and some >0, then max{ Ψ νp (w)φ(w) : w Ω bad } φ P l { Ψ νp (z)φ(z) : z Ω good } = { Ψ νp (z)φ l (z) : z Ω good }. Proof. Part (b) follows immediately from Theorem 4.4. Part (a) can be seen by observing that since Ψ νp (z) is an asymptotically optimal family for Ω bad, max{ Ψ νp (w) : w Ω bad } { Ψ νp (z) : z Ω good } φ P l ( ) max{ Ψνp (w)φ(w) : w Ω bad } { Ψ νp (z)φ(z) : z Ω good } ( e {g[z,ω bad]:z Ω good } ) νp+l. Now fixing p and l, the conclusion follows from (4.7) by following the subsequence generated by ν =1, 2,... Recall that the desired effect of the restart polynomial is to retain the rapid convergence rate of the full (unrestarted) Krylov subspace without requiring the dimension l to grow without bound. We have seen here that restarting with optimal polynomials for Ω bad recovers the expected linear convergence rate for Ω bad (presug one can identify this set, not a trivial matter in practice). Still, the unrestarted process may take advantage of the discrete nature of the spectrum, accelerating convergence beyond the expected linear rate. Designing a restart strategy that yields similar behavior is more elaborate Superlinear effects from assimilation of bad eigenvalues. In a variety of situations, the gap appears to converge superlinearly. True superlinear convergence is an asymptotic phenomenon that has a nontrivial meaning only for nonterating iterations. Thus one must be cautious about describing superlinear effects relating to (unrestarted) Krylov subspaces, since U good is eventually completely captured by the Krylov subspace as discussed in section 2. Here our point of view follows that of [46, 48], showing the estimated gap may be bounded by a family of linearly converging

Key words. Krylov subspace methods, Arnoldi algorithm, eigenvalue computations, containment gap, pseudospectra

Key words. Krylov subspace methods, Arnoldi algorithm, eigenvalue computations, containment gap, pseudospectra CONVERGENCE OF POLYNOMIAL RESTART KRYLOV METHODS FOR EIGENVALUE COMPUTATIONS CHRISTOPHER A. BEATTIE, MARK EMBREE, AND D. C. SORENSEN Abstract. Krylov subspace methods have proved effective for many non-hermitian

More information

Convergence of Polynomial Restart Krylov Methods for Eigenvalue Computations

Convergence of Polynomial Restart Krylov Methods for Eigenvalue Computations SIAM REVIEW Vol. 47,No. 3,pp. 492 515 c 2005 Society for Industrial and Applied Mathematics Convergence of Polynomial Restart Krylov Methods for Eigenvalue Computations Christopher A. Beattie Mark Embree

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

Key words. cyclic subspaces, Krylov subspaces, orthogonal bases, orthogonalization, short recurrences, normal matrices.

Key words. cyclic subspaces, Krylov subspaces, orthogonal bases, orthogonalization, short recurrences, normal matrices. THE FABER-MANTEUFFEL THEOREM FOR LINEAR OPERATORS V FABER, J LIESEN, AND P TICHÝ Abstract A short recurrence for orthogonalizing Krylov subspace bases for a matrix A exists if and only if the adjoint of

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Index. for generalized eigenvalue problem, butterfly form, 211

Index. for generalized eigenvalue problem, butterfly form, 211 Index ad hoc shifts, 165 aggressive early deflation, 205 207 algebraic multiplicity, 35 algebraic Riccati equation, 100 Arnoldi process, 372 block, 418 Hamiltonian skew symmetric, 420 implicitly restarted,

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information

Matrix Functions and their Approximation by. Polynomial methods

Matrix Functions and their Approximation by. Polynomial methods [ 1 / 48 ] University of Cyprus Matrix Functions and their Approximation by Polynomial Methods Stefan Güttel stefan@guettel.com Nicosia, 7th April 2006 Matrix functions Polynomial methods [ 2 / 48 ] University

More information

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic joint work with Gérard

More information

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

More information

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection Eigenvalue Problems Last Time Social Network Graphs Betweenness Girvan-Newman Algorithm Graph Laplacian Spectral Bisection λ 2, w 2 Today Small deviation into eigenvalue problems Formulation Standard eigenvalue

More information

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v ) Section 3.2 Theorem 3.6. Let A be an m n matrix of rank r. Then r m, r n, and, by means of a finite number of elementary row and column operations, A can be transformed into the matrix ( ) Ir O D = 1 O

More information

c 2009 Society for Industrial and Applied Mathematics

c 2009 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 3, No., pp. c 29 Society for Industrial and Applied Mathematics THE ARNOLDI EIGENVALUE ITERATION WITH EXACT SHIFTS CAN FAIL MARK EMBREE Abstract. The restarted Arnoldi algorithm,

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

November 18, 2013 ANALYTIC FUNCTIONAL CALCULUS

November 18, 2013 ANALYTIC FUNCTIONAL CALCULUS November 8, 203 ANALYTIC FUNCTIONAL CALCULUS RODICA D. COSTIN Contents. The spectral projection theorem. Functional calculus 2.. The spectral projection theorem for self-adjoint matrices 2.2. The spectral

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems Part I: Review of basic theory of eigenvalue problems 1. Let A C n n. (a) A scalar λ is an eigenvalue of an n n A

More information

ON BEST APPROXIMATIONS OF POLYNOMIALS IN MATRICES IN THE MATRIX 2-NORM

ON BEST APPROXIMATIONS OF POLYNOMIALS IN MATRICES IN THE MATRIX 2-NORM ON BEST APPROXIMATIONS OF POLYNOMIALS IN MATRICES IN THE MATRIX 2-NORM JÖRG LIESEN AND PETR TICHÝ Abstract. We show that certain matrix approximation problems in the matrix 2-norm have uniquely defined

More information

On the influence of eigenvalues on Bi-CG residual norms

On the influence of eigenvalues on Bi-CG residual norms On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue

More information

Krylov subspace projection methods

Krylov subspace projection methods I.1.(a) Krylov subspace projection methods Orthogonal projection technique : framework Let A be an n n complex matrix and K be an m-dimensional subspace of C n. An orthogonal projection technique seeks

More information

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS TSOGTGEREL GANTUMUR Abstract. After establishing discrete spectra for a large class of elliptic operators, we present some fundamental spectral properties

More information

Convergence Theory for Iterative Eigensolvers

Convergence Theory for Iterative Eigensolvers Convergence Theory for Iterative Eigensolvers Mark Embree Virginia Tech with Chris Beattie, Russell Carden, John Rossi, Dan Sorensen RandNLA Workshop Simons Institute September 28 setting for the talk

More information

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY ILSE C.F. IPSEN Abstract. Absolute and relative perturbation bounds for Ritz values of complex square matrices are presented. The bounds exploit quasi-sparsity

More information

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalue Problems Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalues also important in analyzing numerical methods Theory and algorithms apply

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 4 Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES

PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES 1 PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES MARK EMBREE, THOMAS H. GIBSON, KEVIN MENDOZA, AND RONALD B. MORGAN Abstract. fill in abstract Key words. eigenvalues, multiple eigenvalues, Arnoldi,

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems Charles University Faculty of Mathematics and Physics DOCTORAL THESIS Iveta Hnětynková Krylov subspace approximations in linear algebraic problems Department of Numerical Mathematics Supervisor: Doc. RNDr.

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: More on Arnoldi Iteration; Lanczos Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 17 Outline 1

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization General Tools for Solving Large Eigen-Problems

More information

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts

More information

NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction

NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction NONCOMMUTATIVE POLYNOMIAL EQUATIONS Edward S Letzter Introduction My aim in these notes is twofold: First, to briefly review some linear algebra Second, to provide you with some new tools and techniques

More information

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA Kent State University Department of Mathematical Sciences Compiled and Maintained by Donald L. White Version: August 29, 2017 CONTENTS LINEAR ALGEBRA AND

More information

LARGE SPARSE EIGENVALUE PROBLEMS

LARGE SPARSE EIGENVALUE PROBLEMS LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1 General Tools for Solving Large Eigen-Problems

More information

Characterization of half-radial matrices

Characterization of half-radial matrices Characterization of half-radial matrices Iveta Hnětynková, Petr Tichý Faculty of Mathematics and Physics, Charles University, Sokolovská 83, Prague 8, Czech Republic Abstract Numerical radius r(a) is the

More information

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM Electronic Transactions on Numerical Analysis. Volume 45, pp. 133 145, 2016. Copyright c 2016,. ISSN 1068 9613. ETNA ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Topics in linear algebra

Topics in linear algebra Chapter 6 Topics in linear algebra 6.1 Change of basis I want to remind you of one of the basic ideas in linear algebra: change of basis. Let F be a field, V and W be finite dimensional vector spaces over

More information

Math 504 (Fall 2011) 1. (*) Consider the matrices

Math 504 (Fall 2011) 1. (*) Consider the matrices Math 504 (Fall 2011) Instructor: Emre Mengi Study Guide for Weeks 11-14 This homework concerns the following topics. Basic definitions and facts about eigenvalues and eigenvectors (Trefethen&Bau, Lecture

More information

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Antti Koskela KTH Royal Institute of Technology, Lindstedtvägen 25, 10044 Stockholm,

More information

Matrices, Moments and Quadrature, cont d

Matrices, Moments and Quadrature, cont d Jim Lambers CME 335 Spring Quarter 2010-11 Lecture 4 Notes Matrices, Moments and Quadrature, cont d Estimation of the Regularization Parameter Consider the least squares problem of finding x such that

More information

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS KEITH CONRAD. Introduction The easiest matrices to compute with are the diagonal ones. The sum and product of diagonal matrices can be computed componentwise

More information

On Solving Large Algebraic. Riccati Matrix Equations

On Solving Large Algebraic. Riccati Matrix Equations International Mathematical Forum, 5, 2010, no. 33, 1637-1644 On Solving Large Algebraic Riccati Matrix Equations Amer Kaabi Department of Basic Science Khoramshahr Marine Science and Technology University

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

LINEAR ALGEBRA SUMMARY SHEET.

LINEAR ALGEBRA SUMMARY SHEET. LINEAR ALGEBRA SUMMARY SHEET RADON ROSBOROUGH https://intuitiveexplanationscom/linear-algebra-summary-sheet/ This document is a concise collection of many of the important theorems of linear algebra, organized

More information

MATH Linear Algebra

MATH Linear Algebra MATH 304 - Linear Algebra In the previous note we learned an important algorithm to produce orthogonal sequences of vectors called the Gramm-Schmidt orthogonalization process. Gramm-Schmidt orthogonalization

More information

Matrix functions and their approximation. Krylov subspaces

Matrix functions and their approximation. Krylov subspaces [ 1 / 31 ] University of Cyprus Matrix functions and their approximation using Krylov subspaces Matrixfunktionen und ihre Approximation in Krylov-Unterräumen Stefan Güttel stefan@guettel.com Nicosia, 24th

More information

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction MAT4 : Introduction to Applied Linear Algebra Mike Newman fall 7 9. Projections introduction One reason to consider projections is to understand approximate solutions to linear systems. A common example

More information

Preconditioned inverse iteration and shift-invert Arnoldi method

Preconditioned inverse iteration and shift-invert Arnoldi method Preconditioned inverse iteration and shift-invert Arnoldi method Melina Freitag Department of Mathematical Sciences University of Bath CSC Seminar Max-Planck-Institute for Dynamics of Complex Technical

More information

Part III. 10 Topological Space Basics. Topological Spaces

Part III. 10 Topological Space Basics. Topological Spaces Part III 10 Topological Space Basics Topological Spaces Using the metric space results above as motivation we will axiomatize the notion of being an open set to more general settings. Definition 10.1.

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve

More information

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det What is the determinant of the following matrix? 3 4 3 4 3 4 4 3 A 0 B 8 C 55 D 0 E 60 If det a a a 3 b b b 3 c c c 3 = 4, then det a a 4a 3 a b b 4b 3 b c c c 3 c = A 8 B 6 C 4 D E 3 Let A be an n n matrix

More information

Bare-bones outline of eigenvalue theory and the Jordan canonical form

Bare-bones outline of eigenvalue theory and the Jordan canonical form Bare-bones outline of eigenvalue theory and the Jordan canonical form April 3, 2007 N.B.: You should also consult the text/class notes for worked examples. Let F be a field, let V be a finite-dimensional

More information

LINEAR ALGEBRA BOOT CAMP WEEK 2: LINEAR OPERATORS

LINEAR ALGEBRA BOOT CAMP WEEK 2: LINEAR OPERATORS LINEAR ALGEBRA BOOT CAMP WEEK 2: LINEAR OPERATORS Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F has characteristic zero. The following are facts

More information

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

In English, this means that if we travel on a straight line between any two points in C, then we never leave C. Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory.

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory. GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory. Linear Algebra Standard matrix manipulation to compute the kernel, intersection of subspaces, column spaces,

More information

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors Chapter 7 Canonical Forms 7.1 Eigenvalues and Eigenvectors Definition 7.1.1. Let V be a vector space over the field F and let T be a linear operator on V. An eigenvalue of T is a scalar λ F such that there

More information

Alternative correction equations in the Jacobi-Davidson method

Alternative correction equations in the Jacobi-Davidson method Chapter 2 Alternative correction equations in the Jacobi-Davidson method Menno Genseberger and Gerard Sleijpen Abstract The correction equation in the Jacobi-Davidson method is effective in a subspace

More information

Balanced Truncation 1

Balanced Truncation 1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.242, Fall 2004: MODEL REDUCTION Balanced Truncation This lecture introduces balanced truncation for LTI

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

MATH SOLUTIONS TO PRACTICE MIDTERM LECTURE 1, SUMMER Given vector spaces V and W, V W is the vector space given by

MATH SOLUTIONS TO PRACTICE MIDTERM LECTURE 1, SUMMER Given vector spaces V and W, V W is the vector space given by MATH 110 - SOLUTIONS TO PRACTICE MIDTERM LECTURE 1, SUMMER 2009 GSI: SANTIAGO CAÑEZ 1. Given vector spaces V and W, V W is the vector space given by V W = {(v, w) v V and w W }, with addition and scalar

More information

A PRIMER ON SESQUILINEAR FORMS

A PRIMER ON SESQUILINEAR FORMS A PRIMER ON SESQUILINEAR FORMS BRIAN OSSERMAN This is an alternative presentation of most of the material from 8., 8.2, 8.3, 8.4, 8.5 and 8.8 of Artin s book. Any terminology (such as sesquilinear form

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing. 5 Measure theory II 1. Charges (signed measures). Let (Ω, A) be a σ -algebra. A map φ: A R is called a charge, (or signed measure or σ -additive set function) if φ = φ(a j ) (5.1) A j for any disjoint

More information

MATHEMATICS 217 NOTES

MATHEMATICS 217 NOTES MATHEMATICS 27 NOTES PART I THE JORDAN CANONICAL FORM The characteristic polynomial of an n n matrix A is the polynomial χ A (λ) = det(λi A), a monic polynomial of degree n; a monic polynomial in the variable

More information

The Jordan canonical form

The Jordan canonical form The Jordan canonical form Francisco Javier Sayas University of Delaware November 22, 213 The contents of these notes have been translated and slightly modified from a previous version in Spanish. Part

More information

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim

More information

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ.

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ. Linear Algebra 1 M.T.Nair Department of Mathematics, IIT Madras 1 Eigenvalues and Eigenvectors 1.1 Definition and Examples Definition 1.1. Let V be a vector space (over a field F) and T : V V be a linear

More information

MULTICENTRIC CALCULUS AND THE RIESZ PROJECTION

MULTICENTRIC CALCULUS AND THE RIESZ PROJECTION JOURNAL OF NUMERICAL ANALYSIS AND APPROXIMATION THEORY J. Numer. Anal. Approx. Theory, vol. 44 (2015) no. 2, pp. 127 145 ictp.acad.ro/jnaat MULTICENTRIC CALCULUS AND THE RIESZ PROJECTION DIANA APETREI

More information

A linear algebra proof of the fundamental theorem of algebra

A linear algebra proof of the fundamental theorem of algebra A linear algebra proof of the fundamental theorem of algebra Andrés E. Caicedo May 18, 2010 Abstract We present a recent proof due to Harm Derksen, that any linear operator in a complex finite dimensional

More information

The Cyclic Decomposition of a Nilpotent Operator

The Cyclic Decomposition of a Nilpotent Operator The Cyclic Decomposition of a Nilpotent Operator 1 Introduction. J.H. Shapiro Suppose T is a linear transformation on a vector space V. Recall Exercise #3 of Chapter 8 of our text, which we restate here

More information

A linear algebra proof of the fundamental theorem of algebra

A linear algebra proof of the fundamental theorem of algebra A linear algebra proof of the fundamental theorem of algebra Andrés E. Caicedo May 18, 2010 Abstract We present a recent proof due to Harm Derksen, that any linear operator in a complex finite dimensional

More information

What is A + B? What is A B? What is AB? What is BA? What is A 2? and B = QUESTION 2. What is the reduced row echelon matrix of A =

What is A + B? What is A B? What is AB? What is BA? What is A 2? and B = QUESTION 2. What is the reduced row echelon matrix of A = STUDENT S COMPANIONS IN BASIC MATH: THE ELEVENTH Matrix Reloaded by Block Buster Presumably you know the first part of matrix story, including its basic operations (addition and multiplication) and row

More information

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY RONALD B. MORGAN AND MIN ZENG Abstract. A restarted Arnoldi algorithm is given that computes eigenvalues

More information

Estimates for probabilities of independent events and infinite series

Estimates for probabilities of independent events and infinite series Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences

More information

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying I.2 Quadratic Eigenvalue Problems 1 Introduction The quadratic eigenvalue problem QEP is to find scalars λ and nonzero vectors u satisfying where Qλx = 0, 1.1 Qλ = λ 2 M + λd + K, M, D and K are given

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 2nd, 2014 A. Donev (Courant Institute) Lecture

More information

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS p. 2/4 Eigenvalues and eigenvectors Let A C n n. Suppose Ax = λx, x 0, then x is a (right) eigenvector of A, corresponding to the eigenvalue

More information

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red. Study Tip For each

More information

Algorithms that use the Arnoldi Basis

Algorithms that use the Arnoldi Basis AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O Leary c 2006, 2007 Algorithms that use the Arnoldi Basis Reference: Chapter 6 of Saad The Arnoldi Basis How to

More information

Frame Diagonalization of Matrices

Frame Diagonalization of Matrices Frame Diagonalization of Matrices Fumiko Futamura Mathematics and Computer Science Department Southwestern University 00 E University Ave Georgetown, Texas 78626 U.S.A. Phone: + (52) 863-98 Fax: + (52)

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Chapter 2 Spectra of Finite Graphs

Chapter 2 Spectra of Finite Graphs Chapter 2 Spectra of Finite Graphs 2.1 Characteristic Polynomials Let G = (V, E) be a finite graph on n = V vertices. Numbering the vertices, we write down its adjacency matrix in an explicit form of n

More information

Linear Algebra, Summer 2011, pt. 2

Linear Algebra, Summer 2011, pt. 2 Linear Algebra, Summer 2, pt. 2 June 8, 2 Contents Inverses. 2 Vector Spaces. 3 2. Examples of vector spaces..................... 3 2.2 The column space......................... 6 2.3 The null space...........................

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

Study Guide for Linear Algebra Exam 2

Study Guide for Linear Algebra Exam 2 Study Guide for Linear Algebra Exam 2 Term Vector Space Definition A Vector Space is a nonempty set V of objects, on which are defined two operations, called addition and multiplication by scalars (real

More information

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F is R or C. Definition 1. A linear operator

More information

Generalized eigenspaces

Generalized eigenspaces Generalized eigenspaces November 30, 2012 Contents 1 Introduction 1 2 Polynomials 2 3 Calculating the characteristic polynomial 5 4 Projections 7 5 Generalized eigenvalues 10 6 Eigenpolynomials 15 1 Introduction

More information