A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

Size: px

Start display at page:

Download "A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering"

Prosper Carr
5 years ago
Views:

1 A short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer Science and Engineering Universite du Littoral, Jan 19-30, 2005

2 Outline Part 1 Introd., discretization of PDEs Sparse matrices and sparsity Basic iterative methods (Relaxation..) Part 2 Projection methods Krylov subspace methods Part 3 Preconditioned iterations Preconditioning techniques Parallel implementations Part 4 Eigenvalue problems Applications Calais February 7,

3 INTRODUCTION - MOTIVATION

4 Origins of Eigenvalue Problems Structural Engineering [Ku = λm u] Electronic structure calculations [Shrödinger equation..] Stability analysis [e.g., electrical networks, mechanical system,..] Bifurcation analysis [e.g., in uid ow] Large sparse eigenvalue problems are among the most demanding calculations (in terms of CPU time) in scienti c computing. Calais February 7,

5 New application in information technology Search engines (google) rank web-sites in order to improve searches The google toolbar on some browsers ( - gives a measure of relevance of a page. The problem can be formulated as a Markov chain Seek the dominant eigenvector Algorithm used: power method For details see: Calais February 7,

6 The Problem We consider the eigenvalue problem Ax = λx or Ax = λbx Typically: B is symmetric (semi) positive de nite, A is symmetric or nonsymmetric Requirements vary: Compute a few λ i s with smallest or largest real parts; Compute all λ i s in a certain region of C ; Compute a few of the dominant eigenvalues; Compute all λ i s. Calais February 7,

7 Types of problems * Standard Hermitian (or symmetric real) Ax = λx, A H = A * Standard non-hermitian Ax = λx, A H A * Generalized Ax = λbx Several distinct sub-cases (B SPD, B SSPD, B singular with large null space, both A and B singular, etc..) * Quadratic (A + λb + λ 2 C)x = 0 * Nonlinear A(λ)x = 0 Calais February 7,

8 EIGENVALUE PROBLEMS BASICS DENSE MATRIX CASE Background on eigenvalues/ eigenvectors/ Jordan form The Schur form Perturbation analysis, condition numbers.. Power method, subspace iteration algorithms The QR algorithm Practical QR algorithms: use of Hessenberg form and shifts The symmetric eigenvalue problem.

9 Basic de nitions and properties A complex scalar λ is called an eigenvalue of a square matrix A if there exists a nonzero vector u in C n such that Au = λu. The vector u is called an eigenvector of A associated with λ. The set of all eigenvalues of A is the spectrum of A. Notation: Λ(A). λ Λ(A) iff the columns of A λi are linearly dependent.... equivalent to saying that its rows are linearly dependent. So: there is a nonzero vector w such that w H (A λi) = 0 w H is called a left eigenvector of A (u is a right eigenvector) λ Λ(A) iff det(a λi) = 0 Calais February 7,

10 Basic de nitions and properties (cont.) An eigenvalue is a root of the Characteristic polynomial: p A (λ) = det(a λi) So there are n eigenvalues (counted with their multiplicities). The multiplicity of these eigenvalues as roots of p A are called algebraic multiplicities. The geometric multiplicity of an eigenvalue λ i is the number of linearly independent eigenvectors associated with λ i. Calais February 7,

11 Geometric multiplicity is algebraic multiplicity. An eigenvalue is simple if its (algebraic) multiplicity is one. It is semi-simple if its geometric and algebraic multiplicities are equal. Example: Consider A = What are the eigenvalues of A? their algebraic multiplicities? their geometric multiplicities? Is one a semi-simple eigenvalue? Same questions if a 33 is replaced by one. Calais February 7,

12 Two matrices A and B are similar if there exists a nonsingular matrix X such that B = XAX 1 De nition: A is diagonalizable if it is similar to a diagonal matrix THEOREM: A matrix is diagonalizable iff it has n linearly independent eigenvectors THEOREM (Schur form): Any matrix is unitarily similar to a triangular matrix, i.e., for any A there exists a unitary matrix Q and an upper triangular matrix R such that A = QRQ H Any Hermitian matrix is unitarily similar to a real diagonal matrix, (i.e. its Schur form is real diagonal). Calais February 7,

13 Special case: symmetric / Hermitian matrices Consider the Schur form of a real symmetric matrix A: A = QRQ H Since A H = A then R = R H Eigenvalues of A are real In addition, Q can be taken to be real when A is real. (A λi)(u + iv) = 0 (A λi)u = 0 and (A λi)v = 0 Can select eigenvectors to be real. There is an orthonormal basis of eigenvectors of A Calais February 7,

14 The min-max theorem Label eigenvalues increasingly: λ 1 λ 2 λ n The eigenvalues of a Hermitian matrix A are characterized by the relation λ k = max S, dim(s)=k min x S,x 0 (Ax, x) (x, x) Consequence: λ 1 = max(ax, x)/(x, x) λ x 0 n = min(ax, x)/(x, x) x 0 Calais February 7,

15 The Law of interia A matrix A with m negative, z zero, and p positive eigenvalues, has inertia [m, z, p]. Sylvester s Law of inertia: then A and X T AX have the same inertia. If X is an n n nonsingular matrix, Example: Suppose that A = LDL T where L is unit lower triangular, and D diagonal. How many negative eigenvalues does A have? Example: Assume that A is tridiagonal. How many operations are required to determine the number of negative eigenvalues of A? Example: Devise an algorithm based on the inertia theorem to compute the i-th eigenvalue of a tridiagonal matrix. Calais February 7,

16 Perturbation analysis General questions: If A is perturbed how does an eigenvalue change? How about an eigenvector? Also: sensitivity of an eigenvalue to perturbations THEOREM [Gerschgorin] λ Λ(A), i such that λ a ii j=n j=1 j i a ij. In words: An eigenvalue λ of A is located in one of the closed discs D(a ii, ρ i ) with ρ i = j i a ij. Calais February 7,

17 Gerschgorin s theorem - example Find a region of the complex plane where the eigenvalues of the following matrix are located: A = Re nement: if disks are all disjoint then each of them contains one eigenvalue Re nement: can combine row and column version of the theorem (column version obtained by applying theorem to A H ). Calais February 7,

18 Bauer-Fike theorem THEOREM [Bauer-Fike] Let λ, ũ be an approximate eigenpair with ũ 2 = 1, and let r = Aũ λũ ( residual vector ). Assume A is diagonalizable: A = XDX 1, with D diagonal. Then λ Λ(A) such that λ λ cond 2 (X) r 2. Very restrictive result - also not too sharp in general. Alternative formulation. If E is a perturbation to A then for any eigenvalue λ of A + E there is an eigenvalue λ of A such that: λ λ cond 2 (X) E 2. Prove this result from the previous one. Calais February 7,

19 Conditioning of Eigenvalues Assume that λ is a simple eigenvalue with right and left eigenvectors u and w H respectively. Consider the matrices: A(t) = A + te Eigenvalue λ(t), eigenvector u(t). Conditioning of λ of A relative to E is the dλ(t)/dt at t = 0. Write A(t)u(t) = λ(t)u(t) then multiply both sides to the left by w H w H (A + te)u(t) = λ(t)w H u(t) λ(t)w H u(t) = w H Au(t) + tw H Eu(t) = λw H u(t) + tw H Eu(t).

20 Hence, λ(t) λ w H u(t) = w H Eu(t) t Take the limit at t = 0, λ (0) = wh Eu w H u Note: the left and right eigenvectors associated with a simple eigenvalue cannot be orthogonal to each other. Actual conditioning of an eigenvalue, given a perturbation in the direction of E is the modulus of the above quantity. In practice, one only has an estimate of E for some norm λ (0) Eu 2 w 2 (u, w) E 2 u 2 w 2 (u, w)

21 De nition. The condition number of a simple eigenvalue λ of an arbitrary matrix A is de ned by cond(λ) = 1 cos θ(u, w) in which u and w H are the right and left eigenvectors, respectively, associated with λ. Example: Consider the matrix A = Λ(A) = {1, 2, 3}. Right and left eigenvectors associated with λ 1 = 1: u = and w =

22 So: cond(λ 1 ) Perturbing a 11 to yields the spectrum: {0.2287, , }. as expected.. For Hermitian (also normal matrices) every simple eigenvalue is well-conditioned, since cond(λ) = 1. Calais February 7,

23 The power method Basic idea is to generate the sequence of vectors A k v 0 where v 0 0 then normalize. Most commonly used normalization: ensure that the largest component of the approximation is equal to one. ALGORITHM : 1 The Power Method 1. Choose a nonzero initial vector v (0). 2. For k = 1, 2,..., until convergence, Do: 3. v (k) = 1 Av (k 1) where α k 4. α k = argmax i=1,...,n (Av (k 1) ) i 5. EndDo argmax i=1,..,n x i the component x i with largest modulus Calais February 7,

24 Convergence of the power method THEOREM Assume that there is one and only one eigenvalue λ 1 of A of largest modulus and that λ 1 is semi-simple. Then either the initial vector v 0 has no component in the invariant subspace associated with λ 1 or the sequence of vectors generated by the algorithm converges to an eigenvector associated with λ 1 and α k converges to λ 1. Proof in the diagonalizable case. v k is = vector A k v 0 normalized by a certain scalar ˆα k in such a way that its largest component is 1. Decompose the initial vector v 0 as v 0 = p i=1 γ i u i where the u i s are the eigenvectors associated with the λ i s, i = 1,..., n. Calais February 7,

25 Note that A k u i = λ p i u i v k = = = 1 scaling n λ k i γ iu i i=1 1 scaling λ k 1 γ 1u 1 + n 1 scaling u 1 + n i=2 i=2 λ i λ 1 λ k i γk i u i k γ i γ 1 u i Second term inside bracket converges to zero. QED Proof suggests that the convergence factor is given by ρ D = λ 2 λ 1 where λ 2 is the second largest eigenvalue in modulus. Example: Consider a Markov Chain matrix of size n = 55. Dominant eigenvalues are λ = 1 and λ = 1 the power method applied directly to A fails. (Why?) Calais February 7,

26 We can consider instead the matrix I + A The eigenvalue λ = 1 is then transformed into the (only) dominant eigenvalue λ = 2 Iteration Norm of diff. Res. norm Eigenvalue D D D D D D D D D D D D D D D D Calais February 7,

27 The Shifted Power Method In previous example shifted A into B = A + I before applying power method. We could also iterate with B(σ) = A + σi for any positive σ Example: With σ = 0.1 we get the following improvement. Iteration Norm of diff. Res. Norm Eigenvalue D D D D D D D D D D Calais February 7,

28 Question: What is the best shift-of-origin σ to use? When all eigenvalues are real and such that λ 1 > λ 2 λ 2 λ n, then the value of σ which yields the best convergence factor is: σ opt = λ 2 + λ n 2 Calais February 7,

29 Inverse Iteration Observation: The eigenvectors of A and A 1 are identical. Idea: use the power method on A 1. Will compute the eigenvalues closest to zero. Shift-and-invert Use power method on (A σi) 1. will compute eigenvalues closest to σ. Advantages: fast convergence in general. Drawbacks: need to factor A (or A σi) into LU.. Calais February 7,

30 Subspace iteration Generalizes the power method ALGORITHM : 2 Orthogonal iteration 1. Start: Q 0 = [q 1,..., q m ] 2. Iterate: Until convergence do, 3. X := AQ k 1 4. X = Q k R (QR factorization) 5. EndDo Normalization in step 4 is similar to the scaling used in the power method. Improvement: normalize only once in a while. Calais February 7,

31 ALGORITHM : 3 Subspace Iteration with Projection Start: Choose Q 0 = [q 0,..., q m ] Iterate: For k = 1,..., until convergence do: Compute Ẑ = AQ k 1. Ẑ = ZR Z (QR factorization) B = Z H AZ Compute the Schur factorization B = Y RY H Q k = ZY EndDo Again: no need to orthogonalize + project at each step. Assume λ 1 λ 2 λ m > λ m+1 λ n, then convergence rate for λ 1 is (generally) λ m+1 /λ 1 Calais February 7,

32 The QR algorithm The most common method for solving small (dense) eigenvalue problems. The basic algorithm: ALGORITHM : 4 QR without shifts 1. Until Convergence Do: 2. Compute the QR factorization A = QR 3. Set A := RQ 4. EndDo Until Convergence means Until A becomes close enough to an upper triangular matrix Calais February 7,

33 Note: A new = RQ = Q H (QR)Q = Q H AQ A new is similar to A throughout the algorithm. Above basic algorithm is never used in practice. Two variations: (1) use shift of origin and (2) Transform A into Hessenberg form.. Calais February 7,

34 Practical QR: Shifts of origin Observation: (from theory): Last row converges fastest. Convergence is dictated by λ n λ n 1 We will now consider only the real symmetric case. Eigenvalues are real. A (k) remains symmetric throughout process. As k goes to in nity the last column and row (except a (k) nn ) converge to zero quickly.,, and a (k) nn converges to lowest eigenvalue. Calais February 7,

35 A (k) =..... a..... a..... a..... a..... a a a a a a a Idea: Apply QR algorithm to A (k) µi with µ = a (k) nn. Note: eigenvalues of A (k) µi are shifted by µ, and eigenvectors are the same. Calais February 7,

36 ALGORITHM : 5 QR with shifts 1. Until row a in, 1 i < n converges to zero DO: 2. Obtain next shift (e.g. µ = a nn ) 3. A µi = QR 5. Set A := RQ + µi 6. EndDo Convergence is cubic at the limit! [for symmetric case] Calais February 7,

37 Result of algorithm: A (k) = λ n Next step: de ate, i.e., apply above algorithm to (n 1) (n 1) upper triangular matrix. Calais February 7,

38 Practical QR: Use of the Hessenberg Form Recall: Upper Hessenberg matrix is such that a ij = 0 for j < i 1 Observation: The QR algorithm preserves Hessenberg form (tridiagonal form in symmetric case). Results in substantial savings. 1-st step: reduce A to Hessenberg form. Then (2nd step) apply QR algorithm to resulting matrix. It is easy to adapt the Householder factorization to reduce a matrix into Hessenberg form [similarity transformation] Consider the rst step only on a 6 6 matrix.

39 We want H 1AH T 1 = H 1AH 1 to have the form: Choose a w in H 1 = I 2ww T so that (H 1 A)[2 : n, 1] = 0 Apply to left B = H 1 A. Then apply to right A 1 = BH 1. Observation: the Householder matrix H 1 which transforms the column A(:, 1) into e 1 works only on rows 2 to n. When applying H T 1 to the right of B = H 1A, only columns 2 to n will be altered 1st column retains the same pattern (zeros below row 2) Calais February 7,

40 QR for Hessenberg matrices Need the implicit Q theorem Suppose that Q T AQ is an unreduced upper Hessenberg matrix. Then columns 2 to n of Q are determined uniquely (up to signs) by the rst column of Q. Implication: In order to compute A i+1 = Q T i AQ i we can: Compute the rst column of Q i [easy: = scalar A(:, 1)] Choose other columns so Q i = unitary, and A i+1 = Hessenberg. Calais February 7,

41 Example: With n = 6 : A = Choose G 1 = G(1, 2, θ 1 ) so that (G 1 A) 21 = 0 A 1 = G T 1 AG 1 =

42 2. Choose G 2 = G(2, 3, θ 2 ) so that (G 2 A 1 ) 31 = 0 A 2 = G T 2 A 1G 2 = Choose G 3 = G(3, 4, θ 3 ) so that (G 3 A 2 ) 42 = 0 A 3 = G T 3 A 2G 3 =

43 4. Choose G 4 = G(4, 5, θ 4 ) so that (G 4 A 3 ) 53 = 0 A 4 = G T 4 A 3G 4 = Process known as Bulge chasing Similar idea for the symmetric (tridiagonal) case Calais February 7,

44 The QR algorithm for symmetric matrices Most important method used : reduce to tridiagonal form and apply the QR algorithm with shifts. Householder transformation to Hesseenberg form yields a tridiagonal matrix because HAH T = A 1 is symmetric and also of Hessenberg form it is tridiagonal symmetric. Tridiagonal form is preserved by QR similarity transformation Calais February 7,

45 Practical method How to implement the QR algorithm with shifts? It is best to use Givens rotations can do a shifted QR step without explicitly shifting the matrix.. Two most popular shifts: s = a nn and s = smallest e.v. of A(n 1 : n, n 1 : n) Calais February 7,

46 THE SINGULAR VALUE DECOMPOSITION The SVD existence - properties. Pseudo-inverses and the SVD Use of SVD for least-squares problems Applications of the SVD

47 The Singular Value Decomposition (SVD) For any real n m matrix A there exists orthogonal matrices U R n n and V R m m such that A = UΣV T where Σ is a diagonal matrix with nonnegative diagonal entries. σ 11 σ 22 σ pp 0 with p = min(m, n) The σ ii are called singular values of A. Denoted simply by σ i. Proof: [one among many!] Let σ 1 = A 2 = max x, x 2 =1 Ax 2 There exists a pair of unit vectors v 1, u 1 such that Av 1 = σ 1 u 1 Calais February 7,

48 Complete v 1 into an orthonormal basis of R m V [v 1, V 2 ] = m m unitary Complete u 1 into an orthonormal basis of R m U [u 1, U 2 ] = n n unitary Then, it is easy to show that AV = U σ 1 w T 0 B U T AV = σ 1 w T 0 B A 1 Observe that A 1 σ 1 w 2 σ w 2 = σ1 2 + w 2 This shows that w must be zero [why?] Complete the proof by an induction argument. Calais February 7, σ 1 w 2

49 Case 1: V T A = U Σ Case 2: A = U Σ V T Calais February 7,

50 The thin SVD Consider the Case-1. It can be rewritten as A = [U 1 U 2 ] Σ 1 0 V T Which gives: A = U 1 Σ 1 V T where U 1 is n m (same shape as A), and Σ 1 and V are m m referred to as the thin SVD. Important in practice. Show how to obtain the thin SVD from the QR factorization of A and the SVD of an m m matrix Calais February 7,

51 Some properties. Assume that σ 1 σ 2 σ r > 0 and σ r+1 = = σ p = 0 Then: rank(a) = r = number of nonzero singular values. Ran(A) = span{u 1, u 2,..., u r } Null(A) = span{v r+1, v r+2,..., v m } The matrix A admits the SVD expansion: A = r i=1 σ i u i v T i Calais February 7,

52 Properties of the SVD (continued) A 2 = σ 1 = largest singular value A F = ( r i=1 σ 2 i ) 1/2 When A is an n n nonsingular matrix then A 1 2 = 1/σ n = inverse of smallest s.v. Let k < r and then A k = k i=1 σ i u i v T i min rank(b)=k A B 2 = A A k 2 = σ k+1

53 De ne the r r matrix Σ 1 = diag(σ 1,..., σ r ) Let A R n m and consider now A T A (which is of size m m) A T A = V Σ T ΣV T A T A = V Σ }{{} m m V T This gives the spectral decomposition of A T A. Similarly, U gives the eigenvectors of AA T. AA T = U Σ }{{} n n U T Important: A T A = V D 1 V T and AA T = UD 2 U T give the SVD factors U, V up to signs!

54 Compute the singular value decomposition the matrix: A = Find the matrix B of rank 1 which is the closest to the above matrix in the 2-norm sense. What is the pseudo-inverse of A? What is the pseudo-inverse of B? Find the vector x of smallest norm which minimizes b Ax 2 with b = (1, 1) T Find the vector x of smallest norm which minimizes b Bx 2 with b = (1, 1) T Calais February 7,

55 Pseudo-inverse of an arbitrary matrix The pseudo-inverse of A is given by A = V Σ U T Moore-Penrose conditions: The pseudo inverse of a matrix is uniquely determined by these four conditions: (1) AXA = A (2) XAX = X (3) (AX) H = AX (4) (XA) H = XA In the full-rank overdetermined case, A = (A T A) 1 A T Calais February 7,

56 Least-squares problems and the SVD SVD can give much information about solving overdetermined and underdetermined linear systems Let A be an n m matrix and A = UΣV its SVD with r = rank(a), V = [v 1,..., v m ] U = [u 1,..., u n ]. Then x LS = r u T i b i=1 σ i minimizes b Ax] 2 and has the smallest 2-norm among all possible minimizers. In addition, ρ LS b Ax LS 2 = z 2 with z = [u r+1,..., u n ] T b v i Calais February 7,

57 Least-squares problems and pseudo-inverses A restatement of the rst part of the previous result: Consider the general linear least-squares problem min x S x 2 S = {x R m b Ax 2 min} This problem always has a unique solution given by x = A b Calais February 7,

58 Ill-conditioned systems and the SVD Let A be n n (square matrix) and A = UΣV T its SVD Solution of Ax = b is x = A 1 b = n i=1 u T i b σ i v i When A is very ill-conditioned, it may have many small singular values. The division by these small σ i s will amplify any noise in the data. Result: solution may be meaningless. Remedy: use regularization, i.e., truncate the SVD by only keeping the σ i s that are larger than a threshold τ. This gives the truncated SVD solution (SVD regularization:) x T SV D = σ i τ u T i b Many applications [e.g., Image processing,..] Calais February 7, σ i v i

59 Numerical rank and the SVD Assume that the original matrix A is exactly of rank k. The computed SVD of A will be the SVD of a nearby matrix A+E. Easy to show that ˆσ i σ i α σ 1 eps Result: zero singular values will yield small computed singular values Determining the numerical rank: treat singular values below a certain threshold δ as zero. Practical problem : need to set δ. Calais February 7,

60 LARGE SPARSE EIGENVALUE PROBLEMS

61 General Tools for Solving Large Eigen-Problems Projection techniques Arnoldi, Lanczos, Subspace Iteration; Preconditioninings: shift-and-invert, Polynomials,... De ation and restarting techniques Good computational codes combine these three ingredients Calais February 7,

62 A few popular solution Methods Subspace Iteration [Now less popular sometimes used for validation] Arnoldi s method (or Lanczos) with polynomial acceleration [Stiefel 58, Rutishauser 62, YS 84, 85, Sorensen 89,...] Shift-and-invert and other preconditioners. [Use Arnoldi or Lanczos for (A σi) 1.] Davidson s method and variants, Generalized Davidosn s method [Morgan and Scott, 89], Jacobi-Davidsion Emerning method: Automatic Multilevel Substructuring (AMLS). Calais February 7,

63 Projection Methods for Eigenvalue Problems General formulation: Projection method onto K orthogonal to L Given: Two subspaces K and L of same dimension. Find: λ, ũ such that λ C, ũ K; ( λi A)ũ L Two types of methods: Orthogonal projection methods: situation when L = K. Oblique projection methods: When L K. Calais February 7,

64 Rayleigh-Ritz projection Given: a subspace X known to contain good approximations to eigenvectors of A. Question: How to extract good approximations to eigenvalues/ eigenvectors from this subspace? Answer: Rayleigh Ritz process. Let Q = [q 1,..., q m ] an orthonormal basis of X. Then write an approximation in the form ũ = Qy and obtain y by writing Q H (A λi)ũ = 0 Q H AQy = λy Calais February 7,

65 Procedure: 1. Obtain an orthonormal basis of X 2. Compute C = Q H AQ (an m m matrix) 3. Obtain Schur factorization of C, C = Y RY H 4. Compute Ũ = QY Property: if X is (exactly) invariant, then procedure will yield exact eigenvalues and eigenvectors. Proof: Since X is invariant, (A λi)u = Qz for a certain z. Q H Qz = 0 implies z = 0 and therefore (A λi)u = 0. Can use this procedure in conjunction with the subspace obtained from subspace iteration algorithm Calais February 7,

66 Subspace Iteration Original idea: projection technique onto a subspace if the form Y = A k X In practice: Replace A k by suitable polynomial [Chebyshev] Advantages: Easy to implement (in symmetric case); Easy to analyze; Disadvantage: Slow. Often used with polynomial acceleration: A k X replaced by C k (A)X. Typically C k = Chebyshev polynomial.

67 Algorithm: Subspace Iteration with Projection 1. Start: Choose an initial system of vectors X = [x 0,..., x m ] and an initial polynomial C k. 2. Iterate: Until convergence do: (a) Compute Ẑ = C k (A)X old. (b) Orthonormalize Ẑ into Z. (c) Compute B = Z H AZ and use the QR algorithm to compute the Schur vectors Y = [y 1,..., y m ] of B. (d) Compute X new = ZY. (e) Test for convergence. If satis ed stop. Else select a new polynomial C k and continue.

68 THEOREM: Let S 0 = span{x 1, x 2,..., x m } and assume that S 0 is such that the vectors {P x i } i=1,...,m are linearly independent where P is the spectral projector associated with λ 1,..., λ m. Let P k the orthogonal projector onto the subspace S k = span{x k }. Then for each eigenvector u i of A, i = 1,..., m, there exists a unique vector s i in the subspace S 0 such that P s i = u i. Moreover, the following inequality is satis ed (I P k )u i 2 u i s i 2 where ɛ k tends to zero as k tends to in nity. λ m+1 λ i + ɛ k k, (1) Calais February 7,

69 KRYLOV SUBSPACE METHODS

70 KRYLOV SUBSPACE METHODS Principle: Projection methods on Krylov subspaces, i.e., on K m (A, v 1 ) = span{v 1, Av 1,, A m 1 v 1 } probably the most important class of projection methods [for linear systems and for eigenvalue problems] many variants exist depending on the subspace L. Properties of K m. Let µ = deg. of minimal polynom. of v. Then, K m = {p(a)v p = polynomial of degree m 1} K m = K µ for all m µ. Moreover, K µ is invariant under A. dim(k m ) = m iff µ m. Calais February 7,

71 ARNOLDI S ALGORITHM Goal: to compute an orthogonal basis of K m. Input: Initial vector v 1, with v 1 2 = 1 and m. ALGORITHM : 6 For j = 1,..., m do Arnoldi s procedure Compute w := Av j For i = 1,..., j, do h j+1,j = w 2 ; End h i,j := (w, v i ) w := w h i,j v i v j+1 = w/h j+1,j Calais February 7,

72 Result of Arnoldi s algorithm Let H m = x x x x x x x x x x x x x x x x x x x x H m = x x x x x x x x x x x x x x x x x x x 1. V m = [v 1, v 2,..., v m ] orthonormal basis of K m. 2. AV m = V m+1 H m = V m H m + h m+1,m v m+1 e T m 3. V T m AV m = H m H m last row. Calais February 7,

73 Appliaction to eigenvalue problems Write approximate eigenvector as ũ = V m y + Galerkin condition (A λi)v m y K m V H m (A λi)v m y = 0 Approximate eigenvalues are eigenvalues of H m H m y j = λ j y j Associated approximate eigenvectors are ũ j = V m y j Typically a few of the outermost eigenvalues will converge rst. Calais February 7,

74 Restarted Arnoldi In practice: Memory requirement of algorithm implies restarting is necessary ALGORITHM : 7 Restarted Arnoldi (computes rightmost eigenpair) 1. Start: Choose an initial vector v 1 and a dimension m. 2. Iterate: Perform m steps of Arnoldi s algorithm. 3. Restart: Compute the approximate eigenvector u (m) 1 4. associated with the rightmost eigenvalue λ (m) If satis ed stop, else set v 1 u (m) 1 and goto 2. Calais February 7,

75 Example: Small Markov Chain matrix [ Mark(10), dimension = 55]. Restarted Arnoldi procedure for computing the eigenvector associated with the eigenvalue with algebraically largest real part. We use m = 10. m R(λ) I(λ) Res. Norm D D D D D D D D D D-07 Calais February 7,

76 Restarted Arnoldi (cont.) Can be generalized to more than *one* eigenvector : v (new) 1 = p ρ i u (m) i i=1 However: often does not work well (hard to nd good coef cients ρ i s) Alternative : compute eigenvectors (actually Schur vectors) one at a time. Implicit de ation. Calais February 7,

77 Hermitian case: The Lanczos Algorithm The Hessenberg matrix becomes tridiagonal : A = A H and V H m AV m = H m H m = H H m We can write H m = α 1 β 2 β 2 α 2 β 3 β 3 α 3 β β m α m Consequence: three term recurrence β j+1 v j+1 = Av j α j v j β j v j 1 (2) Calais February 7,

78 ALGORITHM : 8 Lanczos 1. Choose an initial vector v 1 of norm unity. Set β 1 0, v For j = 1, 2,..., m Do: 3. w j := Av j β j v j 1 4. α j := (w j, v j ) 5. w j := w j α j v j 6. β j+1 := w j 2. If β j+1 = 0 then Stop 7. v j+1 := w j /β j+1 8. EndDo Hermitian matrix + Arnoldi Hermitian Lanczos In theory v i s de ned by 3-term recurrence are orthogonal. However: in practice severe loss of orthogonality; Calais February 7,

79 Observation [Paige, 1981]: Loss of orthogonality starts suddenly, when the rst eigenpair has converged. It is a sign of loss of linear indedependence of the computed eigenvectors. When orthogonality is lost, then several the copies of the same eigenvalue start appearing. Calais February 7,

80 Reorthogonalization Full reorthogonalization reorthogonalize v j+1 against all previous v i s every time. Partial reorthogonalization reorthogonalize v j+1 against all previous v i s only when needed [Parlett & Simon] Selective reorthogonalization reorthogonalize v j+1 against computed eigenvectors [Parlett & Scott] No reorthogonalization Do not reorthogonalize - but take measures to deal with spurious eigenvalues. [Cullum & Willoughby] Calais February 7,

81 LANCZOS BIORTHOGONALIZATION

82 The Lanczos biorthogonalization (A H A) ALGORITHM : 9 The Lanczos Bi-Orthogonalization Procedure 1. Choose v 1, w 1 such that (v 1, w 1 ) = 1. Set β 1 = δ 1 0, w 0 = v For j = 1, 2,..., m Do: 3. α j = (Av j, w j ) [ α j = (Av j β j v j 1, w j ) ] 4. ˆv j+1 = Av j α j v j β j v j 1 [ ˆv j+1 = (Av j β j v j 1 ) α j v j ] 5. ŵ j+1 = A H w j ᾱ j w j δ j w j 1 [ ŵ j+1 = (A H w j δ j w j 1 ) ᾱ j w j ] 6. δ j+1 = (ˆv j+1, ŵ j+1 ) 1/2. If δ j+1 = 0 Stop 7. β j+1 = (ˆv j+1, ŵ j+1 )/δ j+1 8. w j+1 = ŵ j+1 / β j+1 9. v j+1 = ˆv j+1 /δ j EndDo Calais February 7,

83 Builds a pair of biorthogonal bases for the two subspaces K m (A, v 1 ) and K m (A H, w 1 ) Many choices for δ j+1, β j+1 in lines 7 and 8. Only constraint: δ j+1 β j+1 = (ˆv j+1, ŵ j+1 ) Let T m = α 1 β 2 δ 2 α 2 β 3... δ m 1 α m 1 β m δ m α m. v i K m (A, v 1 ) and w j K m (A T, w 1 ).

84 If the algorithm does not break down before step m, then the vectors v i, i = 1,..., m, and w j, j = 1,..., m, are biorthogonal, i.e., (v j, w i ) = δ ij 1 i, j m. Moreover, {v i } i=1,2,...,m is a basis of K m (A, v 1 ) and {w i } i=1,2,...,m is a basis of K m (A H, w 1 ) and AV m = V m T m + δ m+1 v m+1 e H m, A H W m = W m T H m + β m+1 w m+1 e H m, W H m AV m = T m. Calais February 7,

85 If θ j, y j, z j are, respectively an eigenvalue of T m, with associated right and left eigenvectors y j and z j respectively, then corresponding approximations for A are Ritz value Right Ritz vector Left Ritz vector θ j V m y j W m z j [Note: terminology is abused slightly - Ritz values and vectors normally refer to Hermitian cases.] Calais February 7,

86 Advantages and disadvantages Advantages: Nice three-term recurrence requires little storage in theory. Computes left and a right eigenvectors at the same time Disadvantages: Algorithm can breakdown or nearly breakdown. Convergence not too well understood. Erratic behavior Not easy to take advantage of the tridiagonal form of T m. Calais February 7,

87 Look-ahead Lanczos Algorithm breaks down when: Three distinct situations. (ˆv j+1, ŵ j+1 ) = 0 lucky breakdown when either ˆv j+1 or ŵ j+1 is zero. In this case, eigenvalues of T m are eigenvalues of A. (ˆv j+1, ŵ j+1 ) = 0 but of ˆv j+1 0, ŵ j+1 0 serious breakdown. Often possible to bypass the step (+ a few more) and continue the algorithm. If this is not possible then we get an Incurable break-down. [very rare] Calais February 7,

88 Look-ahead Lanczos algorithms deal with the second case. See Parlett 80, Freund and Nachtigal Main idea: when break-down occurs, skip the computation of v j+1, w j+1 and de ne v j+2, w j+2 from v j, w j. For example by orthogonalizing A 2 v j... Can de ne v j+1 somewhat arbitrarily as v j+1 = Av j. Similarly for w j+1. Drawbacks: (1) projected problem no longer tridiagonal (2) dif cult to know what constitutes near-breakdown. Calais February 7,

89 DEFLATION

90 De ation Very useful in practice. Different forms: locking (subspace iteration), selective orthogonalization (Lanczos), Schur de ation,... Calais February 7,

91 A little background Consider Schur canonical form A = URU H where U is a (complex) upper triangular matrix. Vector columns u 1,..., u n called Schur vectors. Note: Schur vectors depend on each other, and on the order of the eigenvalues

92 Wiedlandt De ation: Assume we have computed a right eigenpair λ 1, u 1. Wielandt de ation considers eigenvalues of A 1 = A σu 1 v H Note: Λ(A 1 ) = {λ 1 σ, λ 2,..., λ n } Wielandt de ation preserves u 1 as an eigenvector as well all the left eigenvectors not associated with λ 1. An interesting choice for v is to take simply v = u 1. In this case Wielandt de ation preserves Schur vectors as well.

93 It is possible to apply this procedure successively: ALGORITHM : 10 Explicit De ation 1. A 0 = A 2. For j = 0... µ 1 Do: 3. Compute a dominant eigenvector of A j 4. De ne A j+1 = A j σ j u j u H j 5. End Computed u 1, u 2.,.. form a set of Schur vectors for A. Alternative: implicit de ation (within a procedure such as Arnoldi).

94 De ated Arnoldi: When rst eigenvector converges, we freeze it as the rst vector of V m = [v 1, v 2,..., v m ]. Arnoldi starts working at column v 2. Orthogonalization is still done against v 1,..., v j at step j. Each new converged eigenvector will be added to the locked set of eigenvectors. Calais February 7,

95 For k = 1,....NEV do: /* Eigenvalue loop */ 1. For j = k, k + 1,..., m do: /* Arnoldi loop*/ Compute w := Av j. Orthonormalize w against v 1, v 2,..., v j v j+1 2. Compute next approximate eigenpair λ, ũ. 3. Orthonormalize ũ against v 1,..., v j Result = s = approximate Schur vector. 4. De ne v k := s. 5. If approximation not satisfactory go to Else de ne h i,k = (Av k, v i ), i = 1,.., k,

96 Thus, for k = 2: V m = v 1, v }{{ 2 } Locked active {}}{, v 3,..., v m H m = Similar techniques in Subspace iteration [G. Stewart s SRRIT]

97 Example: Matrix Mark(10) small Markov chain matrix (N = 55). First eigenpair by iterative Arnoldi with m = 10. m Re(λ) Im(λ) Res. Norm D D D D D D D D D D-07

98 Computing the next 2 eigenvalues of Mark(10). Eig. Mat-Vec s Re(λ) Im(λ) Res. Norm D D D D D D D D-07

99 PRECONDITIONING - DAVIDSON S METHOD

100 Preconditioning eigenvalue problems Goal: To extract good approximations to add to a subspace in a projection process. Result: faster convergence. Best known technique: Shift-and-invert; Work with B = (A σi) 1 Some success with polynomial preconditioning [Chebyshev iteration / least-squares polynomials]. Work with B = p(a) Above preconditioners preserve eigenvectors. Other methods (Davidson) use a more general preconditioner M. Calais February 7,

101 Shift-and-invert preconditioning Main idea: to use Arnoldi, or Lanczos, or subspace iteration for the matrix B = (A σi) 1. The matrix B need not be computed explicitly. Each time we need to apply B to a vector we solve a system with B. Factor B = A σi = LU. Then each solution Bx = y requires solving Lz = y and Ux = z. Calais February 7,

102 How to deal with complex shifts? If A is complex need to work in complex arithmetic. If A is real, it is desirable that Arnoldi/ Lanczos algorithms work with a real matrix. Idea: Instead of using B = (A σi) 1 use B + = Re(A σi) 1 = 1 2 [ (A σi) 1 + (A σi) 1] or B = Im(A σi) 1 = 1 2i [ (A σi) 1 (A σi) 1] Little difference between the two. Result: B = θ(a σi) 1 (A σi) with θ = Im(σ). Calais February 7,

103 Preconditioning by polynomials Main idea: Iterate with p(a) instead of A in Arnoldi or Lanczos,.. Used very early on in subspace iteration [Rutishauser, 1959.] Usually not as reliable as Shift-and-invert techniques but less demanding in terms of storage. Calais February 7,

104 Question: How to nd a good polynomial (dynamically)? 1 Use of Chebyshev polynomials over ellipses Approaches: 2 Use polynomials based on Leja points 3 Least-squares polynomials over polygons 4 Polynomials from previous Arnoldi decompositions Calais February 7,

105 Polynomial lters and implicit restart Goal: to apply polynomial lter of the form p(t) = (t θ 1 )(t θ 2 )... (t θ q ) by exploiting the Arnoldi procedure. Assume AV m = V m H m + β m v m+1 e T m and consider rst factor: (t θ 1 ) (A θ 1 I)V m = V m (H m θ 1 I) + β m v m+1 e T m Let H m θ 1 I = Q 1 R 1. Then, (A θ 1 I)V m = V m Q 1 R 1 + β m v m+1 e T m (A θ 1 I)(V m Q 1 ) = (V m Q 1 )R 1 Q 1 + β m v m+1 e T m Q 1 A(V m Q 1 ) = (V m Q 1 )(R 1 Q 1 + θ 1 I) + β m v m+1 e T m Q 1

106 Calais February 7,

107 Notation: R 1 Q 1 +θ 1 I H (1) m ; V (1) m (b(1) m+1) T e T m Q 1; V m Q 1 = AV m (1) = V m (1) H(1) m + v m+1(b (1) m+1) T Note that H (1) m is upper Hessenberg. Similar to an Arnoldi decomposition. Observe: R 1 Q 1 + θ 1 I matrix resulting from one step of the QR algorithm with shift θ 1 applied to H m. First column of V (1) m is a multiple of (A θ 1I)v 1. The columns of V (1) m are orthonormal. Calais February 7,

108 Can now apply second shift in same way: (A θ 2 I)V (1) m = V (1) m (H(1) m θ 2I) + v m+1 (b (1) m+1 )T Similar process: (H (1) m θ 2I) = Q 2 R 2 then Q 2 to the right: (A θ 2 I)V (1) m Q 2 = (V (1) m Q 2)(R 2 Q 2 ) + v m+1 (b (1) m+1 )T Q 2 Now: AV (2) m = V m (2) H(2) m + v m+1(b (2) m+1) T First column of V m (2) = scalar (A θ 2 I)v (1) 1 = scalar (A θ 2 I)(A θ 1 I)v 1 Calais February 7,

109 Note that (b (2) m+1) T = e T m Q 1Q 2 = [0, 0,, 0, q 1, q 2, q 3 ] Let: ˆV m 2 = [ˆv 1,..., ˆv m 2 ] consist of rst m 2 columns of V (2) m and Ĥ m 2 = leading principal submatrix of H m. Then A ˆV m 2 = ˆV m 2 Ĥ m 2 + ˆβ m 1ˆv m 1 e T m with ˆβ m 1ˆv m 1 q 1 v m+1 + h (2) m 1,m 2 v(2) m 1 ˆv m 1 2 = 1 Result: An Arnoldi process of m 2 steps with the initial vector p(a)v 1. In other words: We know how to apply polynomial ltering via a form of the Arnoldi process, combined with the QR algorithm. Calais February 7,

110 The Davidson approach Goal: to use a more general preconditioner to introduce good new components to the subspace. Ideal new vector would be eigenvector itself! Next best thing: an approximation to (A µi) 1 r where r = (A µi)z, current residual. Approximation written in the form M 1 r. Note that M can vary at every step if needed. Calais February 7,

111 ALGORITHM : 11 Davidson s method (Real symmetric case) 1. Choose an initial unit vector v 1. Set V 1 = [v 1 ]. 2. Until convergence Do: 3. For j = 1,..., m Do: 4. w := Av j. 5. Update H j Vj T j 6. Compute the smallest eigenpair µ, y of H j. 7. z := V j y r := Az µz 8. Test for convergence. If satis ed Return 9. If j < m compute t := Mj 1 r 10. compute V j+1 := ORT HN([V j, t]) 11. EndIf 12. Set v 1 := z and go to EndDo 14. EndDo Calais February 7,

112 Note: Traditional Davidson uses diagonal preconditioning: M j = D σ j I. Will work only for some matrices Other options: Shift-and-invert using ILU [negatives: expensive + hard to parallelize.] Filtering (by averaging) Filtering by using smoothers (multigrid style) Iterative solves [See Jacobi-Davidson] Calais February 7,

113 CONVERGENCE THEORY

114 The Lanczos Algorithm in the Hermitian Case Assume eigenvalues sorted increasingly λ 1 λ 2 λ n Orthogonal projection method onto K m ; To derive error bounds, use the Courant characterization λ 1 = min u K, u 0 (Au, u) (u, u) = (Aũ 1, ũ 1 ) (ũ 1, ũ 1 ) (Au, u) λ j = { min u K, u 0 (u, u) = (Aũ j, ũ j ) (ũ u ũ 1,...,ũ j, ũ j ) j 1 Calais February 7,

115 Bounds for λ 1 easy to nd similar to linear systems. Ritz values approximate eigenvalues of A inside out: λ 1 λ 2 λ 1 λ2 λ n 1 λn λ n 1 λ n Calais February 7,

116 A-priori error bounds Theorem [Kaniel, 1966]: 0 λ (m) 1 λ 1 (λ N λ 1 ) tan (v 1, u 1 ) T m 1 (1 + 2γ 1 ) where γ 1 = λ 2 λ 1 λ N λ 2 ; and (v 1, u 1 ) = acute angle between v 1 and u 1. + results for other eigenvalues. [Kaniel, Paige, YS] Theorem [YS,1980] 0 λ (m) i λ i (λ N λ 1 ) where γ i = λ i+1 λ i λ N λ i+1, κ (m) i = j<i κ (m) i λ (m) j λ N λ (m) j λ i tan (v i, u i ) 2 T m i (1 + 2γ i ) 2 Calais February 7,

117 Theory for nonhermitian case More dif cult. No convincing results on global convergence. Can get a general a-priori a-posteriori error bound Let P be the orthogonal projector onto K and Q be the (oblique) projector onto K and orthogonally to L. x L Q x P x K Px K, x Px K Qx K, x Qx L Calais February 7,

118 Analysis Approximate problem amounts to solving Q(Ax λx) = 0, x K or in operator form QAPx = λx Set A m QAP THEOREM. Let γ = Q(A λi)(i P) 2. Then the residual norms of the pairs λ, Pu and λ, u for the linear operator A m satisfy, respectively (A m λi)pu 2 γ (I P)u 2 (A m λi)u 2 λ 2 + γ 2 (I P)u 2. Calais February 7,

119 How to estimate (I P)u i 2? Assume that A is diagonizable and expand v 1 in the eigen-basis v 1 = N j=1 α j u j Assume α i 0, u j 2 = 1 for all j. Then: (I P)u i 2 ξ i ɛ (m) i where ξ i = j i α j α i and ɛ (m) i = min { p Pm 1 p(λ i )=1 max j i p(λ j) Calais February 7,

120 Particular case i = 1 Assume: Λ(A)\{λ 1 } is an ellipse E(c, e, a). Im(z) c a c e c c+e c+a Re(z) ɛ (m) ( ) a ( e C λ1 c m 1 e 1 C m 1 where C m 1 = Chebyshev polynomial of degree m 1 of the rst kind. )

121 Calais February 7,

122 HARMONIC RITZ VALUES

123 Harmonic Ritz values: Literature Morgan 91 [Hermitian case] Freund 93 [Non-Hermitian case, Starting point: GMRES] Morgan 93 [Nonhermitian case] Paige, Parlett, Van der Vorst 95. Chapman & Y.S. 95. [use in De ated GMRES] Many publications in the 40s and 50s (Intermediate eigenvalue problems, Lehman intervals, etc..) Calais February 7,

124 Harmonic Ritz values (continued) Main idea: take L = AK in projection process In context of Arnoldi s method. Write ũ = V m y then: Using AV m = V m+1 H m (A λi)v m y {AV m } H H m V H m+1 [ Vm+1 H m y λv m y ] = 0 Notation: H m = H m last row. Then H H m H my λh H m y = 0 Calais February 7,

125 or ( H H m H m + h 2 m+1,m e me H m ) y = λh H m y Remark: Assume H m is nonsingular and multiply both sides by Hm H. Then, the problem is equivalent to ( Hm + z m e H m ) y = λy with z m = h 2 m+1,m H H m e m. Modi ed from H m only in the last column. Calais February 7,

126 Implementation within Davidson framework Slight varation to standard Davidson: Introduce z i = Mi 1 r i to subspace. Proceed as in FGMRES: v j+1 = Orthn(Az j, V j ). From Gram-Schmidt process: Hence the relation Approximation: λ, ũ = Z m y Az j = j+1 i=1 h ij v i AZ m = V m+1 H m Galerkin Condition: r AZ m gives the generalized problem H H m H m y = λ H H m V H m+1 Z m y Calais February 7,

127 Davidson s algorithm and two variants DAVIDSON s ALGORITHM. 1 Start: select v 1. For j = 1,..., m Do: Update V H j AV j. Compute Ritz pair ũ, λ Compute r = Aũ λu z = M 1 r v j+1 = ORT HN(z, V j ) EndDo DAVIDSON S ALGORITHM. 2 Start: select r. For j = 1,..., m Do: z = M 1 r v j = ORT HN(z, V j 1 ) Compute w = Av j and Update V H j AV j. Compute Ritz pair ũ, λ Compute r = Aũ λu EndDo Difference: start with a preconditioning operation instead of a matvec. In general minor differences.

128 Calais February 7,

129 HARMONIC DAVIDSON Start: select r. Set v 1 = r/ r 2. For j = 1,..., m Do: z j = M 1 r Compute w = Az j and v j+1 = ORT HN(w, V j, h :,j ); Update G = H H j H j, and S = H H j V H j+1 Z j ; Compute Ritz pair ũ, λ : Gy = λsy, ũ = Z j y Compute r = Aũ λu EndDo Arnoldi part identical with that of FGMRES. Calais February 7,

130 Relation with GMRES (Freund 91) The Harmonic Ritz values are the roots of the GMRES polynomial: ψ m = arg min ψ P m, ψ(0)=1 = ψ(a)r 0 2 Proof. GMRES condition is: βv 1 AV m y {AV m } ψ m (A)v 1 {AV m } (A λ i I)V m y i {AV m } Same condition as that of Harmonic Ritz projection. λ i = Ritz harmonic value, ũ i = V m y i = Ritz Harmonic vector. Calais February 7,

131 Harmonic values and interior eigenvalues Let z = A 1 ũ and rewrite the condition [A λi]ũ AK as: [ λ 1 I A 1 ] z L z L Orthogonal projection method for A 1. Note: This is NOT shift-and-invert in disguise. Space of approximants is the same as for standard projection. Interesting consequence for Hermitian case. Calais February 7,

132 Harmonic Ritz projection in the Hermitian Case Order eigenvalues increasingly: λ 1 λ 2 λ n Recall: Ritz values approximate eigenvalues of A inside out: λ 1 λ 2 λ 1 λ2 λ n 1 λn λ n 1 λ n De ne: K (i) = {x K x ũ 1, ũ 2,... ũ i 1 } (K (1) K). Then λ i = min x Ki (Ax,x) (x,x) Calais February 7,

133 Apply principle to Harmonic Ritz values λ 1 1 λ 1 1 ; λ 1 n λ 1 n λ1 λ 1 ; λn λ n Careful: treat positive and negative eigenvalues separately. Result: [Paige, Parlett, Van der Vorst 95] λ 2 λ 1 0 λ + 1 λ + 2 λ 2 λ 1 λ + 1 λ + 2 Assume for simplicity that A is SPD. De ne: K (i) = {x K Ax Aũ 1, Aũ 2,... Aũ i 1 } (K (i) K). Then λ 1 i = max x Ki (Ax,x) (Ax,Ax) Calais February 7,

134 Alternative Projections Eigenvalue problems are really non-linear systems of equations.. Idea: nd µ such (A µi)v is nearly rank-de cient Leads to det[v H (A µi) H (A µi)v ] = 0 Assume µ = real. Using AV m = V m+1 H quadratic problem ( H m T H m µ(h m + Hm T ) + µ2 I)y = 0 Calais February 7,

135 Alternative formulations det ( V H (A µi)v ) = 0 orthog. projection det ( (AV ) H (A µi)v ) = 0 Harmonic projection σ min ((A µi)v ) = 0 SVD projection

136 Markov chain Pb. N = 1024; m = 15 Restarted Arnoldi Restarted Harmonic Krylov Restarted Krylov SVD 10 2 log 10 of residual Numbers of iterations Calais February 7,

137 10 1 CONVD(32,32),0.05,.1) m = 10 Restarted Arnoldi Restarted Harmonic Krylov Restarted Krylov SVD 10 0 log 10 of residual Numbers of iterations

138 CONVD(32,32),0.05,0) m = 10 Restarted Arnoldi Restarted Harmonic Krylov Restarted Krylov SVD 10 2 log 10 of residual Numbers of iterations Calais February 7,

139 JACOBI DAVIDSON

140 Introduction via Newton s metod Assumptions: M = A + E and Az µz Goal: to nd an improved eigenpair (µ + η, z + v). Write A(z + v) = (µ + η)(z + v) and neglect second order terms + rearrange (M µi)v ηz = r with r (A µi)z Unknowns: η and v. Underdertermined system. Need one constraint. Add the condition: w H v = 0 for some vector w.

141 In matrix form: M µi z w H 0 v η = r 0 Eliminate v from second equation: M µi z v 0 w H (M µi) 1 z η = r w H (M µi) 1 r Solution: [Olsen s method] η = wh (M µi) 1 r w H (M µi) 1 z v = (M µi) 1 (r ηz) When M = A, corresponds to Newton s method for solving (A λi)u = 0 w T u = Constant

142 Note: Another way to characterize the solution is: v = (M µi) 1 r + η(m µi) 1 z, η such that w H v = 0 Involves inverse of (M λi). Jacobi-Davidson rewrites solution using projectors. Let P z be a projector in the direction of z which leaves r invariant. It is of the form P z = I zsh s H z where s r. Similarly let P w any projector which leaves v inchanged. Then the Olsen s solution can be written as [P z (M µi)p w ]v = r w H v = 0 The two solutions are mathematically equivalent. Calais February 7,

143 The Jacobi-Davidson approach In orthogonal projection methods (e.g. Arnoldi) we have r z Also it is natural to take w z. Assume z 2 = 1 With the above assumptions, Olsen s correction equation is mathematically equivalent to nding v such that : (I zz H )(M µi)(i zz H )v = r v z Main attraction: can use iterative method for the solution of the correction equation. (M -solves not explicitly required). Calais February 7,

144 Automatic Multi-Level Substructuring Origin: Extention of substructuring for eigenvalue problems. Background: Domain decomposition. Let A C n n, Hermitian Γ Ω2 Ω 1 A = B E E C Note: B is block-diagonal B C (n p) (n p) B= block-diagonal - represents local matrices - E represent coupling - C operates on interface variables. Calais February 7,

145 The problem Au = λu, can be written as: B E E C ũ B ũ S = λ ũ B ũ S Basic idea of the method for two levels First step: eliminate the blocks E, E. U = I B 1 E 0 I U AU = B 0 0 S ; S = C E B 1 E. Original problem is equivalent to U AUu = λu Uu B 0 0 S u = λ with M S = I + E B 2 E I B 1 E E B 1 M S u ; Calais February 7,

146 Second step: neglect the coupling in right-hand side matrix: B 0 0 S u = λ I 0 0 M S u Bv = µ v Sw = η M S w Compute a few of the smallest eigenvalues of above problem. Third step: Build a good subspace to approximate to eigenfunctions of original problem. For projection, use basis the form ˆv i = v i i = 1,..., m B ; ŵ j = 0 where m B < (n p) and m S < p. 0 w j j = 1,..., m S, Calais February 7,

147 Then use this subspace for a Rayleigh-Ritz projection applied to B 0 0 S u B u S = λ I B 1 E E B 1 M S (Note: not the original problem.) Final step: exploit recursion NOTE: algorithm does only one shot of descent - ascent (no iterative improvement). u B u S Calais February 7,

148 References: [1] J. K. BENNIGHOF AND R. B. LEHOUCQ, An automated multilevel substructuring method for eigenspace computation in linear elastodynamics, To appear in SIAM. J. Sci. Comput., (2003). [2] R. R. GRAIG, JR. AND M. C. C. BAMPTON, Coupling of substructures for dynamic analysis, AIAA Journal, 6 (1968), pp [3] W. C. HURTY, Vibrations of structural systems by componentmode synthesis, Journal of the Engineering Mechanics Division, ASCE, 86 (1960), pp [4] K. BEKAS AND Y. SAAD, Computation of Smallest Eigenvalues using Spectral Schur Complements, MSI technical report, Jan to appear. Calais February 7,

149 Spectral Schur complements Can interpret AMLS in terms of Schur complements. Start with B E E C u B u S = λ u B u S For λ / Λ(B) de ne S(λ) = C E (B λi) 1 E When λ / Λ(B) then λ Λ(A) λ Λ(S(λ)), i.e., iff S(λ)u S = λu S Observation: The Schur complement problem solved by AMLS can be viewed as the problem resulting from rst order approximation of S(λ) around λ = 0. Calais February 7,

150 The standard expansion of the resolvent (B λi) 1 = B 1 (λb 1 ) k = k=0 k=0 λ k B k 1, around λ = 0, leads to the series S(λ) = C E ( B 1 + λb 2 + λ 2 B ) E = S k=1 λ k E B k 1 E Zeroth order approximation [ shift-and-invert with zero shift] Su S = λu S First order approximation [AMLS] Su S = λ(i + E B 2 E)u S Second order approximation [See Bekas and YS 04] Su S = λ(i + E B 2 E + λe B 3 E)u S Calais February 7,

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts