LARGE SPARSE EIGENVALUE PROBLEMS

LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1

General Tools for Solving Large Eigen-Problems Projection techniques Arnoldi, Lanczos, Subspace Iteration; Preconditioninings: shift-and-invert, Polynomials,... Deflation and restarting techniques Computational codes often combine these three ingredients 14-2 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-2

A few popular solution Methods Subspace Iteration [Now less popular sometimes used for validation] Arnoldi s method (or Lanczos) with polynomial acceleration Shift-and-invert and other preconditioners. [Use Arnoldi or Lanczos for (A σi) 1.] Davidson s method and variants, Jacobi-Davidson Specialized method: Automatic Multilevel Substructuring (AMLS). 14-3 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-3

Projection Methods for Eigenvalue Problems Projection method onto K orthogonal to L Given: Two subspaces K and L of same dimension. Approximate eigenpairs λ, ũ, obtained by solving: Find: λ C, ũ K such that( λi A)ũ L Two types of methods: Orthogonal projection methods: Situation when L = K. Oblique projection methods: When L K. First situation leads to Rayleigh-Ritz procedure 14-4 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-4

Rayleigh-Ritz projection Given: a subspace X known to contain good approximations to eigenvectors of A. Question: How to extract best approximations to eigenvalues/ eigenvectors from this subspace? Answer: Orthogonal projection method Let Q = [q 1,..., q m ] = orthonormal basis of X Orthogonal projection method onto X yields: Q H (A λi)ũ = 0 Q H AQy = λy where ũ = Qy Known as Rayleigh Ritz process 14-5 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-5

Procedure: 1. Obtain an orthonormal basis of X 2. Compute C = Q H AQ (an m m matrix) 3. Obtain Schur factorization of C, C = Y RY H 4. Compute Ũ = QY Property: if X is (exactly) invariant, then procedure will yield exact eigenvalues and eigenvectors. Proof: Since X is invariant, (A λi)u = Qz for a certain z. Q H Qz = 0 implies z = 0 and therefore (A λi)u = 0. Can use this procedure in conjunction with the subspace obtained from subspace iteration algorithm 14-6 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-6

Subspace Iteration Original idea: Y = A k X projection technique onto a subspace of the form Practically: A k replaced by suitable polynomial Advantages: Easy to implement (in symmetric case); Easy to analyze; Disadvantage: Slow. Often used with polynomial acceleration: A k X replaced by C k (A)X. Typically C k = Chebyshev polynomial. 14-7 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-7

Algorithm: Subspace Iteration with Projection 1. Start: Choose an initial system of vectors X = [x 0,..., x m ] and an initial polynomial C k. 2. Iterate: Until convergence do: (a) Compute Ẑ = C k (A)X. [Simplest case: Ẑ = AX.] (b) Orthonormalize Ẑ: [Z, R Z ] = qr(ẑ, 0) (c) Compute B = Z H AZ (d) Compute the Schur factorization B = Y R B Y H of B (e) Compute X := ZY. (f) Test for convergence. If satisfied stop. Else select a new polynomial C k and continue. 14-8 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-8

THEOREM: Let S 0 = span{x 1, x 2,..., x m } and assume that S 0 is such that the vectors {P x i } i=1,...,m are linearly independent where P is the spectral projector associated with λ 1,..., λ m. Let P k the orthogonal projector onto the subspace S k = span{x k }. Then for each eigenvector u i of A, i = 1,..., m, there exists a unique vector s i in the subspace S 0 such that P s i = u i. Moreover, the following inequality is satisfied (I P k )u i 2 u i s i 2 ( λ m+1 λ i where ɛ k tends to zero as k tends to infinity. k k) + ɛ, (1) 14-9 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-9

KRYLOV SUBSPACE METHODS 14-10

Krylov subspace methods Principle: Projection methods on Krylov subspaces: K m (A, v 1 ) = span{v 1, Av 1,, A m 1 v 1 } The most important class of projection methods [for linear systems and for eigenvalue problems] Variants depend on the subspace L Let µ = deg. of minimal polynom. of v 1. Then: K m = {p(a)v 1 p = polynomial of degree m 1} K m = K µ for all m µ. Moreover, K µ is invariant under A. dim(k m ) = m iff µ m. 14-11 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-11

Arnoldi s algorithm Goal: to compute an orthogonal basis of K m. Input: Initial vector v 1, with v 1 2 = 1 and m. ALGORITHM : 1 Arnoldi s procedure For j = 1,..., m do Compute w := Av j For i = 1,..., j, do h j+1,j = w 2 ; v j+1 = w/h j+1,j End { hi,j := (w, v i ) w := w h i,j v i Based on Gram-Schmidt procedure 14-12 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-12

Result of Arnoldi s algorithm Let: H m = x x x x x x x x x x x x x x x x x x x x, H m = x x x x x x x x x x x x x x x x x x x Results: 1. V m = [v 1, v 2,..., v m ] orthonormal basis of K m. 2. AV m = V m+1 H m = V m H m + h m+1,m v m+1 e T m 3. Vm T AV m = H m H m last row. 14-13 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-13

Application to eigenvalue problems Write approximate eigenvector as ũ = V m y Galerkin condition: (A λi)v m y K m Vm H(A λi)v m y = 0 Approximate eigenvalues are eigenvalues of H m H m y j = λ j y j Associated approximate eigenvectors are ũ j = V m y j Typically a few of the outermost eigenvalues will converge first. 14-14 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-14

Hermitian case: The Lanczos Algorithm The Hessenberg matrix becomes tridiagonal : A = A H and V H m AV m = H m H m = H H m Denote H m by T m and H m by T m. We can write α 1 β 2 β 2 α 2 β 3 T m = β 3 α 3 β 4...... β m α m Relation AV m = V m+1 T m 14-15 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-15

Consequence: three term recurrence β j+1 v j+1 = Av j α j v j β j v j 1 ALGORITHM : 2 Lanczos 1. Choose an initial v 1 with v 1 2 = 1; Set β 1 0, v 0 0 2. For j = 1, 2,..., m Do: 3. w j := Av j β j v j 1 4. α j := (w j, v j ) 5. w j := w j α j v j 6. β j+1 := w j 2. If β j+1 = 0 then Stop 7. v j+1 := w j /β j+1 8. EndDo Hermitian matrix + Arnoldi Hermitian Lanczos 14-16 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-16

In theory v i s defined by 3-term recurrence are orthogonal. However: in practice severe loss of orthogonality; Observation [Paige, 1981]: Loss of orthogonality starts suddenly, when the first eigenpair has converged. It is a sign of loss of linear independence of the computed eigenvectors. When orthogonality is lost, then several the copies of the same eigenvalue start appearing. 14-17 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-17

Reorthogonalization Full reorthogonalization reorthogonalize v j+1 against all previous v i s every time. Partial reorthogonalization reorthogonalize v j+1 against all previous v i s only when needed [Parlett & Simon] Selective reorthogonalization reorthogonalize v j+1 against computed eigenvectors [Parlett & Scott] No reorthogonalization Do not reorthogonalize - but take measures to deal with spurious eigenvalues. [Cullum & Willoughby] 14-18 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-18

Lanczos Bidiagonalization We now deal with rectangular matrices. Let A R m n. ALGORITHM : 3 Golub-Kahan-Lanczos 1. Choose an initial v 1 with v 1 2 = 1; Set β 0 0, u 0 0 2. For k = 1,..., p Do: 3. û := Av k β k 1 u k 1 4. α k = û 2 ; u k = û/α k 5. ˆv = A T u k α k v k 6. β k = ˆv 2 ; v k+1 := ˆv/β k 7. EndDo Let: V p+1 = [v 1, v 2,, v p+1 ] R n (p+1) U p = [u 1, u 2,, u p ] R m p 14-19 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-19

Let: B p = α 1 β 1 α 2 β 2............ α p β p ; ˆB p = B p (:, 1 : p) V p = [v 1, v 2,, v p ] R n p Result: V T p+1 V p+1 = I U T p U p = I AV p = U p ˆB p A T U p = V p+1 B T p 14-20 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-20

Observe that : A T (AV p ) = A T (U p ˆB p ) = V p+1 B T p ˆB p B T p ˆB p is a (symmetric) tridiagonal matrix of size (p + 1) p Call this matrix T k. Then: (A T A)V p = V p+1 T p Standard Lanczos relation! Algorithm is equivalent to standard Lanczos applied to A T A. Similar result for the u i s [involves AA T ] Work out the details: What are the entries of T p relative to those of B p? 14-21 TB: 36; AB: 4.6.1, 4.6.7-8, 4.5.4, 4.6.2; Gvl4 10.1,10.5.1 Eigen3 14-21