Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Similar documents
Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

LARGE SPARSE EIGENVALUE PROBLEMS

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

The Lanczos and conjugate gradient algorithms

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Krylov subspace projection methods

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

Arnoldi Methods in SLEPc

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Scientific Computing: An Introductory Survey

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Index. for generalized eigenvalue problem, butterfly form, 211

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY

Numerical Methods I Eigenvalue Problems

Numerical Methods in Matrix Computations

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

Krylov Subspace Methods for Large/Sparse Eigenvalue Problems

Large-scale eigenvalue problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

ITERATIVE PROJECTION METHODS FOR SPARSE LINEAR SYSTEMS AND EIGENPROBLEMS CHAPTER 11 : JACOBI DAVIDSON METHOD

ABSTRACT NUMERICAL SOLUTION OF EIGENVALUE PROBLEMS WITH SPECTRAL TRANSFORMATIONS

Structured Krylov Subspace Methods for Eigenproblems with Spectral Symmetries

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

ABSTRACT OF DISSERTATION. Ping Zhang

Rational Krylov methods for linear and nonlinear eigenvalue problems

Davidson Method CHAPTER 3 : JACOBI DAVIDSON METHOD

APPLIED NUMERICAL LINEAR ALGEBRA

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

Efficient Methods For Nonlinear Eigenvalue Problems. Diploma Thesis

Iterative Methods for Linear Systems of Equations

Numerical Methods for Solving Large Scale Eigenvalue Problems

Preconditioned inverse iteration and shift-invert Arnoldi method

6.4 Krylov Subspaces and Conjugate Gradients

FEM and sparse linear system solving

The Eigenvalue Problem: Perturbation Theory

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Alternative correction equations in the Jacobi-Davidson method

Krylov Subspace Methods to Calculate PageRank

ABSTRACT. Professor G.W. Stewart

ECS130 Scientific Computing Handout E February 13, 2017

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

A Structure-Preserving Method for Large Scale Eigenproblems. of Skew-Hamiltonian/Hamiltonian (SHH) Pencils

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

A Chebyshev-based two-stage iterative method as an alternative to the direct solution of linear systems

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

The German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English.

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Notes on Eigenvalues, Singular Values and QR

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

The QR Algorithm. Marco Latini. February 26, 2004

Iterative methods for Linear System

On the Ritz values of normal matrices

MICHIEL E. HOCHSTENBACH

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Singular Value Computation and Subspace Clustering

Computational Methods. Eigenvalues and Singular Values

Orthogonal iteration to QR

A hybrid reordered Arnoldi method to accelerate PageRank computations

Algorithms that use the Arnoldi Basis

IDR(s) as a projection method

On the influence of eigenvalues on Bi-CG residual norms

Introduction to Iterative Solvers of Linear Systems

Sparse matrix methods in quantum chemistry Post-doctorale cursus quantumchemie Han-sur-Lesse, Belgium

Numerical Methods - Numerical Linear Algebra

Review of Some Concepts from Linear Algebra: Part 2

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

EECS 275 Matrix Computation

The Nullspace free eigenvalue problem and the inexact Shift and invert Lanczos method. V. Simoncini. Dipartimento di Matematica, Università di Bologna

of dimension n 1 n 2, one defines the matrix determinants

Numerical Methods I: Eigenvalues and eigenvectors

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Block Krylov Space Solvers: a Survey

Iterative Methods for Sparse Linear Systems

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

Lecture 3: Inexact inverse iteration with preconditioning

HARMONIC RAYLEIGH RITZ EXTRACTION FOR THE MULTIPARAMETER EIGENVALUE PROBLEM

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna.

Polynomial Jacobi Davidson Method for Large/Sparse Eigenvalue Problems

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Model reduction of large-scale dynamical systems

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to

Mathematical foundations - linear algebra

The Conjugate Gradient Method

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

Fast iterative solvers

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Recycling Bi-Lanczos Algorithms: BiCG, CGS, and BiCGSTAB

An Arnoldi Method for Nonlinear Symmetric Eigenvalue Problems

Transcription:

Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts Harmonic Ritz values, Jacobi-Davidson s method

Origins of Eigenvalue Problems Structural Engineering [Ku = λmu] Electronic structure calculations [Schrödinger equation..] Stability analysis [e.g., electrical networks, mechanical system,..] Bifurcation analysis [e.g., in fluid flow] Large sparse eigenvalue problems are among the most demanding calculations (in terms of CPU time) in scientific computing. 17-2 eig1

New application in information technology Search engines (google) rank web-sites in order to improve searches The google toolbar on some browsers (http://toolbar.google.com) - gives a measure of relevance of a page. The problem can be formulated as a Markov chain Seek the dominant eigenvector Algorithm used: power method For details see: http://www.iprcom.com/papers/pagerank/index.html 17-3 eig1

The Problem We consider the eigenvalue problem Ax = λx or Ax = λbx Typically: B is symmetric (semi) positive definite, A is symmetric or nonsymmetric Requirements vary: Compute a few λ i s with smallest or largest real parts; Compute all λ i s in a certain region of C; Compute a few of the dominant eigenvalues; Compute all λ i s. 17-4 eig1

Types of problems * Standard Hermitian (or symmetric real) Ax = λx, A H = A * Standard non-hermitian Ax = λx, A H A * Generalized Ax = λbx Several distinct sub-cases (B SPD, B SSPD, B singular with large null space, both A and B singular, etc..) * Quadratic (A + λb + λ 2 C)x = 0 * Nonlinear A(λ)x = 0 17-5 eig1

General Tools for Solving Large Eigen-Problems Projection techniques Arnoldi, Lanczos, Subspace Iteration; Preconditionings: shift-and-invert, Polynomials,... Deflation and restarting techniques Good computational codes combine these 3 ingredients 17-6 eig1

A few popular solution Methods Subspace Iteration [Now less popular sometimes used for validation] Arnoldi s method (or Lanczos) with polynomial acceleration [Stiefel 58, Rutishauser 62, YS 84, 85, Sorensen 89,...] Shift-and-invert and other preconditioners. [Use Arnoldi or Lanczos for (A σi) 1.] Davidson s method and variants, Generalized Davidosn s method [Morgan and Scott, 89], Jacobi-Davidson 17-7 eig1

Projection Methods for Eigenvalue Problems General formulation: Projection method onto K orthogonal to L Given: Two subspaces K and L of same dimension. Find: λ, ũ such that λ C, ũ K; ( λi A)ũ L Two types of methods: Orthogonal projection methods: situation when L = K. Oblique projection methods: When L K. 17-8 eig1

Rayleigh-Ritz projection Given: a subspace X known to contain good approximations to eigenvectors of A. Question: How to extract good approximations to eigenvalues/ eigenvectors from this subspace? Answer: Rayleigh Ritz process. Let Q = [q 1,..., q m ] an orthonormal basis of X. Then write an approximation in the form ũ = Qy and obtain y by writing Q H (A λi)ũ = 0 Q H AQy = λy 17-9 eig1

Procedure: 1. Obtain an orthonormal basis of X 2. Compute C = Q H AQ (an m m matrix) 3. Obtain Schur factorization of C, C = Y RY H 4. Compute Ũ = QY Property: if X is (exactly) invariant, then procedure will yield exact eigenvalues and eigenvectors. Proof: Since X is invariant, (A λi)u = Qz for a certain z. Q H Qz = 0 implies z = 0 and therefore (A λi)u = 0. Can use this procedure in conjunction with the subspace obtained from subspace iteration algorithm 17-10 eig1

Subspace Iteration Original idea: projection technique onto a subspace if the form Y = A k X In practice: Replace A k by suitable polynomial [Chebyshev] Advantages: Disadvantage: Slow. Easy to implement (in symmetric case); Easy to analyze; Often used with polynomial acceleration: A k X replaced by C k (A)X. Typically C k = Chebyshev polynomial. 17-11 eig1

Algorithm: Subspace Iteration with Projection 1. Start: Choose an initial system of vectors X = [x 0,..., x m ] and an initial polynomial C k. 2. Iterate: Until convergence do: (a) Compute Ẑ = C k (A)X old. (b) Orthonormalize Ẑ into Z. (c) Compute B = Z H AZ and use the QR algorithm to compute the Schur vectors Y = [y 1,..., y m ] of B. (d) Compute X new = ZY. (e) Test for convergence. If satisfied stop. Else select a new polynomial C k and continue.

THEOREM: Let S 0 = span{x 1, x 2,..., x m } and assume that S 0 is such that the vectors {P x i } i=1,...,m are linearly independent where P is the spectral projector associated with λ 1,..., λ m. Let P k the orthogonal projector onto the subspace S k = span{x k }. Then for each eigenvector u i of A, i = 1,..., m, there exists a unique vector s i in the subspace S 0 such that P s i = u i. Moreover, the following inequality is satisfied ( λ m+1 (I P k )u i 2 u i s i 2 λ i where ɛ k tends to zero as k tends to infinity. k k) + ɛ, (1) 17-13 eig1

Krylov subspace methods Principle: Projection methods on Krylov subspaces, i.e., on K m (A, v 1 ) = span{v 1, Av 1,, A m 1 v 1 } probably the most important class of projection methods [for linear systems and for eigenvalue problems] many variants exist depending on the subspace L. Properties of K m. Let µ = deg. of minimal polynom. of v. Then, K m = {p(a)v p = polynomial of degree m 1} K m = K µ for all m µ. Moreover, K µ is invariant under A. dim(k m ) = m iff µ m. 17-14 eig1

Arnoldi s Algorithm Goal: to compute an orthogonal basis of K m. Input: Initial vector v 1, with v 1 2 = 1 and m. ALGORITHM : 1 Arnoldi s procedure For j = 1,..., m do Compute w := Av j For i = 1,..., j, do { hi,j := (w, v i ) w := w h i,j v i h j+1,j = w 2 ; v j+1 = w/h j+1,j End 17-15 eig1

Result of Arnoldi s algorithm Let H m = x x x x x x x x x x x x x x x x x x x x ; H m = H m (1 : m, 1 : m) 1. V m = [v 1, v 2,..., v m ] orthonormal basis of K m. 2. AV m = V m+1 H m = V m H m + h m+1,m v m+1 e T m 3. V T m AV m = H m H m last row. 17-16 eig1

Appliaction to eigenvalue problems Write approximate eigenvector as ũ = V m y + Galerkin condition (A λi)v m y K m V H m (A λi)v m y = 0 Approximate eigenvalues are eigenvalues of H m H m y j = λ j y j Associated approximate eigenvectors are ũ j = V m y j Typically a few of the outermost eigenvalues will converge first. 17-17 eig1

Restarted Arnoldi In practice: Memory requirement of algorithm implies restarting is necessary Restarted Arnoldi for computing rightmost eigenpair: ALGORITHM : 2 Restarted Arnoldi 1. Start: Choose an initial vector v 1 and a dimension m. 2. Iterate: Perform m steps of Arnoldi s algorithm. 3. Restart: Compute the approximate eigenvector u (m) 1 4. associated with the rightmost eigenvalue λ (m) 1. 5. If satisfied stop, else set v 1 u (m) 1 and goto 2. 17-18 eig1

Example: Small Markov Chain matrix [ Mark(10), dimension = 55]. Restarted Arnoldi procedure for computing the eigenvector associated with the eigenvalue with algebraically largest real part. We use m = 10. m R(λ) I(λ) Res. Norm 10 0.9987435899D+00 0.0 0.246D-01 20 0.9999523324D+00 0.0 0.144D-02 30 0.1000000368D+01 0.0 0.221D-04 40 0.1000000025D+01 0.0 0.508D-06 50 0.9999999996D+00 0.0 0.138D-07 17-19 eig1

Restarted Arnoldi (cont.) Can be generalized to more than *one* eigenvector : p v (new) 1 = ρ i u (m) i i=1 However: often does not work well (hard to find good coefficients ρ i s) Alternative : compute eigenvectors (actually Schur vectors) one at a time. Implicit deflation. 17-20 eig1

Deflation Very useful in practice. Different forms: locking (subspace iteration), selective orthogonalization (Lanczos), Schur deflation,... A little background Consider Schur canonical form A = URU H where U is a (complex) upper triangular matrix. Vector columns u 1,..., u n called Schur vectors. Note: Schur vectors depend on each other, and on the order of the eigenvalues

Wiedlandt Deflation: Assume we have computed a right eigenpair λ 1, u 1. Wielandt deflation considers eigenvalues of Note: A 1 = A σu 1 v H Λ(A 1 ) = {λ 1 σ, λ 2,..., λ n } Wielandt deflation preserves u 1 as an eigenvector as well all the left eigenvectors not associated with λ 1. An interesting choice for v is to take simply v = u 1. In this case Wielandt deflation preserves Schur vectors as well. Can apply above procedure successively. 17-22 eig1

ALGORITHM : 3 Explicit Deflation 1. A 0 = A 2. For j = 0... µ 1 Do: 3. Compute a dominant eigenvector of A j 4. Define A j+1 = A j σ j u j u H j 5. End Computed u 1, u 2.,.. form a set of Schur vectors for A. Alternative: implicit deflation (within a procedure such as Arnoldi). 17-23 eig1

Deflated Arnoldi When first eigenvector converges, put it in 1st column of V m = [v 1, v 2,..., v m ]. Arnoldi will now start at column 2, orthogonaling still against v 1,..., v j at step j. Accumulate each new converged eigenvector in columns 2, 3,... [ locked set of eigenvectors.] [ ] active {}}{ Thus, for k = 2: V m = v } 1 {{, v } 2, v 3,..., v m Locked H m =

Similar techniques in Subspace iteration [G. Stewart s SRRIT] Example: Matrix Mark(10) small Markov chain matrix (N = 55). First eigenpair by iterative Arnoldi with m = 10. m Re(λ) Im(λ) Res. Norm 10 0.9987435899D+00 0.0 0.246D-01 20 0.9999523324D+00 0.0 0.144D-02 30 0.1000000368D+01 0.0 0.221D-04 40 0.1000000025D+01 0.0 0.508D-06 50 0.9999999996D+00 0.0 0.138D-07 17-25 eig1

Computing the next 2 eigenvalues of Mark(10). Eig. Mat-Vec s Re(λ) Im(λ) Res. Norm 2 60 0.9370509474 0.0 0.870D-03 69 0.9371549617 0.0 0.175D-04 78 0.9371501442 0.0 0.313D-06 87 0.9371501564 0.0 0.490D-08 3 96 0.8112247133 0.0 0.210D-02 104 0.8097553450 0.0 0.538D-03 112 0.8096419483.. 0.0. 0.874D-04..... 152 0.8095717167 0.0 0.444D-07 17-26 eig1

Hermitian case: The Lanczos Algorithm The Hessenberg matrix becomes tridiagonal : A = A H and V H m AV m = H m H m = H H m We can write H m = α 1 β 2 β 2 α 2 β 3 β 3 α 3 β 4...... β m α m (2) Consequence: three term recurrence β j+1 v j+1 = Av j α j v j β j v j 1 17-27 eig1

ALGORITHM : 4 Lanczos 1. Choose v 1 of norm unity. Set β 1 0, v 0 0 2. For j = 1, 2,..., m Do: 3. w j := Av j β j v j 1 4. α j := (w j, v j ) 5. w j := w j α j v j 6. β j+1 := w j 2. If β j+1 = 0 then Stop 7. v j+1 := w j /β j+1 8. EndDo Hermitian matrix + Arnoldi Hermitian Lanczos In theory v i s defined by 3-term recurrence are orthogonal. However: in practice severe loss of orthogonality; 17-28 eig1

Lanczos with reorthogonalization Observation [Paige, 1981]: Loss of orthogonality starts suddenly, when the first eigenpair converges. It indicates loss of linear indedependence of the v i s. When orthogonality is lost, then several copies of the same eigenvalue start appearing. Full reorthogonalization reorthogonalize v j+1 against all previous v i s every time. Partial reorthogonalization reorthogonalize v j+1 against all previous v i s only when needed [Parlett & Simon] Selective reorthogonalization reorthogonalize v j+1 against computed eigenvectors [Parlett & Scott] No reorthogonalization Do not reorthogonalize - but take measures to deal with spurious eigenvalues. [Cullum & Willoughby] 17-29 eig1

Partial reorthogonalization Partial reorthogonalization: reorthogonalize only when deemed necessary. Main question is when? Uses an inexpensive recurrence relation Work done in the 80 s [Parlett, Simon, and co-workers] + more recent work [Larsen, 98] Package: PROPACK [Larsen] V 1: 2001, most recent: V 2.1 (Apr. 05) Often, need for reorthogonalization not too strong 17-30 eig1

The Lanczos Algorithm in the Hermitian Case Assume eigenvalues sorted increasingly λ 1 λ 2 λ n Orthogonal projection method onto K m ; To derive error bounds, use the Courant characterization λ 1 = λ j = min u K, u 0 { min u K, u 0 u ũ 1,...,ũ j 1 (Au, u) (u, u) = (Aũ 1, ũ 1 ) (ũ 1, ũ 1 ) (Au, u) (u, u) = (Aũ j, ũ j ) (ũ j, ũ j ) 17-31 eig1

Bounds for λ 1 easy to find similar to linear systems. Ritz values approximate eigenvalues of A inside out: λ 1 λ 2 λ n 1 λ n λ 1 λ2 λ n 1 λn 17-32 eig1

A-priori error bounds Theorem [Kaniel, 1966]: 0 λ (m) 1 λ 1 (λ N λ 1 ) [ tan (v1, u 1 ) T m 1 (1 + 2γ 1 ) where γ 1 = λ 2 λ 1 λ N λ 2 ; and (v 1, u 1 ) = angle between v 1 and u 1. + results for other eigenvalues. [Kaniel, Paige, YS] Theorem 0 λ (m) i λ i (λ N λ 1 ) where γ i = λ i+1 λ i λ N λ i+1, κ (m) i [ = j<i κ (m) i tan (v i, u i ) ] 2 T m i (1 + 2γ i ) λ (m) j λ N λ (m) j λ i 17-33 eig1 ] 2

The Lanczos biorthogonalization (A H A) ALGORITHM : 5 Lanczos bi-orthogonalization 1. Choose two vectors v 1, w 1 such that (v 1, w 1 ) = 1. 2. Set β 1 = δ 1 0, w 0 = v 0 0 3. For j = 1, 2,..., m Do: 4. α j = (Av j, w j ) 5. ˆv j+1 = Av j α j v j β j v j 1 6. ŵ j+1 = A T w j α j w j δ j w j 1 7. δ j+1 = (ˆv j+1, ŵ j+1 ) 1/2. If δ j+1 = 0 Stop 8. β j+1 = (ˆv j+1, ŵ j+1 )/δ j+1 9. w j+1 = ŵ j+1 /β j+1 10. v j+1 = ˆv j+1 /δ j+1 11.EndDo 17-34 eig1

Builds a pair of biorthogonal bases for the two subspaces K m (A, v 1 ) and K m (A H, w 1 ) Many choices for δ j+1, β j+1 in lines 7 and 8. Only constraint: δ j+1 β j+1 = (ˆv j+1, ŵ j+1 ) Let T m = α 1 β 2 δ 2 α 2 β 3... δ m 1 α m 1 β m δ m α m. v i K m (A, v 1 ) and w j K m (A T, w 1 ). 17-35 eig1

If the algorithm does not break down before step m, then the vectors v i, i = 1,..., m, and w j, j = 1,..., m, are biorthogonal, i.e., (v j, w i ) = δ ij 1 i, j m. Moreover, {v i } i=1,2,...,m is a basis of K m (A, v 1 ) and {w i } i=1,2,...,m is a basis of K m (A H, w 1 ) and AV m = V m T m + δ m+1 v m+1 e H m, A H W m = W m T H m + β m+1 w m+1 e H m, W H m AV m = T m. 17-36 eig1

If θ j, y j, z j are, respectively an eigenvalue of T m, with associated right and left eigenvectors y j and z j respectively, then corresponding approximations for A are Ritz value Right Ritz vector Left Ritz vector θ j V m y j W m z j [Note: terminology is abused slightly - Ritz values and vectors normally refer to Hermitian cases.] 17-37 eig1

Advantages and disadvantages Advantages: Nice three-term recurrence requires little storage in theory. Computes left and a right eigenvectors at the same time Disadvantages: Algorithm can break down or nearly break down. Convergence not too well understood. Erratic behavior Not easy to take advantage of the tridiagonal form of T m. 17-38 eig1

Look-ahead Lanczos Algorithm breaks down when: (ˆv j+1, ŵ j+1 ) = 0 Three distinct situations. lucky breakdown when either ˆv j+1 or ŵ j+1 is zero. In this case, eigenvalues of T m are eigenvalues of A. (ˆv j+1, ŵ j+1 ) = 0 but of ˆv j+1 0, ŵ j+1 0 serious breakdown. Often possible to bypass the step (+ a few more) and continue the algorithm. If this is not possible then we get an...... Incurable break-down. [very rare] 17-39 eig1

Look-ahead Lanczos algorithms deal with the second case. See Parlett 80, Freund and Nachtigal 90... Main idea: when break-down occurs, skip the computation of v j+1, w j+1 and define v j+2, w j+2 from v j, w j. For example by orthogonalizing A 2 v j... Can define v j+1 somewhat arbitrarily as v j+1 = Av j. Similarly for w j+1. Drawbacks: (1) projected problem no longer tridiagonal (2) difficult to know what constitutes near-breakdown. 17-40 eig1

Preconditioning eigenvalue problems Goal: To extract good approximations to add to a subspace in a projection process. Result: faster convergence. Best known technique: Shift-and-invert; Work with B = (A σi) 1 Some success with polynomial preconditioning [Chebyshev iteration / least-squares polynomials]. Work with B = p(a) Above preconditioners preserve eigenvectors. Other methods (Davidson) use a more general preconditioner M. 17-41 eig2

Shift-and-invert preconditioning Main idea: to use Arnoldi, or Lanczos, or subspace iteration for the matrix B = (A σi) 1. The matrix B need not be computed explicitly. Each time we need to apply B to a vector we solve a system with B. Factor B = A σi = LU. Then each solution Bx = y requires solving Lz = y and Ux = z. How to deal with complex shifts? If A is complex need to work in complex arithmetic. If A is real, then instead of (A σi) 1 use Re(A σi) 1 = 1 [ 2 (A σi) 1 + (A σi) 1] 17-42 eig2

Preconditioning by polynomials Main idea: Iterate with p(a) instead of A in Arnoldi or Lanczos,.. Used very early on in subspace iteration [Rutishauser, 1959.] Usually not as reliable as Shift-and-invert techniques but less demanding in terms of storage. 17-43 eig2

Question: How to find a good polynomial (dynamically)? 1 Use of Chebyshev polynomials over ellipses Approaches: 2 Use polynomials based on Leja points 3 Least-squares polynomials over polygons 4 Polynomials from previous Arnoldi decompositions 17-44 eig2

Polynomial filters and implicit restart Goal: exploit the Arnoldi procedure to apply polynomial filter of the form: p(t) = (t θ 1 )(t θ 2 )... (t θ q ) Assume AV m = V m H m + ˆv m+1 e T m and consider first factor: (t θ 1 ) (A θ 1 I)V m = V m (H m θ 1 I) + ˆv m+1 e T m Let H m θ 1 I = Q 1 R 1. Then, (A θ 1 I)V m = V m Q 1 R 1 + ˆv m+1 e T m (A θ 1 I)(V m Q 1 ) = (V m Q 1 )R 1 Q 1 + ˆv m+1 e T m Q 1 A(V m Q 1 ) = (V m Q 1 )(R 1 Q 1 + θ 1 I) +ˆv m+1 e T m Q 1 17-45 eig2

Notation: R 1 Q 1 + θ 1 I H (1) m ; AV m (1) = V m (1) Note that H (1) m (b(1) m+1) T e T m Q 1; V m Q 1 V m (1) H(1) is upper Hessenberg. Similar to an Arnoldi decomposition. Observe: m + v m+1(b (1) m+1) T R 1 Q 1 + θ 1 I matrix resulting from one step of the QR algorithm with shift θ 1 applied to H m. First column of V m (1) is a multiple of (A θ 1I)v 1. The columns of V m (1) are orthonormal. 17-46 eig2

Can now apply second shift in same way: (A θ 2 I)V (1) m = V (1) m Similar process: (H (1) m (H(1) m θ 2I) + v m+1 (b (1) m+1) T θ 2I) = Q 2 R 2 then Q 2 to the right: (A θ 2 I)V (1) m Q 2 = (V (1) m Q 2)(R 2 Q 2 ) + v m+1 (b (1) m+1) T Q 2 AV (2) m = V m (2) H(2) m + v m+1(b (2) m+1) T Now: 1st column of V m (2) = scalar (A θ 2I)v (1) 1 = scalar (A θ 2 I)(A θ 1 I)v 1 17-47 eig2

Note that (b (2) m+1) T = e T m Q 1Q 2 = [0, 0,, 0, η 1, η 2, η 3 ] Let: ˆV m 2 = [ˆv 1,..., ˆv m 2 ] consist of first m 2 columns of V m (2) and Ĥ m 2 = H m (1 : m 2, 1 : m 2). Then A ˆV m 2 = ˆV m 2 Ĥ m 2 + ˆβ m 1ˆv m 1 e T m with ˆβ m 1ˆv m 1 η 1 v m+1 + h (2) m 1,m 2v (2) m 1 ˆv m 1 2 = 1 Result: An Arnoldi process of m 2 steps with the initial vector p(a)v 1. In other words: We know how to apply polynomial filtering via a form of the Arnoldi process, combined with the QR algorithm. 17-48 eig2