Algorithms that use the Arnoldi Basis

Similar documents
4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

FEM and sparse linear system solving

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Iterative methods for Linear System

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Krylov Subspace Methods that Are Based on the Minimization of the Residual

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

Simple iteration procedure

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Course Notes: Week 1

IDR(s) as a projection method

On the influence of eigenvalues on Bi-CG residual norms

M.A. Botchev. September 5, 2014

1 Extrapolation: A Hint of Things to Come

AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Krylov Minimization and Projection (KMP) Dianne P. O Leary c 2006, 2007.

On Solving Large Algebraic. Riccati Matrix Equations

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Computational Linear Algebra

6.4 Krylov Subspaces and Conjugate Gradients

GMRES: Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Computational Linear Algebra

Iterative Methods for Sparse Linear Systems

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

The Lanczos and conjugate gradient algorithms

Conjugate Gradient Method

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Math 504 (Fall 2011) 1. (*) Consider the matrices

Krylov Subspaces. Lab 1. The Arnoldi Iteration

Notes on Some Methods for Solving Linear Systems

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to

Arnoldi Methods in SLEPc

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

I = i 0,

DELFT UNIVERSITY OF TECHNOLOGY

PROJECTED GMRES AND ITS VARIANTS

The Conjugate Gradient Method

Institute for Advanced Computer Studies. Department of Computer Science. Iterative methods for solving Ax = b. GMRES/FOM versus QMR/BiCG

Alternative correction equations in the Jacobi-Davidson method

9.1 Preconditioned Krylov Subspace Methods

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Math 5630: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 2019

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

EECS 275 Matrix Computation

Block Bidiagonal Decomposition and Least Squares Problems

Iterative Methods for Linear Systems of Equations

Course Notes: Week 4

Lab 1: Iterative Methods for Solving Linear Systems

WHEN studying distributed simulations of power systems,

Communication-avoiding Krylov subspace methods

UNIT 6: The singular value decomposition.

Agenda: Understand the action of A by seeing how it acts on eigenvectors.

Linear Algebra Review

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

ABSTRACT OF DISSERTATION. Ping Zhang

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

Numerical Analysis Lecture Notes

Linear Algebra Massoud Malek

Iterative Methods for Solving A x = b

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

Conceptual Questions for Review

GMRES ON (NEARLY) SINGULAR SYSTEMS

Augmented GMRES-type methods

LARGE SPARSE EIGENVALUE PROBLEMS

DELFT UNIVERSITY OF TECHNOLOGY

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

MATH 235. Final ANSWERS May 5, 2015

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

COMP 558 lecture 18 Nov. 15, 2010

ft-uiowa-math2550 Assignment NOTRequiredJustHWformatOfQuizReviewForExam3part2 due 12/31/2014 at 07:10pm CST

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Linear Solvers. Andrew Hazel

The Conjugate Gradient Method

Preface to the Second Edition. Preface to the First Edition

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Notes on Eigenvalues, Singular Values and QR

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Cheat Sheet for MATH461

Rational Krylov methods for linear and nonlinear eigenvalue problems

Chapter 5. Eigenvalues and Eigenvectors

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Definition (T -invariant subspace) Example. Example

IDR(s) Master s thesis Goushani Kisoensingh. Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht

Scientific Computing: An Introductory Survey

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Transcription:

AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O Leary c 2006, 2007 Algorithms that use the Arnoldi Basis Reference: Chapter 6 of Saad The Arnoldi Basis How to use the Arnoldi Basis FOM (usually called the Arnoldi iteration) GMRES Givens Rotations Relation between FOM and GMRES iterates Practicalities Convergence Results for GMRES The Arnoldi Basis Recall: The Arnoldi basis v 1,..., v n is obtained column by column from the relation AV = VH where H is upper Hessenberg and V T V = I. In particular, h ij = (Av j, v i ). The vectors v j can be formed by a (modified) Gram Schmidt process. For three algorithms for computing the recurrence, see Sec 6.3.1 and 6.3.2. Facts about the Arnoldi Basis Let s convince ourselves of the following facts: 1

The first m columns of V form an orthogonal basis for K m (A, v 1 ). Let V m denote the first m columns of V, and H m denote the leading m m block of H. Then AV m = V m H m + h m+1,m v m+1 e T m. The algorithm breaks down if h m+1,m = 0. In this case, we cannot get a new vector v m+1. This breakdown means that K m+1 (A, v 1 ) = K m (A, v 1 ). In other words, K m (A, v 1 ) is an invariant subspace of A of dimension m. How to use the Arnoldi Basis We want to solve Ax = b. We choose x 0 = 0 and v 1 = b/β, where β = b. Let s choose x m K m (A, v 1 ). Two choices: Projection: Make the residual orthogonal to K m (A, v 1 ). (Section 6.4.1) This is the Full Orthogonalization Method (FOM). Minimization: Minimize the norm of the residual over all choices of x in K m (A, v 1 ). (Section 6.5) This is the Generalized Minimum Residual Method (GMRES). FOM (usually called the Arnoldi iteration) The Arnoldi relation: AV m = V m H m + h m+1,m v m+1 e T m. If x m K m (A, v 1 ), then we can express it as x m = V m y m for some vector y m. The residual is r m = b Ax m = βv 1 AV m y m = βv 1 (V m H m + h m+1,m v m+1 e T m)y m. 2

To enforce orthogonality, we want 0 = Vmr T m = βvmv T 1 (VmV T m H m + h m+1,m Vmv T m+1 e T m)y m. = βe 1 H m y m. Therefore, given the Arnoldi relation at step m, we solve the linear system and set x m = V m y m, r m = b Ax m = b AV m y m H m y m = βe 1 = βv m e 1 (V m H m + h m+1,m v m+1 e T m)y m = V m (βe 1 H m y m ) h m+1,m v m+1 e T my m = h m+1,m v m+1 e T my m = h m+1,m (y m ) m v m+1. Therefore, the norm of the residual is just h m+1,m (y m ) m, which can be computed without forming x m. GMRES The Arnoldi relation: AV m = V m H m + h m+1,m v m+1 e T m. The iterate: x m = V m y m for some vector y m. Define The residual: H m to be the leading (m + 1) m block of H. r m = βv 1 (V m H m + h m+1,m v m+1 e T m)y m = βv 1 V m+1 H m y m = V m+1 (βe 1 H m y m ). The norm of the residual: r m = V m+1 (βe 1 H m y m ) = βe 1 H m y m. We minimize this quantity w.r.t. y m to get the GMRES iterate. 3

Givens Rotations The linear system involving H m is solved using m Givens rotations. The least squares problems involving H m is solved using 1 additional Givens rotation. Relation between FOM and GMRES iterates FOM: H m y F m = βe 1, or H T mh m y F m = βh T me 1. GMRES: minimize βe 1 H m ym G by solving the normal equations H mt H m ym G = βh T m e1. Note that so H T mh m and H mt [ H m = H m h m+1,m e T m ], H m differ by a matrix of rank 2. Therefore (Sec 6.5.7 and Sec 6.5.8), If we compute the FOM iterate, we can easily get the GMRES iterate, and vice versa. It can be shown that the smallest residual computed by FOM in iterations 1,..., m is within a factor of m of the smallest for GMRES. There is a recursion for computing the GMRES iterates given the FOM iterates: x G m = s 2 mx G m 1 c 2 mx F m, where s 2 m + c 2 m = 1 and c m is the cosine used in the last step of the reduction of H m to upper triangular form. Practicalities Practicality 1: GMRES and FOM can stagnate. 4

It is possible (but unlikely) that GMRES and FOM make no progress at all until the last iteration. Example: Let A = [ e T 1 I 0 ], x true = e n, b = e 1. (e is the vector with every entry equal to 1.) Then K m (A, b) = span{e 1,..., e m }, so the solution vector is orthogonal to K m (A, b) for m < n. This is called complete stagnation. Partial stagnation can also occur. The usual remedy is preconditioning (discussed later). Practicality 2: m must be kept relatively small. As m gets big, The work per iteration becomes overwhelming, since the upper Hessenberg part of H m is generally dense. In fact, the work for the orthogonalization grows as O(m 2 n) The orthogonality relations are increasingly contaminated by round-off error, and this can cause the computation to compute bad approximations to the true GMRES/FOM iterates. Two remedies: restarting (Sec 6.4.1 and Sec 6.5.5) truncating the orthogonalization (Sec 6.4.2 and Sec 6.5.6) Restarting In the restarting algorithm, we run the Arnoldi iteration for a fixed number of steps, perhaps m = 100. If the resulting solution is not good enough, we repeat, by setting v 1 to be the normalized residual. This limits the work per iteration, but removes the finite-termination property. In fact, due to stagnation, it is possible (but unlikely) that the iteration never makes any progress at all! 5

Truncating the orthogonalization In this algorithm, we modify the Arnoldi iteration so that we only orthogonalize against the latest k vectors, rather than orthogonalizing against all of the old vectors. H approx m The matrix then has only k 1 nonzeros above the main diagonal, and the work for orthogonalization is reduced from O(m 2 n) to O(kmn). We are still working with the same Krylov subspace, but The basis vectors v 1,..., v m are not exactly orthogonal. We are approximating the matrix H m, so we don t minimize the residual (exactly) or make the residual (exactly) orthogonal to the Krylov subspace. All is not lost, though. In Section 6.5.6, Saad works out the relation between the GMRES residual norm and the one produced from the truncated method (quasi-gmres). Note: Quasi-GMRES is not often used. Restarted-GMRES is the standard algorithm. Practicality 3: Breakdown of the Arnoldi iteration is a good thing. If the Arnoldi iteration breaks down, with h m+1,m = 0, so that there is no new vector v m+1, then H m = H m, so both GMRES and FOM compute the same iterate. More importantly, Arnoldi breaks down if and only if x G m = x F m = x true. This follows from the fact that the norm of the GMRES residual is h m+1,m (y m ) m. Practicality 4: These algorithms also work in complex arithmetic. If A or b is complex rather than real, the algorithms still work, as long as we remember to use conjugate transpose instead of transpose. 6

use complex Givens rotations (Section 6.5.9): [ ] [ ] [ ] c s x γ = s c y 0 if s = y x 2 + y, c = x 2 x 2 + y. 2 Convergence Results for GMRES (Sec 6.11.4) Two definitions: We say that a matrix A is positive definite if x T Ax > 0. for all x 0. If A is symmetric, this means that all of its eigenvalues are positive. For a nonsymmetric matrix, this means that the symmetric part of A, which is 1 2 (A + AT ), is positive definite. The algorithm that restarts GMRES every m iterations is called GMRES(m). GMRES(m) converges for positive definite matrices. Proof: The iteration can t stagnate, because it makes progress at the first iteration. In particular, if we seek x = αv 1 to minimize r 2 = b αav 1 2 where b = βv 1 and β > 0 then so the residual norm is reduced. = b T b 2αb T Av 1 + α 2 v T 1 A T Av 1, α = bt Av 1 v T 1 AT Av 1 > 0, By working a little harder, we can show that it reduced enough to guarantee convergence. [] GMRES(m) is an optimal polynomial method. The iterates take the form x m = x 0 + q m 1 (A)r 0, where q m 1 is a polynomial of degree at most m 1. 7

Saying that x is in K m (A, r 0 ) is equivalent to saying x = q m 1 (A)r 0 for some polynomial of degree at most m 1. The resulting residual is r m = r 0 q m 1 (A)Ar 0 = (I q m 1 (A)A)r 0 = p m (A)r 0, where p m is a polynomial of degree m satisfying p m (0) = 1. since GMRES(m) minimizes the norm of the residual, it must pick the optimal polynomial p m. Let A = XΛX 1, where Λ is diagonal and contains the eigenvalues of A. (In other words, we are assuming that A has a complete set of linearly independent eigenvectors, the columns of X.) Then r m = min p m(0)=1 p m(a)r 0 = min p m(0)=1 Xp m(λ)x 1 r 0 X X 1 min p m(0)=1 p m(λ) r 0 X X 1 min p m(0)=1 max p m(λ j ) r 0. j=1,...,n Therefore, any polynomial that is small on the eigenvalues can give us a bound on the GMRES(m) residual norm, since GMRES(m) picks the optimal polynomial. By choosing a Chebyshev polynomial, Saad derives the bound ( a + ) k a r m 2 d 2 c + X X 1 r 0. c 2 d 2 whenever the ellipse centered at c with focal distance d and semi-major axis a contains all of the eigenvalues and excludes the origin (See the picture on p. 207). 8