AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Similar documents
AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Chapter 7 Iterative Techniques in Matrix Algebra

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

AMS526: Numerical Analysis I (Numerical Linear Algebra)

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

FEM and sparse linear system solving

Iterative methods for Linear System

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Numerical Methods in Matrix Computations

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

9.1 Preconditioned Krylov Subspace Methods

Course Notes: Week 1

Contents. Preface... xi. Introduction...

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

The Lanczos and conjugate gradient algorithms

6.4 Krylov Subspaces and Conjugate Gradients

Multigrid absolute value preconditioning

Iterative Methods for Sparse Linear Systems

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Introduction. Chapter One

Block Bidiagonal Decomposition and Least Squares Problems

Preface to the Second Edition. Preface to the First Edition

Numerical Methods I Non-Square and Sparse Linear Systems

Iterative Methods for Linear Systems of Equations

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to

The solution of the discretized incompressible Navier-Stokes equations with iterative methods

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

On the accuracy of saddle point solvers

Preconditioned inverse iteration and shift-invert Arnoldi method

On the influence of eigenvalues on Bi-CG residual norms

The Conjugate Gradient Method

Algebraic Multigrid as Solvers and as Preconditioner

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

AMS526: Numerical Analysis I (Numerical Linear Algebra)

APPLIED NUMERICAL LINEAR ALGEBRA

M.A. Botchev. September 5, 2014

Krylov Space Solvers

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Iterative Methods and Multigrid

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Lecture 18 Classical Iterative Methods

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

1 Extrapolation: A Hint of Things to Come

AMS526: Numerical Analysis I (Numerical Linear Algebra)

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Generalized MINRES or Generalized LSQR?

FEM and Sparse Linear System Solving

4.6 Iterative Solvers for Linear Systems

Master Thesis Literature Study Presentation

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

arxiv: v4 [math.na] 1 Sep 2018

Computational Linear Algebra

Math 504 (Fall 2011) 1. (*) Consider the matrices

Lab 1: Iterative Methods for Solving Linear Systems

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Research Article Some Generalizations and Modifications of Iterative Methods for Solving Large Sparse Symmetric Indefinite Linear Systems

DELFT UNIVERSITY OF TECHNOLOGY

Algorithms that use the Arnoldi Basis

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Solving Ax = b, an overview. Program

Lecture # 20 The Preconditioned Conjugate Gradient Method

Numerical behavior of inexact linear solvers

Preconditioning Techniques Analysis for CG Method

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

MS&E 318 (CME 338) Large-Scale Numerical Optimization

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Steady-State Optimization Lecture 1: A Brief Review on Numerical Linear Algebra Methods

Scientific Computing: An Introductory Survey

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

Eigenvalue and Eigenvector Problems

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

KRYLOV SUBSPACE ITERATION

Computational Linear Algebra

Matrices, Moments and Quadrature, cont d

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

Fast Iterative Solution of Saddle Point Problems

Linear Solvers. Andrew Hazel

Chebyshev semi-iteration in Preconditioning

Preconditioning for Nonsymmetry and Time-dependence

LARGE SPARSE EIGENVALUE PROBLEMS

Algebra C Numerical Linear Algebra Sample Exam Problems

Index. for generalized eigenvalue problem, butterfly form, 211

1. Fast Iterative Solvers of SLE

Linear Algebra. Brigitte Bidégaray-Fesquet. MSIAM, September Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, Grenoble.

Arnoldi Methods in SLEPc

Transcription:

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18

Outline 1 Krylov Subspace Methods for Asymmetric Systems 2 Preconditioned Conjugate Gradient Xiangmin Jiao Numerical Analysis I 2 / 18

Minimizing Residual CG only works for SPD matrices. It minimizes x x A and φ(x) = 1 2 x T Ax x T b There have been many proposed extensions to nonsymmetric matrices, GMRES, BiCG, etc. GMRES (Generalized Minimal RESiduals) is one of most well known The basic idea of GMRES is to find x n K n that minimizes r n = b Ax n This can be viewed as a least squares problem: Find a vector c s.t. AK n c b is minimized, where K n is the m n Krylov matrix composed of basis vectors of K n, and K n = b Ab A n 1 b Orthogonal basis is often used, produced by Arnoldi iteration Xiangmin Jiao Numerical Analysis I 3 / 18

Review: Arnoldi Iteration Let Q n = [q 1 q 2 q n ] be m n matrix with first n columns of Q and H n be (n + 1) n upper-left section of H Start by picking a random q 1 and then determine q 2 and H 1 The nth columns of AQ n = Q n+1 H n can be written as Aq n = h 1n q 1 + + h nn q n + h n+1,n q n+1 Algorithm: Arnoldi Iteration given random nonzero b, let q 1 = b/ b for n = 1, 2, 3,... v = Aq n for j = 1 to n h jn = q j v v = v h jn q j h n+1,n = v q n+1 = v/h n+1,n Xiangmin Jiao Numerical Analysis I 4 / 18

Minimal Residual with Orthogonal Basis Let Q n be Krylov matrix whose columns q 1, q 2,... span the successive Krylov subspaces Instead of find x n = K n c, find x n = Q n y which minimizes AQ n y b For Arnoldi iteration, we showed that AQ n = Q n+1 H n, so Q n+1 H n y b = minimum Left multiplication by Q n+1 does not change the norm, so H n y Q n+1b = minimum Finally, by construction, Q n+1b = b e 1, so H n y b e 1 = minimum. Xiangmin Jiao Numerical Analysis I 5 / 18

The GMRES Algorithm Algorithm: GMRES q 1 = b/ b for n = 1, 2, 3,... Step n of Arnoldi iteration Find y to minimize H n y b e 1 = r n x n = Q n y The residual r n does not need to be computed explicitly from x n Xiangmin Jiao Numerical Analysis I 6 / 18

The GMRES Algorithm Least squares problem has Hessenberg structure, solved with QR factorization of H n If QR factorization of H n is constructed from scratch, then it costs O(n 2 ) flops, due to Hessenberg structure However, QR factorization of H n can be updated from that of H n 1, using Givens rotation within O(n) work However, memory and cost grow with n. In practice, restart the algorithm by clearing accumulated data. This might stagnate the method Xiangmin Jiao Numerical Analysis I 7 / 18

GMRES and Polynomial Approximation GMRES can be interpreted as finding polynomial p n P n for n = 1, 2, 3,... where P n = {polynomial p of degree n with p(0) = 1} such that p n (A)b is minimized Note that r = b AK n c, where AK n c = ( c 1 A + c 2 A 2 + + c n A n) b and r = ( 1 c 1 A c 2 A 2 c n A n) b. In other words, p n (z) = 1 z(c 1 + c 2 z + + c n 1 z n 1 ) Invariance of GMRES Scale invariance: If we change A σa and b σb, then r n σr n Invariance under unitary similarity transformations: If change A UAU for some unitary matrix U and b Ub, then r n U r n Xiangmin Jiao Numerical Analysis I 8 / 18

Convergence of GMRES GMRES converges monotonically and it converges after at most m steps Based on a polynomial analysis, diagonalizable A = V ΛV 1 converges as r n κ(v ) inf sup p n (λ i ) b p n P n λ i Λ(A) In other words, if A is not far from normal (i.e., eigenvectors are nearly orthogonal), and if properly normalized degree n polynomials can be found whose size on the spectrum Λ(A) decreases quickly with n, then GMRES converges quickly Xiangmin Jiao Numerical Analysis I 9 / 18

Other Krylov Subspace Methods CG on the Normal Equations (CGN) Solve A Ax = A b using Conjugate Gradients Poor convergence due to squared condition number (i.e., κ(a A) = κ(a) 2 ) One advantage is that it applies least squares problems without modification BiConjugate Gradients (BCG/BiCG) Makes residuals orthogonal to another Krylov subspace, based on A It can be implemented with three-term recurrences, so memory requirements is smaller Convergence sometimes comparable to GMRES, but unpredictable Conjugate Gradients Squared (CGS) Avoids multiplication by A in BCG, sometimes twice as fast convergence as BCG Quasi-Minimal Residuals (QMR) and Stabilized BiCG (Bi-CGSTAB) Variants of BiCG with more regular convergence Xiangmin Jiao Numerical Analysis I 10 / 18

MINRES: Minimum residual Solve Ax = b or (A si )x = b. The matrix A si must be symmetric but it may be definite or indefinite or singular Scalar s is a shifting parameter it may be any number. The method is based on Lanczos tridiagonalization MINRES is solving one of the least-squares problems minimize Ax b or (A si )x b References C. C. Paige and M. A. Saunders (1975). Solution of sparse indefinite systems of linear equations, SIAM J. Numerical Analysis, Vol. 12, 617-629 http://www.stanford.edu/group/sol/software/minres.html Xiangmin Jiao Numerical Analysis I 11 / 18

LSQR It solves Ax = b, or minimizes Ax b 2 or minimize Ax b 2 + λ 2 x 2 A may be square or rectangular (over-determined or under-determined), and may have any rank The method is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation (A T A + λ 2 I )x = A T b, but has better numerical properties, especially if A is ill-conditioned LSQR reduces r monotonically (where r = b Ax if λ = 0) References: Paige, C. C. and M. A. Saunders, LSQR: An Algorithm for Sparse Linear Equations And Sparse Least Squares, ACM Trans. Math. Soft., Vol.8, 1982, pp.43-71 http://www.stanford.edu/group/sol/software/lsqr.html Xiangmin Jiao Numerical Analysis I 12 / 18

Outline 1 Krylov Subspace Methods for Asymmetric Systems 2 Preconditioned Conjugate Gradient Xiangmin Jiao Numerical Analysis I 13 / 18

Preconditioning Motivation: Convergence of iterative methods heavily depends on eigenvalues or singular values of equation Main idea of preconditioning is to introduce a nonsingular matrix M such that M 1 A has better properties than A. Thereafter, solve M 1 Ax = M 1 b, which has the same solution as Ax = b Criteria of M Good approximation of A, depending on iterative solvers Ease of inversion Typically, a precondition M is good if M 1 A is not too far from normal and its eigenvalues are clustered Xiangmin Jiao Numerical Analysis I 14 / 18

Left, Right, and Hermitian Preconditioners Left preconditioner: Left multiply M 1 and solve M 1 Ax = M 1 b Right preconditioner: Right multiply M 1 and solve AM 1 y = b with x = M 1 y However, if A is Hermitian, M 1 A or AM 1 breaks symmetry How to resolve this problem? Xiangmin Jiao Numerical Analysis I 15 / 18

Left, Right, and Hermitian Preconditioners Left preconditioner: Left multiply M 1 and solve M 1 Ax = M 1 b Right preconditioner: Right multiply M 1 and solve AM 1 y = b with x = M 1 y However, if A is Hermitian, M 1 A or AM 1 breaks symmetry How to resolve this problem? Suppose M is Hermitian positive definite, with M = CC for some C, then Ax = b is equivalent to [ C 1 AC ] (C x) = C 1 b, where C 1 AC is Hermitian positive definite, and it is similar to C C 1 A = M 1 A and has same eigenvalues as M 1 A Example of M = CC is Cholesky factorization M = RR, where R is upper triangular Xiangmin Jiao Numerical Analysis I 15 / 18

Preconditioned Conjugate Gradient When preconditioning a symmetric matrix, use SPD matrix M, and M = RR T In practice, algorithm can be organized so that only M 1 (instead of R 1 ) appears Algorithm: Preconditioned Conjugate Gradient Method x 0 = 0, r 0 = b, p 0 = M 1 r 0, z 0 = p 0 for n = 1, 2, 3,... α n = (r T n 1 z n 1)/(p T n 1 Ap n 1) step length x n = x n 1 + α n p n 1 approximate solution r n = r n 1 α n Ap n 1 residual z n = M 1 r n preconditioning β n = (r T n z n )/(r T n 1 z n 1) improvement this step p n = z n + β n p n 1 search direction Xiangmin Jiao Numerical Analysis I 16 / 18

Effective Preconditioners for CG SSOR Preconditioner Simpler form: use matrix splitting of form A = L + D + L T and take M = (D + L)D 1 (D + L) T More generally, introduce SSOR relaxation parameter ω, and take M = 1 2 ω ( 1 ( ) 1 1 ω D + L) ω D ( 1 ω D + L)T. With optimal ω, cond(m 1 A) = O( cond(a)), but determining optimal ω is impractical Incomplete factorization If A = LL T were used as preconditioner, then cond(m 1 A) = 1, but impractical Instead, compute approximate factorization A L L T, which omit all fills or omit small fills and use M = L L T as preconditioner Xiangmin Jiao Numerical Analysis I 17 / 18

Other Commonly Used Preconditioners Jacobi preconditioning: M = diag(a). Very simple and cheap, might improve certain problems but usually insufficient Block-Jacobi preconditioning: Let M be composed of block-diagonal instead of diagonal Multigrid (coarse-grid approximations): For a PDE discretized on a grid, a preconditioner can be formed by transferring the solution to a coarser grid, solving a smaller problem, then transferring back. This is sometimes the most efficient approach if applicable Xiangmin Jiao Numerical Analysis I 18 / 18