AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 9
Minimizing Residual CG only works for SPD matrices. It minimizes x x A and φ(x) = 1 2 x T Ax x T b There have been many proposed extensions to nonsymmetric matrices, GMRES, BiCG, etc. GMRES (Generalized Minimal RESiduals) is one of most well known The basic idea of GMRES is to find x n K n that minimizes r n = b Ax n This can be viewed as a least squares problem: Find a vector c s.t. AK n c b is minimized, where K n is the m n Krylov matrix composed of basis vectors of K n, and AK n = Ab A 2 b A n b Orthogonal basis is often used, produced by Arnoldi iteration Xiangmin Jiao Numerical Analysis I 2 / 9
Review: Arnoldi Iteration Let Q n = [q 1 q 2 q n ] be m n matrix with first n columns of Q and H n be (n + 1) n upper-left section of H Start by picking a random q 1 and then determine q 2 and H 1 The nth columns of AQ n = Q n+1 H n can be written as Aq n = h 1n q 1 + + h nn q n + h n+1,n q n+1 Algorithm: Arnoldi Iteration given random nonzero b, let q 1 = b/ b for n = 1 to 1, 2, 3,... v = Aq n for j = 1 to n h jn = q j v v = v h jn q j h n+1,n = v q n+1 = v/h n+1,n Xiangmin Jiao Numerical Analysis I 3 / 9
Minimal Residual with Orthogonal Basis Let Q n be Krylov matrix whose columns q 1, q 2,... span the successive Krylov subspaces Instead of find x n = K n c, find x n = Q n y which minimizes AQ n y b For Arnoldi iteration, we showed that AQ n = Q n+1 H n, so Q n+1 H n y b = minimum Left multiplication by Q n+1 does not change the norm, so H n y Q n+1b = minimum Finally, by construction, Q n+1b = b e 1, so H n y b e 1 = minimum. Xiangmin Jiao Numerical Analysis I 4 / 9
The GMRES Algorithm Algorithm: GMRES q 1 = b/ b for n = 1 to 1, 2, 3,... Step n of Arnoldi iteration Find y to minimize H n y b e 1 = r n x n = Q n y The residual r n does not need to be computed explicitly from x n Xiangmin Jiao Numerical Analysis I 5 / 9
The GMRES Algorithm Least squares problem has Hessenberg structure, solved with QR factorization of H n If QR factorization of H n is constructed from scratch, then it costs O(n 2 ) flops, due to Hessenberg structure However, QR factorization of H n can be updated from that of H n 1, using Givens rotation within O(n) work However, memory and cost grow with n. In practice, restart the algorithm by clearing accumulated data. This might stagnate the method Xiangmin Jiao Numerical Analysis I 6 / 9
GMRES and Polynomial Approximation GMRES can be interpreted as finding polynomial p n P n for n = 1, 2, 3,... where P n = {polynomial p of degree n with p(0) = 1} such that p n (A)b is minimized Note that r = b AK n c, where AK n c = ( c 1 A + c 2 A 2 + + c n A n) b and r = ( 1 c 1 A c 2 A 2 c n A n) b. In other words, p n (z) = 1 z(c 1 + c 2 z + + c n z n ) Invariance of GMRES Scale invariance: If we change A σa and b σb, then r n σr n Invariance under unitary similarity transformations: If change A UAU for some unitary matrix U and b Ub, then r n U r n Xiangmin Jiao Numerical Analysis I 7 / 9
Convergence of GMRES GMRES converges monotonically and it converges after at most m steps Based on a polynomial analysis, diagonalizable A = V ΛV 1 converges as r n κ(v ) inf sup p n (λ i ) b p n P n λ i Λ(A) In other words, if A is not far from normal (i.e., eigenvectors are nearly orthogonal), and if properly normalized degree n polynomials can be found whose size on the spectrum Λ(A) decreases quickly with n, then GMRES converges quickly Xiangmin Jiao Numerical Analysis I 8 / 9
Other Krylov Subspace Methods CG on the Normal Equations (CGN) Solve A Ax = A b using Conjugate Gradients Poor convergence due to squared condition number (i.e., κ(a A) = κ(a) 2 ) One advantage is that it applies least squares problems without modification BiConjugate Gradients (BCG/BiCG) Makes residuals orthogonal to another Krylov subspace, based on A It can be implemented with three-term recurrences, so memory requirements is smaller Convergence sometimes comparable to GMRES, but unpredictable Conjugate Gradients Squared (CGS) Avoids multiplication by A in BCG, sometimes twice as fast convergence as BCG Quasi-Minimal Residuals (QMR) and Stabilized BiCG (Bi-CGSTAB) Variants of BiCG with more regular convergence Xiangmin Jiao Numerical Analysis I 9 / 9