Nested Krylov methods for shifted linear systems M. Baumann, and M. B. van Gizen Email: M.M.Baumann@tudelft.nl Delft Institute of Applied Mathematics Delft University of Technology Delft, The Netherlands University of Wuppertal - September 25, 2014 M. Baumann () Nested Krylov methods for shifted systems 1 / 27
Motivation Full-waveform inversion PDE-constrained optimization: min u sim u meas, ρ(x),c p(x),c s(x) where in our application: u sim is the (numerical) solution of the elastic wave equation, u meas is obtained from measurements, ρ(x) is the density of the earth layers we are interested in. The modelling is done in frequency-domain... M. Baumann () Nested Krylov methods for shifted systems 2 / 27
Motivation Modelling in frequency-domain Frequency-domain approach: The time-harmonic elastic wave equation For many (angular) frequencies ω k, we solve ωk 2 ρ(x)û σ(û, c p, c s ) = ŝ, x Ω R 2,3, together with absorbing or reflecting boundary conditions. Inverse (discrete) Fourier transform: u(x, t) = k û(x, ω k )e iω kt M. Baumann () Nested Krylov methods for shifted systems 3 / 27
Motivation Shifted linear systems The discretized time-harmonic elastic wave equation is quadratic in ω k : (K + iω k C ωk 2 M)û = ŝ, which can be re-arranged as, [( im 1 C M 1 ) ( )] ( ) ( K I 0 ωk û M 1ŝ ) ω I 0 k =. 0 I û 0 The latter is of the form: (A ω k I )x k = b, k = 1,..., N. movie M. Baumann () Nested Krylov methods for shifted systems 4 / 27
Motivation Shifted linear systems The discretized time-harmonic elastic wave equation is quadratic in ω k : (K + iω k C ωk 2 M)û = ŝ, which can be re-arranged as, [( im 1 C M 1 ) ( )] ( ) ( K I 0 ωk û M 1ŝ ) ω I 0 k =. 0 I û 0 The latter is of the form: (A ω k I )x k = b, k = 1,..., N. movie M. Baumann () Nested Krylov methods for shifted systems 4 / 27
Outline 1 Introduction 2 Multi-shift Krylov methods 3 Polynomial preconditioners 4 Nested multi-shift Krylov methods 5 Numerical results 6 Summary M. Baumann () Nested Krylov methods for shifted systems 5 / 27
What s a shifted linear system? Definition Shifted linear systems are of the form where ω C is the shift. (A ωi )x (ω) = b, For the simultaneous solution, Krylov methods are well-suited because of the shift-invariance property: K m (A, b) span{b, Ab,..., A m 1 b} = K m (A ωi, b). Proof (shift-invariance) For m = 2: K 2 (A, b) = span{b, Ab} K 2 (A ωi, b) = span{b, Ab ωb} = span{b, Ab} M. Baumann () Nested Krylov methods for shifted systems 6 / 27
What s a shifted linear system? Definition Shifted linear systems are of the form where ω C is the shift. (A ωi )x (ω) = b, For the simultaneous solution, Krylov methods are well-suited because of the shift-invariance property: K m (A, b) span{b, Ab,..., A m 1 b} = K m (A ωi, b). Proof (shift-invariance) For m = 2: K 2 (A, b) = span{b, Ab} K 2 (A ωi, b) = span{b, Ab ωb} = span{b, Ab} M. Baumann () Nested Krylov methods for shifted systems 6 / 27
What s a shifted linear system? Definition Shifted linear systems are of the form where ω C is the shift. (A ωi )x (ω) = b, For the simultaneous solution, Krylov methods are well-suited because of the shift-invariance property: K m (A, b) span{b, Ab,..., A m 1 b} = K m (A ωi, b). Proof (shift-invariance) For m = 2: K 2 (A, b) = span{b, Ab} K 2 (A ωi, b) = span{b, Ab ωb} = span{b, Ab} M. Baumann () Nested Krylov methods for shifted systems 6 / 27
For example, multi-shift GMRES After m steps of Arnoldi, we have, AV m = V m+1 H m, and the approximate solution yields: x m V m y m, where y m = argmin y C m H m y b e 1. For shifted systems, we get and, therefore, x (ω) m (A ωi )V m = V m+1 (H m ωi m ), V m y m (ω), where y m (ω) = argmin y C m H (ω) m y b e 1. M. Baumann () Nested Krylov methods for shifted systems 7 / 27
For example, multi-shift GMRES After m steps of Arnoldi, we have, AV m = V m+1 H m, and the approximate solution yields: x m V m y m, where y m = argmin y C m H m y b e 1. For shifted systems, we get and, therefore, x (ω) m (A ωi )V m = V m+1 (H m ωi m ), V m y m (ω), where y m (ω) = argmin y C m H (ω) m y b e 1. M. Baumann () Nested Krylov methods for shifted systems 7 / 27
For example, multi-shift GMRES After m steps of Arnoldi, we have, AV m = V m+1 H m, and the approximate solution yields: x m V m y m, where y m = argmin y C m H m y b e 1. For shifted systems, we get and, therefore, x (ω) m (A ωi )V m = V m+1 (H m ωi m ), V m y m (ω), where y m (ω) = argmin y C m H (ω) m y b e 1. M. Baumann () Nested Krylov methods for shifted systems 7 / 27
Preconditioning is a problem Main disadvantage: Preconditioners are in general not easy to apply. For it does not hold: (A ωi )P 1 ω y (ω) = b, P ω x (ω) = y (ω) K m (AP 1, b) K m (AP 1 ω ωpω 1, b). However, there are ways... Reference B. Jegerlehner, Krylov space solvers for shifted linear systems. Published online arxiv:hep-lat/9612014, 1996. M. Baumann () Nested Krylov methods for shifted systems 8 / 27
Preconditioning is a problem Main disadvantage: Preconditioners are in general not easy to apply. For it does not hold: (A ωi )P 1 ω y (ω) = b, P ω x (ω) = y (ω) K m (AP 1, b) K m (AP 1 ω ωpω 1, b). However, there are ways... Reference B. Jegerlehner, Krylov space solvers for shifted linear systems. Published online arxiv:hep-lat/9612014, 1996. M. Baumann () Nested Krylov methods for shifted systems 8 / 27
Preconditioning is a problem... or has been a problem? Short historical overview: 2002 Shift-and-invert preconditioner: P = (A τi ), τ {ω 1,..., ω N } 2007 Many shift-and-invert preconditioners: 2013 Polynomial preconditioners: P = (A τ I ) p n (A) A 1, p ω n (A) (A ωi ) 1 2014 Nested Krylov methods M. Baumann () Nested Krylov methods for shifted systems 9 / 27
Shift-and-invert preconditioner Choose P (A τi ). Then, (A ωi )P 1 ω = AP 1 η ω I = A(A τi ) 1 η ω I = [A η ω (A τi )] (A τi ) 1 [ = A + η ωτ 1 η ω I From the fit ω = ηωτ 1 η ω, we conclude ] (1 η ω )(A τi ) 1. η ω = ω ω τ, P ω = 1 P = τ ω P. 1 η ω τ M. Baumann () Nested Krylov methods for shifted systems 10 / 27
Polynomial preconditioners - Theory Suppose we have found, p n (A) n α i A i A 1 i=0 Question: Can we find p ω n (A) n i=0 αω i A i such that (A ωi )p ω n (A) = Ap n (A) η ω I? Reference M. I. Ahmad, D. B. Szyld, and M. B. van Gizen, Preconditioned multishift BiCG for H 2-optimal model reduction. Technical report, 2013. M. Baumann () Nested Krylov methods for shifted systems 11 / 27
Polynomial preconditioners - Theory Suppose we have found, p n (A) n α i A i A 1 i=0 Question: Can we find p ω n (A) n i=0 αω i A i such that (A ωi )p ω n (A) = Ap n (A) η ω I? Reference M. I. Ahmad, D. B. Szyld, and M. B. van Gizen, Preconditioned multishift BiCG for H 2-optimal model reduction. Technical report, 2013. M. Baumann () Nested Krylov methods for shifted systems 11 / 27
Polynomial preconditioners - Theory From (A ωi )pn ω (A) = Ap n (A) η ω I we get n n n αi ω A i+1 ωαi ω A i α i A i+1 + η ω I = 0 i=0 i=0 i=0 The latter can be solved to: α ω n = α n α ω i 1 = α i 1 + ωα ω i, for i = n,..., 1 η ω = ωα ω 0 M. Baumann () Nested Krylov methods for shifted systems 12 / 27
Polynomial preconditioners - In practice What goes wrong in practice? Use Chebyshev polynomials for p n (A) A 1. Based on ellipse that surrounds the spectrum of A, Does not work for indefinite matrix A. Instead, approximate the shift-and-invert preconditioner p n (A) (A τi ) 1, i.e. shift the spectrum. For Helmholtz, this resembles an approximate shifted Laplace preconditioner. M. Baumann () Nested Krylov methods for shifted systems 13 / 27
Polynomial preconditioners - Example (1/2) Acoustic wave propagation (Helmholtz equation): 0 Velocity profile c(x) p ( ) 2πfk p = s c(x) s = δ(x 1 300, x 2 ) depth [m] 100 200 300 400 500 600 700 800 2000 m/s 1500 m/s 3000 m/s 900 1000 0 300 600 x axis [m] We prescribe absorbing boundary conditions (Sommerfeld conditions). M. Baumann () Nested Krylov methods for shifted systems 14 / 27
Polynomial preconditioners - Example (2/2) We consider 5 different meshes: Grid h[m] gridpoints f k [Hz] No prec. Exact prec. PP n = 3 1 100 77 1 179 77 87 2 100/2 273 1,2 457 156 194 3 100/4 1025 1,2,4 1160 319 457 4 100/8 3969 1,2,4,8 4087 788 1115 5 100/16 15617 1,2,4,8,16 9994 1830 2429 MS-QMRIDR(8), Seed: τ = (1 8i)2πf max Note: we have to use a shift with a large imaginary part to obtain a converging Chebyshev polynomial. [Ongoing Research] M. Baumann () Nested Krylov methods for shifted systems 15 / 27
Nested multi-shift Krylov methods Methodology: We have learned: Polynomial preconditioners exist Question: Can we use a Krylov polynomial? Nested multi-shift Krylov methods: Use an inner multi-shift Krylov method as preconditioner. For inner method, require collinear residuals [r (ω) = γr ]. This is the case for: multi-shift GMRES [1998] multi-shift FOM [2003] multi-shift BiCG [2003] multi-shift IDR(s) [new!] Using γ, we can preserve the shift-invariance in the outer Krylov iteration. M. Baumann () Nested Krylov methods for shifted systems 16 / 27
Nested multi-shift Krylov methods Methodology: We have learned: Polynomial preconditioners exist Question: Can we use a Krylov polynomial? Nested multi-shift Krylov methods: Use an inner multi-shift Krylov method as preconditioner. For inner method, require collinear residuals [r (ω) = γr ]. This is the case for: multi-shift GMRES [1998] multi-shift FOM [2003] multi-shift BiCG [2003] multi-shift IDR(s) [new!] Using γ, we can preserve the shift-invariance in the outer Krylov iteration. M. Baumann () Nested Krylov methods for shifted systems 16 / 27
Nested multi-shift Krylov methods Overview of one possible combination: M. Baumann () Nested Krylov methods for shifted systems 17 / 27
Multi-shift FOM as inner method Classical result: In FOM, the residuals are r = b Ax =... = h +1, e T y v +1. Thus, for the shifted residuals it holds: r (ω) = b (A ωi )x (ω) which gives γ = y (ω) /y. Reference =... = h (ω) +1, et y (ω) v +1, V. Simoncini, Restarted full orthogonalization method for shifted linear systems. BIT Numerical Mathematics, 43 (2003). M. Baumann () Nested Krylov methods for shifted systems 18 / 27
Flexible multi-shift GMRES as outer method Use flexible GMRES in the outer loop, where one column yields (A ωi ) P(ω) 1 v }{{} inner loop (A ωi ) V m = V m+1 H (ω) m, = V m+1 h (ω), 1 m. The inner loop is the truncated solution of (A ωi ) with right-hand side v using msfom. M. Baumann () Nested Krylov methods for shifted systems 19 / 27
Flexible multi-shift GMRES as outer method Use flexible GMRES in the outer loop, where one column yields (A ωi ) P(ω) 1 v }{{} inner loop (A ωi ) V m = V m+1 H (ω) m, = V m+1 h (ω), 1 m. The inner loop is the truncated solution of (A ωi ) with right-hand side v using msfom. M. Baumann () Nested Krylov methods for shifted systems 19 / 27
Flexible multi-shift GMRES The inner residuals are: Imposing r (ω) r (ω) = v (A ωi )P(ω) 1 v, r = v AP 1 v, = γr yields: (A ωi )P(ω) 1 v = γap 1 v (γ 1)v ( ) Note that the right-hand side in ( ) is a preconditioned shifted system! M. Baumann () Nested Krylov methods for shifted systems 20 / 27
Flexible multi-shift GMRES The inner residuals are: Imposing r (ω) r (ω) = v (A ωi )P(ω) 1 v, r = v AP 1 v, = γr yields: (A ωi )P(ω) 1 v = γap 1 v (γ 1)v ( ) Note that the right-hand side in ( ) is a preconditioned shifted system! M. Baumann () Nested Krylov methods for shifted systems 20 / 27
Flexible multi-shift GMRES Altogether, which yields: (A ωi )P(ω) 1 v = V m+1 h (ω) γap 1 v (γ 1)v = V m+1 h (ω) γv m+1 h V m+1 (γ 1) e = V m+1 h (ω) ( ) V m+1 γh (γ 1) e = Vm+1 h (ω) with Γ m := diag(γ 1,..., γ m ). H (ω) m = (H m I m ) Γ m + I m, M. Baumann () Nested Krylov methods for shifted systems 21 / 27
Flexible multi-shift GMRES Altogether, which yields: (A ωi )P(ω) 1 v = V m+1 h (ω) γap 1 v (γ 1)v = V m+1 h (ω) γv m+1 h V m+1 (γ 1) e = V m+1 h (ω) ( ) V m+1 γh (γ 1) e = Vm+1 h (ω) with Γ m := diag(γ 1,..., γ m ). H (ω) m = (H m I m ) Γ m + I m, M. Baumann () Nested Krylov methods for shifted systems 21 / 27
Flexible multi-shift GMRES Altogether, which yields: (A ωi )P(ω) 1 v = V m+1 h (ω) γap 1 v (γ 1)v = V m+1 h (ω) γv m+1 h V m+1 (γ 1) e = V m+1 h (ω) ( ) V m+1 γh (γ 1) e = Vm+1 h (ω) with Γ m := diag(γ 1,..., γ m ). H (ω) m = (H m I m ) Γ m + I m, M. Baumann () Nested Krylov methods for shifted systems 21 / 27
Flexible multi-shift GMRES Altogether, which yields: (A ωi )P(ω) 1 v = V m+1 h (ω) γap 1 v (γ 1)v = V m+1 h (ω) γv m+1 h V m+1 (γ 1) e = V m+1 h (ω) ( ) V m+1 γh (γ 1) e = Vm+1 h (ω) with Γ m := diag(γ 1,..., γ m ). H (ω) m = (H m I m ) Γ m + I m, M. Baumann () Nested Krylov methods for shifted systems 21 / 27
Flexible multi-shift GMRES Altogether, which yields: (A ωi )P(ω) 1 v = V m+1 h (ω) γap 1 v (γ 1)v = V m+1 h (ω) γv m+1 h V m+1 (γ 1) e = V m+1 h (ω) ( ) V m+1 γh (γ 1) e = Vm+1 h (ω) with Γ m := diag(γ 1,..., γ m ). H (ω) m = (H m I m ) Γ m + I m, M. Baumann () Nested Krylov methods for shifted systems 21 / 27
A first example - The setting Test case from literature: Ω = [0, 1] [0, 1] h = 0.01 implying n = 10.201 grid points system size: 4n = 40.804 N = 6 frequencies point source at center Reference T. Airaksinen, A. Pennanen, and J. Toivanen, A damping preconditioner for time-harmonic wave equations in fluid and elastic material. Journal of Computational Physics, 2009. M. Baumann () Nested Krylov methods for shifted systems 22 / 27
A first example - Convergence behavior (1/2) Preconditioned multi-shift GMRES: Relative residual norm 0 1 2 3 4 5 6 7 multi shift GMRES convergence f = 5,000 f = 10,000 f = 15,000 f = 20,000 f = 25,000 f = 30,000 We observe: simultaneous solve CPU time: 17.71s 8 9 0 10 20 30 40 50 60 70 80 90 100 110 # iterations M. Baumann () Nested Krylov methods for shifted systems 23 / 27
A first example - Convergence behavior (2/2) Preconditioned nested FOM-FGMRES: Relative residual norm Relative residual norm 1 0.5 0 0.5 1 1.5 2 Inner msfom convergence f = 5,000 f = 10,000 f = 15,000 f = 20,000 f = 25,000 f = 30,000 2.5 0 5 10 15 20 25 30 # iterations 0 1 2 3 4 5 6 7 Outer msgmres convergence f = 5,000 f = 10,000 f = 15,000 f = 20,000 f = 25,000 f = 30,000 We observe: 30 inner iterations truncate when inner residual norm 0.1 very few outer iterations CPU time: 9.62s 8 9 10 0 1 2 3 4 5 6 7 # iterations M. Baumann () Nested Krylov methods for shifted systems 24 / 27
A first example - More nested methods We were running the same setting with different (nested) multi-shift Krylov methods: multi-shift Krylov methods msgmres rest msgmres QMRIDR(4) msidr(4) # inner iterations - 20 - - # outer iterations 103 7 136 134 seed shift τ 0.7-0.7i 0.7-0.7i 0.7-0.7i 0.7-0.7i CPU time 17.71s 6.13s 22.35s 22.58s nested multi-shift Krylov methods FOM-FGMRES IDR(4)-FGMRES FOM-FQMRIDR(4) IDR(4)-FQMRIDR(4) # inner iterations 30 25 30 30 # outer iterations 7 9 5 15 seed shift τ 0.7-0.7i 0.7-0.7i 0.7-0.7i 0.7-0.7i CPU time 9.62s 32.99s 8.14s 58.36s M. Baumann () Nested Krylov methods for shifted systems 25 / 27
Summary Nested Krylov methods for Ax = b are widely used extension to shifted linear systems is possible Multiple combinations of inner-outer methods possible, e.g. FOM-FGMRES, IDR-FQMRIDR,... The shift-and-invert preconditioner (or the polynomial preconditioner) can be applied on top Future work: recylcing, deflation,... M. Baumann () Nested Krylov methods for shifted systems 26 / 27
Thank you for your attention! Further reading: M. Baumann and M. B. van Gizen. Nested Krylov methods for shifted linear systems. DIAM technical report 14-01, 2014. Further coding: https://bitbucket.org/manuelmbaumann/nestedkrylov M. Baumann () Nested Krylov methods for shifted systems 27 / 27
IDR(s) in a nutshell IDR(s) is a Krylov subspace method that enforces the residuals r n to be, G +1 r n+1 = (I µ +1 A)v n, with v n G P, where µ +1 C \ {0} and P = [p 1,..., p s ] are chosen freely. The IDR theorem states: 1 G +1 G for all 0, 2 G = {0} for some N. Reference P. Sonneveld, M. B. van Gizen, IDR(s): A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput., 2008. M. Baumann () Nested Krylov methods for shifted systems 27 / 27
IDR(s) in a nutshell IDR(s) is a Krylov subspace method that enforces the residuals r n to be, G +1 r n+1 = (I µ +1 A)v n, with v n G P, where µ +1 C \ {0} and P = [p 1,..., p s ] are chosen freely. The IDR theorem states: 1 G +1 G for all 0, 2 G = {0} for some N. Reference P. Sonneveld, M. B. van Gizen, IDR(s): A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput., 2008. M. Baumann () Nested Krylov methods for shifted systems 27 / 27
IDR(s) with collinear residuals Notation: Let s call  (A ωi ) and ˆr n, ˆµ +1, ˆv n,... Assuming, we have ˆv n = α n v n (not trivial!), then we want: γ n+1 r n+1 = ˆr n+1 ( ) γ n+1 (I µ +1 A) v n = I ˆµ +1  ˆv n γ n+1 (I µ +1 A) v n = (I ˆµ +1 (A ωi )) α n v n γ n+1 v n γ n+1 µ +1 Av n = (α n + α n ˆµ +1 ω)v n α n ˆµ +1 Av n Choose ˆµ +1 and γ n+1 such that the two terms match: ˆµ +1 = µ +1 1 ωµ +1, γ n+1 = α n 1 ωµ +1. M. Baumann () Nested Krylov methods for shifted systems 27 / 27
IDR(s) with collinear residuals Notation: Let s call  (A ωi ) and ˆr n, ˆµ +1, ˆv n,... Assuming, we have ˆv n = α n v n (not trivial!), then we want: γ n+1 r n+1 = ˆr n+1 ( ) γ n+1 (I µ +1 A) v n = I ˆµ +1  ˆv n γ n+1 (I µ +1 A) v n = (I ˆµ +1 (A ωi )) α n v n γ n+1 v n γ n+1 µ +1 Av n = (α n + α n ˆµ +1 ω)v n α n ˆµ +1 Av n Choose ˆµ +1 and γ n+1 such that the two terms match: ˆµ +1 = µ +1 1 ωµ +1, γ n+1 = α n 1 ωµ +1. M. Baumann () Nested Krylov methods for shifted systems 27 / 27