A Jacobi Davidson Method with a Multigrid Solver for the Hermitian Wilson-Dirac Operator

A Jacobi Davidson Method with a Multigrid Solver for the Hermitian Wilson-Dirac Operator Artur Strebel Bergische Universität Wuppertal August 3, 2016

Joint Work This project is joint work with: Gunnar Bali, Sarah Collins, Andreas Frommer, Karsten Kahl, Matthias Rottmann, Benjamin Müller +, Jakob Simeth : Universität Regensburg : Bergische Universität Wuppertal + : Johannes Gutenberg-Universität Mainz A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 1/21

Outline AMG for Lattice QCD Jacobi-Davidson + AMG Numerical Results Summary & Outlook A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 2/21

Lattice QCD: The Wilson Discretization FD Discretization: N 3 s N t regular lattice L (parallelized by distributing sub-blocks) nearest neighbor coupling 12 variables per site and four 12 12 coupling matrices periodic boundary conditions γ 5 -symmetry: γ 5 D = (γ 5 D) (γ 5 := γ 1 γ 2 γ 3 γ 4, γ 2 5 = I) imaginary axis 2 1.5 1 0.5 0 0.5 1 1.5 2 0 1 2 3 4 5 6 7 8 real axis spec(d) spec(d) is in the right half plane and symmetric to the real axis A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 3/21

Lattice QCD: The Wilson Discretization FD Discretization: N 3 s N t regular lattice L (parallelized by distributing sub-blocks) nearest neighbor coupling 12 variables per site and four 12 12 coupling matrices periodic boundary conditions γ 5 -symmetry: γ 5 D = (γ 5 D) (γ 5 := γ 1 γ 2 γ 3 γ 4, γ 2 5 = I) eigenvalue λi 8 6 4 2 0 2 4 6 8 0 500 1000 1500 2000 2500 3000 index i spec(γ 5 D) spec(γ 5 D) is real and maximally indefinite, we denote Q := γ 5 D A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 3/21

Lattice QCD: Tasks Application: calculating physical observables oftentimes requires tr(q 1 ) tr(q 1 ) = λ 1 i compute small eigenvectors exactly treat remainder of spectrum with stochastic algorithm Challenges: todays matrix sizes: 10 7 10 7 and larger (shifted) linear systems with either D or Q to be solved small EVs of Q (= singular vectors of D) highly desired Approaches: multigrid approaches for inverting D and Q Jacobi-Davidson method + multigrid approach for Q cheap eigensolver A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 4/21

Adaptive Algebraic Multigrid Approach for D (non-hermitian) Two-grid error propagator for ν steps of post-smoothing E (ν) 2g = (I MD)ν } {{ } smoother (I P Dc 1 P D), D }{{} c := P DP }{{} coarse grid correction coarse operator low accuracy for Dc 1 and M is sufficient introduce recursive construction for D c multigrid To Do: Define interpolation P and smoother M DD-αAMG [ArXiv:1303.1377,1307.6101] M: Schwarz Alternating Procedure (SAP) [Lüscher 2003] P : Aggregation based Interpolation [Brannick, Clark et al. 2010] A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 5/21

Aggregation Based Interpolation Construction define aggregates: domain decomposition A 1,..., A s A 1 A 3 A 2 A 4 P R = P calculate test vectors w 1,..., w N decompose test vectors over aggregates A 1,..., A s A 1 A 1 (w 1,..., w N ) = = A 2 P = A 2 A s A s A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 6/21

Adaptive Algebraic Multigrid Approach for Q = γ 5 D Construct P such that P γ 5 = γ 5 P Replacing D Q = γ 5 D we obtain the two-grid error propagator Ẽ (ν) 2g = (I MQ)ν }{{} requires new smoother (I P Q 1 c P Q), Q }{{} c := P QP }{{} new coarse operator =I P Dc 1 P D P is valid for D and Q P preserves Q c = Q c recursive application possible SAP smoother does not work anymore Choice: Replace M by GMRES A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 7/21

γ 5 -Preconditioning AMG for Q in practice a factor of 2 slower than DD-αAMG can be remedied by γ 5 -preconditioning : (Q σi)ϕ = ψ γ 5 (Q σi)ϕ = γ 5 ψ (D σγ 5 )ϕ = γ 5 ψ error propagator of smoother now reads: Ẽ (ν) S = (I M(D σγ 5)) ν enables to compute (Q σ) 1 with AMG efficiently and thus to compute small eigenvectors of Q A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 8/21

AMG for D in Practice: Scaling with the Bare Quark Mass Configuration: 64 64 3, 128 cores time to solution (in seconds) 10000 1000 400 200 100 50 30 m crit m u m ud m d 0.05 m 0 two-level DD-αAMG three-level DD-αAMG four-level DD-αAMG mp oe BiCGStab 0.04 0.03 0.02 lighter quark mass m 0 more ill-conditioned system AMG less sensitive to condition number than Krylov subspace methods A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 9/21

AMG for Lattice QCD Jacobi-Davidson + AMG Numerical Results Summary & Outlook A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 10/21

An Introduction to Eigensolver Methods Applying a matrix A to an arbitrary vector v multiple times converges to largest eigenvector Power method If λ is an eigenvalue of A, then (λ σ) 1 is an eigenvalue of (A σi) 1 Inverse iteration Now we can compute any eigenpair we want Given a normalized vector v the Rayleigh Quotient is defined as r(x) = v Av r(x) is the best-approximation to an eigenvalue of v Rayleigh quotient iteration (cubic convergence for hermitian matrices) A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 11/21

An Introduction to the Jacobi-Davidson Method The Jacobi-Davidson method combines two ideas Davidson: Impose Galerkin Condition AV s θs V on eigenvalue problem to some subspace V = {v 1, v 2,..., v n }, which leads to V AV s θs = 0 If (θ i, s i ) is a solution of this equation then (θ i, u i = V s i ) is called a Ritz pair w.r.t. the subspace V (θ i, u i ) approximates eigenpairs How to construct V? Jacobi: Given an eigenvector approximation u i, choose an orthogonal correction t u i such that A(u i + t) = λ(u i + t) We find that t fulfills the equation (I u i u i )(A λi)(i u iu i )t = (A θ ii)u i := r λ unknown replace by θ i A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 12/21

An Introduction to the Jacobi-Davidson Method The Jacobi-Davidson method combines two ideas Davidson: Impose Galerkin Condition AV s θs V on eigenvalue problem to some subspace V = {v 1, v 2,..., v n }, which leads to V AV s θs = 0 If (θ i, s i ) is a solution of this equation then (θ i, u i = V s i ) is called a Ritz pair w.r.t. the subspace V (θ i, u i ) approximates eigenpairs How to construct V? Jacobi: Given an eigenvector approximation u i, choose an orthogonal correction t u i such that A(u i + t) = λ(u i + t) We find that t fulfills the equation (I u i u i )(A θ ii)(i u i u i )t = (A θ ii)u i =: r Solve inexactly for t and add result to V A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 12/21

The Jacobi-Davidson Method Algorithm 1: Jacobi-Davidson (basic) input: initial guess t, desired accuracy ε output: eigenpair (λ, x) 1 for all m = 1, 2,... do 2 t = t V V t, v m = t/ t 2 3 M = V AV 4 compute smallest eigenpair (θ, s) of M 5 u = V s 6 r = Au θu 7 if r 2 ε then 8 λ = θ, x = u 9 solve (approximately) for t u in (I uu )(A θi)(i uu )t = r A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 13/21

The Jacobi-Davidson Method with AMG Algorithm 2: Jacobi-Davidson + AMG input: no. of eigenvalues n, subspace size l, initial guess [v 0, v 1,..., v k, t], desired accuracy ε output: set of n eigenpairs (Λ, X) 1 for all m = k + 1, k + 2,... do 2 t = t V V t, v m = t/ t 2 3 compute all (θ i, s i ) with (AV ) (AV )s i = θ i (AV ) V s i 4 find smallest θ i / Λ 5 u = V s i, r = Au θ i u 6 if r 2 ε then 7 Λ = [Λ, θ i ], X = [X, u] 8 if m l + n conv then 9 jd restart 10 find smallest θ i / Λ 11 rebuild interpolation 12 solve (approximately) (A θ i I)t = r A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 14/21

Restarting Goal: Limit the amount of work (and memory) per iteration control dimension of search space V Algorithm 3: jd restart input: look ahead m, number of converged eigenpairs n conv, search space V output: new search space V and Ritz pairs (θ i, s i ) 1 [, π] = sort( Θ, ascending) 2 for all i = 1,... m + n conv do 3 V i = V s π(i) 4 V = V (1 : m + n conv ) 5 orthogonalize V 6 compute all (θ i, s i ) with (AV ) (AV )s i = (AV ) V s i look ahead m preserves search space information in spirit of thick restarting subspace size is bound by n + l A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 15/21

Interaction between Jacobi-Davidson and AMG Rebuilding the interpolation JD computed all eigenvectors up to some point σ AMG needs small eigenvectors of (A σi) for interpolation P Idea: Use converged vectors to improve P dynamically: current EV (to be computed) k fix smallest EVs k flex EVs Build interpolation P with: k fix smallest eigenvectors current eigenvector approximation u k flex converged eigenvectors closest to u A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 16/21

Interaction between Jacobi-Davidson and AMG Rebuilding the interpolation JD computed all eigenvectors up to some point σ AMG needs small eigenvectors of (A σi) for interpolation P Idea: Use converged vectors to improve P dynamically: current EV (to be computed) k flex EVs k fix smallest EVs Build interpolation P with: k fix smallest eigenvectors current eigenvector approximation u k flex converged eigenvectors closest to u A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 16/21

AMG for Lattice QCD Jacobi-Davidson + AMG Numerical Results Summary & Outlook A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 17/21

Comparison with PARPACK: Scaling with the Number of Eigenvectors N 10 5 PARPACK JD + AMG core-h 10 4 10 3 10 2 50 100 150 200 250 300 350 400 number of eigenvectors N 48 24 3 lattice ( 8 10 6 unknowns) 648 cores on JURECA PARPACK: Krylov subspace dimension 4N JD: k fix = 8, k flex = 16, search space dimension 150 orthogonalization cost n conv + 150 A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 18/21

Effectiveness of AMG Solves solve time per EV (in seconds) 10 2 10 1 10 0 FGMRES + AMG FGMRESR (after 50 EVs) 10 1 0 100 200 300 400 500 eigenvector number coarse grid correction switched off after 50 converged eigenvectors clear loss in solve time for first 400 eigenvectors afterwards coarse grid effectiveness deteriorates better interpolation strategy? hybrid approach? different preconditioner at some point? A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 19/21

AMG for Lattice QCD Jacobi-Davidson + AMG Numerical Results Summary & Outlook A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 20/21

Summary & Outlook Summary Jacobi-Davidson type algorithm with multigrid support good scaling with number of eigenvalues scalability mostly depends on solver performance speed up can be orders of magnitudes Improve Eigensolver explore possible benefits of AMG with more levels study orthogonal projections for solves find optimal restarting strategy improve interaction between JD and interpolation explore scaling behavior with lattice size (former approach scaled better than PARPACK) compare with other eigensolver approaches (e.g. FEAST) A. Strebel et al., Multigrid Solver for the Jacobi Davidson Method 21/21

Acknowledgments All results computed on JUROPA at Jülich Supercomputing Centre (JSC) Funded by Deutsche Forschungsgemeinschaft (DFG), Transregional Collaborative Research Centre 55 (SFB TR 55) All configurations provided by BMW-c, QCDSF & CLS