Polynomial and rational filtering for eigenvalue problems and the EVSL project Yousef Saad Department of Computer Science and Engineering University of Minnesota ILAS 217 July 25, 217
Large eigenvalue problems in applications Challenge in eigenvalue problems: extract large number of eigenvalues & vectors of very large matrices (quantum physics/ chemistry,...) - often in the middle of spectrum. Example: Excited states involve transitions much more complex computations than for DFT (ground states) Large matrices, *many* eigen-pairs to compute Illustration: Hamiltonian of size n 1 6 get 1% of bands 2 ILAS 217 - July 25, 217
Solving large interior eigenvalue problems Three broad approaches: 1. Shift-invert (real shifts) 2. Polynomial filtering 3. Rational filtering (Cauchy, + others). Issues with shift-and invert (and related approaches) Issue 1: factorization may be too expensive Can use iterative methods? Issue 2: Iterative techniques often fail Reason: Highly indefinite problems. First Alternative: Spectrum slicing with Polynomial filtering 3 ILAS 217 - July 25, 217
Spectrum Slicing Situation: very large number of eigenvalues to be computed Goal: compute spectrum by slices by applying filtering Apply Lanczos or Subspace iteration to problem: 1.8 φ(λ i ) Pol. of degree 32 approx δ(.5) in [ 1 1] φ(a)u = µu φ(t) a polynomial or rational function that enhances wanted eigenvalues.1.2.3.4.5.6.7.8.9 1 φ ( λ ).6.4.2 λ i.2 λ 4 ILAS 217 - July 25, 217
Rationale. Eigenvectors on both ends of wanted spectrum need not be orthogonalized against each other : Idea: Get the spectrum by slices or windows [e.g., a few hundreds or thousands of pairs at a time] Can use polynomial or rational filters 5 ILAS 217 - July 25, 217
Hypothetical scenario: large A, *many* wanted eigenpairs Assume A has size 1M... and you want to compute 5, eigenvalues/vectors (huge for numerical analysits, not for physicists)...... in the lower part of the spectrum - or the middle. By (any) standard method you will need to orthogonalize at least 5K vectors of size 1M. Then: Space needed: 4 1 12 b = 4TB *just for the basis* Orthogonalization cost: 5 1 16 = 5 PetaOPS. At step k, each orthogonalization step costs 4kn This is 2, n for k close to 5,. 6 ILAS 217 - July 25, 217
Illustration: All eigenvalues in [, 1] of a 49 3 Laplacean 35 3 Computing all 1,971 e.v. s. in [, 1] Matvec Orth Total 25 2 Time 15 1 5 1 2 3 4 5 6 Number of slices Note: This is a small pb. in a scalar environment. Effect likely much more pronounced in a fully parallel case. 7 ILAS 217 - July 25, 217
How do I slice my spectrum?.25 Slice spectrum into 8 with the DOS.2.15 Answer: Use the DOS..1 DOS.5.5 5 1 15 2 We must have: ti+1 t i φ(t)dt = 1 n slices b a φ(t)dt 8 ILAS 217 - July 25, 217
Polynomial filtering Apply Lanczos or Subspace iteration to: M = ρ(a) where ρ(t) is a polynomial Each matvec y = Av is replaced by y = ρ(a)v. Eigenvalues in high part of filter will be computed first. Old (forgotten) idea. But new context is *very* favorable 9 ILAS 217 - July 25, 217
What polynomials? LS approximations to δ-dirac functions Pol. of degree 32 approx δ(.5) in [ 1 1] Obtain the LS approximation to the δ Dirac function Centered at some point (TBD) inside the interval. φ ( λ ) 1.8.6.4.2 φ(λ i ) λ i.2.1.2.3.4.5.6.7.8.9 1 λ W ll express everything in the interval [ 1, 1] 1 ILAS 217 - July 25, 217
Theory The Chebyshev expansion of δ γ is k { 1 j = ρ k (t) = µ j T j (t) with µ j = 2 cos(j cos 1 (γ)) j > j= Recall: The delta Dirac function is not a function we can t properly approximate it in least-squares sense. However: Proposition Let ˆρ k (t) be the polynomial that minimizes r(t) w over all polynomials r of degree k, such that r(γ) = 1, where. w represents the Chebyshev L 2 -norm. Then ˆρ k (t) = ρ k (t)/ρ k (γ). 11 ILAS 217 - July 25, 217
A few technical details. Issue # one: balance the filter To facilitate the selection of wanted eigenvalues [Select λ s such that ρ(λ) > bar] we need to...... find γ so that ρ(ξ) == ρ(η) 1.2 1.2 1 1.8.8 ρ k ( λ ).6.4 ρ k ( λ ).6.4.2.2 ξ η ξ η.2 1.8.6.4.2.2.4.6.8 1 λ.2 1.8.6.4.2.2.4.6.8 1 λ Procedure: Solve the equation ρ γ (ξ) ρ γ (η) = with respect to γ, accurately. Use Newton or eigenvalue formulation. 12 ILAS 217 - July 25, 217
Issue # two: Determine degree & polynomial (automatically) Start low then increase degree until value (s) at the boundary (ies) become small enough - Exple for [.833,.97..] 1.2 Degree = 3; Sigma damping 1.2 Degree = 4; Sigma damping 1.2 Degree = 7; Sigma damping 1 1 1.8.8.8.6.6.6.4.4.4.2.2.2.2 1.8.6.4.2.2.4.6.8 1.2 1.8.6.4.2.2.4.6.8 1.2 1.8.6.4.2.2.4.6.8 1 1.2 Degree = 13; Sigma damping 1.2 Degree = 2; Sigma damping 1.2 Degree = 23; Sigma damping 1 1 1.8.8.8.6.6.6.4.4.4.2.2.2.2 1.8.6.4.2.2.4.6.8 1.2 1.8.6.4.2.2.4.6.8 1.2 1.8.6.4.2.2.4.6.8 1 13 ILAS 217 - July 25, 217
Polynomial filtered Lanczos: No-Restart version Degree = 23; Sigma damping 1.9.8.7.6.5.4.3.2.1.75.8.85.9.95 Use Lanczos with full reorthogonalization on ρ(a). Eigenvalues of ρ(a): ρ(λ i ) Accept if ρ(λ i ) bar Ignore if ρ(λ i ) < bar..8 1. ρ(λ) Unwanted eigenvalues Wanted 14 ILAS 217 - July 25, 217
Polynomial filtered Lanczos: Thick-Restart version PolFilt Thick-Restart Lanczos in a picture: 1.8 φ(λ i ) Pol. of degree 32 approx δ(.5) in [ 1 1] If accurate then lock else add to Thick Restart set. φ ( λ ).6.4 Reject.2 λ i.2.1.2.3.4.5.6.7.8.9 1 λ Due to locking, no more candidates will show up in wanted area after some point Stop. 15 ILAS 217 - July 25, 217
TR Lanczos: The 3 types of basis vectors Basis vectors Matrix representation Locked vectors Thick restart vectors Lanczos vectors 16 ILAS 217 - July 25, 217
Experiments: Hamiltonian matrices from PARSEC Matrix n nnz [a, b] [ξ, η] ν [ξ,η] Ge 87 H 76 112, 985 7.9M [ 1.21, 32.76] [.64,.53] 212 Ge 99 H 1 112, 985 8.5M [ 1.22, 32.7] [.65,.96] 25 Si 41 Ge 41 H 72 185, 639 15.M [ 1.12, 49.82] [.64,.28] 218 Si 87 H 76 24, 369 1.6M [ 1.19, 43.7] [.66,.3] 213 Ga 41 As 41 H 72 268, 96 18.5M [ 1.25, 131] [.64,.] 21 17 ILAS 217 - July 25, 217
Results: (No-Restart Lanczos) Matrix deg iter matvec CPU time (sec) matvec orth. total max residual Ge 87 H 76 26 1,2 26,784 48.58 18.67 74.45 1.2 1 12 Ge 99 H 1 26 1,9 28,642 6.11 2.44 86.52 7.2 1 12 Si 41 Ge 41 H 72 32 95 3,682 15.5 28.25 144.19 1.2 1 1 Si 87 H 76 29 1,1 29,561 76.45 39.16 128.95 4.3 1 12 Ga 41 As 41 H 72 174 91 158,889 693.5 34.16 759.99 3.7 1 12 Demo with Si1H16 [n = 17, 77, nnz(a) = 446, 5] 18 ILAS 217 - July 25, 217
RATIONAL FILTERS
Why use rational filters? Consider a spectrum like this one: 1 9 Polynomial filtering utterly ineffective for this case Second issue: situation when Matrix-vector products are expensive Generalized eigenvalue problems. 2 ILAS 217 - July 25, 217
Alternative is to use rational filters: φ(z) = j α j z σ j φ(a) = j α j(a σ j I) 1 We now need to solve linear systems Tool: Cauchy integral representations of spectral projectors P = 1 2iπ Γ (A si) 1 ds Numer. integr. P P Use Krylov or S.I. on P Sakurai-Sugiura approach [Krylov] FEAST [Subs. iter.] (E. Polizzi) 21 ILAS 217 - July 25, 217
What makes a good filter 2.5 2 real(σ) =[.]; imag(σ) =[ 1.1]; pow = 3 Opt. pole 3 Single pole Single pole 3 Gauss 2 poles Gauss 2 poles 2 1.9.8.7 real(σ) =[.]; imag(σ) =[ 1.1]; pow = 3 Opt. pole 3 Single pole Single pole 3 Gauss 2 poles Gauss 2 poles 2 1.5.6.5 1.4.3.5.2.1 2 1.5 1.5.5 1 1.5 2 1.5 1.4 1.3 1.2 1.1 1.9.8.7.6.5 Assume subspace iteration is used with above filters. Which filter will give better convergence? Simplest and best indicator of performance of a filter is the magnitude of its derivative at -1 (or 1) 22 ILAS 217 - July 25, 217
The Gauss viewpoint: Least-squares rational filters Given: poles σ 1, σ 2,, σ p Related basis functions φ j (z) = 1 z σ j Find φ(z) = p j=1 α jφ j (z) that minimizes w(t) h(t) φ(t) 2 dt h(t) = step function χ [ 1,1]. w(t)= weight function. For example a = 1, β =.1 w(t) = if t > a β if t 1 1 else 23 ILAS 217 - July 25, 217
Advantages: Can select poles far away from real axis faster iterative solvers Very flexible can be adapted to many situations Can repeat poles (!) Implemented in EVSL.. [Interfaced to SuiteSparse as a solver] 24 ILAS 217 - July 25, 217
Spectrum Slicing and the EVSL project Newly released EVSL uses polynomial and rational filters Each can be appealing in different situations. Spectrum slicing: cut the overall interval containing the spectrum into small sub-intervals and compute eigenpairs in each sub-interval independently. 1.2 1.8 φ ( λ ).6.4.2.2 1.8.6.4.2.2.4.6.8 1 λ 25 ILAS 217 - July 25, 217
Levels of parallelism Macro task 1 Slice 1 Domain 1 Slice 2 Slice 3 Domain 2 Domain 3 Domain 4 The two main levels of parallelism in EVSL 26 ILAS 217 - July 25, 217
EVSL Main Contributors (version 1.1.) + support Ruipeng Li LLNL Yuanzhe Xi Post-doc (UMN) Luke Erlandson UG Intern (UMN) Work supported by DOE [ending this summer]...... and by NSF [going forward] 28 ILAS 217 - July 25, 217
EVSL: current status & plans Version _1. Released in Sept. 216 Matrices in CSR format (only) Standard Hermitian problems (no generalized) Spectrum slicing with KPM (Kernel Polynomial Meth.) Trivial parallelism across slices with OpenMP Methods: Non-restart Lanczos polynomial & rational filters Thick-Restart Lanczos polynomial & rational filters Subspace iteration polynomial & rational filters 29 ILAS 217 - July 25, 217
Version _1.1.x V_1.1. Due for release end of July general matvec [passed as function pointer] Ax = λbx Fortran (3) interface. Spectrum slicing by Lanczos and KPM Efficient Spectrum slicing for Ax = λbx (no solves with B). Version _1.2.x V_1.2. Early 218 (?) Fully parallel version [MPI + openmp] Challenge application in earth sciences [in progress] 3 ILAS 217 - July 25, 217
Conclusion Polynomial Filtering appealing when # of eigenpairs to be computed is large and Matvecs are not too expensive Somewhat costly for generalized eigenvalue problems Will not work well for spectra with large outliers. Alternative: Rational filtering Both approaches implemented in EVSL Current focus: provide as many interfaces as possible. EVSL code available here: www.cs.umn.edu/~saad/software/evsl EVSL Also on github (development) 31 ILAS 217 - July 25, 217