Domain Decomposition-based contour integration eigenvalue solvers
|
|
- Kathlyn Paul
- 6 years ago
- Views:
Transcription
1 Domain Decomposition-based contour integration eigenvalue solvers Vassilis Kalantzis joint work with Yousef Saad Computer Science and Engineering Department University of Minnesota - Twin Cities, USA SIAM ALA 2015, VK, YS (U of M) DD-CINT techniques / 31
2 Acknowledgments Joint work with Eric Polizzi and James Kestyn (UM Amherst). Special thanks to Minnesota Supercomputing Institute for allowing access to its supercomputers. Work supported by NSF and DOE (DE-SC ). VK, YS (U of M) DD-CINT techniques / 31
3 Contents 1 Introduction 2 The Domain Decomposition framework 3 Domain Decomposition-based contour integration 4 Implementation in HPC architectures 5 Experiments 6 Discussion VK, YS (U of M) DD-CINT techniques / 31
4 Introduction The sparse symmetric eigenvalue problem Ax = λx where A R n n, x R n and λ R. A pair (λ,x) is called an eigenpair of A. Focus in this talk Find all (λ,x) pairs inside the interval [α,β] where α, β R and λ 1 α, β λ n. Typical approach Project A on a low-dimensional subspace by V AVy = θv Vy, x = Vy. V: Krylov, (Generalized-Jacobi)-Davidson, contour integration,... VK, YS (U of M) DD-CINT techniques / 31
5 Contour integration (CINT) V := PˆV = 1 (ζi A) 1 dζ ˆV XX ˆV, 2iπ with Γ complex contour with endpoints [α,β]. V is an exact invariant subspace Numerical approximation PˆV P ˆV n c = j=1 Γ ω j (ζ j I A) 1ˆV, n c ρ(z) = with (weight,pole) (ω j,ζ j ), j = 1,...,n c. Trapezoidal, Midpoint, Gauss-Legendre,... Zolotarev, Least-Squares,... j=1 FEAST (Polizzi), Sakurai-Sugiura (SS), SS-CIRR,... ω j ζ j z VK, YS (U of M) DD-CINT techniques / 31
6 Main characteristics of CINT Can be seen as a (rational) filtering technique Different levels of parallelism Eigenvalue problem Linear systems with multiple right-hand sides In this talk We study contour integration from a Domain Decomposition (DD) point-of-view Two ideas: Use DD to derive CINT schemes Use DD to accelerate FEAST or other CINT-based method We target parallel architectures VK, YS (U of M) DD-CINT techniques / 31
7 Contents 1 Introduction 2 The Domain Decomposition framework 3 Domain Decomposition-based contour integration 4 Implementation in HPC architectures 5 Experiments 6 Discussion VK, YS (U of M) DD-CINT techniques / 31
8 Partitioning of the domain (Metis, Scotch,...) Figure: An edge-separator (vertex-based partitioning) VK, YS (U of M) DD-CINT techniques / 31
9 The local viewpoint assume M partitions Stack interior variables u 1,u 2,...,u P into u, then interface variables y, B 1 E 1 u 1 u 1 B 2 E 2 u 2 u = λ. B M E M u M u M E1 E2 EM C y y VK, YS (U of M) DD-CINT techniques / 31
10 Pictorially: Write as ( ) B E A = E. C VK, YS (U of M) DD-CINT techniques / 31
11 Contents 1 Introduction 2 The Domain Decomposition framework 3 Domain Decomposition-based contour integration 4 Implementation in HPC architectures 5 Experiments 6 Discussion VK, YS (U of M) DD-CINT techniques / 31
12 Expressing (A ζi) 1 in DD Let ζ C and recall that ( ) B E A = E. C VK, YS (U of M) DD-CINT techniques / 31
13 Expressing (A ζi) 1 in DD Let ζ C and recall that ( ) B E A = E. C Then ( (B ζi) 1 +F(ζ)S(ζ) 1 F(ζ) F(ζ)S(ζ) 1 ) (A ζi) 1 = S(ζ) 1 F(ζ) S(ζ) 1, where F(ζ) = (B ζi) 1 E S(ζ) = C ζi E T (B ζi) 1 E. VK, YS (U of M) DD-CINT techniques / 31
14 Spectral projectors and DD As previously, ( (B ζi) 1 +F(ζ)S(ζ) 1 F(ζ) F(ζ)S(ζ) 1 ) (A ζi) 1 = S(ζ) 1 F(ζ) S(ζ) 1, Then, P DD = 1 (A ζi) 1 dζ 2iπ Γ ( ) H W W G VK, YS (U of M) DD-CINT techniques / 31
15 Spectral projectors and DD As previously, ( (B ζi) 1 +F(ζ)S(ζ) 1 F(ζ) F(ζ)S(ζ) 1 ) (A ζi) 1 = S(ζ) 1 F(ζ) S(ζ) 1, Then, P DD = 1 (A ζi) 1 dζ 2iπ Γ ( ) H W W G H = 1 2iπ Γ [(B ζi) 1 +F(ζ)S(ζ) 1 F(ζ) ]dζ G = 1 2iπ Γ S(ζ) 1 dζ W = 1 2iπ Γ F(ζ)S(ζ) 1 dζ. VK, YS (U of M) DD-CINT techniques / 31
16 Extracting approximate eigenspaces Let ˆV be a set of mrhs to multiply P ) ( ) ( ) (ˆVu HˆVu WˆV s Zu P DD = ˆV s W ˆVu,with +GˆV s Z s Z u = 1 2iπ Γ (B ζi) 1ˆVu dζ 1 2iπ Γ F(ζ)S(ζ) 1 [ˆV s F(ζ) ˆVu ]dζ Z s = 1 2iπ Γ S(ζ) 1 [ˆV s F(ζ) ˆVu ]dζ. VK, YS (U of M) DD-CINT techniques / 31
17 Extracting approximate eigenspaces Let ˆV be a set of mrhs to multiply P ) ( ) ( ) (ˆVu HˆVu WˆV s Zu P DD = ˆV s W ˆVu,with +GˆV s Z s Z u = 1 2iπ Γ (B ζi) 1ˆVu dζ 1 2iπ Γ F(ζ)S(ζ) 1 [ˆV s F(ζ) ˆVu ]dζ Z s = 1 2iπ Γ S(ζ) 1 [ˆV s F(ζ) ˆVu ]dζ. In practice: n c n c Z u = ω j (B ζ j I) 1ˆVu ω j F(ζ j )S(ζ) 1 [ˆV s F(ζ) ˆVu ], j=1 j=1 n c Z s = ω j S(ζ j ) 1 [ˆV s F(ζ j ) ˆVu ]. j=1 VK, YS (U of M) DD-CINT techniques / 31
18 Pseudocode - Full projector (DD-FP) 1: for j = 1 to n c do 2: W u := (B ζ j I) 1ˆVu (local) 3: W s := ˆV s E W u (local) 4: W s := S(ζ j ) 1 W s ; Z s := Z s +ω j W s (distributed) 5: W u := W u (B ζ j ) 1 EW s ; Zu := Z u +ω j W u (local) 6: end for Practical considerations For each ζ j, j = 1,...,n c : Two solves with B ζ j I + One solve with S(ζ j ) The procedure can be repeated with an updated ˆV Equivalent to FEAST tied with a DD solver VK, YS (U of M) DD-CINT techniques / 31
19 An alternative scheme CINT along the interface unknowns P DD = 1 ( ) W (A ζi) 1 dζ = [P 1,P 2 ], 2iπ Γ G G = 1 S(ζ) 1 dζ, W = 1 (B ζi) 1 ES(ζ) 1 dζ. 2iπ 2iπ Γ Advantage: Does not involve the inverse of whole matrix. Γ VK, YS (U of M) DD-CINT techniques / 31
20 An alternative scheme CINT along the interface unknowns P DD = 1 ( ) W (A ζi) 1 dζ = [P 1,P 2 ], 2iπ Γ G G = 1 S(ζ) 1 dζ, W = 1 (B ζi) 1 ES(ζ) 1 dζ. 2iπ 2iπ Γ Advantage: Does not involve the inverse of whole matrix. Γ ( ) U P DD = XX, X = Y ( ) UY P DD = YY Just capture the range of P 2 = XY P 2 randn() Also: Lanczos on P 2 P 2 (sequential, doubles the work) VK, YS (U of M) DD-CINT techniques / 31
21 Pseudocode - Partial projector (DD-PP) 1: for j = 1 to n c do 2: W u := (B ζ j I) 1ˆVu (local) 3: W s := ˆV s E W u (local) 4: W s := S(ζ j ) 1ˆVs ; Zs := Z s +ω j W s (distributed) 5: W u := W u (B ζ j ) 1 EW s ; Zu := Z u +ω j W u (local) 6: end for Practical considerations For each ζ j, j = 1,...,n c : One solve with B ζ j I + One solve with S(ζ j ) More like a one-shot method VK, YS (U of M) DD-CINT techniques / 31
22 A more detailed analysis of DD-PP Spectral Schur complement λ det[s(λ)] = 0 (λ / σ(b)) The eigenvector satisfies ( ) (B λi) 1 Ey x =, with y := S(λ)y = 0. y If (B ζi) 1 analytic in [α,β] W = 1 (B ζi) 1 ES(ζ) 1 dζ {(B λi) 1 Ey} Γ 2iπ Γ G = 1 S(ζ) 1 dζ { y} Γ 2iπ Γ Another idea: Solve S(λ)y = 0 directly ([VK,RLi,YS],[VanB,Kra],[Sak]) VK, YS (U of M) DD-CINT techniques / 31
23 Contents 1 Introduction 2 The Domain Decomposition framework 3 Domain Decomposition-based contour integration 4 Implementation in HPC architectures 5 Experiments 6 Discussion VK, YS (U of M) DD-CINT techniques / 31
24 A closer look at the Schur complement So far: Eigenvalue problem Linear systems with mrhs Schur complement From the DD framework we have S 1 (ζ) E E 1M E 21 S 2 (ζ)... E 2M S(ζ) =....., where E M1 E M2... S M (ζ) S i (ζ) = C i ζi E T i (B i ζi) 1 E i, i = 1,...,M, is the local Schur complement (complex symmetric). VK, YS (U of M) DD-CINT techniques / 31
25 Solving linear systems with the Schur complement Straightforward approach Form and factorize S(ζ) Extremely robust but impractical for 3D problems Alternative Use an approximation of S(ζ) Lots of ideas (parms, LORASC,...) Typical preconditioners implemented: Block Jacobi: Use C i, C i ζi or S i (ζ), i = 1,...,M Global approximation: Use C, C ζi or S(ζ) Memory Vs robustness Important: magnitude of the imaginary part of a pole VK, YS (U of M) DD-CINT techniques / 31
26 Contents 1 Introduction 2 The Domain Decomposition framework 3 Domain Decomposition-based contour integration 4 Implementation in HPC architectures 5 Experiments 6 Discussion VK, YS (U of M) DD-CINT techniques / 31
27 Implementation and computing environment Hardware ITASCA HP Linux cluster at Minnesota Supercomputing Inst. 1,091 HP ProLiant BL280c G6 blade servers, each with two-socket, quad-core 2.8 GHz Intel Xeon X5560 Nehalem EP (24 GB per node) Software 40-gigabit QDR InfiniBand (IB) interconnect The software was written in C++ and on top of PETSc (MPI) Linked to AMD, METIS, UMFPACK, MUMPS, MKL-BLAS, MKL-LAPACK Compiled with mpiicpc (-O3) VK, YS (U of M) DD-CINT techniques / 31
28 Parallel implementation (0,0) (0,1) (0,2) (0,3) (1,0) (1,1) (1,2) (1,3) (2,0) (2,1) (2,2) (2,3) (3,0) (3,1) (3,2) (3,3) COMM DD COMM XX Different levels of parallelism (1-D, 2-D, 3-D grids) COMM DD is the communicator used for Domain Decomposition Different choices for XX XX = mrhs: Parallelize the multiple right-hand sides XX = poles: Parallelize the number of poles Selection depends on the linear solver (direct or iterative) VK, YS (U of M) DD-CINT techniques / 31
29 Experimental framework CINT + Subspace Iteration CINT-SI: standard FEAST approach Direct (MUMPS) or iterative (preconditioned) solver CINT + DD DD-FP: implements the full projector DD-PP: implements the partial projector Schur complement: exact or approximate Details # MPI processes # cores Quadrature rule: Gauss-Legendre Eig/vle tolerance: 1e 8 VK, YS (U of M) DD-CINT techniques / 31
30 Numerical illustration A comparison of DD-FP and DD-PP I D Laplacean [α,β]=[1.6,1.7] Average relative residual DD FP, n = 4 c DD PP: n = 4 c DD PP: n = 8 c DD PP: n = 12 c Iterations VK, YS (U of M) DD-CINT techniques / 31
31 Numerical illustration (cont. from previous) A comparison of DD-FP and DD-PP II VK, YS (U of M) DD-CINT techniques / 31
32 Test on a 2D Laplacian Table: Time is listed in seconds. A 2-D grid of processors was used. S(ζ) factorized explicitly. Number of poles: n c = 4 Exterior eigvls Interior eigvls MPI 16 4 MPI 32 4 MPI 64 4 [α, β] Its CINT-SI DD-FP CINT-SI DD-FP CINT-SI DD-FP [λ 1,λ 20 ] [λ 1,λ 50 ] [λ 1,λ 100 ] [λ 201,λ 220 ] [λ 201,λ 250 ] [λ 201,λ 300 ] Exterior: Number of right-hand sides number of eigvls + 20 Interior: Number of right-hand sides 2 number of eigvls VK, YS (U of M) DD-CINT techniques / 31
33 Efficiency of DD-FP Ratio Efficiency of DD FP (from previous table) [λ 1,λ 50 ] [λ 201,λ 250 ] [λ 1,λ 100 ] [λ,λ ] MPI Processes VK, YS (U of M) DD-CINT techniques / 31
34 Contents 1 Introduction 2 The Domain Decomposition framework 3 Domain Decomposition-based contour integration 4 Implementation in HPC architectures 5 Experiments 6 Discussion VK, YS (U of M) DD-CINT techniques / 31
35 Conclusion In this talk We presented a DD-based form of contour integration DD can be used to: Solve the linear systems in numerical integration Derive DD-based contour integration methods The Schur complement holds a central role Considerations Higher moments of the Schur complement integral Block Krylov subspaces Preconditioning of indefinite linear systems VK, YS (U of M) DD-CINT techniques / 31
Divide and conquer algorithms for large eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota
Divide and conquer algorithms for large eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota PMAA 14 Lugano, July 4, 2014 Collaborators: Joint work with:
More informationSPECTRAL SCHUR COMPLEMENT TECHNIQUES FOR SYMMETRIC EIGENVALUE PROBLEMS
Electronic Transactions on Numerical Analysis. Volume 45, pp. 305 329, 2016. Copyright c 2016,. ISSN 1068 9613. ETNA SPECTRAL SCHUR COMPLEMENT TECHNIQUES FOR SYMMETRIC EIGENVALUE PROBLEMS VASSILIS KALANTZIS,
More informationFEAST eigenvalue algorithm and solver: review and perspectives
FEAST eigenvalue algorithm and solver: review and perspectives Eric Polizzi Department of Electrical and Computer Engineering University of Masachusetts, Amherst, USA Sparse Days, CERFACS, June 25, 2012
More informationPFEAST: A High Performance Sparse Eigenvalue Solver Using Distributed-Memory Linear Solvers
PFEAST: A High Performance Sparse Eigenvalue Solver Using Distributed-Memory Linear Solvers James Kestyn, Vasileios Kalantzis, Eric Polizzi, Yousef Saad Electrical and Computer Engineering Department,
More informationThe EVSL package for symmetric eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota
The EVSL package for symmetric eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota 15th Copper Mountain Conference Mar. 28, 218 First: Joint work with
More informationBEYOND AMLS: DOMAIN DECOMPOSITION WITH RATIONAL FILTERING
EYOND AMLS: DOMAIN DECOMPOSITION WITH RATIONAL FILTERING VASSILIS KALANTZIS, YUANZHE XI, AND YOUSEF SAAD Abstract. This paper proposes a rational filtering domain decomposition technique for the solution
More informationIs there life after the Lanczos method? What is LOBPCG?
1 Is there life after the Lanczos method? What is LOBPCG? Andrew V. Knyazev Department of Mathematics and Center for Computational Mathematics University of Colorado at Denver SIAM ALA Meeting, July 17,
More informationBEYOND AUTOMATED MULTILEVEL SUBSTRUCTURING: DOMAIN DECOMPOSITION WITH RATIONAL FILTERING
SIAM J SCI COMPUT Vol 0, No 0, pp 000 000 c XXXX Society for Industrial and Applied Mathematics EYOND AUTOMATED MULTILEVEL SUSTRUCTURING: DOMAIN DECOMPOSITION WITH RATIONAL FILTERING VASSILIS KALANTZIS,
More informationMultilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota
Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with Ruipeng Li Work
More informationILAS 2017 July 25, 2017
Polynomial and rational filtering for eigenvalue problems and the EVSL project Yousef Saad Department of Computer Science and Engineering University of Minnesota ILAS 217 July 25, 217 Large eigenvalue
More informationMultilevel Low-Rank Preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota. Modelling 2014 June 2,2014
Multilevel Low-Rank Preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota Modelling 24 June 2,24 Dedicated to Owe Axelsson at the occasion of his 8th birthday
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationParallel Multilevel Low-Rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota
Parallel Multilevel Low-Rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota ICME, Stanford April 2th, 25 First: Partly joint work with
More information1. Introduction. We consider the problem of solving the linear system. Ax = b, (1.1)
SCHUR COMPLEMENT BASED DOMAIN DECOMPOSITION PRECONDITIONERS WITH LOW-RANK CORRECTIONS RUIPENG LI, YUANZHE XI, AND YOUSEF SAAD Abstract. This paper introduces a robust preconditioner for general sparse
More informationA Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation
A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation Tao Zhao 1, Feng-Nan Hwang 2 and Xiao-Chuan Cai 3 Abstract In this paper, we develop an overlapping domain decomposition
More informationDensity-Matrix-Based Algorithms for Solving Eingenvalue Problems
University of Massachusetts - Amherst From the SelectedWorks of Eric Polizzi 2009 Density-Matrix-Based Algorithms for Solving Eingenvalue Problems Eric Polizzi, University of Massachusetts - Amherst Available
More informationParallel Algorithms for Solution of Large Sparse Linear Systems with Applications
Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications Murat Manguoğlu Department of Computer Engineering Middle East Technical University, Ankara, Turkey Prace workshop: HPC
More informationCOMPUTING PARTIAL SPECTRA WITH LEAST-SQUARES RATIONAL FILTERS
SIAM J. SCI. COMPUT. Vol. 38, No. 5, pp. A32 A345 c 26 Society for Industrial and Applied Mathematics COMPUTING PARTIAL SPECTRA WITH LEAST-SQUARES RATIONAL FILTERS YUANZHE XI AND YOUSEF SAAD Abstract.
More informationA Parallel Implementation of the Davidson Method for Generalized Eigenproblems
A Parallel Implementation of the Davidson Method for Generalized Eigenproblems Eloy ROMERO 1 and Jose E. ROMAN Instituto ITACA, Universidad Politécnica de Valencia, Spain Abstract. We present a parallel
More informationPASC17 - Lugano June 27, 2017
Polynomial and rational filtering for eigenvalue problems and the EVSL project Yousef Saad Department of Computer Science and Engineering University of Minnesota PASC17 - Lugano June 27, 217 Large eigenvalue
More informationA knowledge-based approach to high-performance computing in ab initio simulations.
Mitglied der Helmholtz-Gemeinschaft A knowledge-based approach to high-performance computing in ab initio simulations. AICES Advisory Board Meeting. July 14th 2014 Edoardo Di Napoli Academic background
More informationA METHOD FOR PROFILING THE DISTRIBUTION OF EIGENVALUES USING THE AS METHOD. Kenta Senzaki, Hiroto Tadano, Tetsuya Sakurai and Zhaojun Bai
TAIWANESE JOURNAL OF MATHEMATICS Vol. 14, No. 3A, pp. 839-853, June 2010 This paper is available online at http://www.tjm.nsysu.edu.tw/ A METHOD FOR PROFILING THE DISTRIBUTION OF EIGENVALUES USING THE
More informationAccelerating the spike family of algorithms by solving linear systems with multiple right-hand sides
Accelerating the spike family of algorithms by solving linear systems with multiple right-hand sides Dept. of Computer Engineering & Informatics University of Patras Graduate Program in Computer Science
More informationSolving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners
Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University of Minnesota 2 Department
More informationLEAST-SQUARES RATIONAL FILTERS FOR THE SOLUTION OF INTERIOR EIGENVALUE PROBLEMS
LEAST-SQUARES RATIONAL FILTERS FOR THE SOLUTION OF INTERIOR EIGENVALUE PROBLEMS YUANZHE XI AND YOUSEF SAAD Abstract. This paper presents a method for computing partial spectra of Hermitian matrices, based
More informationA robust multilevel approximate inverse preconditioner for symmetric positive definite matrices
DICEA DEPARTMENT OF CIVIL, ENVIRONMENTAL AND ARCHITECTURAL ENGINEERING PhD SCHOOL CIVIL AND ENVIRONMENTAL ENGINEERING SCIENCES XXX CYCLE A robust multilevel approximate inverse preconditioner for symmetric
More informationPreconditioned Locally Minimal Residual Method for Computing Interior Eigenpairs of Symmetric Operators
Preconditioned Locally Minimal Residual Method for Computing Interior Eigenpairs of Symmetric Operators Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University
More informationIncomplete Cholesky preconditioners that exploit the low-rank property
anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline
More informationSome Geometric and Algebraic Aspects of Domain Decomposition Methods
Some Geometric and Algebraic Aspects of Domain Decomposition Methods D.S.Butyugin 1, Y.L.Gurieva 1, V.P.Ilin 1,2, and D.V.Perevozkin 1 Abstract Some geometric and algebraic aspects of various domain decomposition
More informationSchur complement-based domain decomposition preconditioners with low-rank corrections
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. (206) Published online in Wiley Online Library (wileyonlinelibrary.com)..205 Schur complement-based domain decomposition preconditioners
More informationContents. Preface... xi. Introduction...
Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism
More informationITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS
ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS YOUSEF SAAD University of Minnesota PWS PUBLISHING COMPANY I(T)P An International Thomson Publishing Company BOSTON ALBANY BONN CINCINNATI DETROIT LONDON MADRID
More informationproblem Au = u by constructing an orthonormal basis V k = [v 1 ; : : : ; v k ], at each k th iteration step, and then nding an approximation for the e
A Parallel Solver for Extreme Eigenpairs 1 Leonardo Borges and Suely Oliveira 2 Computer Science Department, Texas A&M University, College Station, TX 77843-3112, USA. Abstract. In this paper a parallel
More informationRecent advances in HPC with FreeFem++
Recent advances in HPC with FreeFem++ Pierre Jolivet Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann Fourth workshop on FreeFem++ December 6, 2012 With F. Hecht, F. Nataf, C. Prud homme. Outline
More informationParallelization of the Molecular Orbital Program MOS-F
Parallelization of the Molecular Orbital Program MOS-F Akira Asato, Satoshi Onodera, Yoshie Inada, Elena Akhmatskaya, Ross Nobes, Azuma Matsuura, Atsuya Takahashi November 2003 Fujitsu Laboratories of
More informationDomain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions
Domain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions Bernhard Hientzsch Courant Institute of Mathematical Sciences, New York University, 51 Mercer Street, New
More information9.1 Preconditioned Krylov Subspace Methods
Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete
More informationarxiv: v1 [hep-lat] 19 Jul 2009
arxiv:0907.3261v1 [hep-lat] 19 Jul 2009 Application of preconditioned block BiCGGR to the Wilson-Dirac equation with multiple right-hand sides in lattice QCD Abstract H. Tadano a,b, Y. Kuramashi c,b, T.
More informationSOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA
1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization
More informationMultigrid absolute value preconditioning
Multigrid absolute value preconditioning Eugene Vecharynski 1 Andrew Knyazev 2 (speaker) 1 Department of Computer Science and Engineering University of Minnesota 2 Department of Mathematical and Statistical
More informationA Numerical Study of Some Parallel Algebraic Preconditioners
A Numerical Study of Some Parallel Algebraic Preconditioners Xing Cai Simula Research Laboratory & University of Oslo PO Box 1080, Blindern, 0316 Oslo, Norway xingca@simulano Masha Sosonkina University
More informationPreconditioned Eigensolver LOBPCG in hypre and PETSc
Preconditioned Eigensolver LOBPCG in hypre and PETSc Ilya Lashuk, Merico Argentati, Evgueni Ovtchinnikov, and Andrew Knyazev Department of Mathematics, University of Colorado at Denver, P.O. Box 173364,
More informationContour Integral Method for the Simulation of Accelerator Cavities
Contour Integral Method for the Simulation of Accelerator Cavities V. Pham-Xuan, W. Ackermann and H. De Gersem Institut für Theorie Elektromagnetischer Felder DESY meeting (14.11.2017) November 14, 2017
More informationLecture 8: Fast Linear Solvers (Part 7)
Lecture 8: Fast Linear Solvers (Part 7) 1 Modified Gram-Schmidt Process with Reorthogonalization Test Reorthogonalization If Av k 2 + δ v k+1 2 = Av k 2 to working precision. δ = 10 3 2 Householder Arnoldi
More informationLattice Boltzmann simulations on heterogeneous CPU-GPU clusters
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts
More informationArnoldi Methods in SLEPc
Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,
More informationThe new challenges to Krylov subspace methods Yousef Saad Department of Computer Science and Engineering University of Minnesota
The new challenges to Krylov subspace methods Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM Applied Linear Algebra Valencia, June 18-22, 2012 Introduction Krylov
More informationA parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations
A parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations Akira IMAKURA Center for Computational Sciences, University of Tsukuba Joint work with
More informationAccelerating linear algebra computations with hybrid GPU-multicore systems.
Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)
More informationOpen-source finite element solver for domain decomposition problems
1/29 Open-source finite element solver for domain decomposition problems C. Geuzaine 1, X. Antoine 2,3, D. Colignon 1, M. El Bouajaji 3,2 and B. Thierry 4 1 - University of Liège, Belgium 2 - University
More informationDELFT UNIVERSITY OF TECHNOLOGY
DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug
More informationLecture 17: Iterative Methods and Sparse Linear Algebra
Lecture 17: Iterative Methods and Sparse Linear Algebra David Bindel 25 Mar 2014 Logistics HW 3 extended to Wednesday after break HW 4 should come out Monday after break Still need project description
More informationSolution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method
Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts
More informationEigenvalue Problems CHAPTER 1 : PRELIMINARIES
Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Heinrich Voss voss@tu-harburg.de Hamburg University of Technology Institute of Mathematics TUHH Heinrich Voss Preliminaries Eigenvalue problems 2012 1 / 14
More informationPreface to the Second Edition. Preface to the First Edition
n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................
More informationA Jacobi Davidson Method for Nonlinear Eigenproblems
A Jacobi Davidson Method for Nonlinear Eigenproblems Heinrich Voss Section of Mathematics, Hamburg University of Technology, D 21071 Hamburg voss @ tu-harburg.de http://www.tu-harburg.de/mat/hp/voss Abstract.
More informationPreconditioned inverse iteration and shift-invert Arnoldi method
Preconditioned inverse iteration and shift-invert Arnoldi method Melina Freitag Department of Mathematical Sciences University of Bath CSC Seminar Max-Planck-Institute for Dynamics of Complex Technical
More informationTHE EIGENVALUES SLICING LIBRARY (EVSL): ALGORITHMS, IMPLEMENTATION, AND SOFTWARE
THE EIGENVALUES SLICING LIBRARY (EVSL): ALGORITHMS, IMPLEMENTATION, AND SOFTWARE RUIPENG LI, YUANZHE XI, LUCAS ERLANDSON, AND YOUSEF SAAD Abstract. This paper describes a software package called EVSL (for
More informationAPPLIED NUMERICAL LINEAR ALGEBRA
APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation
More informationA Parallel Implementation of the Trace Minimization Eigensolver
A Parallel Implementation of the Trace Minimization Eigensolver Eloy Romero and Jose E. Roman Instituto ITACA, Universidad Politécnica de Valencia, Camino de Vera, s/n, 4622 Valencia, Spain. Tel. +34-963877356,
More informationScalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver
Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,
More informationSakurai-Sugiura algorithm based eigenvalue solver for Siesta. Georg Huhs
Sakurai-Sugiura algorithm based eigenvalue solver for Siesta Georg Huhs Motivation Timing analysis for one SCF-loop iteration: left: CNT/Graphene, right: DNA Siesta Specifics High fraction of EVs needed
More informationWRF performance tuning for the Intel Woodcrest Processor
WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,
More informationImplementation of a preconditioned eigensolver using Hypre
Implementation of a preconditioned eigensolver using Hypre Andrew V. Knyazev 1, and Merico E. Argentati 1 1 Department of Mathematics, University of Colorado at Denver, USA SUMMARY This paper describes
More informationOpportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem
Opportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem Peter Benner, Andreas Marek, Carolin Penke August 16, 2018 ELSI Workshop 2018 Partners: The Problem The Bethe-Salpeter
More informationSchwarz-type methods and their application in geomechanics
Schwarz-type methods and their application in geomechanics R. Blaheta, O. Jakl, K. Krečmer, J. Starý Institute of Geonics AS CR, Ostrava, Czech Republic E-mail: stary@ugn.cas.cz PDEMAMIP, September 7-11,
More informationA FEAST Algorithm with oblique projection for generalized eigenvalue problems
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 206; 00: 20 Published online in Wiley InterScience (www.interscience.wiley.com). A FEAST Algorithm with oblique projection for generalized
More informationInexact and Nonlinear Extensions of the FEAST Eigenvalue Algorithm
University of Massachusetts Amherst ScholarWorks@UMass Amherst Doctoral Dissertations Dissertations and Theses 2018 Inexact and Nonlinear Extensions of the FEAST Eigenvalue Algorithm Brendan E. Gavin University
More informationSolving Ax = b, an overview. Program
Numerical Linear Algebra Improving iterative solvers: preconditioning, deflation, numerical software and parallelisation Gerard Sleijpen and Martin van Gijzen November 29, 27 Solving Ax = b, an overview
More informationStabilization and Acceleration of Algebraic Multigrid Method
Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration
More informationLARGE SPARSE EIGENVALUE PROBLEMS
LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1 General Tools for Solving Large Eigen-Problems
More informationRecent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna.
Recent advances in approximation using Krylov subspaces V. Simoncini Dipartimento di Matematica, Università di Bologna and CIRSA, Ravenna, Italy valeria@dm.unibo.it 1 The framework It is given an operator
More informationMigration with Implicit Solvers for the Time-harmonic Helmholtz
Migration with Implicit Solvers for the Time-harmonic Helmholtz Yogi A. Erlangga, Felix J. Herrmann Seismic Laboratory for Imaging and Modeling, The University of British Columbia {yerlangga,fherrmann}@eos.ubc.ca
More informationError Bounds for Iterative Refinement in Three Precisions
Error Bounds for Iterative Refinement in Three Precisions Erin C. Carson, New York University Nicholas J. Higham, University of Manchester SIAM Annual Meeting Portland, Oregon July 13, 018 Hardware Support
More informationVariants of BiCGSafe method using shadow three-term recurrence
Variants of BiCGSafe method using shadow three-term recurrence Seiji Fujino, Takashi Sekimoto Research Institute for Information Technology, Kyushu University Ryoyu Systems Co. Ltd. (Kyushu University)
More informationLecture 3: Inexact inverse iteration with preconditioning
Lecture 3: Department of Mathematical Sciences CLAPDE, Durham, July 2008 Joint work with M. Freitag (Bath), and M. Robbé & M. Sadkane (Brest) 1 Introduction 2 Preconditioned GMRES for Inverse Power Method
More informationComputing least squares condition numbers on hybrid multicore/gpu systems
Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning
More informationEIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems
EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems JAMES H. MONEY and QIANG YE UNIVERSITY OF KENTUCKY eigifp is a MATLAB program for computing a few extreme eigenvalues
More informationMultilevel spectral coarse space methods in FreeFem++ on parallel architectures
Multilevel spectral coarse space methods in FreeFem++ on parallel architectures Pierre Jolivet Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann DD 21, Rennes. June 29, 2012 In collaboration with
More informationParallel sparse linear solvers and applications in CFD
Parallel sparse linear solvers and applications in CFD Jocelyne Erhel Joint work with Désiré Nuentsa Wakam () and Baptiste Poirriez () SAGE team, Inria Rennes, France journée Calcul Intensif Distribué
More informationThe Nullspace free eigenvalue problem and the inexact Shift and invert Lanczos method. V. Simoncini. Dipartimento di Matematica, Università di Bologna
The Nullspace free eigenvalue problem and the inexact Shift and invert Lanczos method V. Simoncini Dipartimento di Matematica, Università di Bologna and CIRSA, Ravenna, Italy valeria@dm.unibo.it 1 The
More informationMatrices, Moments and Quadrature, cont d
Jim Lambers CME 335 Spring Quarter 2010-11 Lecture 4 Notes Matrices, Moments and Quadrature, cont d Estimation of the Regularization Parameter Consider the least squares problem of finding x such that
More informationSome thoughts about energy efficient application execution on NEC LX Series compute clusters
Some thoughts about energy efficient application execution on NEC LX Series compute clusters G. Wellein, G. Hager, J. Treibig, M. Wittmann Erlangen Regional Computing Center & Department of Computer Science
More informationMultipréconditionnement adaptatif pour les méthodes de décomposition de domaine. Nicole Spillane (CNRS, CMAP, École Polytechnique)
Multipréconditionnement adaptatif pour les méthodes de décomposition de domaine Nicole Spillane (CNRS, CMAP, École Polytechnique) C. Bovet (ONERA), P. Gosselet (ENS Cachan), A. Parret Fréaud (SafranTech),
More informationSpectrum Slicing for Sparse Hermitian Definite Matrices Based on Zolotarev s Functions
Spectrum Slicing for Sparse Hermitian Definite Matrices Based on Zolotarev s Functions Yingzhou Li, Haizhao Yang, ICME, Stanford University Department of Mathematics, Duke University January 30, 017 Abstract
More informationStatic-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems
Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse
More informationAn Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors
Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03024-7 An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Michel Lesoinne
More informationThe Deflation Accelerated Schwarz Method for CFD
The Deflation Accelerated Schwarz Method for CFD J. Verkaik 1, C. Vuik 2,, B.D. Paarhuis 1, and A. Twerda 1 1 TNO Science and Industry, Stieltjesweg 1, P.O. Box 155, 2600 AD Delft, The Netherlands 2 Delft
More informationInexact inverse iteration with preconditioning
Department of Mathematical Sciences Computational Methods with Applications Harrachov, Czech Republic 24th August 2007 (joint work with M. Robbé and M. Sadkane (Brest)) 1 Introduction 2 Preconditioned
More informationBoundary Value Problems - Solving 3-D Finite-Difference problems Jacob White
Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationMatrix balancing and robust Monte Carlo algorithm for evaluating dominant eigenpair
Computer Science Journal of Moldova, vol.18, no.3(54), 2010 Matrix balancing and robust Monte Carlo algorithm for evaluating dominant eigenpair Behrouz Fathi Vajargah Farshid Mehrdoust Abstract Matrix
More informationA parallel exponential integrator for large-scale discretizations of advection-diffusion models
A parallel exponential integrator for large-scale discretizations of advection-diffusion models L. Bergamaschi 1, M. Caliari 2, A. Martínez 3, and M. Vianello 3 1 Department of Mathematical Methods and
More informationWeather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012
Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems
ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems Part I: Review of basic theory of eigenvalue problems 1. Let A C n n. (a) A scalar λ is an eigenvalue of an n n A
More informationPolynomial filtering for interior eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota
Polynomial filtering for interior eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with: Haw-ren
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationA Parallel Additive Schwarz Preconditioned Jacobi-Davidson Algorithm for Polynomial Eigenvalue Problems in Quantum Dot Simulation
A Parallel Additive Schwarz Preconditioned Jacobi-Davidson Algorithm for Polynomial Eigenvalue Problems in Quantum Dot Simulation Feng-Nan Hwang a, Zih-Hao Wei a, Tsung-Ming Huang b, Weichung Wang c, a
More informationA High-Performance Parallel Hybrid Method for Large Sparse Linear Systems
Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,
More information