Robust solution of Poisson-like problems with aggregation-based AMG

Similar documents
An efficient multigrid solver based on aggregation

Aggregation-based algebraic multigrid

Solving PDEs with Multigrid Methods p.1

Multigrid Methods and their application in CFD

multigrid, algebraic multigrid, AMG, convergence analysis, preconditioning, ag- gregation Ax = b (1.1)

Algebraic multigrid and multilevel methods A general introduction. Outline. Algebraic methods: field of application

Algebraic multigrid for moderate order finite elements

Algebraic Multigrid as Solvers and as Preconditioner

1. Fast Iterative Solvers of SLE

Incomplete Cholesky preconditioners that exploit the low-rank property

Preface to the Second Edition. Preface to the First Edition

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMG for a Peta-scale Navier Stokes Code

Adaptive algebraic multigrid methods in lattice computations

Computational Linear Algebra

An efficient multigrid method for graph Laplacian systems II: robust aggregation

c 2017 Society for Industrial and Applied Mathematics

New Multigrid Solver Advances in TOPS

Aspects of Multigrid

Multigrid absolute value preconditioning

Kasetsart University Workshop. Multigrid methods: An introduction

Adaptive Multigrid for QCD. Lattice University of Regensburg

Stabilization and Acceleration of Algebraic Multigrid Method

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

A Novel Aggregation Method based on Graph Matching for Algebraic MultiGrid Preconditioning of Sparse Linear Systems

Computers and Mathematics with Applications

Preconditioned Locally Minimal Residual Method for Computing Interior Eigenpairs of Symmetric Operators

IMPLEMENTATION OF A PARALLEL AMG SOLVER

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Partial Differential Equations

Lecture 8: Fast Linear Solvers (Part 7)

The Conjugate Gradient Method

University of Illinois at Urbana-Champaign. Multigrid (MG) methods are used to approximate solutions to elliptic partial differential

A Generalized Eigensolver Based on Smoothed Aggregation (GES-SA) for Initializing Smoothed Aggregation Multigrid (SA)

A Numerical Study of Some Parallel Algebraic Preconditioners

Geometric Multigrid Methods

hypre MG for LQFT Chris Schroeder LLNL - Physics Division

Domain decomposition on different levels of the Jacobi-Davidson method

Comparison of V-cycle Multigrid Method for Cell-centered Finite Difference on Triangular Meshes

arxiv: v1 [math.na] 11 Jul 2011

Elliptic Problems / Multigrid. PHY 604: Computational Methods for Physics and Astrophysics II

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

The Removal of Critical Slowing Down. Lattice College of William and Mary

Conjugate Gradients: Idea

A greedy strategy for coarse-grid selection

Using an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs

Some Geometric and Algebraic Aspects of Domain Decomposition Methods

Spectral element agglomerate AMGe

Algebraic Multigrid Methods for the Oseen Problem

Multigrid and Domain Decomposition Methods for Electrostatics Problems

Algebraic Multigrid Preconditioners for Computing Stationary Distributions of Markov Processes

An Angular Multigrid Acceleration Method for S n Equations with Highly Forward-Peaked Scattering

The Discontinuous Galerkin Finite Element Method

Scientific Computing: An Introductory Survey

Lecture 18 Classical Iterative Methods

Introduction to Scientific Computing II Multigrid

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Preconditioners for the incompressible Navier Stokes equations

An Algebraic Multigrid Method for Eigenvalue Problems

Iterative Methods and Multigrid

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Lehrstuhl für Informatik 10 (Systemsimulation)

A SHORT NOTE COMPARING MULTIGRID AND DOMAIN DECOMPOSITION FOR PROTEIN MODELING EQUATIONS

INTRODUCTION TO MULTIGRID METHODS

EFFICIENT MULTIGRID BASED SOLVERS FOR ISOGEOMETRIC ANALYSIS

EXACT DE RHAM SEQUENCES OF SPACES DEFINED ON MACRO-ELEMENTS IN TWO AND THREE SPATIAL DIMENSIONS

Multigrid finite element methods on semi-structured triangular grids

Splitting Iteration Methods for Positive Definite Linear Systems

Lecture 17: Iterative Methods and Sparse Linear Algebra

Notes for CS542G (Iterative Solvers for Linear Systems)

Simulating Solid Tumor Growth using Multigrid Algorithms

HW4, Math 228A. Multigrid Solver. Fall 2010

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Research Article Evaluation of the Capability of the Multigrid Method in Speeding Up the Convergence of Iterative Methods

On nonlinear adaptivity with heterogeneity

Parallel sparse linear solvers and applications in CFD

Analysis of two-grid methods: The nonnormal case

Simulating Solid Tumor Growth Using Multigrid Algorithms

ANALYSIS AND COMPARISON OF GEOMETRIC AND ALGEBRAIC MULTIGRID FOR CONVECTION-DIFFUSION EQUATIONS

Contents. Preface... xi. Introduction...

A User Friendly Toolbox for Parallel PDE-Solvers

ANALYSIS OF AUGMENTED LAGRANGIAN-BASED PRECONDITIONERS FOR THE STEADY INCOMPRESSIBLE NAVIER-STOKES EQUATIONS

Parallel Discontinuous Galerkin Method

A parallel block multi-level preconditioner for the 3D incompressible Navier Stokes equations

Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools

Preconditioning Techniques Analysis for CG Method

Newton-Multigrid Least-Squares FEM for S-V-P Formulation of the Navier-Stokes Equations

Numerical Programming I (for CSE)

Efficient smoothers for all-at-once multigrid methods for Poisson and Stokes control problems

Integration of PETSc for Nonlinear Solves

Chapter 5. Methods for Solving Elliptic Equations

arxiv: v4 [math.na] 1 Sep 2018

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

c 2010 Society for Industrial and Applied Mathematics

INTERGRID OPERATORS FOR THE CELL CENTERED FINITE DIFFERENCE MULTIGRID ALGORITHM ON RECTANGULAR GRIDS. 1. Introduction

Optimizing Runge-Kutta smoothers for unsteady flow problems

Transcription:

Robust solution of Poisson-like problems with aggregation-based AMG Yvan Notay Université Libre de Bruxelles Service de Métrologie Nucléaire Paris, January 26, 215 Supported by the Belgian FNRS http://homepages.ulb.ac.be/ ynotay

Outline (p. 2) 1. Introduction 2. Why multigrid 3. AMG & Aggregation 4. AGMG Robustness study Comparative study 5. Parallelization of AGMG 6. Conclusions

1. Introduction (1) (p. 3) In many cases, the design of an appropriate iterative linear solver is much easier if one has at hand a library able to efficiently solve linear (sub)systems Au = b where A corresponds to the discretization of div(d grad(u))+v grad(u)+cu = f (+B.C.) (or closely related). Thus we need a good solver for discrete Poisson-like problems

1. Introduction (1) (p. 3) In many cases, the design of an appropriate iterative linear solver is much easier if one has at hand a library able to efficiently solve linear (sub)systems Au = b where A corresponds to the discretization of div(d grad(u))+v grad(u)+cu = f (+B.C.) (or closely related). Thus we need a good solver for discrete Poisson-like problems Efficiently: robustly (stable performances) in linear time: elapsed n #proc roughly constant

1. Introduction (2) (p. 4) Discrete Poisson-like problems For not too large 2D problems, direct methods OK (solve the system in linear time) de facto standard for a long time

1. Introduction (2) (p. 4) Discrete Poisson-like problems For not too large 2D problems, direct methods OK (solve the system in linear time) de facto standard for a long time Large 3D problems: untractable for direct methods (In addition, scale poorly in parallel) Iterative solvers needed

1. Introduction (2) (p. 4) Discrete Poisson-like problems For not too large 2D problems, direct methods OK (solve the system in linear time) de facto standard for a long time Large 3D problems: untractable for direct methods (In addition, scale poorly in parallel) Iterative solvers needed Multigrid method are good candidates (see below) But to substitute a direct solver we need a black box solver (You provide the matrix & rhs, I return the solution) that is robust (convergence not affected by changes in the BC, PDE coeff., geometry & discretization grid)

2. Why multigrid (1) (p. 5) Standard iterative methods typically very slow Consider the error from different scales: small scale strong local variations large scale smooth variations Iterative methods mainly act locally, i.e. damp error modes seen from small scale, but not much smooth modes More steps needed to propagate information about smooth modes as the grid is refined (linear time out of reach)

2. Why multigrid (1) (p. 5) Standard iterative methods typically very slow Consider the error from different scales: small scale strong local variations large scale smooth variations Iterative methods mainly act locally, i.e. damp error modes seen from small scale, but not much smooth modes More steps needed to propagate information about smooth modes as the grid is refined (linear time out of reach) Multigrid: solving the problem on a coarser grid (less unknowns easier) yields an approximate solution which is essentially correct from the large scale viewpoint

2. Why multigrid (2) (p. 6) A multigrid method alternates: smoothing iterations: improve current solution u k using a basic iterative method ũ k coarse grid correction: project the residual equation A(u ũ k ) = b Aũ k r on the coarse grid: r c = Rr (R : n c n) solve the coarse problem: v c = A 1 c r c prolongate (interpolate) coarse correction on the fine grid: u k+1 = ũ k +P v c (P : n n c )

2. Why multigrid (3) (p. 7) Example u = 2 e 1((x.5)2 +(y.5) 2 ) u = on Ω in Ω = (,1) (,1) Uniform grid with mesh size h, five-point finite difference..9.8.7.6.5.4.3.2.1 1.8.6.4.2.2.4.6.8 1 Solution with h 1 = 5 Solution with h 1 = 25.9.8.7.6.5.4.3.2.1 1.8.6.4.2.2.4.6.8 1

2. Why multigrid (4) (p. 8) Simple iterative methods are not efficient Example: symmetric Gauss Seidel iterations (1 forward Gauss Seidel sweep + 1 backward Gauss Seidel sweep) 1 1-2 r b vs number of iter. 1-4 1-6 1 2 3 4 r b =.72 after 4 steps

2. Why multigrid (5) (p. 9) Initial residual (r.h.s.) Residual after solve on the coarse grid 2 25 2 15 15 1 1 5-5 5 1.8.6.4.2.2.4.6.8 1-1 -15-2 1.8.6.4.2.2.4.6.8 1 r b =.71

2. Why multigrid (5) (p. 9) Initial residual (r.h.s.) Residual after solve on the coarse grid 2 25 2 15 1 5 1.8.6.4.2.2.4.6.8 1 15 1 5-5 -1-15 -2 1.8.6.4.2.2.4.6.8 1 r b =.71 + 1 sym. Gauss-Seidel step + 8 sym. Gauss-Seidel steps.2.25.15.2.1.15.1.5.5 -.5 1.8.6.4.2.2.4.6.8 1 r b =.4.2.39 -.5 1.8.6.2.4.6.8 1 r b =.12

2. Why multigrid (6) (p. 1) Initial residual 1 (SGS coarse solve SGS) 2.12.1 15 1.8.6.4.2 5 1.8.6.4.2.2.4.6.8 1 -.2 -.4 -.6 1.8.6.4.2.2.4.6.8 1 r b = 3.91 3 2 (SGS coarse solve SGS) 4 (SGS coarse solve SGS) x 1-3 x 1-6 3 8 2.5 2 1.5 6 4 1 2.5 -.5-1 1.8.6.4.2.2.4.6.8 1 r b = 7.71 5.4.2-2 -4 1.8.6.2.4.6.8 1 r b = 1.71 7

3. AMG & Aggregation (1) (p. 11) Geometric Multigrid is not black box Often also not robust (e.g., influenced by BC)

3. AMG & Aggregation (1) (p. 11) Geometric Multigrid is not black box Often also not robust (e.g., influenced by BC) Classical Algebraic Multigrid (AMG) Attempt to imitate geometric MG in black box mode With robustness enhancements Issues still open after 3 years of research New issues came with massive parallelism

3. AMG & Aggregation (1) (p. 11) Geometric Multigrid is not black box Often also not robust (e.g., influenced by BC) Classical Algebraic Multigrid (AMG) Attempt to imitate geometric MG in black box mode With robustness enhancements Issues still open after 3 years of research New issues came with massive parallelism Aggregation-based AMG Overlooked for a long time, revival since 27 Solves issues of classical AMG in a natural way Faster and more robust (controversial)

3. AMG & Aggregation (2) (p. 12) Aggregation-based AMG Coarse unknowns: obtained by mere aggregation Coarse grid matrix: obtained by a simple summation (A c ) ij = k G i l G j a kl G 3 G 4 G 1 G 2

3. AMG & Aggregation (2) (p. 12) Aggregation-based AMG Coarse unknowns: obtained by mere aggregation Coarse grid matrix: obtained by a simple summation (A c ) ij = k G i l G j a kl G 3 G 4 G 1 G 2 In parallel: Aggregates formed with unknowns assigned to a same process natural parallelization

3. AMG & Aggregation (3) (p. 13) How to solve the coarse problem? By recursivity: apply the same two-grid scheme at the coarse level do only a few iteration (cost) go to a further coarse level: recursivity again and so on, until problem size is small enough

3. AMG & Aggregation (3) (p. 13) How to solve the coarse problem? By recursivity: apply the same two-grid scheme at the coarse level do only a few iteration (cost) go to a further coarse level: recursivity again and so on, until problem size is small enough Geometric MG & Classical AMG: use the V-cycle 1 iteration at each level

3. AMG & Aggregation (3) (p. 13) How to solve the coarse problem? By recursivity: apply the same two-grid scheme at the coarse level do only a few iteration (cost) go to a further coarse level: recursivity again and so on, until problem size is small enough Geometric MG & Classical AMG: use the V-cycle 1 iteration at each level Aggregation-based AMG: use the K-cycle 2 iterations at each level with Krylov (CG) acceleration Downside of the simplicity, but not an issue

3. AMG & Aggregation (4) (p. 14) Example: recursive Quality Aware Aggregation for the discrete Poisson linear finite element matrix associated with the mesh: Zoom:

3. AMG & Aggregation (5) (p. 15) Aggregation Zoom at Level 1 level 1 Level 2 Level 3

3. AMG & Aggregation (6) (p. 16) Aggregation works also for higher order FE matrices Example: 3rd order (P3) nnz(a) 16n Fine grid Level 1 Level 2 Level 3

3. AMG & Aggregation (7) (p. 17) Solve phase: Workflow for 1 iteration using the K-cycle (4 levels) S (1) 1 R 1 Level 1 (linear system to solve) P 1 S (2) 1 S (1) 2 Level 2 S (2) 2 S(1) 2 S (2) 2 R 2 P 2 R 2 P 2 Level 3 S (1) 3 S (2) 3 S(1) 3 S (2) 3 R 3 P 3 R 3 P 3 S (1) 3 S (2) 3 S(1) 3 S (2) 3 R 3 P 3 R 3 P 3 B LS B LS B LS B LS Level 4 (B LS : bottom level solver)

4. AGMG (p. 18) Iterative solution with AGgregation-based algebraic MultiGrid Linear system solver software package Black box FORTRAN 9 (easy interface with C & C++) Matlab interface >> x=agmg(a,y); >> x=agmg(a,y,1); % SPD case Free academic license

4. AGMG: Test suite (1) (p. 19) MODEL2D : 5-point & 9 point discretizations of { u = 1 on Ω = [,1] [,1] u = on Ω with = 2 x 2 + 2 y 2 ANI2D : 5-point discretization of D u = 1 on Ω = [,1] [,1] u = for x = 1, y 1 = elsewhere on Ω u n with = 1 x x +1 y y, D = diag(ε, 1) Tested: ε = 1 2, 1 4-1 -1-1 -1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 Σ = 2(1+ε) h

4. AGMG: Test suite (2) (p. 2) Non M-matrices ANI2DBIFE : bilinear FE element discretization of D u = 1 on Ω = [,1] [,1] u = for x = 1, y 1 = elsewhere on Ω u n with = 1 x x +1 y y, D = diag(ε, 1) Tested: ε = 1 2, 1 2, 1 3 (i.e. some strong positive offdiagonal elements) -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 Σ = 2(1+ε) h

4. AGMG: Test suite (3) (p. 21) MODEL3D : 7-point discretization of { u = 1 on Ω = [,1] [,1] [,1] u = on Ω with = 2 x 2 + 2 y 2 + 2 z 2 ANI3D : 7-point discretization of D u = 1 on Ω = [,1] [,1] [,1] u = for x = 1, y, z 1 = elsewhere on Ω u n with = 1 x x +1 y y +1 z z, D = diag(ε x, ε y,1) Tested: (.7,1,1),(.7,.25,1), (.7,.7,1), (.5,1,1), (.5,.7, 1), (.5,.5, 1)

4. AGMG: Test suite (4) (p. 22) Problems with Jumps (FD) JUMP2D : 5-point discretization of D u = f on Ω = [,1] [,1] u = for x = 1, y 1 = elsewhere on Ω u n with = 1 x x +1 y y,d = diag(d x, D y ) 1,1 1,1 1,1 D x, D y 1,1 JUMP3D : 7-point discretization of D u = f on Ω = [,1] [,1] [,1] u = for x = 1, y, z 1 = elsewhere on Ω u n with = 1 x x +1 y y +1 z z D = 1 3 D = 1

4. AGMG: Test suite (5) (p. 23) Sphere in a cube, Unstructured 3D meshes Finite element discretization of D u = on Ω = [,1] [,1] [,1] u = for x =, 1, y, z 1 u = 1 elsewhere on Ω with = 1 x x +1 y y +1 z z SPHUNF : quasi uniform mesh D = d D = 1 d = 1 3, 1, 1 3 SPHRF : mesh 1 finer on the sphere

4. AGMG: Test suite (6) (p. 24) Reentering corner, local refinement Finite { element discretization of u = on Ω = [,1] [,1] u = r 2 3 sin( 2θ 3 ) on Ω with = 2 x 2 + 2 y 2 LUNFST : uniform structured grid LRFUST : unstructured grid with mesh 1 r finer near reentering corner r = 5

4. AGMG: Test suite (7) (p. 25) Challenging convection-diffusion problems Upwind FD approximation of { ν u + v grad(u) = f u = g In all cases: tests for ν = 1, 1 2, 1 4, 1 6 ν 1 highly nonsymmetric matrices Example of flow Magnitude: in Ω on Ω.25.2.15.1.5 1.8.6.4.2.2.4.6.8 1

4. AGMG: Robustness study (1) (p. 26) Iterations stopped when r k r < 1 6 Times reported are total elapsed times in seconds (including set up) per 1 6 unknowns FD on regular grids; 3 sizes: 2D: h 1 = 6, 16, 5 3D: h 1 = 8, 16, 32 FE on (un)structured meshes (with different levels of local refinement); 2 sizes per problem: n =.15e6 n = 7.1e6

4. AGMG: Robustness study (2) (p. 27) 12 2D symmetric problems 1 8 6 4 2 Regular Unstructured

4. AGMG: Robustness study (3) (p. 28) 12 3D symmetric problems 1 8 6 4 2 Regular Unstructured

4. AGMG: Robustness study (4) (p. 29) 12 2D nonsymmetric problems 1 8 6 4 2 5 1 15 2 25 3

4. AGMG: Robustness study (5) (p. 3) 12 3D nonsymmetric problems 1 8 6 4 2 5 1 15 2 25 3

4. AGMG: comparative study (1) (p. 31) Comparison with some other software AMG(Hyp): a classical AMG method as implemented in the Hypre library (Boomer AMG) AMG(HSL): a classical AMG method as implemented in the HSL library ILUPACK: efficient threshold-based ILU preconditioner Matlab \: Matlab sparse direct solver (UMFPACK) All methods but the last with Krylov subspace acceleration (Iterations stopped when r k r < 1 6 ) Quantity reported: Total elapsed times in seconds (including set up) per 1 6 unknowns as a function of the number of unknowns (more unknowns yielded by grid refinement)

4. AGMG: comparative study (2) (p. 32) 4 AGMG AMG(Hyp) 2 AMG(HSL) ILUPACK Matlab \ 1 5 POISSON 2D, FD 4 AGMG AMG(Hyp) 2 AMG(HSL) ILUPACK Matlab \ 1 5 LAPLACE 2D, FE(P3) 2 1 5 3 1 4 1 5 1 6 1 7 1 8 2 1 5 3 1 4 1 5 1 6 1 7 1 8 33% of nonzero offdiag >

4. AGMG: comparative study (3) (p. 33) Poisson 2D, L-shaped, FE Unstructured, Local refin. Convection-Diffusion 2D, FD ν = 1 6 4 AGMG AMG(Hyp) 2 AMG(HSL): coarsening failure in all cases ILUPACK Matlab \ 1 5 2 4 2 1 5 2 AGMG AMG(Hyp) AMG(HSL) ILUPACK Matlab \ 1 1 5 3 5 3 1 5 1 6 1 7 1 4 1 5 1 6 1 7 1 8

4. AGMG: comparative study (4) (p. 34) POISSON 3D, FD LAPLACE 3D, FE(P3) 4 AGMG AMG(Hyp) 2 AMG(HSL) ILUPACK Matlab \ 1 5 4 2 1 5 2 1 5 3 1 4 1 5 1 6 1 7 1 8 2 1 5 3 AGMG AMG(Hyp) AMG(HSL) ILUPACK Matlab \ 1 4 1 5 1 6 1 7 51% of nonzero offdiag >

4. AGMG: comparative study (5) (p. 35) 5 Poisson 3D, FE Unstructured, Local refin. 4 AGMG AMG(Hyp) 2 AMG(HSL) ILUPACK Matlab \ 1 Convection-Diffusion 3D, FD ν = 1 6 4 AGMG AMG(Hyp) 2 AMG(HSL) ILUPACK Matlab \ 1 5 2 1 5 3 1 4 1 5 1 6 1 7 2 1 5 3 1 4 1 5 1 6 1 7 1 8

5. Parallelization of AGMG (1) (p. 36) Perspectives Good to start from the best sequential method Any scalability curve should be put in perspective: how much do we loose with respect the best state-of-the-art method on 1 core?

5. Parallelization of AGMG (1) (p. 36) Perspectives Good to start from the best sequential method Any scalability curve should be put in perspective: how much do we loose with respect the best state-of-the-art method on 1 core? The faster the method, the more challenging its parallelization Less computation means less opportunity to overlap communications with computation

5. Parallelization of AGMG (2) (p. 37) General parallelization strategy Partitioning of the unknowns partitioning of matrix rows

5. Parallelization of AGMG (2) (p. 37) General parallelization strategy Partitioning of the unknowns partitioning of matrix rows Aggregation algorithm Unchanged, except that aggregates are only formed with unknowns in a same partition. inherently parallel

5. Parallelization of AGMG (2) (p. 37) General parallelization strategy Partitioning of the unknowns partitioning of matrix rows Aggregation algorithm Unchanged, except that aggregates are only formed with unknowns in a same partition. inherently parallel Solve phase The parallelization raises no particular difficulties, except regarding the bottom level solver In sequential: sparse direct solver Parallel direct solver bottleneck for many proc.

5. Parallelization of AGMG (2) (p. 38) Algorithm redesign Only four levels, whatever problem size

5. Parallelization of AGMG (2) (p. 38) Algorithm redesign Only four levels, whatever problem size Then the bottom level system has about 5 times less unknowns than the initial fine grid system still very large

5. Parallelization of AGMG (2) (p. 38) Algorithm redesign Only four levels, whatever problem size Then the bottom level system has about 5 times less unknowns than the initial fine grid system still very large Thus: Iterative bottom level solver

5. Parallelization of AGMG (2) (p. 38) Algorithm redesign Only four levels, whatever problem size Then the bottom level system has about 5 times less unknowns than the initial fine grid system still very large Thus: Iterative bottom level solver 5 times less Need not be as fast per unknown as AGMG

5. Parallelization of AGMG (2) (p. 38) Algorithm redesign Only four levels, whatever problem size Then the bottom level system has about 5 times less unknowns than the initial fine grid system still very large Thus: Iterative bottom level solver 5 times less Need not be as fast per unknown as AGMG But has to scale very well in parallel (despite smaller problem size)

5. Parallelization of AGMG (3) (p. 39) Iterative bottom level solver Aggregation-based two-grid method (one further level: very coarse grid) All unknowns on a same process form 1 aggregate (very coarse grid: size = number of processes (cores)) Better smoother: apply sequential AGMG to the local part of the matrix Very coarse grid system if still too large, solved in parallel within subgroups of processes the solver is AGMG again (either sequential or parallel)

5. Parallelization of AGMG (4) (p. 4) S (1) 3 S (2) 3 R 3 P 3 B LS R b P b R b P b S b : sequential AGMG applied to local part of the matrix B b : sequential AGMG (512 cores or less) or parallel AGMG in subgroups (more than 512 cores) S b B b S b B b

5. Parallelization of AGMG (5) (p. 41) Results: the magic works Weak scalability on CURIE (Intel Farm) for 3D Poisson Elapsed time (seconds) vs number of unknowns Finite Difference P3 Finite Elements 7 6 5 4 16 cores 64 cores 512 cores 496 cores 13824 cores 32768 cores 9 8 7 6 5 16 cores 64 cores Setup Solve Total. 496 cores 1728 cores 512 cores 13824 cores 32768 cores 3 4 2 1 Setup Solve Total. 1 8 1 9 1 1 1 11 3 2 1 1 7 1 8 1 9 1 1 1 11

5. Parallelization of AGMG (6) (p. 42) 3D Poisson (Finite Difference) on HERMIT (Cray XE6) 3 25 Weak scalability Time vs # unknowns 175616 unknowns per core 144928 unknowns per core 2744 unknowns per core Strong scalability Time vs number of cores 1 3 Total Time Ideal speed-up. 2 15 64 cores 512 cores 496 cores 32768 cores 1 2 1 1 1 1 5 1 7 1 8 1 9 1 1 1 11 1-1 1 4 16 64 256 124

5. Parallelization of AGMG (7) (p. 43) Weak scalability on JUQUEEN (IBM BG/Q) for 3D Poisson (Finite Difference) Elapsed time vs number of unknowns 12 1 Setup Solve Total. 373248 cores 8 11592 cores 6 16 cores 64 cores 512 cores 496 cores 32768 cores 4 2 1 8 1 9 1 1 1 11 1 12

6. Conclusions (p. 44) Robust method for discrete Poisson-like problems

6. Conclusions (p. 44) Robust method for discrete Poisson-like problems Can be used (and is used!) black box (does not require tuning or adaptation)

6. Conclusions (p. 44) Robust method for discrete Poisson-like problems Can be used (and is used!) black box (does not require tuning or adaptation) Faster than other state-of-the-art solvers

6. Conclusions (p. 44) Robust method for discrete Poisson-like problems Can be used (and is used!) black box (does not require tuning or adaptation) Faster than other state-of-the-art solvers Fairly small setup time: especially well suited when only a modest accuracy is needed

6. Conclusions (p. 44) Robust method for discrete Poisson-like problems Can be used (and is used!) black box (does not require tuning or adaptation) Faster than other state-of-the-art solvers Fairly small setup time: especially well suited when only a modest accuracy is needed Efficient parallelization: Linear system with > 1 12 unknowns solved in less than 2 minutes using 373,248 cores on IBM BG/Q; that is: in about.1 nanoseconds per unknown

6. Conclusions (p. 44) Robust method for discrete Poisson-like problems Can be used (and is used!) black box (does not require tuning or adaptation) Faster than other state-of-the-art solvers Fairly small setup time: especially well suited when only a modest accuracy is needed Efficient parallelization: Linear system with > 1 12 unknowns solved in less than 2 minutes using 373,248 cores on IBM BG/Q; that is: in about.1 nanoseconds per unknown Professional code available, free academic license

Some references (p. 45) Two-grid convergence theory Algebraic analysis of two-grid methods: the nonsymmetric case, NLAA (21) Algebraic theory of two-grid methods (NTMA, to appear review paper) The K-cycle Recursive Krylov-based multigrid cycles (with P. S. Vassilevski), NLAA (28) Two-grid analysis of aggregation methods Analysis of aggregation based multigrid (with A. C. Muresan), SISC (28) Algebraic analysis of aggregation-based multigrid, (with A. Napov) NLAA (211) AGMG and quality aware aggregation An aggregation-based algebraic multigrid method, ETNA (21). An algebraic multigrid method with guaranteed convergence rate (with A. Napov), SISC (212) Aggregation-based algebraic multigrid for convection-diffusion equations, SISC (212) Algebraic multigrid for moderate order finite elements (with A. Napov), SISC (214) Parallelization A massively parallel solver for discrete Poisson-like problems, Tech. Rep. (214)

Some references!! Thank you!! (p. 45) Two-grid convergence theory Algebraic analysis of two-grid methods: the nonsymmetric case, NLAA (21) Algebraic theory of two-grid methods (NTMA, to appear review paper) The K-cycle Recursive Krylov-based multigrid cycles (with P. S. Vassilevski), NLAA (28) Two-grid analysis of aggregation methods Analysis of aggregation based multigrid (with A. C. Muresan), SISC (28) Algebraic analysis of aggregation-based multigrid, (with A. Napov) NLAA (211) AGMG and quality aware aggregation An aggregation-based algebraic multigrid method, ETNA (21). An algebraic multigrid method with guaranteed convergence rate (with A. Napov), SISC (212) Aggregation-based algebraic multigrid for convection-diffusion equations, SISC (212) Algebraic multigrid for moderate order finite elements (with A. Napov), SISC (214) Parallelization A massively parallel solver for discrete Poisson-like problems, Tech. Rep. (214)