An efficient multigrid solver based on aggregation Yvan Notay Université Libre de Bruxelles Service de Métrologie Nucléaire Graz, July 4, 2012 Co-worker: Artem Napov Supported by the Belgian FNRS http://homepages.ulb.ac.be/ ynotay An efficient multigrid solver based on aggregation p.1
Outline 1. Introduction 2. Convergence analysis 3. Aggregation procedure Repeated pairwise aggregation 4. Parallelization 5. Numerical results 6. Conclusions An efficient multigrid solver based on aggregation p.2
1. Introduction Developing a biomedical model: complex system of PDEs An efficient multigrid solver based on aggregation p.3
1. Introduction Developing a biomedical model: complex system of PDEs (Discretization) Huge (nonlinear, time dependent) algebraic problem An efficient multigrid solver based on aggregation p.3
1. Introduction Developing a biomedical model: complex system of PDEs (Discretization) Huge (nonlinear, time dependent) algebraic problem (Linearization, (semi-)implicit time discretization) (many successive) Large linear systems An efficient multigrid solver based on aggregation p.3
1. Introduction Developing a biomedical model: complex system of PDEs (Discretization) Huge (nonlinear, time dependent) algebraic problem (Linearization, (semi-)implicit time discretization) (many successive) Large linear systems Hope of the developer model: use standard library for this last step: Au = b u = A 1 b An efficient multigrid solver based on aggregation p.3
1. Introduction Developing a biomedical model: complex system of PDEs (Discretization) Huge (nonlinear, time dependent) algebraic problem (Linearization, (semi-)implicit time discretization) (many successive) Large linear systems Hope of the developer model: use standard library for this last step: Au = b u = A 1 b Problems: standard library for 3D problems with n > 10 6 : Direct methods no more feasible Iterative methods are application dependent Developing the linear solver: requires specific skills An efficient multigrid solver based on aggregation p.3
1. Introduction In many cases, the design of an appropriate iterative linear solver is much easier if one has at hand a library able to efficiently solve linear (sub)systems Au = b where A corresponds to the discretization of div(d grad(u))+v grad(u)+cu = f (+B.C.) (or closely related). Example: Pennacchio & Simoncini (SISC, 2011) about the bidomain model in electrocardiology Efficiently: robustly (stable performances) elapsed in linear time: n #proc roughly constant An efficient multigrid solver based on aggregation p.4
1. Introduction Why multigrid? Standard iterative methods typically very slow An efficient multigrid solver based on aggregation p.5
1. Introduction Why multigrid? Standard iterative methods typically very slow Consider the error from different scales: small scale strong local variations large scale smooth variations Iterative methods mainly act locally, i.e. damp error modes seen from small scale, but not much smooth modes More steps needed to propagate information about smooth modes as the grid is refined (linear time out of reach) An efficient multigrid solver based on aggregation p.5
1. Introduction Why multigrid? Standard iterative methods typically very slow Consider the error from different scales: small scale strong local variations large scale smooth variations Iterative methods mainly act locally, i.e. damp error modes seen from small scale, but not much smooth modes More steps needed to propagate information about smooth modes as the grid is refined (linear time out of reach) Multigrid: solving the problem on a coarser grid (less unknowns easier) yields an approximate solution which is essentially correct from the large scale viewpoint An efficient multigrid solver based on aggregation p.5
1. Introduction A multigrid method alternates: smoothing iterations: improve current solution u k using a basic iterative method ũ k coarse grid correction: project the residual equation A(u ũ k ) = b Aũ k r on the coarse grid: r c = Rr (R : n c n) solve the coarse problem: v c = A 1 c r c prolongate (interpolate) coarse correction on the fine grid: u k+1 = ũ k +P v c (P : n n c ) An efficient multigrid solver based on aggregation p.6
1. Introduction A multigrid method alternates: smoothing iterations: improve current solution u k using a basic iterative method ũ k coarse grid correction: project the residual equation A(u ũ k ) = b Aũ k r on the coarse grid: r c = Rr (R : n c n) solve the coarse problem: v c = A 1 c r c prolongate (interpolate) coarse correction on the fine grid: u k+1 = ũ k +P v c (P : n n c ) This principle is used recursively: A 1 c r c is replaced by the approximation from 1 or 2 steps of a multigrid method to solve the coarse problem An efficient multigrid solver based on aggregation p.6 coarser grid, and so on
1. Introduction Why algebraic multigrid (AMG)? Classical (geometric) multigrid needs an hierarchy of grids and close interaction between the iterative solver and the discretization process AMG attempts to obtain the same effect using only the information present in the system matrix A Coarse grid correction step: v = P A 1 c Rr P : prolongation (n n c ) R : restriction ( n c n) A c : coarse grid matrix An efficient multigrid solver based on aggregation p.7
1. Introduction Why algebraic multigrid (AMG)? Classical (geometric) multigrid needs an hierarchy of grids and close interaction between the iterative solver and the discretization process AMG attempts to obtain the same effect using only the information present in the system matrix A Coarse grid correction step: v = P A 1 c Rr P : prolongation (n n c ) R : restriction ( n c n) R = P T A c : coarse grid matrix An efficient multigrid solver based on aggregation p.7
1. Introduction Why algebraic multigrid (AMG)? Classical (geometric) multigrid needs an hierarchy of grids and close interaction between the iterative solver and the discretization process AMG attempts to obtain the same effect using only the information present in the system matrix A Coarse grid correction step: v = P A 1 c Rr P : prolongation (n n c ) R : restriction ( n c n) R = P T A c : coarse grid matrix A c = RAP = P T AP An efficient multigrid solver based on aggregation p.7
1. Introduction Why algebraic multigrid (AMG)? Classical (geometric) multigrid needs an hierarchy of grids and close interaction between the iterative solver and the discretization process AMG attempts to obtain the same effect using only the information present in the system matrix A Coarse grid correction step: v = P A 1 c Rr P : prolongation (n n c ) R : restriction ( n c n) R = P T A c : coarse grid matrix A c = RAP = P T AP Nothing else needed besides P Find a way to obtain P from A An efficient multigrid solver based on aggregation p.7
1. Intro: Aggregation-based AMG Group nodes into aggregates G i (partitioning of [1, n]) Each set corresponds to 1 coarse variable (and vice-versa) G 3 G 4 G 1 G 2 An efficient multigrid solver based on aggregation p.8
1. Intro: Aggregation-based AMG Prolongation P : P ij = Example { 1 if i G j 0 otherwise 3 u c = 1 2 3 G 1 4 1 3 3 3 1 2 2 2 4 G 3 G 4 4 4 4 4 1 2 G 2 An efficient multigrid solver based on aggregation p.9
1. Intro: Aggregation-based AMG Coarse grid matrix: A c = P T AP given by (A c ) ij = k G i l G j a kl G 3 G 4 G 1 G 2 Tends to reproduce the stencil from the fine grid An efficient multigrid solver based on aggregation p.10
1. Intro: Aggregation-based AMG Coarse grid matrix: A c = P T AP given by (A c ) ij = k G i l G j a kl G 3 G 4 G 1 G 2 Tends to reproduce the stencil from the fine grid Recursive use raises no difficulties Low setup cost & memory requirements An efficient multigrid solver based on aggregation p.10
1. Intro: Aggregation-based AMG Does not mimic any classical multigrid method Not efficient if the piecewise constant P just substitutes the classical prolongation in a standard multigrid scheme has been overlooked for a long time An efficient multigrid solver based on aggregation p.11
1. Intro: Aggregation-based AMG Does not mimic any classical multigrid method Not efficient if the piecewise constant P just substitutes the classical prolongation in a standard multigrid scheme has been overlooked for a long time Recent revival: Proper convergence theory (mimicry not essential for a good interplay with the smoother) Efficient when combined with specific components: cheap smoother & K-cycle (solution of coarse problems accelerated with Krylov subspace methods) Theory and efficient solver developed hand in hand An efficient multigrid solver based on aggregation p.11
1. Intro: Aggregation-based AMG The algebraic convergence theory: yields meaningful bounds (actual convergence as proved often OK) An efficient multigrid solver based on aggregation p.12
1. Intro: Aggregation-based AMG The algebraic convergence theory: yields meaningful bounds (actual convergence as proved often OK) is compatible with irregular geometries, unstructured grids, jumps in coefficients, etc (guarantees essentially the same bound) An efficient multigrid solver based on aggregation p.12
1. Intro: Aggregation-based AMG The algebraic convergence theory: yields meaningful bounds (actual convergence as proved often OK) is compatible with irregular geometries, unstructured grids, jumps in coefficients, etc (guarantees essentially the same bound) requires M-matrix, but has natural heuristic extensions An efficient multigrid solver based on aggregation p.12
1. Intro: Aggregation-based AMG The algebraic convergence theory: yields meaningful bounds (actual convergence as proved often OK) is compatible with irregular geometries, unstructured grids, jumps in coefficients, etc (guarantees essentially the same bound) requires M-matrix, but has natural heuristic extensions whenever applicable, holds at every level of the hierarchy An efficient multigrid solver based on aggregation p.12
1. Intro: Aggregation-based AMG The algebraic convergence theory: yields meaningful bounds (actual convergence as proved often OK) is compatible with irregular geometries, unstructured grids, jumps in coefficients, etc (guarantees essentially the same bound) requires M-matrix, but has natural heuristic extensions whenever applicable, holds at every level of the hierarchy covers symmetric and nonsymmetric problems in a uniform fashion An efficient multigrid solver based on aggregation p.12
1. Intro: Aggregation-based AMG The algebraic convergence theory: yields meaningful bounds (actual convergence as proved often OK) is compatible with irregular geometries, unstructured grids, jumps in coefficients, etc (guarantees essentially the same bound) requires M-matrix, but has natural heuristic extensions whenever applicable, holds at every level of the hierarchy covers symmetric and nonsymmetric problems in a uniform fashion The aggregation algorithm we use is entirely based on the theory and its heuristic extensions An efficient multigrid solver based on aggregation p.12
1. Introduction 2. Convergence analysis 3. Aggregation procedure Repeated pairwise aggregation 4. Parallelization 5. Numerical results 6. Conclusions An efficient multigrid solver based on aggregation p.13
2. Convergence analysis Method used as a preconditioner for CG or GCR Fast convergence if the eigenvalues λ i of the preconditioned matrix are: bounded substantially away from 0 Using a standard smoother (e.g., Gauss-Seidel), the eigenvalues are bounded independently of P If P = 0 the eigenvalues associated with smooth modes are in general very small Main difficulty: λ i substantially away from 0 Role of the coarse grid correction: move the small eigenvalues enough to the right (Guideline for the choice of P ) An efficient multigrid solver based on aggregation p.14
2. Convergence analysis: λ i away from 0 SPD case Main identity [Falgout, Vassilevski & Zikatanov (2005)]: 1 λ min = κ(a, P) with κ(a,p) = ω 1 v T D ( I P(P T DP) 1 P T D ) v sup v 0 v T Av An efficient multigrid solver based on aggregation p.15
2. Convergence analysis: λ i away from 0 SPD case Main identity [Falgout, Vassilevski & Zikatanov (2005)]: 1 λ min = κ(a, P) with κ(a,p) = ω 1 v T D ( I P(P T DP) 1 P T D ) v sup v 0 v T Av General case [YN (2010)] For any λ i : Re(λ i ) with A S = 1 2 (A+AT ) 1 κ(a S, P) The analysis of the SPD case can be sufficient An efficient multigrid solver based on aggregation p.15
2. Convergence analysis: λ i away from 0 κ(a S,P) = ω 1 v T D ( I P(P T DP) 1 P T D ) v sup v 0 v T A S v Aggregation-based methods P = 1 n (1)..., D = diag(a) = D 1... 1 n (nc) D ( I P(P T DP) 1 P T D ) D nc = blockdiag ( D i ( I 1n (i)(1 T n (i) D i 1 n (i)) 1 1 T n (i) D i )) An efficient multigrid solver based on aggregation p.16
2. Convergence analysis: λ i away from 0 κ(a S,P) = ω 1 v T D ( I P(P T DP) 1 P T D ) v sup v 0 v T A S v Aggregation-based methods P = 1 n (1)..., D = diag(a) = D 1... 1 n (nc) D ( I P(P T DP) 1 P T D ) D nc = blockdiag ( D i ( I 1n (i)(1 T n (i) D i 1 n (i)) 1 1 T n (i) D i )) find A b, A r nonnegative definite s.t. A S = A b +A r with A b = A (S) G 1... A G (S) nc An efficient multigrid solver based on aggregation p.16
2. Convergence analysis: λ i away from 0 κ(a S,P) ω 1 v T D ( I P(P T DP) 1 P T D ) v sup v 0 v T A b v Aggregation-based methods P = 1 n (1)..., D = diag(a) = D 1... 1 n (nc) D ( I P(P T DP) 1 P T D ) D nc = blockdiag ( D i ( I 1n (i)(1 T n (i) D i 1 n (i)) 1 1 T n (i) D i )) find A b, A r nonnegative definite s.t. A S = A b +A r with A b = A (S) G 1... A G (S) nc An efficient multigrid solver based on aggregation p.16
2. Convergence analysis: λ i away from 0 Aggregate Quality µ G = ω 1 sup vt D G (I 1 G (1T G D G1 G ) 1 1 T G D G)v v/ N(A (S) G ) v T A (S) G v, Then: κ(a S, P) max i µ Gi Controlling µ Gi ensures that eigenvalues are away from 0 An efficient multigrid solver based on aggregation p.17
2. Convergence analysis: λ i away from 0 Aggregate Quality µ G = ω 1 sup vt D G (I 1 G (1T G D G1 G ) 1 1 T G D G)v v/ N(A (S) G ) v T A (S) G v, Then: κ(a S, P) max i µ Gi Controlling µ Gi ensures that eigenvalues are away from 0 A (S) G : Computed from A S = A b +A r with A r 1 = 0 Rigorous for M-matrices s.t. A S 1 0 (then A b, A r guaranteed nonnegative definite) Heuristic in other cases (A r could have negative eigenvalue(s)) An efficient multigrid solver based on aggregation p.17
1. Introduction 2. Convergence analysis 3. Aggregation procedure Repeated pairwise aggregation 4. Parallelization 5. Numerical results 6. Conclusions An efficient multigrid solver based on aggregation p.18
3. Aggregation procedure κ(a S, P) max i µ Gi A posteriori control of given aggregation scheme: limited utility (often a few aggregates with large µ G ) An efficient multigrid solver based on aggregation p.19
3. Aggregation procedure κ(a S, P) max i µ Gi A posteriori control of given aggregation scheme: limited utility (often a few aggregates with large µ G ) Aggregation algorithm based on the control of µ Gi An efficient multigrid solver based on aggregation p.19
3. Aggregation procedure κ(a S, P) max i µ Gi A posteriori control of given aggregation scheme: limited utility (often a few aggregates with large µ G ) Aggregation algorithm based on the control of µ Gi Problem: repeated assessment of µ G is costly An efficient multigrid solver based on aggregation p.19
3. Aggregation procedure κ(a S, P) max i µ Gi A posteriori control of given aggregation scheme: limited utility (often a few aggregates with large µ G ) Aggregation algorithm based on the control of µ Gi Problem: repeated assessment of µ G is costly For a pair {i,j}, µ {i,j} is a simple function of the local entries & the row and column sum An efficient multigrid solver based on aggregation p.19
3. Aggregation procedure κ(a S, P) max i µ Gi A posteriori control of given aggregation scheme: limited utility (often a few aggregates with large µ G ) Aggregation algorithm based on the control of µ Gi Problem: repeated assessment of µ G is costly For a pair {i,j}, µ {i,j} is a simple function of the local entries & the row and column sum ( 1 T G D G 1 G ) 11 T G D G )z µ G = ω 1 sup zt D G (I 1 G (S) z/ N(A G ) z T A (S) G z It is always cheap to check that µ G < κ TG holds: Z G = κ TG A (S) G ω 1 D G (I 1 G ( 1 T G D G 1 G ) 1 1 T G D G) is nonnegative definite if no negative pivot occurs while performing an LDL T factorization An efficient multigrid solver based on aggregation p.19
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Pairwise aggregation Input: threshold κ TG Output: n c and aggregates G i, i = 1...,n c Initialization: U = [1, n]\g 0, n c = 0 Algorithm: While U do 1. Select i U ; n c = n c +1 2. Select j U such that µ {i,j} is minimal 3. If µ {i,j} < κ TG then G nc = {i,j} else G nc = {i} 4. U = U\G nc An efficient multigrid solver based on aggregation p.20
3. Repeated Pairwise aggregation s = 1 ; A (s) = A Apply pairwise aggregation to A (s) Form aggregated matrix A (s+1) s s+1 no nnz(a (s+1) ) < nnz(a) τ or s==n pass? yes A c = A (s+1) An efficient multigrid solver based on aggregation p.20
3. Repeated Pairwise aggregation s = 1 ; A (s) = A Apply pairwise aggregation to A (s) Form aggregated matrix A (s+1) s s+1 no nnz(a (s+1) ) < nnz(a) τ or s==n pass? yes A c = A (s+1) An efficient multigrid solver based on aggregation p.20
3. Repeated Pairwise aggregation s = 1 ; A (s) = A Apply pairwise aggregation to A (s) Form aggregated matrix A (s+1) s s+1 no nnz(a (s+1) ) < nnz(a) τ or s==n pass? yes A c = A (s+1) An efficient multigrid solver based on aggregation p.20
3. Repeated Pairwise aggregation s = 1 ; A (s) = A Apply pairwise aggregation to A (s) Check µ G < κ TG in A Form aggregated matrix A (s+1) s s+1 no nnz(a (s+1) ) < nnz(a) τ or s==n pass? yes A c = A (s+1) An efficient multigrid solver based on aggregation p.20
3. Repeated Pairwise aggregation s = 1 ; A (s) = A Apply pairwise aggregation to A (s) Check µ G < κ TG in A Form aggregated matrix A (s+1) s s+1 no nnz(a (s+1) ) < nnz(a) τ or s==n pass? yes A c = A (s+1) An efficient multigrid solver based on aggregation p.20
3. Aggregation procedure: Illustration Upwind FD approximation of ν u + v grad(u) = f in Ω = unit square with u = g on Ω, v(x, y) = x(1 x)(2y 1) : (2x 1)y(1 y) 0.25 0.2 0.15 0.1 0.05 0 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Direction of the flow Magnitude An efficient multigrid solver based on aggregation p.21
3. Aggregation procedure: Illustration ν = 1 : diffusion dominating (near symmetric) Aggregation Spectrum 0.5 0 0.5 0 0.2 0.4 0.6 0.8 1 + : σ(i T) : theory : σ(ωd 1 A) (convex hull) An efficient multigrid solver based on aggregation p.22
3. Aggregation procedure: Illustration ν = 10 3 : convection dominating (strongly nonsymmetric) Aggregation Spectrum 0.5 0 0.5 0 0.2 0.4 0.6 0.8 1 + : σ(i T) : theory : σ(ωd 1 A) (convex hull) An efficient multigrid solver based on aggregation p.23
1. Introduction 2. Convergence analysis 3. Aggregation procedure Repeated pairwise aggregation 4. Parallelization 5. Numerical results 6. Conclusions An efficient multigrid solver based on aggregation p.24
4. Parallelization Partitioning of the unknowns partitioning of matrix rows An efficient multigrid solver based on aggregation p.25
4. Parallelization Partitioning of the unknowns partitioning of matrix rows We apply exactly the same aggregation algorithm except that aggregates can only contain unknowns in a same partition. Hence, one needs only to know the local matrix rows (no communication except upon forming the next coarse grid matrix) An efficient multigrid solver based on aggregation p.25
4. Parallelization Partitioning of the unknowns partitioning of matrix rows We apply exactly the same aggregation algorithm except that aggregates can only contain unknowns in a same partition. Hence, one needs only to know the local matrix rows (no communication except upon forming the next coarse grid matrix) The prolongations & restrictions are then purely local An efficient multigrid solver based on aggregation p.25
4. Parallelization Partitioning of the unknowns partitioning of matrix rows We apply exactly the same aggregation algorithm except that aggregates can only contain unknowns in a same partition. Hence, one needs only to know the local matrix rows (no communication except upon forming the next coarse grid matrix) The prolongations & restrictions are then purely local Smoother: Gauss-Seidel, ignoring connections between different partitions inherently parallel An efficient multigrid solver based on aggregation p.25
4. Parallelization Partitioning of the unknowns partitioning of matrix rows We apply exactly the same aggregation algorithm except that aggregates can only contain unknowns in a same partition. Hence, one needs only to know the local matrix rows (no communication except upon forming the next coarse grid matrix) The prolongations & restrictions are then purely local Smoother: Gauss-Seidel, ignoring connections between different partitions inherently parallel During iterations: communications only for matvec and inner product computation An efficient multigrid solver based on aggregation p.25
1. Introduction 2. Convergence analysis 3. Aggregation procedure Repeated pairwise aggregation 4. Parallelization 5. Numerical results 6. Conclusions An efficient multigrid solver based on aggregation p.26
5. Numerical results Classical AMG talk on application Description of the application (beautiful pictures) Description of the AMG strategy and needed tuning Numerical results, often not fully informative: no robustness study on a comprehensive test suite; no comparison with state of the art competitors. An efficient multigrid solver based on aggregation p.27
5. Numerical results This talk Most applications ran by people downloading the code. Some of those I am aware of: CFD, electrocardiology (in general, I don t have the beautiful pictures at hand). An efficient multigrid solver based on aggregation p.28
5. Numerical results This talk Most applications ran by people downloading the code. Some of those I am aware of: CFD, electrocardiology (in general, I don t have the beautiful pictures at hand). The code is used black box (adaptation neither sought nor needed) An efficient multigrid solver based on aggregation p.28
5. Numerical results This talk Most applications ran by people downloading the code. Some of those I am aware of: CFD, electrocardiology (in general, I don t have the beautiful pictures at hand). The code is used black box (adaptation neither sought nor needed) I think the most important is the robustness on a comprehensive test suite An efficient multigrid solver based on aggregation p.28
5. Numerical results This talk Most applications ran by people downloading the code. Some of those I am aware of: CFD, electrocardiology (in general, I don t have the beautiful pictures at hand). The code is used black box (adaptation neither sought nor needed) I think the most important is the robustness on a comprehensive test suite I like comparison with state of the art competitors An efficient multigrid solver based on aggregation p.28
5. Numerical results Iterations stopped when r k r 0 < 10 6 Times reported are total elapsed times in seconds (including set up) per 10 6 unknowns Test suite: discrete scalar elliptic PDEs SPD problems with jumps and all kind of anisotropy in the coefficients (some with reentering corner) convection-diffusion problems with viscosity from 1 10 6 and highly varying recirculating flow FD on regular grids; 3 sizes: 2D: h 1 = 600, 1600, 5000 3D: h 1 = 80, 160, 320 FE on (un)structured meshes (with different levels of local refinement); 2 sizes: n = 0.15e6 n = 7.1e6 An efficient multigrid solver based on aggregation p.29
5. Numerical results 12 2D symmetric problems 10 8 6 4 2 0 Regular Unstructured An efficient multigrid solver based on aggregation p.30
5. Numerical results 12 3D symmetric problems 10 8 6 4 2 0 Regular Unstructured An efficient multigrid solver based on aggregation p.31
5. Numerical results 12 2D nonsymmetric problems 10 8 6 4 2 0 0 5 10 15 20 25 30 An efficient multigrid solver based on aggregation p.32
5. Numerical results 12 3D nonsymmetric problems 10 8 6 4 2 0 0 5 10 15 20 25 30 An efficient multigrid solver based on aggregation p.33
5. Numerical results Comparison with other methods AMG(Hyp): classical AMG method as implemented in the Hypre library (Boomer AMG) AMG(HSL): the classical AMG method as implemented in the HSL library ILUPACK: efficient threshold-based ILU preconditioner Matlab \: Matlab sparse direct solver (UMFPACK) All methods but the last with Krylov subspace acceleration An efficient multigrid solver based on aggregation p.34
5. Numerical results 400 AGMG AMG(Hyp) 200 AMG(HSL) ILUPACK Matlab \ 100 50 POISSON 2D, FD 400 AGMG AMG(Hyp) 200 AMG(HSL) ILUPACK Matlab \ 100 50 LAPLACE 2D, FE(P3) 20 10 5 3 10 4 10 5 10 6 10 7 10 8 20 10 5 3 10 4 10 5 10 6 10 7 10 8 33% of nonzero offdiag > 0 An efficient multigrid solver based on aggregation p.35
5. Numerical results Poisson 2D, L-shaped, FE Unstructured, Local refin. Convection-Diffusion 2D, FD ν = 10 6 400 AGMG AMG(Hyp) 200 AMG(HSL): coarsening failure in all cases ILUPACK Matlab \ 100 50 20 400 200 100 50 20 AGMG AMG(Hyp) AMG(HSL) ILUPACK Matlab \ 10 10 5 3 5 3 10 5 10 6 10 7 10 4 10 5 10 6 10 7 10 8 An efficient multigrid solver based on aggregation p.36
5. Numerical results POISSON 3D, FD LAPLACE 3D, FE(P3) 400 AGMG AMG(Hyp) 200 AMG(HSL) ILUPACK Matlab \ 100 50 400 200 100 50 20 10 5 3 10 4 10 5 10 6 10 7 10 8 20 10 5 3 AGMG AMG(Hyp) AMG(HSL) ILUPACK Matlab \ 10 4 10 5 10 6 10 7 51% of nonzero offdiag > 0 An efficient multigrid solver based on aggregation p.37
5. Numerical results 50 Poisson 3D, FE Unstructured, Local refin. 400 AGMG AMG(Hyp) 200 AMG(HSL) ILUPACK Matlab \ 100 Convection-Diffusion 3D, FD ν = 10 6 400 AGMG AMG(Hyp) 200 AMG(HSL) ILUPACK Matlab \ 100 50 20 10 5 3 10 4 10 5 10 6 10 7 20 10 5 3 10 4 10 5 10 6 10 7 10 8 An efficient multigrid solver based on aggregation p.38
5. Numerical results Parallel run: with direct coarsest grid solver Cray, 32 cores/node with 1GB/node Poisson, 3D trilinear hexahedral FE #Nodes #Cores n/10 6 #Iter. Setup Time Solve Time 1 32 31 17 5.9 40.4 2 64 63 17 6.2 40.8 4 128 125 17 6.9 41.5 8 256 251 17 9.5 41.8 16 512 501 17 14.8 42.5 32 1024 1003 17 27.3 44.0 64 2048 2007 17 69.0 48.2 128 4096 4014 17 383.0 59.4 (By courtesy of Mark Walkley, Univ. of Leeds) An efficient multigrid solver based on aggregation p.39
5. Numerical results Parallel run: with (new) iterative coarsest grid solver Intel(R) Xeon(R) CPU E5649 @ 2.53GHz 3D problem with jumps, FD #Nodes #Cores n/10 6 #Iter. Setup Time Solve Time 1 8 64 12 14.9 89. 16 128 1026 16 17.4 191. 48 384 3065 14 18.0 165. 96 768 6155 13 17.4 170. An efficient multigrid solver based on aggregation p.40
6. Conclusions Robust method for scalar elliptic PDEs An efficient multigrid solver based on aggregation p.41
6. Conclusions Robust method for scalar elliptic PDEs Purely algebraic convergence theory: do not depend on FE spaces, regularity assumption; applies also to the nonsymmetric case. An efficient multigrid solver based on aggregation p.41
6. Conclusions Robust method for scalar elliptic PDEs Purely algebraic convergence theory: do not depend on FE spaces, regularity assumption; applies also to the nonsymmetric case. Can be used (and is used!) black box (does not require tuning or adaptation) An efficient multigrid solver based on aggregation p.41
6. Conclusions Robust method for scalar elliptic PDEs Purely algebraic convergence theory: do not depend on FE spaces, regularity assumption; applies also to the nonsymmetric case. Can be used (and is used!) black box (does not require tuning or adaptation) Faster than some solvers based on classical AMG An efficient multigrid solver based on aggregation p.41
6. Conclusions Robust method for scalar elliptic PDEs Purely algebraic convergence theory: do not depend on FE spaces, regularity assumption; applies also to the nonsymmetric case. Can be used (and is used!) black box (does not require tuning or adaptation) Faster than some solvers based on classical AMG Fairly small setup time: especially well suited when only a modest accuracy is needed (e.g., linear solve within Newton steps) An efficient multigrid solver based on aggregation p.41
6. Conclusions Robust method for scalar elliptic PDEs Purely algebraic convergence theory: do not depend on FE spaces, regularity assumption; applies also to the nonsymmetric case. Can be used (and is used!) black box (does not require tuning or adaptation) Faster than some solvers based on classical AMG Fairly small setup time: especially well suited when only a modest accuracy is needed (e.g., linear solve within Newton steps) Efficient parallelization An efficient multigrid solver based on aggregation p.41
6. Conclusions Robust method for scalar elliptic PDEs Purely algebraic convergence theory: do not depend on FE spaces, regularity assumption; applies also to the nonsymmetric case. Can be used (and is used!) black box (does not require tuning or adaptation) Faster than some solvers based on classical AMG Fairly small setup time: especially well suited when only a modest accuracy is needed (e.g., linear solve within Newton steps) Efficient parallelization Professional code available, free academic license An efficient multigrid solver based on aggregation p.41
References Analysis of aggregation based multigrid (with A. C. Muresan), SISC (2008). Recursive Krylov-based multigrid cycles (with P. S. Vassilevski), NLAA (2008). An aggregation-based algebraic multigrid method, ETNA (2010). Algebraic analysis of two-grid methods: the nonsymmetric case, NLAA (2010). Algebraic analysis of aggregation-based multigrid, (with A. Napov) NLAA (2011). An algebraic multigrid method with guaranteed convergence rate (with A. Napov), SISC (2012). Aggregation-based algebraic multigrid for convection-diffusion equations, SISC (2012, to appear). AGMG software: Google AGMG (http://homepages.ulb.ac.be/~ynotay/agmg) Thank you for your attention! An efficient multigrid solver based on aggregation p.42
1. Intro: Why aggregation-based AMG? Classical AMG Heuristic algorithms to mimic geometric multigrid (Connectivity set of coarse nodes; Matrix entries interpolation rules) Need to be used recursively: A c = P T AP A cc = P T c A c P c, etc Several variants and parameters; relevant choices depend on applications Main difficulty: Find a good tradeoff between accuracy and the mastering of complexity (i.e., the control of the sparsity in successive coarse grid matrices) Is a good algorithm for A also good for A c? An efficient multigrid solver based on aggregation p.43
Test suite: 2D Model Problems (FD) MODEL2D : 5-point & 9 point discretizations of { u = 1 on Ω = [0,1] [0,1] u = 0 on Ω with = 2 x 2 + 2 y 2 ANI2D : 5-point discretization of D u = 1 on Ω = [0,1] [0,1] u = 0 for x = 1, 0 y 1 = 0 elsewhere on Ω u n with = 1 x x +1 y y, D = diag(ε, 1) Tested: ε = 10 2, 10 4-1 -1-1 -1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 Σ = 2(1+ε) An efficient multigrid solver based on aggregation p.44 h
Test suite: Non M-matrices ANI2DBIFE : bilinear FE element discretization of D u = 1 on Ω = [0,1] [0,1] u = 0 for x = 1, 0 y 1 u n = 0 elsewhere on Ω with = 1 x x +1 y y, D = diag(ε, 1) Tested: ε = 10 2, 10 2, 10 3 (i.e. some strong positive offdiagonal elements) -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 -εσ -εσ -εσ -εσ -ε -1-1 -1-1 Σ = 2(1+ε) h An efficient multigrid solver based on aggregation p.45
Test suite: 3D Model Problems (FD) MODEL3D : 7-point discretization of { u = 1 on Ω = [0,1] [0,1] [0,1] u = 0 on Ω with = 2 x 2 + 2 y 2 + 2 z 2 ANI3D : 7-point discretization of D u = 1 on Ω = [0,1] [0,1] [0,1] u = 0 for x = 1, 0 y, z 1 = 0 elsewhere on Ω u n with = 1 x x +1 y y +1 z z, D = diag(ε x, ε y,1) Tested: (0.07,1,1),(0.07,0.25,1), (0.07,0.07,1), (0.005,1,1), An efficient multigrid solver based on aggregation p.46 (0.005,0.07,1), (0.005,0.005,1)
Test suite: Problems with Jumps (FD) JUMP2D : 5-point discretization of D u = f on Ω = [0,1] [0,1] u = 0 for x = 1, 0 y 1 = 0 elsewhere on Ω u n with = 1 x x +1 y y,d = diag(d x, D y ) 100,100 1,1 100,1 1,100 D x, D y JUMP3D : 7-point discretization of D u = f on Ω = [0,1] [0,1] [0,1] u = 0 for x = 1, 0 y, z 1 = 0 elsewhere on Ω u n with = 1 x x +1 y y +1 z z D = 10 3 D = 1 An efficient multigrid solver based on aggregation p.47
Test suite: Sphere in a cube (FE) Finite element discretization of D u = 0 on Ω = [0,1] [0,1] [0,1] u = 0 for x = 0, 1, 0 y, z 1 u = 1 elsewhere on Ω with = 1 x x +1 y y +1 z z SPHUNF : quasi uniform mesh D = d D = 1 d = 10 3, 1, 10 3 SPHRF : mesh 10 finer on the sphere An efficient multigrid solver based on aggregation p.48
Test suite: reenter. corner, loc. ref. (FE) Finite { element discretization of u = 0 on Ω = [0,1] [0,1] u = r 2 3 sin( 2θ 3 ) on Ω with = 2 x 2 + 2 y 2 LUNFST : uniform structured grid LRFUST : unstructured grid with mesh 10 r finer near reentering corner r = 0 5 An efficient multigrid solver based on aggregation p.49
Test suite: Nonsymmetric problems Challenging convection-diffusion problems Upwind FD approximation of { ν u + v grad(u) = f in Ω u = g on Ω In all cases: tests for ν = 1, 10 2, 10 4, 10 6 An efficient multigrid solver based on aggregation p.50