Toward black-box adaptive domain decomposition methods

Similar documents
Une méthode parallèle hybride à deux niveaux interfacée dans un logiciel d éléments finis

Multilevel spectral coarse space methods in FreeFem++ on parallel architectures

HPDDM une bibliothèque haute performance unifiée pour les méthodes de décomposition de domaine

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems

Domain decomposition, hybrid methods, coarse space corrections

Adaptive Coarse Spaces and Multiple Search Directions: Tools for Robust Domain Decomposition Algorithms

Recent advances in HPC with FreeFem++

Overlapping Domain Decomposition Methods with FreeFem++

Overlapping domain decomposition methods withfreefem++

Some examples in domain decomposition

JOHANNES KEPLER UNIVERSITY LINZ. Abstract Robust Coarse Spaces for Systems of PDEs via Generalized Eigenproblems in the Overlaps

An introduction to Schwarz methods

Multipréconditionnement adaptatif pour les méthodes de décomposition de domaine. Nicole Spillane (CNRS, CMAP, École Polytechnique)

Avancées récentes dans le domaine des solveurs linéaires Pierre Jolivet Journées inaugurales de la machine de calcul June 10, 2015

Efficient domain decomposition methods for the time-harmonic Maxwell equations

Algebraic Adaptive Multipreconditioning applied to Restricted Additive Schwarz

Auxiliary space multigrid method for elliptic problems with highly varying coefficients

Two-level Domain decomposition preconditioning for the high-frequency time-harmonic Maxwell equations

An Introduction to Domain Decomposition Methods: algorithms, theory and parallel implementation

arxiv: v1 [math.na] 16 Dec 2015

Some Domain Decomposition Methods for Discontinuous Coefficients

IsogEometric Tearing and Interconnecting

Substructuring for multiscale problems

Finite element programming by FreeFem++ advanced course

Various ways to use a second level preconditioner

Additive Average Schwarz Method for a Crouzeix-Raviart Finite Volume Element Discretization of Elliptic Problems

On the choice of abstract projection vectors for second level preconditioners

Two new enriched multiscale coarse spaces for the Additive Average Schwarz method

C. Vuik 1 R. Nabben 2 J.M. Tang 1

Master Thesis Literature Study Presentation

Robust Domain Decomposition Preconditioners for Abstract Symmetric Positive Definite Bilinear Forms

Some Geometric and Algebraic Aspects of Domain Decomposition Methods

Dual-Primal Isogeometric Tearing and Interconnecting Solvers for Continuous and Discontinuous Galerkin IgA Equations

ANR Project DEDALES Algebraic and Geometric Domain Decomposition for Subsurface Flow

Overlapping Schwarz preconditioners for Fekete spectral elements

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D

The Conjugate Gradient Method

On Iterative Substructuring Methods for Multiscale Problems

Domain Decomposition solvers (FETI)

Adaptive Coarse Space Selection in BDDC and FETI-DP Iterative Substructuring Methods: Towards Fast and Robust Solvers

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

An Adaptive Multipreconditioned Conjugate Gradient Algorithm

On the Robustness and Prospects of Adaptive BDDC Methods for Finite Element Discretizations of Elliptic PDEs with High-Contrast Coefficients

Electronic Transactions on Numerical Analysis Volume 49, 2018

THEORETICAL AND NUMERICAL COMPARISON OF PROJECTION METHODS DERIVED FROM DEFLATION, DOMAIN DECOMPOSITION AND MULTIGRID METHODS

Schwarz Preconditioner for the Stochastic Finite Element Method

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

An Introduction to Domain Decomposition Methods

Sparse linear solvers: iterative methods and preconditioning

Iterative Helmholtz Solvers

Parallel sparse linear solvers and applications in CFD

Architecture Solutions for DDM Numerical Environment

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

arxiv: v1 [cs.ce] 9 Jul 2016

Adding a new finite element to FreeFem++:

Construction of a New Domain Decomposition Method for the Stokes Equations

Shifted Laplace and related preconditioning for the Helmholtz equation

Convergence Behavior of a Two-Level Optimized Schwarz Preconditioner

SOME PRACTICAL ASPECTS OF PARALLEL ADAPTIVE BDDC METHOD

Open-source finite element solver for domain decomposition problems

Fakultät für Mathematik und Informatik

Chapter 7 Iterative Techniques in Matrix Algebra

Lecture 17: Iterative Methods and Sparse Linear Algebra

Deflated Krylov Iterations in Domain Decomposition Methods

Algebraic Coarse Spaces for Overlapping Schwarz Preconditioners

On domain decomposition preconditioners for finite element approximations of the Helmholtz equation using absorption

Parallel Preconditioning Methods for Ill-conditioned Problems

Bath Institute For Complex Systems

Optimal Left and Right Additive Schwarz Preconditioning for Minimal Residual Methods with Euclidean and Energy Norms

Coupled FETI/BETI for Nonlinear Potential Problems

Linear and Non-Linear Preconditioning

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Lecture 8: Fast Linear Solvers (Part 7)

Extending the theory for domain decomposition algorithms to less regular subdomains

New constructions of domain decomposition methods for systems of PDEs

Multigrid and Iterative Strategies for Optimal Control Problems

Solving Large Nonlinear Sparse Systems

The Deflation Accelerated Schwarz Method for CFD

Parallel Scalability of a FETI DP Mortar Method for Problems with Discontinuous Coefficients

AN INTRODUCTION TO DOMAIN DECOMPOSITION METHODS. Gérard MEURANT CEA

Linear Solvers. Andrew Hazel

A Numerical Study of Some Parallel Algebraic Preconditioners

Multigrid absolute value preconditioning

arxiv: v2 [physics.comp-ph] 3 Jun 2016

A dissection solver with kernel detection for unsymmetric matrices in FreeFem++

Inexact Data-Sparse BETI Methods by Ulrich Langer. (joint talk with G. Of, O. Steinbach and W. Zulehner)

FETI-DP for Elasticity with Almost Incompressible 2 Material Components 3 UNCORRECTED PROOF. Sabrina Gippert, Axel Klawonn, and Oliver Rheinbach 4

Scalable domain decomposition preconditioners for heterogeneous elliptic problems 1

Parallel scalability of a FETI DP mortar method for problems with discontinuous coefficients

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

20. A Dual-Primal FETI Method for solving Stokes/Navier-Stokes Equations

Short title: Total FETI. Corresponding author: Zdenek Dostal, VŠB-Technical University of Ostrava, 17 listopadu 15, CZ Ostrava, Czech Republic

Comparison of the deflated preconditioned conjugate gradient method and algebraic multigrid for composite materials

Incomplete Cholesky preconditioners that exploit the low-rank property

A two-level domain-decomposition preconditioner for the time-harmonic Maxwell s equations

Application of Preconditioned Coupled FETI/BETI Solvers to 2D Magnetic Field Problems

Fine-grained Parallel Incomplete LU Factorization

On deflation and singular symmetric positive semi-definite matrices

arxiv: v2 [math.na] 20 May 2016

Asymptotic expansions for high-contrast elliptic equations

Transcription:

Toward black-box adaptive domain decomposition methods Frédéric Nataf Laboratory J.L. Lions (LJLL), CNRS, Alpines Inria and Univ. Paris VI joint work with Victorita Dolean (Univ. Nice Sophia-Antipolis) Patrice Hauret (Michelin, Clermont-Ferrand) Frédéric Hecht (LJLL) Pierre Jolivet (LJLL) Clemens Pechstein (Johannes Kepler Univ., Linz) Robert Scheichl (Univ. Bath) Nicole Spillane (LJLL) JLPC Lyon 2013

Outline 1 Some Applications 2 An abstract 2-level Schwarz: the GenEO algorithm 3 Conclusion F. Nataf et al BB DDM 2 / 28

Outline 1 Some Applications 2 An abstract 2-level Schwarz: the GenEO algorithm 3 Conclusion F. Nataf et al BB DDM 3 / 28

Motivation Large Sparse discretized system with strongly heterogeneous coefficients (high contrast, nonlinear, multiscale) E.g. Darcy pressure equation, P 1 -finite elements: Applications: flow in heterogeneous / stochastic / layered media structural mechanics time dependent waves etc. AU = F cond(a) α max α min h 2 Goal: iterative solvers robust in size and heterogeneities F. Nataf et al BB DDM 4 / 28

Problem div(σ(u)) = f, where ( ) σ ij (u) = 2µε ij (u) + λδ ij div(u), ε ij (u) = 1 ui 2 x j + u j x i, f = (0, g) T = (0, 10) T, µ = E 2(1+ν), λ = Eν (1+ν)(1 2ν). After a FEM (fintie element method) discretization: AU = F where A is a large sparse highy heterogeneous matrix. F. Nataf et al BB DDM 5 / 28

Black box solvers (solve(mat,rhs,sol)) Direct Solvers Iterative Solvers Pros Robustness Naturally Cons Difficult to Robustness Domain Decomposition Methods (DDM): Hybrid solver should be naturally parallel and robust General form: Au = f, solved with PCG for a preconditioner M 1. What s classical: Robustness with respect to problem size (scalability) What s New here: Provable Robustness in the SPD case with respect to: Coefficients jumps System of PDEs F. Nataf et al BB DDM 6 / 28

Black box solvers (solve(mat,rhs,sol)) Direct Solvers Iterative Solvers Pros Robustness Naturally Cons Difficult to Robustness Domain Decomposition Methods (DDM): Hybrid solver should be naturally parallel and robust General form: Au = f, solved with PCG for a preconditioner M 1. What s classical: Robustness with respect to problem size (scalability) What s New here: Provable Robustness in the SPD case with respect to: Coefficients jumps System of PDEs F. Nataf et al BB DDM 6 / 28

The First Domain Decomposition Method The original Schwarz Method (H.A. Schwarz, 1870) (u) = f in Ω u = 0 on Ω. Ω 1 Ω 2 Schwarz Method : (u1 n, un 2 ) (un+1 1, u n+1 2 ) with (u n+1 1 ) = f in Ω 1 u n+1 1 = 0 on Ω 1 Ω u n+1 1 = u2 n on Ω 1 Ω 2. (u n+1 2 ) = f in Ω 2 u n+1 2 = 0 on Ω 2 Ω u n+1 2 = u n+1 1 on Ω 2 Ω 1. Parallel algorithm, converges but very slowly, overlapping subdomains only. The parallel version is called Jacobi Schwarz method (JSM). F. Nataf et al BB DDM 7 / 28

Jacobi and Schwarz (I): Algebraic point of view The set of indices is partitioned into two sets N 1 and N 2 : ( ) ( ) ( ) A11 A 12 x1 b1 = A 21 A 22 x 2 b 2 The block-jacobi algorithm reads: ( ) ( ) ( A11 0 x k+1 1 0 A12 0 A 22 x k+1 = A 2 21 0 ) ( x k 1 x k 2 ) ( b1 + b 2 It corresponds to solving a Dirichlet boundary value problem in each subdomain with Dirichlet data taken from the other one at the previous step Schwarz method with minimal overlap 1 2 ) 1 2 Figure : Domain decomposition with minimal overlap F. Nataf et al BB DDM 8 / 28

Jacobi and Schwarz (II): Larger overlap Let δ be a non negative integer 1 2 n s n s + δ Figure : Domain decomposition with overlap A = n s ( ) or n s + δ ( ) n s + 1 n s Figure : Matrix decomposition Without or with overlap F. Nataf et al BB DDM 9 / 28

Strong and Weak scalability How to evaluate the efficiency of a domain decomposition? Strong scalability (Amdahl) How the solution time varies with the number of processors for a fixed total problem size Weak scalability (Gustafson) How the solution time varies with the number of processors for a fixed problem size per processor. Not achieved with the one level method Number of subdomains 8 16 32 64 ASM 18 35 66 128 The iteration number increases linearly with the number of subdomains in one direction. F. Nataf et al BB DDM 10 / 28

How to achieve scalability Stagnation corresponds to a few very low eigenvalues in the spectrum of the preconditioned problem. They are due to the lack of a global exchange of information in the preconditioner. u = f in Ω u = 0 on Ω The mean value of the solution in domain i depends on the value of f on all subdomains. A classical remedy consists in the introduction of a coarse problem that couples all subdomains. This is closely related to deflation technique classical in linear algebra (see Nabben and Vuik s papers in SIAM J. Sci. Comp, 200X) and multigrid techniques. F. Nataf et al BB DDM 11 / 28

Adding a coarse space We add a coarse space correction (aka second level) Let V H be the coarse space and Z be a basis, V H = span Z, writing R 0 = Z T we define the two level preconditioner as: M 1 ASM,2 := RT 0 (R 0AR T 0 ) 1 R 0 + N i=1 R T i A 1 i R i. The Nicolaides approach is to use the kernel of the operator as a coarse space, this is the constant vectors, in local form this writes: Z := (Ri T D i R i 1) 1 i N where D i are chosen so that we have a partition of unity: N Ri T D i R i = Id. i=1 F. Nataf et al BB DDM 12 / 28

Theoretical convergence result Theorem (Widlund, Dryija) Let M 1 ASM,2 be the two-level additive Schwarz method: κ(m 1 ASM,2 (1 A) C + H ) δ where δ is the size of the overlap between the subdomains and H the subdomain size. This does indeed work very well Number of subdomains 8 16 32 64 ASM 18 35 66 128 ASM + Nicolaides 20 27 28 27 F. Nataf et al BB DDM 13 / 28

Failure for Darcy equation with heterogeneities (α(x, y) u) = 0 in Ω R 2, u = 0 on Ω D, u n = 0 on Ω N. IsoValue -5262.11 2632.55 7895.66 13158.8 18421.9 23685 28948.1 34211.2 39474.3 44737.4 50000.5 55263.6 60526.7 65789.8 71052.9 76316 81579.1 86842.2 92105.3 105263 Our approach Decomposition α(x, y) Jump 1 10 10 2 10 3 10 4 ASM 39 45 60 72 73 ASM + Nicolaides 30 36 50 61 65 Fix the problem by an optimal and proven choice of a coarse space Z. F. Nataf et al BB DDM 14 / 28

Outline 1 Some Applications 2 An abstract 2-level Schwarz: the GenEO algorithm Choice of the coarse space Parallel implementation 3 Conclusion F. Nataf et al BB DDM 15 / 28

Objectives Strategy Define an appropriate coarse space V H 2 = span(z 2 ) and use the framework previously introduced, writing R 0 = Z2 T the two level preconditioner is: P 1 ASM 2 := RT 0 (R 0AR T 0 ) 1 R 0 + N i=1 R T i A 1 i R i. The coarse space must be Local (calculated on each subdomain) parallel Adaptive (calculated automatically) Easy and cheap to compute Robust (must lead to an algorithm whose convergence is proven not to depend on the partition nor the jumps in coefficients) F. Nataf et al BB DDM 16 / 28

Abstract eigenvalue problem Gen.EVP per subdomain: Find p j,k V h Ωj and λ j,k 0: a Ωj (p j,k, v) = λ j,k a Ω j (Ξ j p j,k, Ξ j v) v V h Ωj A j p j,k = λ j,k X j A j X j p j,k (X j... diagonal) a D... restriction of a to D In the two-level ASM: Choose first m j eigenvectors per subdomain: V 0 = span { Ξ j p j,k } j=1,...,n k=1,...,m j This automatically includes Zero Energy Modes. F. Nataf et al BB DDM 17 / 28

Abstract eigenvalue problem Gen.EVP per subdomain: Find p j,k V h Ωj and λ j,k 0: a Ωj (p j,k, v) = λ j,k a Ω j (Ξ j p j,k, Ξ j v) v V h Ωj A j p j,k = λ j,k X j A j X j p j,k (X j... diagonal) a D... restriction of a to D In the two-level ASM: Choose first m j eigenvectors per subdomain: V 0 = span { Ξ j p j,k } j=1,...,n k=1,...,m j This automatically includes Zero Energy Modes. F. Nataf et al BB DDM 17 / 28

Comparison with existing works Galvis & Efendiev (SIAM 2010): κ p j,k v dx = λ j,k κ p j,k v dx Ω j Ω j v V h Ωj Efendiev, Galvis, Lazarov & Willems (submitted): a Ωj (p j,k, v) = λ j,k a Ωj (ξ j ξ i p j,k, ξ j ξ i v) v V Ωj Our gen.evp: i neighb(j) ξ j... partition of unity, calculated adaptively (MS) a Ωj (p j,k, v) = λ j,k a Ω j (Ξ j p j,k, Ξ j v) v V h Ωj both matrices typically singular = λ j,k [0, ] F. Nataf et al BB DDM 18 / 28

Theory Two technical assumptions. Theorem (Spillane, Dolean, Hauret, N., Pechstein, Scheichl) If for all j: 0 < λ j,mj+1 < : [ κ(m 1 ASM,2 A) (1 + k 0) 2 + k 0 (2k 0 + 1) N max j=1 ( 1 + 1 )] λ j,mj +1 Possible criterion for picking m j : (used in our Numerics) λ j,mj +1 < δ j H j H j... subdomain diameter, δ j... overlap F. Nataf et al BB DDM 19 / 28

E Eigenvalues and eigenvectors Eigenvector number 1 is -3.07387e-15; exageration coefficient is: 1000000000 Eigenvector number 2 is 8.45471e-16; exageration coefficient is: 1000000000 Eigenvector number 3 is 5.3098e-15; exageration coefficient is: 1000000000 0 Eigenvector number 4 is 1.15244e-05; exageration coefficient is: 100000 Eigenvector number 5 is 1.87668e-05; exageration coefficient is: 100000 Eigenvector number 6 is 4.99451e-05; exageration coefficient is: 100000 Eigenvector number 7 is 0.000132778; exageration coefficient is: 100000 Eigenvector number 8 is 0.000141253; exageration coefficient is: 100000 Eigenvector number 9 is 0.000396054; exageration coefficient is: 100000 Eigenvector number 10 is 0.169032; exageration coefficient is: 100000 Eigenvector number 11 is 0.169212; exageration coefficient is: 100000 Eigenvector number 12 is 0.169217; exageration coefficient is: 100000 Eigenvector number 13 is 0.16922; exageration coefficient is: 1000000 Eigenvector number 14 is 0.169515; exageration coefficient is: 100000 Eigenvector number 15 is 0.170536; exageration coefficient is: 10000 10 0 10 2 eigenvalue (log scale) 10 4 10 6 10 8 10 10 10 12 10 14 10 16 5 10 15 20 eigenvalue number Logarithmic scale F. Nataf et al BB DDM 20 / 28

Convergence 10 1 10 0 AS P BNN : AS + Z Nico P BNN : AS + Z D2N GMRES P BNN : AS + Z D2N 10 1 10 2 10 3 Error 10 4 10 5 10 6 10 7 10 8 0 100 200 300 400 500 600 700 Iteration count F. Nataf et al BB DDM 21 / 28

Numerical results via a Domain Specific Language FreeFem++ (http://www.freefem.org/ff++), with: Metis Karypis and Kumar 1998 SCOTCH Chevalier and Pellegrini 2008 UMFPACK Davis 2004 ARPACK Lehoucq et al. 1998 MPI Snir et al. 1995 Intel MKL PARDISO Schenk et al. 2004 MUMPS Amestoy et al. 1998 PaStiX Hénon et al. 2005 PETSC Why use a DS(E)L instead of C/C++/Fortran/..? performances close to low-level language implementation, hard to beat something as simple as: varf a(u, v) = int3d(mesh)([dx(u), dy(u), dz(u)]' * [dx(v), dy(v), dz(v)]) + int3d(mesh)(f * v) + on(boundary mesh)(u = 0) Much More in F. Hecht s talk tomorrow F. Nataf et al BB DDM 22 / 28

Strong scalability in two dimensions heterogeneous elasticity (P. Jolivet with Frefeem ++) Elasticity problem with heterogeneous coefficients Timing relative to 1 024 processes 8 4 2 1 25 20 15 Linear speedup 10 #iterations 1 024 2 048 4 096 8 192 #processes Speed-up for a 1.2 billion unknowns 2D problem. Direct solvers in the subdomains. Peak performance wall-clock time: 26s. F. Nataf et al BB DDM 23 / 28

Strong scalability in three dimensions heterogeneous elasticity Elasticity problem with heterogeneous coefficients Timing relative to 1 024 processes 6 4 2 1 25 20 15 Linear speedup 10 #iterations 1 024 2 048 4 096 6 144 #processes Speed-up for a 160 million unknowns 3D problem. Direct solvers in subdomains. Peak performance wall-clock time: 36s. F. Nataf et al BB DDM 24 / 28

Weak scalability in two dimensions Darcy problems with heterogeneous coefficients Weak efficiency relative to 1 024 processes 100 % 80 % 60 % 40 % 20 % 0 % 1 024 2 048 4 096 12 288 8 192 5 088 422 #d.o.f. (in millions) #processes Efficiency for a 2D problem. Direct solvers in the subdomains. Final size: 22 billion unknowns. Wall-clock time: 200s. F. Nataf et al BB DDM 25 / 28

Weak scalability in three dimensions Darcy problems with heterogeneous coefficients Weak e ciency relative to 1 024 processes 100 % 80 % 60 % 40 % 20 % 0% 1 024 2 048 4 096 8 192 6 144 105 13 #d.o.f. (in millions) #processes Efficiency for a 3D problem. Direct solvers in the subdomains. Final size: 2 billion unknowns. Wall-clock time: 200s. F. Nataf et al BB DDM 26 / 28

Outline 1 Some Applications 2 An abstract 2-level Schwarz: the GenEO algorithm 3 Conclusion F. Nataf et al BB DDM 27 / 28

Conclusion Summary Using generalized eigenvalue problems and projection preconditioning we are able to achieve a targeted convergence rate. Works for ASM, BNN and FETI methods This process can be implemented in a black box algorithm. Future work Build the coarse space on the fly. Multigrid like three (or more) level methods THANK YOU FOR YOUR ATTENTION! F. Nataf et al BB DDM 28 / 28

Conclusion Summary Using generalized eigenvalue problems and projection preconditioning we are able to achieve a targeted convergence rate. Works for ASM, BNN and FETI methods This process can be implemented in a black box algorithm. Future work Build the coarse space on the fly. Multigrid like three (or more) level methods THANK YOU FOR YOUR ATTENTION! F. Nataf et al BB DDM 28 / 28