Open-source finite element solver for domain decomposition problems

Similar documents
A Quasi-Optimal Non-Overlapping Domain Decomposition Algorithm for the Helmholtz Equation

A Quasi-Optimal Non-Overlapping Domain Decomposition Algorithm for the Helmholtz Equation

A Non-overlapping Quasi-optimal Optimized Schwarz 2 Domain Decomposition Algorithm for the Helmholtz 3 Equation 4 UNCORRECTED PROOF

Two-level Domain decomposition preconditioning for the high-frequency time-harmonic Maxwell equations

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems

Some Geometric and Algebraic Aspects of Domain Decomposition Methods

Poisson Equation in 2D

Convergence Behavior of a Two-Level Optimized Schwarz Preconditioner

Solving Time-Harmonic Scattering Problems by the Ultra Weak Variational Formulation

Efficiently solving large sparse linear systems on a distributed and heterogeneous grid by using the multisplitting-direct method

Parallelism in FreeFem++.

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

arxiv: v1 [math.na] 11 Jul 2018

Electromagnetic wave propagation. ELEC 041-Modeling and design of electromagnetic systems

Toward black-box adaptive domain decomposition methods

Shifted Laplace and related preconditioning for the Helmholtz equation

Optimized Schwarz Methods for Maxwell Equations with Discontinuous Coefficients

FETI-DPH: A DUAL-PRIMAL DOMAIN DECOMPOSITION METHOD FOR ACOUSTIC SCATTERING

Multilevel spectral coarse space methods in FreeFem++ on parallel architectures

Schwarz-type methods and their application in geomechanics

Multi-Domain Approaches for the Solution of High-Frequency Time-Harmonic Propagation Problems

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

DG discretization of optimized Schwarz methods for Maxwell s equations

Comparison of a One and Two Parameter Family of Transmission Conditions for Maxwell s Equations with Damping

Domain Decomposition-based contour integration eigenvalue solvers

Performance comparison between hybridizable DG and classical DG methods for elastic waves simulation in harmonic domain

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

Lecture 8: Fast Linear Solvers (Part 7)

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

Efficient domain decomposition methods for the time-harmonic Maxwell equations

Acceleration of a Domain Decomposition Method for Advection-Diffusion Problems

Architecture Solutions for DDM Numerical Environment

An Accelerated Block-Parallel Newton Method via Overlapped Partitioning

Algebraic Multigrid as Solvers and as Preconditioner

Zonal modelling approach in aerodynamic simulation

Space-time domain decomposition for a nonlinear parabolic equation with discontinuous capillary pressure

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

Optimal Interface Conditions for an Arbitrary Decomposition into Subdomains

A User Friendly Toolbox for Parallel PDE-Solvers

Une méthode parallèle hybride à deux niveaux interfacée dans un logiciel d éléments finis

HPDDM une bibliothèque haute performance unifiée pour les méthodes de décomposition de domaine

Multipréconditionnement adaptatif pour les méthodes de décomposition de domaine. Nicole Spillane (CNRS, CMAP, École Polytechnique)

Stabilization and Acceleration of Algebraic Multigrid Method

Sparse Matrix Computations in Arterial Fluid Mechanics

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

A decade of fast and robust Helmholtz solvers

Recent advances in HPC with FreeFem++

A Stochastic-based Optimized Schwarz Method for the Gravimetry Equations on GPU Clusters

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

On the performance of the Algebraic Optimized Schwarz Methods with applications

A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D

Optimal Left and Right Additive Schwarz Preconditioning for Minimal Residual Methods with Euclidean and Energy Norms

Parallel sparse linear solvers and applications in CFD

FEAST eigenvalue algorithm and solver: review and perspectives

Recent Progress of Parallel SAMCEF with MUMPS MUMPS User Group Meeting 2013

A parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations

Robust solution of Poisson-like problems with aggregation-based AMG

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Finite Element Solver for Flux-Source Equations

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS

Final Ph.D. Progress Report. Integration of hp-adaptivity with a Two Grid Solver: Applications to Electromagnetics. David Pardo

WRF performance tuning for the Intel Woodcrest Processor

Generating Equidistributed Meshes in 2D via Domain Decomposition

Domain decomposition on different levels of the Jacobi-Davidson method

A knowledge-based approach to high-performance computing in ab initio simulations.

Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools

High performance computing for neutron diffusion and transport equations

A simple FEM solver and its data parallelism

Calculate Sensitivity Function Using Parallel Algorithm

Department of Computer Science, University of Illinois at Urbana-Champaign

Optimized Schwarz methods with overlap for the Helmholtz equation

Additive Schwarz method for scattering problems using the PML method at interfaces

Elmer. Introduction into Elmer multiphysics. Thomas Zwinger. CSC Tieteen tietotekniikan keskus Oy CSC IT Center for Science Ltd.

Breaking Computational Barriers: Multi-GPU High-Order RBF Kernel Problems with Millions of Points

Incomplete Cholesky preconditioners that exploit the low-rank property

On domain decomposition preconditioners for finite element approximations of the Helmholtz equation using absorption

IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations

Fast Multipole BEM for Structural Acoustics Simulation

Hybrid iterative-direct domain decomposition based solvers for the time-harmonic Maxwell equations

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory

New Streamfunction Approach for Magnetohydrodynamics

arxiv: v2 [physics.comp-ph] 3 Jun 2016

Reduced Vlasov-Maxwell modeling

Multipole-Based Preconditioners for Sparse Linear Systems.

Solving the Stochastic Steady-State Diffusion Problem Using Multigrid

Adaptive Coarse Spaces and Multiple Search Directions: Tools for Robust Domain Decomposition Algorithms

A robust computational method for the Schrödinger equation cross sections using an MG-Krylov scheme

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

BETI for acoustic and electromagnetic scattering

Application of Maxwell Equations to Human Body Modelling

An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors

Gypsilab : a MATLAB toolbox for FEM-BEM coupling

Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method

A Parallel Schwarz Method for Multiple Scattering Problems

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

Transcription:

1/29 Open-source finite element solver for domain decomposition problems C. Geuzaine 1, X. Antoine 2,3, D. Colignon 1, M. El Bouajaji 3,2 and B. Thierry 4 1 - University of Liège, Belgium 2 - University of Lorraine, France 3 - INRIA Nancy Grand-Est, France 4 - OPERA, Applied Geophysical Research Group, Pau, France DD22, September, 2013

2/29 Outline GetDP Results Conclusions and perspectives

3/29 *** GetDP Results Conclusions and perspectives

4/29 Reference problem K Ω Γ Γ Scattered field u solution of ( + k 2 )u = 0 in Ω := R d \ K, u = u inc on Γ, n u iku = 0 on Γ,

5/29 Reference problem Non-overlapping DDM, iteration n + 1 For j = 0,..., N dom 1, do (Ω := j Ω j): 1. Compute the fields u n+1 j := u n+1 Ωj : ( + k 2 )uj n+1 = 0 in Ω j, uj n+1 = u inc on Γ j, n uj n+1 ikuj n+1 = 0 on Γ j, { n uj n+1 + Suj n+1 = gij n on Σ ij ( i j). 2. Update the data g n+1 ji Where: g n+1 ji = g n ij + 2Su n+1 j, on Σ ij. Σ ij := Ω i Ωj, Γ j = Γ Ω j and Γ j = Γ Ω j S: transmission operator (Sommerfeld, OO2, GIBC,... )

/29 Reference problem Recast the DDM into the linear system: (I A)g = b, (1.1) where I = identity operator and... b = (b ij ) j i, with b ij = 2Sv j (Σ ij ), with ( + k 2 )v j = 0 (Ω j ), v j = u inc (Γ j ), n v j ikv j = 0 (Γ j ), n v j + Sv j = 0 (Σ ij ). g = (g ij ) j i, with with (Ag) ji = g ij + 2Sw j (Σ ij ), ( + k 2 )w j = 0 (Ω j ), w j = 0 (Γ j ), n w j ikw j = 0 (Γ j ), n w j + Sw j = g ij (Σ ij ). System (1.1) can be solved using a Krylov subspaces solver (gmres,... ).

/29 Reference problem Main steps 1. Compute b: solve N dom (Helmholtz + surface PDE) 2. Iterative linear solver to solve (I A)g = b: Solve Ndom (Helmholtz + surface PDE): (gij n+1 ) j i 3. Compute u: solve N dom Helmholtz Ω 1 Ω 2 Ω 0

/29 Reference problem Main steps in parallel 1. Compute b: solve N dom (Helmholtz + surface PDE) in parallel 2. Iterative linear solver to solve (I A)g = b: Exchange data between subdomains/mpi process Solve Ndom (Helmholtz + surface PDE) in parallel: (gij n+1 3. Exchange data between subdomains/mpi process 4. Compute u: solve N dom Helmholtz in parallel Proc. 1 ) j i Proc. 2 Ω 1 Ω 2 Ω 0 Proc. 0

GetDP GetDP=General Environment for the Treatment of Discrete Problems Authors C. Geuzaine and P. Dular History Started at the end of 1996 (first binary release mid 1998) Open-Source under GNU GPL (since 2004) Design Small, fast, no GUI End-users: not yet another library Limit external dependencies to a minimum How? Clear mathematical structure Languages 9/29 C/C++ with PETSc

10/29 GetDP: DD add-on Motivation of the add-on Solve a linear system AX = b with a matrix-free vector product. Goal of the add-on DD made simple: passing from mono-domain to multi-domain must not be painful Parallelization: easy and prevent the user from writing in MPI General method: no restriction to a PDE, a Krylov solver,... Fast and efficient Robust and scalable Open-source

11/29 GetDP: DD add-on When the user have a code solving a mono-domain PDE with GetDP... What the user must do... All the mathematics of DDM Build/Decompose mesh with e.g. GMSH (some plugins exist) Define the matrix-free vector product Define parameters: solver, tolerance,... If(mpirun) Chose a distribution of the domains/unknown to MPI process Opt.: specify subdomains neighbors (speed up) What GetDP will do... Manages the iterative linear solver (PETSc) If(mpirun): Automatically exchanges data between process (MPI)

12/29 GetDP: DD add-on Remark: surface unknown g are interpolable Mixed formulations Non conform mesh at interfaces

13/29 GetDP: code Main steps in parallel 1. Compute b: solve N dom Helmholtz + N dom surface PDE in parallel 2. Iterative linear solver to solve (I A)g = b: Exchange data between subdomains/mpi process Solve Ndom Helmholtz + N dom surface PDE in parallel: (g n+1 ij ) j i 3. Exchange data between subdomains/mpi process 4. Compute u: solve N dom Helmholtz

GetDP: code Weak formulations (S := ik (Sommerfeld) and B := ik (Sommerfeld)). 4/29 For j = 0,..., N dom 1 (v j:= test functions): u j v j k 2 u j v j Ω j Ω j }{{} Helmholtz equation Γ j iku j v j }{{} Sommerfeld ABC g in j Σ j For j In {0:N DOM-1} [... ] Galerkin { [ Dof{Grad u {j}}, {Grad u {j}} ] ; In Omega {j}; Jacobian JVol ; Integration I1 ; } Galerkin { [ -kˆ2 * Dof{u {j}}, {u {j}} ] ; In Omega {j}; Jacobian JVol ; Integration I1 ; } Galerkin { [ - I[] * k * Dof{u {j}}, {u {j}} ] ; In GammaInf {j}; Jacobian JSur ; Integration I1 ; } Galerkin { [ - g in {j}[], {u {j}} ] ; In Sigma {j}; Jacobian JSur; Integration I1 ; } Galerkin { [ -I[] * k * Dof{u {j}}, {u {j}} ] ; In Sigma {j}; Jacobian JSur ; Integration I1 ; } [... ] EndFor v j iku j v j = 0 Σ j } {{ } Transmission condition

15/29 GetDP: code More complicated transmission condition (OO2: Gander, Magoulès, Nataf, 2002): au j b Σj u j g in j = 0, on Σ j. Weak formulation (v j := test function): 2au j v j Σ j b Σj u j Σj v j + Σ j gj in v j = 0. Σ j Galerkin { [ a[]* Dof{u {j}}, {u {j}} ] ; In Sigma {j}; Jacobian JSur ; Integration I1 ; } Galerkin { [ -b[] * Dof{d u {j}}, {d u {j}} ] ; In Sigma {j}; Jacobian JSur ; Integration I1 ; } Galerkin { [ - g in {j}[], {u {j}} ] ; In Sigma {j}; Jacobian JSur ; Integration I1 ; }

16/29 GetDP: code Main steps in parallel 1. Compute b: solve N dom (Helmholtz + surface PDE) in parallel 2. Iterative linear solver to solve (I A)g = b: Exchange data between subdomains/mpi process Solve N dom (Helmholtz + surface PDE) in parallel: (gij n+1 3. Exchange data between subdomains/mpi process 4. Compute u: solve N dom Helmholtz ) j i

17/29 GetDP: code Acoustic: IterativeLinearSolver[ I-A, solver, tol, maxit,... ] { SetCommSelf ; // Sequential mode For ii In {0: #ListOfDom()-1} j = ListOfDom(ii) ; // Solve Helmholtz on Ω j GenerateRHSGroup[Helmholtz {j}, Sigma {j}] ; SolveAgain[Helmholtz {j}] ; // Compute g ji GenerateRHSGroup[ComputeG {j}, Sigma {j}] ; SolveAgain[ComputeG {j}] ; EndFor // Update g ji in memory For ii In {0: #ListOfDom()-1} j = ListOfDom(ii) ; PostOperation[g out {j}] ; EndFor SetCommWorld ;// Parallel mode }

18/29 GetDP: code And for Maxwell s equation... IterativeLinearSolver[ I-A, solver, tol, maxit,... ] { SetCommSelf ; // Sequential mode For ii In {0: #ListOfDom()-1} j = ListOfDom(ii) ; // Solve Maxwell on Ω j GenerateRHSGroup[Maxwell {j}, Sigma {j}] ; SolveAgain[Maxwell {j}] ; // Compute g ji GenerateRHSGroup[ComputeG {j}, Sigma {j}] ; SolveAgain[ComputeG {j}] ; EndFor // Update g ji in memory For ii In {0: #ListOfDom()-1} j = ListOfDom(ii) ; PostOperation[g out {j}] ; EndFor SetCommWorld ;// Parallel mode }

19/29 *** GetDP Results Conclusions and perspectives

20/29 Acoustic: 3D submarine Mesh Cut in slices Split in different files

21/29 Acoustic: 3D submarine Submarine problem with 16 subdomains: iso-surfaces of the real part of the scattered field for k = 41π.

Acoustic: 3D submarine Vega Cluster (ULB, Belgium): 42 nodes, each with 256GB of RAM and 4 AMD 6722 processors, with 16 cores at 2.1GHz (MPI/OpenBlas). k = 10π and L sub = 1 Nb. domains Nb. nodes Nb. MPI/nodes Nb. threads/mpi Nb. vertices (M.) Nb. iterations CPU time (min) Max RAM (GB) Size volume PDE Size surface PDE Size GMRES 22/29 16 16 1 16 40 127 302 137 27 10 5 10 10 4 9 10 6 32 16 2 8 16 2 8 64 32 2 16 2.5 113 9 3 1.2 10 5 1.6 10 4 2.9 10 6 5 121 21 7 2.1 10 5 2.5 10 4 4.5 10 6 10 129 42 13 4.1 10 5 4 10 4 7.2 10 6 20 158 90? 8 10 5 6.7 10 4 12 10 6 40 171 205 60 15 10 5 10 10 4 18 10 6 40 171 182 63 15 10 5 10 10 4 18 10 6 80 184 495 141 29 10 5 16 10 4 29 10 6 120 192 795 220 42 10 5 21 10 4 38 10 6 40 240 250 40 8 10 5 10 10 4 38 10 6 80 259 817 85 16 10 5 16 10 4 60 10 6 120 270 1072 118 23 10 5 21 10 4 78 10 6

23/29 Acoustic: 3D submarine NIC4 cluster (ULg, Belgium): 120 nodes, each with 64GB of RAM and 2 Intel E2650 processors with 8 cores at 2GHz (MPI/MKL). k = 10π and L sub = 1. Nb. domains Nb. nodes Nb. MPI/nodes Nb. threads/mpi Nb. vertices (M.) Nb. iterations CPU time (min) Max RAM (GB) Size volume PDE Size surface PDE Size GMRES 32 32 1 16 40 171 100 59.5 14.9 10 5 10 5 1.8 10 7 64 64 1 16 40 240 160 39.4 8.8 10 5 10 5 3.8 10 7 96 96 1 16 40 303 195 36.3 6.7 10 5 10 5 5.7 10 7 120 120 1 16 80 371 399 63.0 10.8 10 5 1.6 10 5 11.4 10 7

4/29 Acoustic: scalability (2D) Γ 700 600 1 MPI Process N_DOM MPI Process 500 CPU time (s) 400 300 200 Γ 100 0 2 4 6 8 10 12 14 16 Number of subdomains Circle-pie decomposition (2D). CPU time vs. N dom with 1 or N dom number of MPI process.

25/29 Acoustic: non conform (simple case) k = 2π N dom = 5 tol: 10 6

Electromagnetism 3D Relative residual (Log 10 ) 0 1 2 3 4 5 6 7 k =4π with Ndom=5 k =4π with Ndom=10 k =8π with Ndom=5 k =8π with Ndom=10 Perfectly conducting sphere Two concentric spheres N dom = 5 or 10 26/29 8 0 10 20 30 40 50 Iteration number

27/29 *** GetDP Results Conclusions and perspectives

28/29 Conclusions Our software... Quite easy to pass from mono-domain to multi-domain Fast and efficient Has been successfully tested on cluster Adapted to scalar/vector/tensor unknown What can be done with it... 1D/2D/3D without change (GetDP is 3D native) Automatic partitioning ( waveguide-style ) With or without overlap Mixte formulations Non conform mesh at interfaces (interpolation) Right (or left) preconditioning: (I A)F 1 g = b with Fg = g.

29/29 Perspectives Publish DDM codes online (acoustic/electromagnetism) Documentation Computational optimizations (storage of unknown,... ) Optimize/Make easier mesh decomposition (topological tools) Really automatic partitioning algorithms (metis/chaco) Hope some of you will use it :-)

29/29 Perspectives Publish DDM codes online (acoustic/electromagnetism) Documentation Computational optimizations (storage of unknown,... ) Optimize/Make easier mesh decomposition (topological tools) Really automatic partitioning algorithms (metis/chaco) Hope some of you will use it :-) Thank you very much!