Multipole-Based Preconditioners for Sparse Linear Systems.

Similar documents
Analyzing the Error Bounds of Multipole-Based

A Solenoidal Basis Method For Efficient Inductance Extraction Λ

Fast Multipole BEM for Structural Acoustics Simulation

Effective matrix-free preconditioning for the augmented immersed interface method

Fast Multipole Methods

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

Contents. Preface... xi. Introduction...

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

Fast Multipole Methods for Incompressible Flow Simulation

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

Fast multipole boundary element method for the analysis of plates with many holes

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov

arxiv: v2 [math.na] 1 Oct 2016

LEAST-SQUARES FINITE ELEMENT MODELS

Indefinite and physics-based preconditioning

Solving Large Nonlinear Sparse Systems

2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co

Efficient Augmented Lagrangian-type Preconditioning for the Oseen Problem using Grad-Div Stabilization

Iterative Solvers in the Finite Element Solution of Transient Heat Conduction

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

Linear Solvers. Andrew Hazel

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

Fast Space Charge Calculations with a Multigrid Poisson Solver & Applications

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

A Fast, Parallel Potential Flow Solver

Fast solvers for steady incompressible flow

1420 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 9, SEPTEMBER 2005

A DG-based incompressible Navier-Stokes solver

An adaptive fast multipole boundary element method for the Helmholtz equation

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 HIGH FREQUENCY ACOUSTIC SIMULATIONS VIA FMM ACCELERATED BEM

Preface to the Second Edition. Preface to the First Edition

Review: From problem to parallel algorithm

A HIERARCHICAL 3-D DIRECT HELMHOLTZ SOLVER BY DOMAIN DECOMPOSITION AND MODIFIED FOURIER METHOD

A High-Order Galerkin Solver for the Poisson Problem on the Surface of the Cubed Sphere

Lecture 18 Classical Iterative Methods

Transactions on Modelling and Simulation vol 18, 1997 WIT Press, ISSN X

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Lecture 2: The Fast Multipole Method

An Embedded Boundary Integral Solver for the Stokes Equations 1

Review for the Midterm Exam

Spatial discretization scheme for incompressible viscous flows

7.4 The Saddle Point Stokes Problem

PDE Solvers for Fluid Flow

Partial Differential Equations

Fast Iterative Solution of Saddle Point Problems

Partial Differential Equations. Examples of PDEs

Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method

IMPLEMENTATION OF A PARALLEL AMG SOLVER

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS

- Part 4 - Multicore and Manycore Technology: Chances and Challenges. Vincent Heuveline

Implementation of 3D Incompressible N-S Equations. Mikhail Sekachev

Overlapping Schwarz preconditioners for Fekete spectral elements

Dual Reciprocity Method for studying thermal flows related to Magma Oceans

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners

PALADINS: Scalable Time-Adaptive Algebraic Splitting and Preconditioners for the Navier-Stokes Equations

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses

9.1 Preconditioned Krylov Subspace Methods

Some Geometric and Algebraic Aspects of Domain Decomposition Methods

1. Fast Iterative Solvers of SLE

APPLICATION OF THE SPARSE CARDINAL SINE DECOMPOSITION TO 3D STOKES FLOWS

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

Fast Structured Spectral Methods

The Embedded Boundary Integral Method (EBI) for the Incompressible Navier-Stokes equations

Uncertainty analysis of large-scale systems using domain decomposition

A parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations

Convergence Behavior of a Two-Level Optimized Schwarz Preconditioner

Parallel Fast Multipole Boundary Element Method for Crustal Dynamics

DIRECT NUMERICAL SIMULATION IN A LID-DRIVEN CAVITY AT HIGH REYNOLDS NUMBER

FEM-FEM and FEM-BEM Coupling within the Dune Computational Software Environment

Solving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing

Case Study: Quantum Chromodynamics

Scalable Non-blocking Preconditioned Conjugate Gradient Methods

Efficient Solvers for the Navier Stokes Equations in Rotation Form

Lecture 8: Boundary Integral Equations

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

Open boundary conditions in numerical simulations of unsteady incompressible flow

Domain decomposition on different levels of the Jacobi-Davidson method

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Termination criteria for inexact fixed point methods

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

The Conjugate Gradient Method

20. A Dual-Primal FETI Method for solving Stokes/Navier-Stokes Equations

Fine-grained Parallel Incomplete LU Factorization

Performance Analysis of Parallel Alternating Directions Algorithm for Time Dependent Problems

Fast numerical methods for solving linear PDEs

Course Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications

Enhancing Scalability of Sparse Direct Methods

Master Thesis Literature Review: Efficiency Improvement for Panel Codes. TU Delft MARIN

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

A fast adaptive numerical solver for nonseparable elliptic partial differential equations

Hierarchical Parallel Solution of Stochastic Systems

Multigrid Methods and their application in CFD

Two-Dimensional Unsteady Flow in a Lid Driven Cavity with Constant Density and Viscosity ME 412 Project 5

Lab 1: Iterative Methods for Solving Linear Systems

Inexact inverse iteration with preconditioning

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

Fast Sparse Spectral Methods for Higher Dimensional PDEs

Transcription:

Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation.

Overview Summary of Contributions Generalized Stokes Problem Solenoidal Basis Methods and Preconditioning Multipole Methods as Preconditioners Performance of Multipole-Based Preconditioners Parallelization of Solver/Preconditioner Parallel Performance Concluding Remarks

Summary of Contributions Problem Formulation Excellent Convergence Properties of Multipole- Based Preconditioners Parallelism in Multipole-based Sparse Solvers Highly Scalable and Efficient Parallel Formulations

Generalized Stokes Problem Navier-Stokes equation u t 1 u u p Δu R u 0 in Ω u g on Ω in Ω Incompressibility Condition u 0

Generalized Stokes Problem Linear system B T is the discrete divergence operator and A collects all the velocity terms. Navier-Stokes: Generalized Stokes Problem: 0 p u 0 f B B A T L R C M t A 1 1 L R M t A 1 1

Solenoidal Basis Methods Class of projection based methods Basis for divergence-free velocity Solenoidal basis matrix P: Divergence free velocity: Modified system P 0 B T x u P 0 x u P B B T T 0 p u 0 f B B A T f B AP x p =>

Solenoidal Basis Method Reduced system: B T P T P B 0 APx Bp f ; P T APx P T f Reduced system solved by suitable iterative methods such as CG or GMRES. Velocity obtained by : u Px Pressure obtained by: Bp f APx

Observations Preconditioning y P T u ξ u u Pw u ψ y P T Pw ξ ψ Vorticity-Velocity function formulation: ξ t 1 ξ, R ξ Δψ

Preconditioning Reduced system approximation: P AP Preconditioner: T G 1 Δt 1 Δt M M P Low accuracy preconditioning is sufficient 1 R 1 R P T L s L s P T P

Preconditioning Preconditioning can be affected by an approximate Laplace solve followed by a Helmholtz solve. The Laplace solve can be performed effectively using a Multipole method.

Preconditioning Laplace/Poisson Problems Given a domain with known internal charges and boundary potential (Dirichlet conditions), estimate potential across the domain. Compute boundary potential due to internal charges (single multipole application) Compute residual boundary potential due to charges on the boundary (vector operation). Compute boundary charges corresponding to residual boundary potential (multipole-based GMRES solve). Compute potential through entire domain due to boundary and internal charges (single multipole).

Overview of Multipole Methods Consider the problem of computing the trajectory of particles in space. Multipole methods use hierarchical approximations to reduce computational cost.

Overview of Multipole Methods Accurate formulation requires O(n 2 ) force computations (mat-vec with a coefficient matrix of Green s functions) Hierarchical methods reduce this complexity to O(n) or O(n log n) Since particle distributions can be very irregular, the tree can be highly unbalanced Commonly used hierarchical methods include FMM and the Barnes-Hut methods.

Overview of Multipole Methods Interactions (direct force computations) are replaced by Gaussian quadratures. Far-field interactions are replaced by multipole series representing remote Gauss points. Each matrix-vector product now becomes a single tree traversal taking O(n) or O(n log n) time. This mat-vec can be encapsulated in iterative solvers for dense systems.

Overview of Multipole Methods A number of results relating to the complexity of hierarchical methods [Rokhlin, Greengard], error analysis and control [Grama, Sarin, Sameh], and their use in dense solvers [Grama, Kumar, Sameh] have been shown. Their applications in applications such as inductance extraction [White, Sarin, Grama] have also been demonstrated.

Numerical Experiments 3D driven cavity problem in a cubic domain. Marker-and-cell scheme in which domain is discretized uniformly. Pressure unknowns are defined at node points and velocities at mid points of edges. x, y, and z components of velocities are defined on edges along respective directions.

Sample Problem Sizes Mesh Pressure Velocity Solenoidal Functions 8x8x8 512 1,344 1,176 16x16x16 4,096 11,520 10,800 32x32x32 32,768 95,232 92,256

Preconditioning Performance

Preconditioning Performance

Preconditioning Performance Poisson Problem

Preconditioning Performance Poisson Problem

Preconditioning Performance Poisson Problem

Parallel Formulation - Outer Solve Partition domain across processors Computation and storage of solenoidal basis matrix P T Matrix-vector products: y P APx Local operations on the grid Vector operations Preconditioning Scalable parallel implementations have been developed for all these operations Matrix free approach

Parallel Formulation (Multipole Solve) Each node evaluation can potentially be executed as a thread. Since there may be a large number of nodes, this may lead to too many threads. In general, we build some granularity into the thread by gathering k nodes into a single thread. The Barnes-Hut tree is a read-only data structure. Therefore, good performance can be achieved if we can build spatial and temporal locality (and the working set size does not exceed local memory).

Parallel Formulation(Multipole Solve) Spatially proximate particles are likely to interact with nearly identical parts of the tree. Therefore particles must be traversed in a spatial-proximity preserving order.

Parallel Performance

Parallel Performance

Concluding Remarks Multipole methods provide highly effective preconditioning techniques that yield excellent parallel speedups. The accuracy parameters of the multipole solve (degree, multipole acceptance criteria) significantly impact time and convergence rate. Rely on closed form Green s function, but can be adapted to other scenarios.