Numerical Solution of the Generalised Poisson Equation on Parallel Computers

Similar documents
Contents. Preface... xi. Introduction...

Cloud Computing mit mathematischen Anwendungen

Mehrphasensimulationen zur Charakterisierung einer Aerosol-Teilprobenahme

Master Thesis Literature Study Presentation

Ordinals and Cardinals: Basic set-theoretic techniques in logic

Hierarchical Parallel Solution of Stochastic Systems

Incomplete Cholesky preconditioners that exploit the low-rank property

Numerical modelling of phase change processes in clouds. Challenges and Approaches. Martin Reitzle Bernard Weigand

D-optimally Lack-of-Fit-Test-efficient Designs and Related Simple Designs

Locating RFID Tags. Matthias Benesch, Siegfried Bublitz Siemens AG. C-LAB Report. Vol. 8 (2009) No. 3

AA210A Fundamentals of Compressible Flow. Chapter 1 - Introduction to fluid flow

Algebra. Übungsblatt 10 (Lösungen)

Interpolation in h-version finite element spaces

PDE Solvers for Fluid Flow

Iterative Solvers in the Finite Element Solution of Transient Heat Conduction

Module Module Component Aim LoI** Study programme Term expected calendar week/ time***

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

Algebra. Übungsblatt 12 (Lösungen)

On the choice of abstract projection vectors for second level preconditioners

Discrete Projection Methods for Incompressible Fluid Flow Problems and Application to a Fluid-Structure Interaction

A Stable Spectral Difference Method for Triangles

Numerically Solving Partial Differential Equations

SS BMMM01 Basismodul Mathematics/Methods Block 1: Mathematics for Economists. Prüfer: Prof. Dr.

Jae Heon Yun and Yu Du Han

when viewed from the top, the objects should move as if interacting gravitationally

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

Algebra. Übungsblatt 8 (Lösungen) m = a i m i, m = i=1

Halbeinfache Algebren

1. Einleitung. 1.1 Organisatorisches. Ziel der Vorlesung: Einführung in die Methoden der Ökonometrie. Voraussetzungen: Deskriptive Statistik

1. Positive and regular linear operators.

A PRECONDITIONER FOR THE HELMHOLTZ EQUATION WITH PERFECTLY MATCHED LAYER

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Cyclic difference sets with parameters (511, 255, 127) BACHER, Roland

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

On deflation and singular symmetric positive semi-definite matrices

Grid Forming Control for Stable Power Systems with. up to 100 % Inverter Based Generation: A Paradigm. Scenario Using the IEEE 118-Bus System

Finite Element Decompositions for Stable Time Integration of Flow Equations

An introduction to PDE-constrained optimization

FEM-Level Set Techniques for Multiphase Flow --- Some recent results

Numerical Solution of Partial Differential Equations governing compressible flows

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Multipole-Based Preconditioners for Sparse Linear Systems.

The All-floating BETI Method: Numerical Results

4. Minkowski vertices and Chebyshev Systems

Algebra. Übungsblatt 5 (Lösungen)

Course Okt 28, Statistische Mechanik plus. Course Hartmut Ruhl, LMU, Munich. Personen. Hinweis. Rationale

Modelling and implementation of algorithms in applied mathematics using MPI

Dirichlet-Neumann and Neumann-Neumann Methods

Scalable Non-blocking Preconditioned Conjugate Gradient Methods

A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries

Solving Large Nonlinear Sparse Systems

The Navier-Stokes Equations

Numerik 2 Motivation

Mixed Finite Elements Method

Basic Aspects of Discretization

The notable configuration of inscribed equilateral triangles in a triangle

International Steam Tables - Properties of Water and Steam based on

Introduction to PDEs and Numerical Methods Lecture 1: Introduction

The Euler equations in fluid mechanics

Detailed 3D modelling of mass transfer processes in two phase flows with dynamic interfaces

Adjoint-Fueled Advances in Error Estimation for Multiscale, Multiphysics Systems

Numerical methods for PDEs FEM convergence, error estimates, piecewise polynomials

Organische Chemie IV: Organische Photochemie

Parallelization of An Experimental Multiphase Flow Algorithm. Master Thesis Literature Review submitted to the Delft Institute of Applied Mathematics

FEniCS Course. Lecture 6: Incompressible Navier Stokes. Contributors Anders Logg André Massing

FEMLAB as a General Tool to Investigate the Basic Laws of Physics

Overview of Convection Heat Transfer

Chapter 9 Implicit integration, incompressible flows

Analysis of Hybrid Discontinuous Galerkin Methods for Incompressible Flow Problems

Course Nov 03, Statistische Mechanik plus. Course Hartmut Ruhl, LMU, Munich. People involved. Hinweis.

Static Program Analysis

EF2000 Control Laws - Phase 4 Optimisation of Feedback Gains (Pitch)

20. A Dual-Primal FETI Method for solving Stokes/Navier-Stokes Equations

3. Rational curves on quartics in $P^3$

Friction Temperature of POM PE Sliding Contacts

Static Program Analysis

3-D FINITE ELEMENT MODELLING OF GRANULAR FLOW IN SILOS

SIMPLE Algorithm for Two-Dimensional Channel Flow. Fluid Flow and Heat Transfer

Game Physics. Game and Media Technology Master Program - Utrecht University. Dr. Nicolas Pronost

Introduction to Finite Volume projection methods. On Interfaces with non-zero mass flux

1. The Bergman, Kobayashi-Royden and Caratheodory metrics

Numerical Solution I

ADDITIVE SCHWARZ FOR SCHUR COMPLEMENT 305 the parallel implementation of both preconditioners on distributed memory platforms, and compare their perfo

The Apollonius Circle as a Tucker Circle

Hydro-elastic Wagner impact using variational inequalities

Performance Analysis of Parallel Alternating Directions Algorithm for Time Dependent Problems

Observation and Control for Operator Semigroups

Free energy concept Free energy approach LBM implementation Parameters

Matroids on graphs. Brigitte Servatius Worcester Polytechnic Institute. First Prev Next Last Go Back Full Screen Close Quit

Numerical Modeling of Inclined Negatively Buoyant Jets

Implementation of Implicit Solution Techniques for Non-equilibrium Hypersonic Flows

A recovery-assisted DG code for the compressible Navier-Stokes equations

Applications of Mathematical Economics

MPI parallel implementation of CBF preconditioning for 3D elasticity problems 1

3. Clifford's Theorem The elementary proof

A note on Steiner symmetrization of hyperbolic triangles

7.4 The Saddle Point Stokes Problem

Universität Dortmund UCHPC. Performance. Computing for Finite Element Simulations

Capacitance Matrix Method

Scenario Aggregation

Transcription:

Numerical Solution of the Generalised Poisson Equation on Parallel Computers Diplomanden- und Dissertantenkolloquium Universität Wien Hannes Grimm-Strele Projektgruppe ACORE, Fakultät für Mathematik, Universität Wien April 15, 2010 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 1 / 14

Introduction Subject of this talk Generalised Poisson Equation (GPE: Main questions considered: (κ u + cu = f on Ω, (1 κ(x κ 0 > 0, c(x 0 x Ω. (2 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 2 / 14

Introduction Subject of this talk Generalised Poisson Equation (GPE: (κ u + cu = f on Ω, (1 κ(x κ 0 > 0, c(x 0 x Ω. (2 Main questions considered: Origin and practical relevance of GPE Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 2 / 14

Introduction Subject of this talk Generalised Poisson Equation (GPE: (κ u + cu = f on Ω, (1 κ(x κ 0 > 0, c(x 0 x Ω. (2 Main questions considered: Origin and practical relevance of GPE Importance of parallel computing Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 2 / 14

Introduction Subject of this talk Generalised Poisson Equation (GPE: (κ u + cu = f on Ω, (1 κ(x κ 0 > 0, c(x 0 x Ω. (2 Main questions considered: Origin and practical relevance of GPE Importance of parallel computing Approach to numerical solution on parallel computers Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 2 / 14

Heat conduction Physical Background Assumption: Particles move according to Brownian motion. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 3 / 14

Heat conduction Physical Background Assumption: Particles move according to Brownian motion. Fourier s law: Energy flux goes from regions of high temperature to regions of low temperature. J = κ T (3 T temperature κ heat conductivity of the material (> 0 e energy J energy flux Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 3 / 14

Heat conduction Physical Background Assumption: Particles move according to Brownian motion. Fourier s law: Energy flux goes from regions of high temperature to regions of low temperature. J = κ T (3 T temperature κ heat conductivity of the material (> 0 e energy J energy flux Conservation of energy e t = J Stationarity e t = 0 J = (κ T = 0 (4 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 3 / 14

Euler Equations 1/2 Physical Background The Euler Equations govern the dynamics of flow without friction. Mass and momentum equations in two dimensions: ( ρu ρv t ( ρu 2 + ρuv x ( ρuv + ρuv 2 ρ t + (ρ u = 0, (5 ( px + = 0, (6 y p y ρ u = (u, v p mass density velocity field pressure Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 4 / 14

Euler Equations 1/2 Physical Background The Euler Equations govern the dynamics of flow without friction. Mass and momentum equations in two dimensions: ( ρu ρv t ( ρu 2 + ρuv x ( ρuv + ρuv 2 ρ t + (ρ u = 0, (5 ( px + = 0, (6 y p y ρ u = (u, v p mass density velocity field pressure Incompressibility assumption: Mass density does not change along a trajectory. Flow is incompressible (5 u = 0. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 4 / 14

Euler Equations 2/2 Physical Background Set (ρ u = (ρ u n t ( ( ρu 2 ρuv x ( ρuv + ρuv 2 y Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 5 / 14

Euler Equations 2/2 Physical Background Set (ρ u = (ρ u n t ( ( ρu 2 ρuv x ( ρuv + ρuv 2 y Euler forward step (ρ u t (ρ un+1 (ρ u t = p Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 5 / 14

Euler Equations 2/2 Physical Background Set (ρ u = (ρ u n t ( ( ρu 2 ρuv x ( ρuv + ρuv 2 y Euler forward step (ρ u t (ρ un+1 (ρ u t = p Divide by ρ n+1 = ρ un+1 u t = p ρ n+1 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 5 / 14

Physical Background Euler Equations 2/2 Set (ρ u = (ρ u n t ( ( ρu 2 ρuv x ( ρuv + ρuv 2 y Euler forward step (ρ u t (ρ un+1 (ρ u t = p Divide by ρ n+1 = ρ un+1 u t = p ρ n+1 ( Take divergence u n+1 = u t p ρ n+1 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 5 / 14

Physical Background Euler Equations 2/2 Set (ρ u = (ρ u n t ( ( ρu 2 ρuv x ( ρuv + ρuv 2 y Euler forward step (ρ u t (ρ un+1 (ρ u t = p Divide by ρ n+1 = ρ un+1 u t = p ρ n+1 ( Take divergence u n+1 = u t p ρ n+1 ( Incompressible flow p = u ρ n+1 t Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 5 / 14

Parallel Computing Why parallel computing is important Simulating 3s of the solar granulation with ANTARES requires approx. 5000 CPUh and 1.1 TB memory. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 6 / 14

Parallel Computing Example: Deutsches Klimarechenzentrum http://www.dkrz.de/dkrz/science/why SC1: Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 7 / 14

Parallel Computing Example: Deutsches Klimarechenzentrum http://www.dkrz.de/dkrz/science/why SC1: Die globale Umwelt und insbesondere das Klima sind äußerst komplexe Systeme, deren Dynamik und künftige Entwicklung nur mit weitreichenden Untersuchungen und aufwendigen Modellrechnungen erfasst werden können. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 7 / 14

Parallel Computing Example: Deutsches Klimarechenzentrum http://www.dkrz.de/dkrz/science/why SC1: Die globale Umwelt und insbesondere das Klima sind äußerst komplexe Systeme, deren Dynamik und künftige Entwicklung nur mit weitreichenden Untersuchungen und aufwendigen Modellrechnungen erfasst werden können. Vier Gründe, warum wir auf Höchstleistungsrechner angewiesen sind: Komplexität des Erdsystems Modellauflösung Ensemble -Rechnungen Integrationen über viele Jahrhunderte Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 7 / 14

Parallel Computing Example: Deutsches Klimarechenzentrum http://www.dkrz.de/dkrz/science/why SC1: Die globale Umwelt und insbesondere das Klima sind äußerst komplexe Systeme, deren Dynamik und künftige Entwicklung nur mit weitreichenden Untersuchungen und aufwendigen Modellrechnungen erfasst werden können. Vier Gründe, warum wir auf Höchstleistungsrechner angewiesen sind: Komplexität des Erdsystems Modellauflösung Ensemble -Rechnungen Integrationen über viele Jahrhunderte HPC blizzard : 8448 cores, 158 TFlops, 20 TB memory Cp: VSC 3488 cores, 35.5 TFlops, 11.2 TB memory Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 7 / 14

Parallel Computing Example: Sauber Motorsport http://www.sauber-motorsport.com/index.cfm?pageid=70: CFD (Computational Fluid Dynamics spielt insbesondere bei der Entwicklung von Front- und Heckflügeln sowie auch bei der Motor- und Bremskühlung eine wichtige Rolle. Seit dem Frühjahr 2008 ist Albert3 in Betrieb. Die jüngste Ausbaustufe verfügt über insgesamt 4224 Prozessorkerne. [...] Der Arbeitsspeicher wuchs auf 8448 GByte und die maximale Rechenleistung auf 50,7 TFlops. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 8 / 14

Why not just wait? Parallel Computing Moore sches Gesetz: Die Anzahl an Transistoren auf einem handelsüblichen Prozessor verdoppelt sich alle achtzehn Monate. Quelle: Der Spiegel 11/2005, S. 174 184 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 9 / 14

Parallel Computing The concept of parallel programming Decomposition and distribution of data and work (e.g. by domain decomposition Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 10 / 14

Parallel Computing The concept of parallel programming Decomposition and distribution of data and work (e.g. by domain decomposition Synchronisation and communication Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 10 / 14

Parallel Computing The concept of parallel programming Decomposition and distribution of data and work (e.g. by domain decomposition Synchronisation and communication Criteria for good parallelisation: load balancing and optimal speedup (Amdahl s law Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 10 / 14

Parallel Computing The concept of parallel programming Decomposition and distribution of data and work (e.g. by domain decomposition Synchronisation and communication Criteria for good parallelisation: load balancing and optimal speedup (Amdahl s law Programming models: Message Passing Interface (MPI and OpenMP Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 10 / 14

Numerical Methods Finite Element Method (FEM Using FEM, GPE is transformed to the discrete system Au h = b, where u h is an approximate solution for u. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 11 / 14

Numerical Methods Finite Element Method (FEM Using FEM, GPE is transformed to the discrete system Au h = b, where u h is an approximate solution for u. The approximation error can be controlled by the choice of the finite-dimensional ansatz space. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 11 / 14

Numerical Methods Finite Element Method (FEM Using FEM, GPE is transformed to the discrete system Au h = b, where u h is an approximate solution for u. The approximation error can be controlled by the choice of the finite-dimensional ansatz space. Most times: space of linear splines on Ω. Figure: Basis function for the space of linear splines (2D. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 11 / 14

Numerical Methods Finite Element Method (FEM Using FEM, GPE is transformed to the discrete system Au h = b, where u h is an approximate solution for u. The approximation error can be controlled by the choice of the finite-dimensional ansatz space. Most times: space of linear splines on Ω. Figure: Basis function for the space of linear splines (2D. A is always symmetric and positive definite. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 11 / 14

Numerical Methods Preconditioned Conjugate Gradient Algorithm (PCG Iterative algorithm to invert linear systems Au h = b. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 12 / 14

Numerical Methods Preconditioned Conjugate Gradient Algorithm (PCG Iterative algorithm to invert linear systems Au h = b. Converges if A is symmetric and positive definite. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 12 / 14

Numerical Methods Preconditioned Conjugate Gradient Algorithm (PCG Iterative algorithm to invert linear systems Au h = b. Converges if A is symmetric and positive definite. Convergence speed depends on κ(a. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 12 / 14

Numerical Methods Preconditioned Conjugate Gradient Algorithm (PCG Iterative algorithm to invert linear systems Au h = b. Converges if A is symmetric and positive definite. Convergence speed depends on κ(a. κ(a can be lowered by preconditioning (e.g. Incomplete Cholesky Decomposition. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 12 / 14

Numerical Methods Schur Complement Method 1/2 If the matrix A is of the following form: ( A1,1 A A = 1,2 A 2,1 A 2,2 with certain submatrices A 1,1, A 1,2, A 2,1 and A 2,2 such that A 1,1 is invertible, one can define the Schur Complement S A by S A = A 2,2 A 2,1 A 1 1,1 A 1,2. (7 Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 13 / 14

Numerical Methods Schur Complement Method 1/2 If the matrix A is of the following form: ( A1,1 A A = 1,2 A 2,1 A 2,2 with certain submatrices A 1,1, A 1,2, A 2,1 and A 2,2 such that A 1,1 is invertible, one can define the Schur Complement S A by Lemma S A = A 2,2 A 2,1 A 1 1,1 A 1,2. (7 If A is symmetric and positive definite, then S A defined by (7 is also symmetric and positive definite. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 13 / 14

Numerical Methods Schur Complement Method 2/2 The system Au = b is equivalent to ( ( ( ( I 0 A1,1 A 1,2 u1 b1 A 2,1 A 1 =. 1,1 I 0 S A u 2 b 2 By the PCG algorithm these systems can be solved subsequently. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 14 / 14

Numerical Methods Schur Complement Method 2/2 The system Au = b is equivalent to ( I 0 A 2,1 A 1 1,1 I ( A1,1 A 1,2 0 S A ( u1 u 2 = ( b1 By the PCG algorithm these systems can be solved subsequently. This can be done in parallel, if A 1,1 has block structure (can be achieved by reordering and renumbering of the nodes: A 1,1 = D 1 0... 0 D p. Then A 1,2 = (B 1,..., B p T, A 2,1 = (C 1,..., C p, A 2,2 = i E i. S A = p i=1 (E i C i D 1 i B i. Hannes Grimm-Strele (ACORE Parallel Solution of GPE April 15, 2010 14 / 14 b 2.