A Fast, Parallel Potential Flow Solver

Similar documents
Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

Fast Multipole Methods for Incompressible Flow Simulation

Multipole-Based Preconditioners for Sparse Linear Systems.

FEniCS Course. Lecture 6: Incompressible Navier Stokes. Contributors Anders Logg André Massing

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 HIGH FREQUENCY ACOUSTIC SIMULATIONS VIA FMM ACCELERATED BEM

Length Learning Objectives Learning Objectives Assessment

arxiv: v2 [math.na] 1 Oct 2016

Solving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing

Chapter 2. General concepts. 2.1 The Navier-Stokes equations

Master Thesis Literature Review: Efficiency Improvement for Panel Codes. TU Delft MARIN

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Jacobi-Davidson Eigensolver in Cusolver Library. Lung-Sheng Chien, NVIDIA

ρ Du i Dt = p x i together with the continuity equation = 0, x i

Lecture 8: Fast Linear Solvers (Part 7)

Solving Large Nonlinear Sparse Systems

An adaptive fast multipole boundary element method for the Helmholtz equation

fluid mechanics as a prominent discipline of application for numerical

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)

A Fast Solver for the Stokes Equations. Tejaswin Parthasarathy CS 598APK, Fall 2017

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

Large scale simulations of gravity induced sedimentation of slender fibers

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners

Dual Reciprocity Boundary Element Method for Magma Ocean Simulations

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov

Linear Solvers. Andrew Hazel

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

arxiv: v1 [math.na] 25 Jan 2017

Efficient calculation for evaluating vast amounts of quadrupole sources in BEM using fast multipole method

On the use of multipole methods for domain integration in the BEM

Fast Multipole BEM for Structural Acoustics Simulation

The hybridized DG methods for WS1, WS2, and CS2 test cases

Fast Structured Spectral Methods

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29

1 POTENTIAL FLOW THEORY Formulation of the seakeeping problem

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters

Fast Methods for Integral Equations. Jacob White. Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy

Dual Reciprocity Method for studying thermal flows related to Magma Oceans

An Introduction to the Discontinuous Galerkin Method

Accelerating incompressible fluid flow simulations on hybrid CPU/GPU systems

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS

PiCAP: A Parallel and Incremental Capacitance Extraction Considering Stochastic Process Variation

4. The Green Kubo Relations

Laplace s Equation FEM Methods. Jacob White. Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy, Jaime Peraire and Tony Patera

Partial Differential Equations

APPLICATION OF THE SPARSE CARDINAL SINE DECOMPOSITION TO 3D STOKES FLOWS

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

Compressible Potential Flow: The Full Potential Equation. Copyright 2009 Narayanan Komerath

Dense Arithmetic over Finite Fields with CUMODP

Towards Reduced Order Modeling (ROM) for Gust Simulations

FLUID MECHANICS. Atmosphere, Ocean. Aerodynamics. Energy conversion. Transport of heat/other. Numerous industrial processes

Chapter 6: Incompressible Inviscid Flow

Efficient Augmented Lagrangian-type Preconditioning for the Oseen Problem using Grad-Div Stabilization

nek5000 massively parallel spectral element simulations

Sensitivity analysis by adjoint Automatic Differentiation and Application

Sakurai-Sugiura algorithm based eigenvalue solver for Siesta. Georg Huhs

Partial Differential Equations

Parallel Adaptive Fast Multipole Method: application, design, and more...

An alternative approach to integral equation method based on Treftz solution for inviscid incompressible flow

Performance Analysis of Parallel Alternating Directions Algorithm for Time Dependent Problems

Green s Functions, Boundary Integral Equations and Rotational Symmetry

CFD as a Tool for Thermal Comfort Assessment

Investigation potential flow about swept back wing using panel method

Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI

2.29 Numerical Fluid Mechanics Fall 2011 Lecture 7

A Linear Multigrid Preconditioner for the solution of the Navier-Stokes Equations using a Discontinuous Galerkin Discretization. Laslo Tibor Diosady

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

A sparse multifrontal solver using hierarchically semi-separable frontal matrices

There are no simple turbulent flows

Sparse BLAS-3 Reduction

High Order Discontinuous Galerkin Methods for Aerodynamics

Partial Differential Equations. Examples of PDEs

Transient flow and heat equations - the Rayleigh-Benard instability

AS IC designs are approaching processes below 45 nm,

Yanlin Shao 1 Odd M. Faltinsen 2

Parallelization Strategies for Density Matrix Renormalization Group algorithms on Shared-Memory Systems

12.1 Viscous potential flow (VPF)

AMS 529: Finite Element Methods: Fundamentals, Applications, and New Trends

On the choice of abstract projection vectors for second level preconditioners

The Blue Gene/P at Jülich Case Study & Optimization. W.Frings, Forschungszentrum Jülich,

Using AD to derive adjoints for large, legacy CFD solvers

Review: From problem to parallel algorithm

Course Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications

IMPLEMENTATION OF A PARALLEL AMG SOLVER

Fast Multipole Methods for The Laplace Equation. Outline

Open boundary conditions in numerical simulations of unsteady incompressible flow

Solving PDEs with freefem++

Basic Aspects of Discretization

FLUID MECHANICS. ! Atmosphere, Ocean. ! Aerodynamics. ! Energy conversion. ! Transport of heat/other. ! Numerous industrial processes

Implementation of Implicit Solution Techniques for Non-equilibrium Hypersonic Flows

Time-dependent variational forms

An introduction to PDE-constrained optimization

FOR many computational fluid dynamics (CFD) applications, engineers require information about multiple solution

Contents. Preface... xi. Introduction...

Fast solvers for steady incompressible flow

A Study on Numerical Solution to the Incompressible Navier-Stokes Equation

PDE Solvers for Fluid Flow

Numerical methods for the Navier- Stokes equations

A COUPLED-ADJOINT METHOD FOR HIGH-FIDELITY AERO-STRUCTURAL OPTIMIZATION

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

Transcription:

Advisor: Jaime Peraire December 16, 2012

Outline 1 Introduction to Potential FLow 2 The Boundary Element Method 3 The Fast Multipole Method 4 Discretization 5 Implementation 6 Results 7 Conclusions

Why Potential Flow? It s easy and potentially fast Potential Flow: 2 φ = 0 vs: ( ) V Navier-Stokes: ρ t + V V = p + T + f Linear system of equations Full-blown fluid simulation (Navier-Stokes) is expensive Many times, we are just interested in time-averaged forces, moments, and pressure distribution.

Examples Start movie 1 1 P.O. Persson Discontinuous Galerkin CFD. Runtime time: > 1 week. 100s of CPUs Figure 1: Potential Flow Solution. Runtime time: 2 minutes on 4 CPUs

Potential Flow: Can be Fast, But... Cannot model everything (highly turbulent flow, etc). Accuracy issues due linearisation assumptions... Potential Flow Assumptions Flow is incompressible Viscosity is neglected (can be a major cause of drag) Flow is irrotational ( V = 0) But, it turns out to predict aerodynamic flows pretty well for many cases (examples: Flows about ships and aircraft)

Potential Flow: Governing Equation Governed by Laplace s equation 2 φ = 0 Potential in domain written as: φ( r) = φ s + φ d + V r Enforce that there is no flow in surface-normal direction... Force perturbation potential to vanish just inside the body: φ( r) = φ s + φ d = 0 Basically forces the aerodynamic body to be a streamsurface

Potential Flow: Discretization Can be discretized using the Boundary Element Method (BEM) BEM summary 1 Divide boundary into N elements 2 Analytically integrate Green s function over each of the N elements 3 Compute the potential due to singularity density at each element on all other elements 4 Solve for the surface singularity strengths The BEM requires that either a Neumann or Dirichlet boundary condition be applied wherever we want a solution.

Boundary Element Method: Green s Function There are several Green s functions that satisfy Laplace s equation: Singe-Layer potential: G s (σ j, r i r j ) = 1 4π Double-Layer potential: G d (µ j, r i r j ) = 1 4π σ j r i r j ˆn j φ( r i ) = S j (G d (µ j, r i r j ) + G s (σ j, r i r j )) = 0 µ j r i r j These Green s functions can be analytically integrated to arbitrary precision over planar surfaces Analytic integral can be very expensive...

Boundary Element Method: Collocation vs. Galerkin Collocation: Enforce boundary condition at N explicit points. Galerkin: Enforce Boundary condition in an integrated sense over the surface 1 Write unknown singularity distribution µ as a linear combination of N basis functions a i 2 Substitute into governing equations, and write a residual vector R 3 Multiply by test residual by test function. 4 Choose test function to be basis function residual will be orthogonal to the set of basis functions. R i = S i a i φ i ( r i )ds i = [ NE ( S i a 1 1 i j=1 4π S j µ j ˆn Sj r i r j ds j + 1 1 4π S j σ j r i r j j) ] ds dsi = 0 (1)

Boundary Element Method: Computational Considerations Produces a system that is dense, and may be very large For example, the aircraft shown earlier would have resulted in a 180000x180000 dense matrix Would require 259 GB of memory just to store the system of equations! So parallelizing the matrix assembly routine won t help (yet) This would be a deal-breaker for large problems, but there is a solution...

Hybrid Fast Multipole Method (FMM)/ Boundary Element Method (BEM) What is FMM? A method to compute a fast matrix-vector product (MVP) Allows MVP to be done in O(p 4 N) operations by sacrificing accuracy, where p is the multipole expansion order. We would think that a MVP for a dense matrix scales as O(N 2 ) Theoretically highly parallelizable More on this later... FMM can be applied to the BEM The FMM is easily applied to the Green s function of Laplace s Equation Can think of elements as being composed of many source particles Maintains same embarrassing parallelism as canonical FMM

FMM Step 1: Octree decomposition Create root box encompassing entire surface Recursively divide box until there are no more than N max elements in a box. Easily Parallizable Figure 2: All level 8 boxes in octree about an aircraft

FMM Steps 2 and 3: Upward and Downward Pass Basic Idea: Separate near-field and far-field interactions M L Downard Pass L L M M Upward Pass

Parallelization Four Options: 1 Distributed Memory (MPI) 2 Shared Memory (OpenMP) 3 GPU 4 Julia Originally, I wrote the FMM code in MATLAB, but was VERY slow Switched to C++, code sped up 4 orders of magnitude Now, runtimes are at most several minutes Weary of scripting due to MATLAB implementation... MPI would be overkill Ended up computing matrix-vector product in C++ using OpenMP System solved in MATLAB using gmres

Implementation First, had to get serial code to work (6,500 lines of code) Once serial code available, easy to parallize with OpenMP Simply add preprocessor directives and specify # of cores Example: 1 // M u l t i p o l e to l o c a l 2 f o r ( i n t l e v = 2 ; l e v < N l e v e l ; l e v ++){ 3 #pragma omp p a r a l l e l f o r 4 f o r ( i n t i =0; i <l e v e l i n d e x [ l e v ]. i d x. s i z e ( ) ; i ++){ 5 i n t bx = l e v e l i n d e x [ l e v ]. i d x [ i ] ; 6 i f ( boxes [ bx ]. i s e v a l b o x ) { 7 f o r ( i n t j =0; j <boxes [ bx ]. i l i s t. s i z e ( ) ; j ++){ 8 i f ( boxes [ boxes [ bx ]. i l i s t [ j ] ]. i s s o u r c e b o x ) { 9 m u l t i p o l e t o l o c a l ( boxes [ boxes [ bx ]. i l i s t [ j ] ], boxes [ bx ], Bnm, Anm, i l i s t c o n s t s [ bx ] [ j ], f i r s t t i m e, c o m p u t e s i n g l e, compute double ) ; 10 } 11 } 12 } 13 } 14 }

Solving the System Ax=b solved with GMRES Matrix is reasonably well-conditioned, but can we do better? But we never compute the A matrix, so how do we create a preconditioner? Assemble sparse matrix containing only near-field interactions. Then perform ILU on the near-field influence matrix to create preconditioner Figure 3: Near-field influence Matrix

Test Case Falcon Buisness Jet 5234 Elements, 2619 Nodes Linear Basis Functions Requires > 5 minutes to compute solution without FMM Figure 4: Falcon business jet

Results Table 1: Speedup compared to 1CPU p 1 CPU (s) 2 CPUs 3 CPUs 4 CPUs 1 5.2414 1.24 1.36 1.37 2 8.6618 1.39 1.59 1.70 3 23.8976 1.65 2.04 2.34 4 52.4548 1.73 2.26 2.59 5 105.9322 1.76 2.38 2.79

Conclusions 1 C++ is so much faster than MATLAB for BEMs 2 Only really makes sense to use shared memory parallelism (like OpenMP) for this application 3 Speedups of 2.8X possible on 4 CPUs in some cases 4 Implementation can be improved: this was my first attempt at parallel programming

Thank you for your time! Questions/Comments?

Table 2: Percent Speedup p 1 CPU 2 CPUs 3 CPUs 4 CPUs 1 5.2414 4.2134 3.8450 3.8305 2 8.6618 6.2183 5.4455 5.1049 3 23.8976 14.5198 11.7185 10.2322 4 52.4548 30.3419 23.2016 20.2512 5 105.9322 60.3612 44.4813 37.9687