Accelerating incompressible fluid flow simulations on hybrid CPU/GPU systems

Size: px
Start display at page:

Download "Accelerating incompressible fluid flow simulations on hybrid CPU/GPU systems"

Transcription

1 Accelerating incompressible fluid flow simulations on hybrid CPU/GPU systems Yushan Wang 1, Marc Baboulin 1,2, Karl Rupp 3,4, Yann Fraigneau 1,5, Olivier Le Maître 1,5 1 Université Paris-Sud, France 2 INRIA, France 3 Argonne National Laboratory, USA 4 Vienna University of Technology, Austria 5 LIMSI-CNRS, France 26 th, Nov 2013

2 Laboratoire de Recherche en Informatique (LRI) 2 The main research areas addressed by ParSys include high-performance computing, distributed algorithms, compilation and code optimization. Founded more than 30 years ago, LRI has now over 260 members, including 105 faculty and staff and 90 Ph.D. students.

3 Outline 3 Introduction to Navier-Stokes equations Helmholtz-like equations Poisson equation Hybrid model and performance on a multicore + GPU architecture Conclusion and future work

4 Navier-Stokes equations 4 The incompressible NS equations are the fundamental bases of many CFD problems. A million dollars in cash awaits anyone who can develop a rigorous mathematical model for how fluids flow. -- Clay Mathematics Institute

5 Incompressible Navier-Stokes equations 5 ' V ) t + ( V V T ) = P + 1 Re Δ V ( ) * V = 0 Density is neglected because the problem is supposed to be with constant coefficient. Reynolds number Re indicates the fluid state. Larger demands finer mesh discretization. Non-linear convection term for incompressible fluid flow. ( V V T ) Re can be simplified as ( V ) V

6 6 Solving NS equations with a prediction-projection method Hodge-Helmholtz decomposition: ( V n, P n ) V * = V div=0 + φ V * φ ( V n+1, P n+1 ) (I 2Δt 3Re Δ)( V * V n ) = S Helmholtz-like equation Δφ = 3 2Δt V * Poisson equation V n+1 = V * 2Δt 3 φ P n+1 = P n +φ 1 Re V * u Time increments Y. Wang, M. Baboulin, J. Dongarra, J. Falcou, Y. Fraigneau, O. Le Maître A Parallel Solver for Incompressible Fluid Flows. ICCS 2013:

7 Solving Helmholtz-like equation 7 (I 2Δt 3Re Δ)( V * i V n i ) = S i i {x, y, z} Alternating Direction Implicit method: System of 3 Helmholtz-like equations: Thomas algorithm (I αδ) = (I αδ x )(I αδ y )(I αδ z )+ο(α 2 ) $ & & & % & & & ' (I 2Δt 3Re Δ x ) T # = S i (I 2Δt 3Re Δ y) T ## = T# (I 2Δt 3Re Δ z)( V * i V n i ) = T## Matrix transpose: reordering Tridiagonal systems

8 Solving Helmholtz-like equation 8 Bx = f. B is a tridiagonal block matrix. For certain cases, with multiple RHS. Bx = f Two methods available: B = [a i, b i, c i ] i=1,...,m can by considered as a smaller system Thomas algorithm: Gaussian elimination without pivoting. Explicit inverse of B è x = B 1 f. " ( 1) i+ j c i c i+1...c j 1 θ i 1 φ j+1 /θ n, i > j, $ B 1 ij = # θ i 1 φ i+1 /θ n, i = j, $ ( 1) i+ j a j+1 a j+2...a i θ j 1 φ i+1 /θ n, % $ i > j. θ 0 =1, θ 1 = b 1, θ i = b i θ i 1 c i 1 a i θ i 2, i = 2,..., m, φ m+1 =1, φ m = b m, φ i = b i φ i+1 c i a i+1 φ i+2, i = m 1,...,1,

9 Solving Helmholtz-like equation 9 #% u(x) αδu(x) = f (x), $ &% u(x) = 0, x Ω = (0,1) 3, x Ω, f (x) = (1+ 3απ 2 )u(x), α =10 7, u(x) = sin(π x 1 )sin(π x 2 )sin(π x 3 ). Using the explicit inverse of B gains a factor of 4 over the Thomas algorithm, while the accuracy is the same. However, the application of the explicit inverse is limited: only problems with no immerged body, and with same boundary conditions, etc.

10 Solving Poisson equation 10 Δφ = 3 2Δt V * = S Partial diagonalization: Δ x = Q x Λ x Q x 1 Δ y = Q y Λ y Q y 1 S $ = Q 1 x Q 1 y S φ $ = Q 1 1 x Q y φ % ' ' & ' ' (' Δ = Δ x + Δ y + Δ z (Λ x + Λ y + Δ z ) φ # = S# Tridiagonal system Thomas algorithm Matrix transpose: reordering Most time-consuming part is the matrix-matrix multiplication. Using GPU to accelerate. MAGMA: Matrix Algebra on GPU and Multicore Architectures.

11 CPU vs. GPU on Helmholtz and Poisson problems 11 Problem size = Helmholtz (with B -1 ) Poisson Transfer CPU è GPU Matrix multiplication Solution reordering Tridiagonal system solve Total GPU solver Total CPU solver * (12 MPI processes) Acceleration x4.3 x4.2 2 Inter Xeon E5645 è 12 cores in total. 2 NVIDIA Tesla C2075.

12 SUNFLUIDH 12 Navier-Stokes solver developed at LIMSI (Laboratoire d Informatique pour la Mécanique et les Sciences de l Ingénieur) 3D simulation of unsteady incompressible flow or low Mach number flow. Forced convection flow Thermal convection flow Multispecies flow Reactive flow The base frame of our current work. More information on

13 Hybrid model of our NS solver 13 Domain is divided equally into subdomains. MPI One subdomain corresponds to one MPI process. OpenMP thread Multicore processor + GPU thread thread Each process is associated to one GPU acceleration. Multi-threading techniques are applied within each subdomain.

14 Performance results 14 2 MPI processes. Each process is an hexa-core processor. Up to 6 threads per process (12 threads in total). p Problem size = p p About 50% of the computational work is done by GPU. Multithreading is not yet fully developed.

15 Conclusion and future work 15 A hybrid multi-core GPU Navier-Stokes solver which includes the solution of the Helmholtz-like and Poisson equations. Significant acceleration by taking advantage of GPU devices. More computational work to be transferred on GPU. Construction of the tridiagonal systems. Computation of convection flux, diffusion flux, etc.. Multi-threading implementation to be ameliorated. Using PETSc iterative solver when direct solver is not available. Larger scale simulations.

16 16

A parallel solver for incompressible fluid flows

A parallel solver for incompressible fluid flows Available online at www.sciencedirect.com Procedia Computer Science 00 (2013) 000 000 International Conference on Computational Science, ICCS 2013 A parallel solver for incompressible fluid flows Yushan

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning

More information

The Poisson equa-on in projec-on methods for incompressible flows. Bérengère Podvin, Yann Fraigneau LIMSI- CNRS

The Poisson equa-on in projec-on methods for incompressible flows. Bérengère Podvin, Yann Fraigneau LIMSI- CNRS The Poisson equa-on in projec-on methods for incompressible flows Bérengère Podvin, Yann Fraigneau LIMSI- CNRS 1 Overview Origin of the Poisson equa-on for pressure Resolu-on methods for the Poisson equa-on

More information

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS

More information

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems Marc Baboulin, Jack Dongarra, Rémi Lacroix To cite this version: Marc Baboulin, Jack Dongarra, Rémi Lacroix. Computing least squares

More information

Using AmgX to accelerate a PETSc-based immersed-boundary method code

Using AmgX to accelerate a PETSc-based immersed-boundary method code 29th International Conference on Parallel Computational Fluid Dynamics May 15-17, 2017; Glasgow, Scotland Using AmgX to accelerate a PETSc-based immersed-boundary method code Olivier Mesnard, Pi-Yueh Chuang,

More information

Open boundary conditions in numerical simulations of unsteady incompressible flow

Open boundary conditions in numerical simulations of unsteady incompressible flow Open boundary conditions in numerical simulations of unsteady incompressible flow M. P. Kirkpatrick S. W. Armfield Abstract In numerical simulations of unsteady incompressible flow, mass conservation can

More information

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a

More information

Vector and scalar penalty-projection methods

Vector and scalar penalty-projection methods Numerical Flow Models for Controlled Fusion - April 2007 Vector and scalar penalty-projection methods for incompressible and variable density flows Philippe Angot Université de Provence, LATP - Marseille

More information

A High-Order Discontinuous Galerkin Method for the Unsteady Incompressible Navier-Stokes Equations

A High-Order Discontinuous Galerkin Method for the Unsteady Incompressible Navier-Stokes Equations A High-Order Discontinuous Galerkin Method for the Unsteady Incompressible Navier-Stokes Equations Khosro Shahbazi 1, Paul F. Fischer 2 and C. Ross Ethier 1 1 University of Toronto and 2 Argonne National

More information

The behaviour of high Reynolds flows in a driven cavity

The behaviour of high Reynolds flows in a driven cavity The behaviour of high Reynolds flows in a driven cavity Charles-Henri BRUNEAU and Mazen SAAD Mathématiques Appliquées de Bordeaux, Université Bordeaux 1 CNRS UMR 5466, INRIA team MC 351 cours de la Libération,

More information

ANR Project DEDALES Algebraic and Geometric Domain Decomposition for Subsurface Flow

ANR Project DEDALES Algebraic and Geometric Domain Decomposition for Subsurface Flow ANR Project DEDALES Algebraic and Geometric Domain Decomposition for Subsurface Flow Michel Kern Inria Paris Rocquencourt Maison de la Simulation C2S@Exa Days, Inria Paris Centre, Novembre 2016 M. Kern

More information

TAU Solver Improvement [Implicit methods]

TAU Solver Improvement [Implicit methods] TAU Solver Improvement [Implicit methods] Richard Dwight Megadesign 23-24 May 2007 Folie 1 > Vortrag > Autor Outline Motivation (convergence acceleration to steady state, fast unsteady) Implicit methods

More information

Poisson Equation in 2D

Poisson Equation in 2D A Parallel Strategy Department of Mathematics and Statistics McMaster University March 31, 2010 Outline Introduction 1 Introduction Motivation Discretization Iterative Methods 2 Additive Schwarz Method

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

Reduced Vlasov-Maxwell modeling

Reduced Vlasov-Maxwell modeling Reduced Vlasov-Maxwell modeling Philippe Helluy, Michel Massaro, Laurent Navoret, Nhung Pham, Thomas Strub To cite this version: Philippe Helluy, Michel Massaro, Laurent Navoret, Nhung Pham, Thomas Strub.

More information

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts

More information

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Christopher P. Stone, Ph.D. Computational Science and Engineering, LLC Kyle Niemeyer, Ph.D. Oregon State University 2 Outline

More information

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and Accelerating the Multifrontal Method Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes {rflucas,genew,ddavis}@isi.edu and grimes@lstc.com 3D Finite Element

More information

Cranfield University, Cranfield, Bedfordshire, MK43 0AL, United Kingdom. Cranfield University, Cranfield, Bedfordshire, MK43 0AL, United Kingdom

Cranfield University, Cranfield, Bedfordshire, MK43 0AL, United Kingdom. Cranfield University, Cranfield, Bedfordshire, MK43 0AL, United Kingdom MultiScience - XXX. microcad International Multidisciplinary Scientific Conference University of Miskolc, Hungary, 21-22 April 2016, ISBN 978-963-358-113-1 NUMERICAL INVESTIGATION OF AN INCOMPRESSIBLE

More information

Solving PDEs with CUDA Jonathan Cohen

Solving PDEs with CUDA Jonathan Cohen Solving PDEs with CUDA Jonathan Cohen jocohen@nvidia.com NVIDIA Research PDEs (Partial Differential Equations) Big topic Some common strategies Focus on one type of PDE in this talk Poisson Equation Linear

More information

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric

More information

Structures in the turbulent wake of an Ahmed body: LES vs Experiments using POD and some ideas about control

Structures in the turbulent wake of an Ahmed body: LES vs Experiments using POD and some ideas about control LES vs Experiments using POD and some ideas about control Stéphanie PELLERIN and Bérengère PODVIN Laboratoire d Informatique pour la Mécanique et les Sciences de l Ingénieur Context Université Paris Sud,

More information

On Verification and Validation of Spring Fabric Model

On Verification and Validation of Spring Fabric Model On Verification and Validation of Spring Fabric Model Zheng Gao, Qiangqiang Shi, Yiyang Yang, Bernard Moore, and Xiaolin Li Department of Applied Mathematics and Statistics Stony Brook University Stony

More information

Direct numerical simulation of interfacial instability in gas-liquid flows

Direct numerical simulation of interfacial instability in gas-liquid flows Direct numerical simulation of interfacial instability in gas-liquid flows Iain Bethune 1, Lennon Ó Náraigh 2, David Scott 1, Peter Spelt 3,4, Prashant Valluri 5, Zlatko Solomenko 3 1 Edinburgh Parallel

More information

R. Glenn Brook, Bilel Hadri*, Vincent C. Betro, Ryan C. Hulguin, and Ryan Braby Cray Users Group 2012 Stuttgart, Germany April 29 May 3, 2012

R. Glenn Brook, Bilel Hadri*, Vincent C. Betro, Ryan C. Hulguin, and Ryan Braby Cray Users Group 2012 Stuttgart, Germany April 29 May 3, 2012 R. Glenn Brook, Bilel Hadri*, Vincent C. Betro, Ryan C. Hulguin, and Ryan Braby Cray Users Group 2012 Stuttgart, Germany April 29 May 3, 2012 * presenting author Contents Overview on AACE Overview on MIC

More information

Solving Large Nonlinear Sparse Systems

Solving Large Nonlinear Sparse Systems Solving Large Nonlinear Sparse Systems Fred W. Wubs and Jonas Thies Computational Mechanics & Numerical Mathematics University of Groningen, the Netherlands f.w.wubs@rug.nl Centre for Interdisciplinary

More information

A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)

A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal

More information

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D Luc Giraud N7-IRIT, Toulouse MUMPS Day October 24, 2006, ENS-INRIA, Lyon, France Outline 1 General Framework 2 The direct

More information

Numerical modelling of phase change processes in clouds. Challenges and Approaches. Martin Reitzle Bernard Weigand

Numerical modelling of phase change processes in clouds. Challenges and Approaches. Martin Reitzle Bernard Weigand Institute of Aerospace Thermodynamics Numerical modelling of phase change processes in clouds Challenges and Approaches Martin Reitzle Bernard Weigand Introduction Institute of Aerospace Thermodynamics

More information

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015 Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in NUMERICAL ANALYSIS Spring 2015 Instructions: Do exactly two problems from Part A AND two

More information

PALADINS: Scalable Time-Adaptive Algebraic Splitting and Preconditioners for the Navier-Stokes Equations

PALADINS: Scalable Time-Adaptive Algebraic Splitting and Preconditioners for the Navier-Stokes Equations 2013 SIAM Conference On Computational Science and Engineering Boston, 27 th February 2013 PALADINS: Scalable Time-Adaptive Algebraic Splitting and Preconditioners for the Navier-Stokes Equations U. Villa,

More information

A Scalable, Parallel Implementation of Weighted, Non-Linear Compact Schemes

A Scalable, Parallel Implementation of Weighted, Non-Linear Compact Schemes A Scalable, Parallel Implementation of Weighted, Non-Linear Compact Schemes Debojyoti Ghosh Emil M. Constantinescu Jed Brown Mathematics Computer Science Argonne National Laboratory SIAM Annual Meeting

More information

Numerical Modelling in Fortran: day 10. Paul Tackley, 2016

Numerical Modelling in Fortran: day 10. Paul Tackley, 2016 Numerical Modelling in Fortran: day 10 Paul Tackley, 2016 Today s Goals 1. Useful libraries and other software 2. Implicit time stepping 3. Projects: Agree on topic (by final lecture) (No lecture next

More information

Solving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing

Solving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing Solving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing Based on 2016v slides by Eivind Fonn NTNU, IMF February 27. 2017 1 The Poisson problem The Poisson equation is an elliptic partial

More information

Investigation of an implicit solver for the simulation of bubble oscillations using Basilisk

Investigation of an implicit solver for the simulation of bubble oscillations using Basilisk Investigation of an implicit solver for the simulation of bubble oscillations using Basilisk D. Fuster, and S. Popinet Sorbonne Universités, UPMC Univ Paris 6, CNRS, UMR 79 Institut Jean Le Rond d Alembert,

More information

Two-Dimensional Unsteady Flow in a Lid Driven Cavity with Constant Density and Viscosity ME 412 Project 5

Two-Dimensional Unsteady Flow in a Lid Driven Cavity with Constant Density and Viscosity ME 412 Project 5 Two-Dimensional Unsteady Flow in a Lid Driven Cavity with Constant Density and Viscosity ME 412 Project 5 Jingwei Zhu May 14, 2014 Instructor: Surya Pratap Vanka 1 Project Description The objective of

More information

Variational Assimilation of Discrete Navier-Stokes Equations

Variational Assimilation of Discrete Navier-Stokes Equations Variational Assimilation of Discrete Navier-Stokes Equations Souleymane.Kadri-Harouna FLUMINANCE, INRIA Rennes-Bretagne Atlantique Campus universitaire de Beaulieu, 35042 Rennes, France Outline Discretization

More information

IMPLEMENTATION OF A PARALLEL AMG SOLVER

IMPLEMENTATION OF A PARALLEL AMG SOLVER IMPLEMENTATION OF A PARALLEL AMG SOLVER Tony Saad May 2005 http://tsaad.utsi.edu - tsaad@utsi.edu PLAN INTRODUCTION 2 min. MULTIGRID METHODS.. 3 min. PARALLEL IMPLEMENTATION PARTITIONING. 1 min. RENUMBERING...

More information

LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version

LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version 1 LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version Amal Khabou Advisor: Laura Grigori Université Paris Sud 11, INRIA Saclay France SIAMPP12 February 17, 2012 2

More information

Spatial discretization scheme for incompressible viscous flows

Spatial discretization scheme for incompressible viscous flows Spatial discretization scheme for incompressible viscous flows N. Kumar Supervisors: J.H.M. ten Thije Boonkkamp and B. Koren CASA-day 2015 1/29 Challenges in CFD Accuracy a primary concern with all CFD

More information

Scalable Non-Linear Compact Schemes

Scalable Non-Linear Compact Schemes Scalable Non-Linear Compact Schemes Debojyoti Ghosh Emil M. Constantinescu Jed Brown Mathematics Computer Science Argonne National Laboratory International Conference on Spectral and High Order Methods

More information

IMPLEMENTATION OF PRESSURE BASED SOLVER FOR SU2. 3rd SU2 Developers Meet Akshay.K.R, Huseyin Ozdemir, Edwin van der Weide

IMPLEMENTATION OF PRESSURE BASED SOLVER FOR SU2. 3rd SU2 Developers Meet Akshay.K.R, Huseyin Ozdemir, Edwin van der Weide IMPLEMENTATION OF PRESSURE BASED SOLVER FOR SU2 3rd SU2 Developers Meet Akshay.K.R, Huseyin Ozdemir, Edwin van der Weide Content ECN part of TNO SU2 applications at ECN Incompressible flow solver Pressure-based

More information

DIRECT NUMERICAL SIMULATION IN A LID-DRIVEN CAVITY AT HIGH REYNOLDS NUMBER

DIRECT NUMERICAL SIMULATION IN A LID-DRIVEN CAVITY AT HIGH REYNOLDS NUMBER Conference on Turbulence and Interactions TI26, May 29 - June 2, 26, Porquerolles, France DIRECT NUMERICAL SIMULATION IN A LID-DRIVEN CAVITY AT HIGH REYNOLDS NUMBER E. Leriche, Laboratoire d Ingénierie

More information

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations Rohit Gupta, Martin van Gijzen, Kees Vuik GPU Technology Conference 2012, San Jose CA. GPU Technology Conference 2012,

More information

Towards a Numerical Benchmark for 3D Low Mach Number Mixed Flows in a Rectangular Channel Heated from Below

Towards a Numerical Benchmark for 3D Low Mach Number Mixed Flows in a Rectangular Channel Heated from Below Copyright 2008 Tech Science Press FDMP, vol.4, no.4, pp.263-269, 2008 Towards a Numerical Benchmark for 3D Low Mach Number Mixed Flows in a Rectangular Channel Heated from Below G. Accary 1, S. Meradji

More information

Preconditioners for the incompressible Navier Stokes equations

Preconditioners for the incompressible Navier Stokes equations Preconditioners for the incompressible Navier Stokes equations C. Vuik M. ur Rehman A. Segal Delft Institute of Applied Mathematics, TU Delft, The Netherlands SIAM Conference on Computational Science and

More information

Perm State University Research-Education Center Parallel and Distributed Computing

Perm State University Research-Education Center Parallel and Distributed Computing Perm State University Research-Education Center Parallel and Distributed Computing A 25-minute Talk (S4493) at the GPU Technology Conference (GTC) 2014 MARCH 24-27, 2014 SAN JOSE, CA GPU-accelerated modeling

More information

Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs

Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs 1 School of Information Engineering, China University of Geosciences (Beijing) Beijing, 100083, China E-mail: Yaolk1119@icloud.com Ailan

More information

A Study on Numerical Solution to the Incompressible Navier-Stokes Equation

A Study on Numerical Solution to the Incompressible Navier-Stokes Equation A Study on Numerical Solution to the Incompressible Navier-Stokes Equation Zipeng Zhao May 2014 1 Introduction 1.1 Motivation One of the most important applications of finite differences lies in the field

More information

Multipole-Based Preconditioners for Sparse Linear Systems.

Multipole-Based Preconditioners for Sparse Linear Systems. Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation. Overview Summary of Contributions Generalized Stokes Problem Solenoidal

More information

On the design of parallel linear solvers for large scale problems

On the design of parallel linear solvers for large scale problems On the design of parallel linear solvers for large scale problems ICIAM - August 2015 - Mini-Symposium on Recent advances in matrix computations for extreme-scale computers M. Faverge, X. Lacoste, G. Pichon,

More information

Iterative Solvers in the Finite Element Solution of Transient Heat Conduction

Iterative Solvers in the Finite Element Solution of Transient Heat Conduction Iterative Solvers in the Finite Element Solution of Transient Heat Conduction Mile R. Vuji~i} PhD student Steve G.R. Brown Senior Lecturer Materials Research Centre School of Engineering University of

More information

Numerical Solutions of the Burgers System in Two Dimensions under Varied Initial and Boundary Conditions

Numerical Solutions of the Burgers System in Two Dimensions under Varied Initial and Boundary Conditions Applied Mathematical Sciences, Vol. 6, 22, no. 3, 563-565 Numerical Solutions of the Burgers System in Two Dimensions under Varied Initial and Boundary Conditions M. C. Kweyu, W. A. Manyonge 2, A. Koross

More information

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about

More information

Multigrid Methods and their application in CFD

Multigrid Methods and their application in CFD Multigrid Methods and their application in CFD Michael Wurst TU München 16.06.2009 1 Multigrid Methods Definition Multigrid (MG) methods in numerical analysis are a group of algorithms for solving differential

More information

Space-time Discontinuous Galerkin Methods for Compressible Flows

Space-time Discontinuous Galerkin Methods for Compressible Flows Space-time Discontinuous Galerkin Methods for Compressible Flows Jaap van der Vegt Numerical Analysis and Computational Mechanics Group Department of Applied Mathematics University of Twente Joint Work

More information

Stability of the Parareal Algorithm

Stability of the Parareal Algorithm Stability of the Parareal Algorithm Gunnar Andreas Staff and Einar M. Rønquist Norwegian University of Science and Technology Department of Mathematical Sciences Summary. We discuss the stability of the

More information

A Fast, Parallel Potential Flow Solver

A Fast, Parallel Potential Flow Solver Advisor: Jaime Peraire December 16, 2012 Outline 1 Introduction to Potential FLow 2 The Boundary Element Method 3 The Fast Multipole Method 4 Discretization 5 Implementation 6 Results 7 Conclusions Why

More information

Vorticity and Dynamics

Vorticity and Dynamics Vorticity and Dynamics In Navier-Stokes equation Nonlinear term ω u the Lamb vector is related to the nonlinear term u 2 (u ) u = + ω u 2 Sort of Coriolis force in a rotation frame Viscous term ν u = ν

More information

Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods

Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods Marc Baboulin 1, Xiaoye S. Li 2 and François-Henry Rouet 2 1 University of Paris-Sud, Inria Saclay, France 2 Lawrence Berkeley

More information

FREE BOUNDARY PROBLEMS IN FLUID MECHANICS

FREE BOUNDARY PROBLEMS IN FLUID MECHANICS FREE BOUNDARY PROBLEMS IN FLUID MECHANICS ANA MARIA SOANE AND ROUBEN ROSTAMIAN We consider a class of free boundary problems governed by the incompressible Navier-Stokes equations. Our objective is to

More information

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative

More information

SOLUTION of linear systems of equations of the form:

SOLUTION of linear systems of equations of the form: Proceedings of the Federated Conference on Computer Science and Information Systems pp. Mixed precision iterative refinement techniques for the WZ factorization Beata Bylina Jarosław Bylina Institute of

More information

Diffusion / Parabolic Equations. PHY 688: Numerical Methods for (Astro)Physics

Diffusion / Parabolic Equations. PHY 688: Numerical Methods for (Astro)Physics Diffusion / Parabolic Equations Summary of PDEs (so far...) Hyperbolic Think: advection Real, finite speed(s) at which information propagates carries changes in the solution Second-order explicit methods

More information

An iterative algorithm for nonlinear wavelet thresholding: Applications to signal and image processing

An iterative algorithm for nonlinear wavelet thresholding: Applications to signal and image processing An iterative algorithm for nonlinear wavelet thresholding: Applications to signal and image processing Marie Farge, LMD-CNRS, ENS, Paris Kai Schneider, CMI, Université de Provence, Marseille Alexandre

More information

Navier-Stokes equations

Navier-Stokes equations 1 Navier-Stokes equations Introduction to spectral methods for the CSC Lunchbytes Seminar Series. Incompressible, hydrodynamic turbulence is described completely by the Navier-Stokes equations where t

More information

The Deflation Accelerated Schwarz Method for CFD

The Deflation Accelerated Schwarz Method for CFD The Deflation Accelerated Schwarz Method for CFD J. Verkaik 1, C. Vuik 2,, B.D. Paarhuis 1, and A. Twerda 1 1 TNO Science and Industry, Stieltjesweg 1, P.O. Box 155, 2600 AD Delft, The Netherlands 2 Delft

More information

Direct Numerical Simulations of converging-diverging channel flow

Direct Numerical Simulations of converging-diverging channel flow Intro Numerical code Results Conclusion Direct Numerical Simulations of converging-diverging channel flow J.-P. Laval (1), M. Marquillie (1) Jean-Philippe.Laval@univ-lille1.fr (1) Laboratoire de Me canique

More information

A Finite-Element based Navier-Stokes Solver for LES

A Finite-Element based Navier-Stokes Solver for LES A Finite-Element based Navier-Stokes Solver for LES W. Wienken a, J. Stiller b and U. Fladrich c. a Technische Universität Dresden, Institute of Fluid Mechanics (ISM) b Technische Universität Dresden,

More information

Application of a Modular Particle-Continuum Method to Partially Rarefied, Hypersonic Flows

Application of a Modular Particle-Continuum Method to Partially Rarefied, Hypersonic Flows Application of a Modular Particle-Continuum Method to Partially Rarefied, Hypersonic Flows Timothy R. Deschenes and Iain D. Boyd Department of Aerospace Engineering, University of Michigan, Ann Arbor,

More information

Introduction to numerical simulation of fluid flows

Introduction to numerical simulation of fluid flows Introduction to numerical simulation of fluid flows Mónica de Mier Torrecilla Technical University of Munich Winterschool April 2004, St. Petersburg (Russia) 1 Introduction The central task in natural

More information

A STUDY OF MULTIGRID SMOOTHERS USED IN COMPRESSIBLE CFD BASED ON THE CONVECTION DIFFUSION EQUATION

A STUDY OF MULTIGRID SMOOTHERS USED IN COMPRESSIBLE CFD BASED ON THE CONVECTION DIFFUSION EQUATION ECCOMAS Congress 2016 VII European Congress on Computational Methods in Applied Sciences and Engineering M. Papadrakakis, V. Papadopoulos, G. Stefanou, V. Plevris (eds.) Crete Island, Greece, 5 10 June

More information

EULER AND SECOND-ORDER RUNGE-KUTTA METHODS FOR COMPUTATION OF FLOW AROUND A CYLINDER

EULER AND SECOND-ORDER RUNGE-KUTTA METHODS FOR COMPUTATION OF FLOW AROUND A CYLINDER EULER AND SEOND-ORDER RUNGE-KUTTA METHODS FOR OMPUTATION OF FLOW AROUND A YLINDER László Daróczy, László Baranyi MSc student, Professor University of Miskolc, Hungary Department of Fluid and Heat Engineering,

More information

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros

More information

Computational Engineering

Computational Engineering Coordinating unit: 205 - ESEIAAT - Terrassa School of Industrial, Aerospace and Audiovisual Engineering Teaching unit: 220 - ETSEIAT - Terrassa School of Industrial and Aeronautical Engineering Academic

More information

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming

More information

FEM-Level Set Techniques for Multiphase Flow --- Some recent results

FEM-Level Set Techniques for Multiphase Flow --- Some recent results FEM-Level Set Techniques for Multiphase Flow --- Some recent results ENUMATH09, Uppsala Stefan Turek, Otto Mierka, Dmitri Kuzmin, Shuren Hysing Institut für Angewandte Mathematik, TU Dortmund http://www.mathematik.tu-dortmund.de/ls3

More information

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,

More information

The hybridized DG methods for WS1, WS2, and CS2 test cases

The hybridized DG methods for WS1, WS2, and CS2 test cases The hybridized DG methods for WS1, WS2, and CS2 test cases P. Fernandez, N.C. Nguyen and J. Peraire Aerospace Computational Design Laboratory Department of Aeronautics and Astronautics, MIT 5th High-Order

More information

Logo. A Massively-Parallel Multicore Acceleration of a Point Contact Solid Mechanics Simulation DRAFT

Logo. A Massively-Parallel Multicore Acceleration of a Point Contact Solid Mechanics Simulation DRAFT Paper 1 Logo Civil-Comp Press, 2017 Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering, P. Iványi, B.H.V Topping and G. Várady (Editors)

More information

Petascale Quantum Simulations of Nano Systems and Biomolecules

Petascale Quantum Simulations of Nano Systems and Biomolecules Petascale Quantum Simulations of Nano Systems and Biomolecules Emil Briggs North Carolina State University 1. Outline of real-space Multigrid (RMG) 2. Scalability and hybrid/threaded models 3. GPU acceleration

More information

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Pierre Jolivet, F. Hecht, F. Nataf, C. Prud homme Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann INRIA Rocquencourt

More information

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

Spectral finite elements for a mixed formulation in computational acoustics taking flow effects into account

Spectral finite elements for a mixed formulation in computational acoustics taking flow effects into account Spectral finite elements for a mixed formulation in computational acoustics taking flow effects into account Manfred Kaltenbacher in cooperation with A. Hüppe, I. Sim (University of Klagenfurt), G. Cohen

More information

A finite-volume algorithm for all speed flows

A finite-volume algorithm for all speed flows A finite-volume algorithm for all speed flows F. Moukalled and M. Darwish American University of Beirut, Faculty of Engineering & Architecture, Mechanical Engineering Department, P.O.Box 11-0236, Beirut,

More information

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros

More information

Outline: 1 Motivation: Domain Decomposition Method 2 3 4

Outline: 1 Motivation: Domain Decomposition Method 2 3 4 Multiscale Basis Functions for Iterative Domain Decomposition Procedures A. Francisco 1, V. Ginting 2, F. Pereira 3 and J. Rigelo 2 1 Department Mechanical Engineering Federal Fluminense University, Volta

More information

fluid mechanics as a prominent discipline of application for numerical

fluid mechanics as a prominent discipline of application for numerical 1. fluid mechanics as a prominent discipline of application for numerical simulations: experimental fluid mechanics: wind tunnel studies, laser Doppler anemometry, hot wire techniques,... theoretical fluid

More information

Accelerating Model Reduction of Large Linear Systems with Graphics Processors

Accelerating Model Reduction of Large Linear Systems with Graphics Processors Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex

More information

MUMPS. The MUMPS library: work done during the SOLSTICE project. MUMPS team, Lyon-Grenoble, Toulouse, Bordeaux

MUMPS. The MUMPS library: work done during the SOLSTICE project. MUMPS team, Lyon-Grenoble, Toulouse, Bordeaux The MUMPS library: work done during the SOLSTICE project MUMPS team, Lyon-Grenoble, Toulouse, Bordeaux Sparse Days and ANR SOLSTICE Final Workshop June MUMPS MUMPS Team since beg. of SOLSTICE (2007) Permanent

More information

STAR-CCM+: NACA0012 Flow and Aero-Acoustics Analysis James Ruiz Application Engineer January 26, 2011

STAR-CCM+: NACA0012 Flow and Aero-Acoustics Analysis James Ruiz Application Engineer January 26, 2011 www.cd-adapco.com STAR-CCM+: NACA0012 Flow and Aero-Acoustics Analysis James Ruiz Application Engineer January 26, 2011 Introduction The objective of this work is to prove the capability of STAR-CCM+ as

More information

Game Physics. Game and Media Technology Master Program - Utrecht University. Dr. Nicolas Pronost

Game Physics. Game and Media Technology Master Program - Utrecht University. Dr. Nicolas Pronost Game and Media Technology Master Program - Utrecht University Dr. Nicolas Pronost Soft body physics Soft bodies In reality, objects are not purely rigid for some it is a good approximation but if you hit

More information

Numerical Modelling in Fortran: day 8. Paul Tackley, 2017

Numerical Modelling in Fortran: day 8. Paul Tackley, 2017 Numerical Modelling in Fortran: day 8 Paul Tackley, 2017 Today s Goals 1. Introduction to parallel computing (applicable to Fortran or C; examples are in Fortran) 2. Finite Prandtl number convection Motivation:

More information

Package magma. February 15, 2013

Package magma. February 15, 2013 Package magma February 15, 2013 Title Matrix Algebra on GPU and Multicore Architectures Version 0.2.2-1 Date 2010-08-27 Author Brian J Smith Maintainer Brian J Smith

More information

High performance computing for neutron diffusion and transport equations

High performance computing for neutron diffusion and transport equations High performance computing for neutron diffusion and transport equations Horizon Maths 2012 Fondation Science Mathématiques de Paris A.-M. Baudron, C. Calvin, J. Dubois, E. Jamelot, J.-J. Lautard, O. Mula-Hernandez

More information

Nonhydrostatic Icosahedral Model (NIM) A 3-D finite-volume NIM Jin Lee (+ other contributors)

Nonhydrostatic Icosahedral Model (NIM) A 3-D finite-volume NIM Jin Lee (+ other contributors) Nonhydrostatic Icosahedral Model (NIM) A 3-D finite-volume NIM Jin Lee (+ other contributors) Earth System Research Laboratory (ESRL) NOAA/OAR GFDL,NSSL,ARL,AOML,GLERL,PMEL Aeronomy Lab. Climate Diagnostic

More information