Using AmgX to accelerate a PETSc-based immersed-boundary method code
|
|
- Clyde Evan McCarthy
- 5 years ago
- Views:
Transcription
1 29th International Conference on Parallel Computational Fluid Dynamics May 15-17, 2017; Glasgow, Scotland Using AmgX to accelerate a PETSc-based immersed-boundary method code Olivier Mesnard, Pi-Yueh Chuang, & Lorena A. Barba Mechanical and Aerospace Engineering, The George Washington University, United-States 1
2 PetIBM 2D & 3D incompressible Navier-Stokes equations Projection method: a block-lu decomposition (Perot, 1993) PETSc (Portable Extensible Toolkit for Scientific Computation) Immersed-boundary method + u ru = rp + 1 Re r2 u + R s r u = 0 >: u ( (s, t)) = R u (x) (x )dx x f ( (s, t)) ( x)ds 2
3 Immersed-boundary projection method (IBPM) Taira and Colonius (2007) Pressure and boundary forces gathered together Modified-Poisson system Q T B N Q 2 A G E T 4G 0 0 E q n+1 f 1 A = 0 1 r 0 A + u n+1 B 1 bc 1 bc 2 A 0 with Q G, E T f Block-LU decomposition: apple A 0 Q T Q T B N Q apple I B N Q 0 I q n+1 = r1 r 2 Aq = r 1 Q T B N Q = Q T q r 2 q n+1 = q B N Q (velocity system) (modified-poisson system) (projection step) 3
4 Application to 2D gliding snake Figures from Krishnan, A., Socha, J. J., Vlachos, P. P., & Barba, L. A. (2014). Lift and wakes of flying snakes. Physics of Fluids, 26(3), Enhanced lift at Re 2000 at AoA 35 deg AoA matches previous experimental data Can we observe lift-enhancement with more realistic 3D simulations? 4
5 Modified-Poisson system expensive to solve Conjugate-gradient Algebraic multigrid preconditioner (aggregation) 1 CPU node (16 cores - Dual 8-Core 2.6GHz Intel Xeon E5-2670) 80 time units (200,000 time steps) It takes about 90% of the simulation runtime! Q T B N Q 5
6 Nvidia AmgX library Various Krylov solvers Algebraic multigrid algorithms (classical and aggregation) Solve systems on multiple CUDA-capable GPU devices Available with a free license for non-commercial use for Accelerated Computing Developers Objective: use AmgX within PetIBM to reduce the time-to-solution of the Poisson system Problem: PETSc and AmgX have their own data structures 6
7 AmgXWrapper Interface between PETSc and AmgX Not specific to PetIBM 7
8 AmgXWrapper - Poisson system 8
9 PetIBM + AmgXWrapper Benchmark: 2D snake (Re=2000, AoA=35deg) 2.9M mesh-grid 1 CPU node: 12 CPU cores (2 Intel E5-2620) 1 GPU node: 1 CPU node (i.e., 12 CPU cores) + 2 K20 GPUs workstation: 6 CPU cores (1 Intel i7-5930k) + 2 K40c GPUs
10 Decoupled Immersed-boundary projection method Li and co-workers (2016) Decouple pressure field from Lagrangian forces 2-step block-lu decomposition 2 A G 3 E T 4G E 0 0 q n+1 f 1 A = 0 1 r 0 A + u n+1 B 1 bc 1 bc 2 A 0 appleā Ē T Ē 0 n+1 f = r1 r 2 with Ā appleā Ē T Ē 0 ; n+1 q n+1 First block-lu decomposition: appleā 0 Ē ĒĀ 1 Ē T apple A 0 G T G T A 1 G apple I Ā 1 Ē T 0 I apple I A 1 G 0 I n+1 Second block-lu decomposition: f q = = r1 r1 r 2 r 2 Aq = r 1 G T A 1 G = G T q + bc 2 q = q A 1 G EA 1 E T f = Eq q n+1 = q A 1 E T f u n+1 B 10
11 IBPM vs. decoupled method Benchmark: 2D snake (Re=2000, AoA=30deg) 2.9M meshgrid 200,000 time steps (80 time units) Cl Cd 11
12 IBPM vs. decoupled method Benchmark: 2D snake 2.9M meshgrid 200,000 time steps (80 time units) 1 CPU node (16 CPU cores) 1 GPU node (12 CPU cores & 2 K20 devices) 12
13 Flying snakes to the cloud CPU node on Colonial One Dual 8-Core 2.6GHz Intel Xeon E CPUs InfiniBand Microsoft Azure A9 Dual 8-Core 2.6GHz Intel Xeon E CPUs InfiniBand Ohio State University micro-benchmarks: 13
14 Flying snakes to the cloud Benchmark: Poisson system with 46M unknowns, Hypre BoomerAMG classical preconditioner PETSc CG 16 CPU cores per node 14
15 Flying snakes to the cloud Instance cores RAM disk sizes price A8 8 56GB 382GB $0.975/hr A-series A GB 382GB $1.95/hr A GB 382GB $0.78/hr A GB 382GB $1.56/hr Instance cores RAM disk sizes GPU price NC6 6 56GB 340GB 1 x K80 $0.90/hr NC-series NC GB 680GB 2 x K80 $1.80/hr NC GB 1,440GB 4 x K80 $3.60/hr NC24r GB 1,440GB 4 x K80 $3.96/hr 15
16 Flying snakes to the cloud Benchmark: 2D snake (Re=2000, AoA=35deg) 2.9M meshgrid 10,000 time steps $12.5 $
17 Flying snakes to the cloud Benchmark: 3D snake (Re=2000, AoA=deg) 46M meshgrid 1,000 time steps $15.5 $7.9 17
18 Conclusions Use AmgX in a PETSc-based code AmgXWrapper ( Fast decoupled immersed-boundary projection method PetIBM + AmgX to reduce cloud computing expenses Microsoft Azure Sponsorship ( 18
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts
More informationLangevin Rigid: Animating Immersed Rigid Bodies in Real-time
NICOGRAPH International 2013 Session 5: Physics-based Simulation Langevin Rigid: Animating Immersed Rigid Bodies in Real-time Haoran Xie Kazunori Miyata Japan Advanced Institute of Science and Technology
More informationAccelerating incompressible fluid flow simulations on hybrid CPU/GPU systems
Accelerating incompressible fluid flow simulations on hybrid CPU/GPU systems Yushan Wang 1, Marc Baboulin 1,2, Karl Rupp 3,4, Yann Fraigneau 1,5, Olivier Le Maître 1,5 1 Université Paris-Sud, France 2
More informationIMPLEMENTATION OF PRESSURE BASED SOLVER FOR SU2. 3rd SU2 Developers Meet Akshay.K.R, Huseyin Ozdemir, Edwin van der Weide
IMPLEMENTATION OF PRESSURE BASED SOLVER FOR SU2 3rd SU2 Developers Meet Akshay.K.R, Huseyin Ozdemir, Edwin van der Weide Content ECN part of TNO SU2 applications at ECN Incompressible flow solver Pressure-based
More informationSimulation of Lid-driven Cavity Flow by Parallel Implementation of Lattice Boltzmann Method on GPUs
Simulation of Lid-driven Cavity Flow by Parallel Implementation of Lattice Boltzmann Method on GPUs S. Berat Çelik 1, Cüneyt Sert 2, Barbaros ÇETN 3 1,2 METU, Mechanical Engineering, Ankara, TURKEY 3 METU-NCC,
More information- Part 4 - Multicore and Manycore Technology: Chances and Challenges. Vincent Heuveline
- Part 4 - Multicore and Manycore Technology: Chances and Challenges Vincent Heuveline 1 Numerical Simulation of Tropical Cyclones Goal oriented adaptivity for tropical cyclones ~10⁴km ~1500km ~100km 2
More informationLehrstuhl für Informatik 10 (Systemsimulation)
FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Comparison of two implementations
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationAlgorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method
Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk
More informationThe Lattice Boltzmann Method for Laminar and Turbulent Channel Flows
The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows Vanja Zecevic, Michael Kirkpatrick and Steven Armfield Department of Aerospace Mechanical & Mechatronic Engineering The University of
More informationDiscretization of PDEs and Tools for the Parallel Solution of the Resulting Systems
Discretization of PDEs and Tools for the Parallel Solution of the Resulting Systems Stan Tomov Innovative Computing Laboratory Computer Science Department The University of Tennessee Wednesday April 4,
More informationRobust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations
Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations Rohit Gupta, Martin van Gijzen, Kees Vuik GPU Technology Conference 2012, San Jose CA. GPU Technology Conference 2012,
More informationPetascale Quantum Simulations of Nano Systems and Biomolecules
Petascale Quantum Simulations of Nano Systems and Biomolecules Emil Briggs North Carolina State University 1. Outline of real-space Multigrid (RMG) 2. Scalability and hybrid/threaded models 3. GPU acceleration
More informationAlgebraic Multigrid as Solvers and as Preconditioner
Ò Algebraic Multigrid as Solvers and as Preconditioner Domenico Lahaye domenico.lahaye@cs.kuleuven.ac.be http://www.cs.kuleuven.ac.be/ domenico/ Department of Computer Science Katholieke Universiteit Leuven
More informationUniversität Dortmund UCHPC. Performance. Computing for Finite Element Simulations
technische universität dortmund Universität Dortmund fakultät für mathematik LS III (IAM) UCHPC UnConventional High Performance Computing for Finite Element Simulations S. Turek, Chr. Becker, S. Buijssen,
More informationUTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement
UTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement Wuxi Li, Meng Li, Jiajun Wang, and David Z. Pan University of Texas at Austin wuxili@utexas.edu November 14, 2017 UT DA Wuxi Li
More informationProgress in Parallel Implicit Methods For Tokamak Edge Plasma Modeling
Progress in Parallel Implicit Methods For Tokamak Edge Plasma Modeling Michael McCourt 1,2,Lois Curfman McInnes 1 Hong Zhang 1,Ben Dudson 3,Sean Farley 1,4 Tom Rognlien 5, Maxim Umansky 5 Argonne National
More informationMultiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU
Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU Khramtsov D.P., Nekrasov D.A., Pokusaev B.G. Department of Thermodynamics, Thermal Engineering and Energy Saving Technologies,
More informationNVIDIA MPI-enabled Iterative Solvers for Large Scale Problems. Joe Eaton Manager, AmgX CUDA Library NVIDIA
NVIDIA MPI-enabled Iterative Solvers for Large Scale Problems Joe Eaton Manager, AmgX CUDA Library NVIDIA ANSYS Fluent Fluent control flow Accelerate this first Non-linear iterations Assemble Linear System
More informationNewton-Multigrid Least-Squares FEM for S-V-P Formulation of the Navier-Stokes Equations
Newton-Multigrid Least-Squares FEM for S-V-P Formulation of the Navier-Stokes Equations A. Ouazzi, M. Nickaeen, S. Turek, and M. Waseem Institut für Angewandte Mathematik, LSIII, TU Dortmund, Vogelpothsweg
More informationFEM-Level Set Techniques for Multiphase Flow --- Some recent results
FEM-Level Set Techniques for Multiphase Flow --- Some recent results ENUMATH09, Uppsala Stefan Turek, Otto Mierka, Dmitri Kuzmin, Shuren Hysing Institut für Angewandte Mathematik, TU Dortmund http://www.mathematik.tu-dortmund.de/ls3
More informationAn Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems
An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems P.-O. Persson and J. Peraire Massachusetts Institute of Technology 2006 AIAA Aerospace Sciences Meeting, Reno, Nevada January 9,
More informationEfficient multigrid solvers for mixed finite element discretisations in NWP models
1/20 Efficient multigrid solvers for mixed finite element discretisations in NWP models Colin Cotter, David Ham, Lawrence Mitchell, Eike Hermann Müller *, Robert Scheichl * * University of Bath, Imperial
More informationScalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems
Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Pierre Jolivet, F. Hecht, F. Nataf, C. Prud homme Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann INRIA Rocquencourt
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationResearch on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method
NUCLEAR SCIENCE AND TECHNIQUES 25, 0501 (14) Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method XU Qi ( 徐琪 ), 1, YU Gang-Lin ( 余纲林 ), 1 WANG Kan ( 王侃 ),
More informationAccelerating Model Reduction of Large Linear Systems with Graphics Processors
Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex
More informationOUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU
Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative
More informationParallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)
Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationA robust multilevel approximate inverse preconditioner for symmetric positive definite matrices
DICEA DEPARTMENT OF CIVIL, ENVIRONMENTAL AND ARCHITECTURAL ENGINEERING PhD SCHOOL CIVIL AND ENVIRONMENTAL ENGINEERING SCIENCES XXX CYCLE A robust multilevel approximate inverse preconditioner for symmetric
More informationVideo: Lenovo, NVIDIA & Beckman Coulter showcase healthcare solutions
Video: Lenovo, NVIDIA & Beckman Coulter showcase healthcare solutions http://www.youtube.com/watch?v=ldjif9u6zms 2 Lenovo ThinkStation 3 LENOVO THINKSTATION RELIABLE AND POWERFUL Lenovo ThinkStation S30
More informationWRF performance tuning for the Intel Woodcrest Processor
WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,
More informationChapter 9 Implicit integration, incompressible flows
Chapter 9 Implicit integration, incompressible flows The methods we discussed so far work well for problems of hydrodynamics in which the flow speeds of interest are not orders of magnitude smaller than
More informationPreliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools
Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools Xiangjun Wu (1),Lilun Zhang (2),Junqiang Song (2) and Dehui Chen (1) (1) Center for Numerical Weather Prediction, CMA (2) School
More informationINTER-COMPARISON AND VALIDATION OF RANS AND LES COMPUTATIONAL APPROACHES FOR ATMOSPHERIC DISPERSION AROUND A CUBIC OBSTACLE. Resources, Kozani, Greece
INTER-COMPARISON AND VALIDATION OF AND LES COMPUTATIONAL APPROACHES FOR ATMOSPHERIC DISPERSION AROUND A CUBIC OBSTACLE S. Andronopoulos 1, D.G.E. Grigoriadis 1, I. Mavroidis 2, R.F. Griffiths 3 and J.G.
More informationThe Deflation Accelerated Schwarz Method for CFD
The Deflation Accelerated Schwarz Method for CFD J. Verkaik 1, C. Vuik 2,, B.D. Paarhuis 1, and A. Twerda 1 1 TNO Science and Industry, Stieltjesweg 1, P.O. Box 155, 2600 AD Delft, The Netherlands 2 Delft
More informationPDE Solvers for Fluid Flow
PDE Solvers for Fluid Flow issues and algorithms for the Streaming Supercomputer Eran Guendelman February 5, 2002 Topics Equations for incompressible fluid flow 3 model PDEs: Hyperbolic, Elliptic, Parabolic
More informationNumerical Modelling in Fortran: day 10. Paul Tackley, 2016
Numerical Modelling in Fortran: day 10 Paul Tackley, 2016 Today s Goals 1. Useful libraries and other software 2. Implicit time stepping 3. Projects: Agree on topic (by final lecture) (No lecture next
More informationOn Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code
On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance
More informationOn the choice of abstract projection vectors for second level preconditioners
On the choice of abstract projection vectors for second level preconditioners C. Vuik 1, J.M. Tang 1, and R. Nabben 2 1 Delft University of Technology 2 Technische Universität Berlin Institut für Mathematik
More informationMulti-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics
Multi-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics Michael Griebel Christian Rieger Peter Zaspel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität
More informationThe Pennsylvania State University The Graduate School THE AUXILIARY SPACE SOLVERS AND THEIR APPLICATIONS. A Dissertation in Mathematics by Lu Wang
The Pennsylvania State University The Graduate School THE AUXILIARY SPACE SOLVERS AND THEIR APPLICATIONS A Dissertation in Mathematics by Lu Wang c 2014 Lu Wang Submitted in Partial Fulfillment of the
More informationComputers and Mathematics with Applications
Computers and Mathematics with Applications 68 (2014) 1151 1160 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa A GPU
More informationThe solution of the discretized incompressible Navier-Stokes equations with iterative methods
The solution of the discretized incompressible Navier-Stokes equations with iterative methods Report 93-54 C. Vuik Technische Universiteit Delft Delft University of Technology Faculteit der Technische
More informationDesign Of An Anisokinetic Probe For Sampling RadioactiveParticles From Ducts Of Nuclear Facilities
Design Of An Anisokinetic Probe For Sampling RadioactiveParticles From Ducts Of Nuclear Facilities Author P. Geraldini 1 1 Sogin Spa Via Marsala 51C, 00185 Rome Italy, geraldini@sogin.it Abstract: The
More informationOpen-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer and GPU-Clusters --
Parallel Processing for Energy Efficiency October 3, 2013 NTNU, Trondheim, Norway Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer
More informationPreconditioners for the incompressible Navier Stokes equations
Preconditioners for the incompressible Navier Stokes equations C. Vuik M. ur Rehman A. Segal Delft Institute of Applied Mathematics, TU Delft, The Netherlands SIAM Conference on Computational Science and
More informationA High-Order Discontinuous Galerkin Method for the Unsteady Incompressible Navier-Stokes Equations
A High-Order Discontinuous Galerkin Method for the Unsteady Incompressible Navier-Stokes Equations Khosro Shahbazi 1, Paul F. Fischer 2 and C. Ross Ethier 1 1 University of Toronto and 2 Argonne National
More informationMultilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota
Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with Ruipeng Li Work
More informationGPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic
GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago
More informationSolving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing
Solving PDEs: the Poisson problem TMA4280 Introduction to Supercomputing Based on 2016v slides by Eivind Fonn NTNU, IMF February 27. 2017 1 The Poisson problem The Poisson equation is an elliptic partial
More informationPractical Combustion Kinetics with CUDA
Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides
More informationarxiv: v1 [hep-lat] 31 Oct 2015
and Code Optimization arxiv:1511.00088v1 [hep-lat] 31 Oct 2015 Hwancheol Jeong, Sangbaek Lee, Weonjong Lee, Lattice Gauge Theory Research Center, CTP, and FPRD, Department of Physics and Astronomy, Seoul
More informationUsing an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs
Using an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs Pasqua D Ambra Institute for Applied Computing (IAC) National Research Council of Italy (CNR) pasqua.dambra@cnr.it
More informationPresentation Outline
Parallel Multi-Zone Methods for Large- Scale Multidisciplinary Computational Physics Simulations Ding Li, Guoping Xia and Charles L. Merkle Purdue University The 6th International Conference on Linux Clusters
More informationInvestigation of an Unusual Phase Transition Freezing on heating of liquid solution
Investigation of an Unusual Phase Transition Freezing on heating of liquid solution Calin Gabriel Floare National Institute for R&D of Isotopic and Molecular Technologies, Cluj-Napoca, Romania Max von
More informationFEniCS Course. Lecture 0: Introduction to FEM. Contributors Anders Logg, Kent-Andre Mardal
FEniCS Course Lecture 0: Introduction to FEM Contributors Anders Logg, Kent-Andre Mardal 1 / 46 What is FEM? The finite element method is a framework and a recipe for discretization of mathematical problems
More informationTOPS Contributions to PFLOTRAN
TOPS Contributions to PFLOTRAN Barry Smith Matthew Knepley Mathematics and Computer Science Division Argonne National Laboratory TOPS Meeting at SIAM PP 08 Atlanta, Georgia March 14, 2008 M. Knepley (ANL)
More informationOpportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem
Opportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem Peter Benner, Andreas Marek, Carolin Penke August 16, 2018 ELSI Workshop 2018 Partners: The Problem The Bethe-Salpeter
More informationMPI parallel implementation of CBF preconditioning for 3D elasticity problems 1
Mathematics and Computers in Simulation 50 (1999) 247±254 MPI parallel implementation of CBF preconditioning for 3D elasticity problems 1 Ivan Lirkov *, Svetozar Margenov Central Laboratory for Parallel
More informationFast solvers for steady incompressible flow
ICFD 25 p.1/21 Fast solvers for steady incompressible flow Andy Wathen Oxford University wathen@comlab.ox.ac.uk http://web.comlab.ox.ac.uk/~wathen/ Joint work with: Howard Elman (University of Maryland,
More informationProf. Brant Robertson Department of Astronomy and Astrophysics University of California, Santa
Accelerated Astrophysics: Using NVIDIA GPUs to Simulate and Understand the Universe Prof. Brant Robertson Department of Astronomy and Astrophysics University of California, Santa Cruz brant@ucsc.edu, UC
More informationAn Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators
An Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators Hao Zhuang 1, Wenjian Yu 2, Ilgweon Kang 1, Xinan Wang 1, and Chung-Kuan Cheng 1 1. University of California, San
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationA particle flow specific boundary element formulation for microfluidic applications
A particle flow specific boundary element formulation for microfluidic applications Besim BARANOĞLU 1, Barbaros ÇETİN 2,* * Corresponding author: Tel.: +90 (312) 290-2108; Fax: +90 (312) 266-4126; Email:
More informationA Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya
A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya Sarah Swatski, Samuel Khuvis, and Matthias K. Gobbert (gobbert@umbc.edu) Department
More informationBenchmark of Femlab, Fluent and Ansys
Benchmark of Femlab, Fluent and Ansys Verdier, Olivier 2004 Link to publication Citation for published version (APA): Verdier, O. (2004). Benchmark of Femlab, Fluent and Ansys. (Preprints in Mathematical
More informationIMPLEMENTATION OF A PARALLEL AMG SOLVER
IMPLEMENTATION OF A PARALLEL AMG SOLVER Tony Saad May 2005 http://tsaad.utsi.edu - tsaad@utsi.edu PLAN INTRODUCTION 2 min. MULTIGRID METHODS.. 3 min. PARALLEL IMPLEMENTATION PARTITIONING. 1 min. RENUMBERING...
More informationMultilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses
Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses P. Boyanova 1, I. Georgiev 34, S. Margenov, L. Zikatanov 5 1 Uppsala University, Box 337, 751 05 Uppsala,
More informationAcoustics Analysis of Speaker ANSYS, Inc. November 28, 2014
Acoustics Analysis of Speaker 1 Introduction ANSYS 14.0 offers many enhancements in the area of acoustics. In this presentation, an example speaker analysis will be shown to highlight some of the acoustics
More informationRobust solution of Poisson-like problems with aggregation-based AMG
Robust solution of Poisson-like problems with aggregation-based AMG Yvan Notay Université Libre de Bruxelles Service de Métrologie Nucléaire Paris, January 26, 215 Supported by the Belgian FNRS http://homepages.ulb.ac.be/
More informationKey words. Parallel iterative solvers, saddle-point linear systems, preconditioners, timeharmonic
PARALLEL NUMERICAL SOLUTION OF THE TIME-HARMONIC MAXWELL EQUATIONS IN MIXED FORM DAN LI, CHEN GREIF, AND DOMINIK SCHÖTZAU Numer. Linear Algebra Appl., Vol. 19, pp. 525 539, 2012 Abstract. We develop a
More informationLecture 1: Numerical Issues from Inverse Problems (Parameter Estimation, Regularization Theory, and Parallel Algorithms)
Lecture 1: Numerical Issues from Inverse Problems (Parameter Estimation, Regularization Theory, and Parallel Algorithms) Youzuo Lin 1 Joint work with: Rosemary A. Renaut 2 Brendt Wohlberg 1 Hongbin Guo
More informationA finite element solver for ice sheet dynamics to be integrated with MPAS
A finite element solver for ice sheet dynamics to be integrated with MPAS Mauro Perego in collaboration with FSU, ORNL, LANL, Sandia February 6, CESM LIWG Meeting, Boulder (CO), 202 Outline Introduction
More informationParallel scalability of a FETI DP mortar method for problems with discontinuous coefficients
Parallel scalability of a FETI DP mortar method for problems with discontinuous coefficients Nina Dokeva and Wlodek Proskurowski University of Southern California, Department of Mathematics Los Angeles,
More informationIterative Methods and Multigrid
Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be
More informationTuning And Understanding MILC Performance In Cray XK6 GPU Clusters. Mike Showerman, Guochun Shi Steven Gottlieb
Tuning And Understanding MILC Performance In Cray XK6 GPU Clusters Mike Showerman, Guochun Shi Steven Gottlieb Outline Background Lattice QCD and MILC GPU and Cray XK6 node architecture Implementation
More informationSchwarz-type methods and their application in geomechanics
Schwarz-type methods and their application in geomechanics R. Blaheta, O. Jakl, K. Krečmer, J. Starý Institute of Geonics AS CR, Ostrava, Czech Republic E-mail: stary@ugn.cas.cz PDEMAMIP, September 7-11,
More informationIntegration of PETSc for Nonlinear Solves
Integration of PETSc for Nonlinear Solves Ben Jamroz, Travis Austin, Srinath Vadlamani, Scott Kruger Tech-X Corporation jamroz@txcorp.com http://www.txcorp.com NIMROD Meeting: Aug 10, 2010 Boulder, CO
More informationHeterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives
Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives Wenlu Zhang 1,2, Wayne Joubert 3, Peng Wang 4, Matthew Niemerg 5, Bei Wang 6, William Tang 6, Sam Taimourzadeh 1,
More informationPERFORMANCE OF PARALLEL LINEAR ITERATIVE PRECONDITIONERS AND SOLVERS FROM A FINITE ELEMENT MODEL OF WOODY VEGETATION ON LEVEES
XIX International Conference on Water Resources CMWR 2012 University of Illinois at Urbana-Champaign June 17-22, 2012 PERFORMANCE OF PARALLEL LINEAR ITERATIVE PRECONDITIONERS AND SOLVERS FROM A FINITE
More informationR. Glenn Brook, Bilel Hadri*, Vincent C. Betro, Ryan C. Hulguin, and Ryan Braby Cray Users Group 2012 Stuttgart, Germany April 29 May 3, 2012
R. Glenn Brook, Bilel Hadri*, Vincent C. Betro, Ryan C. Hulguin, and Ryan Braby Cray Users Group 2012 Stuttgart, Germany April 29 May 3, 2012 * presenting author Contents Overview on AACE Overview on MIC
More informationParallel sparse linear solvers and applications in CFD
Parallel sparse linear solvers and applications in CFD Jocelyne Erhel Joint work with Désiré Nuentsa Wakam () and Baptiste Poirriez () SAGE team, Inria Rennes, France journée Calcul Intensif Distribué
More information2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS Int. J. Numer. Meth. Fluids 2002; 00:1 6 [Version: 2000/07/27 v1.0] Nonlinear Additive Schwarz Preconditioners and Application in Computational Fluid
More informationA Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures
A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationA Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers
Applied and Computational Mathematics 2017; 6(4): 202-207 http://www.sciencepublishinggroup.com/j/acm doi: 10.11648/j.acm.20170604.18 ISSN: 2328-5605 (Print); ISSN: 2328-5613 (Online) A Robust Preconditioned
More informationRecent Developments in Overture
Recent Developments in Overture Bill Henshaw Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA. 11th Symposium on Overset Grids and Solution Technology,
More informationParticle Dynamics with MBD and FEA Using CUDA
Particle Dynamics with MBD and FEA Using CUDA Graham Sanborn, PhD Senior Research Engineer Solver 2 (MFBD) Team FunctionBay, Inc., S. Korea Overview MFBD: Multi-Flexible-Body Dynamics Rigid & flexible
More informationContents. Preface... xi. Introduction...
Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism
More informationA simple Concept for the Performance Analysis of Cluster-Computing
A simple Concept for the Performance Analysis of Cluster-Computing H. Kredel 1, S. Richling 2, J.P. Kruse 3, E. Strohmaier 4, H.G. Kruse 1 1 IT-Center, University of Mannheim, Germany 2 IT-Center, University
More informationCluster Computing: Updraft. Charles Reid Scientific Computing Summer Workshop June 29, 2010
Cluster Computing: Updraft Charles Reid Scientific Computing Summer Workshop June 29, 2010 Updraft Cluster: Hardware 256 Dual Quad-Core Nodes 2048 Cores 2.8 GHz Intel Xeon Processors 16 GB memory per
More informationSoft Bodies. Good approximation for hard ones. approximation breaks when objects break, or deform. Generalization: soft (deformable) bodies
Soft-Body Physics Soft Bodies Realistic objects are not purely rigid. Good approximation for hard ones. approximation breaks when objects break, or deform. Generalization: soft (deformable) bodies Deformed
More informationPressure corrected SPH for fluid animation
Pressure corrected SPH for fluid animation Kai Bao, Hui Zhang, Lili Zheng and Enhua Wu Analyzed by Po-Ram Kim 2 March 2010 Abstract We present pressure scheme for the SPH for fluid animation In conventional
More informationScalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver
Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,
More informationEdwin van der Weide and Magnus Svärd. I. Background information for the SBP-SAT scheme
Edwin van der Weide and Magnus Svärd I. Background information for the SBP-SAT scheme As is well-known, stability of a numerical scheme is a key property for a robust and accurate numerical solution. Proving
More informationERLANGEN REGIONAL COMPUTING CENTER
ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,
More informationLevel-3 BLAS on a GPU
Level-3 BLAS on a GPU Picking the Low Hanging Fruit Francisco Igual 1 Gregorio Quintana-Ortí 1 Robert A. van de Geijn 2 1 Departamento de Ingeniería y Ciencia de los Computadores. University Jaume I. Castellón
More informationStochastic Modelling of Electron Transport on different HPC architectures
Stochastic Modelling of Electron Transport on different HPC architectures www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivan ova Institute of Information and Communication Technologies Bulgarian Academy
More information