Parallel Simulations of Self-propelled Microorganisms

Similar documents
A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Some thoughts about energy efficient application execution on NEC LX Series compute clusters

Simulation of floating bodies with lattice Boltzmann

Applications of Lattice Boltzmann Methods

Drag Force Simulations of Particle Agglomerates with the Lattice-Boltzmann Method

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems

Numerical Simulation Of Pore Fluid Flow And Fine Sediment Infiltration Into The Riverbed

COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD

Pore-scale lattice Boltzmann simulation of laminar and turbulent flow through a sphere pack

On the Use of a Many core Processor for Computational Fluid Dynamics Simulations

A parallel finite element multigrid framework for geodynamic simulations with more than ten trillion unknowns

MSc. Thesis Project. Simulation of a Rotary Kiln. MSc. Cand.: Miguel A. Romero Advisor: Dr. Domenico Lahaye. Challenge the future

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU

Multiscale modeling of active fluids: selfpropellers and molecular motors. I. Pagonabarraga University of Barcelona

The Finite Cell Method: High order simulation of complex structures without meshing

The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows

Massively parallel molecular-continuum simulations with the macro-micro-coupling tool Neumann, P.; Harting, J.D.R.

Available online at ScienceDirect. Procedia Engineering 61 (2013 ) 94 99

ERLANGEN REGIONAL COMPUTING CENTER

The Lattice Boltzmann Simulation on Multi-GPU Systems

Emergence of collective dynamics in active biological systems -- Swimming micro-organisms --

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

Lattice-Boltzmann vs. Navier-Stokes simulation of particulate flows

Sustained Petascale Performance of Seismic Simulations with SeisSol

Cactus Tools for Petascale Computing

Multiscale simulations of complex fluid rheology

DNS of colloidal dispersions using the smoothed profile method: formulation and applications

Fluid-soil multiphase flow simulation by an SPH-DEM coupled method

p + µ 2 u =0= σ (1) and

PoS(LAT2009)061. Monte Carlo approach to turbulence

Gas Turbine Technologies Torino (Italy) 26 January 2006

arxiv: v1 [cs.ce] 23 Jan 2015

Computational Numerical Integration for Spherical Quadratures. Verified by the Boltzmann Equation

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster

GPU-accelerated Computing at Scale. Dirk Pleiter I GTC Europe 10 October 2018

Direct Self-Consistent Field Computations on GPU Clusters

Simulation of Lid-driven Cavity Flow by Parallel Implementation of Lattice Boltzmann Method on GPUs

SPECIAL PROJECT PROGRESS REPORT

The Blue Gene/P at Jülich Case Study & Optimization. W.Frings, Forschungszentrum Jülich,

Game Physics. Game and Media Technology Master Program - Utrecht University. Dr. Nicolas Pronost

Domain Decomposition-based contour integration eigenvalue solvers

Lattice Boltzmann Simulation of Turbulent Flow Laden with Finite-Size Particles

591 TFLOPS Multi-TRILLION Particles Simulation on SuperMUC

Review for the Midterm Exam

A Momentum Exchange-based Immersed Boundary-Lattice. Boltzmann Method for Fluid Structure Interaction

GPU Computing Activities in KISTI

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Experience with DNS of particulate flow using a variant of the immersed boundary method

Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver

Unsteady CFD for Automotive Aerodynamics

The... of a particle is defined as its change in position in some time interval.

Optimising PICCANTE an Open Source Particle-in-Cell Code for Advanced Simulations on Tier-0 Systems

Advantages of a Finite Extensible Nonlinear Elastic Potential in Lattice Boltzmann Simulations

Micromechanical Modeling of Discontinuous Shear Thickening in Granular Media-Fluid

Optimal locomotion at low Reynolds number

Mesoscale fluid simulation of colloidal systems

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code

Appendix A Prototypes Models

Multiphase Flows. Mohammed Azhar Phil Stopford

HPMPC - A new software package with efficient solvers for Model Predictive Control

NOAA Research and Development High Performance Compu3ng Office Craig Tierney, U. of Colorado at Boulder Leslie Hart, NOAA CIO Office

Simulation of Particulate Solids Processing Using Discrete Element Method Oleh Baran

Collision Resolution

Event-Driven Molecular Dynamics for Non-Convex Polyhedra

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

arxiv:cond-mat/ v1 [cond-mat.soft] 22 Jan 2007

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

WRF performance tuning for the Intel Woodcrest Processor

Deutscher Wetterdienst

Chapter 13. Simple Harmonic Motion

Performance Evaluation of Scientific Applications on POWER8

A Generalized Maximum Dissipation Principle in an Impulse-velocity Time-stepping Scheme

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

Computers and Mathematics with Applications. Investigation of the LES WALE turbulence model within the lattice Boltzmann framework

Hybrid parallelization of a pseudo-spectral DNS code and its computational performance on RZG s idataplex system Hydra

is acting on a body of mass m = 3.0 kg and changes its velocity from an initial

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

Lattice Quantum Chromodynamics on the MIC architectures

Computation of Unsteady Flows With Moving Grids

Direct Numerical Simulation of fractal-generated turbulence

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

Parallel spacetime approach to turbulence: computation of unstable periodic orbits and the dynamical zeta function

GAME PHYSICS ENGINE DEVELOPMENT

- Part 4 - Multicore and Manycore Technology: Chances and Challenges. Vincent Heuveline

Scientific Computing II

GRANULAR DYNAMICS ON ASTEROIDS

More Science per Joule: Bottleneck Computing

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and

Numerical simulation of fluid flow in a monolithic exchanger related to high temperature and high pressure operating conditions

Passive locomotion via normal mode. coupling in a submerged spring-mass system

Visual Interactive Simulation, TDBD24, Spring 2006

SUPPLEMENTARY INFORMATION

Pairwise Interaction Extended Point-Particle (PIEP) Model for droplet-laden flows: Towards application to the mid-field of a spray

High performance mantle convection modeling

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory

Transcription:

Parallel Simulations of Self-propelled Microorganisms K. Pickl a,b M. Hofmann c T. Preclik a H. Köstler a A.-S. Smith b,d U. Rüde a,b ParCo 2013, Munich a Lehrstuhl für Informatik 10 (Systemsimulation), FAU Erlangen-Nürnberg b Cluster of Excellence: Engineering of Advanced Materials, FAU Erlangen-Nürnberg c Fakultät für Mathematik, Lehrstuhl für Numerische Mathematik, TU München d Institut für Theoretische Physik I, FAU Erlangen-Nürnberg

Flow Regimes 4 10-4 10 Re 9 10 2 10 all images taken from www.wikipedia.com ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 2

Flow at Low Reynolds Number: Purcell s Scallop Theorem t Stokes flow t 2 t 1 x 1 x 2 x domination of viscous forces small momentum always laminar time reversible no coasting we need asymmetric, non-time reversible motion to achieve any net movement E.M. Purcell. Life at low Reynolds number. American Journal of Physics 45: 3-11 (1977) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 3

Physical Model of a Swimmer we choose the simplest possible design: Golestanian s* swimmer connections between the objects: spring-damper systems used in previous studies A. Najafi and R. Golestanian. Simple swimmer at low Reynolds number: Three linked spheres. Phys. Rev. E, 69(6):062901 (2004) K. Pickl et al. All good things come in threes three beads learn to swim with lattice Boltzmann and a rigid body solver. JoCS 3(5):374 387 (2012) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 4

Physical Model of a Swimmer we choose the simplest possible design: Golestanian s* swimmer connections between the objects: spring-damper systems used in previous studies overlapping hydrodynamic interactions prevent bending (preserve axis of 180 ) introduce angular springs A. Najafi and R. Golestanian. Simple swimmer at low Reynolds number: Three linked spheres. Phys. Rev. E, 69(6):062901 (2004) K. Pickl et al. All good things come in threes three beads learn to swim with lattice Boltzmann and a rigid body solver. JoCS 3(5):374 387 (2012) M. Hofmann. Parallelisation of Swimmer Models for Swarms of Bacteria in the Physics Engine pe. Master s thesis, LSS, FAU Erlangen-Nürnberg (2013) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 4

Non-time Reversible Cycling Strategy 0.3 0.25 0.2 Force (x-component) 0.15 0.1 0.05 0-0.05-0.1-0.15 Force on body 2 Force on body 1 Force on body 3-0.2-0.25-0.3 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 Time step total applied force vanishes over one cycle (displacement of swimmer over one cycle is zero in absence of fluid) applied along specified main axis of swimmer on center of mass of each body (in this case: x-direction) net driving force acting on system at each instant of time is zero ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 5

Software Fluid Simulation WALBERLA (widely applicable Lattice Boltzmann solver from Erlangen) suited for various flow applications different fluid models (SRT, TRT, MRT) suitable for homo- and heterogeneous architectures large-scale, MPI-based parallelization Rigid Body Simulation pe based on Newton s mechanics fully resolved objects (sphere, box,... ) connections between objects can be soft or hard constraints accurate handling of friction during collision large-scale, MPI-based parallelization I. Ginzburg et al. Two-relaxation-time lattice Boltzmann scheme: About parametrization,.... Comm. in Computational Physics, 3(2):427 478, (2008) P. A. Cundall and O. D. L. Strack. A discrete numerical model for granular assemblies. Geotechnique, 29:47 65, (1979) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 6

Coupling both Frameworks: Four-Way Coupling 1. Object Mapping 2. LBM Communication 3. Stream Collide 4. Hydrodynamic Forces 5. Lubrication Correction 6. Physics Engine ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 7

So Far: Sequential Computations ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 8

So Far: Sequential Computations Get Ready for Parallel Simulations of Many Swimmers ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 8

So Far: Sequential Computations Get Ready for Parallel ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Simulations of Many Swimmers introduction of angular springs to prevent bending Parallel Simulations of Self-propelled Microorganisms 8

So Far: Sequential Computations Get Ready for Parallel ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Simulations of Many Swimmers introduction of angular springs to prevent bending handling of pair-wise spring-like interactions, extending not only over neighboring but also over multiple process domains job of the pe Parallel Simulations of Self-propelled Microorganisms 8

Parallel Discrete Element Method (DEM) First MPI communication: Send and receive forces and torques 1: Find and resolve all contacts inside each local domain 2: // First MPI communication 3: for all remote objects b rem do 4: sendforcesandtorquestoowner(b rem ) 5: end for 6: Receive forces and torques on local objects and perform a time-integration ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 9

Parallel Discrete Element Method (DEM) Second MPI communication: Update remote objects and migrate objects 7: for all local objects b loc do 8: for all neighboring processes p loc do 9: if b loc and p loc intersect and there is no remote object of b loc on p loc then 10: send b loc to p loc 11: end if 12: end for 13: for all processes p s holding remote objects of b loc do 14: send update of b loc to p s 15: end for 16: if b loc s center of mass has moved to neighboring process p n then 17: migrate b loc to p n 18: mark springs attached to b loc to be moved to p n 19: end if 20: end for 21: Receive updates and new objects M. Hofmann. Parallelisation of Swimmer Models for Swarms of Bacteria in the Physics Engine pe. Master s thesis, LSS, FAU Erlangen-Nürnberg (2013) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 10

Parallel Discrete Element Method (DEM) New! Third MPI Communication: Send springs and attached objects M. Hofmann. Parallelisation of Swimmer Models for Swarms of Bacteria in the Physics Engine pe. Master s thesis, LSS, FAU Erlangen-Nürnberg (2013) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 11

Parallel Discrete Element Method (DEM) New! Third MPI Communication: Send springs and attached objects 22: for all springs s send marked to be sent do 23: for all objects b att attached to s send do 24: send remote object of b att to the stored process p n 25: end for 26: send spring s send to the stored process p n 27: end for 28: Receive remote objects and instantiate a distant process, if necessary 29: Receive springs and attach them 30: Delete remote objects, springs, and distant processes no longer needed keep communication partners updated all information regarding spring-like pair-wise interactions is sent for long-range interactions: only associated processes communicate M. Hofmann. Parallelisation of Swimmer Models for Swarms of Bacteria in the Physics Engine pe. Master s thesis, LSS, FAU Erlangen-Nürnberg (2013) ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 12

So Far: Sequential Computations Now We Are Ready for ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Many Swimmers introduction of angular springs to prevent bending handling of pair-wise spring-like interactions, extending not only over neighboring but also over multiple process domains Parallel Simulations of Self-propelled Microorganisms 13

Weak Scaling Setup Does the newly introduced communication influence the scaling behavior? 1003 lattice cells and 2x8x8 swimmers/core rsphere = 4 lattice cells, dc.o.m. = 16 lattice cells 4,000 time steps smallest scenario: 4x4x4 cores (8,192 swimmers) successively doubling the cores in Cartesian directions entire domain: periodic in all directions ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 14

System Configurations of the Used Supercomputers SUPERMUC JUQUEEN # Cores 147,456 458,752 # Nodes 9,216 28,672 # Processors per Node 2 1 # Cores per Processor 8 16 Peak Performance [PFlop/s] 3.185 5.9 Memory per Core [GByte] 2 1 Processor Type Sandy Bridge-EP Intel Xeon E5-2680 8C IBM PowerPC A2 Clock Speed 2.7 GHz 1.6 GHz Interconnect Infiniband FDR10 IBM specific Interconnect Type Intra-Island Topology: non-blocking Tree 5D Torus Inter-Island Topology: Pruned Tree 4:1 ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 15

Weak Scaling Results on SUPERMUC 4000 100 3500 90 Time to solution [s] 3000 2500 2000 1500 1000 500 80 70 60 50 40 30 20 10 Efficiency [%] Efficiency Physics Engine LBM Communication Stream Collide Hydrodynamic Forces Object Mapping 0 0 4 8 16 32 64 128 256 512 Number of nodes using Intel C++ compiler version 12.1, IBM MPI implementation, and a clock speed of 2.7 GHz not displayed: Setup, Swimmer Setup and Lubrication Correction individual fractions measured using average over all cores ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 16

Weak Scaling Setup on JUQUEEN cores are able to perform four-way multithreading analyze our smallest setup: 4x4x4 cores (ˆ= 4 nodes) # Threads MLUP/s 1 23.66 2 40.03 4 48.80 ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 17

Weak Scaling Setup on JUQUEEN cores are able to perform four-way multithreading analyze our smallest setup: 4x4x4 cores (ˆ= 4 nodes) # Threads MLUP/s 1 23.66 2 40.03 4 48.80 ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 17

Weak Scaling Results on JUQUEEN 8000 100 7000 90 Time to solution [s] 6000 5000 4000 3000 2000 1000 80 70 60 50 40 30 20 10 Efficiency [%] Efficiency Physics Engine LBM Communication Stream Collide Hydrodynamic Forces Object Mapping 0 0 4 8 16 32 64 128 256 512 1024 2048 4096 8192 Number of nodes Largest simulated setup on 8,192 nodes: 16,777,216 swimmers using GNU C++ compiler version 4.4.6 not displayed: Setup, Swimmer Setup and Lubrication Correction individual fractions measured using average over all cores ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 18

Conclusions and Future Work Conclusions successful integration of handling pair-wise interactions extending over process domains weak scaling on two supercomputers currently ranked in top ten of TOP500 list demonstrate scalability on up to 262,144 processes Future Work analyze collective behavior of swimmers systematically reaching a steady state requires longer time steps improvement of parallel I/O and associated data analysis strong scaling characteristics ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 19

Thank you for your attention! Extract from the References K. Pickl et al. All good things come in threes three beads learn to swim with lattice Boltzmann and a rigid body solver. Journal of Computational Science, 3(5):374 387, 2012. M. Hofmann. Parallelisation of Swimmer Models for Swarms of Bacteria in the Physics Engine pe. Master s thesis, Lehrstuhl für Informatik 10 (Systemsimulation), FAU Erlangen-Nürnberg, 2013. C. Feichtinger et al. WaLBerla: HPC software design for computational engineering simulations. Journal of Computational Science, 2(2):105 112, 2011. A. Najafi and R. Golestanian. Simple swimmer at low Reynolds number: Three linked spheres. Phys. Rev. E, 69(6):062901, 2004. C. M. Pooley et al. Hydrodynamic interaction between two swimmers at low Reynolds number. Phys. Rev. Lett., 99:228103, 2007. Acknowledgments ParCo 2013, Munich kristina.pickl@fau.de FAU Erlangen-Nürnberg Parallel Simulations of Self-propelled Microorganisms 20