Presentation Outline

Similar documents
Particles Removal from a Moving Tube by Blowing Systems: A CFD Analysis

WRF performance tuning for the Intel Woodcrest Processor

A Data Communication Reliability and Trustability Study for Cluster Computing

The Green Index (TGI): A Metric for Evalua:ng Energy Efficiency in HPC Systems

ASTRA BASED SWARM OPTIMIZATIONS OF THE BERLinPro INJECTOR

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville

Parallelization of the Molecular Orbital Program MOS-F

Stochastic Modelling of Electron Transport on different HPC architectures

Schwarz-type methods and their application in geomechanics

Cluster Computing: Updraft. Charles Reid Scientific Computing Summer Workshop June 29, 2010

Using AmgX to accelerate a PETSc-based immersed-boundary method code

Parallel Performance Studies for a Numerical Simulator of Atomic Layer Deposition Michael J. Reid

Parallel Computations of Unsteady Three-Dimensional Flows in a High Pressure Turbine

The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows

pr,. U I V --- AFRL.SR-AR-T 12. DISTRIBUTION/AVAILABILITY STATEMENT REPORT DOCUMENTATION PAGE OMB No

Benchmark of the CPMD code on CRESCO HPC Facilities for Numerical Simulation of a Magnesium Nanoparticle.

A parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations

Perm State University Research-Education Center Parallel and Distributed Computing

A framework for detailed multiphase cloud modeling on HPC systems

STAR-CCM+: NACA0012 Flow and Aero-Acoustics Analysis James Ruiz Application Engineer January 26, 2011

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem

Least Squares Finite Element Methods for Large Scale Incompressible Flows

State of GIS at the High Performance Computing Cluster

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Network Analysis at IIT Bombay

OpenFOAM SIMULATION OF THE FLOW IN THE HÖLLEFORSEN DRAFT TUBE MODEL

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment

Higher-Order Finite-Element Analysis for Fuzes Subjected to High-Frequency Environments

Advanced Computing Systems for Scientific Research

Review of Some Fast Algorithms for Electromagnetic Scattering

Performance Results for the Weather Research and Forecast (WRF) Model on AHPCRC HPC Systems

Practical Combustion Kinetics with CUDA

Benchmarking program performance evaluation of Parallel programming language XcalableMP on Many core processor

Application of a Non-Linear Frequency Domain Solver to the Euler and Navier-Stokes Equations

Direct Self-Consistent Field Computations on GPU Clusters

FINAL CFD DESIGN OF SCIROCCO ARC-JET TEST CONDITIONS FOR IXV TPS INTERFACES

Domain Decomposition-based contour integration eigenvalue solvers

Pulsating Flow Analysis in a Small Turbocharger Turbine

Investigation of an Unusual Phase Transition Freezing on heating of liquid solution

THE INVESTIGATION OF NUMERICAL SIMULATION SOFTWARE FOR FRACTURED RESERVOIRS

Julian Merten. GPU Computing and Alternative Architecture

Gas Turbine Technologies Torino (Italy) 26 January 2006

A Magnetohydrodynamic study af a inductive MHD generator

Open-source finite element solver for domain decomposition problems

Communication-avoiding LU and QR factorizations for multicore architectures

Susumu YAMADA 1,3 Toshiyuki IMAMURA 2,3, Masahiko MACHIDA 1,3

Performance Analysis of Lattice QCD Application with APGAS Programming Model

NHM Tutorial Part I. Brief Usage of the NHM

Performance comparison between hybridizable DG and classical DG methods for elastic waves simulation in harmonic domain

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

Entropy 2011, 13, ; doi: /e OPEN ACCESS. Entropy Generation at Natural Convection in an Inclined Rectangular Cavity

A Plasma Torch Model

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

Laminar flow heat transfer studies in a twisted square duct for constant wall heat flux boundary condition

Quantum ESPRESSO Performance Benchmark and Profiling. February 2017

A Plasma Torch Model. 1. Introduction

Combustion and Emission Modeling in CONVERGE with LOGE models

3D MHD Free Surface Fluid Flow Simulation Based on. Magnetic-Field Induction Equations

Application of a Non-Linear Frequency Domain Solver to the Euler and Navier-Stokes Equations

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

One Optimized I/O Configuration per HPC Application

THE BEHAVIOUR OF THE EXHAUST GASES EVICTED FROM THE SHIP FUNNEL

TOPS Contributions to PFLOTRAN

A parallel implementation of an MHD code for the simulation of mechanically driven, turbulent dynamos in spherical geometry

Prospects for High-Speed Flow Simulations

A parallel exponential integrator for large-scale discretizations of advection-diffusion models

Some thoughts about energy efficient application execution on NEC LX Series compute clusters

R. Glenn Brook, Bilel Hadri*, Vincent C. Betro, Ryan C. Hulguin, and Ryan Braby Cray Users Group 2012 Stuttgart, Germany April 29 May 3, 2012

Parallelism in Structured Newton Computations

Overview of Turbulent Reacting Flows

IMPLEMENTATION OF PRESSURE BASED SOLVER FOR SU2. 3rd SU2 Developers Meet Akshay.K.R, Huseyin Ozdemir, Edwin van der Weide

Finite Volume Method

A. Varoneckas 1, G. Varoneckas 2,3, A. Zilinskas 4

Large Scale Parallel Wake Field Computations with PBCI

Scalable and Power-Efficient Data Mining Kernels

High-Performance Computing and Groundbreaking Applications

Parallelization of the QC-lib Quantum Computer Simulator Library

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

The Relationship between Discrete Calculus Methods and other Mimetic Approaches

Numerical simulation of fluid flow in a monolithic exchanger related to high temperature and high pressure operating conditions

Computationally Efficient Analysis of Large Array FTIR Data In Chemical Reaction Studies Using Distributed Computing Strategy

Materials that you may find helpful when working through this exercise

Optimization strategy for MASNUM surface wave model

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

An H-LU Based Direct Finite Element Solver Accelerated by Nested Dissection for Large-scale Modeling of ICs and Packages

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Advanced Multi-Physics Modeling & Simulation Efforts for Fast Reactors

Large-Scale Mass Table Calculations

Application of COMSOL Multiphysics in Nanoscale Electrokinetic Transport

A simple Concept for the Performance Analysis of Cluster-Computing

Parallel Implementation of BDDC for Mixed-Hybrid Formulation of Flow in Porous Media

Using Aziz Supercomputer

MONK Under-sampling bias calculations for benchmark S2 - Initial results. Presented by PAUL SMITH EGAMCT Meeting, Paris 6 th July 2016

Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale

Utilization of ARIES Systems Code

Full-wave Simulations of Lower Hybrid Wave Propagation in the EAST Tokamak

WRF benchmark for regional applications

Transcription:

Parallel Multi-Zone Methods for Large- Scale Multidisciplinary Computational Physics Simulations Ding Li, Guoping Xia and Charles L. Merkle Purdue University The 6th International Conference on Linux Clusters The HPC Revolution 25 Chapel Hill, NC, April 25-28, 28, 24 Presentation Outline Multidisciplinary Numerical Analysis System GEMS code Generalized Equations of Motion Linux Cluster and Benchmarks Parallel Implementation Representative Applications

Multidisciplinary Computational Physics Multi-physics structures, plasma dynamics, fluid dynamics, electromagnetics, radiative energy transfer and neutron transport Different approaches loosely coupled and individual codes closely coupled and solved simultaneously Unified Framework General conservation law Numerical Analysis System CAD GRID GENERATOR PGRID Data Mining/ Visualization Property Modules G E M S Data Repository

GEMS code Preconditioned, Multiple-Time Algorithms Qp Γ V t General Equation N N Inv Vis + Fn, i Ai Fvn, i Ai = i i SV Structured- Unstructured Grids r ρv rr r r + ρvv = p+ τ + σ t Fluid-Solid Model r r 1 r r σ = 2µ SM e δtrac( e) G E M S Cluster Computing Cluster Computing Multiple Physical Zones Electromagnetics Purdue University - School of Mechanical Engineering Generalized Equations Generic set of partial differential equations: Q p Γ τ Q + + F t D + F C + Φ = Q p Q Γ + d + FDd + FC d + Φd = τ t Qp Γ + Qd + n FDdΣ + n FC dσ + nφdσ = τ t Normal Flux Tangential Flux Scale

Number of Partial Differential Equations in Various Fields Multi-Physics Zone Method Cluster2 Cluster Distinct Physics Zones Different media Different equations Parallel Processing Each zone divided into sub-clusters Load Balancing Number of equations Size of grids Cluster1

Linux Clusters Simba (21) 51 nodes:single P4 1.8 Ghz CPU:1Gb RAMs:1/1 Ethernet Redhat9., Lahey Fortran Compiler, MPICH1.2.4, PBS Macbeth (25) 98 nodes: dual AMD Opteron 1.8 Ghz CPU: 4GB RAMs:Infiniband interconnect:4x Infiniband network fabric (1Gbps) Redhat Enterprise, Intel, PGI and Pathscale Fortran Compiler, MPICH1.2.6,PBSPro PBSPro 4.E-4 Simba vs. Macbeth 2D turbulent flow w/.5 Million grid cells wtime/cells/iterations (Simba,Lahey) 25 WTime/cells/iterations.E-4 2.E-4 1.E-4.E+ 2 15 wtime/cells/iterations (Macbeth.Intel) 1 5 wall time (Simba,Lahey) wall time (Macbeth,Intel) Wall Time (s) 1 2 4 Number of Processors

Intel, Pathscale vs. PGI 1.4E-4 1.2E-4 1.E-4 8.E-5 6.E-5 wtime/cells/iterations/processors PGI Pathscale D flow with.75million grid cells Intel WTime/cells/iterations (sec.) 1 2 4 5 6 Number of Processors Parallel Computing & Partitioning N1 N2 N4 N1 N 1 Partitions

Parallel Data Structure Definitions: Interface: The face adjoined by two different partitions Sending Data: The cells of current partition adjoined to the interface Receiving Data: The cells of all partitions except current partition adjoined to the interface Interface Current partition Receiving data Sending data Zero diagonal: no data exchange inside a partition Rows represent the number of sending data Columns represent the number of receiving data Sum of rows are total number of data received in the partition of the row Sum of columns are total number of data sent in the partition of the column The number of each element is the number of data sent by the column partition to the row partition Exchanging matrix Receiving partition (processor) 1 2 4 5 5 Sending partition (processor) 1 2 4 5 5 8 2 8

Cell list for sending and receiving Cell list for sending in partition Cell list of receiving in partition 2 Cell list of receiving in partition 5 1 18 18 2 21 22 2 51 52 5 1 2 4 5 6 7 8 Partition 1 1 2 4 5 6 7 8 9 1 Partition 4 Representative Applications Constant Volume Combustion Turbine System Calculation in only one sector is necessary, since flow in other sectors experience same condition, at different time Boundary condition at sector interface is provided by the solution in the same sector at earlier time step, which is determined by the firing order

Pulse Detonation Engine and Turbine Interaction Research Temperature & Pressure Contours

Summary Unified parallel framework for dealing with multiphysics problems Parallel Computational implementation. Generalized form with divergence, curl and gradient Potential to apply in the fast growing grid computing. A variety of interesting physical phenomena and the efficacy of the computational implementation Thanks Any Questions?