Parallel Multi-Zone Methods for Large- Scale Multidisciplinary Computational Physics Simulations Ding Li, Guoping Xia and Charles L. Merkle Purdue University The 6th International Conference on Linux Clusters The HPC Revolution 25 Chapel Hill, NC, April 25-28, 28, 24 Presentation Outline Multidisciplinary Numerical Analysis System GEMS code Generalized Equations of Motion Linux Cluster and Benchmarks Parallel Implementation Representative Applications
Multidisciplinary Computational Physics Multi-physics structures, plasma dynamics, fluid dynamics, electromagnetics, radiative energy transfer and neutron transport Different approaches loosely coupled and individual codes closely coupled and solved simultaneously Unified Framework General conservation law Numerical Analysis System CAD GRID GENERATOR PGRID Data Mining/ Visualization Property Modules G E M S Data Repository
GEMS code Preconditioned, Multiple-Time Algorithms Qp Γ V t General Equation N N Inv Vis + Fn, i Ai Fvn, i Ai = i i SV Structured- Unstructured Grids r ρv rr r r + ρvv = p+ τ + σ t Fluid-Solid Model r r 1 r r σ = 2µ SM e δtrac( e) G E M S Cluster Computing Cluster Computing Multiple Physical Zones Electromagnetics Purdue University - School of Mechanical Engineering Generalized Equations Generic set of partial differential equations: Q p Γ τ Q + + F t D + F C + Φ = Q p Q Γ + d + FDd + FC d + Φd = τ t Qp Γ + Qd + n FDdΣ + n FC dσ + nφdσ = τ t Normal Flux Tangential Flux Scale
Number of Partial Differential Equations in Various Fields Multi-Physics Zone Method Cluster2 Cluster Distinct Physics Zones Different media Different equations Parallel Processing Each zone divided into sub-clusters Load Balancing Number of equations Size of grids Cluster1
Linux Clusters Simba (21) 51 nodes:single P4 1.8 Ghz CPU:1Gb RAMs:1/1 Ethernet Redhat9., Lahey Fortran Compiler, MPICH1.2.4, PBS Macbeth (25) 98 nodes: dual AMD Opteron 1.8 Ghz CPU: 4GB RAMs:Infiniband interconnect:4x Infiniband network fabric (1Gbps) Redhat Enterprise, Intel, PGI and Pathscale Fortran Compiler, MPICH1.2.6,PBSPro PBSPro 4.E-4 Simba vs. Macbeth 2D turbulent flow w/.5 Million grid cells wtime/cells/iterations (Simba,Lahey) 25 WTime/cells/iterations.E-4 2.E-4 1.E-4.E+ 2 15 wtime/cells/iterations (Macbeth.Intel) 1 5 wall time (Simba,Lahey) wall time (Macbeth,Intel) Wall Time (s) 1 2 4 Number of Processors
Intel, Pathscale vs. PGI 1.4E-4 1.2E-4 1.E-4 8.E-5 6.E-5 wtime/cells/iterations/processors PGI Pathscale D flow with.75million grid cells Intel WTime/cells/iterations (sec.) 1 2 4 5 6 Number of Processors Parallel Computing & Partitioning N1 N2 N4 N1 N 1 Partitions
Parallel Data Structure Definitions: Interface: The face adjoined by two different partitions Sending Data: The cells of current partition adjoined to the interface Receiving Data: The cells of all partitions except current partition adjoined to the interface Interface Current partition Receiving data Sending data Zero diagonal: no data exchange inside a partition Rows represent the number of sending data Columns represent the number of receiving data Sum of rows are total number of data received in the partition of the row Sum of columns are total number of data sent in the partition of the column The number of each element is the number of data sent by the column partition to the row partition Exchanging matrix Receiving partition (processor) 1 2 4 5 5 Sending partition (processor) 1 2 4 5 5 8 2 8
Cell list for sending and receiving Cell list for sending in partition Cell list of receiving in partition 2 Cell list of receiving in partition 5 1 18 18 2 21 22 2 51 52 5 1 2 4 5 6 7 8 Partition 1 1 2 4 5 6 7 8 9 1 Partition 4 Representative Applications Constant Volume Combustion Turbine System Calculation in only one sector is necessary, since flow in other sectors experience same condition, at different time Boundary condition at sector interface is provided by the solution in the same sector at earlier time step, which is determined by the firing order
Pulse Detonation Engine and Turbine Interaction Research Temperature & Pressure Contours
Summary Unified parallel framework for dealing with multiphysics problems Parallel Computational implementation. Generalized form with divergence, curl and gradient Potential to apply in the fast growing grid computing. A variety of interesting physical phenomena and the efficacy of the computational implementation Thanks Any Questions?