Parallel spacetime approach to turbulence Peter V. Coveney1, Bruce M. Boghosian2, 1 1 Luis Fazendeiro, Derek Groen DEISA PRACE Symposium 2011, April 13-14 2011, Helsinki, Finland p.v.coveney@ucl.ac.uk Centre for Computational Science, UCL, London, UK 2 Tufts University, Department of Mathematics, Medford, USA 1
Contents 1) 2) 3) 4) 5) 6) 7) Navier-Stokes Equations. Unstable Periodic Orbits. Dynamical Zeta Function. HYPO4D ( HYdrodynamic Periodic Orbits in 4D ). Computational aspects. JUGENE optimization work and results. Prospects.
1 - Navier-Stokes Equations Claude-Louis Navier (1785-1836) George Gabriel Stokes (1819 1903) Nonlinearity arises directly from the kinematic aspects of the problem. Typical control parameter, Reynolds number: Re ~106 for many common situations; algorithmic complexity scales with Re3. After the onset of turbulence, systems exhibit large fluctuations on many length and time scales. Solution to the 3D NSE (proof of existence and smoothness) still an open problem (Temam, 1984).
1 - Navier-Stokes Equations Kolmogorov picture of turbulence: Kolomogorov length: Big whorls have little whorls Which feed on their velocity And little whorls have lesser whorls And so on to viscosity - Lewis F. Richardson Dynamical systems approach: strange attractor is closure of the set of all the UPOs (in the neighbourhood of which the system will spend most of the time). For 2D NSE in the general case: finite-dimensional attracting set (scaling as a power of Re). For 3D NSE: proven with certain assumptions (Constantin, Foias, Temam, 1985).
Example of a strange attractor: Lorentz System ρ = Prandtl number σ = Rayleigh number ρ = 28 σ = 10 β = 8/3 Deterministic nonperiodic flow E. N. Lorenz, J. Atmos. Sci. 20 130 (1963) Picture: http://en.wikipedia.org/wiki/lorenz_attractor
2 - Unstable Periodic Orbits Role of UPO's has been acknowledged since the work of Poincaré ( founder of modern dynamical systems theory). Typical trajectory will wander incessantly in a sequence of close approaches to the UPOs. Analogy from statistical mechanics in physics: set of UPO's can be viewed as the microstates from which a macroscopic description of the system can be calculated. Accuracy of predictions is limited by the (non-composite) UPO of smaller period which we fail to include. http://www.chaosbook.org/ by Cvitanovic et al.
2 - Unstable Periodic Orbits Abstract dynamical landscape with peaks representing the UPOs. Chaotic trajectory can be visualized as the motion of a ball rolling on this abstract dynamical landscape. Motion of the ball will be strongly affected by the sharpness of the peak (stability of the UPOs). Relative height of the peaks is related to the natural measure of the indicated UPOs (straight lines are just visual aids in identifying the corresponding cyclic pieces of the period-2 cycles in the diagram). from: http://www.scholarpedia.org/article/unstable_periodic_orbits Curator: Dr. Paul So, George Mason University, Fairfax, VA
2 - Unstable Periodic Orbits In traditional fluid turbulence, observables are computed as averages over many time frames: Alternative approach is using a generating function: Then the moments of the PDF of an observable A are:
3 - Dynamical Zeta Function If a sufficient number of UPOs is known, the DZF can be calculated; DZF expression: Generating function: Computation of expectation value:
3 - Dynamical Zeta Function Comparison between the two approaches for the Lorenz equations: exact polynomial expansion (left) versus noisy timeintegration (right). B. M. Boghosian et al. Unstable Periodic Orbits in the Lorenz attractor, Phil. Trans. R. Soc. A, 2011 (in press).
3 - Dynamical Zeta Function Main advantages of this approach: Degree of accuracy is high and converges quickly, with the number of known lower period UPOs. No need to redo initial value problem every time we wish to compute the average of some quantity. Averages are no longer stochastic in nature: we have an exact expansion to compute them.
4 - HYPO4D Kawahara & Kida (2001) found two periodic solutions in plane Couette flow, using spectral methods. Novel efficient variational algorithm to locate UPOs1. Tested on Lorenz model and other low-dimensional systems. Good convergence to the UPOs, including 2D fluid. Plot Δ(t,T): find initial guess (minimum) for whole trajectory, and value of T: Numerically relax minima (in 4D) towards finding UPO (and T). B. M. Boghosian et al. New Variational Principles for Locating Periodic Orbits of Differential Equations, accepted for publication, Phil. Trans. R. Soc. A, 2011. 1
4 - HYPO4D (lattice Boltzmann) Fluid flow (nearly incompressible NSE) is simulated using the Lattice Boltzmann Method1,2: S. Chen, G. D. Doolen, Annu. Rev. Fluid Mech., 30, 329-364, 1998. S. Succi, The LBE for Fluid Dynamics and Beyond, Oxford University Press, Oxford, 2001. 1 2
4 - HYPO4D (lattice Boltzmann) Communication pattern between proceses: only the halo values need to be sent to nearest neighbours = HYPO4D: Linear scaling up to 16K on Ranger (TACC), TG08 scalability award. Main advantages of LBM: Local collision operator + linear streaming operator (nearly embarrassingly parallel). Minimal set of velocities. No need for extra term for extra eq. for pressure term.
4 - HYPO4D In the halo-exchange step, all MPI communications are non-blocking, in order to prevent dead-lock. No aggressive optimization pursued, so that code can be (seamlessly) deployed on a great number of platforms. Perfectly linear scaling on the Cray XT5 machine Kraken (NICS, Oak Ridge Lab.), up to 65K cores.
4 - HYPO4D Current approach: fluid starts from rest, periodic boundary conditions in all directions (isotropic, homogeneous turbulence), ABC (Arnold-BeltramiChildress) type force1 applied: Numerical stability tests applied at all sites in the lattice at every time step, convergence tests and other quantities measured at regular intervals. After a certain threshold of the Reynolds number, inertia overcomes acceleration and (steady) turbulent behaviour sets in. L. Fazendeiro et al. Unstable Periodic Orbits in Weak Turbulence, JOCS, 1, 13-23, 2010. 1
4 HYPO4D: convergence test Cubic lattice, 643, varying LB relaxation time. Maximum Re ~ 500, taking velocity averaged over many time, steps, but DNS, no modeling included. Convergence test for the velocity field, taken between consecutive time steps. Notice sharp increase after inertia of the fluid becomes predominant (vertical axis scale is logarithmic). (L. Fazendeiro et al. Unstable Periodic Orbits in Weak Turbulence, JOCS, 1, 13-23, 2010).
4- HYPO4D: locating candidate orbits System is 643, with =0.53, Re=371. Vertical axis is T, horizontal axis is t, color code gives:
4- HYPO4D: Locating candidate orbits Detail of previous plot, showing several (purplish) minima/ areas of interest. Sampling rate = 500 (.xdr format for I/O).
4 - HYPO4D: Locating candidate orbits (now at fixed time t) Same as before, but this time fixed t, in order to determine optimal T, sampling rate = 1.
5 - Computational aspects: spacetime relaxation Memory critical resource for the full 4D relaxation procedure. For T=24.5K, we have to keep in memory at least 643*19 (number of LB velocities)*8(double precision)*2.45*104 variables ~ 1 TB; Then SD=5 copies and CG=8!! Minimization algorithm: Define functional compute gradient: with
5 - Computational aspects: spacetime relaxation
5 - Computational aspects: spacetime relaxation Left: t=177k, T=26864 orbit, RMS value of F per lattice site = 1.7504e-05, after 340 SD iterations. Right: t=637k, T=24594 orbit, RMS value of F per lattice site = 1.4076e-05 after 300 SD iterations.
5 - Computational aspects: vorticity field Magnitude of the vorticity field, thresholded, for the t=177k, T=26864 UPO, displaying very structured patterns (perhaps smaller UPOs may be escaping detection...).
5 - Computational aspects Memory is the critical resource for the full 4D relaxation procedure. For the minima shown previously, assuming T= 3*104 time steps, we have to load memory 643 x 19 (number of velocities) x 8 (double precision) x 3 x 104 variables ~ 1.2 TB. One single minimization run can take ~1M SUs and more (~24h wall clock time). Thus, very large number of computing cores required (order of T).
6 - Optimizing HYPO4D for JUGENE We optimized Hypo4D on several occasions: PRACE early access call, project PRA020, August November 2010 allocation. PRACE Extreme Scaling Workshop 2011 in Juelich. Code optimizations include: Improved halo_exchange. 3x fewer synchronization points, ~3.8x lower bandwidth use. Optimized collective operations. Previous implementation led to performance degradation and crashes on higher core counts.
6 - Optimizing HYPO4D for JUGENE Linear scaling up to 256K on JUGENE (IBM BlueGene/P), after the recent optimizations in the collectives and halo_exchange.
6 PRACE Early Access results On JUGENE: 17M core hours, with 7 new UPOs (and largest so far). Post-processing work still ongoing. About 15TBytes of data, shipped back to UCL (and US storage facilities) via globus-url-copy. We now possess several UPOs, which may go a long way towards alternative description (via DZF formalism) of turbulent flow.
7 - Prospects Further characterization of the turbulent state (energy dissipation rate, vorticity, enstrophy, etc). Relaxation procedure in order to identify prime (smaller period) UPOs in 3D NSE (and make sure we identify all the smallest ones). Implementation of more sophisticated minimization algorithms, using Newton-Krylov solvers and GMRES (maintaining spacetime variational principle).
7 - Prospects Ongoing work: new GMRES algorithm, coupled with Poincaré sections (in order to cut down on huge number of variables - LB distribution functions). Goal is to drive the value of the functional even further down (required to compute stability eigenvalues). Poincaré section (extension of HYPO4D) part coupled with PETSc (http://www.mcs.anl.gov/petsc/petsc-as/) package (uses MPI, deployed on very many large resources).
7 - Prospects Long-term goal: Compare averages from several UPOs (via DZF) with time forward averaging. Classification of the (prime) UPOs identified + creation and maintenance of a digital library of such orbits. Compute dynamical zeta function for turbulent hydrodynamics!
Acknowledgements Special thanks goes out to: PRACE Jülich Supercomputing Centre (JSC) Wolfgang Frings (JSC) Jutta Docter (JSC)
Acknowledgements