The hybridized DG methods for WS1, WS2, and CS2 test cases P. Fernandez, N.C. Nguyen and J. Peraire Aerospace Computational Design Laboratory Department of Aeronautics and Astronautics, MIT 5th High-Order Workshop Sunday, January 7 th, 218
WS1: Taylor-Green Vortex
Case description Flow conditions: Re = U L / ν = 1,6 and Solver: DIGASO M = U / a =.1 Discretization scheme: - Space: 3rd-order IEDG - Time: 3rd-order DIRK(3,3) Quadrature rule: Gauss-Legendre with exact integration of polynomials up to 5th-order Number of elements: 64 x 64 x 64 hexes Baseline CFL number:.375 Studies presented: - Riemann solver - SGS model: ILES, Smagorinsky, WALE, Vreman - Time-step size 3 Source: G. Giangaspero et al., HiOCFD3, 215
Effect of Riemann solver Re = 16 Time evolution of kinetic energy dissipation rate ( 1/ρ U 3 L 2) -dek/dt.14.12.1.8.6.4.2 σ =(A n + A n )/2 σ = A n σ = λ max DNS 5 1 15 tu /L! 1=; U 3 L 2" & i Ek Numerical dissipation of kinetic energy #1-4 18 16 14 12 1 8 6 4 2 < = (A n + ja n j)=2 < = ja n j < = 6 max Subgrid scales appear in the flow -2 5 1 15 t U =L #1-3 16 #1-3 16 14 12 < = (A n + ja n j)=2 < = ja n j < = 6 max 14 12 < = (A n + ja n j)=2 < = ja n j < = 6 max Re =! 1=; U 3 L 2" -dek=dt 1 8 6 4! 1=; U 3 L 2" & i Ek 1 8 6 4 2-2 5 1 15 t U =L 2 Subgrid scales appear in the flow -2 5 1 15 t U =L Laminar regime Turbulent regime 4 Laminar regime Turbulent regime
Effect of Riemann solver Time evolution of kinetic energy dissipation rate Numerical dissipation of kinetic energy Re = 16 ( 1/ρ U 3 L 2) -dek/dt.14.12.1.8.6.4.2 σ =(A n + A n )/2 σ = A n σ = λ max DNS 5 1 15 tu /L! 1=; U 3 L 2" & i Ek Numerical dissipation 2 orders of magnitude smaller than total dissipation #1-4 18 16 14 12 1 8 6 4 2 < = (A n + ja n j)=2 < = ja n j < = 6 max Subgrid scales appear in the flow -2 5 1 15 t U =L #1-3 16 #1-3 16 14 12 < = (A n + ja n j)=2 < = ja n j < = 6 max 14 12 < = (A n + ja n j)=2 < = ja n j < = 6 max Re =! 1=; U 3 L 2" -dek=dt 1 8 6 4! 1=; U 3 L 2" & i Ek 1 8 6 4 2-2 5 1 15 t U =L 2 Subgrid scales appear in the flow -2 5 1 15 t U =L Laminar regime Turbulent regime 4 Laminar regime Turbulent regime
Effect of SGS model ( 1/ρ U 3 L 2) -dek/dt.14.12.1.8.6.4 ILES Smagorinsky WALE Vreman DNS Unphysical dissipation in laminar regime Re = 16! 1=; U 3 L 2" -dek=dt #1-3 16 14 12 1 8 ILES 6 4 2 ILES Smagorinsky WALE Vreman Unphysical dissipation in laminar regime Re =.2 5 1 15 tu /L -2 5 1 15 t U =L Laminar regime Turbulent regime Laminar regime Turbulent regime All SGS models produce unphysical dissipation in the laminar regime Built-in ILES capability in DG more accurate than explicit SGS models to detect SGS scales and add numerical dissipation only under those conditions 5
Kinetic energy spectra at t = 9 L / U ILES with different Riemann solvers Explicit vs. implicit LES 1 3 1 3 1 2 1 2 Re = 16 (1=U 2 L) E1D(k) 1 1 1 1-1 1-2 1-3 1-4 < = (A n + ja n j)=2 < = ja n j < = 6 max \! 5=3" slope 1 1 1 1 2 k L (1=U 2 L) E1D(k) 1 1 1 1-1 1-2 1-3 1-4 ILES Smagorinsky Vreman \! 5=3" slope 1 1 1 1 2 k L 1 2 1 3 < = (A n + ja n j)=2 < = ja n j < = 6 max \! 5=3" slope 1 2 1 3 ILES Smagorinsky Vreman \! 5=3" slope Re = (1=U 2 L) E1D(k) 1 1 1 1-1 1-2 Energy bump due to over-upwinding (1=U 2 L) E1D(k) 1 1 1 1-1 1-2 More premature end of inertial range with explicit models 1-3 1 1 1 1 2 k L 6 1-3 1 1 1 1 2 k L
Effect of time-step size ( 1/ρ U 3 L 2) -dek/dt Time evolution of kinetic energy dissipation rate.14.12.1.8.6.4 CFL =1.5 CFL =.75 CFL =.375 CFL =.1875 DNS ) Π S ( 1/ρ UL 2 cv 12 1 8 6 4 2 1-6 Numerical generation of entropy CFL =1.5 CFL =.75 CFL =.375 CFL =.1875 (Just acoustic waves).2 5 1 15 tu /L -2 5 1 15 tu /L Laminar regime Turbulent regime Laminar regime Turbulent regime Vorticity and entropy modes captured with local CFL ~ 1. Acoustic modes require CFL << 1 (recall M =.1) Optimal LES implementations likely with local CFL > 1: Global CFL >>> 1 for wallbounded flows 7
WS2: Turbulent Channel Flow
Case description Flow conditions: Reτ =182 and 544 M =.1 Channel size: 4πδ x 2δ x 2πδ Solver: DIGASO Discretization scheme: - Space: 3rd- and 5th- order IEDG - Time: 3rd-order DIRK(3,3) Quadrature rule: Gauss-Legendre with exact integration of polynomials up to 5th- and 9th-order, respectively. Number of elements: 48 x 32 x 4 Studies presented: - SGS model Riemann solver study is presented in (Fernandez et al., 217b) Source: P. Blonigan et al., APS DFD Meeting 216 Mesh resolution in wall units Reτ=182 Reτ=544 Δx + 23.8 71.2 Δy + avg 5.69 17. Δy + wall.438 1.31 Δz + 14.3 42.7 Δt +.29.74 9
Effect of SGS model 2 18 16 14 12 ILES Smagorinsky WALE Vreman DNS data (Lee and Moser, 215) Reτ = 182 Reτ = 544 25 2 15 ILES Smagorinsky WALE Vreman DNS data (Lee and Moser, 215) U + 1 U + 8 6 4 2 1 5 1-1 1 1 1 1 2 y + 1-1 1 1 1 1 2 1 3 Viscous layer Buffer layer Log layer Viscous layer Buffer layer Log layer y + ILES more accurate than explicit LES at both Reynolds numbers Viscous sublayer: Accurately resolved in all cases (except Smagorinsky-LES) Log layer: Accurately resolved in all cases 1
Effect of SGS model 2 18 16 14 12 ILES Smagorinsky WALE Vreman DNS data (Lee and Moser, 215) Reτ = 182 Reτ = 544 25 2 15 ILES Smagorinsky WALE Vreman DNS data (Lee and Moser, 215) U + 1 U + 8 6 4 2 1 5 1-1 1 1 1 1 2 y + 1-1 1 1 1 1 2 1 3 Viscous layer Buffer layer Log layer Viscous layer Buffer layer Log layer Buffer layer: Not accurately resolved. All simulations introduce too much numerical transport of momentum (not the same as numerical dissipation!) Ratio momentum transport-to-dissipation due to discretization errors and explicit SGS models is larger than the true SGS value y + 11
Effect of accuracy order 3 rd - and 5 th -order accurate LES have same number of DOFs and time-step size 25 2 Reτ = 544 Third-order ILES Fifth-order ILES DNS data (Lee and Moser, 215) LES predictions improve by increasing accuracy order U + 15 Accurately resolving the large turbulent scales by using a highorder method plays a more important role than that of the subgrid scales 1 5 1-1 1 1 1 1 2 1 3 y + Viscous layer Buffer layer Log layer 12
CS2: T16C LTP cascade
Case description Solver: DIGASO Discretization scheme: - Space: 3rd-order IEDG - Time: 3rd-order DIRK(3,3) Number of elements: 118,68 isoparametric hexes (baseline mesh) Time-step size: Δt = 5 1-3 c / v (CFLglobal = 6.6) LES model: No model (ILES) Riemann solver: Lax-Friedrichs-type in (Fernandez et al., 217b) Quadrature rule: Gauss-Legendre with exact integration of polynomials up to 5th-order Mach number Spanwise vorticity 14
Flow fields on the periodic plane Mach number Spanwise vorticity Time average Instantaneous 15
Non-dimensional grid size 16
Pressure and skin-friction coefficients Two types of inflow/outflow boundary conditions are considered Small differences in Cp and Cf observed 17
Analysis of boundary layer instabilities Amplitude of instabilities - Suction side Amplitude of instabilities - Pressure side Natural transition through 2D unstable modes The nomenclature and details of the post-processing strategy are described in (Fernandez et al., 217a) 18
Suction side boundary layer Displacement and momentum thickness Shape parameter The nomenclature and details of the post-processing strategy are described in (Fernandez et al., 217a) 19
Summary Implicit LES outperformed explicit SGS models for the transition prediction, wallfree turbulence and wall-bounded turbulence cases considered. Minor differences between Riemann solvers: DG methods have an autocorrection mechanism to compensate for overshoots in the Riemann solver. DG methods have a built-in implicit LES capability and add numerical dissipation in under-resolved turbulence simulations. Built-in implicit SGS model in DG methods is more accurate than explicit models since they add numerical dissipation only when SGS s are present in the flow Higher order are important to capture detailed physics in turbulent flows at higher Reynolds number. Optimal LES implementations are likely achieved with local CFLlocal > 1 (CFLglobal >> 1 for wall-bounded flows). 2
Summary IEDG is the hybridized DG method of choice: It inherits computational efficiency from EDG and BC robustness from HDG. Below ~5 th order accuracy, IEDG allows for more efficient implementations than other DG methods. Beyond ~5 th order accuracy, memory requirements and flop count become prohibitive, and IEDG does not provide any advantages. ILU() + RAS(1) type preconditioners are extremely efficient for most LES cases considered. Accuracy and stability of ILU deteriorates for low cell Peclet numbers (time-step size needs to be reduced more than linearly w.r.t. mesh size): Issues for wall-resolved LES at Re > 5,. Efficiency and scalable nonlinear solvers are required for these methods to be adopted in industrial applications. 21
Questions? For additional details: P. Fernandez, N.C. Nguyen, J. Peraire, The hybridized Discontinuous Galerkin method for Implicit Large- Eddy Simulation of transitional turbulent flows, J. Comput. Phys. 336 (1) (217) 38 329. P. Fernandez, N.C. Nguyen, J. Peraire, Subgrid-scale modeling and implicit numerical dissipation in DGbased Large-Eddy Simulation, In: 23rd AIAA Computational Fluid Dynamics Conference, Denver, USA, 217. P. Fernandez, N.C. Nguyen, J. Peraire, A physics-based shock capturing method for unsteady laminar and turbulent flows, In: 56th AIAA Aerospace Sciences Meeting, Gaylord Palms, USA, 218. SciTech talk: Title: A physics-based shock capturing method for unsteady laminar and turbulent flows Session: FD-3, CFD for Capturing Flow Discontinuities Setting: Monday, January 8, 9:3 AM, Room Sun 5 pablof@mit.edu cuongng@mit.edu peraire@mit.edu
Hybridized DG methods
Hybridized DG schemes Consider the unsteady compressible Navier-Stokes equations: q ru =, in (,T], @u @t + r F NS(u, q) =, in (,T]. u = @ v j E 1 A The hybridized DG approach (Nguyen et al., 215) introduces additional variables on the element faces. Then: @K (q h, u h ) K = f(bu h @K ), 8K 2 T h This yields a global problem: @K K bu h bu h (q h, u h ) r h (u h )= Standard DG r h (bu h )= Hybridized DG Global DOFs in standard DG methods Global DOFs in hybridized DG methods
Hybridized DG schemes Hybridized DG schemes (Nguyen et al., 215) are a family of numerical schemes Different choices of the space for bu h lead to different schemes. Three examples: Hybridizable DG (HDG) Embedded DG (EDG) Interior Embedded DG (IEDG) Method Nature of bu h space (Peraire et al., 21) Discontinuous (Peraire et al., 211) Continuous (Fernandez et al., 216) Interior across faces across faces faces: Continuous Boundary faces: Discontinuous HDG IEDG EDG
Hybridized DG. Efficiency and accuracy Efficiency Non-zero entries in Jacobian of global system: NNZ = N p N 2 c NNZ NNZ Number of non-zeros N p Number of mesh vertices N c Number of components of the PDE Values of NNZ (tetrahedra) Accuracy order 2nd 3rd 4th 5th Standard DG 48 3, 12, 36,75 HDG 756 3,24 8,4 18,9 EDG 15 23 1,311 4,41 IEDG <15 <23 <1,311 <4,41 p +1 u h u h Optimal accuracy ( ) in. p +2 Accuracy Superconvergence ( ) in can be inexpensively achieved (HDG only) p +1 q h Optimal accuracy ( ) in (HDG only) Standard DG HDG EDG IEDG Accuracy order u h q h p +1 p p +1 p +1 p +1 p p +1 p
Parallel implementation and iterative solvers
Parallel iterative solver The hybridized DG discretization yields a nonlinear system of equations at every time-step r h (bu h )= An efficient and scalable solution procedure is required for these methods to be adopted in industrial applications We discuss next on parallel iterative solvers
Parallel iterative solver r h (bu h )= Linearization K h bu h = r h Nonlinear system: Newton or quasi-newton method bu n, Initial guess h at time-step computed with a reducedbasis minimum-residual algorithm: bu n, h = sx j=1 j bu n ( 1,..., s )=arg min ( 1,..., s )2R s r h sx n h j j=1 j bu n h 2 j
Parallel iterative solver r h (bu h )= Linearization K h bu h = r h Nonlinear system: Newton or quasi-newton method bu n, Initial guess h at time-step computed with a reducedbasis minimum-residual algorithm Linear system: Restarted GMRES method Parallel preconditioner M 1 : Traced node -based. overlapping restrictive additive Schwarz (Cai et al., 1999) : NX M 1 := Ri (K h ) 1 i R R i i i=1 (K h ) i = R i K h R i Restriction operator onto the overlap subdomain Number of subdomains (i.e. processors) Subdomain preconditioner: Block incomplete LU factorization with zero fill-in (BILU) and MDF reorder (Persson et al., 28) n N (K h ) 1 i U e 1 i el 1 i
Parallel iterative solver r h (bu h )= Nonlinear system: Newton or quasi-newton method bu n, Initial guess h at time-step computed with a reducedbasis minimum-residual algorithm n Linearization K h bu h = r h Linear system: Restarted GMRES method Parallel preconditioner M 1 : Traced node -based. overlapping restrictive additive Schwarz Subdomain preconditioner: Block incomplete LU factorization with zero fill-in (BILU) and MDF reorder (Persson et al., 28) (Cai et al., 1999)
Parallel iterative solver r h (bu h )= Nonlinear system: Newton or quasi-newton method bu n, Initial guess h at time-step computed with a reducedbasis minimum-residual algorithm n Linearization K h bu h = r h Linear system: Restarted GMRES method Parallel preconditioner M 1 : Traced node -based. overlapping restrictive additive Schwarz Subdomain preconditioner: Block incomplete LU factorization with zero fill-in (BILU) and MDF reorder (Persson et al., 28) (Cai et al., 1999) Others: Mixed-precision approach for Newton-GMRES algorithm Adaptive quadrature rules Stabilized ILU factorization (silu) Minimum Interaction Domain Decomposition (MIDD)
Parallel scalability Weak scaling for LES of Ecole Centrale de Lyon compressor cascade Low-speed compressor cascade with Reynolds numbers ranging from 2, to 4, Numerical discretization: - Space: 3rd-order IEDG - Time: 3rd-order DIRK(3,3) Additional details: - Time-step size is kept content in all runs - Computing platform: Titan (OLCF) - One MPI rank per physical core - GPU computing and hybrid OpenMP/MPI parallelization disabled Time to solution 12 1 8 6 4 2 496 8192 16384 32768 65536 Number of processors Actual Ideal
References 1.X.C. Cai, M. Sarkis, A restricted additive Schwarz preconditioner for general sparse linear systems, SIAM J. Sci. Comput. 21 (2) (1999) 792 797. 2.P. Fernandez, N.C. Nguyen, X. Roca, J. Peraire, Implicit large-eddy simulation of compressible flows using the Interior Embedded Discontinuous Galerkin method, In: 54th AIAA Aerospace Sciences Meeting, San Diego, USA, 216. 3.P. Fernandez, N.C. Nguyen, J. Peraire, The hybridized Discontinuous Galerkin method for Implicit Large- Eddy Simulation of transitional turbulent flows, J. Comput. Phys. 336 (1) (217) 38 329. 4.P. Fernandez, N.C. Nguyen, J. Peraire, Subgrid-scale modeling and implicit numerical dissipation in DGbased Large-Eddy Simulation, In: 23rd AIAA Computational Fluid Dynamics Conference, Denver, USA, 217. 5.P. Fernandez, N.C. Nguyen, J. Peraire, A physics-based shock capturing method for unsteady laminar and turbulent flows, In: 56th AIAA Aerospace Sciences Meeting, Gaylord Palms, USA, 218. 6.N.C. Nguyen, J. Peraire, B. Cockburn, A class of embedded discontinuous Galerkin methods for computational fluid dynamics, J. Comput. Phys. 32 (1) (215) 674 692. 7.J. Peraire, N.C. Nguyen, B. Cockburn, A Hybridizable Discontinuous Galerkin Method for the Compressible Euler and Navier-Stokes Equations, In: 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, Orlando, USA, 21. 8.J. Peraire, N.C. Nguyen, B. Cockburn, An Embedded Discontinuous Galerkin Method for the Compressible Euler and Navier-Stokes Equations, In: 2th AIAA Computational Fluid Dynamics Conference, Honolulu, USA, 211. 9.P.-O. Persson, J. Peraire, Newton-GMRES preconditioning for Discontinuous Galerkin Discretizations of the Navier-Stokes Equations, SIAM J. Sci. Comput. 3 (6) (28) 279 2733.