Design of optimal Runge-Kutta methods David I. Ketcheson King Abdullah University of Science & Technology (KAUST) D. Ketcheson (KAUST) 1 / 36
Acknowledgments Some parts of this are joint work with: Aron Ahmadia Matteo Parsani D. Ketcheson (KAUST) 2 / 36
Outline 1 High order Runge-Kutta methods 2 Linear properties of Runge-Kutta methods 3 Nonlinear properties of Runge-Kutta methods 4 Putting it all together: some optimal methods and applications D. Ketcheson (KAUST) 3 / 36
Outline 1 High order Runge-Kutta methods 2 Linear properties of Runge-Kutta methods 3 Nonlinear properties of Runge-Kutta methods 4 Putting it all together: some optimal methods and applications D. Ketcheson (KAUST) 4 / 36
Solution of hyperbolic PDEs The fundamental algorithmic barrier is the CFL condition: t a x Implicit methods don t usually help (due to reduced accuracy) D. Ketcheson (KAUST) 5 / 36
Solution of hyperbolic PDEs The fundamental algorithmic barrier is the CFL condition: t a x Implicit methods don t usually help (due to reduced accuracy) Strong scaling limits the effectiveness of spatial parallelism alone D. Ketcheson (KAUST) 5 / 36
Solution of hyperbolic PDEs The fundamental algorithmic barrier is the CFL condition: t a x Implicit methods don t usually help (due to reduced accuracy) Strong scaling limits the effectiveness of spatial parallelism alone Strategy: keep x as large as possible by using high order methods D. Ketcheson (KAUST) 5 / 36
Solution of hyperbolic PDEs The fundamental algorithmic barrier is the CFL condition: t a x Implicit methods don t usually help (due to reduced accuracy) Strong scaling limits the effectiveness of spatial parallelism alone Strategy: keep x as large as possible by using high order methods But: high order methods cost more and require more memory D. Ketcheson (KAUST) 5 / 36
Solution of hyperbolic PDEs The fundamental algorithmic barrier is the CFL condition: t a x Implicit methods don t usually help (due to reduced accuracy) Strong scaling limits the effectiveness of spatial parallelism alone Strategy: keep x as large as possible by using high order methods But: high order methods cost more and require more memory Can we develop high order methods that are as efficient as lower order methods? D. Ketcheson (KAUST) 5 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. D. Ketcheson (KAUST) 6 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. Using a different time integrator can: Reduce the number of RHS evaluations required D. Ketcheson (KAUST) 6 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. Using a different time integrator can: Reduce the number of RHS evaluations required Alleviate timestep resrictions due to D. Ketcheson (KAUST) 6 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. Using a different time integrator can: Reduce the number of RHS evaluations required Alleviate timestep resrictions due to Linear stability D. Ketcheson (KAUST) 6 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. Using a different time integrator can: Reduce the number of RHS evaluations required Alleviate timestep resrictions due to Linear stability Nonlinear stability D. Ketcheson (KAUST) 6 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. Using a different time integrator can: Reduce the number of RHS evaluations required Alleviate timestep resrictions due to Linear stability Nonlinear stability Improve accuracy (truncation error, dispersion, dissipation) D. Ketcheson (KAUST) 6 / 36
Time Integration Using a better time integrator is usually simple but can be highly beneficial. Using a different time integrator can: Reduce the number of RHS evaluations required Alleviate timestep resrictions due to Linear stability Nonlinear stability Improve accuracy (truncation error, dispersion, dissipation) Reduce storage requirements D. Ketcheson (KAUST) 6 / 36
Runge-Kutta Methods To solve the initial value problem: u (t) = F (u(t)), u(0) = u 0 a Runge-Kutta method computes approximations u n u(n t): i 1 y i = u n + t a ij F (y j ) j=1 s 1 u n+1 = u n + t b j F (y j ) The accuracy and stability of the method depend on the coefficient matrix A and vector b. j=1 D. Ketcheson (KAUST) 7 / 36
Runge-Kutta Methods: a philosophical aside An RK method builds up information about the solution derivatives through the computation of intermediate stages At the end of a step all of this information is thrown away! Use more stages = keep information around longer D. Ketcheson (KAUST) 8 / 36
Outline 1 High order Runge-Kutta methods 2 Linear properties of Runge-Kutta methods 3 Nonlinear properties of Runge-Kutta methods 4 Putting it all together: some optimal methods and applications D. Ketcheson (KAUST) 9 / 36
The Stability Function For the linear equation u = λu, a Runge-Kutta method yields a solution u n+1 = φ(λ t)u n, where φ is called the stability function of the method: φ(z) = det(i z(a ebt ) det(i za) D. Ketcheson (KAUST) 10 / 36
The Stability Function For the linear equation u = λu, a Runge-Kutta method yields a solution u n+1 = φ(λ t)u n, where φ is called the stability function of the method: Example: Euler s Method φ(z) = det(i z(a ebt ) det(i za) u n+1 = u n + tf (u); φ(z) = 1 + z. D. Ketcheson (KAUST) 10 / 36
The Stability Function For the linear equation u = λu, a Runge-Kutta method yields a solution u n+1 = φ(λ t)u n, where φ is called the stability function of the method: Example: Euler s Method φ(z) = det(i z(a ebt ) det(i za) u n+1 = u n + tf (u); φ(z) = 1 + z. For explicit methods of order p: p 1 φ(z) = j! zj + j=0 s j=p+1 α j z j. D. Ketcheson (KAUST) 10 / 36
Absolute Stability For the linear equation u (t) = Lu we say the solution is absolutely stable if φ(λ t) 1 for all λ σ(l). D. Ketcheson (KAUST) 11 / 36
Absolute Stability For the linear equation u (t) = Lu we say the solution is absolutely stable if φ(λ t) 1 for all λ σ(l). Example: Euler s Method u n+1 = u n + tf (u); φ(z) = 1 + z. 1 D. Ketcheson (KAUST) 11 / 36
Stability optimization This leads naturally to the following problem. Stability optimization Given L, p, s, maximize t subject to φ( tλ) 1 0, λ σ(l), p 1 s where φ(z) = j! zj + α j z j. j=0 j=p+1 Here the decision variables are t and the coefficients α j, j = p + 1,..., s. This problem is quite difficult; we approximate its solution by solving a sequence of convex problems (DK & A. Ahmadia, arxiv preprint). D. Ketcheson (KAUST) 12 / 36
Accuracy optimization We could instead optimize accuracy over some region in C: Accuracy optimization Given L, p, s, maximize t subject to φ( tλ) exp( tλ ɛ, λ σ(l), p 1 s where φ(z) = j! zj + α j z j. j=0 j=p+1 In the PDE case, we can replace exp( tλ) with the exact dispersion relation for each Fourier mode. D. Ketcheson (KAUST) 13 / 36
Stability Optimization: a toy example As an example, consider the advection equation u t + u x = 0 discretized in space by first-order upwind differencing with unit spatial mesh size U i (t) = (U i (t) U i 1 (t)) with periodic boundary condition U 0 (t) = U N (t). 8 6 4 2 0 2 4 6 8 4 3 2 10 1 D. Ketcheson (KAUST) 14 / 36
Stability Optimization: a toy example 8 6 4 2 0 2 4 6 8 4 3 2 10 1 (a) RK(4,4) 8 6 4 2 0 2 4 6 8 14 12 10 8 6 4 2 0 (b) Optimized 10-stage method D. Ketcheson (KAUST) 15 / 36
Stability Optimization: a toy example What is the relative efficiency? Stable step size Cost per step RK(4,4): 1.4 4 0.35 6 RK(10,4): 10 = 0.6 By allowing even more stages, can asymptotically approach the efficiency of Euler s method. D. Ketcheson (KAUST) 16 / 36
Stability Optimization: a more interesting example Second order discontinuous Galerkin discretization of advection: D. Ketcheson (KAUST) 17 / 36
Stability Optimization: one more example 20 15 10 5 0 5 10 15 20 80 70 60 50 40 30 20 10 0 s = 20 D. Ketcheson (KAUST) 18 / 36
Stability Optimization: one more example s = 20 D. Ketcheson (KAUST) 18 / 36
Stability Optimization: one more example s = 20 D. Ketcheson (KAUST) 18 / 36
Outline 1 High order Runge-Kutta methods 2 Linear properties of Runge-Kutta methods 3 Nonlinear properties of Runge-Kutta methods 4 Putting it all together: some optimal methods and applications D. Ketcheson (KAUST) 19 / 36
Nonlinear accuracy Besides the conditions on the stability polynomial coefficients, high order Runge-Kutta methods must satisfy additional nonlinear order conditions. p = 1: i b i = 1 p = 2: i,j b ia ij = 1/2 p = 3: i,j,k b ia ij a jk = 1/6 i,j,k b ia ij a ik = 1/3 Number of conditions grows factorially (719 conditions for order 10). D. Ketcheson (KAUST) 20 / 36
Beyond linear stability Classical stability theory and its extensions focus on weak bounds: u n C(t) linear problems inner product norms For hyperbolic PDEs, we are often interested in strict bounds u n C nonlinear problems L 1, L, TV, or positivity We refer to bounds of the latter types as strong stability properties. For example: u n TV u n 1 TV D. Ketcheson (KAUST) 21 / 36
Strong stability preservation Designing fully-discrete schemes with strong stability properties is notoriously difficult! D. Ketcheson (KAUST) 22 / 36
Strong stability preservation Designing fully-discrete schemes with strong stability properties is notoriously difficult! Instead, one often takes a method-of-lines approach and assumes explicit Euler time integration. D. Ketcheson (KAUST) 22 / 36
Strong stability preservation Designing fully-discrete schemes with strong stability properties is notoriously difficult! Instead, one often takes a method-of-lines approach and assumes explicit Euler time integration. But in practice, we need to use higher order methods, for reasons of both accuracy and linear stability. D. Ketcheson (KAUST) 22 / 36
Strong stability preservation Designing fully-discrete schemes with strong stability properties is notoriously difficult! Instead, one often takes a method-of-lines approach and assumes explicit Euler time integration. But in practice, we need to use higher order methods, for reasons of both accuracy and linear stability. Strong stability preserving methods provide higher order accuracy while maintaining any convex functional bound satisfied by Euler timestepping. D. Ketcheson (KAUST) 22 / 36
The Forward Euler condition Recall our ODE system (typically from a PDE) u t = F (u), where the spatial discretization F (u) is carefully chosen 1 so that the solution from the forward Euler method u n+1 = u n + tf (u n ), satisfies the monotonicity requirement u n+1 u n, in some norm, semi-norm or convex functional, for a suitably restricted timestep 1 e.g. TVD, TVB t t FE. D. Ketcheson (KAUST) 23 / 36
Runge Kutta methods as a convex combination of Euler Consider the two-stage method: Is u n+1 u n? y 1 = u n + tf (u n ) u n+1 = u n + 1 2 t ( F (u n ) + F (y 1 ) ) D. Ketcheson (KAUST) 24 / 36
Runge Kutta methods as a convex combination of Euler Consider the two-stage method: y 1 = u n + tf (u n ) u n+1 = 1 2 un + 1 ( y 1 + tf (y 1 ) ). 2 Take t t FE. Then y 1 u n, so u n+1 1 2 un + 1 2 y1 + tf (y 1 ) u n. u n+1 u n D. Ketcheson (KAUST) 24 / 36
Optimized SSP methods In general, an SSP method preserves strong stability properties satisfied by Euler s method, under a modified step size restriction: t C t FE. A fair metric for comparison is the effective SSP coefficient: C eff = C # of stages By designing high order methods with many stages, we can achieve C eff 1. D. Ketcheson (KAUST) 25 / 36
Example: A highly oscillatory flow field u t + ( cos 2 (20x + 45t)u ) x = 0 u(0, t) = 0 Method c eff Monotone effective timestep NSSP(3,2) 0 0.037 SSP(50,2) 0.980 0.980 NSSP(3,3) 0 0.004 NSSP(5,3) 0 0.017 SSP(64,3) 0.875 0.875 RK(4,4) 0 0.287 SSP(5,4) 0.302 0.416 SSP(10,4) 0.600 0.602 D. Ketcheson (KAUST) 26 / 36
Low storage methods Straightforward implementation of an s-stage RK method requires s + 1 memory locations per unknown Special low-storage methods are designed so that each stage only depends on one or two most recent previous stages Thus older stages can be discarded as the new ones are computed It is often desirable to Keep the previous solution around to be able to restart a step Compute an error estimate This requires a minimum of three storage locations per unknown D. Ketcheson (KAUST) 27 / 36
Low storage methods 3S Algorithm S 3 := u n (y 1 ) S 1 := u n for i = 2 : m + 1 do S 2 := S 2 + δ i 1 S 1 (y i ) S 1 := γ i1 S 1 + γ i2 S 2 + γ i3 S 3 + β i,i 1 tf(s 1 ) end (û n+1 ) S 2 := u n+1 = S 1 1 m+2 j=1 δ (S 2 + δ m+1 S 1 + δ m+2 S 3 ) j D. Ketcheson (KAUST) 28 / 36
Outline 1 High order Runge-Kutta methods 2 Linear properties of Runge-Kutta methods 3 Nonlinear properties of Runge-Kutta methods 4 Putting it all together: some optimal methods and applications D. Ketcheson (KAUST) 29 / 36
Two-step optimization process Our optimization approach proceeds in two steps: 1 Optimize the linear stability or accuracy of the scheme by choosing the stability polynomial coefficients α j D. Ketcheson (KAUST) 30 / 36
Two-step optimization process Our optimization approach proceeds in two steps: 1 Optimize the linear stability or accuracy of the scheme by choosing the stability polynomial coefficients α j 2 Optimize the nonlinear stability/accuracy and storage requirements by choosing the Butcher coefficients a ij, b j. D. Ketcheson (KAUST) 30 / 36
Two-step optimization process Our optimization approach proceeds in two steps: 1 Optimize the linear stability or accuracy of the scheme by choosing the stability polynomial coefficients α j 2 Optimize the nonlinear stability/accuracy and storage requirements by choosing the Butcher coefficients a ij, b j. D. Ketcheson (KAUST) 30 / 36
Two-step optimization process Our optimization approach proceeds in two steps: 1 Optimize the linear stability or accuracy of the scheme by choosing the stability polynomial coefficients α j 2 Optimize the nonlinear stability/accuracy and storage requirements by choosing the Butcher coefficients a ij, b j. Each of these steps is a complex numerical problem in itself, involving nonconvex optimization in dozens to hundreds of variables, with nonlinear equality and inequality constraints. D. Ketcheson (KAUST) 30 / 36
Optimizing for the SD spectrum On regular grids, SD leads to a block-toeplitz operator We perform a von Neumann-like analysis using a generating pattern i 1,j+1 i, j +1 i +1,j+1 g2 i 1,j i, j i +1,j g1 i 1,j 1 i, j 1 i +1,j 1 dw i,j dt + a ( T 0,0 W i,j + T 1,0 W i 1,j + T 0, 1 W i,j 1 g +T +1,0 W i+1,j + T 0,+1 W i,j+1 ) = 0 D. Ketcheson (KAUST) 31 / 36
Optimizing for the SD spectrum Blue: eigenvalues; Red: RK stability boundary The convex hull of the generated spectrum is used as a proxy to accelerate the optimization process D. Ketcheson (KAUST) 32 / 36
Optimizing for the SD spectrum Primarily optimized for stable step size Secondary optimization for nonlinear accuracy and low-storage (3 memory locations per unknown) D. Ketcheson (KAUST) 33 / 36
Application: flow past a wedge fully unstructured mesh D. Ketcheson (KAUST) 34 / 36
Application: flow past a wedge Density at t = 100 62% speedup using optimized method D. Ketcheson (KAUST) 35 / 36
Conclusions Numerical optimization allows for flexible, targeted design of time integrators Stability optimization based on spectra from a model (linear) problem on a uniform grid seems to work well even for nonlinear problems on fully unstructured grids Significant speedup can be achieved in practice (greater for higher order methods) D. Ketcheson (KAUST) 36 / 36