Structure preserving preconditioner for the incompressible Navier-Stokes equations

Similar documents
Solving Large Nonlinear Sparse Systems


DELFT UNIVERSITY OF TECHNOLOGY

Preconditioners for the incompressible Navier Stokes equations

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

ANALYSIS OF AUGMENTED LAGRANGIAN-BASED PRECONDITIONERS FOR THE STEADY INCOMPRESSIBLE NAVIER-STOKES EQUATIONS

Efficient Augmented Lagrangian-type Preconditioning for the Oseen Problem using Grad-Div Stabilization

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

The solution of the discretized incompressible Navier-Stokes equations with iterative methods

Jos L.M. van Dorsselaer. February Abstract. Continuation methods are a well-known technique for computing several stationary

Incomplete Cholesky preconditioners that exploit the low-rank property

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

Contents. Preface... xi. Introduction...

Finding Rightmost Eigenvalues of Large, Sparse, Nonsymmetric Parameterized Eigenvalue Problems

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

c 2011 Society for Industrial and Applied Mathematics

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

PALADINS: Scalable Time-Adaptive Algebraic Splitting and Preconditioners for the Navier-Stokes Equations

Many preconditioners have been developed with specic applications in mind and as such their eciency is somewhat limited to those applications. The mai

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

Multigrid absolute value preconditioning

Solving Ax = b, an overview. Program

Fast Iterative Solution of Saddle Point Problems

Numerical Methods in Matrix Computations

Preface to the Second Edition. Preface to the First Edition

DELFT UNIVERSITY OF TECHNOLOGY

Linear Solvers. Andrew Hazel

Adaptive preconditioners for nonlinear systems of equations

The Deflation Accelerated Schwarz Method for CFD

Efficient Solvers for the Navier Stokes Equations in Rotation Form

A TAXONOMY AND COMPARISON OF PARALLEL BLOCK MULTI-LEVEL PRECONDITIONERS FOR THE INCOMPRESSIBLE NAVIER STOKES EQUATIONS

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses

AMS526: Numerical Analysis I (Numerical Linear Algebra)

A Taxonomy and Comparison of Parallel Block Multi-level Preconditioners for the Incompressible Navier Stokes Equations

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

7.4 The Saddle Point Stokes Problem

ADDITIVE SCHWARZ FOR SCHUR COMPLEMENT 305 the parallel implementation of both preconditioners on distributed memory platforms, and compare their perfo

Newton s Method and Efficient, Robust Variants

Indefinite and physics-based preconditioning

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Efficient Solvers for Stochastic Finite Element Saddle Point Problems

M.A. Botchev. September 5, 2014

1e N

Stabilization and Acceleration of Algebraic Multigrid Method

A parallel block multi-level preconditioner for the 3D incompressible Navier Stokes equations

Domain decomposition on different levels of the Jacobi-Davidson method

1. Fast Iterative Solvers of SLE

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY

Fast solvers for steady incompressible flow

The parallel computation of the smallest eigenpair of an. acoustic problem with damping. Martin B. van Gijzen and Femke A. Raeven.

Department of Computer Science, University of Illinois at Urbana-Champaign

Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method

RANA03-02 January Jacobi-Davidson methods and preconditioning with applications in pole-zero analysis

BLOCK ILU PRECONDITIONED ITERATIVE METHODS FOR REDUCED LINEAR SYSTEMS

n i,j+1/2 q i,j * qi+1,j * S i+1/2,j

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Lecture 8: Fast Linear Solvers (Part 7)

A Review of Preconditioning Techniques for Steady Incompressible Flow

FINDING RIGHTMOST EIGENVALUES OF LARGE SPARSE NONSYMMETRIC PARAMETERIZED EIGENVALUE PROBLEMS

Mathematics and Computer Science

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

Yuri Feldman, MUMPS User Days, Grenoble, France, June 2017

Scientific Computing

On the choice of abstract projection vectors for second level preconditioners

Newton-Krylov-Schwarz Method for a Spherical Shallow Water Model

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Fine-grained Parallel Incomplete LU Factorization

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices

Linear instability of the lid-driven flow in a cubic cavity

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Jae Heon Yun and Yu Du Han

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

EQUATIONS WITH LOW VISCOSITY HOWARD C. ELMAN UMIACS-TR November 1996

for the solution of layered problems with

2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co

A simple iterative linear solver for the 3D incompressible Navier-Stokes equations discretized by the finite element method

Model order reduction of large-scale dynamical systems with Jacobi-Davidson style eigensolvers

On deflation and singular symmetric positive semi-definite matrices

Algebraic Multigrid as Solvers and as Preconditioner

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

EINDHOVEN UNIVERSITY OF TECHNOLOGY Department ofmathematics and Computing Science

An Accelerated Block-Parallel Newton Method via Overlapped Partitioning

Preconditioned Conjugate Gradient-Like Methods for. Nonsymmetric Linear Systems 1. Ulrike Meier Yang 2. July 19, 1994

X. He and M. Neytcheva. Preconditioning the incompressible Navier-Stokes equations with variable viscosity. J. Comput. Math., in press, 2012.

Lecture 3: Inexact inverse iteration with preconditioning

Multigrid and Iterative Strategies for Optimal Control Problems

Algebraic Multigrid Preconditioners for Computing Stationary Distributions of Markov Processes

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

DELFT UNIVERSITY OF TECHNOLOGY

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

UNIVERSITY OF MARYLAND PRECONDITIONERS FOR THE DISCRETE STEADY-STATE NAVIER-STOKES EQUATIONS CS-TR #4164 / UMIACS TR # JULY 2000

Preconditioning Techniques Analysis for CG Method

Overlapping Schwarz preconditioners for Fekete spectral elements

Transcription:

Structure preserving preconditioner for the incompressible Navier-Stokes equations Fred W. Wubs and Jonas Thies Computational Mechanics & Numerical Mathematics University of Groningen, the Netherlands f.w.wubs@rug.nl Centre for Interdisciplinary Mathematics Uppsala University, Sweden jonas@math.uu.se Workshop: Recent developments in the solution of indenite systems Eindhoven, April 17, 2012

Outline The Challenge 1 The Challenge Workshops 2 Current Results A cartoon Computational experiments 3 They are everywhere Solution Example: eigenvalue problem 4 Summary

Workshops The Challenge

Past and future The Challenge Workshops 1995 Laplace Symphony How fast the Laplace equation was solved in 1995, E.F.F. Botta, K. Dekker, Y. Notay, A. van der Ploeg, C. Vuik, F.W. Wubs and P.M. de Zeeuw, Applied Numerical Mathematics, 1997, pp. 439-455. 2013 Minisymposium on Comparison of solvers in Large Scale Bifurcation Analysis Where: BIFD conference July 8-13, Haifa, Israel Organizers: Henk Dijkstra and Fred Wubs 2 Problems: 3D Lid Driven Cavity, Boussinesq on the globe

Benchmark problems The Challenge Workshops 1 3D Lid Driven Cavity Problem description and results in Oscillatory instability of a three-dimensional lid-driven ow in a cube by Yuri Feldman and Alexander Yu. Gelfgat, Phys. Fluids 22, 093602 (2010). They used FVM, 128 3-200 3 grid. Aim is to study transition from steady state to periodic solution. 2 Boussinesq on the Globe. Domain from 60 degrees N Lat. to 60 degrees S Lat. Continent modelled by one line going from the north pole to 50 degrees S Lat. Depth 4000m. Numerical ingredients: continuation of steady states and periodic solutions, nonlinear equations, eigenvalue problems. Key challenge: ecient solution of large sparse linear systems.

Current Results A cartoon Computational experiments

Stokes on HPC computers Current Results A cartoon Computational experiments Jacobian, rhs, construction and application of ILU factorization all in parallel. Solver is expressed in EPETRA package from Trilinos. Both cases 2 block eliminations 116 iterations for 8 digits. Opteron cluster 64 3 grid (1 million unknowns): one solve: 10 minutes on 32 cores from 3 nodes. IBM power 6 cluster (Huygens): 128 3 grid: one solve: 5 minutes on 256 cores from 8 nodes. Based on previous experience: one solve with Nav. Stokes Jacobian takes about the same amount.

Fully implicit approach Incompressible Navier-Stokes equations: Current Results A cartoon Computational experiments u t + N ( u, u) + L u + p = 0 u = 0 Discretize (here second order symmetry-preserving nite dierences on C-grid) Linearize by Newton's method Structure of resulting linear systems (Saddle-point matrix): ( ) ( ) ( ) L + N Grad u f u = (1) Div 0 p f p

Current Results A cartoon Computational experiments Ingredients for eective and robust incomplete factorization (Start from primitive equations) Fill reducing ordering Fourier-like transformation improves diagonal dominance to get rid of unwanted couplings Drop by retaining principal submatrices these submatrices will be positive denite if the matrix is positive denite For incompressible Navier Stokes equation, do not drop in divergence and gradient part There is no increase of ll in this part (even not in direct method) on C-grid

A cartoon of the new algorithm Stokes on a structured C-grid Current Results A cartoon Computational experiments

Current Results A cartoon Computational experiments A cartoon of the new algorithm, step 1 Domain decomposition

Current Results A cartoon Computational experiments A cartoon of the new algorithm, step 2 Identify separators

Current Results A cartoon Computational experiments A cartoon of the new algorithm, step 3 Elimination yields `geometric' Schur-complement NB: All horizontal (vertical) velocities on a vertical (horizontal) separator are coupled to the pressure inside Perform transformation and drop.

Current Results A cartoon Computational experiments A cartoon of the new algorithm, step 4 Flux representation (`coarse grid') Transformation is such that amount of mass owing through the separator remains the same.

Dropping The Challenge Current Results A cartoon Computational experiments Use simple drop-by-position: Drop all couplings between separator groups... and all couplings between VΣ and regular V-nodes. Principal submatrices of SPD matrix are SPD

Dropping The Challenge Current Results A cartoon Computational experiments Use simple drop-by-position: Drop all couplings between separator groups... and all couplings between VΣ and regular V-nodes. Principal submatrices of SPD matrix are SPD = Block diagonal preconditioner with a `reduced matrix' S 2 in the lower right.

Example 1: 3D Laplace Equation Current Results A cartoon Computational experiments Convergence criterion: r / r 0 < 10 8.

Current Results A cartoon Computational experiments Stokes equations: number of iterations

2D lid-driven cavity The Challenge Current Results A cartoon Computational experiments Incompressible Navier-Stokes; Stretched structured grid (ratio 5); Newton's method; First Hopf-bifurcation at Re 8375 (Tiesinga & Wubs 2002).

Convergence behavior Current Results A cartoon Computational experiments Convergence criterion: r / r 0 < 10 6

Robust at high Reynolds numbers Current Results A cartoon Computational experiments Can compute highly unstable steady states; Moderate increase in number of iterations; Conv. tol 10 8 here.

They are everywhere Solution Example: eigenvalue problem

They are everywhere The Challenge They are everywhere Solution Example: eigenvalue problem [ A V W T C ] [ x s ] = [ fx f c ] where A is a big sparse matrix and V and W contain a number of vectors. Occur in: Continuation (Jacobian A singular near turning point) Eigenvalue computation in Jacobi-Davidson method DO method for stochastic PDEs using implicit methods In both last methods one has to compute a correction on a space perpendicular to the current space.

Standard solution The Challenge They are everywhere Solution Example: eigenvalue problem Standard approach: Make block LU factorization [ ] [ A 0 I A 1 V W T I 0 C W T A 1 V ] What if A becomes singular. Arpack: targets 0 and 0.1

Incorporation in multilevel approach They are everywhere Solution Example: eigenvalue problem Multilevel ILU comes in very handy. Example in two-level case: A 11 A 12 V 1 A 21 A 22 V 2 = W1 T W2 T C A 11 0 0 A 21 I 0 I A 1 11 A 12 A 1 11 V 1 W1 T 0 I 0 A 22 A 21 A 1 11 A 12 V 2 A 21 A 1 11 V 1 0 W T 2 W T 1 A 1 11 A 12 C W T 1 A 1 11 V 1 On the last level we can apply a direct method with pivoting, which precludes instability. Possible indeniteness is likely to occur in low frequency part of solution. This is pushed downwards to coarsest grid. If the last block preconditioning matrix is indenite then the original is signaling of bifurcation.

Eigenvalue problems The Challenge They are everywhere Solution Example: eigenvalue problem Jacobi-Davidson QZ method: accelerated inexact Newton method. Space ranges from 20 till 40. Use two-level factorization as approximate Jacobian. Target 0, nd rst 6 eigenvalues First vector from LUx=rand Problem Jacobian matrices from lid driven cavity at Re=100 and 1000 (vertical). Grid renement from 32x32 to 64x64 (horizontal) Plots residual against iteration number (MVs). Mind scale x-axis (200 resp. 400)

Eigenvalue problems The Challenge They are everywhere Solution Example: eigenvalue problem 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. log 10 r it step 2 log 10 r it step 2 2 2 Test subspace computed as Av tau*dv. 3 4 5 6 7 Test subspace computed as Av tau*dv. 3 4 5 6 7 8 8 9 0 20 40 60 80 100 120 140 160 180 200 Correction equation solved with Augm. Prec.. 9 0 20 40 60 80 100 120 140 160 180 200 Correction equation solved with Augm. Prec.. 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. 1 11 2011, 17:46:47 log 10 r it step 2 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. 1 11 2011, 17:55: 9 log 10 r it step 2 2 2 Test subspace computed as Av tau*dv. 3 4 5 6 7 Test subspace computed as Av tau*dv. 3 4 5 6 7 8 8 9 0 50 100 150 200 250 300 350 400 Correction equation solved with Augm. Prec.. 9 0 50 100 150 200 250 300 350 400 Correction equation solved with Augm. Prec.. 1 11 2011, 18: 7:44 1 11 2011, 18:13:29

Summary

Summary The Challenge Summary Bifurcation analysis requires fast and robust linear algebra We developed a solver that combines Ease of use: only one parameter; Robustness: factorization doesn't break down; Can be used as approximate Jacobian Parallelism: exposed on every level; Grid-independent convergence for ILU. Extendable to multi-physics problems Low global communication (one per iteration) Next steps Improvements on accuracy and performance. Do some nice (multiphysics) CFD problems (see Challenge). Look for generalizations.

Summary References A.C. de Niet and F.W. Wubs. Numerically stable LDL T factorization of F-type saddle point matrices. IMA Journal of Numerical Analysis, vol. 29, no 1, pp. 208-234. F.W.Wubs and J.Thies, A robust two-level incomplete factorization for (Navier-) Stokes saddle point matrices, SIAM J. Matrix Anal. Appl., 32:1475-1499, 2011. Jonas Thies and Fred Wubs, Design of a Parallel Hybrid Direct/Iterative Solver for CFD Problems. Proceedings 2011 Seventh IEEE International Conference on escience, 5-8 December 2011, Stockholm, Sweden, pages 387-394, 2011.. G.L.G. Sleijpen and F.W. Wubs. Exploiting Multilevel Preconditioning Techniques in Eigenvalue Computations. SIAM Journal on Scientic Computing, 25(4):1249-1272, 2003.

Summary Direct vs. Iterative Linear Solvers Sparse Direct robust and easy to use comput. complexity O(N 2 ) in 3D (N: number of unknowns) substantial ll-in O(N 4/3 ) Preconditioned Iterative usually not robust, depend on many parameters can have optimal complexity O(N) save memory + CPU time by avoiding ll-in Can we combine the best of both? ILU close to LU and preserve properties

F-matrices The Challenge Summary A saddle point matrix has the following structure: [ ] A B K =. (2) B T 0 Denition 1 A gradient-type matrix has at most two nonzero entries per row and its row sum is zero. Denition 2 A saddle point matrix (2) is called an F-matrix if A is positive denite and B is a gradient-type matrix. The Jacobian of the Stokes equations (Re 0) on a C-grid is an F-matrix.

Summary Computing an LU decomposition of an F-matrix [ A B B T 0 ] [ xv x p ] = [ fv f p ] V nodes P nodes Algorithm: LU decomposition of an F-matrix. Compute a ll-reducing ordering for the graph F (A) F (BB T ), during Gaussian elimination, insert the P-nodes to form 2 2 pivots whenever a coupling between a V-node and a P-node is encountered. Theorem 1 In every step of the above algorithm, the resulting Schur complement is an F-matrix.

Summary How is ll generated in the direct approach? α β a T b T β 0 ˆbT 0 a ˆb  ˆB b 0 T ˆB O. (3) Elimination step: Multiple of ˆbˆb T is added to Â; ˆb becomes denser as P-nodes are eliminated; So dropping in  doesn't make sense. Main problem: For ILU we have to get rid of couplings of velocities to inside pressure

Domain decomposition Summary This ordering exposes parallelism in the matrix: ( ) K11 K K = 12, K 21 K 22 where K 11 is block-diagonal. Subdomains and `separator groups'; Retain one pressure per subdomain.

The Schur complement Summary LU-decomposition of the matrices on the subdomains, K 11 = L 11 U 11 ; Schur-complement: S = K 22 K 21 K 1 11 K 12; S retains structural and numerical properties of K; S has only a few rather dense `B' columns (with at most two entries per row); Solve the system with S by a preconditioned Krylov subspace method. Schur-complement:

How can we maintain sparsity? Summary Still an F-matrix; All V-nodes on a separator are now connected to the same 2 P-nodes; Use orthogonal similarity transformation to disconnect them (harmless: SPD remains SPD)

How can we maintain sparsity? Summary Still an F-matrix; All V-nodes on a separator are now connected to the same 2 P-nodes; Use orthogonal similarity transformation to disconnect them (harmless: SPD remains SPD) = Only one V-node per separator remains connected to P-nodes (V Σ -nodes) We did it!

Why it works The Challenge Summary Orthogonal transformations: Eliminate most V-P couplings to avoid ll; `Transfer operators' dening coarse problem S 2 ; Improves diagonal dom. grid indep. convergence; Coarse problem S 2 : solve for ux V Σ through each separator; Still an F-matrix in case of the Stokes equations; Constraint preconditioning: no approximations in `Grad' or `Div' part; mass is conserved exactly throughout. Drop-by-position original properties preserved (symmetry, positiveness); singular subsystems cannot occur robust approach. No segregation of variables Grid independent convergence: Block size determines rate of convergence In two level variant amount of work is not independent of problem size.

Summary Example 1: 3D Laplace Equation (2)

Stokes equations: relative ll Summary

Achieving high accuracy Summary Driven Cavity, 512 512 grid; Subdomain size: 8 8; Convergence tolerance 10 10 ; Preconditioned GMRES; = Some modes not captured using this subdomain size.

Generalizations The Challenge Dierent coordinate systems Summary spherical coordinates common in geophysics Flux-formulation = F-matrix Dierent discretizations: B-grid Dierent physics: can solve Poisson, Convection-Diusion, Stokes adding heat transfer is easy Coriolis force - skew-symmetric, so it works good scaling is essential rotate v by 45 = F-matrix

Possible improvements Summary Multi-level extension: Reduced problem has same structure as original matrix; Recursive application leads to linear comp. complexity. Deation to avoid `plateaus' in GMRES Adaptivity: Any domain decomposition can be used; Inhom. problems: short separators in regions of weak coupling. Unstructured grids: Structure-preserving direct method?