Solving Large Nonlinear Sparse Systems

Similar documents
Structure preserving preconditioner for the incompressible Navier-Stokes equations


AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Efficient Augmented Lagrangian-type Preconditioning for the Oseen Problem using Grad-Div Stabilization

ANALYSIS OF AUGMENTED LAGRANGIAN-BASED PRECONDITIONERS FOR THE STEADY INCOMPRESSIBLE NAVIER-STOKES EQUATIONS

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

DELFT UNIVERSITY OF TECHNOLOGY

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

Numerical Methods in Matrix Computations

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Jos L.M. van Dorsselaer. February Abstract. Continuation methods are a well-known technique for computing several stationary

Preface to the Second Edition. Preface to the First Edition

Contents. Preface... xi. Introduction...

Solving Ax = b, an overview. Program

Incomplete Cholesky preconditioners that exploit the low-rank property

PALADINS: Scalable Time-Adaptive Algebraic Splitting and Preconditioners for the Navier-Stokes Equations

Preconditioners for the incompressible Navier Stokes equations

Fast Iterative Solution of Saddle Point Problems

c 2011 Society for Industrial and Applied Mathematics

Linear Solvers. Andrew Hazel

Finding Rightmost Eigenvalues of Large, Sparse, Nonsymmetric Parameterized Eigenvalue Problems

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

The solution of the discretized incompressible Navier-Stokes equations with iterative methods

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

Indefinite and physics-based preconditioning

Adaptive preconditioners for nonlinear systems of equations

Multigrid absolute value preconditioning

Scientific Computing

7.4 The Saddle Point Stokes Problem

Algebraic Multigrid as Solvers and as Preconditioner

Newton s Method and Efficient, Robust Variants

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses

Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners

Domain decomposition on different levels of the Jacobi-Davidson method

DELFT UNIVERSITY OF TECHNOLOGY

Efficient Solvers for the Navier Stokes Equations in Rotation Form

Efficient Solvers for Stochastic Finite Element Saddle Point Problems

ADDITIVE SCHWARZ FOR SCHUR COMPLEMENT 305 the parallel implementation of both preconditioners on distributed memory platforms, and compare their perfo

M.A. Botchev. September 5, 2014

Many preconditioners have been developed with specic applications in mind and as such their eciency is somewhat limited to those applications. The mai

A parallel block multi-level preconditioner for the 3D incompressible Navier Stokes equations

AMS526: Numerical Analysis I (Numerical Linear Algebra)

A Review of Preconditioning Techniques for Steady Incompressible Flow

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

The parallel computation of the smallest eigenpair of an. acoustic problem with damping. Martin B. van Gijzen and Femke A. Raeven.

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Fast solvers for steady incompressible flow

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

The Deflation Accelerated Schwarz Method for CFD

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

Lecture 17: Iterative Methods and Sparse Linear Algebra

X. He and M. Neytcheva. Preconditioning the incompressible Navier-Stokes equations with variable viscosity. J. Comput. Math., in press, 2012.

1e N

Department of Computer Science, University of Illinois at Urbana-Champaign

1. Fast Iterative Solvers of SLE

Algebraic Multigrid Preconditioners for Computing Stationary Distributions of Markov Processes

Fine-grained Parallel Incomplete LU Factorization

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Preconditioned GMRES Revisited

Stabilization and Acceleration of Algebraic Multigrid Method

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

A TAXONOMY AND COMPARISON OF PARALLEL BLOCK MULTI-LEVEL PRECONDITIONERS FOR THE INCOMPRESSIBLE NAVIER STOKES EQUATIONS

A simple iterative linear solver for the 3D incompressible Navier-Stokes equations discretized by the finite element method

A Taxonomy and Comparison of Parallel Block Multi-level Preconditioners for the Incompressible Navier Stokes Equations

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

Lecture 18 Classical Iterative Methods

FINDING RIGHTMOST EIGENVALUES OF LARGE SPARSE NONSYMMETRIC PARAMETERIZED EIGENVALUE PROBLEMS

Multigrid and Domain Decomposition Methods for Electrostatics Problems

EQUATIONS WITH LOW VISCOSITY HOWARD C. ELMAN UMIACS-TR November 1996

On the choice of abstract projection vectors for second level preconditioners

for Finite Element Simulation of Incompressible Flow Arnd Meyer Department of Mathematics, Technical University of Chemnitz,

A SHORT NOTE COMPARING MULTIGRID AND DOMAIN DECOMPOSITION FOR PROTEIN MODELING EQUATIONS

Lecture 8: Fast Linear Solvers (Part 7)

Algebra C Numerical Linear Algebra Sample Exam Problems

2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

BLOCK ILU PRECONDITIONED ITERATIVE METHODS FOR REDUCED LINEAR SYSTEMS

RANA03-02 January Jacobi-Davidson methods and preconditioning with applications in pole-zero analysis

Aggregation-based algebraic multigrid

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

AMG for a Peta-scale Navier Stokes Code

Preconditioning Techniques Analysis for CG Method

High Performance Nonlinear Solvers

The Conjugate Gradient Method

Multigrid Methods and their application in CFD

ON THE GENERALIZED DETERIORATED POSITIVE SEMI-DEFINITE AND SKEW-HERMITIAN SPLITTING PRECONDITIONER *

Multipole-Based Preconditioners for Sparse Linear Systems.

Iterative Methods for Incompressible Flow

Solving linear systems (6 lectures)

2.29 Numerical Fluid Mechanics Spring 2015 Lecture 9

Transcription:

Solving Large Nonlinear Sparse Systems Fred W. Wubs and Jonas Thies Computational Mechanics & Numerical Mathematics University of Groningen, the Netherlands f.w.wubs@rug.nl Centre for Interdisciplinary Mathematics Uppsala University, Sweden jonas@math.uu.se Workshop: Tipping Points in Complex Flows Leiden, November 2, 2011

Outline 1 Twolevel ILU 2 They are everywhere Solution Example: eigenvalue problem 3 Improvements and generalizations Summary

Twolevel ILU Objective 3D CFD problems, geophysical applications Compute branches of steady states Identify bifurcation points Investigate stability Key challenge: large sparse linear systems with the Jacobian

Fully implicit approach Incompressible Navier-Stokes equations: u t + N ( u, u) + L u + p = 0 u = 0 Discretize (here second order symmetry-preserving nite dierences on C-grid) Linearize by Newton's method Structure of resulting linear systems (Saddle-point matrix): ( ) ( ) ( ) L + N Grad u f u = (1) Div 0 p f p

Direct vs. Iterative Linear Solvers Sparse Direct robust and easy to use comput. complexity O(N 2 ) in 3D (N: number of unknowns) substantial ll-in O(N 4/3 ) Preconditioned Iterative usually not robust, depend on many parameters can have optimal complexity O(N) save memory + CPU time by avoiding ll-in Can we combine the best of both? ILU close to LU and preserve properties

Ingredients for eective and robust incomplete factorization Fill reducing ordering Fourier-like transformation improves diagonal dominance to get rid of unwanted couplings Drop by retaining principal submatrices these submatrices will be positive denite if the matrix is positive denite For incompressible Navier Stokes equation, do not drop in divergence and gradient part There is no increase of ll in this part (even not in direct method) on C-grid

of the new algorithm Stokes on a structured C-grid

of the new algorithm, step 1 Domain decomposition

of the new algorithm, step 2 Identify separators

of the new algorithm, step 3 Elimination yields `geometric' Schur-complement

of the new algorithm, step 4 Flux representation (`coarse grid')

F-matrices Twolevel ILU A saddle point matrix has the following structure: [ ] A B K =. (2) B T 0 Denition 1 A gradient-type matrix has at most two nonzero entries per row and its row sum is zero. Denition 2 A saddle point matrix (2) is called an F-matrix if A is positive denite and B is a gradient-type matrix. The Jacobian of the Stokes equations (Re 0) on a C-grid is an F-matrix.

Computing an LU decomposition of an F-matrix [ A B B T 0 ] [ xv x p ] = [ fv f p ] V nodes P nodes Algorithm: LU decomposition of an F-matrix. Compute a ll-reducing ordering for the graph F (A) F (BB T ), during Gaussian elimination, insert the P-nodes to form 2 2 pivots whenever a coupling between a V-node and a P-node is encountered. Theorem 1 In every step of the above algorithm, the resulting Schur complement is an F-matrix.

How is ll generated in the direct approach? α β a T b T β 0 ˆbT 0 a ˆb  ˆB b 0 T ˆB O. (3) Elimination step: Multiple of ˆbˆb T is added to Â; ˆb becomes denser as P-nodes are eliminated; So dropping in  doesn't make sense. Main problem: For ILU we have to get rid of couplings of velocities to inside pressure

Domain decomposition This ordering exposes parallelism in the matrix: ( ) K11 K K = 12, K 21 K 22 where K 11 is block-diagonal. Subdomains and `separator groups'; Retain one pressure per subdomain.

The Schur complement LU-decomposition of the matrices on the subdomains, K 11 = L 11 U 11 ; Schur-complement: S = K 22 K 21 K 1 11 K 12; S retains structural and numerical properties of K; S has only a few rather dense `B' columns (with at most two entries per row); Solve the system with S by a preconditioned Krylov subspace method. Schur-complement:

How can we maintain sparsity? Still an F-matrix; All V-nodes on a separator are now connected to the same 2 P-nodes; Use orthogonal similarity transformation to disconnect them (harmless: SPD remains SPD)

How can we maintain sparsity? Still an F-matrix; All V-nodes on a separator are now connected to the same 2 P-nodes; Use orthogonal similarity transformation to disconnect them (harmless: SPD remains SPD) = Only one V-node per separator remains connected to P-nodes (V Σ -nodes) We did it!

Dropping Twolevel ILU Use simple drop-by-position: Drop all couplings between separator groups... and all couplings between VΣ and regular V-nodes. Principal submatrices of SPD matrix are SPD

Dropping Twolevel ILU Use simple drop-by-position: Drop all couplings between separator groups... and all couplings between VΣ and regular V-nodes. Principal submatrices of SPD matrix are SPD = Block diagonal preconditioner with a `reduced matrix' S 2 in the lower right.

Why it works Twolevel ILU Orthogonal transformations: Eliminate most V-P couplings to avoid ll; `Transfer operators' dening coarse problem S 2 ; Improves diagonal dom. grid indep. convergence; Coarse problem S 2 : solve for ux V Σ through each separator; Still an F-matrix in case of the Stokes equations; Constraint preconditioning: no approximations in `Grad' or `Div' part; mass is conserved exactly throughout. Drop-by-position original properties preserved (symmetry, positiveness); singular subsystems cannot occur robust approach. No segregation of variables Grid independent convergence: Block size determines rate of convergence In two level variant amount of work is not independent of problem size.

Convergence criterion: r / r 0 < 10 8. Twolevel ILU Example 1: 3D Laplace Equation

Example 1: 3D Laplace Equation (2)

Stokes equations: relative ll

Stokes equations: number of iterations

2D lid-driven cavity Twolevel ILU Incompressible Navier-Stokes; Stretched structured grid (ratio 5); Newton's method; First Hopf-bifurcation at Re 8375 (Tiesinga & Wubs 2002).

Convergence behavior Convergence criterion: r / r 0 < 10 6

Achieving high accuracy Driven Cavity, 512 512 grid; Subdomain size: 8 8; Convergence tolerance 10 10 ; Preconditioned GMRES; = Some modes not captured using this subdomain size.

Robust at high Reynolds numbers Can compute highly unstable steady states; Moderate increase in number of iterations; Conv. tol 10 8 here.

Performance, 3D Flow in a Driven Cavity Trilinos implementation, MUMPS for coarse solve; IBM P6, 32 cores/node (Huygens); subdomain size: 8 3. grid num. setup solve size cores time (s) speedup time (s) speedup 32 3 8 113 (6.7/8) 9.30 (6.4/8) 1 757 59.8 16 64.1 (12/16) 5.24 (11/16) 64 3 16 555 (1.9/2) 50.0 (1.8/2) 8 1 050 91.3 32 290 (3.6/4) 30.2 (3.0/4) 64 176 (6.6/8) 89.3 (0.9/8)

They are everywhere Solution Example: eigenvalue problem They are everywhere [ A V W T C ] [ x s ] = [ fx f c ] where A is a big sparse matrix and V and W contain a number of vectors. Occur in: Continuation (Jacobian A singular near turning point) Eigenvalue computation in Jacobi-Davidson method DO method for stochastic PDEs using implicit methods In both last methods one has to compute a correction on a space perpendicular to the current space.

They are everywhere Solution Example: eigenvalue problem Standard solution Standard approach: Make block LU factorization [ ] [ A 0 I A 1 V W T I 0 C W T A 1 V ] What if A becomes singular. Arpack: targets 0 and 0.1

They are everywhere Solution Example: eigenvalue problem Incorporation in multilevel approach Multilevel ILU comes in very handy. Example in two-level case: A 11 A 12 V 1 A 21 A 22 V 2 = W1 T W2 T C A 11 0 0 I A 1 11 A 12 A 1 11 V 1 A 21 I 0 W1 T 0 I 0 A 22 A 21 A 1 11 A 12 V 2 A 21 A 1 11 V 1 0 W T 2 W T 1 A 1 11 A 12 C W T 1 A 1 11 V 1 On the last level we can apply a direct method with pivoting, which precludes instability. Possible indeniteness is likely to occur in low frequency part of solution. This is pushed downwards to coarsest grid. If the last block preconditioning matrix is indenite then the original is signaling of bifurcation.

They are everywhere Solution Example: eigenvalue problem Eigenvalue problems Jacobi-Davidson QZ method: accelerated inexact Newton method. Space ranges from 20 till 40. Use two-level factorization as approximate Jacobian. Target 0, nd rst 6 eigenvalues First vector from LUx=rand Problem Jacobian matrices from lid driven cavity at Re=100 and 1000 (vertical). Grid renement from 32x32 to 64x64 (horizontal) Plots residual against iteration number (MVs). Mind scale x-axis (200 resp. 400)

They are everywhere Solution Example: eigenvalue problem Eigenvalue problems 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. log 10 r it step 2 log 10 r it step 2 2 2 Test subspace computed as Av tau*dv. 3 4 5 6 7 Test subspace computed as Av tau*dv. 3 4 5 6 7 8 8 9 0 20 40 60 80 100 120 140 160 180 200 Correction equation solved with Augm. Prec.. 9 0 20 40 60 80 100 120 140 160 180 200 Correction equation solved with Augm. Prec.. 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. 1 11 2011, 17:46:47 log 10 r it step 2 1 JDQZ with jmin=20, jmax=40, residual tolerance 1e 08. 1 11 2011, 17:55: 9 log 10 r it step 2 2 2 Test subspace computed as Av tau*dv. 3 4 5 6 7 Test subspace computed as Av tau*dv. 3 4 5 6 7 8 8 9 0 50 100 150 200 250 300 350 400 Correction equation solved with Augm. Prec.. 1 11 2011, 18: 7:44 9 0 50 100 150 200 250 300 350 400 Correction equation solved with Augm. Prec.. 1 11 2011, 18:13:29

Improvements and generalizations Summary Generalizations Dierent coordinate systems spherical coordinates common in geophysics Flux-formulation = F-matrix Dierent discretizations: B-grid Dierent physics: can solve Poisson, Convection-Diusion, Stokes adding heat transfer is easy Coriolis force - skew-symmetric, so it works good scaling is essential rotate v by 45 = F-matrix

Improvements and generalizations Summary Possible improvements Multi-level extension: Reduced problem has same structure as original matrix; Recursive application leads to linear comp. complexity. Deation to avoid `plateaus' in GMRES Adaptivity: Any domain decomposition can be used; Inhom. problems: short separators in regions of weak coupling. Unstructured grids: Structure-preserving direct method?

Improvements and generalizations Summary Summary Bifurcation analysis requires fast and robust linear algebra We developed a solver that combines Ease of use: only one parameter; Robustness: factorization doesn't break down; Can be used as approximate Jacobian Parallelism: exposed on every level; Grid-independent convergence for ILU. Extendable to multi-physics problems Next steps Recursive solver with O(N) complexity Implement deation Do some nice CFD problems

Improvements and generalizations Summary References A.C. de Niet and F.W. Wubs. Numerically stable LDL T factorization of F-type saddle point matrices. IMA Journal of Numerical Analysis, vol. 29, no 1, pp. 208-234. F.W.Wubs and J.Thies, A robust two-level incomplete factorization for (Navier-) Stokes saddle point matrices, to appear in SIAM J. Matrix Anal. Appl., 2011. Preprint available on arxiv:1006.1874v1. G.L.G. Sleijpen and F.W. Wubs. Exploiting Multilevel Preconditioning Techniques in Eigenvalue Computations. SIAM Journal on Scientic Computing, 25(4):1249-1272, 2003.