Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners

Similar documents
Preface to the Second Edition. Preface to the First Edition

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

9.1 Preconditioned Krylov Subspace Methods

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

Fast Iterative Solution of Saddle Point Problems

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

Preconditioners for the incompressible Navier Stokes equations

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices

Fine-grained Parallel Incomplete LU Factorization

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Numerical Methods I Non-Square and Sparse Linear Systems

Linear Solvers. Andrew Hazel

Course Notes: Week 1

The Conjugate Gradient Method

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

Solving Large Nonlinear Sparse Systems

Incomplete Cholesky preconditioners that exploit the low-rank property

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Chapter 7 Iterative Techniques in Matrix Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Solving linear systems (6 lectures)

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

Numerical Methods in Matrix Computations

Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning

Solving Ax = b, an overview. Program

Efficient Solvers for the Navier Stokes Equations in Rotation Form

c 2015 Society for Industrial and Applied Mathematics

The flexible incomplete LU preconditioner for large nonsymmetric linear systems. Takatoshi Nakamura Takashi Nodera

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

Iterative Methods and Multigrid

Scientific Computing

Iterative Methods for Sparse Linear Systems

Contents. Preface... xi. Introduction...

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Preconditioning Techniques Analysis for CG Method

Iterative Methods for Solving A x = b

APPLIED NUMERICAL LINEAR ALGEBRA

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Maximum-weighted matching strategies and the application to symmetric indefinite systems

Mathematics and Computer Science

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

Iterative methods for Linear System

A Block Compression Algorithm for Computing Preconditioners

Aggregation-based algebraic multigrid

Lecture 10 Preconditioning, Software, Parallelisation

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Preconditioning techniques to accelerate the convergence of the iterative solution methods

Incomplete factorization preconditioners and their updates with applications - I 1,2

Algebraic Multigrid as Solvers and as Preconditioner

Stabilization and Acceleration of Algebraic Multigrid Method

The solution of the discretized incompressible Navier-Stokes equations with iterative methods

1. Fast Iterative Solvers of SLE

Using an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs

Lecture 17: Iterative Methods and Sparse Linear Algebra

Solving PDEs with Multigrid Methods p.1

Lecture 18 Classical Iterative Methods

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

Challenges for Matrix Preconditioning Methods

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

AND. Key words. preconditioned iterative methods, sparse matrices, incomplete decompositions, approximate inverses. Ax = b, (1.1)

6.4 Krylov Subspaces and Conjugate Gradients

Lecture 8: Fast Linear Solvers (Part 7)

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

An efficient multigrid solver based on aggregation

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna.

LINEAR SYSTEMS (11) Intensive Computation

MODIFICATION AND COMPENSATION STRATEGIES FOR THRESHOLD-BASED INCOMPLETE FACTORIZATIONS

c 2000 Society for Industrial and Applied Mathematics

On the Preconditioning of the Block Tridiagonal Linear System of Equations

Lecture # 20 The Preconditioned Conjugate Gradient Method

Recent advances in sparse linear solver technology for semiconductor device simulation matrices

Lab 1: Iterative Methods for Solving Linear Systems

Indefinite and physics-based preconditioning

FEM and sparse linear system solving

Notes on PCG for Sparse Linear Systems

Efficient Augmented Lagrangian-type Preconditioning for the Oseen Problem using Grad-Div Stabilization

Robust solution of Poisson-like problems with aggregation-based AMG

Jae Heon Yun and Yu Du Han

arxiv: v4 [math.na] 1 Sep 2018

Scientific Computing: An Introductory Survey

NEWTON-GMRES PRECONDITIONING FOR DISCONTINUOUS GALERKIN DISCRETIZATIONS OF THE NAVIER-STOKES EQUATIONS

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

ANALYSIS OF AUGMENTED LAGRANGIAN-BASED PRECONDITIONERS FOR THE STEADY INCOMPRESSIBLE NAVIER-STOKES EQUATIONS

Adaptive algebraic multigrid methods in lattice computations

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Direct and Incomplete Cholesky Factorizations with Static Supernodes

Transcription:

Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA Scuola di Dottorato di Ricerca in Scienze Matematiche Dipartimento di Matematica Università degli Studi di Padova 1

Outline 1 Introduction 2 Generalities about preconditioning 3 Basic concepts of algebraic preconditioning 4 Incomplete factorizations 5 Sparse approximate inverses 6 IF via approximate inverses 7 Balanced Incomplete Factorization (BIF) 8 Conclusions 2

Outline 1 Introduction 2 Generalities about preconditioning 3 Basic concepts of algebraic preconditioning 4 Incomplete factorizations 5 Sparse approximate inverses 6 IF via approximate inverses 7 Balanced Incomplete Factorization (BIF) 8 Conclusions 3

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b 4

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b Preconditioning may be viewed as a transformation: M 1 Ax = M 1 b, or AM 1 y = b, x = M 1 y 4

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b Preconditioning may be viewed as a transformation: M 1 Ax = M 1 b, or AM 1 y = b, x = M 1 y Examples: Matrix Splittings (block Jacobi, Gauss-Seidel, SSOR); Incomplete Factorizations; Sparse Approximate Inverses; AMG... 4

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b Preconditioning may be viewed as a transformation: M 1 Ax = M 1 b, or AM 1 y = b, x = M 1 y Examples: Matrix Splittings (block Jacobi, Gauss-Seidel, SSOR); Incomplete Factorizations; Sparse Approximate Inverses; AMG... preconditioner M (or M 1 ) should be cheap, fast to compute, and result in rapid convergence of the preconditioned iterative method 4

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b Preconditioning may be viewed as a transformation: M 1 Ax = M 1 b, or AM 1 y = b, x = M 1 y Examples: Matrix Splittings (block Jacobi, Gauss-Seidel, SSOR); Incomplete Factorizations; Sparse Approximate Inverses; AMG... preconditioner M (or M 1 ) should be cheap, fast to compute, and result in rapid convergence of the preconditioned iterative method but also: sufficiently robust 4

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b Preconditioning may be viewed as a transformation: M 1 Ax = M 1 b, or AM 1 y = b, x = M 1 y Examples: Matrix Splittings (block Jacobi, Gauss-Seidel, SSOR); Incomplete Factorizations; Sparse Approximate Inverses; AMG... preconditioner M (or M 1 ) should be cheap, fast to compute, and result in rapid convergence of the preconditioned iterative method but also: sufficiently robust sparse (i.e., low storage requirements) 4

Preconditioned iterative methods Solving large linear systems by Krylov-type methods Ax = b Preconditioning may be viewed as a transformation: M 1 Ax = M 1 b, or AM 1 y = b, x = M 1 y Examples: Matrix Splittings (block Jacobi, Gauss-Seidel, SSOR); Incomplete Factorizations; Sparse Approximate Inverses; AMG... preconditioner M (or M 1 ) should be cheap, fast to compute, and result in rapid convergence of the preconditioned iterative method but also: sufficiently robust sparse (i.e., low storage requirements) The case of sequences of linear systems A (k) x (k) = b (k), k = 0,1,2,... 4

Preconditioned iterative methods Structure of this lecture: 5

Preconditioned iterative methods Structure of this lecture: 1 Brief discussion of algebraic vs. problem-specific preconditioning 5

Preconditioned iterative methods Structure of this lecture: 1 Brief discussion of algebraic vs. problem-specific preconditioning 2 Description of guiding principles behind algebraic preconditioning (IF and SAI). Robustness problems of standard techniques 5

Preconditioned iterative methods Structure of this lecture: 1 Brief discussion of algebraic vs. problem-specific preconditioning 2 Description of guiding principles behind algebraic preconditioning (IF and SAI). Robustness problems of standard techniques 3 Some recent approaches which exploit info on matrix inverse 5

Preconditioned iterative methods Structure of this lecture: 1 Brief discussion of algebraic vs. problem-specific preconditioning 2 Description of guiding principles behind algebraic preconditioning (IF and SAI). Robustness problems of standard techniques 3 Some recent approaches which exploit info on matrix inverse 4 An approach based on a novel decomposition of the input matrix 5

Preconditioned iterative methods Structure of this lecture: 1 Brief discussion of algebraic vs. problem-specific preconditioning 2 Description of guiding principles behind algebraic preconditioning (IF and SAI). Robustness problems of standard techniques 3 Some recent approaches which exploit info on matrix inverse 4 An approach based on a novel decomposition of the input matrix 5 Other recent developments: hybrid and multi-level methods (briefly) 5

Outline 1 Introduction 2 Generalities about preconditioning 3 Basic concepts of algebraic preconditioning 4 Incomplete factorizations 5 Sparse approximate inverses 6 IF via approximate inverses 7 Balanced Incomplete Factorization (BIF) 8 Conclusions 6

A quote In ending this book with the subject of preconditioners, we find ourselves at the philosophical center of the scientific computing of the future... Nothing will be more central to computational science in the next century than the art of transforming a problem that appears intractable into another whose solution can be approximated rapidly. For Krylov subspace matrix iterations, this is preconditioning. From N. L. Trefethen and D. Bau, III, Numerical Linear Algebra, SIAM, 1997. 7

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N Incomplete factorizations: M = LŪ A 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N Incomplete factorizations: M = LŪ A Approximate inverse preconditioners: G = M 1 A 1 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N Incomplete factorizations: M = LŪ A Approximate inverse preconditioners: G = M 1 A 1 Algebraic Multi-Grid (AMG). 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N Incomplete factorizations: M = LŪ A Approximate inverse preconditioners: G = M 1 A 1 Algebraic Multi-Grid (AMG). Hybrids obtained by combining some of the above 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N Incomplete factorizations: M = LŪ A Approximate inverse preconditioners: G = M 1 A 1 Algebraic Multi-Grid (AMG). Hybrids obtained by combining some of the above Such preconditioners are good candidates for inclusion in general-purpose software packages. Although they may not be optimal for almost any problem, they are widely applicable and have proven to be reasonably robust in countless applications. 8

Algebraic vs. Problem-Specific Preconditioning Algebraic preconditioners only use information extracted from the input matrix A, usually supplemented by some user-provided tuning parameters, like drop tolerances or limits on the amount of fill-in allowed. Main examples include: Preconditioners based on classical (block) splittings A = M N Incomplete factorizations: M = LŪ A Approximate inverse preconditioners: G = M 1 A 1 Algebraic Multi-Grid (AMG). Hybrids obtained by combining some of the above Such preconditioners are good candidates for inclusion in general-purpose software packages. Although they may not be optimal for almost any problem, they are widely applicable and have proven to be reasonably robust in countless applications. Also, they are being continually improved. 8

Algebraic vs. Problem-Specific Preconditioning Discretization of a continuous problem (a system of PDEs, an integral equation, etc.) leads to a sequence of linear systems A n x n = b n where A n is n n and n as the discretization is refined (that is, as h 0 ). 9

Algebraic vs. Problem-Specific Preconditioning Discretization of a continuous problem (a system of PDEs, an integral equation, etc.) leads to a sequence of linear systems A n x n = b n where A n is n n and n as the discretization is refined (that is, as h 0 ). Definition: A preconditioner is optimal if it results in a rate of convergence of the preconditioned iteration that is asymptotically constant as the problem size increases, and if the cost of each preconditioned iteration scales linearly in the size of the problem. 9

Algebraic vs. Problem-Specific Preconditioning Discretization of a continuous problem (a system of PDEs, an integral equation, etc.) leads to a sequence of linear systems A n x n = b n where A n is n n and n as the discretization is refined (that is, as h 0 ). Definition: A preconditioner is optimal if it results in a rate of convergence of the preconditioned iteration that is asymptotically constant as the problem size increases, and if the cost of each preconditioned iteration scales linearly in the size of the problem. For integral equations, the scaling of each iteration may be O(n log n) or something like that. 9

Algebraic vs. Problem-Specific Preconditioning Discretization of a continuous problem (a system of PDEs, an integral equation, etc.) leads to a sequence of linear systems A n x n = b n where A n is n n and n as the discretization is refined (that is, as h 0 ). Definition: A preconditioner is optimal if it results in a rate of convergence of the preconditioned iteration that is asymptotically constant as the problem size increases, and if the cost of each preconditioned iteration scales linearly in the size of the problem. For integral equations, the scaling of each iteration may be O(n log n) or something like that. For example, in the SPD case if κ 2 (M 1 n A n) C where C is some constant independent of n, then M n is an optimal preconditioner if the action of M 1 n A n on a vector can be computed in O(n) work. 9

Algebraic vs. Problem-Specific Preconditioning In contrast, problem-specific preconditioners, which are designed to solve a narrow class of problems, are often optimal. These methods make extensive use of the developer s knowledge of the application at hand including information about the physics, the geometry, and the particular discretization technique used. 10

Algebraic vs. Problem-Specific Preconditioning In contrast, problem-specific preconditioners, which are designed to solve a narrow class of problems, are often optimal. These methods make extensive use of the developer s knowledge of the application at hand including information about the physics, the geometry, and the particular discretization technique used. These preconditioners are usually not suitable for other types of problems, so their range of applicability is limited. 10

Algebraic vs. Problem-Specific Preconditioning In contrast, problem-specific preconditioners, which are designed to solve a narrow class of problems, are often optimal. These methods make extensive use of the developer s knowledge of the application at hand including information about the physics, the geometry, and the particular discretization technique used. These preconditioners are usually not suitable for other types of problems, so their range of applicability is limited. Many PDE-based (or physics-based) preconditioners belong to this class. An example is Diffusion Synthetic Acceleration (DSA) in radiation transport. 10

Algebraic vs. Problem-Specific Preconditioning In contrast, problem-specific preconditioners, which are designed to solve a narrow class of problems, are often optimal. These methods make extensive use of the developer s knowledge of the application at hand including information about the physics, the geometry, and the particular discretization technique used. These preconditioners are usually not suitable for other types of problems, so their range of applicability is limited. Many PDE-based (or physics-based) preconditioners belong to this class. An example is Diffusion Synthetic Acceleration (DSA) in radiation transport. Other examples of problem-specific preconditioners, especially for incompressible flow problems, will be discussed later in these lectures. 10

Algebraic vs. Problem-Specific Preconditioning The two approaches, algebraic and problem-specific, are not necessarily mutually exclusive similar to direct vs. iterative methods. 11

Algebraic vs. Problem-Specific Preconditioning The two approaches, algebraic and problem-specific, are not necessarily mutually exclusive similar to direct vs. iterative methods. Most problem-specific preconditioners use algebraic ones as building blocks, e.g., to solve or to approximate subproblems arising within the overall preconditioning strategy. 11

Algebraic vs. Problem-Specific Preconditioning The two approaches, algebraic and problem-specific, are not necessarily mutually exclusive similar to direct vs. iterative methods. Most problem-specific preconditioners use algebraic ones as building blocks, e.g., to solve or to approximate subproblems arising within the overall preconditioning strategy. Some algebraic preconditioners are flexible enough that they can be tailored to specific applications. 11

Algebraic vs. Problem-Specific Preconditioning The two approaches, algebraic and problem-specific, are not necessarily mutually exclusive similar to direct vs. iterative methods. Most problem-specific preconditioners use algebraic ones as building blocks, e.g., to solve or to approximate subproblems arising within the overall preconditioning strategy. Some algebraic preconditioners are flexible enough that they can be tailored to specific applications. Conversely, there has been a trend in recent years to build algebraic preconditioners that mimic the properties of specialized preconditioners; for instance, algebraic multilevel methods. 11

Outline 1 Introduction 2 Generalities about preconditioning 3 Basic concepts of algebraic preconditioning 4 Incomplete factorizations 5 Sparse approximate inverses 6 IF via approximate inverses 7 Balanced Incomplete Factorization (BIF) 8 Conclusions 12

Implicit vs. explicit preconditioners An implicit, or direct, preconditioner is an approximation of the input matrix: M A. 13

Implicit vs. explicit preconditioners An implicit, or direct, preconditioner is an approximation of the input matrix: M A. An explicit, or inverse, preconditioner is an approximation of the inverse of the input matrix: G = M 1 A 1. This is motivated by the observation that even though A 1 is a dense matrix, many of its entries are negligibly small. 13

Implicit vs. explicit preconditioners An implicit, or direct, preconditioner is an approximation of the input matrix: M A. An explicit, or inverse, preconditioner is an approximation of the inverse of the input matrix: G = M 1 A 1. This is motivated by the observation that even though A 1 is a dense matrix, many of its entries are negligibly small. Examples of implicit preconditioners include classical splittings, incomplete factorizations, block and multilevel variants. 13

Implicit vs. explicit preconditioners An implicit, or direct, preconditioner is an approximation of the input matrix: M A. An explicit, or inverse, preconditioner is an approximation of the inverse of the input matrix: G = M 1 A 1. This is motivated by the observation that even though A 1 is a dense matrix, many of its entries are negligibly small. Examples of implicit preconditioners include classical splittings, incomplete factorizations, block and multilevel variants. Examples of explicit preconditioners include polynomial preconditioners, sparse approximate inverses, and data-sparse approximate inverses. 13

Implicit vs. explicit preconditioners An implicit, or direct, preconditioner is an approximation of the input matrix: M A. An explicit, or inverse, preconditioner is an approximation of the inverse of the input matrix: G = M 1 A 1. This is motivated by the observation that even though A 1 is a dense matrix, many of its entries are negligibly small. Examples of implicit preconditioners include classical splittings, incomplete factorizations, block and multilevel variants. Examples of explicit preconditioners include polynomial preconditioners, sparse approximate inverses, and data-sparse approximate inverses. Both factored and non-factored forms are in use. 13

Implicit vs. explicit preconditioners Application of an implicit preconditioner within a Krylov method (like CG or GMRES) requires solving one or more linear systems, often with triangular or block triangular matrices. 14

Implicit vs. explicit preconditioners Application of an implicit preconditioner within a Krylov method (like CG or GMRES) requires solving one or more linear systems, often with triangular or block triangular matrices. In contrast, application of an explicit preconditioner requires one or more matrix-vector products. 14

Implicit vs. explicit preconditioners Application of an implicit preconditioner within a Krylov method (like CG or GMRES) requires solving one or more linear systems, often with triangular or block triangular matrices. In contrast, application of an explicit preconditioner requires one or more matrix-vector products. Explicit preconditioners are easier to parallelize. Generally speaking, however, the construction of an explicit preconditioner tends to be more costly than an implicit one. This is to be expected, since A (or its action) is known but A 1 is not. 14

Implicit vs. explicit preconditioners Application of an implicit preconditioner within a Krylov method (like CG or GMRES) requires solving one or more linear systems, often with triangular or block triangular matrices. In contrast, application of an explicit preconditioner requires one or more matrix-vector products. Explicit preconditioners are easier to parallelize. Generally speaking, however, the construction of an explicit preconditioner tends to be more costly than an implicit one. This is to be expected, since A (or its action) is known but A 1 is not. Also, convergence rates are usually better with implicit preconditioners than with explicit ones. But there are exceptions! 14

Outline 1 Introduction 2 Generalities about preconditioning 3 Basic concepts of algebraic preconditioning 4 Incomplete factorizations 5 Sparse approximate inverses 6 IF via approximate inverses 7 Balanced Incomplete Factorization (BIF) 8 Conclusions 15

Incomplete Factorization (IF) methods When a sparse matrix is factored by Gaussian elimination, fill-in usually takes place. This means that the triangular factors L and U of the coefficient matrix A are considerably less sparse than A. 16

Incomplete Factorization (IF) methods When a sparse matrix is factored by Gaussian elimination, fill-in usually takes place. This means that the triangular factors L and U of the coefficient matrix A are considerably less sparse than A. Even though sparsity-preserving reordering techniques can be used to reduce fill-in, sparse direct methods are not considered viable for solving very large linear systems such as those arising from the discretization of three-dimensional boundary value problems, due to time and space constraints. 16

Incomplete Factorization (IF) methods When a sparse matrix is factored by Gaussian elimination, fill-in usually takes place. This means that the triangular factors L and U of the coefficient matrix A are considerably less sparse than A. Even though sparsity-preserving reordering techniques can be used to reduce fill-in, sparse direct methods are not considered viable for solving very large linear systems such as those arising from the discretization of three-dimensional boundary value problems, due to time and space constraints. However, by discarding part of the fill-in in the course of the factorization process, simple but powerful preconditioners can be obtained in the form M = LŪ where L and Ū are the incomplete (approximate) LU factors. 16

Incomplete Factorization (IF) methods Incomplete factorization algorithms differ in the rules that govern the dropping of fill-in in the incomplete factors. Fill-in can be discarded based on several different criteria, such as position, value, or a combination of the two. 17

Incomplete Factorization (IF) methods Incomplete factorization algorithms differ in the rules that govern the dropping of fill-in in the incomplete factors. Fill-in can be discarded based on several different criteria, such as position, value, or a combination of the two. Letting n = {1,2,...,n}, one can fix a subset S n n of positions in the matrix, usually including the main diagonal and all (i, j) such that a ij 0, and allow fill-in in the LU factors only in positions which are in S. 17

Incomplete Factorization (IF) methods Incomplete factorization algorithms differ in the rules that govern the dropping of fill-in in the incomplete factors. Fill-in can be discarded based on several different criteria, such as position, value, or a combination of the two. Letting n = {1,2,...,n}, one can fix a subset S n n of positions in the matrix, usually including the main diagonal and all (i, j) such that a ij 0, and allow fill-in in the LU factors only in positions which are in S. Formally, an incomplete factorization step can be described as { a ij a ik a 1 kk a ij a kj if (i,j) S otherwise a ij for each k and for i,j > k. 17

Incomplete Factorizations (IF) methods Very simple patterns for cheap / cache-efficient preconditioners? 18

Incomplete Factorizations (IF) methods Very simple patterns for cheap / cache-efficient preconditioners? Example: banded pattern: BCSSTK38, n = 8032, nnz = 181, 746; SPD (small structural analysis problem from Boeing). bandwidth (full) PCG its 1 426 3 821 5 648 9 1638 15 792 1011 105 1311 56 1511 nc 3111 35 4111 18 18

Incomplete Factorization (IF) methods Notice that the incomplete factorization may fail due to division by zero or near-zero (this is usually referred to as a pivot breakdown), even if A admits an LU factorization without pivoting. 19

Incomplete Factorization (IF) methods Notice that the incomplete factorization may fail due to division by zero or near-zero (this is usually referred to as a pivot breakdown), even if A admits an LU factorization without pivoting. Partial pivoting can help, but it is costly and does not always suffice in the incomplete case. 19

Incomplete Factorization (IF) methods Notice that the incomplete factorization may fail due to division by zero or near-zero (this is usually referred to as a pivot breakdown), even if A admits an LU factorization without pivoting. Partial pivoting can help, but it is costly and does not always suffice in the incomplete case. If S coincides with the set of positions which are nonzero in A, we obtain the no-fill ILU factorization, or ILU(0). For SPD matrices the same concept applies to the Cholesky factorization A = LL T, resulting in the no-fill IC factorization, or IC(0). 19

Incomplete Factorization (IF) methods Notice that the incomplete factorization may fail due to division by zero or near-zero (this is usually referred to as a pivot breakdown), even if A admits an LU factorization without pivoting. Partial pivoting can help, but it is costly and does not always suffice in the incomplete case. If S coincides with the set of positions which are nonzero in A, we obtain the no-fill ILU factorization, or ILU(0). For SPD matrices the same concept applies to the Cholesky factorization A = LL T, resulting in the no-fill IC factorization, or IC(0). When used with the conjugate gradient algorithm, this preconditioner leads to the ICCG method (Meijerink & van der Vorst, 1977). 19

Incomplete Factorization (IF) methods The no-fill ILU and IC preconditioners are very simple to implement, their computation is inexpensive, and they are reasonably effective for significant problems, such as low-order discretizations of scalar elliptic PDEs leading to M-matrices or to diagonally dominant ones. No pivot breakdown can occur in these cases (Meijerink & van der Vorst, 1977; Manteuffel, 1980). 20

Incomplete Factorization (IF) methods The no-fill ILU and IC preconditioners are very simple to implement, their computation is inexpensive, and they are reasonably effective for significant problems, such as low-order discretizations of scalar elliptic PDEs leading to M-matrices or to diagonally dominant ones. No pivot breakdown can occur in these cases (Meijerink & van der Vorst, 1977; Manteuffel, 1980). However, for more difficult and realistic problems the no-fill factorizations result in too crude an approximation of A, and more sophisticated preconditioners, which allow some fill-in in the incomplete factors, are needed. For instance, this is the case for highly nonsymmetric and indefinite matrices such as those arising in many CFD applications. 20

Incomplete Factorization (IF) methods A hierarchy of ILU preconditioners may be obtained based on the levels of fill-in concept. A level of fill is attributed to each matrix entry that occurs in the incomplete factorization process. Fill-ins are dropped based on the value of the level of fill. The formal definition is as follows. 21

Incomplete Factorization (IF) methods A hierarchy of ILU preconditioners may be obtained based on the levels of fill-in concept. A level of fill is attributed to each matrix entry that occurs in the incomplete factorization process. Fill-ins are dropped based on the value of the level of fill. The formal definition is as follows. The initial level of fill of a matrix entry a ij is defined to be { 0, if a ij 0, or i = j lev ij = otherwise. Each time this element is modified by the ILU process, its level of fill must be updated according to lev ij = min{lev ij,lev ik + lev kj + 1}. 21

Incomplete Factorization (IF) methods A hierarchy of ILU preconditioners may be obtained based on the levels of fill-in concept. A level of fill is attributed to each matrix entry that occurs in the incomplete factorization process. Fill-ins are dropped based on the value of the level of fill. The formal definition is as follows. The initial level of fill of a matrix entry a ij is defined to be { 0, if a ij 0, or i = j lev ij = otherwise. Each time this element is modified by the ILU process, its level of fill must be updated according to lev ij = min{lev ij,lev ik + lev kj + 1}. Let l be a nonnegative integer. With ILU(l), all fill-ins whose level is greater than l are dropped. Note that for l = 0, we recover the no-fill ILU(0) preconditioner. 21

Example Level-based incomplete LU factorizations ILU(l) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 289 ILU(1) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 349 ILU(2) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 457 ILU(3) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 541 ILU(4) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 601 ILU(5) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 637 ILU(6) 22

Example Level-based incomplete LU factorizations ILU(l) Motivated by decay in factors of diagonally dominant matrices Structure of incomplete factors can be predicted using matrix graph 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 217 ILU(0) 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 nz = 649 ILU(7) 22

Numerical Example Fast symbolic costruction (Hysom & Pothen, SISC 2001) 23

Numerical Example Fast symbolic costruction (Hysom & Pothen, SISC 2001) But, typically expensive to apply even for modest number of levels 23

Numerical Example Fast symbolic costruction (Hysom & Pothen, SISC 2001) But, typically expensive to apply even for modest number of levels Example: Matrix ENGINE, n = 143,571, nnz = 2,424,822; SPD. levels size prec PCG its. 0 2,424,822 523 1 4,458,588 300 2 7,595,466 199 3 12,128,289 115 4 18,078,603 87 5 25,474,380 54 6 34,153,746 45 7 43,861,328 46 8 54,276,063 36 23

Preprocessing incomplete factorizations Preprocessing originally designed for direct solvers often very useful to improve robustness of ILU preconditioners: 24

Preprocessing incomplete factorizations Preprocessing originally designed for direct solvers often very useful to improve robustness of ILU preconditioners: Symmetric reorderings (RCM, MD, ND, etc.) 24

Preprocessing incomplete factorizations Preprocessing originally designed for direct solvers often very useful to improve robustness of ILU preconditioners: Symmetric reorderings (RCM, MD, ND, etc.) Static pivoting : nonsymmetric permutations and scalings aimed at increasing diagonal dominance (Duff & Koster, SIMAX 1999, 2001; B., Haws & T uma, SISC 2000; Saad, SISC 2005; Mayer, SISC 2008) 24

Preprocessing incomplete factorizations Preprocessing originally designed for direct solvers often very useful to improve robustness of ILU preconditioners: Symmetric reorderings (RCM, MD, ND, etc.) Static pivoting : nonsymmetric permutations and scalings aimed at increasing diagonal dominance (Duff & Koster, SIMAX 1999, 2001; B., Haws & T uma, SISC 2000; Saad, SISC 2005; Mayer, SISC 2008) Extension to symmetric indefinite problems (Duff & Pralet, SIMAX 2005; Hagemann & Schenk, SISC 2006) 24

Preprocessing incomplete factorizations Preprocessing originally designed for direct solvers often very useful to improve robustness of ILU preconditioners: Symmetric reorderings (RCM, MD, ND, etc.) Static pivoting : nonsymmetric permutations and scalings aimed at increasing diagonal dominance (Duff & Koster, SIMAX 1999, 2001; B., Haws & T uma, SISC 2000; Saad, SISC 2005; Mayer, SISC 2008) Extension to symmetric indefinite problems (Duff & Pralet, SIMAX 2005; Hagemann & Schenk, SISC 2006) Block variants (many authors) 24

Preprocessing incomplete factorizations Preprocessing originally designed for direct solvers often very useful to improve robustness of ILU preconditioners: Symmetric reorderings (RCM, MD, ND, etc.) Static pivoting : nonsymmetric permutations and scalings aimed at increasing diagonal dominance (Duff & Koster, SIMAX 1999, 2001; B., Haws & T uma, SISC 2000; Saad, SISC 2005; Mayer, SISC 2008) Extension to symmetric indefinite problems (Duff & Pralet, SIMAX 2005; Hagemann & Schenk, SISC 2006) Block variants (many authors) But, for very tough problems still not enough to guarantee convergence of preconditioned iteration. 24

Example (cont.) Preprocessing: matrix is reordered with Multiple Minimum Degree, a fill-reducing ordering. 25

Example (cont.) Preprocessing: matrix is reordered with Multiple Minimum Degree, a fill-reducing ordering. Matrix ENGINE, n = 143,571, nnz = 2,424,822, MMD ordering 25

Example (cont.) Preprocessing: matrix is reordered with Multiple Minimum Degree, a fill-reducing ordering. Matrix ENGINE, n = 143,571, nnz = 2,424,822, MMD ordering levels size its size its 0 2,424,822 523 2,424,822 439 1 4,458,588 300 4,394,040 214 2 7,595,466 199 6,509,826 159 3 12,128,289 115 8,859,522 96 4 18,078,603 87 11,292,927 66 5 25,474,380 54 13,664,157 49 6 34,153,746 45 15,891,321 34 7 43,861,328 46 nc 8 54,276,063 36 19,590,303 18 25

Example (cont.) Preprocessing: matrix is reordered with Multiple Minimum Degree, a fill-reducing ordering. Matrix ENGINE, n = 143,571, nnz = 2,424,822, MMD ordering levels size its size its 0 2,424,822 523 2,424,822 439 1 4,458,588 300 4,394,040 214 2 7,595,466 199 6,509,826 159 3 12,128,289 115 8,859,522 96 4 18,078,603 87 11,292,927 66 5 25,474,380 54 13,664,157 49 6 34,153,746 45 15,891,321 34 7 43,861,328 46 nc 8 54,276,063 36 19,590,303 18 Some improvement observed, but not entirely robust. 25

The use of drop tolerances In many cases, an efficient preconditioner can be obtained from an incomplete factorization where new fill-ins are accepted or discarded on the basis of their size. In this way, only fill-ins that contribute significantly to the quality of the preconditioner are stored and used. 26

The use of drop tolerances In many cases, an efficient preconditioner can be obtained from an incomplete factorization where new fill-ins are accepted or discarded on the basis of their size. In this way, only fill-ins that contribute significantly to the quality of the preconditioner are stored and used. A drop tolerance is a positive number τ which is used in a dropping criterion. An absolute dropping strategy can be used, whereby new fill-ins are accepted only if greater than τ in absolute value. This criterion may work poorly if the matrix is badly scaled, in which case it is better to use a relative drop tolerance. 26

The use of drop tolerances In many cases, an efficient preconditioner can be obtained from an incomplete factorization where new fill-ins are accepted or discarded on the basis of their size. In this way, only fill-ins that contribute significantly to the quality of the preconditioner are stored and used. A drop tolerance is a positive number τ which is used in a dropping criterion. An absolute dropping strategy can be used, whereby new fill-ins are accepted only if greater than τ in absolute value. This criterion may work poorly if the matrix is badly scaled, in which case it is better to use a relative drop tolerance. For example, when eliminating row i, a new fill-in is accepted only if it is greater in absolute value than τ a i 2, where a i denotes the ith row of A. Other criteria are also in use. 26

The use of drop tolerances A drawback of this approach is that it is difficult to choose a good value of the drop tolerance: usually, this is done by trial-and-error for a few sample matrices from a given application, until a satisfactory value of τ is found. In many cases, good results are obtained for values of τ in the range 10 4-10 2, but the optimal value is strongly problem-dependent. 27

The use of drop tolerances A drawback of this approach is that it is difficult to choose a good value of the drop tolerance: usually, this is done by trial-and-error for a few sample matrices from a given application, until a satisfactory value of τ is found. In many cases, good results are obtained for values of τ in the range 10 4-10 2, but the optimal value is strongly problem-dependent. Another difficulty is that it is impossible to predict the amount of storage that will be needed to store the incomplete LU factors. An efficient, predictable algorithm is obtained by limiting the number of nonzeros allowed in each row of the triangular factors. Saad (1994) has proposed the following dual threshold strategy: 27

The use of drop tolerances A drawback of this approach is that it is difficult to choose a good value of the drop tolerance: usually, this is done by trial-and-error for a few sample matrices from a given application, until a satisfactory value of τ is found. In many cases, good results are obtained for values of τ in the range 10 4-10 2, but the optimal value is strongly problem-dependent. Another difficulty is that it is impossible to predict the amount of storage that will be needed to store the incomplete LU factors. An efficient, predictable algorithm is obtained by limiting the number of nonzeros allowed in each row of the triangular factors. Saad (1994) has proposed the following dual threshold strategy: Fix a drop tolerance τ and a number p of fill-ins to be allowed in each row of the incomplete L/U factors; at each step of the elimination process, drop all fill-ins that are smaller than τ times the 2-norm of the current row; of all the remaining ones, keep (at most) the p largest ones in magnitude. 27

The use of drop tolerances A variant of this approach allows in each row of the incomplete factors p nonzeros in addition to the positions that were already nonzeros in the original matrix A. This makes sense for irregular problems in which the nonzeros in A are not distributed uniformly. 28

The use of drop tolerances A variant of this approach allows in each row of the incomplete factors p nonzeros in addition to the positions that were already nonzeros in the original matrix A. This makes sense for irregular problems in which the nonzeros in A are not distributed uniformly. The resulting preconditioner, denoted by ILUT(τ, p), is quite powerful. If it fails on a problem for a given choice of the parameters τ and p, it will often succeed by taking a smaller value of τ and/or a larger value of p. The corresponding incomplete Cholesky preconditioner for SPD matrices, denoted ICT, can also be defined. 28

The use of drop tolerances A variant of this approach allows in each row of the incomplete factors p nonzeros in addition to the positions that were already nonzeros in the original matrix A. This makes sense for irregular problems in which the nonzeros in A are not distributed uniformly. The resulting preconditioner, denoted by ILUT(τ, p), is quite powerful. If it fails on a problem for a given choice of the parameters τ and p, it will often succeed by taking a smaller value of τ and/or a larger value of p. The corresponding incomplete Cholesky preconditioner for SPD matrices, denoted ICT, can also be defined. ILUT(τ, p) and the variant with partial pivoting ILUTP(τ, p) are quite effective and widely used in many industrial applications. However, failures can still occur. 28

Example IC(0)/ICT may fail and simple diagonal scaling work! 29

Example IC(0)/ICT may fail and simple diagonal scaling work! Matrix LDOOR (structural analysis of car door), n = 952, 203, nnz = 23,737,339. 29

Example IC(0)/ICT may fail and simple diagonal scaling work! Matrix LDOOR (structural analysis of car door), n = 952, 203, nnz = 23,737,339. precond / precond. size PCG its Jacobi / 952,203 810 IC(0) / 23,737,339 > 1000 ICT / 23,838,704 > 1000 ICT / 24,614,381 > 1000 ICT / 26,167,321 > 1000 ICT / 30,047,027 > 1000 ICT / 37,809,756 > 1000 29

Stability considerations ILU preconditioners attempt to make the residual matrix R := A M small in some norm. However, this does not always result in good preconditioners. 30

Stability considerations ILU preconditioners attempt to make the residual matrix R := A M small in some norm. However, this does not always result in good preconditioners. As observed by several authors (Elman, Saad,...), a more meaningful approximation measure is based on the size of the error matrix E := I AM 1 30

Stability considerations ILU preconditioners attempt to make the residual matrix R := A M small in some norm. However, this does not always result in good preconditioners. As observed by several authors (Elman, Saad,...), a more meaningful approximation measure is based on the size of the error matrix E := I AM 1 Approximate inverse preconditioners attempt to make E small, but this may require a huge number of nonzeros in the preconditioner (unless the entries of A 1 exhibit fast off-diagonal decay). 30

Stability considerations ILU preconditioners attempt to make the residual matrix R := A M small in some norm. However, this does not always result in good preconditioners. As observed by several authors (Elman, Saad,...), a more meaningful approximation measure is based on the size of the error matrix E := I AM 1 Approximate inverse preconditioners attempt to make E small, but this may require a huge number of nonzeros in the preconditioner (unless the entries of A 1 exhibit fast off-diagonal decay). Note that E = RM 1 R M 1. Hence, if M is very ill-conditioned ( M 1 is very large), then a very large error matrix may occur even if A M is small. This often results in failure to converge. 30

Stability considerations Example (B., Szyld & van Duin, SISC 1999): System Ax = b is a discretization of a convection-dominated, convection-diffusion equation. Solver: Bi-CGSTAB. Orderings: lexicographic and MMD. 31

Stability considerations Example (B., Szyld & van Duin, SISC 1999): System Ax = b is a discretization of a convection-dominated, convection-diffusion equation. Solver: Bi-CGSTAB. Orderings: lexicographic and MMD. Let N 1 := A LŪ F and N 2 := I A( LŪ) 1 F. 31

Stability considerations Example (B., Szyld & van Duin, SISC 1999): System Ax = b is a discretization of a convection-dominated, convection-diffusion equation. Solver: Bi-CGSTAB. Orderings: lexicographic and MMD. Let N 1 := A LŪ F and N 2 := I A( LŪ) 1 F. ILU(0) Lexicogr. MMD N 1 4.06 10 1 4.53 10 0 N 2 3.26 10 6 2.00 10 2 Its nc 59 ILUT(0.01,5) Lexicogr. MMD N 1 1.78 10 1 7.39 10 1 N 2 2.79 10 1 5.81 10 6 Its 11 nc 31

Permuting large entries of A to the main diagonal 0 0 500 500 1000 1000 1500 1500 2000 2000 2500 2500 3000 3000 3500 3500 0 500 1000 1500 2000 2500 3000 3500 nz = 25407 0 500 1000 1500 2000 2500 3000 3500 nz = 25407 Jacobian from Navier-Stokes equations (original and permuted with MC64 + RCM). After preprocessing, ILUT with Bi-CGSTAB converges in 24 iterations. No convergence on original system. 32

Outline 1 Introduction 2 Generalities about preconditioning 3 Basic concepts of algebraic preconditioning 4 Incomplete factorizations 5 Sparse approximate inverses 6 IF via approximate inverses 7 Balanced Incomplete Factorization (BIF) 8 Conclusions 33

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. Also useful for constructing robust smoothers for multigrid, and for other purposes like approximating Schur complements. 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. Also useful for constructing robust smoothers for multigrid, and for other purposes like approximating Schur complements. By now, a large body of literature exists (100 s of papers since the 1990s). 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. Also useful for constructing robust smoothers for multigrid, and for other purposes like approximating Schur complements. By now, a large body of literature exists (100 s of papers since the 1990s). Successfully used in numerous applications, including solution of dense linear systems from BEM in electromagnetics, acoustics, and elastodynamics problems 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. Also useful for constructing robust smoothers for multigrid, and for other purposes like approximating Schur complements. By now, a large body of literature exists (100 s of papers since the 1990s). Successfully used in numerous applications, including solution of dense linear systems from BEM in electromagnetics, acoustics, and elastodynamics problems solution of sparse linear systems from photon and neutron transport, CFD, Markov chains, eigenproblems, etc 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. Also useful for constructing robust smoothers for multigrid, and for other purposes like approximating Schur complements. By now, a large body of literature exists (100 s of papers since the 1990s). Successfully used in numerous applications, including solution of dense linear systems from BEM in electromagnetics, acoustics, and elastodynamics problems solution of sparse linear systems from photon and neutron transport, CFD, Markov chains, eigenproblems, etc quantum chemistry applications 34

Sparse approximate inverses Idea: directly approximate the inverse with a sparse matrix G A 1, then preconditioner application only needs mat-vecs with G. Mostly motivated by parallel processing; also, less prone to instabilities than ILU, and easy to update when solving a sequence of linear systems. Also useful for constructing robust smoothers for multigrid, and for other purposes like approximating Schur complements. By now, a large body of literature exists (100 s of papers since the 1990s). Successfully used in numerous applications, including solution of dense linear systems from BEM in electromagnetics, acoustics, and elastodynamics problems solution of sparse linear systems from photon and neutron transport, CFD, Markov chains, eigenproblems, etc quantum chemistry applications image processing (restoration, deblurring, inpainting) 34

Sparse approximate inverses Main approaches: sparse approximate inverses (SAIs) can be factored or unfactored. 35

Sparse approximate inverses Main approaches: sparse approximate inverses (SAIs) can be factored or unfactored. Factored forms are of the type G = ZW where, for instance, Z U 1 and W L 1. 35

Sparse approximate inverses Main approaches: sparse approximate inverses (SAIs) can be factored or unfactored. Factored forms are of the type G = ZW where, for instance, Z U 1 and W L 1. Factored forms are especially useful if A is SPD. In this case W = Z T and the approximate inverse G = ZZ T is guaranteed to be SPD. This allows for the use of the conjugate gradient (CG) method. 35