Incomplete Cholesky preconditioners that exploit the low-rank property

Similar documents
J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

Numerical Methods I Non-Square and Sparse Linear Systems

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

A sparse multifrontal solver using hierarchically semi-separable frontal matrices

Direct and Incomplete Cholesky Factorizations with Static Supernodes

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses

The Conjugate Gradient Method

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Fast algorithms for hierarchically semiseparable matrices

Contents. Preface... xi. Introduction...

Robust solution of Poisson-like problems with aggregation-based AMG

Solving Large Nonlinear Sparse Systems

EFFECTIVE AND ROBUST PRECONDITIONING OF GENERAL SPD MATRICES VIA STRUCTURED INCOMPLETE FACTORIZATION

Partial Left-Looking Structured Multifrontal Factorization & Algorithms for Compressed Sensing. Cinna Julie Wu

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Enhancing Scalability of Sparse Direct Methods

Preconditioning Techniques Analysis for CG Method

Linear Solvers. Andrew Hazel

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

Multipréconditionnement adaptatif pour les méthodes de décomposition de domaine. Nicole Spillane (CNRS, CMAP, École Polytechnique)

An Efficient Solver for Sparse Linear Systems based on Rank-Structured Cholesky Factorization

Low-Rank Approximations, Random Sampling and Subspace Iteration

Numerical Methods in Matrix Computations

c 2015 Society for Industrial and Applied Mathematics

Aggregation-based algebraic multigrid

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

FAST STRUCTURED EIGENSOLVER FOR DISCRETIZED PARTIAL DIFFERENTIAL OPERATORS ON GENERAL MESHES

Course Notes: Week 1

Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners

Notes on PCG for Sparse Linear Systems

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

A robust inner-outer HSS preconditioner

arxiv: v1 [cs.na] 20 Jul 2015

Master Thesis Literature Study Presentation

Multilevel Low-Rank Preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota. Modelling 2014 June 2,2014

SOLVING MESH EIGENPROBLEMS WITH MULTIGRID EFFICIENCY

Improvements for Implicit Linear Equation Solvers

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices

Lecture 18 Classical Iterative Methods

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D

The new challenges to Krylov subspace methods Yousef Saad Department of Computer Science and Engineering University of Minnesota

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

Iterative solution methods and their rate of convergence

FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING

Lecture 9: Numerical Linear Algebra Primer (February 11st)

On the design of parallel linear solvers for large scale problems

MULTI-LAYER HIERARCHICAL STRUCTURES AND FACTORIZATIONS

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners

arxiv: v1 [hep-lat] 2 May 2012

Jae Heon Yun and Yu Du Han

An efficient multigrid solver based on aggregation

Chapter 7 Iterative Techniques in Matrix Algebra

Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems

Jacobi-Davidson Eigensolver in Cusolver Library. Lung-Sheng Chien, NVIDIA

An Efficient Solver for Sparse Linear Systems based on Rank-Structured Cholesky Factorization

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Lecture 17: Iterative Methods and Sparse Linear Algebra

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Algebraic Multigrid as Solvers and as Preconditioner

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

A dissection solver with kernel detection for unsymmetric matrices in FreeFem++

A Newton-Galerkin-ADI Method for Large-Scale Algebraic Riccati Equations

Estimating the Largest Elements of a Matrix

The flexible incomplete LU preconditioner for large nonsymmetric linear systems. Takatoshi Nakamura Takashi Nodera

Sparsity-Preserving Difference of Positive Semidefinite Matrix Representation of Indefinite Matrices

9.1 Preconditioned Krylov Subspace Methods

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem

Fine-grained Parallel Incomplete LU Factorization

Preface to the Second Edition. Preface to the First Edition

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative Methods for Solving A x = b

FEM and sparse linear system solving

A SPARSE APPROXIMATE INVERSE PRECONDITIONER FOR NONSYMMETRIC LINEAR SYSTEMS

Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks

An efficient multigrid method for graph Laplacian systems II: robust aggregation

FPGA Implementation of a Predictive Controller

Conjugate Gradient Method

Network Analysis at IIT Bombay

Various ways to use a second level preconditioner

Block Bidiagonal Decomposition and Least Squares Problems

Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX

On the Preconditioning of the Block Tridiagonal Linear System of Equations

Implementation of a preconditioned eigensolver using Hypre

An Empirical Comparison of Graph Laplacian Solvers

Structure preserving preconditioner for the incompressible Navier-Stokes equations

Exploiting off-diagonal rank structures in the solution of linear matrix equations

IMPROVING THE PERFORMANCE OF SPARSE LU MATRIX FACTORIZATION USING A SUPERNODAL ALGORITHM

Preconditioners for the incompressible Navier Stokes equations

Transcription:

anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université Libre de Bruxelles PRECONDITIONING (TU EINDHOVEN) JUNE 17, 2015

Motivation The solution of a symmetric positive definite (SPD) system Au = b may be obtained with a direct method (usually: Cholesky factorization A = R T R) robust iterative method (usually: preconditioned conjugate gradient) ; efficient if a good SPD preconditioner B A is available, i.e.: 1. Cheap (to construct, store, invert, parallelize) 2. Good approximation of A contradictory goals tradeoff Possible black box design: take an exact factorization (here, Cholesky) ; add approximation (to enforce 1.) incomplete factorization preconditioner B = R T R A. 2 / 35

Motivation Incomplete factorizations B = R T R A may perform approximations by dropping individual entries (mainstream) ; using low-rank approximations (emerging). Incomplete factorizations: + are (almost) black box (approximation threshold required) + are (relatively) robust may breakdown (but breakdown-free variants exist) have no guarantee to converge fast; fast convergence if B approximates well A ; that is, if κ(r T AR 1 ) = λ max(r T AR 1 ) λ min (R T AR 1 ) is small. Here we present incomplete factorizations that are breakdown-free and have controllable condition number. 3 / 35

4 / 35 Outline Theory... One-level variant basic approach and underlying analysis Multilevel variants extensions of analysis to multi-level setting... and practice Sparse solver motivation and design choices Numerical experiments and comparison with other solvers

4 / 35 Outline Theory... One-level variant basic approach and underlying analysis Multilevel variants extensions of analysis to multi-level setting... and practice Sparse solver motivation and design choices Numerical experiments and comparison with other solvers

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) A ( k = 1 )

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 1 ) A 11 A 12 A 22 A

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 1 ) R 11 A 12 A 22 A 11 = R T 11 R 11 A

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 1 ) R 11 R T 11 A 12 A 22 A

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 1 ) R 12 A 22 A R 12 R 12 The dropping is orthogonal if R ( ) T R 12 R 12 R 12 12 R 12 = O. R 12 R 12

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 1 ) R 12 A 22 A R 12 R 12 The dropping is orthogonal if R 12 T ( R 12 R 12 ) = O. This implies monotonicity for Schur complement (for any SPD A!) : v T (A 22 R T 12 R 12 )v v T (A 22 R T 12R 12 )v

One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 1 ) R 12 A 22 A R 12 R 12 Example of orthogonal dropping low-rank approximation via truncated SVD (with absolute threshold tol a ) ) T + U 2 (Σ 2 V2 ) }{{} R 12 = ( U 1 U 2 ) ( Σ 1 Σ 2 ) ( V T 1 V T 2 If Σ 2 < tol a, then = U 1 (Σ 1 V T 1 ) }{{} R 12 R 12 R 12 R 12 R 12 < tol a 5 / 35

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) R 12 A 22 B 1 ( k = 1 ) S 1 = A 22 1 R T 12 1 R 12

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) B 1 ( k = 2 )

5 / 35 One-level variant : basic ideas Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) B 1 ( k = 2 )

One-level variant : successive approximations Incomplete (block left-looking) Cholesky for each block row k = 1,..., l : 1 factorize 2 solve 3 approximate 4 update (compute new rows) ( k = 2 ) approximate new rows only One-level (model) variant k = 1 k = 2 k = 3 k = 4 6 / 35

7 / 35 Accuracy of individual dropping The orthogonal dropping is assumed, namely R 12 T ( R 12 R 12 ) = O R 12 A 22 B k 1 The accuracy of individual dropping at step k : (R γ k = 12 R ) 12 S 1/2 < 1, B with S B = A 22 R T 12 R 12. R 12 B k

Model problem Model Problem: { u + 10 4 u = f in Ω = (0, 1) 2 u n Full system matrix: = 0 on Ω A Ω = A I A II A I,Γ A II,Γ. Ω I Γ Ω II A T I,Γ AT II,Γ A Γ Interface system matrix: A = A Γ A T I,ΓA 1 I A Γ,I A T II,ΓA 1 II A Γ,II Corresponds to: last Schur complement in sparse Cholesky with ND ordering Schur complement system in iterative substructuring 8 / 35

9 / 35 Accuracy of individual dropping The orthogonal dropping is assumed, namely R 12 T ( R 12 R 12 ) = O. The accuracy of individual dropping at step k: (R γ k = 12 R ) 12 S 1/2 < 1, with B S B = A 22 R T 12 R 12. Model problem: (k = 1, block size = 10, 10 3 10 3 grid, truncated SVD dropping) r 0 1 3 5 γ 1 0.99998 0.44 5.6 10 3 6.2 10 6 B k 1 B k

One-level variant: conditioning analysis (for λ max ) Assume A SPD and that dropping is orthogonal Let B k correspond to the preconditioner at the end of step k, with A = B 0 B l = R T R is the final preconditioner. Only newly computed rows are modified; Let λ (k) ( ) max = λ max B 1 (k 1) l B k, λ max = λ max ( B 1 l B k 1 ) B k 1 B k Then λ (k) max λ (k 1) max λ (k) max + g( λ (k) max, γ k ) where g( λ, γ ) = max β>0 2 γ β λ 1 β 2 β 2 + λ 1. B l 10 / 35

One-level variant: conditioning analysis (for λ min ) Assume A SPD and that dropping is orthogonal Let B k correspond to the preconditioner at the end of step k, with A = B 0 B l = R T R is the final preconditioner. Only newly computed rows are modified; Let λ (k) min = λ ( ) min B 1 (k) l B k, λ min = λ ( ) min B 1 l B k 1 B k 1 B k Then λ (k) min λ(k 1) min λ (k) min g( λ(k) min, γ k ) where g( λ, γ ) = max β>0 2 γ β λ 1 β 2 β 2 + λ 1. B l 10 / 35

One-level variant: conditioning analysis One-level bounds (for λ max ) λ (k) max λ (k 1) max λ (k) max + g( λ (k) max, γ k ) B k 1 where λ (k) max ( ) = λ max B 1 (k 1) l B k, λ max = λ max ( B 1 l B k 1 ) The estimate is sharp: For any γ k a matrix A < 1, k = 1,..., l, there exist a sequence of approximations B k (with B 0 = A) such that the bounds for λ (0) max and λ (0) min are simultaneously reached. B k B l 11 / 35

Model problem: numerical experiments Model Problem: { u + 10 4 u = f in Ω = (0, 1) 2 u n Full system matrix: = 0 on Ω A Ω = A I A II A I,Γ A II,Γ. Ω I Γ Ω II N A T I,Γ AT II,Γ A Γ Interface system N N matrix: A = A Γ A T I,ΓA 1 I A Γ,I A T II,ΓA 1 II A Γ,II Algorithmic details: block size = 10 we consider N from 20 to 650 (n from 400 to 4.3 10 5, l from 1 to 64) maximal ranks remains below 4 12 / 35

13 / 35 One-level variant: numerical experiments (tol r = 10 3 ) 10 2 κ(r T AR 1 ) 10 1 10 0 10 3 10 4 10 5 n condition number one-level precond. upper bound one-level

14 / 35 Outline Theory... One-level variant basic approach and underlying analysis Multilevel variants extensions of analysis to multi-level setting... and practice Sparse solver motivation and design choices Numerical experiments and comparison with other solvers

Multilevel variants : motivation One-level (model) variant k = 1 k = 2 k = 3 k = 4 O(N 3 ) cost, O(N 2 ) memory (although both cost and memory are improved!) N 15 / 35

Multilevel variants : motivation General \ Sequentially Semiseparable variant (SSS) [Gu, Li, Vassilevski 10] k = 1 k = 2 k = 3 k = 4 O(rN 2 ) cost, O(rN) memory r - maximal approximation rank N N/r improvement! 16 / 35

17 / 35 Multilevel variants : general estimate B 1 Assuming that dropping is orthogonal one has ( ) λ max B 1 k B k 1 = 1 + γk ( ) λ min B 1 k B k 1 = 1 γk B 2 and, hence, κ(r T AR 1 ) l k=1 1 + γ k 1 γ k. B 3 B 4

Model problem Model Problem: { u + 10 4 u = f in Ω = (0, 1) 2 u n Full system matrix: = 0 on Ω A Ω = A I A II A I,Γ A II,Γ. Ω I Γ Ω II N A T I,Γ AT II,Γ A Γ Interface system N N matrix: A = A Γ A T I,ΓA 1 I A Γ,I A T II,ΓA 1 II A Γ,II Algorithmic details: block size = 10 we consider N from 20 to 650 (n from 400 to 4.3 10 5, l from 1 to 64) maximal ranks remains below 4 18 / 35

19 / 35 Multilevel variants: numerical experiments 10 8 10 7 10 6 10 5 10 4 10 3 10 2 10 1 10 0 κ(r T AR 1 ) 10 3 10 4 10 5 n condition number κ(a) SSS precond. (tol r = 10 3 ) upper bound general (10 3 )

20 / 35 Multilevel variants: nested subspaces Sequentially Semiseparable variant (SSS) [Gu, Li, Vassilevski 10] k = 1 k = 2 k = 3 k = 4 Nested subspaces assumption R 12 R 12 R 12 R 12 B 1 B 2 B 3 B 4 span( R 12 ) span( ) span( ) span( )

Multilevel variants: nested subspaces Sequentially Semiseparable variant (SSS) [Gu, Li, Vassilevski 10] k = 1 k = 2 k = 3 k = 4 Nested subspaces assumption R 12 R 12 R 12 R 12 B 1 B 2 B 3 B 4 span( R 12 ) span( R 12 ) span( ) span( ) 20 / 35

Multilevel variants: nested subspaces Sequentially Semiseparable variant (SSS) [Gu, Li, Vassilevski 10] k = 1 k = 2 k = 3 k = 4 Nested subspaces assumption R 12 R 12 R 12 R 12 B 1 B 2 B 3 B 4 span( R 12 ) span( R 12 ) span( R 12 ) span( ) 20 / 35

20 / 35 Multilevel variants: nested subspaces Same bounds as for one-level case! One-level bounds (for λ max ) where λ (k) max λ (k) max λ (k 1) max ( ) = λ max B 1 (k 1) l B k, λ max Nested subspaces assumption λ (k) max + g( λ (k) max, γ k ) = λ max ( B 1 l B k 1 ) R 12 R 12 R 12 R 12 B 1 B 2 B 3 B 4 span( R 12 ) span( R 12 ) span( R 12 ) span( )

20 / 35 Multilevel variants: nested subspaces Same bounds as for one-level case! One-level bounds (for λ min ) where λ (k) min λ (k) min = λ min λ(k 1) min ( ) B 1 (k 1) l B k, λ min Nested subspaces assumption λ (k) min g( λ(k) min, γ k ) = λ min ( B 1 l B k 1 ) R 12 R 12 R 12 R 12 B 1 B 2 B 3 B 4 span( R 12 ) span( R 12 ) span( R 12 ) span( )

21 / 35 Multilevel variants: numerical experiments (tol r = 10 3 ) κ(r T AR 1 ) 10 2 10 1 10 0 10 3 10 4 10 5 n condition number SSS precond. one-level precond. upper bound one-level one-level

22 / 35 Outline Theory... One-level variant basic approach and underlying analysis Multilevel variants extensions of analysis to multi-level setting... and practice Sparse solver motivation and design choices Numerical experiments and comparison with other solvers

23 / 35 Sparse solver: factorization MAIN FEATURES: symmetric left-looking block factorization accumulate updates before block row computation nested dissection (ND) ordering induces block structure, enforces sparsity, and reduces operation count orthogonal low-rank approximations by truncated SVD or rank-revealing QR

23 / 35 Sparse solver: factorization MAIN FEATURES: symmetric left-looking block factorization accumulate updates before block row computation nested dissection (ND) ordering induces block structure, enforces sparsity, and reduces operation count orthogonal low-rank approximations by truncated SVD or rank-revealing QR Γ 3 Γ 2 Ω 2 Ω 3 Ω 4 Ω 1 Γ 1 } } } } } } Ω 1 Ω 2 Γ 1 Ω 3 Ω 4 Γ 2 Γ 3

23 / 35 Sparse solver: factorization MAIN FEATURES: symmetric left-looking block factorization accumulate updates before block row computation nested dissection (ND) ordering induces block structure, enforces sparsity, and reduces operation count orthogonal low-rank approximations by truncated SVD or rank-revealing QR

24 / 35 Sparse solver: factorization SPECIAL FEATURES: rank-revealing reordering inside ND blocks ensures the algebraic character of the solver symbolic compression ensures that memory usage decreases adaptive block size faster than fixed block size

25 / 35 Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 2D model separators:

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 2D model separators: 25 / 35

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 2D model separators: } } 25 / 35

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 2D model separators: } } } } } } 25 / 35

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 2D separators from Scotch: connections of length 2 25 / 35

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 2D separators from Scotch: connections of length 2 25 / 35

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 3D model separators: 25 / 35

Sparse solver: revealing low-rank Model problem analysis [Chandrasekaran et al., 10]: rank is bounded proportionally to the number of connections between the corresp. subset and the remaining nodes of separator Implemented: recursive edge bisection (via METIS) 3D model separators: } } } } } } 25 / 35

26 / 35 Sparse solver: symbolic compression If for a low-rank approximation R 12 R 12 Q one has ( nnz R 12 ) nnz( R 12 ) + nnz( Q ), then the low-rank approximation is replaced by R 12 = I R 12. This low-rank approximation is (trivially) orthogonal. No need to store I does not increase memory use.

27 / 35 Sparse solver: adaptive block size Via symbolic compression (s.c.), block subdivision can evolve: } } } } }s.c. }s.c. } } } } We use this evolution: by automatically applying s.c. after few occurrences at the same level; (more successive s.c. occur, (exponentially) more s.c. are skipped to save computations); to adjust the minimal block size for the following separators (this also saves computations).

28 / 35 Outline Theory... One-level variant basic approach and underlying analysis Multilevel variants extensions of analysis to multi-level setting... and practice Sparse solver motivation and design choices Numerical experiments and comparison with other solvers

Numerical experiments: setting Solver parameters: the preconditioner, denoted by SIC, is used with approximation based on rank-revealing QR with column pivoting and absolute approximation threshold ; adaptive block size (initially block size is set to 16); PCG as outer iteration. Test problems: all SPD matrices from University of Florida Sparse Matrix Collection 1 with n > 10 5 and random rhs (excluded: thermomech_tk and bmw7st_1, both have κ(a) 10 14 10 15 ). Experimental setting: time reported for best threshold value (chosen among {10 10, 10 9,..., 10 10 }) stopping criterion : 10 6 relative residual decrease; hardware : Intel Xeon L5420, 2.5 GHz, 16 GB RAM. 1 http://www.cise.ufl.edu/research/sparse/matrices/ 29 / 35

30 / 35 Comparison with (exact) Cholesky 100 chol SIC time (sec.) per million nnz 75 50 25 0 10 6 10 7 10 8 problems by nnz

31 / 35 Comparison with (unpreconditioned) CG 100 CG SIC time (sec.) per million nnz 75 50 25 0 10 6 10 7 10 8 problems by nnz

32 / 35 Comparison with ILUPACK ILUPACK SIC time (sec.) per million nnz 50 25 0 10 6 10 7 10 8 problems by nnz

33 / 35 Conclusions In theory... We have presented a conditioning analysis for incomplete Cholesky factorizations preconditioner based on orthogonal dropping Only requirement on A: should be SPD The analysis relates the condition number of the preconditioned system to the individual dropping accuracies (namely, to γ k s) The analysis is sharp in the one-level case One-level bound can be extended to the multilevel setting if an additional nested subspace assumption holds; it naturally holds for the presented preconditioner

34 / 35 Conclusions... and in practice We have presented a preliminary implementation of incomplete Cholesky factorizations preconditioner based on orthogonal dropping The solver targets at sparse matrices; it uses sparsity structure to reduce the block rank during the factorization Preliminary numerical experiments demonstrate that the solver is competitive

Further details Theory: A. Napov, Conditioning analysis of incomplete Cholesky factorizations with orthogonal dropping, SIAM J. Matrix Anal. Appl. A. Napov, Conditioning analysis of incomplete Cholesky factorizations with orthogonal dropping II: nested subspaces (in preparation) Related approaches (for dense matrices): M. Gu, X. S. Li, P. Vassilevski, Direction-preserving and Schur-monotonic semiseparable approximations of symmetric positive definite matrices, SIAM J. Matrix Anal. Appl. J. Xia, M. Gu, Robust approximate Cholesky factorization of rank-structured symmetric positive definite matrices, SIAM J. Matrix Anal. Appl. Rank revealing based on sparsity: A. Napov, X. S. Li, An algebraic multifrontal preconditioner that exploits the low-rank property, Numer. Lin. Alg. Appl. (to appear). 35 / 35