Lecture 8: Fast Linear Solvers (Part 7)

Similar documents
Numerical Methods in Matrix Computations

Solving Ax = b, an overview. Program

Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools

APPLIED NUMERICAL LINEAR ALGEBRA

Stabilization and Acceleration of Algebraic Multigrid Method

Lecture 18 Classical Iterative Methods

Lab 1: Iterative Methods for Solving Linear Systems

9.1 Preconditioned Krylov Subspace Methods

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

6.4 Krylov Subspaces and Conjugate Gradients

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Iterative Methods and Multigrid

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

FEM and sparse linear system solving

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Linear Solvers. Andrew Hazel

Numerical Methods I Non-Square and Sparse Linear Systems

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0

The Conjugate Gradient Method

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Lecture 17: Iterative Methods and Sparse Linear Algebra

Contents. Preface... xi. Introduction...

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Course Notes: Week 1

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Preface to the Second Edition. Preface to the First Edition

Iterative Methods for Sparse Linear Systems

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

M.A. Botchev. September 5, 2014

Iterative methods for Linear System

Simple iteration procedure

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Iterative Methods for Incompressible Flow

Numerical Linear Algebra

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

A Linear Multigrid Preconditioner for the solution of the Navier-Stokes Equations using a Discontinuous Galerkin Discretization. Laslo Tibor Diosady

Iterative Methods for Solving A x = b

A Numerical Study of Some Parallel Algebraic Preconditioners

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Algebraic Multigrid as Solvers and as Preconditioner

Parallel Discontinuous Galerkin Method

Linear Algebra. Brigitte Bidégaray-Fesquet. MSIAM, September Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, Grenoble.

Iterative Methods and Multigrid

Algebra C Numerical Linear Algebra Sample Exam Problems

Discretization of PDEs and Tools for the Parallel Solution of the Resulting Systems

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

DELFT UNIVERSITY OF TECHNOLOGY

Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method

Roundoff Error. Monday, August 29, 11

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

Inexactness and flexibility in linear Krylov solvers

Parallel sparse linear solvers and applications in CFD

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to

Preconditioning Techniques Analysis for CG Method

Numerical linear algebra

Algorithms that use the Arnoldi Basis

Gram-Schmidt Orthogonalization: 100 Years and More

Implementation of the Deflated Pre-conditioned Conjugate Gradient Method applied to the Poisson equation related to bubbly flow on the GPU

Computational Linear Algebra

9. Iterative Methods for Large Linear Systems

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

Lecture 13 Stability of LU Factorization; Cholesky Factorization. Songting Luo. Department of Mathematics Iowa State University

Linear Algebra. PHY 604: Computational Methods in Physics and Astrophysics II

Sparse Matrices and Iterative Methods

Numerical Analysis: Solutions of System of. Linear Equation. Natasha S. Sharma, PhD

Solution of Linear Equations

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Arnoldi Methods in SLEPc

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Chapter 7 Iterative Techniques in Matrix Algebra

From Stationary Methods to Krylov Subspaces

Numerical Mathematics

1. Fast Iterative Solvers of SLE

Chapter 9 Implicit integration, incompressible flows

Robust solution of Poisson-like problems with aggregation-based AMG

MIQR: A MULTILEVEL INCOMPLETE QR PRECONDITIONER FOR LARGE SPARSE LEAST-SQUARES PROBLEMS

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

Numerical Methods - Numerical Linear Algebra

FEM and Sparse Linear System Solving

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Atomistic effect of delta doping layer in a 50 nm InP HEMT

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

LARGE SPARSE EIGENVALUE PROBLEMS

A hybrid reordered Arnoldi method to accelerate PageRank computations

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

Transcription:

Lecture 8: Fast Linear Solvers (Part 7) 1

Modified Gram-Schmidt Process with Reorthogonalization Test Reorthogonalization If Av k 2 + δ v k+1 2 = Av k 2 to working precision. δ = 10 3 2

Householder Arnoldi In Arnoldi algorithm, the column vectors of a matrix to be orthonormalized are not available ahead of time. In stead, the next vector is Av j, where v j is current basis vector. In the Householder algorithm, an orthogonal column v i is obtained as H 1 H i e i. H.F. Walker: Implementation of the GMRES method using Householder transformation. SIAM J. on Sci. Comput. 9:152-163, 1988 3

Givens Rotations minimize y R k βe 1 H m y k 2 involves QR factorization. Do QR factorizations of H k by Givens Rotations. A 2 2 Givens rotation is a matrix of the form G = c s where c = cos (θ), s = sin (θ) for s c θ [ π, π]. The orthogonal matrix G rotates the vector (c, s) T, which makes an angle of θ with the x-axis, through an angle θ so that it overlaps the x-axis. GA c s = 1 0 4

An N N Givens rotation G j (c, s) replaces a 2 2 block on the diagonal of the N N identity matrix with a 2 2 Givens rotations. G j (c, s) is with a 2 2 Givens rotations in rows and columns j and j + 1. Givens rotations can be used in reducing Hessenberg matrices to triangular form. This can be done in O N floating-point operations. Let H be an N M(N M) upper Hessenberg matrix with rank M. We reduce H to triangular form by first multiplying the matrix by a Givens rotations that zeros h 21 (values of h 11 and subsequent columns are changed) 5

Step 1: Define G 1 (c 1, s 1 ) by c 1 = h 11 / h 2 2 11 + h 21 and s 1 = h 21 / h 2 11 + h 2 21. Replace H by G 1 H. Step 2: Define G 2 (c 2, s 2 ) by c 2 = h 22 / h 2 2 22 + h 32 and s 2 = h 32 / h 2 22 + h 2 32. Replace H by G 2 H. Step j: Define G j (c j, s j ) by c j = h jj / h 2 2 jj + h j+1,j and s j = h j+1,j / h 2 2 jj + h j+1,j. Replace H by G j H. Setting Q = G N G 1. R = QH is upper triangular. 6

Let H m = QR by Givens rotations matrices. minimize y R k βe 1 H m y k 2 = minimize y R k Q(βe 1 H m y k ) 2 = minimize y R k βqe 1 Ry k 2 7

8

Preconditioning Basic idea: using GMRES on a modified system such as M 1 Ax = M 1 b. The matrix M 1 A need not to be formed explicitly. However, Mw = v need to be solved whenever needed. Left preconditioning M 1 Ax = M 1 b Right preconditioning AM 1 u = b with x = M 1 u Split preconditioning: M is factored as M = M L M R M L 1 AM R 1 u = M L 1 b with x = M R 1 u 9

GMRES with Left Preconditioning The Arnoldi process constructs an orthogonal basis for Span{r 0, M 1 Ar 0, (M 1 A) 2 r 0, (M 1 A) k 1 r 0 }. Sadd. Iterative Methods for Sparse Linear Systems 10

GMRES with Right Preconditioning Right preconditioned GMRES is based on solving AM 1 u = b with x = M 1 u. The initial residual is: b AM 1 u 0 = b Ax 0. This means all subsequent vectors of the Krylov subspace can be obtained without any references to the u. At the end of right preconditioned GMRES: u m = u 0 + m i=1 v i η i x m = x 0 + M 1 with u 0 = Mx 0 m i=1 v i η i 11

GMRES with Right Preconditioning The Arnoldi process constructs an orthogonal basis for Span{r 0, AM 1 r 0, (AM 1 ) 2 r 0, (AM 1 ) k 1 r 0 }. Sadd. Iterative Methods for Sparse Linear Systems. 12

Split Preconditioning M can be a factorization of the form M = LU. Then L 1 AU 1 u = L 1 b, with x = U 1 u. Need to operate on the initial residual by L 1 (b Ax 0 ) Need to operate on the linear combination U 1 (V m y m ) in forming the approximate solution 13

Comparison of Left and Right Preconditioning Spectra of M 1 A, AM 1 and L 1 AU 1 are identical. In principle, one should expect convergence to be similar. When M is ill-conditioned, the difference could be substantial. 14

Jacobi Preconditioner Iterative method for solving Ax = b takes the form: x k+1 = M 1 Nx k + M 1 b where M and N split A into A = M N. Define G = M 1 N = M 1 M A = I M 1 A and f = M 1 b. Iterative method is to solve I G x = f, which can be written as M 1 Ax = M 1 b. Jacobi iterative method: x k+1 = G JA x k + f where G JA = (I D 1 A) and f = D 1 b M = D for Jacobi method. 15

SOR/SSOR Preconditioner Define: A = D E F Gauss-Seidel: G GS = I (D E) 1 A M SOR = 1 (D we) w A symmetric SOR (SSOR) consists of: D we x 1 = wf + 1 w D x k+ k + wb 2 D wf x k+1 = we + 1 w D x 1 + wb k+ 2 This gives x k+1 = G SSOR x k + f Where G SSOR = D wf 1 (we + 1 w D) D we 1 (wf + 1 w D) M SSOR = D we D 1 (D wf); M SGS = D E D 1 (D F); Note: SSOR usually is used when A is symmetric 16

Take symmetric GS for example: M SGS = D E D 1 D F Define: L = D E D 1 = I ED 1 and U = D F. L is a lower triangular matrix and U is a upper triangular matrix. To solve M SGS w = x for w, a forward solve and a backward solve are used: Solve I ED 1 z = x for z Solve D F w = z for w 17

Incomplete LU(0) Factorization Define: NZ X = {(i, j) X i,j 0} Incomplete LU (ILU(0)): A = LU + R with NZ L NZ U = NZ(A) r ij = 0 for (i, j) NZ(A) I.e. L and U have no fill-ins at the entries a ij = 0. for i = 1 to n for k = 1 to i 1 and if (i, k) NZ(A) a ik = a ik /a kj for j = k + 1 to n and if (i, k) NZ(A) a ij = a ij a ik a kj end; end; end; 18

ILU(0) Sadd. Iterative Methods for Sparse Linear Systems. 19

Parallel GMRES J. Erhel. A parallel GMRES version for general sparse matrices. Electronic Transactions on Numerical Analyis. 3:160-176, 1995. Implementation in PETSc (Portable, Extensible Toolkit for Scientific Computation) http://www.mcs.anl.gov/petsc/ 20

Parallel Libraries ScaLAPACK http://www.netlib.org/scalapack/ Based on LAPACK (Linear Algebra PACKage) and BLAS (Basic Linear Algebra Subroutines) Parallelized by divide and conquer or block distribution Written in Fortran 90 Successor of LINPACK, which was originally written for vector supercomputers in the 1970s Implemented on top of MPI using MIMD, SPMD, and used explicit message passing 21

PETSc (Portable, Extensible Toolkit for Scientific Computation) http://www.mcs.anl.gov/petsc/ Suite of data structures (core: distributed vectors and matrices) and routines for linea and non-linear solvers User (almost) never has to call MPI himself when using PETSc Uses two MPI communicators: PETSC_COMM_SELF for the library-internal communication and PETSC_COMM_WORLD for user processes Written in C, callable from Fortran Has been used to solve systems with over 500 millions unknowns Has been shown to scale up to over 6000 processors 22

PETSc Structure 23

PETSc Numerical Solvers 24

Parallel Random Number Generator SPRNG (The Scalable parallel random number generators library) http://sprng.cs.fsu.edu/ Random number sequence does not depend on the number of processors used, but only on the seed a reproducible Monte Carlo simulations in parallel SPRNG implements parallel-safe, high-quality random number generators C++/Fortran (used to be C/Fortran in previous versions) 25

Parallel PDE Solver POOMA (Parallel Object-Oriented Methods and Applications) http://acts.nersc.gov/formertools/pooma/index.html Collection of templated C++ classes for writing parallel PDE solvers Provides high-level data types (abstractions) for fields and particles using data-parallel arrays Supports finite-difference simulations on structured, unstructured, and adaptive grids. Also supports particle simulations, hybrid particle-mesh simulations, and Monte Carlo Uses mixed message-passing/thread parallelism 26

Many more Aztec (iterative solvers for sparse linear systems) SuperLU (LU decomposition) Umfpack (unsymmetric multifrontal LU) EISPACK (eigen-solvers) Fishpack (cyclic reduction for 2nd & 4th order FD) PARTI (Parallel run-time system) Bisect (recursive orthogonal bisection) ROMIO (parallel distributed file I/O) KINSol (solves the nonlinear algebraic systems) https://computation.llnl.gov/casc/sundials/main.html SciPy (Scientific Tools for Phython) http://www.scipy.org/ 27

References: C.T. Kelley. Iterative Methods for Linear and Nonlinear Equations. Yousef Sadd. Iterative methods for Sparse Linear Systems G. Karypis and V. Kumar. Parallel Threshold-based ILU Factorization. Technical Report #96-061. U. of Minnesota, Dept. of Computer Science, 1998. P.-O. Persson and J. Peraire. Newton-GMRES Preconditioning for Discontinuous Galerkin Discretizations of the Navier-Stokes Equations. SIAM J. on Sci. Comput. 30(6), 2008. 28