Lecture # 20 The Preconditioned Conjugate Gradient Method

Similar documents
7.2 Steepest Descent and Preconditioning

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Master Thesis Literature Study Presentation

Chapter 7 Iterative Techniques in Matrix Algebra

9.1 Preconditioned Krylov Subspace Methods

4.6 Iterative Solvers for Linear Systems

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Notes on PCG for Sparse Linear Systems

Iterative techniques in matrix algebra

Algebra C Numerical Linear Algebra Sample Exam Problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Preconditioning Techniques Analysis for CG Method

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

Jae Heon Yun and Yu Du Han

CHAPTER 5. Basic Iterative Methods

Iterative Methods and Multigrid

Lecture 18 Classical Iterative Methods

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

From Stationary Methods to Krylov Subspaces

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

Conjugate Gradient Method

Iterative Methods for Solving A x = b

Iterative methods for Linear System

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Computational Linear Algebra

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

1 Conjugate gradients

Numerical Methods I Non-Square and Sparse Linear Systems

Gaussian Elimination for Linear Systems

EXAMPLES OF CLASSICAL ITERATIVE METHODS

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Iterative Solution methods

On the choice of abstract projection vectors for second level preconditioners

Solving Linear Systems

Math 577 Assignment 7

Scientific Computing

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

The Conjugate Gradient Method

Linear Solvers. Andrew Hazel

FEM and Sparse Linear System Solving

Iterative Methods. Splitting Methods

A Chebyshev-based two-stage iterative method as an alternative to the direct solution of linear systems

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves

Lecture 17: Iterative Methods and Sparse Linear Algebra

6.4 Krylov Subspaces and Conjugate Gradients

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Notes for CS542G (Iterative Solvers for Linear Systems)

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

Notes on Some Methods for Solving Linear Systems

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

Goal: to construct some general-purpose algorithms for solving systems of linear Equations

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Line Search Methods for Unconstrained Optimisation

PETROV-GALERKIN METHODS

CLASSICAL ITERATIVE METHODS

1. Fast Iterative Solvers of SLE

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

Tsung-Ming Huang. Matrix Computation, 2016, NTNU

Solving linear systems (6 lectures)

Conjugate Gradient (CG) Method

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

The Solution of Linear Systems AX = B

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

ITERATIVE PROJECTION METHODS FOR SPARSE LINEAR SYSTEMS AND EIGENPROBLEMS CHAPTER 4 : CONJUGATE GRADIENT METHOD

Study on Preconditioned Conjugate Gradient Methods for Solving Large Sparse Matrix in CSR Format

April 26, Applied mathematics PhD candidate, physics MA UC Berkeley. Lecture 4/26/2013. Jed Duersch. Spd matrices. Cholesky decomposition

Numerical methods for solving linear systems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

Scientific Computing WS 2018/2019. Lecture 10. Jürgen Fuhrmann Lecture 10 Slide 1

Gradient Method Based on Roots of A

The Conjugate Gradient Method for Solving Linear Systems of Equations

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

Numerical optimization

Numerical Methods in Matrix Computations

FEM and sparse linear system solving

Iterative solution methods and their rate of convergence

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Numerical Methods - Numerical Linear Algebra

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

ECE133A Applied Numerical Computing Additional Lecture Notes

G1110 & 852G1 Numerical Linear Algebra

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

Transcription:

Lecture # 20 The Preconditioned Conjugate Gradient Method We wish to solve Ax = b (1) A R n n is symmetric and positive definite (SPD). We then of n are being VERY LARGE, say, n = 10 6 or n = 10 7. Usually, the matrix is also sparse (mostly zeros) and Cholesky factorization is not feasible. When A is SPD, solving (1) is equivalent to finding x such that x 1 = arg min x R n 2 xt Ax x T b. The conjugate gradient algorithm is one way to solve this problem. Algorithm 1 (The Conjugate Gradient Algorithm) x 0 initial guess (usually 0). p 1 = r 0 = b Ax 0 ; w = Ap 1 ; α 1 = r T 0 r 0 /(p T 1 w); x 1 = x 0 + α 1 p 1 ; r 1 = r 0 α 1 w; k = 1 ; while r k 2 > ǫ (some tolerence) β k = r T k r k/(r T k 1 r k 1); p k+1 = r k + β k p k ; w = Ap k+1 ; α k+1 = r T k r k/(p T k+1 w); x k+1 = x k + α k+1 p k+1 ; r k+1 = r k α k+1 w; k = k + 1; x = x k ; 1

The conjugate gradient algorithm is essentially optimal in the following sense. We note that One can show that x k = x 0 + x k = arg If we let the A-norm be given by then one can show that k α j p j = x 0 + P k a j=1 P k = (p 1,...,p k ) a = (α 1,...,α k ) T 1 min x=x 0 +P k c 2 xt Ax x T b. w A = w T Aw ( ) k κ 1 x k x A 2 x 0 x A κ + 1 κ = κ 2 (A) = A 1 2 A 2. Thus if A is well conditioned, the conjugate gradient method converges quickly. This bound is actually very pessimistic! However, if A is badly conditioned, conjugate gradient can converge slowly, so we look to transform (1) into a well conditioned problem. We writing A in the form of a splitting A = M N M is SPD. Since M is SPD, it has a Cholesky factorization, thus M = R T R. If we let  = R T AR 1, ˆb = R T b 2

then solving (1) is equivalent to solving Ây = ˆb y = Rx. We hope that κ 2 (Â) κ 2(A), and we solve (1) by performing conjugate gradient on Â. In that formulation, we would have and ŷ k = Rx k ˆr k = ˆb Âŷ k = R T (b Ax k ) = R T r k. A conjugate gradient step would be ˆp k+1 = ˆr k + β kˆp k so Thus Since y k+1 = y k + α k+1ˆp k+1 We prefer to update x k = R 1 y k. We have that if we let then ˆp k+1 = R T r k + β kˆp k, x k+1 = R 1 y k+1 = R 1 y k + α k+1 R 1ˆp k+1 x k+1 = x k + α k+1 p k+1 p k+1 = R 1 R T r k + β k p k. M 1 = R 1 R T z k = M 1 r k p k+1 = z k + β k p k is the update to the descent direction. The formulas for β k and α k+1 update to β k = (r T kz k )/(r T k 1z k 1 ) α k+1 = (r T kz k )/(p T k+1ap k+1 ). These modifications lead to the preconditioned conjugate gradient method. 3

Algorithm 2 (The Preconditioned Conjugate Gradient Algorithm) x 0 initial guess (usually 0). r 0 = b Ax 0 ; Solve Mz 0 = r 0 ; p 1 = z 0 ; w = Ap 1 ; α 1 = r T 0 z 0 /(p T 1 w); x 1 = x 0 + α 1 p 1 ; r 1 = r 0 α 1 w; k = 1 ; while r k 2 > ǫ (some tolerence) Solve Mz k = r k ; β k = r T k z k/(r T k 1 z k 1); p k+1 = z k + β k p k ; w = Ap k+1 ; α k+1 = r T k z k/(p T k+1 w); x k+1 = x k + α k+1 p k+1 ; r k+1 = r k α k+1 w; k = k + 1; x = x k ; So far, I have ignored the question of how to choose M. It is a very important question. Although this version of the conjugate gradient method dates from Concus, Golub, and O Leary (1976), people still write papers on how to choose preconditioners (including your instructor!). Here are some straightforward choices. The Jacobi preconditioner. Choose M = D = diag(a 11,a 22,...,a nn ). The resulting  is  = D 1/2 AD 1/2 which is the same as diagonally scaling to produce  with ones on the diagonal. 4

Block versions of the Jacobi preconditioner are also possible.that is, if we write A in the form A 11 A 12 A 1s A 21 A 22 A 2s A =.......... A s1 A s2 A ss each A ij is q q matrix. A block Jacobi preconditioner is the matrix A 11 0 0 0 A 22 M = D =...... 0 0 0 A ss This matrix M is symmetric and positive definite. The solution of solves z = z 1 z 2. z s Mz = r A kk z k = r k, r = r 1 r 2. r s The SSOR preconditioner For ω (0, 2) choose. M = [ω(2 ω)] 1 (D + ωl)d 1 (D + ωl T ) The value of ω matters little so we usually choose ω = 1, which yields In this case, M = (D + L)D 1 (D + L T ) N = LD 1 L T. The work here is just a back and forward solve. 5

Incomplete Cholesky preconditioner Do Cholesky, but ignore fill elements. If A is large and sparse in the Cholesky factorization A = R T R (2) the matrix R will often have many more nonzeros than A. This is one of the reasons that conjugate gradient is cheaper than Cholesky in some instances. First, let us write a componentwise version of the Cholesky algorithm to compute (2). for k = 1:n 1 r kk = a kk ; for j = k + 1 : n r kj = a kj /r kk ; end ; for i = k + 1 : n for j = i:n % Upper triangle only! a ij = a ij r ki r kj ; r nn = a nn ; Incomplete Cholesky ignores the fill elements. We get M = ˆR T ˆR R M = (ˆr kj ) from the following algorithm. Here r kj = 0 if a kj = 0. Algorithm 3 (Incomplete Cholesky Factorization) for k = 1:n 1 ˆr kk = a kk ; for j = k + 1 : n ˆr kj = a kj /ˆr kk ; end ; for i = k + 1 : n 6

for j = i:n % Upper triangle only! if a ij 0 a ij = a ij ˆr kiˆr kj ; r nn = a nn ; One downside to this algorithm, is that even if A is SPD, it is possible that a kk could be negative or zero when it is time for r kk to be evaluated at the beginning of the main loop. Thus, unlike the Jacobi and SSOR preconditioners, the incomplete Cholesky preconditioner is not defined for all SPD matrices! However, if, in addition, A is irreducibly diagonally dominant, that is, a jj i j a ij, j = 1,...,n with at least one inequality strict and no decoupling, that is, no permutation such that ( ) PAP T A1 0 =, 0 A 2 then incomplete Cholesky is guaranteed to exist. Incomplete Cholesky is a popular preconditioner in many environments. Much has been written about it and continues to be written about it. The many variants of it consist of algorithms for specifying which fill elements to accept and which to reject. They can become quite sophisticated. 7