Tsung-Ming Huang. Matrix Computation, 2016, NTNU

Similar documents
Iterative techniques in matrix algebra

Chapter 7 Iterative Techniques in Matrix Algebra

Conjugate Gradient Method

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Conjugate Gradient Method

Conjugate Gradient (CG) Method

COURSE Iterative methods for solving linear systems

Math 5630: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 2019

PETROV-GALERKIN METHODS

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

Notes on Some Methods for Solving Linear Systems

Iterative Solution methods

9.1 Preconditioned Krylov Subspace Methods

Iterative methods for Linear System

Numerical solutions of nonlinear systems of equations

Iterative Methods for Solving A x = b

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Numerical Methods - Numerical Linear Algebra

Lecture 18 Classical Iterative Methods

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

4.6 Iterative Solvers for Linear Systems

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

FEM and Sparse Linear System Solving

Lecture # 20 The Preconditioned Conjugate Gradient Method

Iterative Methods. Splitting Methods

Iterative Methods for Sparse Linear Systems

1 Conjugate gradients

FEM and sparse linear system solving

The Conjugate Gradient Method

Polynomial Jacobi Davidson Method for Large/Sparse Eigenvalue Problems

GMRES: Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems

7.2 Steepest Descent and Preconditioning

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Notes on PCG for Sparse Linear Systems

Algebra C Numerical Linear Algebra Sample Exam Problems

CHAPTER 6. Projection Methods. Let A R n n. Solve Ax = f. Find an approximate solution ˆx K such that r = f Aˆx L.

EECS 275 Matrix Computation

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Master Thesis Literature Study Presentation

The Conjugate Gradient Method

Conjugate Gradient Tutorial

Numerical Optimization

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Numerical Optimization

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

The Conjugate Gradient Method

HOMEWORK 10 SOLUTIONS

M e t ir c S p a c es

Course Notes: Week 1

Theory of Iterative Methods

Monte Carlo simulation inspired by computational optimization. Colin Fox Al Parker, John Bardsley MCQMC Feb 2012, Sydney

Chapter 4. Unconstrained optimization

The Conjugate Gradient Method

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

ITERATIVE PROJECTION METHODS FOR SPARSE LINEAR SYSTEMS AND EIGENPROBLEMS CHAPTER 4 : CONJUGATE GRADIENT METHOD

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Scientific Computing II

CLASSICAL ITERATIVE METHODS

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Computational Linear Algebra

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

The Conjugate Gradient Method

Vector and Matrix Norms I

Iterative Methods for Linear Systems of Equations

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Numerical methods part 2

Lecture 7 Conjugate Gradient Method(CG)

Iterative Methods for Ax=b

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

Eigenvalues and eigenvectors

The conjugate gradient method

KRYLOV SUBSPACE ITERATION

Iterative Methods and Multigrid

Classical iterative methods for linear systems

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Linear Solvers. Andrew Hazel

Poisson Equation in 2D

Programming, numerics and optimization

Vector and Matrix Norms I

Math 577 Assignment 7

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

Gradient Method Based on Roots of A

Iterative Methods for Smooth Objective Functions

Constructions with ruler and compass.

Some minimization problems

LOWELL JOURNAL.. TOOK H BR LIFE.

Conjugate Gradients: Idea

2.29 Numerical Fluid Mechanics Spring 2015 Lecture 9

Linear Analysis Lecture 5

Goal: to construct some general-purpose algorithms for solving systems of linear Equations

6.4 Krylov Subspaces and Conjugate Gradients

Transcription:

Tsung-Ming Huang Matrix Computation, 2016, NTNU 1

Plan Gradient method Conjugate gradient method Preconditioner 2

Gradient method 3

Theorem Ax = b, A : s.p.d Definition A : symmetric positive definite if Inner product A = A x Ax 0, x 0 < x, y = x y for any x, y! n Define g(x) = < x, Ax 2 < x,b = x Ax 2x b Theorem A : s.p.d x is the sol. of Ax = b g(x ) = min x! n g(x) 4

Proof Assume x is the sol. of Ax = b Ax = b g(x) = < x, Ax 2 < x,b = < x x, A(x x ) + < x, Ax + < x, Ax < x, Ax 2 < x,b = < x x, A(x x ) < x, Ax + 2 < x, Ax 2 < x,b = < x x, A(x x ) < x, Ax +2 < x, Ax b = < x x, A(x x ) < x, Ax < x x, A(x x ) 0 g(x ) = min x! n g(x) 5

Proof Assume g(x ) = min x! n g(x) Fixed vectors x and v, for any α! f (α ) g(x + αv) = < x + αv, Ax + α Av 2 < x + αv,b = < x, Ax +α < v, Ax +α < x, Av +α 2 < v, Av 2 < x,b 2α < v,b = < x, Ax 2 < x,b +2α < v, Ax 2α < v,b +α 2 < v, Av = g(x) + 2α < v, Ax b +α 2 < v, Av 6

Proof f (α ) = g(x) + 2α < v, Ax b +α 2 < v, Av f is a quadratic function of α A : s.p.d f has a minimal value when f (α ) = 0 f ( ˆα ) = 2 < v, Ax b +2 ˆα < v, Av = 0 ˆα = < v, Ax b < v, Av = < v,b Ax < v, Av g(x + ˆαv) = f ( ˆα ) = g(x) 2 < v,b Ax < v, Av < v,b Ax < v,b Ax 2 < v,b Ax 2 + < v, Av < v, Av = g(x) < v, Av 7

Proof v 0 < v,b Ax 2 g(x + ˆαv) = g(x) < v, Av g(x + ˆαv) < g(x) if < v,b Ax 0 Suppose that g(x + ˆαv) = g(x) if < v,b Ax = 0 g(x ) = min x! n g(x) g(x + ˆαv) g(x ) for any v < v,b Ax = 0, v Ax = b 8

α = < v,b Ax < v, Av = < v,r < v, Av, r b Ax If r 0 and < v,r 0 < v,b Ax 2 g(x + αv) = g(x) < g(x) < v, Av x + αv is closer to x than is x Given and x (0) v (1) 0 For k = 1,2,3,! α k = < v(k ),b Ax (k 1) < v (k ), Av (k ), x (k ) = x (k 1) + α k v (k ) Choose a new search direction v (k+1) 9

Steepest descent { } { x (k ) } x Question How to choose v (k ) s.t. rapidly? Let Φ :! n! be a differentiable function on x Φ(x + ε p) Φ(x) ε = Φ(x) p + O(ε) The right hand side takes minimum at p = Φ(x) Φ(x) (i.e., the largest descent) for all p with p = 1 (neglect O(ε) ) 10

Steepest descent direction of g Denote x = [x 1, x 2,!, x n ] g(x) = < x, Ax 2 < x,b = n n n a x x 2 x ij i j i i=1 j=1 i=1 b i g x k (x) = 2 n a ki i=1 x 2b = 2( A(k,:)x b ) i k k g(x) = g (x), g,!, g (x) x x x 1 2 n = 2(Ax b) = 2r 11

Steepest descent method (gradient method) Given x (0) 0 For k = 1,2,3,! r k 1 = b Ax (k 1) If Else r = 0, then k 1 Stop; α k = < r k 1,r k 1 < r k 1, Ar k 1 Convergence Theorem λ 1 λ 2! λ n 0 : eigenvalues x (k ), x (k 1) : approx. sol. x : exact sol. x (k ) x * A λ 1 λ n λ 1 + λ n x(k 1) x * A End x (k ) = x (k 1) + α k r k 1 where x A = x Ax End 12

Conjugate gradient method 13

A-orthogonal If κ (A) = λ 1 λ n is large λ 1 λ n λ 1 + λ n 1 Convergence is very slow NOT recommend it Improvement Choose A-orthogonal search directions Definition p,q! n are called A-orthogonal (A-conjugate) if p Aq = 0 14

Lemma v 1,,v n 0 : pairwisely A-conjugate v 1,,v n : linearly independent Proof n 0 = c v j j j=1 0 = (v ) A c v k j j = n j=1 c k = 0, k = 1,,n n c (v ) Av j k j j=1 = c k (v k ) Av k v 1,,v n : linearly independent 15

Theorem A : symmetric positive definite v,,v 0! n : pairwisely A-conjugate 1 n x : given 0 For, let k = 1,,n α k = < v k,b Ax k 1 < v k, Av k x k = x k 1 + α k v k Then Ax n = b < b Ax k,v j = 0, for j = 1,2,,k 16

Proof x k = x k 1 + α k v k Ax n = Ax n 1 + α n Av n = (Ax n 2 + α n 1 Av n 1 ) + α n Av n =! = Ax + α Av + α Av +!+ α Av 0 1 1 2 2 n n < Ax b,v n k = < Ax b,v +α < Av,v +!+ α < Av,v 0 k 1 1 k n n k = < Ax b,v +α < v, Av +!+ α < v, Av 0 k 1 1 k n n k = < Ax b,v +α < v, Av 0 k k k k = < Ax 0 b,v k + < v k,b Ax k 1 < v k, Av k < v k, Av k = < Ax 0 b,v k + < v k,b Ax k 1 17

Proof < Ax b,v = < Ax b,v + < v,b Ax n k 0 k k k 1 = < Ax b,v 0 k + < v,b Ax + Ax Ax +! Ax + Ax Ax k 0 0 1 k 2 k 2 k 1 = < Ax b,v + < v,b Ax 0 k k 0 + < v, Ax Ax +!+ < v, Ax Ax k 0 1 k k 2 k 1 = < v, Ax Ax +!+ < v, Ax Ax k 0 1 k k 2 k 1 x i = x i 1 + α i v i, i Ax i = Ax i 1 + α i Av i Ax i 1 Ax i = α i Av i < Ax n b,v k = α 1 < v k, Av 1! α k 1 < v k, Av k 1 = 0 Ax n = b 18

Proof Assume < b Ax k,v j = 0, for j = 1,2,,k < r k 1,v j = 0, for j = 1,2,,k 1 r k = b Ax k = b A(x k 1 + α k v k ) = r k 1 α k Av k < r k,v k = < r k 1,v k α k < Av k,v k = < r k 1,v k < v k,b Ax k 1 < v k, Av k < Av k,v k = 0 For j = 1,,k 1 Assumption A-conjugate < r k,v j = < r k 1,v j α k < Av k,v j = 0 which is completed the proof by the mathematic induction. 19

Method of conjugate directions Given x (0), v,,v! n \ {0}: pairwisely A-orthogonal 1 n r 0 = b Ax (0) For k = 1,,n α k = < v k,r k 1 < v k, Av k, x(k ) = x (k 1) + α k v k r k = r k 1 α k Av k = b Ax (k 1) End Question How to find A-orthogonal search directions? 20

A-orthogonalization!v 2 = v 2 αv 1 v 1 v 2 αv 1!v 2 v 1 0 = v 1!v 2 = v 1 v 2 αv 1 v 1 A-orthogonal!v 2 = v 2 αv 1 A v 1 α = v v 1 2 v v 1 1 0 = v 1 A!v 2 = v 1 Av 2 αv 1 Av 1 α = v 1 Av 2 v 1 Av 1 21

A-orthogonalization!v = v v Av 1 2 2 2 v v v Av 1 A 1 1 1 { v, v } { v,!v } : A-orthogonal 1 2 1 2 { v, v, v } { v,!v,!v } : A-orthogonal 1 2 3 1 2 3!v = v α v α!v { v,!v } 3 3 1 1 2 2 A 1 2 0 = v 1 A!v 3 = v 1 Av 3 α 1 v 1 Av 1 α 1 = v 1 Av 3 / v 1 Av 1 0 =!v 2 A!v 3 =!v 2 Av 3 α 2!v 2 A!v 2 α 2 =!v 2 Av 3 /!v 2 A!v 2 22

Practical Implementation Given x (0) r 0 = b Ax (0) v 1 = r 0 α 1 = < v 1,r 0 < v 1, Av 1, x(1) = x (0) + α 1 v 1 r 1 = r 0 α 1 Av 1 steepest descent direction Construct A-orthogonal vector { v, r } 1 1 NOT A-orthogonal set v 2 = r 1 + β 1 v 1, β 1 = < v 1, Ar 1 < v 1, Av 1 α 2 = < v 2,r 1 < v 2, Av 2, x(2) = x (1) + α 2 v 2 r 2 = r 1 α 2 Av 2 23

Construct A-orthogonal vector { v, v,r } 1 2 2 v = r + β v + β v, β = v Ar 1 2 3 2 21 1 22 2 21 v, β = v Ar 2 2 Av 22 v Av 1 1 2 2 r 1 = r 0 α 1 Av 1 v Ar = r Av = α 1 ( r r r r ) 1 2 2 1 1 2 0 2 1 v r = v r α v Av = v r v r 2 1 2 2 2 1 2 2 2 2 1 v v Av = 0 Av 2 2 2 2 0 = v r = ( r + β v )r = r r + β v r 2 2 1 1 1 2 1 2 1 1 2 = r r + β v ( r α Av ) = r r + β v r 1 2 1 1 1 2 2 1 2 1 1 1 = r r + β v ( r α Av ) = r r + β v r < v 1,r 0 1 2 1 1 0 1 1 1 2 1 1 0 < v, Av v Av 1 1 1 1 = r 1 r 2 24

r = r α Av, α = < v 1,r 0 1 0 1 1 1 < v, Av 1 1 < v,r = < v,r α < v, Av = 0 1 1 1 0 1 1 1 < r,r = < r,v = < r,v α < Av,v = 0 2 0 2 1 1 1 2 2 1 v Ar = α 1 ( r r r r ) = 0 1 2 1 2 0 2 1 β = v Ar 1 2 21 v = 0 Av 1 1 v 3 = r 2 + β 2 v 2, β 2 = v 2 Ar 2 v 2 Av 2 25

In general case v k = r k 1 + β k 1 v k 1 if r k 1 0 0 = < v k 1, Av k = < v k 1, Ar k 1 + β k 1 Av k 1 = < v k 1, Ar k 1 +β k 1 < v k 1, Av k 1 β k 1 = < v k 1, Ar k 1 < v k 1, Av k 1 Theorem (i). { r,r,,r } is an orthogonal set 0 1 k 1 (ii). { v,,v } is an A-orthogonal set 1 k 26

Reformula α k, β k v k = r k 1 + β k 1 v k 1 α = < v k,r k 1 k < v, Av = < r + β v,r k 1 k 1 k 1 k 1 < v, Av k k k k = < r k 1,r k 1 < v k, Av k + β k 1 < v,r k 1 k 1 < v, Av = < r k 1,r k 1 < v, Av k k k k < r k 1,r k 1 = α k < v k, Av k r k = r k 1 α k Av k < r k,r k = < r k 1,r k α k < Av k,r k = α k < r k, Av k β k = < v k, Ar k < v k, Av k = < r k, Av k < v k, Av k = < r k,r k < r k 1,r k 1 27

Algorithm (Conjugate Gradient Method) Given For End compute x (0), r 0 = b Ax (0) = v 0 k = 0,1, α k = < r k,r k < v k, Av k, x(k+1) = x (k ) + α k v k r k+1 = r k α k Av k If Else End r = 0, then k+1 Stop; β k = < r k+1,r k+1 < r k,r k, v k+1 = r k+1 + β k v k Theorem Ax n = b well-conditioned r n < tol ill-conditioned r < tol k k n 28

Conjugate Gradient Method Convergence Theorem λ 1 λ 2! λ n 0 : eigenvalues { x (k ) } : produced by CG method x : exact sol. x (k ) x * 2 A κ 1 κ +1 k CG is much better than Gradient method x 0 x * A, κ = λ 1 λ n { x (k )} : produced by Gradient method G x G (k ) x * A λ 1 λ n λ 1 + λ n k x (0) x * = κ 1 G A κ +1 k κ 1 κ +1 κ 1 κ +1 x G (0) x * A 29

Preconditioner 30

!A!x! b Ax = b C 1 A C C x = C 1 b Goal Choose C such that κ (C 1 AC ) <κ (A) Apply CG method to!a!x = b! Get!x Solve x = C!x Question Nothing NEW Apply CG method to! A!x =! b Get x 31

Algorithm (Conjugate Gradient Method) Given For k = 0,1, If compute!x (0),!r 0 =! b!a!x (0) =!v 0!α k = <!r k,!r k <!v k,!a!v k!x (k+1) =!x (k ) +!α k!v k!r k+1 =! b!a!x (k+1)!r = 0, then Stop k+1!β k = <!r k+1,!r k+1 <!r k,!r k!v k+1 =!r k+1 +! β k!v k = C 1 r k+1!r = C 1 b ( C 1 AC T )C x k+1 k+1 Let = < w k+1,w k+1 < w k,w k = C 1 (b Ax k+1 ) = C 1 r k+1!v k = C v k, w k = C 1 r k!β k = < C 1 r k+1,c 1 r k+1 < C 1 r k,c 1 r k = < w k+1,w k+1 < w k,w k End 32

Algorithm (Conjugate Gradient Method) Given For compute!x (0),!r 0 =! b!a!x (0) =!v 0 k = 0,1,!α k = <!r k,!r k <!v k,!a!v k!x (k+1) =!x (k ) +!α k!v k!r k+1 = C 1 r k+1 = < w k,w k < v k, Av k!α k = = < C 1 r,c 1 r k k < C v,c 1 AC C v k k < w,w k k < C v,c 1 Av k k < C v,c 1 Av k k If!r = 0, then Stop k+1!β k = < w k+1,w k+1 < w k,w k = v k CC 1 Av k = v k Av k!α k = < w k,w k < v k, Av k!v k+1 =!r k+1 +! β k!v k End 33

Algorithm (Conjugate Gradient Method) Given For compute!x (0),!r 0 =! b!a!x (0) =!v 0 k = 0,1,!α k = < w k,w k < v k, Av k C x (k+1) = C x (k ) +!α k C v k x (k+1) = x (k ) +!α k v k!x (k+1) =!x (k ) +!α k!v k C 1 r k+1 = C 1 r k!α k C 1 AC C v k!r k+1 =!r k!α k!a!v k r k+1 = r k!α k Av k If!r = 0, then Stop k+1!v k = C v k, w k = C 1 r k!β k = < w k+1,w k+1 < w k,w k!v k+1 =!r k+1 +! β k!v k C v k+1 = C 1 r k+1 +! β k C v k v k+1 = C C 1 r k+1 +! β k v k End = C w k+1 +! β k v k 34

Algorithm (Conjugate Gradient Method) (0)!! Given x!, compute r!0 = b Ax! = v!0 1 For k = 0,1, need w0 wk = C rk < wk,wk 1 1 (0) α! k = w0 = C r0 = C (b Ax ) < vk, Avk (0) x (k+1) =x (k ) + α! k vk need v0 rk+1 = rk α! k Avk vk+1 = C wk+1 + β! k vk v0 = C w0 If rk+1 = 0, then Stop < wk+1,wk+1! βk = < wk,wk Solve C wk+1 = rk+1! v =C w +β v k+1 End k+1 k k 35

Algorithm (CG Method with preconditioner C) Given C and x (0), compute r = b Ax (0) 0 Solve Cw = r and C 0 0 v = w 1 0 For k = 0,1, α k = < w k,w k / < v k, Av k x (k+1) = x (k ) + α k v k r k+1 = r k α k Av k If r = 0, then Stop k+1 r k+1 = CC z k+1 Mz k+1 β k = < C 1 r k+1,c 1 r k+1 < C 1 r k,c 1 r k = < z k+1,r k+1 < z k,r k End Solve Cw = r and C k+1 k+1 z = w k+1 k+1 β k = < w k+1,w k+1 / < w k,w k v k+1 = z k+1 + β k v k α k = < C 1 r k,c 1 r k < C v k,c 1 Av k = < z k,r k < v k, Av k 36

Algorithm (CG Method with preconditioner M) Given M and x (0), compute r = b Ax (0) 0 Solve Mz = r and set v = z 0 0 1 0 For If k = 0,1, Compute α k = < z k,r k / < v k, Av k Compute x (k+1) = x (k ) + α k v k Compute r k+1 = r k α k Av k r = 0, then Stop k+1 Solve Mz k+1 = r k+1 Compute β k = < z k+1,r k+1 / < z k,r k Compute v k+1 = z k+1 + β k v k End 37

Choices of M (Criterion) cond (M 1/2 AM 1/2 ) is nearly by 1, i.e., M 1/2 AM 1/2 I, A M The linear system Mz = r must be easily solved. e.g. M = LL M is symmetric positive definite 38

Preconditioner M Jacobi method A = D + (L +U), M = D x = D 1 (L +U)x + D 1 b k+1 k = D 1 (A D)x + D 1 b k = x + D 1 r k k Gauss-Seidel A = (D + L) +U, M = D + L x = (D + L) 1 Ux + (D + L) 1 b k+1 k = (D + L) 1 (D + L A)x + (D + L) 1 b k = x k + (D + L) 1 r k 39

Preconditioner M SOR: ω A = (D + ω L) ((1 ω )D ωu) M N x k+1 = (D + ω L) 1 [(1 ω )D ωu ]x + ω(d + ω L) 1 b k = (D + ω L) 1 [(D + ω L) ω A]x + ω(d + ω L) 1 b k = I ω(d + ω L) 1 A x k + ω(d + ω L) 1 b = x k + ω(d + ω L) 1 r k SSOR: M (ω ) = 1 ω(2 ω ) (D + ω ( L)D 1 D + ω L ) 40