The Conjugate Gradient Method

Similar documents
The conjugate gradient method

EECS 275 Matrix Computation

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

the method of steepest descent

Conjugate Gradient (CG) Method

Mathematical optimization

Chapter 7 Iterative Techniques in Matrix Algebra

Notes on Some Methods for Solving Linear Systems

Course Notes: Week 4

The Conjugate Gradient Method for Solving Linear Systems of Equations

1 Conjugate gradients

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

The Conjugate Gradient Method

PETROV-GALERKIN METHODS

Conjugate Gradient Tutorial

Linear Solvers. Andrew Hazel

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

An Iterative Descent Method

Conjugate Gradient Method

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative Methods for Solving A x = b

Some minimization problems

NonlinearOptimization

MATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Course Notes: Week 1

Chapter 10 Conjugate Direction Methods

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

7.2 Steepest Descent and Preconditioning

Gradient Descent Methods

Conjugate Gradient Method

Math 5630: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 2019

Iterative methods for Linear System

Conjugate Gradients: Idea

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

Lec10p1, ORF363/COS323

17 Solution of Nonlinear Systems

Programming, numerics and optimization

Computational Linear Algebra

Notes on PCG for Sparse Linear Systems

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

Numerical Methods - Numerical Linear Algebra

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

9.1 Preconditioned Krylov Subspace Methods

Computational Linear Algebra

6.4 Krylov Subspaces and Conjugate Gradients

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Linear Algebra Massoud Malek

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Solution to Homework 8, Math 2568

Lecture 22. r i+1 = b Ax i+1 = b A(x i + α i r i ) =(b Ax i ) α i Ar i = r i α i Ar i

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Conjugate Gradients I: Setup

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

The Conjugate Gradient Method

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

M.A. Botchev. September 5, 2014

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Numerical Optimization

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

The Conjugate Gradient Method

Lecture # 20 The Preconditioned Conjugate Gradient Method

Linear Models Review

which arises when we compute the orthogonal projection of a vector y in a subspace with an orthogonal basis. Hence assume that P y = A ij = x j, x i

4.6 Iterative Solvers for Linear Systems

Neural Network Training

Properties of Matrices and Operations on Matrices

SOLUTIONS to Exercises from Optimization

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

5. Orthogonal matrices

Nonlinear Optimization

Unconstrained optimization

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

MATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products.

Scientific Computing II

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Further Mathematical Methods (Linear Algebra)

Basic concepts in Linear Algebra and Optimization

KRYLOV SUBSPACE ITERATION

Gradient Based Optimization Methods

Conjugate Gradient Method

Computational Linear Algebra

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note 25

Lecture 7 Conjugate Gradient Method(CG)

Linear Algebra. Session 12

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Statistical Geometry Processing Winter Semester 2011/2012

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Vector Spaces, Orthogonality, and Linear Least Squares

1. Subspaces A subset M of Hilbert space H is a subspace of it is closed under the operation of forming linear combinations;i.e.,

Chapter 4 Euclid Space

Elementary linear algebra

Transcription:

The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011

Lecture Objectives describe when CG can be used to solve Ax = b relate CG to the method of conjugate directions describe what CG does geometrically explain each line in the CG algorithm

We are interested in solving the linear system Ax = b where x, b R n and A R n n Matrix is symmetric positive-definite (SPD) A T = A (symmetric) x T Ax > 0, x 0 (positive-definite) discretization of elliptic PDEs optimization of quadratic functionals nonlinear optimization problems

We are interested in solving the linear system Ax = b where x, b R n and A R n n Matrix is symmetric positive-definite (SPD) A T = A (symmetric) x T Ax > 0, x 0 (positive-definite) discretization of elliptic PDEs optimization of quadratic functionals nonlinear optimization problems

When A is SPD, solving the linear system is the same as minimizing the quadratic form f(x) = 1 2 xt Ax b T x. Why? If x is the minimizing point, then f(x ) = Ax b = 0 and, for x x f(x) f(x ) > 0. (homework)

When A is SPD, solving the linear system is the same as minimizing the quadratic form f(x) = 1 2 xt Ax b T x. Why? If x is the minimizing point, then f(x ) = Ax b = 0 and, for x x f(x) f(x ) > 0. (homework)

Definitions Let x i be the approximate solution to Ax = b at iteration i. error: e i x i x residual: r i b Ax i The following identities for the residual will be useful later. r i = Ae i r i = f(x i )

Model problem [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 2.5 x 2 2 1.5 1 3x 1 + 5x 2 = 4 5x 1 3x 2 = 4 x = [2 2] T 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Model problem [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 2.5 x 2 2 1.5 1 3x 1 + 5x 2 = 4 5x 1 3x 2 = 4 x = [2 2] T 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Review: Steepest Descent Method Qualitatively, how will steepest descent proceed on our model problem, starting at x 0 = ( 1 3,1)T? 3.5 3 x 2 2.5 2 1.5 1 x 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Review: Steepest Descent Method Qualitatively, how will steepest descent proceed on our model problem, starting at x 0 = ( 1 3,1)T? 3.5 3 x 2 2.5 2 1.5 x 0 1 x 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Review: Steepest Descent Method Qualitatively, how will steepest descent proceed on our model problem, starting at x 0 = ( 1 3,1)T? 3.5 3 x 2 2.5 2 1.5 1 x 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

How can we eliminate this zig-zag behaviour? To find the answer, we begin by considering the easier problem [ ]( ) ( ) 2 0 x1 4 2 =, 0 8 x 2 0 f(x) = x 2 1 +4x 2 2 4 2x 1. Here, the equations are decoupled, so we can minimize in each direction independently. What do the contours of the corresponding quadratic form look like?

How can we eliminate this zig-zag behaviour? To find the answer, we begin by considering the easier problem [ ]( ) ( ) 2 0 x1 4 2 =, 0 8 x 2 0 f(x) = x 2 1 +4x 2 2 4 2x 1. Here, the equations are decoupled, so we can minimize in each direction independently. What do the contours of the corresponding quadratic form look like?

Simplified problem [ 2 0 0 8 ]( x1 ) = x 2 ( 4 2 0 ) 1.5 1 0.5 0 x 2 0.5 1 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Simplified problem [ 2 0 0 8 ]( x1 ) = x 2 ( 4 2 0 ) 1.5 1 0.5 0 x 2 0.5 1 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Simplified problem [ 2 0 0 8 ]( x1 ) = x 2 ( 4 2 0 ) 1.5 1 0.5 0 x 2 e 0 0.5 1 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Simplified problem [ 2 0 0 8 ]( x1 ) = x 2 ( 4 2 0 ) 1.5 1 0.5 0 x 2 e 0 0.5 1 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Method of Orthogonal Directions Idea: Express error as a sum of n orthogonal search directions n 1 e x 0 x = α i d i. i=0 At iteration i +1, eliminate component α i d i. never need to search along d i again converge in n iterations! How would we apply the method of orthogonal directions to a non-diagonal matrix?

Method of Orthogonal Directions Idea: Express error as a sum of n orthogonal search directions n 1 e x 0 x = α i d i. i=0 At iteration i +1, eliminate component α i d i. never need to search along d i again converge in n iterations! How would we apply the method of orthogonal directions to a non-diagonal matrix?

Method of Orthogonal Directions Idea: Express error as a sum of n orthogonal search directions n 1 e x 0 x = α i d i. i=0 At iteration i +1, eliminate component α i d i. never need to search along d i again converge in n iterations! How would we apply the method of orthogonal directions to a non-diagonal matrix?

Review of Inner Products The search directions in the method of orthogonal directions are orthogonal with respect to the dot product. The dot product is an example of an inner product. Inner Product For x,y,z R n and α R, an inner product (,) : R n R n R satisfies symmetry: (x,y) = (y,x) linearity: (αx +y,z) = α(x,z)+(y,z) positive-definiteness: (x,x) > 0 x 0

Review of Inner Products The search directions in the method of orthogonal directions are orthogonal with respect to the dot product. The dot product is an example of an inner product. Inner Product For x,y,z R n and α R, an inner product (,) : R n R n R satisfies symmetry: (x,y) = (y,x) linearity: (αx +y,z) = α(x,z)+(y,z) positive-definiteness: (x,x) > 0 x 0

Fact: (x,y) A x T Ay is an inner product A-orthogonality (conjugacy) We say two vectors x,y R n are A-orthogonal, or conjugate, if (x,y) A = x T Ay = 0. What happens if we use A-orthogonality rather than standard orthogonality in the method of orthogonal directions?

Let {p 0,p 1,...,p n 1 } be a set of n linearly independent vectors that are A-orthogonal. If p i is the i th column of P, then P T AP = Σ where Σ is a diagonal matrix. Substitute x = Py into the quadratic form: f(py) = y T Σy (P T b) T y. We can apply the method of orthogonal directions in y-space.

Let {p 0,p 1,...,p n 1 } be a set of n linearly independent vectors that are A-orthogonal. If p i is the i th column of P, then P T AP = Σ where Σ is a diagonal matrix. Substitute x = Py into the quadratic form: f(py) = y T Σy (P T b) T y. We can apply the method of orthogonal directions in y-space.

New Problem: how do we get the set {p i } of conjugate vectors? Gram-Schmidt Conjugation Let {d 0,d 1,...,d n 1 } be a set of linearly independent vectors, e.g., coordinate axes. set p 0 = d 0 for i > 0 i 1 p i = d i β ij p j j=0 where β ij = (d i,p j ) A /(p j,p j ) A.

New Problem: how do we get the set {p i } of conjugate vectors? Gram-Schmidt Conjugation Let {d 0,d 1,...,d n 1 } be a set of linearly independent vectors, e.g., coordinate axes. set p 0 = d 0 for i > 0 i 1 p i = d i β ij p j j=0 where β ij = (d i,p j ) A /(p j,p j ) A.

The Method of Conjugate Directions Force the error at iteration i +1 to be conjugate to the search direction p i. p T i Ae i+1 = p T i A(e i +α i p i ) = 0 α i = pt i Ae i p T i Ap i = p T i r i p T i Ap i never need to search along p i again converge in n iterations!

The Method of Conjugate Directions Force the error at iteration i +1 to be conjugate to the search direction p i. p T i Ae i+1 = p T i A(e i +α i p i ) = 0 α i = pt i Ae i p T i Ap i = p T i r i p T i Ap i never need to search along p i again converge in n iterations!

The Method of Conjugate Directions Force the error at iteration i +1 to be conjugate to the search direction p i. p T i Ae i+1 = p T i A(e i +α i p i ) = 0 α i = pt i Ae i p T i Ap i = p T i r i p T i Ap i never need to search along p i again converge in n iterations!

The Method of Conjugate Directions [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 d 1 1 d 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Method of Conjugate Directions [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 p 1 1 p 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Method of Conjugate Directions [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 1 x 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Method of Conjugate Directions [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 1 x 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Method of Conjugate Directions [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 1 x 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Method of Conjugate Directions is well defined, and avoids the zig-zagging of Steepest Descent. What about computational expense? If we choose the d i in Gram-Schmidt conjugation to be the coordinate axes, the Method of Conjugate Directions is equivalent to Gaussian elimination. Keeping all the p i is the same as storing a dense matrix! Can we find a smarter choice for d i?

The Method of Conjugate Directions is well defined, and avoids the zig-zagging of Steepest Descent. What about computational expense? If we choose the d i in Gram-Schmidt conjugation to be the coordinate axes, the Method of Conjugate Directions is equivalent to Gaussian elimination. Keeping all the p i is the same as storing a dense matrix! Can we find a smarter choice for d i?

Error Decomposition Using p i 3.5 3 x 2 2.5 2 1.5 e 0 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Error Decomposition Using p i 3.5 3 x 2 2.5 2 1.5 e 0 α 1 p 1 1 α 0 p 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The error at iteration i can be expressed as n 1 e i = α k p k, k=i so the error must be conjugate to p j for j < i: pj T Ae i = 0, pj T r i = 0, but from Gram-Schmidt conjugation we have = =

The error at iteration i can be expressed as n 1 e i = α k p k, k=i so the error must be conjugate to p j for j < i: pj T Ae i = 0, pj T r i = 0, but from Gram-Schmidt conjugation we have = =

The error at iteration i can be expressed as n 1 e i = α k p k, k=i so the error must be conjugate to p j for j < i: pj T Ae i = 0, pj T r i = 0, but from Gram-Schmidt conjugation we have = =

The error at iteration i can be expressed as n 1 e i = α k p k, k=i so the error must be conjugate to p j for j < i: pj T Ae i = 0, pj T r i = 0, but from Gram-Schmidt conjugation we have j 1 p j = d j β jk p k. k=0

The error at iteration i can be expressed as n 1 e i = α k p k, k=i so the error must be conjugate to p j for j < i: p T j Ae i = 0, p T j r i = 0, but from Gram-Schmidt conjugation we have p T j r i = d T j r i j 1 k=0 0 = d T j r i, j < i. β jk p T k r i

Thus, the residual at iteration i is orthogonal to the vectors d j used in the previous iterations: d T j r i = 0, j < i Idea: what happens if we choose d i = r i? residuals become mutually orthogonal r i is orthogonal to p j, for j < i r i+1 becomes conjugate to p j, for j < i This last point is not immediately obvious, so we will prove it. This result has significant implications for Gram-Schmidt conjugation. we showed this is true for any choice of d i

Thus, the residual at iteration i is orthogonal to the vectors d j used in the previous iterations: d T j r i = 0, j < i Idea: what happens if we choose d i = r i? residuals become mutually orthogonal r i is orthogonal to p j, for j < i r i+1 becomes conjugate to p j, for j < i This last point is not immediately obvious, so we will prove it. This result has significant implications for Gram-Schmidt conjugation. we showed this is true for any choice of d i

The solution is updated according to x j+1 = x j +α j p j r j+1 = r j α j Ap j Ap j = 1 α j (r j r j+1 ). Next, take the dot product of both sides with an arbitrary residual r i : ri T r i α i, i = j ri T Ap j = rt i r i α i 1, i = j +1 0, otherwise.

The solution is updated according to x j+1 = x j +α j p j r j+1 = r j α j Ap j Ap j = 1 α j (r j r j+1 ). Next, take the dot product of both sides with an arbitrary residual r i : ri T r i α i, i = j ri T Ap j = rt i r i α i 1, i = j +1 0, otherwise.

The solution is updated according to x j+1 = x j +α j p j r j+1 = r j α j Ap j Ap j = 1 α j (r j r j+1 ). Next, take the dot product of both sides with an arbitrary residual r i : ri T r i α i, i = j ri T Ap j = rt i r i α i 1, i = j +1 0, otherwise.

The solution is updated according to x j+1 = x j +α j p j r j+1 = r j α j Ap j Ap j = 1 α j (r j r j+1 ). Next, take the dot product of both sides with an arbitrary residual r i : ri T r i α i, i = j ri T Ap j = rt i r i α i 1, i = j +1 0, otherwise.

The solution is updated according to x j+1 = x j +α j p j r j+1 = r j α j Ap j Ap j = 1 α j (r j r j+1 ). Next, take the dot product of both sides with an arbitrary residual r i : ri T r i α i, i = j ri T Ap j = rt i r i α i 1, i = j +1 0, otherwise.

We can show that the first case (i = j) contains no new information (homework). Divide the remaining cases by p T j Ap j and insert the definition of α i 1 : ri T Ap j pj T = Ap j }{{} β ij rt i r i ri 1 T r i 1, i = j +1 0, otherwise. We recognize the L.H.S. as the coefficients in Gram-Schmidt conjugation only one coefficient is nonzero!

We can show that the first case (i = j) contains no new information (homework). Divide the remaining cases by p T j Ap j and insert the definition of α i 1 : r T i Ap j p T j Ap j }{{} β ij = rt i r i ri 1 T r i 1, i = j +1 0, otherwise. We recognize the L.H.S. as the coefficients in Gram-Schmidt conjugation only one coefficient is nonzero!

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method Set p 0 = r 0 = b Ax 0 and i = 0 α i = (p T i r i )/(p T i Ap i ) x i+1 = x i +α i p i r i+1 = r i α i Ap i (step length) (sol. update) (resid. update) β i+1,i = (r T i+1r i+1 )/(r T i r i ) (G.S. coeff.) p i+1 = r i+1 β i+1,i p i (Gram Schmidt) i := i +1

The Conjugate Gradient Method [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 1 x 0 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Conjugate Gradient Method [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

The Conjugate Gradient Method [ 5 3 3 5 ]( x1 ) ( 4 = x 2 4) 3.5 3 x 2 2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 x 1

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

Lecture Objectives describe when CG can be used to solve Ax = b A must be symmetric positive-definite relate CG to the method of conjugate directions CG is a method of conjugate directions with the choice d i = r i, which simplifies Gram-Schmidt conjugation describe what CG does geometrically Performs the method of orthogonal directions in a transformed space where the contours of the quadratic form are aligned with the coordinate axes explain each line in the CG algorithm

References Saad, Y., Iterative Methods for Sparse Linear Systems, second edition Shewchuk, J. R., An introduction to the Conjugate Gradient method without the agonizing pain