Conjugate Gradients I: Setup CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 22
Time for Gaussian Elimination A R n n = CS 205A: Mathematical Methods Conjugate Gradients I: Setup 2 / 22
Time for Gaussian Elimination A R n n = O(n 3 ) CS 205A: Mathematical Methods Conjugate Gradients I: Setup 2 / 22
Common Case Easy to apply, hard to invert. Sparse matrices Special structure CS 205A: Mathematical Methods Conjugate Gradients I: Setup 3 / 22
New Philosophy Iteratively improve approximation rather than solve in closed form. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 4 / 22
For Today A x = b Square Symmetric Positive definite CS 205A: Mathematical Methods Conjugate Gradients I: Setup 5 / 22
Variational Viewpoint min x A x = b [ ] 1 2 x A x b x + c CS 205A: Mathematical Methods Conjugate Gradients I: Setup 6 / 22
Gradient Descent Strategy 1. Compute search direction CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 22
Gradient Descent Strategy 1. Compute search direction d k f( x k 1 ) = b A x k 1 CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 22
Gradient Descent Strategy 1. Compute search direction d k f( x k 1 ) = b A x k 1 2. Do line search to find x k x k 1 + α k dk. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 7 / 22
Line Search Along d from x min α g(α) f( x + α d) CS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 22
Line Search Along d from x min α g(α) f( x + α d) α = d ( b A x) d A d CS 205A: Mathematical Methods Conjugate Gradients I: Setup 8 / 22
Gradient Descent with Closed-Form Line Search d k = b A x k 1 α k = d k d k d k A d k x k = x k 1 + α k dk CS 205A: Mathematical Methods Conjugate Gradients I: Setup 9 / 22
Convergence See book. f( x k ) f( x ) f( x k 1 ) f( x ) 1 1 cond A Conclusions: Conditioning affects speed and quality Unconditional convergence (cond A 1) CS 205A: Mathematical Methods Conjugate Gradients I: Setup 10 / 22
Visualization CS 205A: Mathematical Methods Conjugate Gradients I: Setup 11 / 22
Can We Do Better? Can iterate forever: Should stop after O(n) iterations! Lots of repeated work when poorly conditioned CS 205A: Mathematical Methods Conjugate Gradients I: Setup 12 / 22
Observation f( x) = 1 2 ( x x ) A( x x ) + const. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 22
Observation f( x) = 1 2 ( x x ) A( x x ) + const. A = LL CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 22
Observation f( x) = 1 2 ( x x ) A( x x ) + const. A = LL = f( x) = 1 2 L ( x x ) 2 2 + const. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 13 / 22
Substitution y L x, y L x = f( y) = y y 2 2 CS 205A: Mathematical Methods Conjugate Gradients I: Setup 14 / 22
Substitution y L x, y L x = f( y) = y y 2 2 Proposition Suppose { w 1,..., w n } are orthogonal in R n. Then, f is minimized in at most n steps by line searching in direction w 1, then direction w 2, and so on. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 14 / 22
Visualization Any two orthogonal directions suffice! CS 205A: Mathematical Methods Conjugate Gradients I: Setup 15 / 22
Undoing Change of Coordinates Line search on f along w is the same as line search on f along (L ) 1 w. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 16 / 22
Undoing Change of Coordinates Line search on f along w is the same as line search on f along (L ) 1 w. 0 = w i w j = (L v i ) (L v j ) = v i (LL ) v j = v i A v j CS 205A: Mathematical Methods Conjugate Gradients I: Setup 16 / 22
Conjugate Directions A-Conjugate Vectors Two vectors v and w are A-conjugate if v A w = 0. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 17 / 22
Conjugate Directions A-Conjugate Vectors Two vectors v and w are A-conjugate if v A w = 0. Corollary Suppose { v 1,..., v n } are A-conjugate. Then, f is minimized in at most n steps by line searching in direction v 1, then direction v 2, and so on. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 17 / 22
High-Level Ideas So Far Steepest descent may not be fastest descent (surprising!) CS 205A: Mathematical Methods Conjugate Gradients I: Setup 18 / 22
High-Level Ideas So Far Steepest descent may not be fastest descent (surprising!) Two inner products: v w v, w A v A w CS 205A: Mathematical Methods Conjugate Gradients I: Setup 18 / 22
New Problem Find n A-conjugate directions. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 19 / 22
Gram-Schmidt? CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 22
Gram-Schmidt? Potentially unstable CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 22
Gram-Schmidt? Potentially unstable Storage increases with each iteration CS 205A: Mathematical Methods Conjugate Gradients I: Setup 20 / 22
Another Clue r k b A x k CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 22
Another Clue r k b A x k r k+1 = r k α k+1 A v k+1 CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 22
Another Clue r k b A x k r k+1 = r k α k+1 A v k+1 Proposition When performing gradient descent on f, span { r 0,..., r k } = span { r 0, A r 0,..., A k r 0 }. CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 22
Another Clue r k b A x k r k+1 = r k α k+1 A v k+1 Proposition When performing gradient descent on f, span { r 0,..., r k } = span { r 0, A r 0,..., A k r 0 }. Krylov space?! CS 205A: Mathematical Methods Conjugate Gradients I: Setup 21 / 22
Gradient Descent: Issue x k x 0 arg min v span { r 0,A r 0,...,A k 1 r 0 } f( x 0 + v) CS 205A: Mathematical Methods Conjugate Gradients I: Setup 22 / 22
Gradient Descent: Issue x k x 0 arg min v span { r 0,A r 0,...,A k 1 r 0 } f( x 0 + v) But if this did hold... Convergence in n steps! Next CS 205A: Mathematical Methods Conjugate Gradients I: Setup 22 / 22