Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Size: px
Start display at page:

Download "Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright."

Transcription

1 Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7,

2 Chapter 5 (Conjugate Gradient Methods) Notes on the Text Notes on the linear conjugate gradient method The objective function we seek to minimize is given by φ(x) = 1 2 xt Ax b T x. (1) This will be minimized by taking a step from the current best guess at the minimum, x k, in the conjugate directions, p k, to get a new best guess as x k+1 = x k + αp k. (2) Note that to find the value of α to use in the step given in Equation 2, we can select the value of α that minimizes φ(α) φ(x k + αp k ). To find this value we take the derivative of this expression with respect to α, set the resulting expression equal to zero, and solve for α. Since φ(x k + αp k ) can be written as φ(α) 1 2 (x k + αp k ) T A(x k + αp k ) b T (x k + αp k ) = 1 2 xt k Ax k + αx T k Ap k α2 p T k Ap k b T x k αb T p k = φ(x k ) α2 p T k Ap k + α(p T k Ax k p T k b) T = φ(x k ) α2 p T k Ap k + αr T k p k. Setting the derivative of this expression equal to zero gives p T k Ap kα + r T k p k = 0. From which when we solve for α we get the following for the conjugate direction stepsize: α = rt k p k p T k Ap, (3) k this is equation 5.6 in the book. Notationally we can add a subscript k to the variable α as in α k to denote that this is the step size taking in moving from x k to x k+1. Since the residual r is defined as r = Ax b and to get the the k +1st minimization estimate x k+1 from x k we use Equation 2. If we premultiply this expression by A and subtract b from both sides we get Ax k+1 b = Ax k b + α k Ap k, or in terms of the residuals we get the residual update equation: which is the books equation 5.9. r k+1 = r k + α k Ap k, (4)

3 The initial induction step in the expanding subspace minimization theorem I found it a bit hard to verify the truth of the initial inductive statement used in the proof of the expanding subspace minimization theorem presented in the book. In that proof the initial step requires that one verify r T 1 p 0 = 0. This can be shown as follows. The first residual r 1 is given by r 1 = φ(x 0 + α 0 p 0 ) = A(x 0 + α 0 p 0 ) b. Thus the inner product of r 1 with p 0 gives r T 1 p 0 = (Ax 0 + α 0 Ap 0 b) T p 0 = x T 0 Ap 0 + α 0 p T 0 Ap 0 b T p 0. If we put in the value of α 0 = rt 0 p 0 p T 0 Ap 0 the above expression becomes as we were to show. = x T 0 Ap 0 rt 0 p 0 p T 0 Ap 0 p T 0 Ap 0 b T p 0 = (x T 0 A bt )p 0 r T 0 p 0 = r T 0 p r T 0 p 0 = 0, Notes on the basic properties of the conjugate gradient method We want to pick a value for β k such that the new expression for p k given by p k = r k + β k p k 1, to be A conjugate to the old p k 1. To enforce this conjugacy, multiply this expression by p T k 1A on the left of the expression above where we get p T k 1Ap k = p T k 1Ar k + β k p T k 1Ap k 1. If we take the left-hand-side of this expression equal to zero and solve for β k we get that β k = pt k 1Ar k p T k 1 Ap. (5) k 1 In this case the new value of p k will be A conjugate to the old value p k 1. Notes on the preliminary version of the conjugate gradient method The stepsize in the conjugate direction is given by Equation 3. If we use the conjugate update equation p k = r k + β k 1 p k 1, (6)

4 and the residual prior-conjugate orthogonality condition given by r T k p j = 0 for 0 j < k, (7) in Equation 6 we get when we multiply by r T k on the left we have r T k p k = r T k r k + β k 1 r T k p k 1 = r T k r k, Thus using this fact in Equation 3 we have an alternative expression for α k given by α k = rt k r k p T k Ap. (8) k In the preliminary conjugate gradient algorithm the stepsize, β k+1, in the conjugate direction is given by Equation 5. Using the residual update equation r k+1 = r k + α k Ap k, to replace Ap k in the expression for β k+1 to get β k+1 = 1 α k (rk+1 T (r k+1 r k )) 1 α k p T k (r k+1 r k ) since r T k+1 r k = 0, by residual-residual orthogonality = rt k+1 r k+1 p T k (r k+1 r k ), r T k r i = 0 for i = 0, 1,..., k 1. (9) Using the conjugate update equation p k = r k +β k+1 p k 1 in the denominator above, we get a new denominator given by ( r k + β k p k 1 ) T (r k+1 r k ). Next using residual prior-conjugate orthogonality Equation 7 or r T k p i = 0 for i = 0, 1,..., k 1, we have as we wanted to show. β k+1 = rt k+1 r k+1 r T k r k, (10) Notes on the rate of convergence of the conjugate gradient method To study the convergence of the conjugate gradient method we first argue that the minimization problem we originally posed: that of minimizing φ(x) = 1 2 xt Ax b T x over x is equivalent to the problem of minimizing a norm squared expression, namely x x 2 A. If we take the minimum of φ(x) to be denoted as x such that x solves Ax = b we then have the minimum of φ(x) at this point is given by φ(x ) = 1 2 x T Ax b T x = 1 2 x T b b T x = 1 2 bt x.

5 To show that these two minimization problems are equivalent consider the expression 1 2 x x 2 A. We have 1 2 x x 2 A = 1 2 (x x ) T A(x x ) = 1 2 xt Ax x T Ax x T Ax. Since Ax = b we have that the above becomes 1 2 x x 2 A = 1 2 xt Ax x T b x T Ax = φ(x) x T Ax = φ(x) 1 2 x T Ax + x T Ax verifying the books equation = φ(x) 1 2 x T Ax + x T b = φ(x) φ(x ), To study convergence of the conjugate gradient method we will decompose the difference between our initial guess at the solution denoted as x 0 and the true solution denoted by x or x 0 x in terms of the eigenvectors v i of A as x 0 x = n i=1 ξ i v i. When we do this we have that the difference between the k + 1th iteration and x is given by n x k+1 x = (1 + λ i Pk (λ i))ξ i v i. i=1 Then in terms of the A norm this distance is given by ( n ) x k+1 x T ( A = (1 + λ i Pk (λ n i)ξ i v i A i=1 ( n ) ( = (1 + λ i Pk (λ i)ξ i vi T n A i=1 n n = i=1 j=1 ) (1 + λ i Pk (λ i)ξ i v i i=1 ) (1 + λ i Pk (λ i)ξ i v i i=1 ξ i (1 + λ i P k (λ i))ξ j (1 + λ j P k (λ j))v T i Av j. As v j are orthonormal eigenvectors of A we have Av j = λ j v j and vi Tv j = 0 so that the above becomes n ξi 2 (1 + λ i Pk (λ i )) 2 λ i, i=1 as claimed by the book s equation Notes on the Polak-Ribiere Method In this subsection of these notes we derive an alternative expression for β k+1 that is used in the conjugate-direction update Equation 6 and that is valid for nonlinear optimization

6 problems. Since our state update equation is given by x k+1 = x k + α k p k, the gradient of f(x) at the new point x k+1 can be computed to second order using Taylor s theorem as f k+1 = f k + α k Ḡ k p k, where Ḡk is the average Hessian over the line segment [x k, x k+1 ]. To be sure that the new conjugate search direction p k+1 derived from the standard conjugate direction update equation: p k+1 = f k+1 + β k+1 p k, is conjugate with respect to the average Hessian Ḡk means that or using the expression for p k+1 this becomes So this later expression requires that p T k+1ḡkp k = 0, f T k+1ḡkp k + β k+1 p T k Ḡkp k = 0. β k+1 = ft k+1ḡkp k p T k Ḡkp k. Recognizing that Ḡkp k = f k+1 f k α k so β k+1 above becomes or the Hestenes-Stiefel formula and the books equation β k+1 = ft k+1 ( f k+1 f k ) ( f k+1 f k ) T p k, (11) Problem Solutions Problem 5.1 (the conjugate gradient algorithm on Hilbert matrices) See the MATLAB code prob 1.m where we call the MATLAB routine cgsolve.m to solve for the solution to Ax = b when A is the n n Hilbert matrix (generated in MATLAB using the built-in function hilb.m) and b is a vector of all ones and starting with an initial guess at the minimum of x 0 = 0. We do this for the values of n suggested in the text and then plot the number of CG iterations needed to drive the residual to The result of this calculation is presented in Figure 1. The definition of convergence here is taken to be when Ax b b < We see that the number of iterations grows relatively quickly with the dimension of the matrix A.

7 70 number of iterations required to reduced the error to 1.e size of Hilbert matrix (n) Figure 1: The number of iterations needed for convergence of the CG algorithm when applied to the (classically ill-conditioned) Hilbert matrix. Problem 5.2 (if p i are conjugate w.r.t. A then p i are linearly independent) The books equation 5.4 is the statement that p T i Ap j = 0 for all i j. To show that the vectors p i are linearly independent we begin by assuming that they are not and show that this leads to a contradiction. That p i are not linearly independent means that we can find constants α i (not all zero) such that αi p i = 0. If we premultiply the above summation by the matrix A we get αi Ap i = 0. Next premultiply the above by p T j to get α j p T j Ap j = 0, since p T j Ap i = 0 for all i j. Since A is positive definite the term p T j Ap j > 0 we can divide by it and conclude that α j = 0. Since the above is true for each value of j we have that each value of α j is zero. This is a contradiction to the assumption that p i are not linearly independent thus they must be linearly independent. Problem 5.3 (verification of the conjugate direction stepsize α k ) See the discussion around Equation 3 in these notes where the requested expression is derived.

8 Problem 5.4 (strongly convex when we step along the conjugate directions p k ) I think this problem is supposed to state that f(x) is strongly convex (as which means that f(x) has a functional form given by f(x) = 1 2 xt Ax b T x. In such a case when we take x of the form x = x 0 + σ 0 p 0 + σ 1 p σ k 1 p k 1, we can write f as a strongly convex function in the variables σ = (σ 1, σ 2,...,σ k 1 ) T. Problem 5.5 (the conjugate directions p i span the Krylov subspace) We want to show that 5.16 and 5.17 hold for k = 1. The books equation 5.16 is span{r 0, r 1 } = span{r 0, Ar 0 }. To show this equivalence we need to show that r 1 span{r 0, Ar 0 }. Now r 1 can be written as r 1 = Ax 1 b with x 1 given by = A(x 0 + α 0 p 0 ) b or = Ax 0 b + α 0 Ap 0 or since r 0 = Ax 0 b = r 0 + α 0 Ap 0 or since p 0 = r 0 = r 0 α 0 Ar 0, showing that r 1 span{r 0, Ar 0 }, and thus span{r 0, r 1 } span{r 0, Ar 0 }. Showing the other direction, that is span{r 0, Ar 0 } span{r 0, r 1 } is the same as noting that we can perform the manipulations above in the other direction. The books equation 5.17 when k = 1 is span{p 0, p 1 } = span{r 0, Ar 0 }, since p 0 = r 0 to show span{p 0, p 1 } span{r 0, Ar 0 } we need to show that p 1 span{r 0, Ar 0 }. Since p 1 = r 1 + β 1 p 0 = (Ax 1 b) β 1 r 0 = (A(x 0 + α 0 p 0 ) b) β 1 r 0 ) = r 0 + α 0 Ar 0 β 1 r 0. Again showing the other direction is the same as noting that we can perform the manipulations above in the other direction. Problem 5.6 (an alternative form for the conjugate direction stepsize β k+1 ) See the discussion around Equation 10 of these notes where the requested expression is derived.

9 Problem 5.7 (the eigensystem of a polynomial expression of a matrix) Given the eigenvalues λ i and eigenvectors v i of a matrix A then any polynomial expression of A say P(A) has the same eigenvectors with corresponding eigenvalues P(λ i ) as can be seen by simply evaluating P(A)v i. This problem is a special case of that result. Problem 5.9 (deriving the preconditioned CG algorithm) For this problem we are to derive the preconditioned CG algorithm from the normal CG algorithm. This means that we transform the original problem, that of seeing a solution for the minimum φ(x) of φ(x) = 1 2 xt Ax b T x, by transforming the original x variable into a hat variable ˆx = Cx. In this new space the x minimization problem above is equivalent to seeking the minimum solution to the following ˆφ(ˆx) = 1 2 (C 1ˆx) T A(C 1ˆx) b T (C 1ˆx) = 1 2ˆxT (C T AC 1 )ˆx (C T b) T. This later problem we will solve with the CG method where in the standard CG algorithm we take a matrix A and the vector b given by  = C T AC 1 ˆb = C T b. Now given ˆx 0 as an initial guess at the minimum of the hat problem (note that specifying this is equivalent to specifying a initial guess x 0 for the minimum of φ(x)) and following the standard CG algorithm but using the hated variables, we start to derive the preconditioned CG by setting ˆr 0 = C T AC 1ˆx 0 C T b ˆp 0 = ˆr 0 k = 0. With these initial variables set, we then loop while ˆr k 0 and perform the following steps (following algorithm 5.2) ˆα k = ˆx k+1 ˆr k+1 ˆr T k ˆr k ˆp T k C T AC 1ˆp k = ˆx k + ˆα kˆp k = ˆr k + ˆα k C T AC 1ˆp k ˆβ k+1 = ˆr k+1ˆr k+1 ˆr T k ˆr k ˆp k+1 = ˆr k+1 + ˆβ k+1ˆp k k = k + 1.

10 After performing these iterations our output will be ˆx or the minimum of the quadratic ˆφ(ˆx) = 1 2ˆxT (C T AC 1 )ˆx (C T b)ˆx, but we really want to output x = C 1ˆx. To derive an expression that works on x lets write the above algorithm in terms of the unhatted variables x and r. Given an initial guess x 0 at the minimum of φ(x) then ˆx 0 = Cx 0 is the initial guess at the minimum of ˆφ(ˆx). Note that ˆr 0 = C T AC 1ˆx 0 C T b, or C T ˆr 0 = Ax 0 b = r 0, is the residual of the original problem. Thus it looks like the residuals transform between hatted an unhatted variables as ˆr k = C T r k. (12) Next note that ˆp 0 = ˆr 0 = C T r 0 = C T p 0, it looks like the conjugate directions transform between hatted an unhatted variables in the same way, namely ˆp k = C T p k. (13) Using these two simplifications our preconditioned conjugate gradient algorithm becomes in terms of the unhatted variables (recall the unknown variable transforms as ˆx k = Cx k ) ˆα k = = rk TC 1 C T r k p T k C 1 C T AC 1 C T p k rk T (C T C) 1 r k (14) p T k (CT C) 1 A(C T C) 1 p k x k+1 r k+1 = x k + ˆα k C 1ˆp k = x k + ˆα k C 1 C T p k = x k + ˆα k (C T C) 1 p k (15) = r k + ˆα k AC 1 C T p k = r k + ˆα k A(C T C) 1 p k (16) ˆβ k+1 = rt k+1 C 1 C T r k+1 r T k C 1 C T r k = rt k+1 (CT C) 1 r k+1 r T k (CT C) 1 r k (17) p k+1 = r k+1 + ˆβ k+1 p k (18) k = k + 1. In the above expressions on each line, we first made the hat to unhat substitution and then on the subsequent line simplified the resulting expression. Next to simplify these expressions further we introduce two new variables. The first variable, y k, is defined by y k = (C T C) 1 r k,

11 or the solution y k to the linear system My k = r k, where M = C T C. The second variable z k is defined similarly as z k = (C T C) 1 p k, or the solution to the linear system Mz k = p k. To use these variables, as a first step, in Equation 18 above we multiply by (C T C) 1 on both sides and use the definitions of z k and y k to get z k+1 = y k+1 + ˆβ k+1 z k. our algorithm then becomes Given x 0, our initial guess at the minimum of φ(x) form r 0 = Ax 0 b and solve My 0 = r 0 for y 0. Next compute z 0 given by z 0 = M 1 p 0 = M 1 ( r 0 ) = M 1 (My 0 ) = y 0. (19) Next we set k = 0 and iterate the following while r k 0 ˆα k = rt k y k zk TAz k = x k + ˆα k z k x k+1 r k+1 = r k + ˆα k Az k solve My k+1 = r k+1 for y k+1 ˆβ k+1 = rt k+1 y k+1 r T k y k z k+1 = y k+1 + ˆβ k+1 z k k = k + 1. Note that this is the same algorithm presented in the book but the book denotes the variable z k by the notation p k and I think that there is an error in the book s initialization of this routine in that the book states p 0 = r 0 while I think this expression should be p 0 = y 0 (in their notation) see Equation 19. Problem 5.10 (deriving the modified residual conjugacy condition) In the transformed hat problem to minimize the quadratic ˆφ(ˆx) given by see Problem 5.9 on page 9 above, if we define ˆφ(ˆx) = 1 2ˆxT (C T AC 1 )ˆx (C Tˆb)T ˆx,  C T AC 1 ˆb C T b. So that the transformed hat problem has a residual ˆr given by ˆr = ˆx ˆb = C T AC 1 Cx C T b = C T (Ax b) = C T r.

12 Thus the orthogonality property of successive residuals i.e. the books equation 5.15 for the hat problem which is given by ˆr T k ˆr i = 0 for i = 0, 1, 2,, k 1. (20) becomes in terms of the original variables of the problem r T k C 1 C T r j = r T k M 1 r j = 0, since M = C T C or the modified residual conjugacy condition and is what we were to show. Problem 5.11 (the expressions for β PR and β HS reduce to β FR ) Recall that three expressions suggested for β (the conjugate direction stepsize) are for the Polak-Ribiere formula, for the Hestenes-Stiefel and for the Fletcher-Reeves expression. βk+1 PR = ft k+1 ( f k+1 f k ) fk T f, (21) k β HS k+1 = ft k+1 ( f k+1 f k ) ( f k+1 f k ) T p k, (22) βk+1 FR = ft k+1 f k+1 fk T f, (23) k To show that the Polak-Ribere CG stepsize β PR reduces to the Fletcher Reeves CG stepsize β FR, under the conditions given in this problem, from the above formulas it is sufficient to show that f T k+1 f k = 0, since they agree on the other terms. Now when f(x) is a quadratic function f(x) = 1 2 xt Ax b T x + c for some matrix A, vector b, and scalar c and x k+1 is chosen to be the exact line search minimum then f k+1 = r k+1. Thus f T k+1 f k = r T k+1r k = 0, by residual-residual orthogonality Equation 9 (the books equation 5.15). Now to show that Hestenes-Stiefel CG stepsize β HS reduces to the Fletcher Reeves CG stepsize β FR, under the conditions given in this problem, from the above formulas it is sufficient to show that ( f k+1 f k ) T p k = f T k f k. Using the results above we have ( f k+1 f k ) T p k = (r k+1 r k ) T ( r k + β k p k 1 ) = r T k+1 r k + β k r T k+1 p k 1 + r T k r k β k r T k p k 1 = r T k r k = f T k f k. Where in the above we have used residual prior-conjugate orthogonality given by Equation 7 to show r T k p k 1 = 0.

13 References

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

Chapter 10 Conjugate Direction Methods

Chapter 10 Conjugate Direction Methods Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method

More information

Notes on Some Methods for Solving Linear Systems

Notes on Some Methods for Solving Linear Systems Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps.

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps. Conjugate Gradient algorithm Need: A symmetric positive definite; Cost: 1 matrix-vector product per step; Storage: fixed, independent of number of steps. The CG method minimizes the A norm of the error,

More information

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method The minimization problem We are given a symmetric positive definite matrix R n n and a right hand side vector b R n We want to solve the linear system Find u R n such that

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview

More information

Conjugate Gradient Method

Conjugate Gradient Method Conjugate Gradient Method Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 10, 2011 T.M. Huang (NTNU) Conjugate Gradient Method October 10, 2011 1 / 36 Outline 1 Steepest

More information

Course Notes: Week 4

Course Notes: Week 4 Course Notes: Week 4 Math 270C: Applied Numerical Linear Algebra 1 Lecture 9: Steepest Descent (4/18/11) The connection with Lanczos iteration and the CG was not originally known. CG was originally derived

More information

7.2 Steepest Descent and Preconditioning

7.2 Steepest Descent and Preconditioning 7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Some minimization problems

Some minimization problems Week 13: Wednesday, Nov 14 Some minimization problems Last time, we sketched the following two-step strategy for approximating the solution to linear systems via Krylov subspaces: 1. Build a sequence of

More information

Numerical Optimization of Partial Differential Equations

Numerical Optimization of Partial Differential Equations Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada

More information

Linear Algebra- Final Exam Review

Linear Algebra- Final Exam Review Linear Algebra- Final Exam Review. Let A be invertible. Show that, if v, v, v 3 are linearly independent vectors, so are Av, Av, Av 3. NOTE: It should be clear from your answer that you know the definition.

More information

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method

More information

Conjugate Gradient Method

Conjugate Gradient Method Conjugate Gradient Method direct and indirect methods positive definite linear systems Krylov sequence spectral analysis of Krylov sequence preconditioning Prof. S. Boyd, EE364b, Stanford University Three

More information

PETROV-GALERKIN METHODS

PETROV-GALERKIN METHODS Chapter 7 PETROV-GALERKIN METHODS 7.1 Energy Norm Minimization 7.2 Residual Norm Minimization 7.3 General Projection Methods 7.1 Energy Norm Minimization Saad, Sections 5.3.1, 5.2.1a. 7.1.1 Methods based

More information

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b. Lab 1 Conjugate-Gradient Lab Objective: Learn about the Conjugate-Gradient Algorithm and its Uses Descent Algorithms and the Conjugate-Gradient Method There are many possibilities for solving a linear

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Gradient-Based Optimization

Gradient-Based Optimization Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

EECS260 Optimization Lecture notes

EECS260 Optimization Lecture notes EECS260 Optimization Lecture notes Based on Numerical Optimization (Nocedal & Wright, Springer, 2nd ed., 2006) Miguel Á. Carreira-Perpiñán EECS, University of California, Merced May 2, 2010 1 Introduction

More information

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 Felix Kwok February 27, 2004 Written Problems 1. (Heath E3.10) Let B be an n n matrix, and assume that B is both

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method

Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method Zdeněk Strakoš and Petr Tichý Institute of Computer Science AS CR, Technical University of Berlin. International

More information

Conjugate Gradient Tutorial

Conjugate Gradient Tutorial Conjugate Gradient Tutorial Prof. Chung-Kuan Cheng Computer Science and Engineering Department University of California, San Diego ckcheng@ucsd.edu December 1, 2015 Prof. Chung-Kuan Cheng (UC San Diego)

More information

Solutions to Review Problems for Chapter 6 ( ), 7.1

Solutions to Review Problems for Chapter 6 ( ), 7.1 Solutions to Review Problems for Chapter (-, 7 The Final Exam is on Thursday, June,, : AM : AM at NESBITT Final Exam Breakdown Sections % -,7-9,- - % -9,-,7,-,-7 - % -, 7 - % Let u u and v Let x x x x,

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 4: Iterative Methods PD

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

4.6 Iterative Solvers for Linear Systems

4.6 Iterative Solvers for Linear Systems 4.6 Iterative Solvers for Linear Systems Why use iterative methods? Virtually all direct methods for solving Ax = b require O(n 3 ) floating point operations. In practical applications the matrix A often

More information

Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method

Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method Zdeněk Strakoš and Petr Tichý Institute of Computer Science AS CR, Technical University of Berlin. Emory

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

More information

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method. Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

Gradient Descent Methods

Gradient Descent Methods Lab 18 Gradient Descent Methods Lab Objective: Many optimization methods fall under the umbrella of descent algorithms. The idea is to choose an initial guess, identify a direction from this point along

More information

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University Lecture 17 Methods for System of Linear Equations: Part 2 Songting Luo Department of Mathematics Iowa State University MATH 481 Numerical Methods for Differential Equations Songting Luo ( Department of

More information

TMA4180 Solutions to recommended exercises in Chapter 3 of N&W

TMA4180 Solutions to recommended exercises in Chapter 3 of N&W TMA480 Solutions to recommended exercises in Chapter 3 of N&W Exercise 3. The steepest descent and Newtons method with the bactracing algorithm is implemented in rosenbroc_newton.m. With initial point

More information

From Stationary Methods to Krylov Subspaces

From Stationary Methods to Krylov Subspaces Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD

More information

The Conjugate Gradient Method for Solving Linear Systems of Equations

The Conjugate Gradient Method for Solving Linear Systems of Equations The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2

More information

Mathematical optimization

Mathematical optimization Optimization Mathematical optimization Determine the best solutions to certain mathematically defined problems that are under constrained determine optimality criteria determine the convergence of the

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

FALL 2018 MATH 4211/6211 Optimization Homework 4

FALL 2018 MATH 4211/6211 Optimization Homework 4 FALL 2018 MATH 4211/6211 Optimization Homework 4 This homework assignment is open to textbook, reference books, slides, and online resources, excluding any direct solution to the problem (such as solution

More information

Introduction to Iterative Solvers of Linear Systems

Introduction to Iterative Solvers of Linear Systems Introduction to Iterative Solvers of Linear Systems SFB Training Event January 2012 Prof. Dr. Andreas Frommer Typeset by Lukas Krämer, Simon-Wolfgang Mages and Rudolf Rödl 1 Classes of Matrices and their

More information

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal

More information

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico Lecture 11 Fast Linear Solvers: Iterative Methods J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) Math/CS 375 1 / 23 Summary: Complexity of Linear Solves

More information

Lecture # 20 The Preconditioned Conjugate Gradient Method

Lecture # 20 The Preconditioned Conjugate Gradient Method Lecture # 20 The Preconditioned Conjugate Gradient Method We wish to solve Ax = b (1) A R n n is symmetric and positive definite (SPD). We then of n are being VERY LARGE, say, n = 10 6 or n = 10 7. Usually,

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

Applied Linear Algebra in Geoscience Using MATLAB

Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if

More information

A geometric proof of the spectral theorem for real symmetric matrices

A geometric proof of the spectral theorem for real symmetric matrices 0 0 0 A geometric proof of the spectral theorem for real symmetric matrices Robert Sachs Department of Mathematical Sciences George Mason University Fairfax, Virginia 22030 rsachs@gmu.edu January 6, 2011

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Lecture 10: October 27, 2016

Lecture 10: October 27, 2016 Mathematical Toolkit Autumn 206 Lecturer: Madhur Tulsiani Lecture 0: October 27, 206 The conjugate gradient method In the last lecture we saw the steepest descent or gradient descent method for finding

More information

The Conjugate Gradient Algorithm

The Conjugate Gradient Algorithm Optimization over a Subspace Conjugate Direction Methods Conjugate Gradient Algorithm Non-Quadratic Conjugate Gradient Algorithm Optimization over a Subspace Consider the problem min f (x) subject to x

More information

May 9, 2014 MATH 408 MIDTERM EXAM OUTLINE. Sample Questions

May 9, 2014 MATH 408 MIDTERM EXAM OUTLINE. Sample Questions May 9, 24 MATH 48 MIDTERM EXAM OUTLINE This exam will consist of two parts and each part will have multipart questions. Each of the 6 questions is worth 5 points for a total of points. The two part of

More information

Combination Preconditioning of saddle-point systems for positive definiteness

Combination Preconditioning of saddle-point systems for positive definiteness Combination Preconditioning of saddle-point systems for positive definiteness Andy Wathen Oxford University, UK joint work with Jen Pestana Eindhoven, 2012 p.1/30 compute iterates with residuals Krylov

More information

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix DIAGONALIZATION Definition We say that a matrix A of size n n is diagonalizable if there is a basis of R n consisting of eigenvectors of A ie if there are n linearly independent vectors v v n such that

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

On the choice of abstract projection vectors for second level preconditioners

On the choice of abstract projection vectors for second level preconditioners On the choice of abstract projection vectors for second level preconditioners C. Vuik 1, J.M. Tang 1, and R. Nabben 2 1 Delft University of Technology 2 Technische Universität Berlin Institut für Mathematik

More information

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computations ªaw University of Technology Institute of Mathematics and Computer Science Warsaw, October 7, 2006

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Conjugate Directions for Stochastic Gradient Descent

Conjugate Directions for Stochastic Gradient Descent Conjugate Directions for Stochastic Gradient Descent Nicol N Schraudolph Thore Graepel Institute of Computational Science ETH Zürich, Switzerland {schraudo,graepel}@infethzch Abstract The method of conjugate

More information

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Lecture Notes: Geometric Considerations in Unconstrained Optimization Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections

More information

Algorithms that use the Arnoldi Basis

Algorithms that use the Arnoldi Basis AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O Leary c 2006, 2007 Algorithms that use the Arnoldi Basis Reference: Chapter 6 of Saad The Arnoldi Basis How to

More information

Algebra C Numerical Linear Algebra Sample Exam Problems

Algebra C Numerical Linear Algebra Sample Exam Problems Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

There are two things that are particularly nice about the first basis

There are two things that are particularly nice about the first basis Orthogonality and the Gram-Schmidt Process In Chapter 4, we spent a great deal of time studying the problem of finding a basis for a vector space We know that a basis for a vector space can potentially

More information

THE RELATIONSHIPS BETWEEN CG, BFGS, AND TWO LIMITED-MEMORY ALGORITHMS

THE RELATIONSHIPS BETWEEN CG, BFGS, AND TWO LIMITED-MEMORY ALGORITHMS Furman University Electronic Journal of Undergraduate Mathematics Volume 12, 5 20, 2007 HE RELAIONSHIPS BEWEEN CG, BFGS, AND WO LIMIED-MEMORY ALGORIHMS ZHIWEI (ONY) QIN Abstract. For the solution of linear

More information

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc. Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix

More information

Numerical Methods I Solving Nonlinear Equations

Numerical Methods I Solving Nonlinear Equations Numerical Methods I Solving Nonlinear Equations Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 16th, 2014 A. Donev (Courant Institute)

More information

CHAPTER 6. Projection Methods. Let A R n n. Solve Ax = f. Find an approximate solution ˆx K such that r = f Aˆx L.

CHAPTER 6. Projection Methods. Let A R n n. Solve Ax = f. Find an approximate solution ˆx K such that r = f Aˆx L. Projection Methods CHAPTER 6 Let A R n n. Solve Ax = f. Find an approximate solution ˆx K such that r = f Aˆx L. V (n m) = [v, v 2,..., v m ] basis of K W (n m) = [w, w 2,..., w m ] basis of L Let x 0

More information

Conjugate Gradient Method

Conjugate Gradient Method Conjugate Gradient Method Hung M Phan UMass Lowell April 13, 2017 Throughout, A R n n is symmetric and positive definite, and b R n 1 Steepest Descent Method We present the steepest descent method for

More information

Numerical Methods for Large-Scale Nonlinear Systems

Numerical Methods for Large-Scale Nonlinear Systems Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.

More information

Steepest descent algorithm. Conjugate gradient training algorithm. Steepest descent algorithm. Remember previous examples

Steepest descent algorithm. Conjugate gradient training algorithm. Steepest descent algorithm. Remember previous examples Conjugate gradient training algorithm Steepest descent algorithm So far: Heuristic improvements to gradient descent (momentum Steepest descent training algorithm Can we do better? Definitions: weight vector

More information

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers. Linear Algebra - Test File - Spring Test # For problems - consider the following system of equations. x + y - z = x + y + 4z = x + y + 6z =.) Solve the system without using your calculator..) Find the

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 21: Sensitivity of Eigenvalues and Eigenvectors; Conjugate Gradient Method Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis

More information

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015 Solutions to Final Practice Problems Written by Victoria Kala vtkala@math.ucsb.edu Last updated /5/05 Answers This page contains answers only. See the following pages for detailed solutions. (. (a x. See

More information

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS 1. Please write your name and student number clearly on the front page of the exam. 2. The exam is

More information

The Method of Conjugate Directions 21

The Method of Conjugate Directions 21 The Method of Conjugate Directions 1 + 1 i : 7. The Method of Conjugate Directions 7.1. Conjugacy Steepest Descent often finds itself taking steps in the same direction as earlier steps (see Figure 8).

More information

MATH 4211/6211 Optimization Quasi-Newton Method

MATH 4211/6211 Optimization Quasi-Newton Method MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:

More information

Math 409/509 (Spring 2011)

Math 409/509 (Spring 2011) Math 409/509 (Spring 2011) Instructor: Emre Mengi Study Guide for Homework 2 This homework concerns the root-finding problem and line-search algorithms for unconstrained optimization. Please don t hesitate

More information

The geometry of least squares

The geometry of least squares The geometry of least squares We can think of a vector as a point in space, where the elements of the vector are the coordinates of the point. Consider for example, the following vector s: t = ( 4, 0),

More information

Econ Slides from Lecture 7

Econ Slides from Lecture 7 Econ 205 Sobel Econ 205 - Slides from Lecture 7 Joel Sobel August 31, 2010 Linear Algebra: Main Theory A linear combination of a collection of vectors {x 1,..., x k } is a vector of the form k λ ix i for

More information

1 Numerical optimization

1 Numerical optimization Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms

More information

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

The conjugate gradient method

The conjugate gradient method The conjugate gradient method Michael S. Floater November 1, 2011 These notes try to provide motivation and an explanation of the CG method. 1 The method of conjugate directions We want to solve the linear

More information