TMA 4180 Optimeringsteori THE CONJUGATE GRADIENT METHOD
|
|
- Allan Lee
- 5 years ago
- Views:
Transcription
1 INTRODUCTION TMA 48 Optimeringsteori THE CONJUGATE GRADIENT METHOD H. E. Krogstad, IMF, Spring 28 This note summarizes main points in the numerical analysis of the Conjugate Gradient (CG) method. Most of the material is also found in N&W, but the note contains the proof of Eqn. 5.36, which is the main convergence result for CG. The numerical experiments reported in the lecture are also included. 2 DERIVATION The Conjugate Gradient method is derived for the quadratic test problem, min (x) ; () x2rn where (x) = 2 x Ax b x; and A 2 R nn is positive de nite, A >. The method was originally considered to be a direct method for linear equations, but its favorable properties as an iterative method was soon realized, and it was later generalized to more general optimization problems. A ey point in the derivation is clever use of the A-scalar product, hx; yi A corresponding norm, x = (x Ax) =2 (see Basic Tools note). We already now that r (x) = g = Ax b and r 2 = A, so that = x Ay, and the x = arg min x2r n (x) = A b: (2) The rst observation is that a set of n A-orthogonal vectors, p ; p ; ; p n, that is, p iap j = p iap i ij ; (3) is a basis for R n (recall TMA 4/5 or a similar course for the terminology!). This property can be stated in several equivalent ways:. The n vectors fp j g are linearly independent 2. span fp ; p ; ; p n g = R n 3. every x 2 R n can be written uniquely as x = P n j= jp j Let us for the derivation of the CG method assume that we already have a basis fp ; p ; ; p n g, that we start at x =, and let p = g = b (the start point is easily adjusted later). The A-orthogonality enables us to write x = P n j= jp j, where = p Ax p Ap = p b p Ap = p g p Ap : (4)
2 It is interesting that we can compute the coe cients in the expansion for x without nowing x itself! Below the subspaces will play an important role. Observe that B = span fp ; ; p g ; = ; ; n ; (5) B B B n = R n : (6) It is a general property of expansions in orthogonal bases that partial sums are the best approximations in the corresponding subspaces spanned by the basis-vectors. This means that X x = j p j = arg min y x A : (7) y2b j= As long as we loo around for a good approximation to x in B, x is the best choice (in the A-norm)! In the present case, x is also best in another sense, namely, it solves the optimization problem x = arg min (y) : (8) y2b Show this yourself by deriving that y x 2 A = 2 (y) + b A b. Equation 8 represents a constrained problem (minimum constrained to B ). The feasible directions d from x will all be vectors in B, and since r (x ) d, and d 2 B if d is, r (x ) d = g d = for all d 2 B : (9) This could equally well be derived from the famous Projection Theorem, since the error, x x, is A-orthogonal to B : = (x x ) Ad = (Ax b) d () = g d for all d 2 B : The only remaining problem is to actually nd fp j g n j=, and the standard way would be to do this by a Gram-Schmidt procedure, if we had a set of vectors to start with. If we now p, p,, p, it is tempting to try p = g + p + ~ 2 p ~ p ; () since we already now from above that g is orthogonal to and hence linearly independent of fp ; p ; ; p g. The surprising result is now that it su ces to use and then determine so that p Ap =, p = g + p ; (2) = p Ag p Ap : (3) In order to prove that this is really true, we recall that B i B i+. 2
3 The argument uses induction, and the start is easy: Starting with p, p = A-orthogonal to p by choosing as in Eqn. 3. g + p can be made One then needs to verify that fp ; p ; ; p ; p g is an A-orthogonal set if the vectors fp ; p ; ; p g are A-orthogonal. We already now that p Ap =, but what about p and p ; p ; ; p 2? Consider p and p i for i 2: p iap = p ia g + p = p i Ag ; (4) since we now p i Ap = by the induction hypothesis. The tric is now to write multiply by A and subtract b on both sides so that x i+ = x i + i p i ; (5) g i+ = g i + i Ap i : (6) In general, g i 2 B i+ B i+2 B, g i+ 2 B i+2, and then from Eqn. 6, Ap i = (g i+ g i ) = i 2 B. But g is 2 -orthogonal to B, so (Ap i ) g = for all i 2! Before stating the algorithm, let us consider the modi cations when we start at an arbitrary x 6=. The series expansion for x will then be and exactly as above, x = x + j p j ; (7) j= = p A (x x ) p Ap = p (Ax b) p Ap = p g p Ap : (8) The CG-algorithm may then be written: Given x = x Define p = g = Ax + b for = : max end = p g=p Ap x = x + p g = g + Ap = p Ag=p Ap p = g + p The expression for may be simpli ed in several ways, e.g. by utilizing that gradients are 2 - orthogonal: = p Ag + g+ p Ap = (g + g ) = = g + g + g + p (g+ g ) = g g : (9) 3
4 3 ERROR ANALYSIS We now that fx g is a sequence of best approximations, but how good is it really? It turns out that there exists a fairly complete theory for the error analysis which involves some interesting mathematics. Let us start at a general x with the rst search direction p = g = (Ax b). Then, from Eqns. 6 and 2, p = g + p = (g Ag ) g 2 span fg ; Ag g : (2) Continuing in the same way, it is easy to prove that o p 2 span ng ; Ag ; A 2 g ; ; A g : (2) Hence, B could as well be de ned B = span ng ; Ag ; A 2 g ; ; A g o ; (22) and x o x 2 span ng ; Ag ; A 2 g ; ; A g : (23) The sequence g ; Ag ; A 2 g ; (24) is called a Krylov Sequence, and fb g the Krylov Subspaces, after Alexei Niolaevich Krylov ( ), famous Russian naval architect and applied mathematician. The group of algorithms based on Krylov sequences and subspaces (of which CG is the main one) belongs to the 2th Century Top Ten Algorithms. Because of (23), we can write X x x = j A j X g j A j A g = P (A) g ; (25) j= j= where P (A) is a matrix polynomial of degree. Furthermore, x x = P (A) g + x x = P (A) (Ax Ax ) + x x (26) = (P (A) A + I) (x x ) = Q (A) (x x ) ; where Q is a matrix polynomial of degree with Q () =. Lie we did for the analysis the Steepest Descent method, we observe that A has eigenvalues 2 n and a corresponding set of orthogonal, normalized, eigenvectors, e ; e 2 ; ; e n. Recall that for y = P n j= t je j, y 2 A = y Ay = j t j e j A = j t 2 j: (27) j= j= 4
5 Similarly, noting that also Q (A) e j = Q ( j ) e j, Q (A) y 2 A = y Q (A) AQ (A) y = Q 2 ( j ) j t 2 j: (28) j= If we apply Eqn. 28 to Eqn. 26 with (x x ) = P n j= je j, we obtain x x 2 A = Q 2 ( j) j 2 j: (29) j= This is an exact expression, and since the norm is as small as possible, it will be impossible to decrease it by choosing a di erent polynomial. Thus, the polynomial Q is the optimal solution to the problem Q = arg Q 2 ( j ) j 2 A j (3) deg(q)=; Q()= The expression is not particularly tractable since the optimal Q will vary for each x, but we can also write x x 2 A max Q 2 ( j) j 2 j = max Q 2 ( j) x x 2 j j A ; (3) which shows that the relative error of x is bound by max j jq ( j )j ; j= j= x x x A x max jq ( j )j : (32) A j The quite general error estimate in Eqn. 32 has some surprising implications. We are free to put in any polynomial we lie as long as it has degree and Q () =. Thus, if A happens to have only di erent eigenvalues, there exists a -th order polynomial having those eigenvalues as zeros and, by a suitable scaling, we can also obtain Q () =. Then x x A =, and the method converges to x in iterations! The following result is somewhat deeper. It considers the case when we have an idea about the extreme eigenvalues of A, and n, but not the eigenvalues in between. It is then reasonable to as how small the right hand side of Eqn. 32 could be, since we clearly have x x A x x min A Q of order Q()= max n jq ()j : (33) This turns out to be a classic so-called min-max problem in the approximation of functions, and the solution is given by T 2 n n Q () = T + n n ; (34) where T is the Chebyshev polynomial of degree. Loo up for plots and properties of the Chebyshev polynomials (This is the spelling accepted by Mathworld. When I was a student, we were told that there are 52 di erent spellings of the name Chebyshev, but Mathworld cites 4). 5
6 Since max x jt (x)j =, inequality 32 now leads to x x x A x A T + n ; (35) n and x x x A x A T + ; (36) where = n = is the condition number of A. The Chebyshev polynomials satisfy a lot of relations, and one de nition (not included in the Mathworld Encyclopedia) is simply T (x) = x + p p x x x 2 : (37) With x = ( + ) = ( ), we obtain and x + p x 2 = (p + ) 2 + T = (p + ) 2 ; x p x 2 = (p ) 2 ; (38)!! + (p ) 2 A In addition, = ( p + ) 2 + ( p ) 2 2 ( p ) ( p + ) (39) = p + p + p p > p + p : p p +!! ; (4) when >. This leads nally to Eqn in N&W, 2nd Ed. (Note: Eqn in N&W, st Ed. states the result with the wrong power, the wrong constant, and the wrong de nition of!): p x x A 2 p x x + A : (4) Figure shows the polynomials Q for some -s when = :3 and n = 2. deviation for Q in the interval [:3 ; 2] is only 5:6 4, not bad! The maximum Thus, if we have a system of equations and now that the coe cient matrix has eigenvalues in the interval [:3 2], after CG-iterations, regardless the size of the system! x x A 5:6 4 x x A ; (42) Penalty and barrier methods (N&W, Chapter 7) always lead to Hessians which have a few very large eigenvalues (actually tending to as the iteration proceeds), whereas the rest are clustered 6
7 Figure : The optimal Chebyshev polynomials for the interval [:3; 2] when = 2; 5 and. in a xed range. Theorem 5.5, p. 5, in N&W has some quite important implications for this case: x + x n A x x n + A : (43) The bound is independent of the size of n + ; ; n. It is easy to derive the bound from 32 by constructing a ( + )-th order polynomial with zeros at ( + n ) =2, n + ; ; n, when n + ; ; n are unique (try it!). We recognize (43) as the Steepest Descent error bound for a matrix with extreme eigenvalues and n, and this is a ey observation: By restarting the CG method every + iterations we obtain a convergence rate similar to the SD method, but with a greatly reduced condition number (counting CG steps as one iteration). The convergence behavior with several clustered groups of eigenvalues is illustrated on p. 6-7 in N&W. Another good reference to the Conjugate Gradient is the boo of G. H. Golub and C. E. van Loan: Matrix Computations, Johns Hopins University Press (An excellent general reference boo in numerical linear algebra!). 4 SOME NUMERICAL EXPERIMENTS It is easy to experiment with the CG-method using Matlab. In the following code we rst generate a positive de nite matrix R R, and then the matrix A = (R R) npot. The power npot steers the eigenvalue distribution and hence the condition number. ndim = ; R = randn(ndim); for npot =.:.4:2 A = (R *R)^npot; lamb = eig(a); appa= max(lamb)/min(lamb); xsol = rand(ndim,); b = A*xsol; Norm2 = sqrt(xsol *xsol); NormA = sqrt(xsol *A*xsol); 7
8 .5 Eigenvalues, ndim= npot=. κ= A norm 2 norm 5 Error Iteration number Figure 2: Well-conditioned matrix. Fast convergence. end x = zeros(size(b)); g = Ax-b ; p = -g; for loop = :ndim Ap = A*p; % Only one matrix-vector product! alfa = -(p *g)./(p *Ap); x = x + alfa*p; g = g + alfa*ap; % g = A*x-b; beta = (g *Ap)./ (p *Ap); p = -g + beta*p; err2(loop) = sqrt((x-xsol) *(x-xsol))/norm2; erra(loop) = sqrt((x-xsol) *A*(x-xsol))/NormA; end; subplot(,2,); plot(lamb,zeros(size(lamb)), o ); Tittel = [ Eigenvalues, ndim= num2str(ndim)]; title(tittel); subplot(,2,2); semilogy(:ndim, err2,:ndim,erra); legend( 2-norm, A-norm ); xlabel( Iteration number ); ylabel( Error ) Tittel = [ npot= num2str(npot) nappa=,num2str(appa)]; title(tittel); pause 4. Exercise (a) Implement and plot the error bound in Eqn. 4 in the Matlab code. How does this compare with the actual decrease of the error? (N&W say "This bound often gives a large overestimate"). (b) Modify the matrix A (say, A 2 R ) so that it has m (say, m = 5 ) large eigenvalues by adding a ran-m matrix LL, A = (R R) npot + LL ;. Here L is n m and consists of m random column vectors. Test the performance of the CG-method. 8
9 .5 Eigenvalues, ndim= npot=.5 κ= A norm 2 norm 5 Error Iteration number Figure 3: Medium conditioned matrix..5 Eigenvalues, ndim= npot=.3 κ=.86x 7 A norm 2 norm Error Iteration number Figure 4: Ill-conditioned matrix. The convergence is very slow, and rounding errors distroy the accuracy severely. 9
EECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview
More informationNotes on Some Methods for Solving Linear Systems
Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms
More informationLecture 10: October 27, 2016
Mathematical Toolkit Autumn 206 Lecturer: Madhur Tulsiani Lecture 0: October 27, 206 The conjugate gradient method In the last lecture we saw the steepest descent or gradient descent method for finding
More informationChapter 10 Conjugate Direction Methods
Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationThe conjugate gradient method
The conjugate gradient method Michael S. Floater November 1, 2011 These notes try to provide motivation and an explanation of the CG method. 1 The method of conjugate directions We want to solve the linear
More informationApplied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give
More informationLecture 22. r i+1 = b Ax i+1 = b A(x i + α i r i ) =(b Ax i ) α i Ar i = r i α i Ar i
8.409 An Algorithmist s oolkit December, 009 Lecturer: Jonathan Kelner Lecture Last time Last time, we reduced solving sparse systems of linear equations Ax = b where A is symmetric and positive definite
More informationMath 413/513 Chapter 6 (from Friedberg, Insel, & Spence)
Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence) David Glickenstein December 7, 2015 1 Inner product spaces In this chapter, we will only consider the elds R and C. De nition 1 Let V be a vector
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationLab 1: Iterative Methods for Solving Linear Systems
Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationIntroduction to Linear Algebra. Tyrone L. Vincent
Introduction to Linear Algebra Tyrone L. Vincent Engineering Division, Colorado School of Mines, Golden, CO E-mail address: tvincent@mines.edu URL: http://egweb.mines.edu/~tvincent Contents Chapter. Revew
More informationLecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm
CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method The minimization problem We are given a symmetric positive definite matrix R n n and a right hand side vector b R n We want to solve the linear system Find u R n such that
More information4.3 - Linear Combinations and Independence of Vectors
- Linear Combinations and Independence of Vectors De nitions, Theorems, and Examples De nition 1 A vector v in a vector space V is called a linear combination of the vectors u 1, u,,u k in V if v can be
More informationMATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.
MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors. Orthogonal sets Let V be a vector space with an inner product. Definition. Nonzero vectors v 1,v
More informationKrylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17
Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011 Lecture Objectives describe when CG can be used to solve Ax
More informationM.A. Botchev. September 5, 2014
Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev
More informationLinear Algebra Massoud Malek
CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product
More informationApplied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization
More informationScientific Computing: Optimization
Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 21: Sensitivity of Eigenvalues and Eigenvectors; Conjugate Gradient Method Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis
More informationCourse Notes: Week 4
Course Notes: Week 4 Math 270C: Applied Numerical Linear Algebra 1 Lecture 9: Steepest Descent (4/18/11) The connection with Lanczos iteration and the CG was not originally known. CG was originally derived
More informationEIGENVALUES AND EIGENVECTORS 3
EIGENVALUES AND EIGENVECTORS 3 1. Motivation 1.1. Diagonal matrices. Perhaps the simplest type of linear transformations are those whose matrix is diagonal (in some basis). Consider for example the matrices
More informationMath 24 Spring 2012 Questions (mostly) from the Textbook
Math 24 Spring 2012 Questions (mostly) from the Textbook 1. TRUE OR FALSE? (a) The zero vector space has no basis. (F) (b) Every vector space that is generated by a finite set has a basis. (c) Every vector
More informationChapter 7 Iterative Techniques in Matrix Algebra
Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition
More informationMATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)
MATH 20F: LINEAR ALGEBRA LECTURE B00 (T KEMP) Definition 01 If T (x) = Ax is a linear transformation from R n to R m then Nul (T ) = {x R n : T (x) = 0} = Nul (A) Ran (T ) = {Ax R m : x R n } = {b R m
More informationNumerical Optimization
Numerical Optimization Unit 2: Multivariable optimization problems Che-Rung Lee Scribe: February 28, 2011 (UNIT 2) Numerical Optimization February 28, 2011 1 / 17 Partial derivative of a two variable function
More informationEXAMPLES OF PROOFS BY INDUCTION
EXAMPLES OF PROOFS BY INDUCTION KEITH CONRAD 1. Introduction In this handout we illustrate proofs by induction from several areas of mathematics: linear algebra, polynomial algebra, and calculus. Becoming
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More information22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices
m:33 Notes: 7. Diagonalization of Symmetric Matrices Dennis Roseman University of Iowa Iowa City, IA http://www.math.uiowa.edu/ roseman May 3, Symmetric matrices Definition. A symmetric matrix is a matrix
More information1 Extrapolation: A Hint of Things to Come
Notes for 2017-03-24 1 Extrapolation: A Hint of Things to Come Stationary iterations are simple. Methods like Jacobi or Gauss-Seidel are easy to program, and it s (relatively) easy to analyze their convergence.
More informationKrylov Subspaces. The order-n Krylov subspace of A generated by x is
Lab 1 Krylov Subspaces Lab Objective: matrices. Use Krylov subspaces to find eigenvalues of extremely large One of the biggest difficulties in computational linear algebra is the amount of memory needed
More informationMathematical Methods wk 1: Vectors
Mathematical Methods wk : Vectors John Magorrian, magog@thphysoxacuk These are work-in-progress notes for the second-year course on mathematical methods The most up-to-date version is available from http://www-thphysphysicsoxacuk/people/johnmagorrian/mm
More informationMathematical Methods wk 1: Vectors
Mathematical Methods wk : Vectors John Magorrian, magog@thphysoxacuk These are work-in-progress notes for the second-year course on mathematical methods The most up-to-date version is available from http://www-thphysphysicsoxacuk/people/johnmagorrian/mm
More information17 Solution of Nonlinear Systems
17 Solution of Nonlinear Systems We now discuss the solution of systems of nonlinear equations. An important ingredient will be the multivariate Taylor theorem. Theorem 17.1 Let D = {x 1, x 2,..., x m
More informationSome minimization problems
Week 13: Wednesday, Nov 14 Some minimization problems Last time, we sketched the following two-step strategy for approximating the solution to linear systems via Krylov subspaces: 1. Build a sequence of
More informationConjugate Gradient (CG) Method
Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)
More informationChapter 3 Numerical Methods
Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationPart 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)
Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective
More informationITERATIVE METHODS BASED ON KRYLOV SUBSPACES
ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method
More informationIterative Linear Solvers
Chapter 10 Iterative Linear Solvers In the previous two chapters, we developed strategies for solving a new class of problems involving minimizing a function f ( x) with or without constraints on x. In
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More information7.2 Steepest Descent and Preconditioning
7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider
More information18.06 Quiz 2 April 7, 2010 Professor Strang
18.06 Quiz 2 April 7, 2010 Professor Strang Your PRINTED name is: 1. Your recitation number or instructor is 2. 3. 1. (33 points) (a) Find the matrix P that projects every vector b in R 3 onto the line
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More information1 Conjugate gradients
Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration
More informationOptimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30
Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained
More informationNotes on PCG for Sparse Linear Systems
Notes on PCG for Sparse Linear Systems Luca Bergamaschi Department of Civil Environmental and Architectural Engineering University of Padova e-mail luca.bergamaschi@unipd.it webpage www.dmsa.unipd.it/
More informationFEM and sparse linear system solving
FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich
More informationVectors in Function Spaces
Jim Lambers MAT 66 Spring Semester 15-16 Lecture 18 Notes These notes correspond to Section 6.3 in the text. Vectors in Function Spaces We begin with some necessary terminology. A vector space V, also
More informationApplied Linear Algebra in Geoscience Using MATLAB
Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in
More informationMATH 115A: SAMPLE FINAL SOLUTIONS
MATH A: SAMPLE FINAL SOLUTIONS JOE HUGHES. Let V be the set of all functions f : R R such that f( x) = f(x) for all x R. Show that V is a vector space over R under the usual addition and scalar multiplication
More informationNumerical Methods - Numerical Linear Algebra
Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear
More informationLecture 1: Systems of linear equations and their solutions
Lecture 1: Systems of linear equations and their solutions Course overview Topics to be covered this semester: Systems of linear equations and Gaussian elimination: Solving linear equations and applications
More informationSolutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.
Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More informationPreliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012
Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.
More informationMath 502 Fall 2005 Solutions to Homework 5. Let x = A ;1 b. Then x = ;D ;1 (L+U)x +D ;1 b, and hence the dierences
Math 52 Fall 25 Solutions to Homework 5 (). The i-th row ofd ; (L + U) is r i =[ a i ::: a i i; a i i+ ::: a i M ] a i i a i i a i i a i i Since A is strictly X row diagonally dominant jr i j = j a i j
More informationCS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares
CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search
More informationMidterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015
Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015 The test lasts 1 hour and 15 minutes. No documents are allowed. The use of a calculator, cell phone or other equivalent electronic
More informationConjugate Gradients I: Setup
Conjugate Gradients I: Setup CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 22 Time for Gaussian Elimination
More information5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.
Linear Algebra - Test File - Spring Test # For problems - consider the following system of equations. x + y - z = x + y + 4z = x + y + 6z =.) Solve the system without using your calculator..) Find the
More informationIterative methods for Linear System
Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and
More informationConjugate Gradient Method
Conjugate Gradient Method Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 10, 2011 T.M. Huang (NTNU) Conjugate Gradient Method October 10, 2011 1 / 36 Outline 1 Steepest
More informationECE580 Partial Solution to Problem Set 3
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 1 ECE580 Partial Solution to Problem Set 3 These problems are from the textbook by Chong and Zak, 4th edition, which is the textbook for the
More informationConstrained Optimization
1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange
More informationI. Multiple Choice Questions (Answer any eight)
Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY
More information8 Numerical methods for unconstrained problems
8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields
More informationExercises * on Linear Algebra
Exercises * on Linear Algebra Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 7 Contents Vector spaces 4. Definition...............................................
More informationLECTURE VI: SELF-ADJOINT AND UNITARY OPERATORS MAT FALL 2006 PRINCETON UNIVERSITY
LECTURE VI: SELF-ADJOINT AND UNITARY OPERATORS MAT 204 - FALL 2006 PRINCETON UNIVERSITY ALFONSO SORRENTINO 1 Adjoint of a linear operator Note: In these notes, V will denote a n-dimensional euclidean vector
More informationLinear Algebra. Linear Algebra. Chih-Wei Yi. Dept. of Computer Science National Chiao Tung University. November 12, 2008
Linear Algebra Chih-Wei Yi Dept. of Computer Science National Chiao Tung University November, 008 Section De nition and Examples Section De nition and Examples Section De nition and Examples De nition
More informationThere are two things that are particularly nice about the first basis
Orthogonality and the Gram-Schmidt Process In Chapter 4, we spent a great deal of time studying the problem of finding a basis for a vector space We know that a basis for a vector space can potentially
More informationINTRODUCTION TO LIE ALGEBRAS. LECTURE 2.
INTRODUCTION TO LIE ALGEBRAS. LECTURE 2. 2. More examples. Ideals. Direct products. 2.1. More examples. 2.1.1. Let k = R, L = R 3. Define [x, y] = x y the cross-product. Recall that the latter is defined
More informationLinear Algebra Lecture Notes-II
Linear Algebra Lecture Notes-II Vikas Bist Department of Mathematics Panjab University, Chandigarh-64 email: bistvikas@gmail.com Last revised on March 5, 8 This text is based on the lectures delivered
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large
More informationChapter 8 Gradient Methods
Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point
More informationSolution Set 7, Fall '12
Solution Set 7, 18.06 Fall '12 1. Do Problem 26 from 5.1. (It might take a while but when you see it, it's easy) Solution. Let n 3, and let A be an n n matrix whose i, j entry is i + j. To show that det
More informationMath 1060 Linear Algebra Homework Exercises 1 1. Find the complete solutions (if any!) to each of the following systems of simultaneous equations:
Homework Exercises 1 1 Find the complete solutions (if any!) to each of the following systems of simultaneous equations: (i) x 4y + 3z = 2 3x 11y + 13z = 3 2x 9y + 2z = 7 x 2y + 6z = 2 (ii) x 4y + 3z =
More informationUniversity of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam June 8, 2012
University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam June 8, 2012 Name: Exam Rules: This is a closed book exam. Once the exam
More informationLINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS
LINEAR ALGEBRA, -I PARTIAL EXAM SOLUTIONS TO PRACTICE PROBLEMS Problem (a) For each of the two matrices below, (i) determine whether it is diagonalizable, (ii) determine whether it is orthogonally diagonalizable,
More informationMath 1302 Notes 2. How many solutions? What type of solution in the real number system? What kind of equation is it?
Math 1302 Notes 2 We know that x 2 + 4 = 0 has How many solutions? What type of solution in the real number system? What kind of equation is it? What happens if we enlarge our current system? Remember
More information1 Vectors. Notes for Bindel, Spring 2017 Numerical Analysis (CS 4220)
Notes for 2017-01-30 Most of mathematics is best learned by doing. Linear algebra is no exception. You have had a previous class in which you learned the basics of linear algebra, and you will have plenty
More informationCS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3
CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 Felix Kwok February 27, 2004 Written Problems 1. (Heath E3.10) Let B be an n n matrix, and assume that B is both
More informationThe amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.
AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative
More informationLinear Algebra. and
Instructions Please answer the six problems on your own paper. These are essay questions: you should write in complete sentences. 1. Are the two matrices 1 2 2 1 3 5 2 7 and 1 1 1 4 4 2 5 5 2 row equivalent?
More informationMathematical Optimisation, Chpt 2: Linear Equations and inequalities
Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 12/02/18 Monday 5th February 2018 Peter J.C. Dickinson
More informationChapter Two Elements of Linear Algebra
Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD
More informationSpectral Theorem for Self-adjoint Linear Operators
Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;
More informationLagrange Multipliers
Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each
More informationLecture 10: Powers of Matrices, Difference Equations
Lecture 10: Powers of Matrices, Difference Equations Difference Equations A difference equation, also sometimes called a recurrence equation is an equation that defines a sequence recursively, i.e. each
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationHomework 2. Solutions T =
Homework. s Let {e x, e y, e z } be an orthonormal basis in E. Consider the following ordered triples: a) {e x, e x + e y, 5e z }, b) {e y, e x, 5e z }, c) {e y, e x, e z }, d) {e y, e x, 5e z }, e) {
More information