Quasi-Newton Methods
|
|
- Jordan Carter
- 6 years ago
- Views:
Transcription
1 Quasi-Newton Methods Werner C. Rheinboldt These are excerpts of material relating to the boos [OR00 and [Rhe98 and of write-ups prepared for courses held at the University of Pittsburgh. Some further references are [Kel95, [Kel99, [DS98. 1 Broyden s Method Let F (x) = 0, F : R n R n, (1) be a given system of nonlinear equations defined by a sufficiently smooth function F. A linearization method for the numerical solution of (1) has the general form x +1 = x F (x ), = 0, 1,... () where the n n matrices B are suitably chosen. The integral mean value theorem for F states that [ 1 DF (x + t(y x))dt (y x) = F (y) F (x), x, y R n, 0 The matrix in bracets can be interpreted as the average of the Jacobian matrix on the line segment between the points x and y. This suggests requiring the matrices B to satisfy the so called quasi-newton condition B +1 (x +1 x ) = F (x +1 ) F (x ). In several papers around 1967, C. G. Broyden suggested that it is numerically advantageous to choose the matrices in () such that ran(b +1 B ) is small. This led to the development of the so-called quasi-newton methods, which can be characterized by the following three properties: (a) B (x +1 x ) + F (x ) = 0, (b) B +1 (x +1 x ) = F (x +1 ) F (x ) (c) B +1 = B + B, ran B = m 1 = 0, 1,.... (3) Up to now only the values m = 1 or m = have been used in the design of quasi-newton methods. From (3) we obtain some frequently used relations (a) (B +1 B )s = F (x +1 ), s = x +1 x ), (b) F (x +1 ) = y B s, y = F (x +1 ) F (x ). (4) 1
2 C. G. Broyden himself developed two quasi-newton methods with m = 1 and called one of them his good method. This terminology has persisted. The good method uses B +1 := B + F (x+1 )(s ) (s ) s, (5) or, in view of (4)(b), B +1 := B + (y B s )(s ) (s ) s. (6) As for all standard linearization methods, the matrices B should be invertible. Recall the well nown Sherman-Morrison formula: 1.1. For u, v R n the matrix I + uv is invertible if and only if 1 + u v 0, and in that case ( I + uv ) 1 = I u v uv If in the Broyden method the matrix B is nonsingular, then 1.1 shows that [ B +1 = B I + (B 1 F (x+1 ))(s ) s (7) is again nonsingular, provided that in which case, the inverse is With H = +1 = [I (s + (B 1 F (x+1 )) s 0, ( F (x+1 ))((s ) F (x+1 )) s s + (B 1 and H +1 = +1 this can be written in the form. (8) H +1 = H + (s H y )(s ) (s ) H y H. (9) Various convergence results for Broyden s method have been proved. We refer to the cited references and cite only a simplified version of such a result: 1.. Let F : Ω R n R n be continuously differentiable on an open set Ω. Suppose that x Ω a solution of F (x) = 0 where DF (x ) is invertible and DF (x) DF (x ) γ x x, x Ω. Then there exist δ, η > 0 such that for x 0 x < δ and B 0 DF (x ) < η Broyden s method converges to x and the rate of convergence is superlinear in the sense that x + x +1 lim x +1 x = 0. (10)
3 Recursive Implementation With the notation the inverse formula (8) is +1 = [I w = F (x+1 ), (11) w (s ) s + (w ) s and the next step equals [ s +1 = +1 F (x+1 ) = I From (13) it follows that, (1) w (s ) s + (w ) s s = s + (w ) s w. s +1 (s ) w (s ) s = s + (w ) s j=0 w (13) whence (1) becomes [ +1 = I + s+1 (s ) s = [I + sj+1 (s j ) s j B0 1. (14) while (13) can be written as x +1 = [I + s+1 (s ) s w, (15) that is, [1 + (s ) w s s +1 = w. (16) Suppose now that the steps s j, j = 0, 1,..., and their norms have been stored. Then (14) and (16) imply that w = 1 j=0 [I + sj+1 (s j ) s j s +1 = 1 w, τ = (s ) w 1 + τ s, which can be evaluated by the recursive algorithm w := 0 F (x+1 ); for j = 0,..., 1 τ := [(s j ) w/ s j ; w := w + τs j+1 ; τ := [(s ) w/ s ; s +1 := [1/(1 + τ)w; 3 w, w := 0 F (x+1 ) (17) (18)
4 In order to complete this algorithm, we need some divergence and convergence criteria. In the convergence proof a controlling quantity is the quotient Θ := B 1 F (x+1 ) s, 0, (19) and it turns out, that we should declare divergence if the condition Θ < 1 (0) is violated. In view of the superlinear convergence it suffices to declare convergence as soon as s +1 tol. Altogether the Broyden algorithm can now be formulated as follows, where in contrast to (18) we wor with v = w: input: x 0, B 0, max, tol; solve B 0 s 0 = F (x 0 ); ξ 0 := s 0 ; store ξ 0, s 0 ; for = 0, 1,..., max x +1 := x + s ; solve B 0 v = F (x +1 ); if > 0 for j = 1,..., τ := [(s j 1 ) v/ξj 1 ; v := v + τs j ; endif τ := [(s ) v/ξ ; Θ := v /ξ ; if Θ 1/ then return {divergence}; s +1 := v/(1 τ); ξ +1 := s +1 ; store ξ +1, s +1 ; if ξ +1 tol then return {x := x +1 + s +1 }; return {maximal number of steps} An implementation of this algorithm is the FORTRAN program NLEQ1 of P. Deuflhard, U. Nowa, and L. Weimann available in the ZIB-Elib library. There exists also a Matlab version. A somewhat different Matlab program is brsol.m by C. T. Kelley [Kel03. The recursive form of the Broyden method has shown itself to be very economical in practice. But it has been observed occasionally, that the condition of the matrices may deteriorate over several steps causing the method to become instable. For any matrix A = I + uv v, u, v R n, κ = u < 1, v 4
5 we have uv u v and therefore 1 κ 1 uv v A 1 + uv v 1 + κ, This shows that A 1 (1 κ) 1 and cond (A) := A A κ 1 κ, Hence, for the Broyden matrices (7) it follows from the convergence condition (19), (0), that cond (B +1 ) 1 + Θ 1 Θ cond (B ) < 3cond (B ), and hence that the growth of the condition numbers is not unduly fast and can be controlled by means of these estimates. 3 Linear Equations The recursive form of the Broyden method also provides a very useful iterative method for linear problems In that case (5) has the form Ax = b, A GL(R n ). B +1 := B + (b Ax+1 )(s ) s and with (B A)s + b Ax +1 = 0 it follows that B +1 A = (B A)(I P ), P = s (s ) (s ) s. (1) Here I P is the orthogonal projection onto the orthogonal complement of the linear space spanned by s. We introduce now the matrices E j = A 1 B j I. Then it follows from (1) that E j+1 E j, j 0. Moreover, implies that and hence that B j s j = Ax j b = A(x j x ), x = A 1 b, x j x = A 1 B j s j = (E j + I)s j, ( 1 E js j ) ( s j s j x j x 1 + E js j ) s j s j. 5
6 Under the conditions of the local convergence theorem 1. one can show that lim j E j s j / s j = 0. This leads to the asymptotic error estimate x j x s j In order to smooth any possible erratic behavior, it is here useful to wor with the average of several steps and to declare convergence if ɛ := 1 [ s j 1 + s j + s j+1 1/ η x j tol, () with some given safety factor β < 1. Then the algorithm has the form: input: A, b, y, B 0, max, τ min, tol; r := b Ay; s 0 = B0 1 0 := (s 0 ) s 0 ; for = 0, 1,..., max store s 0, η 0 ; q := As ; z = B0 1 if > 0 for j = 1,..., z := z + [(s j 1 ) z / η j (s j s j 1 ); endif τ := η /[(s ) z; if τ < τ min then return {restart}; x := x + s ; s +1 := τ(s z); η +1 := (s +1 ) s +1 ; Store s +1, η +1 ; ɛ := (1/)[s 1 + s + s +1 1/ ; if ɛ β x tol then return {x := x + s +1 }; return {maximal number of steps} A FORTRAN implementation is the GBITR program in the ZIB-Elib library. Note that for the matrix A only a facility for computing the product Ax, x R n, has to be provided. 4 Ran-Two Updates The variety of possible methods increases considerably in the ran-two case. Many of these methods have been developed for application in optimization problems. In that case the interest centers on update formulas, which preserve the symmetry of the matrices. Evidently, the direct updates should then have the form B +1 = B + ( b c ) ( ) ( ) b σ1 σ Σ, Σ =, b, c R n. (3) σ σ 3 c Some examples show that here the matrix Σ should be nonsingular with a negative determinant, otherwise there may be convergence problems. 6
7 Since the vectors b, c are essentially free, some suitable basis in R may be chosen in which Σ assumes a simpler form. In particular, because of det Σ < 0 we may transform Σ such that either σ 1 or σ 3 is zero. In fact, if, say, σ 3 0 then a simple calculation shows that Σ = ( 1 µ 0 1 ) ( 0 δ ) ( 1 0 δ σ 3 µ 1 ), µ = σ δ σ 3, δ = det Σ. Thus, there is no loss of generality to assume that σ 1 = 0 in (3). As before we use the abbreviations s = x +1 x, y = F (x +1 ) F (x ). (4) The condition (4) requires y B s to be in the subspace spanned by b and c and hence it is no restriction to set b = y B s. Then, for any c R n such that c s 0 it follows that σ = 1/c s and σ 3 = (y B s ) s /(c s ). In other words, all symmetric direct update formulas with nonpositive determinant can be written in the form B +1 = B + (y B s )c + c(y B s ) c s provided, of course, that c s 0. (y B s ) s (c s ) cc, (5) For c = s (5) becomes the Powell-symmetric-Broyden (PSB) update formula B +1 = B + (y B s )(s ) + s (y B s ) (s ) s (y B s ) s ((s ) s ) s (s ) (6) of M. J. D. Powell, while for c = y we obtain the Davidon Fletcher Powell (DFP) update formula B +1 = B + (y B s )(y ) + y (y B s ) (y ) s (y B s ) s ((y ) s ) y (y ) (7) given by D. Davidon and independently by R. Fletcher and M. J. D. Powell. Instead of woring with the direct update (3) we may consider updating the inverses H = such that H +1 H has ran two. Here we can begin with H +1 H in a form analogous with (3) and then proceed as before. We will not go into details, but mention only one of the formulas that can be obtained in this way. It was independently suggested by C. G. Broyden, R. Fletcher, D. Goldfarb and D. F. Shanno, and is generally called the BFGS formula reflecting the first letters of the four authors. H +1 = (I s (y ) )H (y ) s (I y (s ) ) (y ) s + s (s ) (y ) s. (8) This is widely considered the most effective update formula for minimization problems. As before, we can apply here the Sherman-Morrison formula 1.1 and then obtain the direct-update form of the BFGS update B +1 = B + y (y ) (y ) s B s (B s ) (s ) B s. (9) 7
8 5 The BFGS Method in Optimization Extremal problems are of foremost importance in almost all applications of mathematics. Many boundary value problems of mathematical physics may be phrased as variational problems. For instance, holonomic equilibrium problem in Lagrangian mechanics derive from the minimization of a suitable energy function. Similarly, the determination of a geodesic between two points on a manifold is a minimization problem, and so are optimal control problems in engineering, or problems involving the optimal determination of unnown parameters of a technical process. There are close connections between such extremal problems and the solution of nonlinear equations, as is readily seen in the finite dimensional case. Let g : Ω R n R 1 be some functional on some set Ω. A point x Ω is a local minimizer of g in Ω if there exists an open neighborhood U of x in R n such that g(x) g(x ), x U Ω, (30) and a global minimizer on Ω if the inequality (30) holds for all x Ω. A point x in the interior int(ω) of Ω is a critical point of g if g has a derivative at x and Dg(x ) = 0. A well-nown result states that if x int(e) is a local minimizer where g is differentiable, then x is a critical point of g. Of course, a critical point need not be local minimizer. But if g has a continuous second derivative at a critical point x intω and the Hessian matrix D g(x ) is positive definite then x is a proper local minimizer; that is, strict inequality holds in (30) for all x U Ω, x x. Conversely, at a local minimizer x, D g(x ) is positive semi-definite. For a differentiable functional g : Ω R n R 1 we call the transposed first derivative g(x) = Dg(x) R n the gradient of g at x Ω. The problem of finding critical points of g is precisely that of solving the gradient system g(x) = 0, x Ω. (31) Conversely, a differentiable mapping F : Ω R n R n is called a gradient or potential mapping on Ω if there exists a differentiable functional g : Ω R n R 1 such that F (x) = g(x) for all x Ω. A continuously differentiable mapping F on an open convex set Ω is a gradient mapping on Ω if and only if DF (x) is symmetric for all x Ω. This is called Kerner s theorem. For any gradient mapping the problem of solving F (x) = 0 may be replaced by that of minimizing the functional g, provided, of course, we eep in mind that a local minimizer of g need not be a critical point, nor that a critical point is necessarily a minimizer. Let g : Ω R n R 1 be a (sufficiently smooth) functional for which we want to compute a minimizer. Many of the iterative methods for this purpose have the general form x +1 = x λ d, = 0, 1,, (3) involving a direction vector d R n and a steplength λ 0 chosen such that g(x ) > g(x +1 ), = 0, 1,, (33) 8
9 Obviously, it will not suffice to ensure only a decrease of the value of g, but to require that the decrease (33) is sufficiently large. Thus, at the -th step of the methods the major tass are the selection of a suitable direction vector d and the construction of an appropriate steplength λ. The literature in this area is very extensive, see, e.g., [Kel99 and [Rhe98 for an introduction and further references. Clearly, given a current point x Ω, we want to use a (nonzero direction vector d such that for some δ > 0 we have g(x td) g(x) for t [0, δ). From lim t 0 (1/t)[g(x) g(x tp) = Dg(x)p it follows that, in order for this to hold it is sufficient that Dg(x)d > 0 and necessary that Dg(x)d 0. Accordingly, we call a vector d 0 an admissible direction of g at a point x if Dg(x)d > 0. In accordance with the linearization methods () we consider now methods of the form x +1 = x + λ g(x ), = 0, 1,. (34) Hence the direction vectors are here d := g(x ), = 0, 1,. (35) If the matrices B are assumed to be symmetric, positive definite, then we have Dg(x)d = Dg(x) g(x ) = ( g(x )) g(x ) > 0 if g(x ) 0, (36) that is, the directions (35) are admissible. This is the reason, why in section 4 the emphasis was placed on the construction of update formulas that preserve symmetry. Actually many of these update methods also preserve positive definiteness. In particular, this holds for the BFGS formula: 5.1. With the abbreviations (4) suppose that B is symmetric, positive definite, and that (y ) s > 0. Then B +1 given by (9) is also symmetric, positive definite. Proof. By (8) we have ( +1 = I s (y ) ) (y ) s (I y (s ) ) (y ) s + s (s ) (y ) s. (37) Thus, under the stated conditions we have (z B s ) ((s ) B s ) (z B z) z 0, with equality only if z = 0 and s = 0. Moreover, it follows from (9) that z B +1 z = (z y) y s + z B z (z B s ) (s ) B s whence as claimed. z B +1 z > (z y ) (y ) s 0. 9
10 A step of a descent method of the form (34) with the BFGS update formula has now the generic form: Compute the search direction d = H g(x ); Determine suitable λ > 0 such that g(x ) g(x + λ d ) > 0; s = λ d ; x +1 = x + s ; y = g(x +1 ) g(x ); If (y ) s 0 then return; Update H to H +1 by means of the BFGS formula. Numerous algorithms have been proposed for constructing an acceptable step λ. One of the simplest is the so called Armijo rule, where we search along the line t > 0 x + td for a point such that g(x ) g(x + td ) > tα g(x ), (38) where, say, α = More specifically we use a bactracing approach and test (38) first with t = 1 and then with succesively smaller t = β j, j = 0, 1,..., jmax, where 0 < β < 1. In other words, the algorithm has the generic form: input g, x, d, α, β, j max, g = g(x ); p = g(x ); γ = α p ; t = 1; for j = 0 : j max if g g(x + td ) > tγ then return {λ = t}; t = βt; return {f ailure} For the implementation of the overall algorithm one has to decide on the storage of all needed data and on a strategy for a more effective handling of the error case (y ) s 0. These issues are discussed, e.g., in chapter 4 of [Kel99. The simplest approach is to store the entire matrix H, which then allows for the computation of the update once the vectors s and y are available. Clearly, this is costly in storage for large dimensions. A second possibility is to store the sequences {s } and {y } and then to recompute recursively the matrices by means of (9) when they are needed. It turns out that with only a modest increase in complexity the required storage can be decreased to one vector per iteration step. We will not enter into the details, but refer to the discussion in section 4..1 of [Kel99. There also a Matlab implementation bfgsopt involving the above Armijo algorithm is given. Certainly the BFGS updates are not the only possible choice. In fact, numerous other software pacages exist that implement quasi-newton methods for minimization problems have been written; see, e.g., [MW93. 10
11 References [Deu04 P. Deuflhard, Newton Methods for Nonlinear Problems, Springer verlag, Heidelberg, New Yor, 004. [DS98 [Kel03 [Kel95 [Kel99 J. E. Dennis and Robert B. Schnabel, Numerical methods for unconstrained optimization and nonlinear equations, Classics in Applied Mathematics, Vol 16, SIAM Publications, Philadelphia, PA, Originally published by Prentice Hall C. T. Kelley, Solving Nonlinear Equations with Newton s Method, Fundamentals of Algorithms, SIAM Publications, Philadelphia, PA, 003., Iterative Methods for Linear and Nonlinear equations, Frontiers in Appl. Math., vol. 16, SIAM Publications, Philadelphia, PA, 1995., Iterative Methods for Optimization, Frontiers in Appl. Math., vol. 18, SIAM Publications, Philadelphia, PA, [MW93 J. J. More and S. J. Wright, Optimization Software Guide, SIAM Publications, Philadelphia, PA, [OR00 [Rhe98 J. M. Ortega and W. C. Rheinboldt, Iterative Solutions of Nonlinear Equations in Several Variables, Classics in Applied Mathematics, Vol 30, SIAM Publications, Philadelphia, PA, 000. Originally published by Academic Press, 1970, Russian translation 1976, Chinese translation 198. W. C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, Regional Conf. Series in Appl. Math., Vol. 70, Siam Publications, Philadelphia, PA,
2. Quasi-Newton methods
L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization
More information5 Quasi-Newton Methods
Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min
More informationQuasi-Newton methods for minimization
Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1
More informationQuasi-Newton Methods. Javier Peña Convex Optimization /36-725
Quasi-Newton Methods Javier Peña Convex Optimization 10-725/36-725 Last time: primal-dual interior-point methods Consider the problem min x subject to f(x) Ax = b h(x) 0 Assume f, h 1,..., h m are convex
More informationOptimization and Root Finding. Kurt Hornik
Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding
More informationSearch Directions for Unconstrained Optimization
8 CHAPTER 8 Search Directions for Unconstrained Optimization In this chapter we study the choice of search directions used in our basic updating scheme x +1 = x + t d. for solving P min f(x). x R n All
More informationNewton s Method. Ryan Tibshirani Convex Optimization /36-725
Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationStatistics 580 Optimization Methods
Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of
More informationQuasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization
Quasi-Newton Methods Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization 10-725 Last time: primal-dual interior-point methods Given the problem min x f(x) subject to h(x)
More informationLecture 14: October 17
1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationMultipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations
Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations Oleg Burdakov a,, Ahmad Kamandi b a Department of Mathematics, Linköping University,
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Quasi Newton Methods Barnabás Póczos & Ryan Tibshirani Quasi Newton Methods 2 Outline Modified Newton Method Rank one correction of the inverse Rank two correction of the
More informationQuasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)
Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno (BFGS) Limited memory BFGS (L-BFGS) February 6, 2014 Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb
More informationNONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM
NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth
More informationMethods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent
Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived
More informationTwo improved classes of Broyden s methods for solving nonlinear systems of equations
Available online at www.isr-publications.com/jmcs J. Math. Computer Sci., 17 (2017), 22 31 Research Article Journal Homepage: www.tjmcs.com - www.isr-publications.com/jmcs Two improved classes of Broyden
More informationA NOTE ON Q-ORDER OF CONVERGENCE
BIT 0006-3835/01/4102-0422 $16.00 2001, Vol. 41, No. 2, pp. 422 429 c Swets & Zeitlinger A NOTE ON Q-ORDER OF CONVERGENCE L. O. JAY Department of Mathematics, The University of Iowa, 14 MacLean Hall Iowa
More informationAlgorithms for Constrained Optimization
1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationLecture 18: November Review on Primal-dual interior-poit methods
10-725/36-725: Convex Optimization Fall 2016 Lecturer: Lecturer: Javier Pena Lecture 18: November 2 Scribes: Scribes: Yizhu Lin, Pan Liu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS
ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-203-OS-03 Department of Mathematics KTH Royal
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationMATH 4211/6211 Optimization Quasi-Newton Method
MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:
More informationMaria Cameron. f(x) = 1 n
Maria Cameron 1. Local algorithms for solving nonlinear equations Here we discuss local methods for nonlinear equations r(x) =. These methods are Newton, inexact Newton and quasi-newton. We will show that
More informationA derivative-free nonmonotone line search and its application to the spectral residual method
IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 3 Gradient Method Shiqian Ma, MAT-258A: Numerical Optimization 2 3.1. Gradient method Classical gradient method: to minimize a differentiable convex
More informationOptimization II: Unconstrained Multivariable
Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1
More informationA globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications
A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationImproved Damped Quasi-Newton Methods for Unconstrained Optimization
Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified
More informationOn the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method
Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationMatrix Secant Methods
Equation Solving g(x) = 0 Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). 3700 years ago the Babylonians used the secant method in 1D:
More informationQuasi-Newton Methods
Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications
More informationOptimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38
More informationMS&E 318 (CME 338) Large-Scale Numerical Optimization
Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods
More informationStep lengths in BFGS method for monotone gradients
Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly
More informationHigher-Order Methods
Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth
More informationNewton s Method. Javier Peña Convex Optimization /36-725
Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and
More informationA DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION
1 A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION G E MANOUSSAKIS, T N GRAPSA and C A BOTSARIS Department of Mathematics, University of Patras, GR 26110 Patras, Greece e-mail :gemini@mathupatrasgr,
More informationLecture V. Numerical Optimization
Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize
More informationNumerical Optimization: Basic Concepts and Algorithms
May 27th 2015 Numerical Optimization: Basic Concepts and Algorithms R. Duvigneau R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 1 Outline Some basic concepts in optimization Some
More informationORIE 6326: Convex Optimization. Quasi-Newton Methods
ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More information1. Introduction and motivation. We propose an algorithm for solving unconstrained optimization problems of the form (1) min
LLNL IM Release number: LLNL-JRNL-745068 A STRUCTURED QUASI-NEWTON ALGORITHM FOR OPTIMIZING WITH INCOMPLETE HESSIAN INFORMATION COSMIN G. PETRA, NAIYUAN CHIANG, AND MIHAI ANITESCU Abstract. We present
More informationComparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems
International Journal of Scientific and Research Publications, Volume 3, Issue 10, October 013 1 ISSN 50-3153 Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming
More informationNewton s Method and Efficient, Robust Variants
Newton s Method and Efficient, Robust Variants Philipp Birken University of Kassel (SFB/TRR 30) Soon: University of Lund October 7th 2013 Efficient solution of large systems of non-linear PDEs in science
More informationCONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING
CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global and local convergence results
More informationOptimization Tutorial 1. Basic Gradient Descent
E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.
More informationAccelerated Block-Coordinate Relaxation for Regularized Optimization
Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth
More informationEfficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization
Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization Xiaocheng Tang Department of Industrial and Systems Engineering Lehigh University Bethlehem, PA 18015 xct@lehigh.edu Katya Scheinberg
More informationNumerical Methods for Large-Scale Nonlinear Systems
Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.
More informationAM 205: lecture 18. Last time: optimization methods Today: conditions for optimality
AM 205: lecture 18 Last time: optimization methods Today: conditions for optimality Existence of Global Minimum For example: f (x, y) = x 2 + y 2 is coercive on R 2 (global min. at (0, 0)) f (x) = x 3
More informationAn Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations
International Journal of Mathematical Modelling & Computations Vol. 07, No. 02, Spring 2017, 145-157 An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations L. Muhammad
More informationNumerical Methods for Large-Scale Nonlinear Equations
Slide 1 Numerical Methods for Large-Scale Nonlinear Equations Homer Walker MA 512 April 28, 2005 Inexact Newton and Newton Krylov Methods a. Newton-iterative and inexact Newton methods. Slide 2 i. Formulation
More informationGradient-Based Optimization
Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems
More informationECS550NFB Introduction to Numerical Methods using Matlab Day 2
ECS550NFB Introduction to Numerical Methods using Matlab Day 2 Lukas Laffers lukas.laffers@umb.sk Department of Mathematics, University of Matej Bel June 9, 2015 Today Root-finding: find x that solves
More informationSpectral gradient projection method for solving nonlinear monotone equations
Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department
More informationOptimization II: Unconstrained Multivariable
Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Optimization II: Unconstrained
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationAM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α
More informationA new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints
Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More informationMethods for Unconstrained Optimization Numerical Optimization Lectures 1-2
Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods
More informationNotes on Numerical Optimization
Notes on Numerical Optimization University of Chicago, 2014 Viva Patel October 18, 2014 1 Contents Contents 2 List of Algorithms 4 I Fundamentals of Optimization 5 1 Overview of Numerical Optimization
More informationNonlinear Optimization: What s important?
Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global
More informationCONVERGENCE BEHAVIOUR OF INEXACT NEWTON METHODS
MATHEMATICS OF COMPUTATION Volume 68, Number 228, Pages 165 1613 S 25-5718(99)1135-7 Article electronically published on March 1, 1999 CONVERGENCE BEHAVIOUR OF INEXACT NEWTON METHODS BENEDETTA MORINI Abstract.
More informationLecture 7 Unconstrained nonlinear programming
Lecture 7 Unconstrained nonlinear programming Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University,
More informationStep-size Estimation for Unconstrained Optimization Methods
Volume 24, N. 3, pp. 399 416, 2005 Copyright 2005 SBMAC ISSN 0101-8205 www.scielo.br/cam Step-size Estimation for Unconstrained Optimization Methods ZHEN-JUN SHI 1,2 and JIE SHEN 3 1 College of Operations
More informationCubic regularization in symmetric rank-1 quasi-newton methods
Math. Prog. Comp. (2018) 10:457 486 https://doi.org/10.1007/s12532-018-0136-7 FULL LENGTH PAPER Cubic regularization in symmetric rank-1 quasi-newton methods Hande Y. Benson 1 David F. Shanno 2 Received:
More informationMarch 8, 2010 MATH 408 FINAL EXAM SAMPLE
March 8, 200 MATH 408 FINAL EXAM SAMPLE EXAM OUTLINE The final exam for this course takes place in the regular course classroom (MEB 238) on Monday, March 2, 8:30-0:20 am. You may bring two-sided 8 page
More informationGeometry optimization
Geometry optimization Trygve Helgaker Centre for Theoretical and Computational Chemistry Department of Chemistry, University of Oslo, Norway European Summer School in Quantum Chemistry (ESQC) 211 Torre
More informationA New Approach for Solving Dual Fuzzy Nonlinear Equations Using Broyden's and Newton's Methods
From the SelectedWorks of Dr. Mohamed Waziri Yusuf August 24, 22 A New Approach for Solving Dual Fuzzy Nonlinear Equations Using Broyden's and Newton's Methods Mohammed Waziri Yusuf, Dr. Available at:
More informationPreconditioned conjugate gradient algorithms with column scaling
Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted
More informationUnconstrained Multivariate Optimization
Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationEAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science
EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Multidimensional Unconstrained Optimization Suppose we have a function f() of more than one
More information17 Solution of Nonlinear Systems
17 Solution of Nonlinear Systems We now discuss the solution of systems of nonlinear equations. An important ingredient will be the multivariate Taylor theorem. Theorem 17.1 Let D = {x 1, x 2,..., x m
More informationarxiv: v1 [math.oc] 10 Apr 2017
A Method to Guarantee Local Convergence for Sequential Quadratic Programming with Poor Hessian Approximation Tuan T. Nguyen, Mircea Lazar and Hans Butler arxiv:1704.03064v1 math.oc] 10 Apr 2017 Abstract
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted
More informationABSTRACT 1. INTRODUCTION
A DIAGONAL-AUGMENTED QUASI-NEWTON METHOD WITH APPLICATION TO FACTORIZATION MACHINES Aryan Mohtari and Amir Ingber Department of Electrical and Systems Engineering, University of Pennsylvania, PA, USA Big-data
More informationMath 408A: Non-Linear Optimization
February 12 Broyden Updates Given g : R n R n solve g(x) = 0. Algorithm: Broyden s Method Initialization: x 0 R n, B 0 R n n Having (x k, B k ) compute (x k+1, B x+1 ) as follows: Solve B k s k = g(x
More information1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0
Numerical Analysis 1 1. Nonlinear Equations This lecture note excerpted parts from Michael Heath and Max Gunzburger. Given function f, we seek value x for which where f : D R n R n is nonlinear. f(x) =
More informationA COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS
A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS MEHIDDIN AL-BAALI AND HUMAID KHALFAN Abstract. Techniques for obtaining safely positive definite Hessian approximations with selfscaling
More informationGlobal convergence of a regularized factorized quasi-newton method for nonlinear least squares problems
Volume 29, N. 2, pp. 195 214, 2010 Copyright 2010 SBMAC ISSN 0101-8205 www.scielo.br/cam Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems WEIJUN ZHOU
More informationStatic unconstrained optimization
Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R
More informationALGORITHM XXX: SC-SR1: MATLAB SOFTWARE FOR SOLVING SHAPE-CHANGING L-SR1 TRUST-REGION SUBPROBLEMS
ALGORITHM XXX: SC-SR1: MATLAB SOFTWARE FOR SOLVING SHAPE-CHANGING L-SR1 TRUST-REGION SUBPROBLEMS JOHANNES BRUST, OLEG BURDAKOV, JENNIFER B. ERWAY, ROUMMEL F. MARCIA, AND YA-XIANG YUAN Abstract. We present
More information1. Search Directions In this chapter we again focus on the unconstrained optimization problem. lim sup ν
1 Search Directions In this chapter we again focus on the unconstrained optimization problem P min f(x), x R n where f : R n R is assumed to be twice continuously differentiable, and consider the selection
More informationDENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS
DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS by Johannes Brust, Oleg Burdaov, Jennifer B. Erway, and Roummel F. Marcia Technical Report 07-, Department of Mathematics and Statistics, Wae
More informationMinimum Norm Symmetric Quasi-Newton Updates Restricted to Subspaces
MATHEMATICS OF COMPUTATION, VOLUME 32, NUMBER 143 JULY 1978, PAGES 829-837 Minimum Norm Symmetric Quasi-Newton Updates Restricted to Subspaces By Robert B. Schnabel* Abstract. The Davidon-Fletcher-Powell
More informationA NOTE ON PAN S SECOND-ORDER QUASI-NEWTON UPDATES
A NOTE ON PAN S SECOND-ORDER QUASI-NEWTON UPDATES Lei-Hong Zhang, Ping-Qi Pan Department of Mathematics, Southeast University, Nanjing, 210096, P.R.China. Abstract This note, attempts to further Pan s
More informationE5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization
E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained
More informationUnconstrained minimization of smooth functions
Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and
More information8 Numerical methods for unconstrained problems
8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields
More informationNumerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen
Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen
More information