Mesures de criticalité d'ordres 1 et 2 en recherche directe

Size: px
Start display at page:

Download "Mesures de criticalité d'ordres 1 et 2 en recherche directe"

Transcription

1 Mesures de criticalité d'ordres 1 et 2 en recherche directe From rst to second-order criticality measures in direct search Clément Royer ENSEEIHT-IRIT, Toulouse, France Co-auteurs: S. Gratton, L. N. Vicente Journées du GDR MOA - 02/12/15 Mesures de criticalité d'ordres 1 et 2 en recherche directe 1 / 25

2 Outline 1 A problem: solving nonconvex problems via second-order methods 2 A context: direct-search methods 3 From rst to second-order polling 4 Second-order analysis and numerical behaviour Mesures de criticalité d'ordres 1 et 2 en recherche directe 2 / 25

3 Introduction We are interested in solving an unconstrained optimization problem: min f (x). x R n The objective function f f bounded from below, C 2 ; f, 2 f Lipschitz continuous; f nonconvex the Hessian matrix is not always positive semidenite. Mesures de criticalité d'ordres 1 et 2 en recherche directe 3 / 25

4 Caring about second order Our denition of a second-order method An optimization algorithm that exploits the (negative) curvature information contained in the Hessian matrix, to ensure second-order convergence. Second-order tools for the analysis Taylor expansion : f (x + s) f (x) f (x) s s 2 f (x) s + L 2 f s 3, Directional derivative estimate f (x + s) 2 f (x) + f (x s) = s 2 f (x) s + O ( s 3). Mesures de criticalité d'ordres 1 et 2 en recherche directe 4 / 25

5 Second-order derivative-based optimization Early treatment in Trust-Region and (Curvilinear) Line Search Methods; Negative curvature is seldom handled to provide second-order convergence guarantees; Regain of interest, with the outbreak of cubic models: Curtis et al '13,'14,'15, Wong ISMP '15. Main issues Cost of computing negative curvature directions; Dissociate the contributions from orders 1 and 2; No natural scaling between f (x) and λmin ( 2 f (x) ). Mesures de criticalité d'ordres 1 et 2 en recherche directe 5 / 25

6 Outline 1 A problem: solving nonconvex problems via second-order methods 2 A context: direct-search methods 3 From rst to second-order polling 4 Second-order analysis and numerical behaviour Mesures de criticalité d'ordres 1 et 2 en recherche directe 6 / 25

7 Solving the problem without using the derivatives We consider a setting in which derivatives of f are unavailable or too expensive for computation. Derivative-Free Optimization (DFO) methods Do not use the derivatives within the algorithm; Two main classes: Model-based methods; Direct-search methods. Introduction to Derivative-Free Optimization A.R. Conn, K. Scheinberg, L.N. Vicente. (2009) Mesures de criticalité d'ordres 1 et 2 en recherche directe 7 / 25

8 Solving the problem without using the derivatives We consider a setting in which derivatives of f are unavailable or too expensive for computation. Derivative-Free Optimization (DFO) methods Do not use the derivatives within the algorithm; Two main classes: Model-based methods; Direct-search methods. Introduction to Derivative-Free Optimization A.R. Conn, K. Scheinberg, L.N. Vicente. (2009) Mesures de criticalité d'ordres 1 et 2 en recherche directe 7 / 25

9 A simple direct-search framework 1 Initialization Set x 0, α 0 > 0, θ < 1 γ. Set k = 0. 2 Poll Step Choose a polling/direction set of (unitary) vectors. If it exists d k within the set such that f (x k + α k d k ) f (x k ) < α 3 k, then set x k+1 := x k + α k d k and α k+1 := γ α k. Otherwise, set x k+1 := x k and α k+1 := θ α k. 3 Set k = k + 1 and go back to the poll step. Mesures de criticalité d'ordres 1 et 2 en recherche directe 8 / 25

10 A simple direct-search framework 1 Initialization Set x 0, α 0 > 0, θ < 1 γ. Set k = 0. 2 Poll Step Choose a polling/direction set of (unitary) vectors. If it exists d k within the set such that f (x k + α k d k ) f (x k ) < α 3 k, then set x k+1 := x k + α k d k and α k+1 := γ α k. Otherwise, set x k+1 := x k and α k+1 := θ α k. 3 Set k = k + 1 and go back to the poll step. Remarks Performance criterion : # of evaluations of f ; Theoretical properties mainly depend on polling choices. Mesures de criticalité d'ordres 1 et 2 en recherche directe 8 / 25

11 Order 2 in derivative-free methods Few practical methods that explicitly deal with nonconvexity; For direct search, most results due to Abramson et al ('05,'06,'14). Issues with the existing direct-search approaches Study properties of (unknown) convergent subsequences; Rely on density assumptions and on direction sets dependent from an iteration to another. Our objective is to develop a method that exploits second-order properties at the iteration level. Mesures de criticalité d'ordres 1 et 2 en recherche directe 9 / 25

12 Outline 1 A problem: solving nonconvex problems via second-order methods 2 A context: direct-search methods 3 From rst to second-order polling 4 Second-order analysis and numerical behaviour Mesures de criticalité d'ordres 1 et 2 en recherche directe 10 / 25

13 Back to the direct-search method 1 Initialization Set x 0, α 0 > 0, θ < 1 γ. Set k = 0. 2 Poll Step Choose a polling set of (unitary) vectors. If it exists d k within the set such that f (x k + α k d k ) f (x k ) < α 3 k, then set x k+1 := x k + α k d k and α k+1 := γ α k. Otherwise, set x k+1 := x k and α k+1 := θ α k. 3 Set k = k + 1 and go back to the poll step. How can we dene rules to choose the polling sets? Mesures de criticalité d'ordres 1 et 2 en recherche directe 11 / 25

14 First-order polling quality Typical direct-search methods ensure rst-order convergence; The polling sets must provide good approximations of the negative gradient. Mesures de criticalité d'ordres 1 et 2 en recherche directe 12 / 25

15 First-order polling quality Typical direct-search methods ensure rst-order convergence; The polling sets must provide good approximations of the negative gradient. A measure of rst-order quality Let D be a set of unitary vectors and v R n \ {0}. Then d v cm(d, v) = max d D v is called the cosine measure of D at v. Mesures de criticalité d'ordres 1 et 2 en recherche directe 12 / 25

16 First-order polling quality Typical direct-search methods ensure rst-order convergence; The polling sets must provide good approximations of the negative gradient. A measure of rst-order quality Let D be a set of unitary vectors and v R n \ {0}. Then d v cm(d, v) = max d D v is called the cosine measure of D at v. If cm(d, f (x)) > 0, it means that D contains a descent direction of f at x. Mesures de criticalité d'ordres 1 et 2 en recherche directe 12 / 25

17 Usual polling choice Positive Spanning Sets (PSS) D is a PSS if it generates R n by nonnegative linear combinations. D is a PSS i v 0, cm(d, v) > 0; a PSS contains at least n + 1 vectors; Ex) The coordinate set D = [I -I ]. Mesures de criticalité d'ordres 1 et 2 en recherche directe 13 / 25

18 Usual polling choice Positive Spanning Sets (PSS) D is a PSS if it generates R n by nonnegative linear combinations. D is a PSS i v 0, cm(d, v) > 0; a PSS contains at least n + 1 vectors; Ex) The coordinate set D = [I -I ]. PSS and rst-order convergence Two main ideas : Use the Taylor expansion f (x + α d) f (x) α f (x) d + L f α 2. Assume that for every iteration, cm (D k, f (x k )) κ, κ (0, 1). Mesures de criticalité d'ordres 1 et 2 en recherche directe 13 / 25

19 First-order results First-order polling strategy 1 Poll along a Positive Spanning Set D k. Mesures de criticalité d'ordres 1 et 2 en recherche directe 14 / 25

20 First-order results First-order polling strategy 1 Poll along a Positive Spanning Set D k. Convergence arguments Independently of D k, α k 0; On unsuccessful iterations, α k O (κ f (x k ) ). Theorem (First-order convergence) lim inf k f (x k) = 0. Mesures de criticalité d'ordres 1 et 2 en recherche directe 14 / 25

21 A second-order criticality measure Denition Given a set of unitary vectors D and a symmetric matrix A, the Rayleigh measure of D with respect to A is dened by rm (D, A) = min d A d, d V (D) where V (D) = {d D d D} is the symmetric part of D. The Rayleigh measure is an approximation of the minimum eigenvalue; We want this approximation to be suciently good. Mesures de criticalité d'ordres 1 et 2 en recherche directe 15 / 25

22 Rayleigh measure and negative curvature In derivative-based methods, if λ min ( 2 f (x k )) < 0, one uses a sucient negative curvature direction: with β (0, 1]. In a direct-search environment d 2 f (x k ) d β λ min ( 2 f (x k )), Derivative-free: Hessian eigenvalues cannot be computed; Direct search: The step size goes to zero; We will be ensuring rm ( D k, 2 f (x k ) ) β λ min ( 2 f (x k )) + O(α k ). Mesures de criticalité d'ordres 1 et 2 en recherche directe 16 / 25

23 A second-order polling strategy for Direct Search Second-order polling rules 1 Poll along a PSS D k (First-order rule); Mesures de criticalité d'ordres 1 et 2 en recherche directe 17 / 25

24 A second-order polling strategy for Direct Search Second-order polling rules 1 Poll along a PSS D k (First-order rule); 2 Poll along -D k ; 3 Select a basis B k D k and build an approximated Hessian 2 f (x k ) B k, using function values; H k B k 4 Compute a unitary vector such that H k v k = λ min (H k ) v k ; poll along v k and -v k. Mesures de criticalité d'ordres 1 et 2 en recherche directe 17 / 25

25 A second-order polling strategy for Direct Search Second-order polling rules 1 Poll along a PSS D k (First-order rule); 2 Poll along -D k ; 3 Select a basis B k D k and build an approximated Hessian 2 f (x k ) B k, using function values; H k B k 4 Compute a unitary vector such that H k v k = λ min (H k ) v k ; poll along v k and -v k. The cost of an iteration is at most O(n 2 ) evaluations. The polling stops as soon as it encounters a direction d such that f (x k + α k d) f (x k ) < α 3 k. Mesures de criticalité d'ordres 1 et 2 en recherche directe 17 / 25

26 Outline 1 A problem: solving nonconvex problems via second-order methods 2 A context: direct-search methods 3 From rst to second-order polling 4 Second-order analysis and numerical behaviour Mesures de criticalité d'ordres 1 et 2 en recherche directe 18 / 25

27 Second-order convergence Assumptions The D k 's are PSS with k, cm(d k, f (x k )) κ > 0; It exists σ (0, 1] such that k, σ min (B k ) 2 σ > 0. Minimum eigenvalue estimate Let k be an unsuccessful iteration, and P k the corresponding polling set. rm ( P k, 2 f (x k ) ) v k 2 f (x k ) v k σ λ min ( 2 f (x k )) + O (n α k ). The factors σ and n are due to the approximation error. Mesures de criticalité d'ordres 1 et 2 en recherche directe 19 / 25

28 Second-order convergence (2) Convergence arguments As before, α k 0; On an unsuccessful iteration k, one has: α k max { O (κ f (x k ) ), O ( ( σ n 1 λ min 2 f (xk ) ))}. Mesures de criticalité d'ordres 1 et 2 en recherche directe 20 / 25

29 Second-order convergence (2) Convergence arguments As before, α k 0; On an unsuccessful iteration k, one has: α k max { O (κ f (x k ) ), O ( ( σ n 1 λ min 2 f (xk ) ))}. Theorem (Second-order convergence) lim inf k max { f (x k ), λ min ( 2 f (x k )) } = 0. Mesures de criticalité d'ordres 1 et 2 en recherche directe 20 / 25

30 Second-order worst-case complexity We aim to reach an (ɛ g, ɛ H )-second-order critical point, i.e. f (x k ) < ɛ g and λ min ( 2 f (x k )) > ɛ H. Theorem Let N ɛgh the number of evaluations of f needed to reach a (ɛ g, ɛ H )-second-order critical point; then N ɛgh O ( n 2 max { κ 3 ɛ 3 g }), σ 3 n 3 ɛ 3. H Corollary Choosing D k = [I -I ] yields κ = 1/ n, σ = 1, and the complexity bound is O ( n 5 max { ɛ 3 g }), ɛ 3. H Mesures de criticalité d'ordres 1 et 2 en recherche directe 21 / 25

31 Practical insights On 60 CUTEst problems with negative curvature: Using symmetric sets generally improves the performance; Second-order rules (plain lines) allow to solve more problems. Mesures de criticalité d'ordres 1 et 2 en recherche directe 22 / 25

32 Conclusion Our contributions The denition of a second-order criticality measure; A second-order direct-search method that converges w.r.t. this measure and its associated complexity; Numerical conrmation of the theoretical ndings. Mesures de criticalité d'ordres 1 et 2 en recherche directe 23 / 25

33 Conclusion Our contributions The denition of a second-order criticality measure; A second-order direct-search method that converges w.r.t. this measure and its associated complexity; Numerical conrmation of the theoretical ndings. For more information A second-order globally convergent direct-search method and its worst-case complexity. S. Gratton, C. W. Royer, L. N. Vicente. To appear in Optimization. Mesures de criticalité d'ordres 1 et 2 en recherche directe 23 / 25

34 Towards randomization Guaranteeing P (cm(d k, f (x k )) > κ) p > 0 is sucient for rst-order convergence, and we can do it in practice (Gratton, R., Vicente and Zhang '14); Can we do the same with second-order properties? Mesures de criticalité d'ordres 1 et 2 en recherche directe 24 / 25

35 Merci! Mesures de criticalité d'ordres 1 et 2 en recherche directe 25 / 25

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Clément Royer - University of Wisconsin-Madison Joint work with Stephen J. Wright MOPTA, Bethlehem,

More information

A recursive model-based trust-region method for derivative-free bound-constrained optimization.

A recursive model-based trust-region method for derivative-free bound-constrained optimization. A recursive model-based trust-region method for derivative-free bound-constrained optimization. ANKE TRÖLTZSCH [CERFACS, TOULOUSE, FRANCE] JOINT WORK WITH: SERGE GRATTON [ENSEEIHT, TOULOUSE, FRANCE] PHILIPPE

More information

A multistart multisplit direct search methodology for global optimization

A multistart multisplit direct search methodology for global optimization 1/69 A multistart multisplit direct search methodology for global optimization Ismael Vaz (Univ. Minho) Luis Nunes Vicente (Univ. Coimbra) IPAM, Optimization and Optimal Control for Complex Energy and

More information

Worst Case Complexity of Direct Search

Worst Case Complexity of Direct Search Worst Case Complexity of Direct Search L. N. Vicente May 3, 200 Abstract In this paper we prove that direct search of directional type shares the worst case complexity bound of steepest descent when sufficient

More information

Direct Search Based on Probabilistic Descent

Direct Search Based on Probabilistic Descent Direct Search Based on Probabilistic Descent S. Gratton C. W. Royer L. N. Vicente Z. Zhang January 22, 2015 Abstract Direct-search methods are a class of popular derivative-free algorithms characterized

More information

Worst case complexity of direct search

Worst case complexity of direct search EURO J Comput Optim (2013) 1:143 153 DOI 10.1007/s13675-012-0003-7 ORIGINAL PAPER Worst case complexity of direct search L. N. Vicente Received: 7 May 2012 / Accepted: 2 November 2012 / Published online:

More information

Worst Case Complexity of Direct Search

Worst Case Complexity of Direct Search Worst Case Complexity of Direct Search L. N. Vicente October 25, 2012 Abstract In this paper we prove that the broad class of direct-search methods of directional type based on imposing sufficient decrease

More information

On the optimal order of worst case complexity of direct search

On the optimal order of worst case complexity of direct search On the optimal order of worst case complexity of direct search M. Dodangeh L. N. Vicente Z. Zhang May 14, 2015 Abstract The worst case complexity of direct-search methods has been recently analyzed when

More information

Interpolation-Based Trust-Region Methods for DFO

Interpolation-Based Trust-Region Methods for DFO Interpolation-Based Trust-Region Methods for DFO Luis Nunes Vicente University of Coimbra (joint work with A. Bandeira, A. R. Conn, S. Gratton, and K. Scheinberg) July 27, 2010 ICCOPT, Santiago http//www.mat.uc.pt/~lnv

More information

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Clément Royer (Université du Wisconsin-Madison, États-Unis) Toulouse, 8 janvier 2019 Nonconvex optimization via Newton-CG

More information

Direct search based on probabilistic feasible descent for bound and linearly constrained problems

Direct search based on probabilistic feasible descent for bound and linearly constrained problems Direct search based on probabilistic feasible descent for bound and linearly constrained problems S. Gratton C. W. Royer L. N. Vicente Z. Zhang March 8, 2018 Abstract Direct search is a methodology for

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

An introduction to complexity analysis for nonconvex optimization

An introduction to complexity analysis for nonconvex optimization An introduction to complexity analysis for nonconvex optimization Philippe Toint (with Coralia Cartis and Nick Gould) FUNDP University of Namur, Belgium Séminaire Résidentiel Interdisciplinaire, Saint

More information

A decoupled first/second-order steps technique for nonconvex nonlinear unconstrained optimization with improved complexity bounds

A decoupled first/second-order steps technique for nonconvex nonlinear unconstrained optimization with improved complexity bounds A decoupled first/second-order steps technique for nonconvex nonlinear unconstrained optimization with improved complexity bounds S. Gratton C. W. Royer L. N. Vicente August 5, 28 Abstract In order to

More information

A Subsampling Line-Search Method with Second-Order Results

A Subsampling Line-Search Method with Second-Order Results A Subsampling Line-Search Method with Second-Order Results E. Bergou Y. Diouane V. Kungurtsev C. W. Royer November 21, 2018 Abstract In many contemporary optimization problems, such as hyperparameter tuning

More information

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by: Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion

More information

Derivative-Free Trust-Region methods

Derivative-Free Trust-Region methods Derivative-Free Trust-Region methods MTH6418 S. Le Digabel, École Polytechnique de Montréal Fall 2015 (v4) MTH6418: DFTR 1/32 Plan Quadratic models Model Quality Derivative-Free Trust-Region Framework

More information

Introduction to gradient descent

Introduction to gradient descent 6-1: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction to gradient descent Derivation and intuitions Hessian 6-2: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction Our

More information

Introduction to Nonlinear Optimization Paul J. Atzberger

Introduction to Nonlinear Optimization Paul J. Atzberger Introduction to Nonlinear Optimization Paul J. Atzberger Comments should be sent to: atzberg@math.ucsb.edu Introduction We shall discuss in these notes a brief introduction to nonlinear optimization concepts,

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization Journal of Computational and Applied Mathematics 155 (2003) 285 305 www.elsevier.com/locate/cam Nonmonotonic bac-tracing trust region interior point algorithm for linear constrained optimization Detong

More information

This manuscript is for review purposes only.

This manuscript is for review purposes only. 1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region

More information

Optimization Methods. Lecture 18: Optimality Conditions and. Gradient Methods. for Unconstrained Optimization

Optimization Methods. Lecture 18: Optimality Conditions and. Gradient Methods. for Unconstrained Optimization 5.93 Optimization Methods Lecture 8: Optimality Conditions and Gradient Methods for Unconstrained Optimization Outline. Necessary and sucient optimality conditions Slide. Gradient m e t h o d s 3. The

More information

MATH 4211/6211 Optimization Basics of Optimization Problems

MATH 4211/6211 Optimization Basics of Optimization Problems MATH 4211/6211 Optimization Basics of Optimization Problems Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 A standard minimization

More information

The Steepest Descent Algorithm for Unconstrained Optimization

The Steepest Descent Algorithm for Unconstrained Optimization The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem

More information

A line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 )

A line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 ) A line-search algorithm inspired by the adaptive cubic regularization framewor with a worst-case complexity Oɛ 3/ E. Bergou Y. Diouane S. Gratton December 4, 017 Abstract Adaptive regularized framewor

More information

Optimal Newton-type methods for nonconvex smooth optimization problems

Optimal Newton-type methods for nonconvex smooth optimization problems Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations

More information

Lecture 5: September 12

Lecture 5: September 12 10-725/36-725: Convex Optimization Fall 2015 Lecture 5: September 12 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Barun Patra and Tyler Vuong Note: LaTeX template courtesy of UC Berkeley EECS

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

USING SIMPLEX GRADIENTS OF NONSMOOTH FUNCTIONS IN DIRECT SEARCH METHODS

USING SIMPLEX GRADIENTS OF NONSMOOTH FUNCTIONS IN DIRECT SEARCH METHODS USING SIMPLEX GRADIENTS OF NONSMOOTH FUNCTIONS IN DIRECT SEARCH METHODS A. L. CUSTÓDIO, J. E. DENNIS JR., AND L. N. VICENTE Abstract. It has been shown recently that the efficiency of direct search methods

More information

A New Trust Region Algorithm Using Radial Basis Function Models

A New Trust Region Algorithm Using Radial Basis Function Models A New Trust Region Algorithm Using Radial Basis Function Models Seppo Pulkkinen University of Turku Department of Mathematics July 14, 2010 Outline 1 Introduction 2 Background Taylor series approximations

More information

Lecture 3: Basics of set-constrained and unconstrained optimization

Lecture 3: Basics of set-constrained and unconstrained optimization Lecture 3: Basics of set-constrained and unconstrained optimization (Chap 6 from textbook) Xiaoqun Zhang Shanghai Jiao Tong University Last updated: October 9, 2018 Optimization basics Outline Optimization

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

USING SIMPLEX GRADIENTS OF NONSMOOTH FUNCTIONS IN DIRECT SEARCH METHODS

USING SIMPLEX GRADIENTS OF NONSMOOTH FUNCTIONS IN DIRECT SEARCH METHODS Pré-Publicações do Departamento de Matemática Universidade de Coimbra Preprint Number 06 48 USING SIMPLEX GRADIENTS OF NONSMOOTH FUNCTIONS IN DIRECT SEARCH METHODS A. L. CUSTÓDIO, J. E. DENNIS JR. AND

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

Chapter 8 Gradient Methods

Chapter 8 Gradient Methods Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point

More information

Fundamentals of Unconstrained Optimization

Fundamentals of Unconstrained Optimization dalmau@cimat.mx Centro de Investigación en Matemáticas CIMAT A.C. Mexico Enero 2016 Outline Introduction 1 Introduction 2 3 4 Optimization Problem min f (x) x Ω where f (x) is a real-valued function The

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 14: Unconstrained optimization Prof. John Gunnar Carlsson October 27, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 27, 2010 1

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization Frank E. Curtis Department of Industrial and Systems Engineering, Lehigh University Daniel P. Robinson Department

More information

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep

More information

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes

More information

On Nesterov s Random Coordinate Descent Algorithms - Continued

On Nesterov s Random Coordinate Descent Algorithms - Continued On Nesterov s Random Coordinate Descent Algorithms - Continued Zheng Xu University of Texas At Arlington February 20, 2015 1 Revisit Random Coordinate Descent The Random Coordinate Descent Upper and Lower

More information

OPER 627: Nonlinear Optimization Lecture 2: Math Background and Optimality Conditions

OPER 627: Nonlinear Optimization Lecture 2: Math Background and Optimality Conditions OPER 627: Nonlinear Optimization Lecture 2: Math Background and Optimality Conditions Department of Statistical Sciences and Operations Research Virginia Commonwealth University Aug 28, 2013 (Lecture 2)

More information

too, of course, but is perhaps overkill here.

too, of course, but is perhaps overkill here. LUNDS TEKNISKA HÖGSKOLA MATEMATIK LÖSNINGAR OPTIMERING 018-01-11 kl 08-13 1. a) CQ points are shown as A and B below. Graphically: they are the only points that share the same tangent line for both active

More information

An Inexact Newton Method for Nonlinear Constrained Optimization

An Inexact Newton Method for Nonlinear Constrained Optimization An Inexact Newton Method for Nonlinear Constrained Optimization Frank E. Curtis Numerical Analysis Seminar, January 23, 2009 Outline Motivation and background Algorithm development and theoretical results

More information

S. Lucidi F. Rochetich M. Roma. Curvilinear Stabilization Techniques for. Truncated Newton Methods in. the complete results 1

S. Lucidi F. Rochetich M. Roma. Curvilinear Stabilization Techniques for. Truncated Newton Methods in. the complete results 1 Universita degli Studi di Roma \La Sapienza" Dipartimento di Informatica e Sistemistica S. Lucidi F. Rochetich M. Roma Curvilinear Stabilization Techniques for Truncated Newton Methods in Large Scale Unconstrained

More information

Descent methods. min x. f(x)

Descent methods. min x. f(x) Gradient Descent Descent methods min x f(x) 5 / 34 Descent methods min x f(x) x k x k+1... x f(x ) = 0 5 / 34 Gradient methods Unconstrained optimization min f(x) x R n. 6 / 34 Gradient methods Unconstrained

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Math 164: Optimization Barzilai-Borwein Method

Math 164: Optimization Barzilai-Borwein Method Math 164: Optimization Barzilai-Borwein Method Instructor: Wotao Yin Department of Mathematics, UCLA Spring 2015 online discussions on piazza.com Main features of the Barzilai-Borwein (BB) method The BB

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Bilevel Derivative-Free Optimization and its Application to Robust Optimization

Bilevel Derivative-Free Optimization and its Application to Robust Optimization Bilevel Derivative-Free Optimization and its Application to Robust Optimization A. R. Conn L. N. Vicente September 15, 2010 Abstract We address bilevel programming problems when the derivatives of both

More information

Global convergence of trust-region algorithms for constrained minimization without derivatives

Global convergence of trust-region algorithms for constrained minimization without derivatives Global convergence of trust-region algorithms for constrained minimization without derivatives P.D. Conejo E.W. Karas A.A. Ribeiro L.G. Pedroso M. Sachine September 27, 2012 Abstract In this work we propose

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

c 2007 Society for Industrial and Applied Mathematics

c 2007 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 18, No. 1, pp. 106 13 c 007 Society for Industrial and Applied Mathematics APPROXIMATE GAUSS NEWTON METHODS FOR NONLINEAR LEAST SQUARES PROBLEMS S. GRATTON, A. S. LAWLESS, AND N. K.

More information

Performance Surfaces and Optimum Points

Performance Surfaces and Optimum Points CSC 302 1.5 Neural Networks Performance Surfaces and Optimum Points 1 Entrance Performance learning is another important class of learning law. Network parameters are adjusted to optimize the performance

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

A Derivative-Free Gauss-Newton Method

A Derivative-Free Gauss-Newton Method A Derivative-Free Gauss-Newton Method Coralia Cartis Lindon Roberts 29th October 2017 Abstract We present, a derivative-free version of the Gauss-Newton method for solving nonlinear least-squares problems.

More information

Non-negative Matrix Factorization via accelerated Projected Gradient Descent

Non-negative Matrix Factorization via accelerated Projected Gradient Descent Non-negative Matrix Factorization via accelerated Projected Gradient Descent Andersen Ang Mathématique et recherche opérationnelle UMONS, Belgium Email: manshun.ang@umons.ac.be Homepage: angms.science

More information

Complexity of gradient descent for multiobjective optimization

Complexity of gradient descent for multiobjective optimization Complexity of gradient descent for multiobjective optimization J. Fliege A. I. F. Vaz L. N. Vicente July 18, 2018 Abstract A number of first-order methods have been proposed for smooth multiobjective optimization

More information

A DERIVATIVE-FREE ALGORITHM FOR THE LEAST-SQUARE MINIMIZATION

A DERIVATIVE-FREE ALGORITHM FOR THE LEAST-SQUARE MINIMIZATION A DERIVATIVE-FREE ALGORITHM FOR THE LEAST-SQUARE MINIMIZATION HONGCHAO ZHANG, ANDREW R. CONN, AND KATYA SCHEINBERG Abstract. We develop a framework for a class of derivative-free algorithms for the least-squares

More information

On the complexity of an Inexact Restoration method for constrained optimization

On the complexity of an Inexact Restoration method for constrained optimization On the complexity of an Inexact Restoration method for constrained optimization L. F. Bueno J. M. Martínez September 18, 2018 Abstract Recent papers indicate that some algorithms for constrained optimization

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

Inexact Newton Methods and Nonlinear Constrained Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization Inexact Newton Methods and Nonlinear Constrained Optimization Frank E. Curtis EPSRC Symposium Capstone Conference Warwick Mathematics Institute July 2, 2009 Outline PDE-Constrained Optimization Newton

More information

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with Travis Johnson, Northwestern University Daniel P. Robinson, Johns

More information

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University

More information

Optimization Tutorial 1. Basic Gradient Descent

Optimization Tutorial 1. Basic Gradient Descent E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

More information

Stochastic Analogues to Deterministic Optimizers

Stochastic Analogues to Deterministic Optimizers Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured

More information

1. Introduction. We consider the numerical solution of the unconstrained (possibly nonconvex) optimization problem

1. Introduction. We consider the numerical solution of the unconstrained (possibly nonconvex) optimization problem SIAM J. OPTIM. Vol. 2, No. 6, pp. 2833 2852 c 2 Society for Industrial and Applied Mathematics ON THE COMPLEXITY OF STEEPEST DESCENT, NEWTON S AND REGULARIZED NEWTON S METHODS FOR NONCONVEX UNCONSTRAINED

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Stochastic Quasi-Newton Methods

Stochastic Quasi-Newton Methods Stochastic Quasi-Newton Methods Donald Goldfarb Department of IEOR Columbia University UCLA Distinguished Lecture Series May 17-19, 2016 1 / 35 Outline Stochastic Approximation Stochastic Gradient Descent

More information

Generalized Pattern Search Algorithms : unconstrained and constrained cases

Generalized Pattern Search Algorithms : unconstrained and constrained cases IMA workshop Optimization in simulation based models Generalized Pattern Search Algorithms : unconstrained and constrained cases Mark A. Abramson Air Force Institute of Technology Charles Audet École Polytechnique

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

An Inexact Newton Method for Optimization

An Inexact Newton Method for Optimization New York University Brown Applied Mathematics Seminar, February 10, 2009 Brief biography New York State College of William and Mary (B.S.) Northwestern University (M.S. & Ph.D.) Courant Institute (Postdoc)

More information

ARE202A, Fall Contents

ARE202A, Fall Contents ARE202A, Fall 2005 LECTURE #2: WED, NOV 6, 2005 PRINT DATE: NOVEMBER 2, 2005 (NPP2) Contents 5. Nonlinear Programming Problems and the Kuhn Tucker conditions (cont) 5.2. Necessary and sucient conditions

More information

A line-search algorithm inspired by the adaptive cubic regularization framework, with a worst-case complexity O(ɛ 3/2 ).

A line-search algorithm inspired by the adaptive cubic regularization framework, with a worst-case complexity O(ɛ 3/2 ). A line-search algorithm inspired by the adaptive cubic regularization framewor, with a worst-case complexity Oɛ 3/. E. Bergou Y. Diouane S. Gratton June 16, 017 Abstract Adaptive regularized framewor using

More information

Ch4: Method of Steepest Descent

Ch4: Method of Steepest Descent Ch4: Method of Steepest Descent The method of steepest descent is recursive in the sense that starting from some initial (arbitrary) value for the tap-weight vector, it improves with the increased number

More information

Static unconstrained optimization

Static unconstrained optimization Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R

More information

MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels

MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels 1/12 MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels Dominique Guillot Departments of Mathematical Sciences University of Delaware March 14, 2016 Separating sets:

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

GEOMETRY OF INTERPOLATION SETS IN DERIVATIVE FREE OPTIMIZATION

GEOMETRY OF INTERPOLATION SETS IN DERIVATIVE FREE OPTIMIZATION GEOMETRY OF INTERPOLATION SETS IN DERIVATIVE FREE OPTIMIZATION ANDREW R. CONN, KATYA SCHEINBERG, AND LUíS N. VICENTE Abstract. We consider derivative free methods based on sampling approaches for nonlinear

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS

DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS by Johannes Brust, Oleg Burdaov, Jennifer B. Erway, and Roummel F. Marcia Technical Report 07-, Department of Mathematics and Statistics, Wae

More information

Key words. nonlinear programming, pattern search algorithm, derivative-free optimization, convergence analysis, second order optimality conditions

Key words. nonlinear programming, pattern search algorithm, derivative-free optimization, convergence analysis, second order optimality conditions SIAM J. OPTIM. Vol. x, No. x, pp. xxx-xxx c Paper submitted to Society for Industrial and Applied Mathematics SECOND ORDER BEHAVIOR OF PATTERN SEARCH MARK A. ABRAMSON Abstract. Previous analyses of pattern

More information

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28 26/28 Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable Functions YURII NESTEROV AND GEOVANI NUNES GRAPIGLIA 5 YEARS OF CORE DISCUSSION PAPERS CORE Voie du Roman Pays 4, L Tel

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

The Randomized Newton Method for Convex Optimization

The Randomized Newton Method for Convex Optimization The Randomized Newton Method for Convex Optimization Vaden Masrani UBC MLRG April 3rd, 2018 Introduction We have some unconstrained, twice-differentiable convex function f : R d R that we want to minimize:

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

Simple Complexity Analysis of Simplified Direct Search

Simple Complexity Analysis of Simplified Direct Search Simple Complexity Analysis of Simplified Direct Search Jaub Konečný Peter Richtári School of Mathematics University of Edinburgh United Kingdom September 30, 014 (Revised : November 13, 014) Abstract We

More information

Second-Order Methods for Stochastic Optimization

Second-Order Methods for Stochastic Optimization Second-Order Methods for Stochastic Optimization Frank E. Curtis, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

IFT Lecture 2 Basics of convex analysis and gradient descent

IFT Lecture 2 Basics of convex analysis and gradient descent IF 6085 - Lecture Basics of convex analysis and gradient descent his version of the notes has not yet been thoroughly checked. Please report any bugs to the scribes or instructor. Scribes: Assya rofimov,

More information

Global Convergence of Radial Basis Function Trust Region Derivative-Free Algorithms

Global Convergence of Radial Basis Function Trust Region Derivative-Free Algorithms ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439 Global Convergence of Radial Basis Function Trust Region Derivative-Free Algorithms Stefan M. Wild and Christine Shoemaker Mathematics

More information

arxiv: v1 [math.oc] 16 Oct 2018

arxiv: v1 [math.oc] 16 Oct 2018 A Subsampling Line-Search Method with Second-Order Results E. Bergou Y. Diouane V. Kungurtsev C. W. Royer October 18, 2018 arxiv:1810.07211v1 [math.oc] 16 Oct 2018 Abstract In many contemporary optimization

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

A First-Order Framework for Solving Ellipsoidal Inclusion and. Inclusion and Optimal Design Problems. Selin Damla Ahipa³ao lu.

A First-Order Framework for Solving Ellipsoidal Inclusion and. Inclusion and Optimal Design Problems. Selin Damla Ahipa³ao lu. A First-Order Framework for Solving Ellipsoidal Inclusion and Optimal Design Problems Singapore University of Technology and Design November 23, 2012 1. Statistical Motivation Given m regression vectors

More information