Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

Size: px
Start display at page:

Download "Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization"

Transcription

1 Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Clément Royer - University of Wisconsin-Madison Joint work with Stephen J. Wright MOPTA, Bethlehem, Pennsylvania, USA - August 17, 2017 Complexity of second order line search 1

2 Smooth nonconvex optimization We consider an unconstrained smooth problem: min x R f (x). n Assumptions on f f bounded from below. f twice continuously dierentiable. f is not convex. Complexity of second order line search 2

3 Optimality conditions Second-order necessary point x satises the second-order necessary conditions if f (x ) = 0, 2 f (x ) 0. Basic paradigm If x is not a second-order necessary point, d such that 1 d f (x) < 0: gradient-type direction. and/or 2 d 2 f (x)d < 0: negative curvature direction specic to nonconvex problems. Complexity of second order line search 3

4 Motivation Example: Nonconvex formulation of low-rank matrix problems For common classes of problems: min f (U V ). U R n r,v R m r Second-order necessary points are global minimizers (or close). Saddle points have negative curvature. Complexity of second order line search 4

5 Motivation Example: Nonconvex formulation of low-rank matrix problems For common classes of problems: min f (U V ). U R n r,v R m r Second-order necessary points are global minimizers (or close). Saddle points have negative curvature. Renewed interested: Second-order necessary points of nonconvex problems. Needed: Ecient algorithms. Complexity of second order line search 4

6 Second-order complexity Principle For a given method, two tolerances ɛ g, ɛ H (0, 1): Obj: bound the worst-case cost of reaching x k such that f (x k ) ɛ g, λ k = λ min ( 2 f (x k )) ɛ H. Focus: Bound dependencies on ɛ g, ɛ H. Complexity of second order line search 5

7 Second-order complexity Principle For a given method, two tolerances ɛ g, ɛ H (0, 1): Obj: bound the worst-case cost of reaching x k such that f (x k ) ɛ g, λ k = λ min ( 2 f (x k )) ɛ H. Focus: Bound dependencies on ɛ g, ɛ H. Denition of cost? Best rates? Complexity of second order line search 5

8 Existing complexity results Nonconvex optimization literature Classical cost: Number of (expensive) iterations. Best methods: Newton-type frameworks. Complexity of second order line search 6

9 Existing complexity results Nonconvex optimization literature Classical cost: Number of (expensive) iterations. Best methods: Newton-type frameworks. Algorithms Classical trust region Cubic regularization TRACE trust region Bounds O ( max{ɛ 2 g ɛ 1 H, ɛ 3 H }) ( ) O max{ɛ 3 2 g, ɛ 3 H } Complexity of second order line search 6

10 Existing complexity results (2) Learning/Statistics community Specic setting ɛ g = ɛ, ɛ H = O( ɛ). Best Newton-type bound: O(ɛ 3 2 ). Gradient-based cheaper iterations. Cost measure: Hessian-vector products/gradient evaluations. Complexity of second order line search 7

11 Existing complexity results (2) Learning/Statistics community Specic setting ɛ g = ɛ, ɛ H = O( ɛ). Best Newton-type bound: O(ɛ 3 2 ). Gradient-based cheaper iterations. Cost measure: Hessian-vector products/gradient evaluations. Algorithms Gradient descent methods with random noise Accelerated gradient methods for nonconvex problems Bounds Õ ( ɛ 2) Õ(ɛ 7 4 ) Õ( ): logarithmic factors. Results hold with high probability. Complexity of second order line search 7

12 Our objective Illustrate all the possible complexities... In terms of iterations, evaluations, etc. For arbitrary ɛ g, ɛ H. Deterministic and high probability results. Complexity of second order line search 8

13 Our objective Illustrate all the possible complexities... In terms of iterations, evaluations, etc. For arbitrary ɛ g, ɛ H. Deterministic and high probability results....in a single framework Based on line search. Matrix-free: only require Hessian-vector products. Good complexity guarantees. Complexity of second order line search 8

14 Outline 1 Our algorithm 2 Complexity analysis 3 Inexact variants Complexity of second order line search 9

15 Outline 1 Our algorithm 2 Complexity analysis 3 Inexact variants Complexity of second order line search 10

16 Basic framework Parameters: x 0 R n, θ (0, 1), η > 0, ɛ g (0, 1), ɛ H (0, 1). For k=0, 1, 2,... 1 Compute a search direction d k. 2 Perform a backtracking line search to compute α k = θ j k such that f (x k + α k d k ) < f (x k ) η 6 α3 k d k 3. 3 Set x k+1 = x k + α k d k. Complexity of second order line search 11

17 Selecting the search direction d k Step 1: Use gradient related information Compute If R k < ɛ H, set g k = f (x k ), R k = g k 2 f (x k )g k g k 2. d k = R k g k g k. Elseif R k [ ɛ H, ɛ H ] and g k > ɛ g, set Otherwise perform Step 2. g k d k = g k. 1/2 Complexity of second order line search 12

18 Selecting the search direction d k (2) Step 2: Use eigenvalue information Compute an eigenpair (v k, λ k ) such that λ k = λ min ( 2 f (x k )) and 2 f (x k )v k = λ k v k, vk g k 0, v k = 1. Case λ k < ɛ H : d k = λ k v k ; Case λ k > ɛ H - Newton step: d k = dk n, 2 f (x k )dk n = g k; Case λ k [ ɛ H, ɛ H ] - regularized Newton step: d k = dk r, ( ) f 2 (x k ) + 2ɛ H dk r = g k. Complexity of second order line search 13

19 Outline 1 Our algorithm 2 Complexity analysis 3 Inexact variants Complexity of second order line search 14

20 Assumptions and notations Assumptions L f (x 0 ) = {x f (x) f (x 0 )} compact. f twice continuously dierentiable on a open set containing L f (x 0 ), with Lipschitz continuous Hessian. L H : Lipschitz constant for 2 f. flow: lower bound on {f (x k )}. U H : upper bound on 2 f (x k ). Complexity of second order line search 15

21 Criterion Approximate solution x k is an (ɛ g, ɛ H )-point if min { g k, g k+1 } ɛ g, λ k ɛ H. Complexity of second order line search 16

22 Criterion Approximate solution x k is an (ɛ g, ɛ H )-point if Other possibilities: min { g k, g k+1 } ɛ g, λ k ɛ H. Remove gradient directions and use g k+1 No cheap gradient steps. Add a stopping criterion and use g k. No global/local convergence. Complexity of second order line search 16

23 Analysis of the method Key principle Bound the decrease produced at every step while an (ɛ g, ɛ H )-point has not been reached. Complexity of second order line search 17

24 Analysis of the method Key principle Bound the decrease produced at every step while an (ɛ g, ɛ H )-point has not been reached. Five possible directions. Two ways of scaling g k : By its (negative) curvature; By its norm; Negative eigenvector; Newton step; Regularized Newton step. Complexity of second order line search 17

25 Analysis of the method Key principle Bound the decrease produced at every step while an (ɛ g, ɛ H )-point has not been reached. Five possible directions. Two ways of scaling g k : By its (negative) curvature; By its norm; Negative eigenvector; Newton step; Regularized Newton step. One proof technique, typical of backtracking line search If unit step is accepted, guaranteed decrease; Otherwise, lower bound on accepted step size. Complexity of second order line search 17

26 Example: When d k = g k / g k 1/2 In that case: g k 2 f (x k )g k g k 2 [ ɛ H, ɛ H ], g k > ɛ g. Unit step accepted: f (x k ) f (x k+1 ) η 6 d k 3 η 6 ɛ 3 2 g. Unit step rejected: By Taylor expansion, there exists a step α k = θ j k that is accepted such that { } θ j 1 5 k θ min 3, 1 2 ɛg ɛ 1 L H + η H. So the line search terminates and f (x k ) f (x k+1 ) η 6 α3 k d k 3 O ( ) ɛ 3 gɛ 3 H. Complexity of second order line search 18

27 Example: When d k = g k / g k 1/2 In that case: g k 2 f (x k )g k g k 2 [ ɛ H, ɛ H ], g k > ɛ g. Unit step accepted: f (x k ) f (x k+1 ) η 6 d k 3 η 6 ɛ 3 2 g. Unit step rejected: By Taylor expansion, there exists a step α k = θ j k that is accepted such that { } θ j 1 5 k θ min 3, 1 2 ɛg ɛ 1 L H + η H. So the line search terminates and Final decrease: f (x k ) f (x k+1 ) η 6 α3 k d k 3 f (x k ) f (x k+1 ) c g min O ( ) ɛ 3 gɛ 3 H. { ɛ 3 gɛ 3 3 H, ɛ 2 g }. Complexity of second order line search 18

28 Decrease bound General decrease lemma If at the k-th iteration, an (ɛ g, ɛ H )-point has not been reached, then { } 3 2 f (x k ) f (x k+1 ) c min ɛg, ɛ 3 H, ɛ3 gɛ 3 H, ϕ(ɛ g, ɛ H ) 3, where ϕ(ɛ g, ɛ H ) = L 1 H ɛ H ( 2 + ) 4 + 2L H ɛ g /ɛ 2 H. c depends on L H, η, θ. Complexity of second order line search 19

29 Iteration complexity Iteration complexity bound The method reaches an (ɛ g, ɛ H )-point in at most iterations. Specic rates: f 0 flow c max { ɛ g = ɛ, ɛ H = ɛ: O(ɛ 3 2 ). ɛ g = ɛ H = ɛ: O(ɛ 3 ). } ɛ 3 2 g, ɛ 3 H, ɛ 3 g ɛ 3 H, ϕ(ɛ g, ɛ H ) 3 Optimal bounds for Newton-type methods. Complexity of second order line search 20

30 Function evaluation complexity #Iterations = #Gradient/#Hessian evaluations. #Iterations #Function evaluations. Complexity of second order line search 21

31 Function evaluation complexity #Iterations = #Gradient/#Hessian evaluations. #Iterations #Function evaluations. Line-search iterations If x k is not a (ɛ g, ɛ H )-point, the line search takes at most ( )) 1 2 O (log θ min{ɛg ɛ 1 H, ɛ2 H } iterations. Evaluation complexity bound The method reaches an (ɛ g, ɛ H )-point in at most function evaluations. ( { }) Õ max ɛ 3 2 g, ɛ 3 H, ɛ 3 g ɛ 3 H, ϕ(ɛ g, ɛ H ) 3 Complexity of second order line search 21

32 Outline 1 Our algorithm 2 Complexity analysis 3 Inexact variants Complexity of second order line search 22

33 Motivation Algorithmic cost The method should be matrix-free. We use matrix-related operations: Linear system solve; Eigenvalue/Eigenvector computation. Inexactness Perform the matrix operations inexactly. Main cost unit: matrix-vector product/gradient evaluation. Complexity of second order line search 23

34 Conjugate gradient for linear systems We solve systems of the form Hd = g, with H ɛ H I. Complexity of second order line search 24

35 Conjugate gradient for linear systems We solve systems of the form Hd = g, with H ɛ H I. Conjugate Gradient (CG) We apply the conjugate gradient algorithm with stopping criterion: Hd + g ξ 2 min { g, ɛ H d }, ξ (0, 1). If κ = λ max (H)/λ min (H), the CG method will nd such a vector in at most { ( )} min n, 1 2 κ log 4κ 5 2 /ξ = min { n, O ( κ log(κ/ξ) )} matrix-vector products. Complexity of second order line search 24

36 Lanczos for eigenvalue computation Lanczos method to compute a minimum eigenvector. Can fail if deterministic Random start. Results for matrices A 0 Change the Hessian. Complexity of second order line search 25

37 Lanczos for eigenvalue computation Lanczos method to compute a minimum eigenvector. Can fail if deterministic Random start. Results for matrices A 0 Change the Hessian. Lanczos iterations Let H R n n symmetric with H U H, ɛ > 0, δ (0, 1). With probability at least 1 δ, the Lanczos procedure applied to U H I H outputs a vector v such that v Hv λ min (H) + ɛ. in at most min { n, ln(n/δ2 ) 2 2 U H ɛ } iterations/matrix-vector products. Complexity of second order line search 25

38 Selecting the search direction d k - Inexact version Step 1: Use gradient related information Compute If R k < ɛ H, set g k = f (x k ), R k = g k 2 f (x k )g k g k 2. d k = R k g k g k. Elseif R k [ ɛ H, ɛ H ] and g k ɛ g, set d k = Otherwise perform the Inexact Step 2. g k g k 1 2. Complexity of second order line search 26

39 Selecting the direction d k - Inexact version (2) Inexact Step 2: Use (inexact) eigenvalue information Compute an eigenpair (v k i, λi k ) such that with probability 1 δ, λ i k = [v i k ] 2 f (x k )v i k λ k + ɛ H 2, [v i k ] g k 0, v i k = 1. Case λ i k < 1 2 ɛ H: d k = v i k ; Case λ i k > 3 2 ɛ H: - Inexact Newton: Use CG to obtain d k = d in k, 2 f (x k )d in k + g k ξ min { g 2 k, ɛ H dk in } ; Case λ i k [ 1 2 ɛ H, 3 2 ɛ H]: - Inexact regularized Newton: Use CG to obtain d k = d ir k, [ 2 f (x k ) + 2ɛ H ] d ir k + g k ξ min { g 2 k, ɛ H dk ir }. Complexity of second order line search 27

40 Complexity analysis of the inexact method Identical reasoning: 5 steps, 1 proof. Using Lanczos with a random start, the negative curvature decrease only holds with probability 1 δ. With CG, the inexact Newton and regularized Newton give slightly dierent formulas. Decrease lemma For any iteration k, if x k is not an (ɛ g, ɛ H )-point, f (x k ) f (x k+1 ) ĉ min { 3 ɛ 3 gɛ 3 H, ɛ 2 g, ɛ 3 H, ϕ (ɛ g, ξ 2 ɛ H with probability at least 1 δ, and ĉ only depends on L H, η, θ. ) 3 ( ) }, ϕ ɛ g, 4+ξ ɛ 3 2 H, Complexity of second order line search 28

41 Complexity results Iteration complexity An (ɛ g, ɛ H )-point is reached in at most ˆK := f 0 flow ĉ max { ɛ 3 3 g ɛ 3 H, ɛ 2 g ), ɛ 3 H (ɛ, ϕ g, ξ ɛ 3 ( ) } 2 H, ϕ ɛ g, 4+ξ ɛ 3 2 H, iterations, with probability at least 1 ˆKδ. Cost complexity The number of Hessian-vector products or gradient evaluations needed to reach an (ɛ g, ɛ H )-point is at most min { ( ) ( )} n, O U 1/2 1 H ɛ 2 H log(ɛ 1 H /ξ), O U 1/2 1 H ɛ 2 H log(n/δ2 ) ˆK, with probability at least 1 ˆKδ. Complexity of second order line search 29

42 Complexity results (simplied) Setting: ɛ g = ɛ, ɛ H = ɛ. An (ɛ, ɛ)-point is reached in at most O(ɛ 3 2 ) iterations, ( ) Õ ɛ 7 4 Hessian-vector products/gradient evaluations, with probability 1 O(ɛ 3 2 δ). Complexity of second order line search 30

43 Complexity results (simplied) Setting: ɛ g = ɛ, ɛ H = ɛ. An (ɛ, ɛ)-point is reached in at most O(ɛ 3 2 ) iterations, ( ) Õ ɛ 7 4 Hessian-vector products/gradient evaluations, with probability 1 O(ɛ 3 2 δ). Setting δ = 0 gives results in probability 1: Iterations: O(ɛ 3 2 ). Hessian-vector/gradients: O ( ) nɛ 3 2. Complexity of second order line search 30

44 Summary Our proposal A class of second-order line-search methods. Best known complexity guarantees. Features gradient steps and inexactness. Can be implemented matrix-free. For more details... Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization, C. W. Royer and S. J. Wright, arxiv: Also contains local convergence results. Complexity of second order line search 31

45 Follow-up Perspectives Numerical testing of our class of methods. Extension to constrained problems. Complexity of second order line search 32

46 Follow-up Perspectives Numerical testing of our class of methods. Extension to constrained problems. Thank you for your attention! Complexity of second order line search 32

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Clément Royer (Université du Wisconsin-Madison, États-Unis) Toulouse, 8 janvier 2019 Nonconvex optimization via Newton-CG

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

Mesures de criticalité d'ordres 1 et 2 en recherche directe

Mesures de criticalité d'ordres 1 et 2 en recherche directe Mesures de criticalité d'ordres 1 et 2 en recherche directe From rst to second-order criticality measures in direct search Clément Royer ENSEEIHT-IRIT, Toulouse, France Co-auteurs: S. Gratton, L. N. Vicente

More information

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Frank E. Curtis, Lehigh University Beyond Convexity Workshop, Oaxaca, Mexico 26 October 2017 Worst-Case Complexity Guarantees and Nonconvex

More information

A Subsampling Line-Search Method with Second-Order Results

A Subsampling Line-Search Method with Second-Order Results A Subsampling Line-Search Method with Second-Order Results E. Bergou Y. Diouane V. Kungurtsev C. W. Royer November 21, 2018 Abstract In many contemporary optimization problems, such as hyperparameter tuning

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

On Nesterov s Random Coordinate Descent Algorithms - Continued

On Nesterov s Random Coordinate Descent Algorithms - Continued On Nesterov s Random Coordinate Descent Algorithms - Continued Zheng Xu University of Texas At Arlington February 20, 2015 1 Revisit Random Coordinate Descent The Random Coordinate Descent Upper and Lower

More information

An introduction to complexity analysis for nonconvex optimization

An introduction to complexity analysis for nonconvex optimization An introduction to complexity analysis for nonconvex optimization Philippe Toint (with Coralia Cartis and Nick Gould) FUNDP University of Namur, Belgium Séminaire Résidentiel Interdisciplinaire, Saint

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with Travis Johnson, Northwestern University Daniel P. Robinson, Johns

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods

More information

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

Numerical Methods for PDE-Constrained Optimization

Numerical Methods for PDE-Constrained Optimization Numerical Methods for PDE-Constrained Optimization Richard H. Byrd 1 Frank E. Curtis 2 Jorge Nocedal 2 1 University of Colorado at Boulder 2 Northwestern University Courant Institute of Mathematical Sciences,

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

A trust region algorithm with a worst-case iteration complexity of O(ɛ 3/2 ) for nonconvex optimization

A trust region algorithm with a worst-case iteration complexity of O(ɛ 3/2 ) for nonconvex optimization Math. Program., Ser. A DOI 10.1007/s10107-016-1026-2 FULL LENGTH PAPER A trust region algorithm with a worst-case iteration complexity of O(ɛ 3/2 ) for nonconvex optimization Frank E. Curtis 1 Daniel P.

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

arxiv: v1 [math.oc] 16 Oct 2018

arxiv: v1 [math.oc] 16 Oct 2018 A Subsampling Line-Search Method with Second-Order Results E. Bergou Y. Diouane V. Kungurtsev C. W. Royer October 18, 2018 arxiv:1810.07211v1 [math.oc] 16 Oct 2018 Abstract In many contemporary optimization

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Lecture 5: September 12

Lecture 5: September 12 10-725/36-725: Convex Optimization Fall 2015 Lecture 5: September 12 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Barun Patra and Tyler Vuong Note: LaTeX template courtesy of UC Berkeley EECS

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization Frank E. Curtis Department of Industrial and Systems Engineering, Lehigh University Daniel P. Robinson Department

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Optimization Tutorial 1. Basic Gradient Descent

Optimization Tutorial 1. Basic Gradient Descent E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

More information

Lecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent

Lecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent 10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 5: Gradient Descent Scribes: Loc Do,2,3 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for

More information

Trust Regions. Charles J. Geyer. March 27, 2013

Trust Regions. Charles J. Geyer. March 27, 2013 Trust Regions Charles J. Geyer March 27, 2013 1 Trust Region Theory We follow Nocedal and Wright (1999, Chapter 4), using their notation. Fletcher (1987, Section 5.1) discusses the same algorithm, but

More information

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization

More information

OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods

OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods Department of Statistical Sciences and Operations Research Virginia Commonwealth University Sept 25, 2013 (Lecture 9) Nonlinear Optimization

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Sub-Sampled Newton Methods

Sub-Sampled Newton Methods Sub-Sampled Newton Methods F. Roosta-Khorasani and M. W. Mahoney ICSI and Dept of Statistics, UC Berkeley February 2016 F. Roosta-Khorasani and M. W. Mahoney (UCB) Sub-Sampled Newton Methods Feb 2016 1

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima Third-order Smoothness elps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima Yaodong Yu and Pan Xu and Quanquan Gu arxiv:171.06585v1 [math.oc] 18 Dec 017 Abstract We propose stochastic

More information

Introduction to gradient descent

Introduction to gradient descent 6-1: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction to gradient descent Derivation and intuitions Hessian 6-2: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction Our

More information

A Second-Order Method for Strongly Convex l 1 -Regularization Problems

A Second-Order Method for Strongly Convex l 1 -Regularization Problems Noname manuscript No. (will be inserted by the editor) A Second-Order Method for Strongly Convex l 1 -Regularization Problems Kimon Fountoulakis and Jacek Gondzio Technical Report ERGO-13-11 June, 13 Abstract

More information

An Inexact Newton Method for Optimization

An Inexact Newton Method for Optimization New York University Brown Applied Mathematics Seminar, February 10, 2009 Brief biography New York State College of William and Mary (B.S.) Northwestern University (M.S. & Ph.D.) Courant Institute (Postdoc)

More information

Numerical Optimization

Numerical Optimization Numerical Optimization Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Spring 2010 Emo Todorov (UW) AMATH/CSE 579, Spring 2010 Lecture 9 1 / 8 Gradient descent

More information

Inexact Newton Methods and Nonlinear Constrained Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization Inexact Newton Methods and Nonlinear Constrained Optimization Frank E. Curtis EPSRC Symposium Capstone Conference Warwick Mathematics Institute July 2, 2009 Outline PDE-Constrained Optimization Newton

More information

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep

More information

Second Order Optimization Algorithms I

Second Order Optimization Algorithms I Second Order Optimization Algorithms I Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 7, 8, 9 and 10 1 The

More information

Stochastic Analogues to Deterministic Optimizers

Stochastic Analogues to Deterministic Optimizers Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured

More information

Contents. 1 Introduction. 1.1 History of Optimization ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016

Contents. 1 Introduction. 1.1 History of Optimization ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016 ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016 LECTURERS: NAMAN AGARWAL AND BRIAN BULLINS SCRIBE: KIRAN VODRAHALLI Contents 1 Introduction 1 1.1 History of Optimization.....................................

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

Optimization Methods. Lecture 18: Optimality Conditions and. Gradient Methods. for Unconstrained Optimization

Optimization Methods. Lecture 18: Optimality Conditions and. Gradient Methods. for Unconstrained Optimization 5.93 Optimization Methods Lecture 8: Optimality Conditions and Gradient Methods for Unconstrained Optimization Outline. Necessary and sucient optimality conditions Slide. Gradient m e t h o d s 3. The

More information

Introduction to Nonlinear Optimization Paul J. Atzberger

Introduction to Nonlinear Optimization Paul J. Atzberger Introduction to Nonlinear Optimization Paul J. Atzberger Comments should be sent to: atzberg@math.ucsb.edu Introduction We shall discuss in these notes a brief introduction to nonlinear optimization concepts,

More information

Sub-Sampled Newton Methods I: Globally Convergent Algorithms

Sub-Sampled Newton Methods I: Globally Convergent Algorithms Sub-Sampled Newton Methods I: Globally Convergent Algorithms arxiv:1601.04737v3 [math.oc] 26 Feb 2016 Farbod Roosta-Khorasani February 29, 2016 Abstract Michael W. Mahoney Large scale optimization problems

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Lecture 14: October 17

Lecture 14: October 17 1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Constrained Optimization Theory

Constrained Optimization Theory Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August

More information

A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification

A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification JMLR: Workshop and Conference Proceedings 1 16 A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification Chih-Yang Hsia r04922021@ntu.edu.tw Dept. of Computer Science,

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725 Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

More information

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization racle Complexity of Second-rder Methods for Smooth Convex ptimization Yossi Arjevani had Shamir Ron Shiff Weizmann Institute of Science Rehovot 7610001 Israel Abstract yossi.arjevani@weizmann.ac.il ohad.shamir@weizmann.ac.il

More information

The Steepest Descent Algorithm for Unconstrained Optimization

The Steepest Descent Algorithm for Unconstrained Optimization The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem

More information

Optimization Methods. Lecture 19: Line Searches and Newton s Method

Optimization Methods. Lecture 19: Line Searches and Newton s Method 15.93 Optimization Methods Lecture 19: Line Searches and Newton s Method 1 Last Lecture Necessary Conditions for Optimality (identifies candidates) x local min f(x ) =, f(x ) PSD Slide 1 Sufficient Conditions

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 4: Optimization (LFD 3.3, SGD) Cho-Jui Hsieh UC Davis Jan 22, 2018 Gradient descent Optimization Goal: find the minimizer of a function min f (w) w For now we assume f

More information

Lecture 1: Supervised Learning

Lecture 1: Supervised Learning Lecture 1: Supervised Learning Tuo Zhao Schools of ISYE and CSE, Georgia Tech ISYE6740/CSE6740/CS7641: Computational Data Analysis/Machine from Portland, Learning Oregon: pervised learning (Supervised)

More information

A Line search Multigrid Method for Large-Scale Nonlinear Optimization

A Line search Multigrid Method for Large-Scale Nonlinear Optimization A Line search Multigrid Method for Large-Scale Nonlinear Optimization Zaiwen Wen Donald Goldfarb Department of Industrial Engineering and Operations Research Columbia University 2008 Siam Conference on

More information

Introduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1

Introduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1 A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions R. Yousefpour 1 1 Department Mathematical Sciences, University of Mazandaran, Babolsar, Iran; yousefpour@umz.ac.ir

More information

This manuscript is for review purposes only.

This manuscript is for review purposes only. 1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region

More information

Optimal Newton-type methods for nonconvex smooth optimization problems

Optimal Newton-type methods for nonconvex smooth optimization problems Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations

More information

Trajectory-based optimization

Trajectory-based optimization Trajectory-based optimization Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2012 Emo Todorov (UW) AMATH/CSE 579, Winter 2012 Lecture 6 1 / 13 Using

More information

Second order machine learning

Second order machine learning Second order machine learning Michael W. Mahoney ICSI and Department of Statistics UC Berkeley Michael W. Mahoney (UC Berkeley) Second order machine learning 1 / 88 Outline Machine Learning s Inverse Problem

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Math 164: Optimization Barzilai-Borwein Method

Math 164: Optimization Barzilai-Borwein Method Math 164: Optimization Barzilai-Borwein Method Instructor: Wotao Yin Department of Mathematics, UCLA Spring 2015 online discussions on piazza.com Main features of the Barzilai-Borwein (BB) method The BB

More information

Towards stability and optimality in stochastic gradient descent

Towards stability and optimality in stochastic gradient descent Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction

More information

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by: Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion

More information

Nonlinear Optimization Methods for Machine Learning

Nonlinear Optimization Methods for Machine Learning Nonlinear Optimization Methods for Machine Learning Jorge Nocedal Northwestern University University of California, Davis, Sept 2018 1 Introduction We don t really know, do we? a) Deep neural networks

More information

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume

More information

Non-convex optimization. Issam Laradji

Non-convex optimization. Issam Laradji Non-convex optimization Issam Laradji Strongly Convex Objective function f(x) x Strongly Convex Objective function Assumptions Gradient Lipschitz continuous f(x) Strongly convex x Strongly Convex Objective

More information

Accelerating Nesterov s Method for Strongly Convex Functions

Accelerating Nesterov s Method for Strongly Convex Functions Accelerating Nesterov s Method for Strongly Convex Functions Hao Chen Xiangrui Meng MATH301, 2011 Outline The Gap 1 The Gap 2 3 Outline The Gap 1 The Gap 2 3 Our talk begins with a tiny gap For any x 0

More information

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization Adaptive Negative Curvature Descent with Applications in Non-convex Optimization Mingrui Liu, Zhe Li, Xiaoyu Wang, Jinfeng Yi, Tianbao Yang Department of Computer Science, The University of Iowa, Iowa

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

An Inexact Newton Method for Nonlinear Constrained Optimization

An Inexact Newton Method for Nonlinear Constrained Optimization An Inexact Newton Method for Nonlinear Constrained Optimization Frank E. Curtis Numerical Analysis Seminar, January 23, 2009 Outline Motivation and background Algorithm development and theoretical results

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Mini-Course 1: SGD Escapes Saddle Points

Mini-Course 1: SGD Escapes Saddle Points Mini-Course 1: SGD Escapes Saddle Points Yang Yuan Computer Science Department Cornell University Gradient Descent (GD) Task: min x f (x) GD does iterative updates x t+1 = x t η t f (x t ) Gradient Descent

More information

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf. Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

The Randomized Newton Method for Convex Optimization

The Randomized Newton Method for Convex Optimization The Randomized Newton Method for Convex Optimization Vaden Masrani UBC MLRG April 3rd, 2018 Introduction We have some unconstrained, twice-differentiable convex function f : R d R that we want to minimize:

More information

Newton-MR: Newton s Method Without Smoothness or Convexity

Newton-MR: Newton s Method Without Smoothness or Convexity Newton-MR: Newton s Method Without Smoothness or Convexity arxiv:1810.00303v1 [math.oc] 30 Sep 018 Fred (Farbod) Roosta Yang Liu Peng Xu Michael W. Mahoney October, 018 Abstract Establishing global convergence

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Proximal-Gradient Mark Schmidt University of British Columbia Winter 2018 Admin Auditting/registration forms: Pick up after class today. Assignment 1: 2 late days to hand in

More information

On the complexity of an Inexact Restoration method for constrained optimization

On the complexity of an Inexact Restoration method for constrained optimization On the complexity of an Inexact Restoration method for constrained optimization L. F. Bueno J. M. Martínez September 18, 2018 Abstract Recent papers indicate that some algorithms for constrained optimization

More information

Evaluation complexity for nonlinear constrained optimization using unscaled KKT conditions and high-order models by E. G. Birgin, J. L. Gardenghi, J. M. Martínez, S. A. Santos and Ph. L. Toint Report NAXYS-08-2015

More information

arxiv: v2 [math.oc] 1 Nov 2017

arxiv: v2 [math.oc] 1 Nov 2017 Stochastic Non-convex Optimization with Strong High Probability Second-order Convergence arxiv:1710.09447v [math.oc] 1 Nov 017 Mingrui Liu, Tianbao Yang Department of Computer Science The University of

More information

On Lagrange multipliers of trust-region subproblems

On Lagrange multipliers of trust-region subproblems On Lagrange multipliers of trust-region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Programy a algoritmy numerické matematiky 14 1.- 6. června 2008

More information

arxiv: v1 [math.oc] 9 Oct 2018

arxiv: v1 [math.oc] 9 Oct 2018 Cubic Regularization with Momentum for Nonconvex Optimization Zhe Wang Yi Zhou Yingbin Liang Guanghui Lan Ohio State University Ohio State University zhou.117@osu.edu liang.889@osu.edu Ohio State University

More information

Complexity of gradient descent for multiobjective optimization

Complexity of gradient descent for multiobjective optimization Complexity of gradient descent for multiobjective optimization J. Fliege A. I. F. Vaz L. N. Vicente July 18, 2018 Abstract A number of first-order methods have been proposed for smooth multiobjective optimization

More information

Geometry optimization

Geometry optimization Geometry optimization Trygve Helgaker Centre for Theoretical and Computational Chemistry Department of Chemistry, University of Oslo, Norway European Summer School in Quantum Chemistry (ESQC) 211 Torre

More information