A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

Similar documents
On well definedness of the Central Path

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Interior Point Methods in Mathematical Programming

Interior Point Methods for Mathematical Programming

Interior-Point Methods

Newton s Method. Javier Peña Convex Optimization /36-725

Apolynomialtimeinteriorpointmethodforproblemswith nonconvex constraints

Lecture 10. Primal-Dual Interior Point Method for LP

An interior-point trust-region polynomial algorithm for convex programming

Nonlinear Optimization for Optimal Control

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

More First-Order Optimization Algorithms

Local Self-concordance of Barrier Functions Based on Kernel-functions

On self-concordant barriers for generalized power cones

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

10 Numerical methods for constrained problems

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Supplement: Universal Self-Concordant Barrier Functions

Semidefinite Programming

Optimization: Then and Now

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Lecture 14: Newton s Method

On Nesterov s Random Coordinate Descent Algorithms - Continued

4TE3/6TE3. Algorithms for. Continuous Optimization

Interior Point Algorithms for Constrained Convex Optimization

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

Nonsymmetric potential-reduction methods for general cones

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

A Primal-Dual Second-Order Cone Approximations Algorithm For Symmetric Cone Programming

Lecture 9 Sequential unconstrained minimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization

A Conservation Law Method in Optimization

A tight iteration-complexity upper bound for the MTY predictor-corrector algorithm via redundant Klee-Minty cubes

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

On Superlinear Convergence of Infeasible Interior-Point Algorithms for Linearly Constrained Convex Programs *

Primal-dual IPM with Asymmetric Barrier

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Constrained Optimization and Lagrangian Duality

10. Unconstrained minimization

An Optimal Affine Invariant Smooth Minimization Algorithm.

Computing regularization paths for learning multiple kernels

Barrier Method. Javier Peña Convex Optimization /36-725

IBM Almaden Research Center,650 Harry Road Sun Jose, Calijornia and School of Mathematical Sciences Tel Aviv University Tel Aviv, Israel

On the behavior of Lagrange multipliers in convex and non-convex infeasible interior point methods

Advances in Convex Optimization: Theory, Algorithms, and Applications

Optimisation in Higher Dimensions

Operations Research Lecture 4: Linear Programming Interior Point Method

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

Second Order Optimization Algorithms I

Full Newton step polynomial time methods for LO based on locally self concordant barrier functions

Largest dual ellipsoids inscribed in dual cones

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

Lecture 3. Optimization Problems and Iterative Algorithms

An O(nL) Infeasible-Interior-Point Algorithm for Linear Programming arxiv: v2 [math.oc] 29 Jun 2015

A New Class of Polynomial Primal-Dual Methods for Linear and Semidefinite Optimization

Cubic regularization of Newton s method for convex problems with constraints

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

LP. Kap. 17: Interior-point methods

Self-Concordant Barrier Functions for Convex Optimization

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

Fast proximal gradient methods

A Unified Approach to Proximal Algorithms using Bregman Distance

Lagrangian Duality Theory

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

A Path to the Arrow-Debreu Competitive Market Equilibrium

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

Interior Point Methods for Linear Programming: Motivation & Theory

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

A class of Smoothing Method for Linear Second-Order Cone Programming

Conditional Gradient (Frank-Wolfe) Method

Unconstrained minimization

Primal-dual path-following algorithms for circular programming

Algorithms for constrained local optimization

On Conically Ordered Convex Programs

Non-convex optimization. Issam Laradji

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties

Analytic Center Cutting-Plane Method

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016

Sharpness, Restart and Compressed Sensing Performance.

A full-newton step feasible interior-point algorithm for P (κ)-lcp based on a new search direction

Following The Central Trajectory Using The Monomial Method Rather Than Newton's Method

Asymptotic Convergence of the Steepest Descent Method for the Exponential Penalty in Linear Programming

12. Interior-point methods

Sparse Optimization Lecture: Dual Methods, Part I

A Simpler and Tighter Redundant Klee-Minty Construction

LINEAR AND NONLINEAR PROGRAMMING

Complexity of gradient descent for multiobjective optimization

A priori bounds on the condition numbers in interior-point methods

A Distributed Newton Method for Network Utility Maximization, II: Convergence

ORIE 6326: Convex Optimization. Quasi-Newton Methods

Transcription:

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization Yinyu Ye Department is Management Science & Engineering and Institute of Computational & Mathematical Engineering Stanford University May 3, 07 Abstract We present more details of the minimal-norm path following algorithm for unconstrained smooth convex optimization described in the lecture note of CME307 and MS&E3 [0]. Iterative optimization algorithms have been modeled as following certain paths to the optimal solution set, such as the central path of linear programming e.g., [, 3]) and, more recently, the trajectory of accelerated gradients of Nesterov [8, 9]). Moreover, there is an interest in accelerating and globalizing Newton s or second-order methods for unconstrained smooth convex optimization, e.g., [5]. Let fx), x R n be any smooth convex function with continuous second-order derivatives, and it meets a local Lipschitz condition: for any point x 0 and a constant β fx + d) fx) x)d βd T fx)d, whenever d O). ) and x + d is in the function domain. Here, represents the L norm, and it resembles the self-concordant condition of [6]. Note that all convex power, logarithmic, barrier, and exponential functions meet this condition, and the function does not need to be strictly or strongly convex nor has a bounded solution set. Furthermore, we assume that x = 0 is not a minimizer or f0) 0. We consider the path constructed from the strictly convex minimization problem min x fx) + µ x ) where µ is any positive parameter, and the minimizer, denoted by xµ), satisfies the necessary and sufficient condition: fx) + µx = 0. 3) We now prove a theorem on the path convergence similar to the one in []:

Theorem. The following properties on the minimizer of ) hold. i). The minimizer xµ) of ) is unique and continuous with µ. ii). The function fxµ)) is strictly increasing and xµ) is strictly decreasing function of µ. iii). lim µ 0 + xµ) converges to the minimal norm solution of fx). Proof Property i) is based on the fact that fx) + µ x is a strictly convex function for any µ > 0 and its Hessian is positive definite. We prove ii). Let 0 < µ < µ. Then and fxµ )) + µ xµ ) < fxµ)) + µ xµ) fxµ)) + µ xµ) < fxµ )) + µ xµ ). Add the two inequalities on both sides and rearrange them, we have µ µ xµ ) > µ µ xµ). Since µ µ > 0, we have xµ ) > xµ), that is, xµ) is strictly decreasing function of µ. Then, using any one of the original two inequalities, we have fxµ )) < fxµ)). Finally, we prove iii). Let x be an optimizer with the the minimum L norm, then f x) = 0, which, together with 3), indicate fxµ)) f x) + µxµ) = 0. Pre-multiplying xµ) x to both sides, and using the convexity of f, µxµ) x) T xµ) = xµ) x) T fxµ)) f x)) 0. Thus, we have xµ) x T xµ) x xµ), that is, xµ) x) for any µ > 0. If the accumulating limit point x0) x, f must have two different minimum L norm solutions in the convex optimal solution set of f. Then x0) + x) would remain an optimal solution and it has a norm strictly less than x. Thus, x is unique and every accumulating limit point x0) = x, which completes the proof. Let us call the path minimum-norm path and let x k be an approximate path solution for µ = µ k and the path error be fx k ) + µ k x k β µk, which defines a neighborhood of the path. Then, we like to compute a new iterate x k+ remains in the neighborhood of the path, similar to the interior-point path-following algorithms e.g., [7]), that is, fx k+ ) + µ k+ x k+ β µk+, where 0 µ k+ < µ k. 4)

Note that the neighborhood become smaller and smaller as the iterates go. When µ k is replaced by µ k+, say η)µ k for some number η 0, ], we aim to find the solution x such that fx) + η)µ k x = 0. To proceed, we use x k as the initial solution and apply the Newton iteration: fx k ) + fx k )d + η)µ k x k + d) = 0, or fx k )d + η)µ k d = fx k ) η)µ k x k, 5) and let the new iterate From the second expression, we have x k+ = x k + d. On the other hand fx k )d + η)µ k d = fx k ) η)µ k x k = fx k ) µ k x k + ηµ k x k fx k ) µ k x k + ηµ k x k β η xk )µ k. 6) fx k )d + η)µ k d = fx k )d + η)µ k d T fx k )d + η)µ k ) d. From convexity of f, d T fx k )d 0, together using 6), we have η)µ k ) d + β η xk ) µ k ) η)µ k d T fx k )d + β η xk ) µ k ). The first inequality of 7) implies ) d β η) + η xk. η The second inequality of 7) implies and 7) fx + ) + η)µ k x + = fx + ) fx k ) + fx k )d) + fx k ) + fx k )d) + η)µ k x k + d) = fx + ) fx k ) + fx k )d βd T fx k )d β + η) β η xk ) µ k. We now just need to choose η 0, ) such that ) and + η xk β η) η β η) β + η xk ) β to satisfy 4), due to η)µ k = µ k+. Since β, set η = β + x k ) 3 η).

would suffice. This would give a linear convergence of µ down to zero, ) µ k+ µ k β + x k ) and x k follows the path to the optimality. From Theorem, the size x k is bounded above by the size of x where x is the minimum-norm optimal solution of fx) that is fixed. Theorem. There is a linearly convergent second-order or Newton method in minimizing any smooth convex function that satisfies ) the local Lipschitz condition. More precisely, the convergence rate is where x is the minimum-norm optimal solution of β+ x ) fx). Practically, one can implement the algorithm in a predictor-corrector fashion e.g., [4]) to explore wide neighborhoods and without knowing Lipschitz constant β. One can also scale variable x such that the norm of the minimum-norm solution x is about. References [] D. A. Bayer and J. C. Lagarias 989), The nonlinear geometry of linear programming, Part I: Affine and projective scaling trajectories. Transactions of the American Mathematical Society, 34):499 56. [] O. Güler and Y. Ye 993), Convergence behavior of interior point algorithms. Math. Programming, 60:5 8. [3] N. Megiddo 989), Pathways to the optimal set in linear programming. In N. Megiddo, editor, Progress in Mathematical Programming : Interior Point and Related Methods, pages 3 58. Springer Verlag, New York. [4] S. Mizuno, M. J. Todd, and Y. Ye 993), On adaptive step primal dual interior point algorithms for linear programming. Mathematics of Operations Research, 8:964 98. [5] Yurii Nesterov 07), Accelerating the Universal Newton Methods. In Nemfest, George Tech, Atlanta. [6] Yu. E. Nesterov and A. S. Nemirovskii 993), Interior Point Polynomial Methods in Convex Programming: Theory and Algorithms. SIAM Publications. SIAM, Philadelphia. [7] J. Renegar 988), A polynomial time algorithm, based on Newton s method, for linear programming. Math. Programming, 40:59 93. [8] Weijie Su, Stephen Boyd and Emmanuel J. Candés 04), A Differential Equation for Modeling Nesterovs Accelerated Gradient Method: Theory and Insights. In Advances in Neural Information Processing Systems NIPS) 7. 4

[9] Andre Wibisono, Ashia C. Wilson, Michael I. Jordan 06), A Variational Perspective on Accelerated Methods in Optimization. https://arxiv.org/abs/603.0445 [0] Y. Ye 07), Lecture Note 3: Second-Order Optimization Algorithms II. http://web.stanford.edu/class/msande3/lecture3.pdf 5