Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Similar documents
Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Barrier Method. Javier Peña Convex Optimization /36-725

Interior Point Algorithms for Constrained Convex Optimization

CS711008Z Algorithm Design and Analysis

Convex Optimization Lecture 13

12. Interior-point methods

Lecture 6: Conic Optimization September 8

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Lecture 9 Sequential unconstrained minimization

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization

Lecture 15: October 15

Homework 4. Convex Optimization /36-725

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

12. Interior-point methods

Numerical Optimization

10 Numerical methods for constrained problems

Duality revisited. Javier Peña Convex Optimization /36-725

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Convex Optimization and l 1 -minimization

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Interior Point Methods for Mathematical Programming

Optimization: Then and Now

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

Newton s Method. Javier Peña Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Nonlinear Optimization for Optimal Control

Introduction to optimization

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Lecture 16: October 22

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Advances in Convex Optimization: Theory, Algorithms, and Applications

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Lecture 24: August 28

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Applications of Linear Programming

Analytic Center Cutting-Plane Method

Interior-Point Methods for Linear Optimization

Inequality constrained minimization: log-barrier method

Algorithms for nonlinear programming problems II

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Truncated Newton Method

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

ICS-E4030 Kernel Methods in Machine Learning

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Lecture 8. Strong Duality Results. September 22, 2008

Optimisation and Operations Research

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Lecture: Algorithms for LP, SOCP and SDP

Canonical Problem Forms. Ryan Tibshirani Convex Optimization

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

LP. Kap. 17: Interior-point methods

Interior-Point Methods

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Operations Research Lecture 4: Linear Programming Interior Point Method

Lecture 17: Primal-dual interior-point methods part II

Lecture 10: Linear programming duality and sensitivity 0-0

Interior Point Methods in Mathematical Programming

Optimality, Duality, Complementarity for Constrained Optimization

minimize x subject to (x 2)(x 4) u,

Lecture 3: Semidefinite Programming

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Optimization for Machine Learning

CS-E4830 Kernel Methods in Machine Learning

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Bellman s Curse of Dimensionality

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Lecture 18: Optimization Programming

11. Equality constrained minimization

Linear programming II

Algorithms for nonlinear programming problems II

Lecture 14: October 17

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Following The Central Trajectory Using The Monomial Method Rather Than Newton's Method

Lecture 14 Barrier method

Linear programming. Saad Mneimneh. maximize x 1 + x 2 subject to 4x 1 x 2 8 2x 1 + x x 1 2x 2 2

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

A Full Newton Step Infeasible Interior Point Algorithm for Linear Optimization

10. Unconstrained minimization

Primal/Dual Decomposition Methods

Linear Programming Duality

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Lecture 14: Newton s Method

8. Geometric problems

Constrained Optimization and Lagrangian Duality

Lagrangian Duality and Convex Optimization

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

An E cient A ne-scaling Algorithm for Hyperbolic Programming

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

The Q-parametrization (Youla) Lecture 13: Synthesis by Convex Optimization. Lecture 13: Synthesis by Convex Optimization. Example: Spring-mass System

Course Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)

Transcription:

Primal-Dual Interior-Point Methods Ryan Tibshirani Convex Optimization 10-725/36-725

Given the problem Last time: barrier method min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h i, i = 1,... m are convex and smooth, we consider min x subject to tf(x) + φ(x) Ax = b where φ is the log barrier function m φ(x) = log( h i (x)) i=1 2

In the barrier method, we solve a sequence of problems min x subject to tf(x) + φ(x) Ax = b for increasing values of t > 0, until m/t ɛ In particular, start with t = t (0) > 0, and solve the above problem using Newton s method to produce x (0) = x (t). Then, we iterate for k = 1, 2, 3,... Solve the barrier problem at t = t (k), using Newton s method initialized at x (k 1), to produce x (k) = x (t) Stop if m/t ɛ Else update t (k+1) = µt, where µ > 1 3

Outline Today: Motivation from perturbed KKT conditions Primal-dual interior-point method Backtracking line search Highlight on standard form LPs 4

Barrier method versus primal-dual method We will cover the primal-dual interior-point method, which solves basically the same problems as the barrier method. These two are pretty similar, but have some key differences Overview: Both can be motivated in terms of perturbed KKT conditions Primal-dual interior-point methods take one Newton step, and move on (no separate inner and outer loops) Primal-dual interior-point iterates are not necessarily feasible Primal-dual interior-point methods can be more efficient, since they can exhibit better than linear convergence Primal-dual interior-point methods are less intuitive... 5

Back to perturbed KKT conditions The barrier method iterates at any particular barrier method t can be seen as solving a perturbed version of the KKT conditions f(x) + m u i h i (x) + A T v = 0 i=1 u i h i (x) = 1/t, i = 1,... m h i (x) 0, i = 1,... m, Ax = b u i 0, i = 1,... m Only difference between these and actual KKT conditions for our original problem is in the second condition: these are replaced by u i h i (x) = 0, i = 1,... m i.e., complementary slackness, in actual KKT conditions 6

We didn t cover this, but Newton updates for log barrier problem can be seen as Newton step for solving these nonlinear equations, after eliminating u (i.e., taking u = 1/(th i (x)), i = 1,... m) Primal-dual interior-point updates are also motivated by a Newton step for solving these nonlinear equations, but without eliminating u. Write it concisely as r(x, u, v) = 0, where r(x, u, v) = f(x) + Dh(x) T u + A T v diag(u)h(x) 1/t Ax b and h(x) = h 1 (x)... h m (x), Dh(x) = h 1 (x) T... h m (x) T 7

This is a nonlinear equation is (x, u, v), and hard to solve; so let s linearize, and approximately solve Let y = (x, u, v) be the current iterate, and y = ( x, u, v) be the update direction. Define r dual = f(x) + Dh(x) T u + A T v r cent = diag(u)h(x) 1/t r prim = Ax b the dual, central, and primal residuals at current y = (x, u, v) Now we make our first-order approximation 0 = r(y + y) r(y) + Dr(y) y and we want to solve for y in the above 8

I.e., we solve 2 f(x) + m i=1 u i 2 h i (x) Dh(x) T A T diag(u)dh(x) diag(h(x)) 0 A 0 0 r dual = r cent r prim x u v Solution y = ( x, u, v) is our primal-dual update direction Note that the update directions for the primal and dual variables are inexorably linked together (Also, these are different updates than those from barrier method) 9

10 Surrogate duality gap Unlike barrier methods, the iterates from primal-dual interior-point method are not necessarily feasible For barrier method, we have simple duality gap: m/t, since we set u i = 1/(th i (x)), i = 1,... m and saw this was dual feasible For primal-dual interior-point method, we construct a surrogate duality gap: m η = h(x) T u = u i h i (x) This would be a bonafide duality gap if we had feasible points (i.e., if r prim = 0 and r dual = 0), but we don t, so it s not What value of parameter t does this correspond to in perturbed KKT conditions? This is t = η/m i=1

11 Primal-dual interior-point method Putting it all together, we now have our primal-dual interior-point method. Start with a strictly feasible point x (0) and u (0) > 0, v (0). Define η (0) = h(x (0) ) T u (0). Then for a barrier parameter µ > 1, we repeat for k = 1, 2, 3... Define t = µm/η (k 1) Compute primal-dual update direction y Determine step size s Update y (k) = y (k 1) + s y Compute η (k) = h(x (k) ) T u (k) Stop if η (k) ɛ and ( r prim 2 2 + r dual 2 2 )1/2 ɛ Note the stopping criterion checks both the surrogate duality gap, and (approximate) primal and dual feasibility

12 Backtracking line search At each step, we need to make sure that we settle on an update y + = y + s y, i.e., x + = x + s x, u + = u + s u, v + = v + s v that maintains h i (x) < 0, u i > 0, i = 1,... m A multi-stage backtracking line search for this purpose: start with largest step size s max 1 that makes u + s u 0: { } s max = min 1, min{ u i / u i : u i < 0} Then, with parameters α, β (0, 1), we set s = 0.99s max, and Update s = βs, until h i (x + ) < 0, i = 1,... m Update s = βs, until r(x +, u +, v + ) 2 (1 αs) r(x, u, v) 2

13 Highlight: standard form LP Recall the standard form LP: min x subject to c T x Ax = b x 0 for c R n, A R m n, b R m. Its dual is: max u,v subject to b T v A T v + u = c u 0 (This is not a bad thing to memorize)

14 Some history Dantzig (1940s): the simplex method, still today is one of the most well-known/well-studied algorithms for LPs Klee and Minty (1972): pathological LP with n variables and 2n constraints, simplex method takes 2 n iterations to solve Khachiyan (1979): polynomial-time algorithm for LPs, based on ellipsoid method of Nemirovski and Yudin (1976). Strong in theory, weak in practice Karmarkar (1984): interior-point polynomial-time method for LPs. Fairly efficient (US Patent 4,744,026, expired in 2006) Renegar (1988): Newton-based interior-point algorithm for LP. Best known theoretical complexity to date Modern state-of-the-art LP solvers typically use both simplex and interior-point methods

15 KKT conditions for standard form LP The points x and (u, v ) are respectively primal and dual optimal LP solutions if and only if they solve: A T v + u = c x i u i = 0, i = 1,..., n Ax = b x, u 0 (Neat fact: the simplex method maintains the first three conditions and aims for the fourth one... interior-point methods maintain the first and last two, and aim for the second)

16 The perturbed KKT conditions for standard form LP are hence: A T v + u = c x i u i = 1/t, i = 1,..., n Ax = b x, u 0 Let s work through the barrier method, and the primal-dual interior point method, to get a sense of these two Barrier method (after elim u): 0 = r br (x, v) ( A = T v + diag(x) 1 1/t c Ax b ) Primal-dual method: 0 = r pd (x, u, v) A T v + u c = diag(x)u 1/t Ax b

17 Barrier method: 0 = r br (y + y) r br (y) + Dr br (y) y, i.e., we solve [ diag(x) 2 /t A T ] ( ) x = r A 0 v br (x, v) and take a step y + = y + s y (with line search for s > 0), and iterate until convergence. Then update t = µt Primal-dual method: 0 = r pd (y + y) r pd (y) + Dr pd (y) y, i.e., we solve 0 I A T x diag(u) diag(x) 0 u = r pd (x, u, v) A 0 0 v and take a step y + = y + s y (with line search for s > 0), but only once. Then update t = µt

Example: barrier versus primal-dual Example from B & V 11.3.2 and 11.7.4: standard LP with n = 50 variables and m = 100 equality constraints Barrier method uses various values of µ, primal-dual method uses µ = 10. Both use α = 0.01, β = 0.5 duality gap 10 2 10 0 10 2 10 4 10 6 µ =50 µ =150 µ =2 0 20 40 80 60 Newton iterations Barrier duality gap 10 2 10 2 10 5 10 0 10 0 10 0 10 2 10 2 10 4 10 4 ˆη ˆη r rfeas 10 5 10 6 10 6 10 10 10 8 10 8 10 10 10 10 10 15 0 0 5 5 10 1015 1520 2025 2530 30 0 5 10 15 20 25 30 iteration iteration number number iteration number Figure 11.21 Progress of the primal-dual interior-point method for an LP, showing surrogate duality gap ˆη and the norm of the primal and dual residuals, versus iteration number. The residual Primal-dual converges rapidly tofeasibility zero within 24 iterations; the surrogate gap also converges to a very small number in duality about 28 iterations. gap The primal-dual interior-point gap, methodr convergesfaster than the barrier method, especially if high accuracy is required. feas = Primal-dual surrogate ( r prim 2 2 + r dual 2 2) 1/2 Can see that primal-dual is faster to converge to high accuracy 10 2 10 5 18

19 Now a sequence of problems with n = 2m, and n growing. Barrier method uses µ = 100, runs just two outer loops (decreases duality gap by 10 4 ); primal-dual method uses µ = 10, stops when duality gap and feasibility gap are at most 10 8 35 50 Newton iterations 30 25 20 iterations 40 30 20 15 10 1 10 2 10 3 m Barrier method 10 10 1 10 2 10 3 m Primal-dual method Primal-dual method require only slightly more iterations, despite the fact that they it is producing higher accuracy solutions

20 References and further reading S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 11 J. Nocedal and S. Wright (2006), Numerical optimization, Chapters 14 and 19 S. Wright (1997), Primal-dual interior-point methods, Chapters 5 and 6 J. Renegar (2001), A mathematical view of interior-point methods