Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Similar documents
Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Barrier Method. Javier Peña Convex Optimization /36-725

Interior Point Algorithms for Constrained Convex Optimization

CS711008Z Algorithm Design and Analysis

Convex Optimization Lecture 13

12. Interior-point methods

Lecture 6: Conic Optimization September 8

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization

Lecture 9 Sequential unconstrained minimization

Lecture 15: October 15

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

12. Interior-point methods

10 Numerical methods for constrained problems

Numerical Optimization

Homework 4. Convex Optimization /36-725

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Duality revisited. Javier Peña Convex Optimization /36-725

Convex Optimization and l 1 -minimization

Introduction to optimization

Optimization: Then and Now

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Newton s Method. Javier Peña Convex Optimization /36-725

Interior Point Methods for Mathematical Programming

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Lecture 24: August 28

Nonlinear Optimization for Optimal Control

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Lecture 8. Strong Duality Results. September 22, 2008

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Lecture: Algorithms for LP, SOCP and SDP

Applications of Linear Programming

Optimisation and Operations Research

Lecture 16: October 22

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Lecture 10: Linear programming duality and sensitivity 0-0

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

ICS-E4030 Kernel Methods in Machine Learning

Optimization for Machine Learning

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Advances in Convex Optimization: Theory, Algorithms, and Applications

Interior-Point Methods for Linear Optimization

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

LP. Kap. 17: Interior-point methods

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Input: System of inequalities or equalities over the reals R. Output: Value for variables that minimizes cost function

Analytic Center Cutting-Plane Method

Operations Research Lecture 4: Linear Programming Interior Point Method

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Optimality, Duality, Complementarity for Constrained Optimization

Lecture 14: October 17

Lecture 3: Semidefinite Programming

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

Approximate Farkas Lemmas in Convex Optimization

8. Geometric problems

Primal/Dual Decomposition Methods

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Inequality constrained minimization: log-barrier method

Canonical Problem Forms. Ryan Tibshirani Convex Optimization

Interior-Point Methods

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Constrained Optimization and Lagrangian Duality

Lecture Simplex Issues: Number of Pivots. ORIE 6300 Mathematical Programming I October 9, 2014

CS-E4830 Kernel Methods in Machine Learning

Algorithms for nonlinear programming problems II

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Lecture 18: Optimization Programming

Optimization (168) Lecture 7-8-9

An E cient A ne-scaling Algorithm for Hyperbolic Programming

Bellman s Curse of Dimensionality

Truncated Newton Method

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Interior Point Methods in Mathematical Programming

Lecture 17: Primal-dual interior-point methods part II

11. Equality constrained minimization

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

Algorithms for Constrained Optimization

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

EE364a Review Session 5

Lecture 14: Newton s Method

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Solving Dual Problems

2.098/6.255/ Optimization Methods Practice True/False Questions

8. Geometric problems

The Q-parametrization (Youla) Lecture 13: Synthesis by Convex Optimization. Lecture 13: Synthesis by Convex Optimization. Example: Spring-mass System

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Linear programming II

Transcription:

Primal-Dual Interior-Point Methods Ryan Tibshirani Convex Optimization 10-725

Given the problem Last time: barrier method min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h i, i = 1,... m are convex and twice differentiable, and strong duality holds. We consider min x subject to tf(x) + φ(x) Ax = b where φ is the log barrier function m φ(x) = log( h i (x)) i=1 2

Let x (t) be a solution to the barrier problem for particular t > 0, and f be optimal value in original problem. We can show m/t is a duality gap, so that f(x (t)) f m/t Motivates the barrier method, where we solve the barrier problem for increasing values of t > 0, until duality gap satisfies m/t ɛ We fix t (0) > 0, µ > 1. We use Newton to compute x (0) = x (t), a solution to barrier problem at t = t (0). For k = 1, 2, 3,... Solve the barrier problem at t = t (k), using Newton initialized at x (k 1), to yield x (k) = x (t) Stop if m/t ɛ, else update t (k+1) = µt 3

Outline Today: Perturbed KKT conditions, revisited Primal-dual interior-point method Backtracking line search Highlight on standard form LPs 4

Barrier versus primal-dual method Today we will discuss the primal-dual interior-point method, which solves basically the same problems as the barrier method. What s the difference between these two? Overview: Both can be motivated in terms of perturbed KKT conditions Primal-dual interior-point methods take one Newton step, and move on (no separate inner and outer loops) Primal-dual interior-point iterates are not necessarily feasible Primal-dual interior-point methods are often more efficient, as they can exhibit better than linear convergence Primal-dual interior-point methods are less intuitive... 5

Perturbed KKT conditions Barrier method iterates (x (t), u (t), v (t)) can be motivated as solving the perturbed KKT conditions: f(x) + m u i h i (x) + A T v = 0 i=1 u i h i (x) = (1/t)1, i = 1,... m h i (x) 0, i = 1,... m, Ax = b u i 0, i = 1,... m Only difference between these and actual KKT conditions for our original problem is second line: these are replaced by u i h i (x) = 0, i = 1,... m i.e., complementary slackness, in actual KKT conditions 6

Perturbed KKT as nonlinear system Can view this as a nonlinear system of equations, written as f(x) + Dh(x) T u + A T v r(x, u, v) = diag(u)h(x) (1/t)1 Ax b = 0 where h(x) = h 1 (x)... h m (x), Dh(x) = h 1 (x) T... h m (x) T Newton s method, recall, is generally a root-finder for a nonlinear system F (y) = 0. Approximating F (y + y) F (y) + DF (y) y leads to y = (DF (y)) 1 F (y) What happens if we apply this to r(x, u, v) = 0 above? 7

Newton on perturbed KKT, v1 Approach 1: from middle equation (relaxed comp slackness), note that u i = 1/(th i (x)), i = 1,... m. So after eliminating u, we get r(x, v) = ( f(x) + m i=1 ( 1 th i (x) ) h i(x) + A T v Ax b ) = 0 Thus the Newton root-finding update ( x, v) is determined by [ Hbar (x) A T ] ( ) x = r(x, v) A 0 v where H bar (x) = 2 f(x) + m 1 i=1 th i h (x) 2 i (x) h i (x) T + m i=1 ( 1 th i (x) ) 2 h i (x) This is just the KKT system solved by one iteration of Newton s method for minimizing the barrier problem 8

Newton on perturbed KKT, v2 Approach 2: directly apply Newton root-finding update, without eliminating u. Introduce notation r dual = f(x) + Dh(x) T u + A T v r cent = diag(u)h(x) (1/t)t r prim = Ax b called the dual, central, and primal residuals at y = (x, u, v). Now root-finding update y = ( x, u, v) is given by H pd (x) Dh(x) T A T diag(u)dh(x) diag(h(x)) 0 A 0 0 where H pd (x) = 2 f(x) + m i=1 u i 2 h i (x) x u v = r dual r cent r prim 9

10 Some notes: In v2, update directions for the primal and dual variables are inexorably linked together Also, v2 and v1 leads to different (nonequivalent) updates As we saw, one iteration of v1 is equivalent to inner iteration in the barrier method And v2 defines a new method called primal-dual interior point method, that we will flesh out shortly One complication: in v2, the dual iterates are not necessarily feasible for the original dual problem...

11 Surrogate duality gap For barrier method, we have simple duality gap: m/t, since we set u i = 1/(th i (x)), i = 1,... m and saw this was dual feasible For primal-dual interior-point method, we can construct surrogate duality gap: m η = h(x) T u = u i h i (x) This would be a bonafide duality gap if we had feasible points, i.e., r prim = 0 and r dual = 0, but we don t, so it s not What value of parameter t does this correspond to in perturbed KKT conditions? This is t = m/η i=1

12 Primal-dual interior-point method Putting it all together, we now have our primal-dual interior-point method. Start with x (0) such that h i (x (0) ) < 0, i = 1,..., m, and u (0) > 0, v (0). Define η (0) = h(x (0) ) T u (0). We fix µ > 1, repeat for k = 1, 2, 3... Define t = µm/η (k 1) Compute primal-dual update direction y Use backtracking to determine step size s Update y (k) = y (k 1) + s y Compute η (k) = h(x (k) ) T u (k) Stop if η (k) ɛ and ( r prim 2 2 + r dual 2 2 )1/2 ɛ Note that we stop based on surrogate duality gap, and approximate feasibility. (Line search maintains h i (x) < 0, u i > 0, i = 1,..., m)

13 Backtracking line search At each step, must ensure we arrive at y + = y + s y, i.e., x + = x + s x, u + = u + s u, v + = v + s v that maintains both h i (x) < 0, and u i > 0, i = 1,... m A multi-stage backtracking line search for this purpose: start with largest step size s max 1 that makes u + s u 0: { } s max = min 1, min{ u i / u i : u i < 0} Then, with parameters α, β (0, 1), we set s = 0.99s max, and Update s = βs, until h i (x + ) < 0, i = 1,... m Update s = βs, until r(x +, u +, v + ) 2 (1 αs) r(x, u, v) 2

14 Some history Dantzig (1940s): the simplex method, still today is one of the most well-known/well-studied algorithms for LPs Klee and Minty (1972): pathological LP with n variables and 2n constraints, simplex method takes 2 n iterations to solve Khachiyan (1979): polynomial-time algorithm for LPs, based on ellipsoid method of Nemirovski and Yudin (1976). Strong in theory, weak in practice Karmarkar (1984): interior-point polynomial-time method for LPs. Fairly efficient (US Patent 4,744,026, expired in 2006) Renegar (1988): Newton-based interior-point algorithm for LP. Best known complexity... until Lee and Sidford (2014) Modern state-of-the-art LP solvers typically use both simplex and interior-point methods

15 Highlight: standard LP Recall the standard form LP: min x subject to c T x Ax = b x 0 for c R n, A R m n, b R m. Its dual is: max u,v subject to b T v A T v + u = c u 0 (This is not a bad thing to memorize)

16 KKT conditions The points x and (u, v ) are respectively primal and dual optimal LP solutions if and only if they solve: A T v + u = c x i u i = 0, i = 1,..., n Ax = b x, u 0 Neat fact: the simplex method maintains the first three conditions and aims for the fourth one... interior-point methods maintain the first and last two, and aim for the second

17 The perturbed KKT conditions for standard form LP are hence: A T v + u = c x i u i = 1/t, i = 1,..., n Ax = b x, u 0 What do our interior point methods do? Barrier method (after elim u): 0 = r br (x, v) ( A = T v + diag(x) 1 (1/t)1 c Ax b ) Primal-dual method: 0 = r pd (x, u, v) A T v + u c = diag(x)u (1/t)1 Ax b

18 Barrier method: set 0 = r br (y + y) r br (y) + Dr br (y) y, i.e., solve [ diag(x) 2 /t A T ] ( ) x = r A 0 v br (x, v) and take a step y + = y + s y (with line search for s > 0), and iterate until convergence. Then update t = µt Primal-dual method: set 0 = r pd (y + y) r pd (y) + Dr pd (y) y, i.e., solve 0 I A T diag(u) diag(x) 0 A 0 0 x u v = r pd (x, u, v) and take a step y + = y + s y (with line search for s > 0), but only once. Then update t = µt

Example: barrier versus primal-dual Example from B & V 11.3.2 and 11.7.4: standard LP with n = 50 variables and m = 100 equality constraints Barrier method uses various values of µ, primal-dual method uses µ = 10. Both use α = 0.01, β = 0.5 duality gap 10 2 10 0 10 2 10 4 10 6 µ =50 µ =150 µ =2 0 20 40 80 60 Newton iterations Barrier duality gap 10 2 10 2 10 5 10 0 10 0 10 0 10 2 10 2 10 4 10 4 ˆη ˆη r rfeas 10 5 10 6 10 6 10 10 10 8 10 8 10 10 10 10 10 15 0 0 5 5 10 1015 1520 2025 2530 30 0 5 10 15 20 25 30 iteration iteration number number iteration number Figure 11.21 Progress of the primal-dual interior-point method for an LP, showing surrogate duality gap ˆη and the norm of the primal and dual residuals, versus iteration number. The residual Primal-dual converges rapidly tofeasibility zero within 24 iterations; the surrogate gap also converges to a very small number in duality about 28 iterations. gap The primal-dual interior-point gap, where method convergesfasterr than the barrier method, especially if high accuracy is required. feas = Primal-dual surrogate ( r prim 2 2 + r dual 2 2) 1/2 Can see that primal-dual is faster to converge to high accuracy 10 2 10 5 19

20 Now a sequence of problems with n = 2m, and n growing. Barrier method uses µ = 100, runs two outer loops (decreases duality gap by 10 4 ); primal-dual method uses µ = 10, stops when surrogate duality gap and feasibility gap are at most 10 8 35 50 Newton iterations 30 25 20 iterations 40 30 20 15 10 1 10 2 10 3 m Barrier method 10 10 1 10 2 10 3 m Primal-dual method Primal-dual method requires only slightly more iterations, despite the fact that it is producing much higher accuracy solutions

21 References and further reading S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 11 J. Nocedal and S. Wright (2006), Numerical optimization, Chapters 14 and 19 S. Wright (1997), Primal-dual interior-point methods, Chapters 5 and 6 J. Renegar (2001), A mathematical view of interior-point methods