Lecture 14: Primal-Dual Interior-Point Method

Similar documents
Constraint Reduction for Linear Programs with Many Constraints

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Nonlinear Optimization for Optimal Control

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

10 Numerical methods for constrained problems

Linear and Combinatorial Optimization

Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

Linear Programming: Simplex

Duality revisited. Javier Peña Convex Optimization /36-725

Interior Point Methods in Mathematical Programming

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

Input: System of inequalities or equalities over the reals R. Output: Value for variables that minimizes cost function

Interior Point Methods for Mathematical Programming

Interior-Point Methods for Linear Optimization

Lecture 17: Primal-dual interior-point methods part II

Theory and Internet Protocols

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Duality of LPs and Applications

Constrained Optimization and Lagrangian Duality


Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

Lecture 10. Primal-Dual Interior Point Method for LP

Interior Point Methods for LP

Linear programming. Saad Mneimneh. maximize x 1 + x 2 subject to 4x 1 x 2 8 2x 1 + x x 1 2x 2 2

Primal-Dual Interior-Point Methods by Stephen Wright List of errors and typos, last updated December 12, 1999.

Numerical Optimization

Lecture 24: August 28

Problem 1 (Exercise 2.2, Monograph)

BBM402-Lecture 20: LP Duality

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

More First-Order Optimization Algorithms

Interior Point Algorithms for Constrained Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

Interior Point Methods for Linear Programming: Motivation & Theory

EE364a Homework 8 solutions

1 Overview. 2 Extreme Points. AM 221: Advanced Optimization Spring 2016

Convex Optimization : Conic Versus Functional Form

Research overview. Seminar September 4, Lehigh University Department of Industrial & Systems Engineering. Research overview.

Algorithms for constrained local optimization

Advances in Convex Optimization: Theory, Algorithms, and Applications

A Brief Review on Convex Optimization

CS711008Z Algorithm Design and Analysis

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.

Summary of the simplex method

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Lecture 14 Barrier method

Approximate Farkas Lemmas in Convex Optimization

Lecture: Algorithms for LP, SOCP and SDP

Primal/Dual Decomposition Methods

The Johnson-Lindenstrauss Lemma in Linear Programming

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Nonlinear Programming

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

On Superlinear Convergence of Infeasible Interior-Point Algorithms for Linearly Constrained Convex Programs *

Barrier Method. Javier Peña Convex Optimization /36-725

4. Algebra and Duality

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

ODE Homework 1. Due Wed. 19 August 2009; At the beginning of the class

Conic Linear Optimization and its Dual. yyye

Convex Optimization & Lagrange Duality

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

Optimality, Duality, Complementarity for Constrained Optimization

EE364b Homework 5. A ij = φ i (x i,y i ) subject to Ax + s = 0, Ay + t = 0, with variables x, y R n. This is the bi-commodity network flow problem.

Lecture Note 18: Duality

New stopping criteria for detecting infeasibility in conic optimization

IE 521 Convex Optimization Homework #1 Solution

1 Quantum states and von Neumann entropy

18. Primal-dual interior-point methods

Lecture 18 Oct. 30, 2014

Semidefinite Programming

5. Duality. Lagrangian

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

Convex Optimization M2

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

An O(nL) Infeasible-Interior-Point Algorithm for Linear Programming arxiv: v2 [math.oc] 29 Jun 2015

Week 3 Linear programming duality

Lecture #21. c T x Ax b. maximize subject to

A Review of Linear Programming

10. Smooth Varieties. 82 Andreas Gathmann

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Linear Programming Duality

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

The Simplex Algorithm

18.01 Single Variable Calculus Fall 2006

Interior-Point Methods

Generalization to inequality constrained problem. Maximize

Lecture: Duality.

Lecture 18: Optimization Programming

Inequality constrained minimization: log-barrier method

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Lecture: Duality of LP, SOCP and SDP

EE/AA 578, Univ of Washington, Fall Duality

CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms

Transcription:

CSE 599: Interplay between Convex Optimization and Geometry Winter 218 Lecturer: Yin Tat Lee Lecture 14: Primal-Dual Interior-Point Method Disclaimer: Please tell me any mistake you noticed. In this and next week, we study interior point methods. In practice, this framework reduces the problem of optimizing a convex function to <5 many linear systems. In theory, it reduces the problem to Õ( n) many. Although it has many numerical stability problem in practice and it rarely offers nearly linear time algorithm, this is the current best framework for optimizing general convex functions in both theory and practice for the high accuracy regime. For simplicity, we start by describing interior point method for linear programs first. The first half is about the polynomial bound and the second half is about the implementation details. 14.1 Short-step interior point method 14.1.1 Basic Properties of Central Path We first consider the primal problem (P) : min x c x subjects to Ax = b,x where A R m n. The difficulty of linear programs is the constraint x. Without this constraint, we can simply solve it by a linear system. One natural idea to solve linear programs is to replace the hard constraint x by some other function. So, let consider the regularized version of the linear program (P t ) : min x c x t n lnx i subjects to Ax = b. i=1 We will explain the reason of choosing lnx in more detail next lecture. For now, you can think it as a nice function that blows up at x =. One can think that lnx gives a force from every constraint x to make sure x is true. Since the gradient of lnx blows up when x =, when x is close enough, the force is large enough to counter the cost c. When t, then the problem (P t ) is closer to the original problem (P) and hence the minimizer of (P t ) is closer to a minimizer of (P). First, we give a formula for the minimizer of (P t ). Lemma 14.1.1 (Existenceand Uniquenessofcentralpath). If the polytope {Ax = b,x } has an interior, then the optimum of (P t ) is uniquely given by where xs = t is a shorthand of x i s i = t for all i. xs = t, Ax = b, A y +s = c, (x,s) 14-1

14-2 Lecture 14: Primal-Dual Interior-Point Method Proof. The optimality condition is given by c t x = A y where y is the dual variable. Write s i = t x i, we have the formula. The solution is unique because the function lnx is strictly convex. Definition 14.1.2. We define the central path (x t,y t,s t ) of the linear program is given by x t s t = t, Ax t = b, A y t +s t = c, (x t,s t ). To give another interpretation of the central path, we note that the dual problem is (D) : max y,s b y subjects to A y +s = c,s. Note that for any feasible x and y, we have that c x b y = c x x A y = x s. Hence, (x,y,s) solves the linear program if it satisfies the central path equation with t =. Therefore, central path is a balanced way to decrease x i s i uniformly to. Now, we formalize the intuition that for small t, x t is a good approximation of the primal solution. Lemma 14.1.3 (Duality Gap). We have that Duality Gap = c x t b y t = c x t x t A y t = x t s t = t n. The interior point method follows the following framework: 1. Find C 1 2. Until t < ε n, (a) Use C t to find C (1 h)t for h = 1 1 n. Note that this algorithm only finds a solution with ε error. If the linear program is integral, we can simply stop at small enough ε and round it off to the closest integral point. 14.1.2 Finding the initial point The first question to find C 1. This can be handled by extending the problem into higher dimension. By rescaling the problem and adding a new variable, we can assume the solution satisfies i x i = n and that c 1 2. Now, we consider the following linear program min x,x c x+m x subjects to [ ][ ] A (b A1) x 1 x = for some very large M. This linear program has the following properties: [ b n ]

Lecture 14: Primal-Dual Interior-Point Method 14-3 All 1 vector is a feasible interior point of the linear program. [ ] x Any vector x such that Ax = b and x gives a feasible vector in this linear program. Therefore, if M is large enough, this linear program has the same solution as the old linear program. The dual polytope [ [ m has a trivial interior point y = 1 ] [ ] A 1 c (b A1) y M ]. (used c 1 2 and M large enough) Therefore, we have an initial point (x,y,s) with (x,s) >. To make xs = 1, we can rescale the constraint matrix A. 14.1.3 Following the central path During the algorithm, we maintain a point (x,y,s) such that Ax = b, A y + s = c and xs is close to t. We show how to find a feasible (x+ x,y + y,s+ y) such that it is even closer to t. We can write the equation as follows: (x+ x)(s+ s) t, A(x+ x) = b, A (y + y)+(s+ s) = c. (Omitted the non-negative conditions.) Using our assumption on (x,y,s) and noting that x s is small, the equation can simplified as follows I A S X A x y s = t xs This is a linear equation and hence we can solve it exactly as follows: Exercise 14.1.4. Letr = t xs. ProvethatS x = (I P)randX s = PrwhereP = XA (AS 1 XA ) 1 AS 1. Lemma 14.1.5. Let x = x+ x and s = s+ s. If i (x is i t) 2 ε 2 t 2, we have that (x i s i t)2 ( ε 4 +O(ε 5 ) ) t 2. Proof. We have that LHS = i (x i s i t)2 = i Using the formula in Exercise 14.1.4, this can be bounded by i. ( x) 2 i ( s)2 i 1 (1 ε) 2 t 2 x 2 i s2 i ( x)2 i ( s)2 i. i (1 ε) 2 t 2 LHS i ((I P)r) 2 i (Pr)2 i (I P)r 2 4 Pr 2 4 (I P)r 2 2 Pr 2 2. Note that P is a projection matrix and that P op = S 1 1 2 X 2 S 1 1 2 X 2 A (AS 1 XA ) 1 AS 1 1 2 X 2 S 1 2 X 1 2 1 op op op 1 ε where weused that the middle matrixis an orthogonalprojectionmatrix. Similarly, we have I P op α. Hence, we have 1 LHS (1 ε) 6 t 2 r 4 2 = ε 4 t 4 (1 ε) 6 t 2.

14-4 Lecture 14: Primal-Dual Interior-Point Method Using this, we have the following theorem: Theorem 14.1.6. We can solve linear program with ε error in O( nlog( n ε )) many iterations and each iteration only need to solve 1 linear system. Proof. Let E = i (x is i t) 2 be the error of the current iteration. We always maintain E t2 5 for current (x,y,s) and current t. Whenever E t2, we decrease E by Lemma 14.1.5. If not, we decrease t by t(1 1 1 n ). Note that 2 (x i s i t(1 h)) 2 2 i i (x i s i t) 2 +2t 2 h 2 n 2t2 2 + 2t2 1 (t(1 h))2 5. Therefore, the invariant is preserved after this step of t. Hence, each iteration, we simply solve 1 linear system and decrease t by (1 1 1 n ) factor. So, it takes O( nlog( n ε )) to decrease t from 1 to ε n. Remark. Due the our reduction to get the initial point, we are paying huge factors in the logarithmic terms. For many problems, this factor is polynomial and hence the total running time is still O( nlog( n ε )). However, for some problem such as circuit evaluation, the factor can be exponentially large and hence we only get a n 1.5 time algorithm. This is natural because otherwise, we would prove that any program can be run in n depth! 14.1.4 Where does n come from? Central path satisfies the following ODE S t d dt x t +X t d dt S t = 1, A d dt x t =, A d dt y t + d dt s t =. dx Solvingthislinearsystem,wehavethatS t ds t dt = (I P t )1andX t t dt = P t1wherep t = X t A (ASt 1 X t A ) 1 AS 1 Using that x t s t = t, we have that and that Xt 1 dx t P t = X t A (AX 2 ta) 1 AX t = S 1 t A (AS 2 t A) 1 AS t dt = 1 t (I P t)1 and St 1 ds t dt = 1 t P t1. Equivalently, we have dlnx t dlnt = (I P t)1 and dlns t dlnt = P t1. t. Note that P t 1 P t 1 2 = n. Hence, x t and s t can changes at most constant factor when we change t by a 1± 1 n factor. Exercise 14.1.7. If we are given x such that lnx lnx t O(1), then we can find x t by solving Õ(1) linear systems. If v is a random vector with length n, then we have P t v = O( logn). Maybe this is the reason we see the running time of interior point closer to nearly constant iterations instead of polynomial.

Lecture 14: Primal-Dual Interior-Point Method 14-5 14.2 Mehrotra predictor corrector method In practice, we implement the interior point slightly differently. First, we do not need to find an initial point that is feasible. Hence, we do not need to do that reduction. Second, our step is not aiming at xs = t. Instead, every step, we first aim at xs = (called affine direction). Then, we check if we can take a large step on this affine direction. This is used to evaluation how long step size h we can take. Finally, we compute a second order update step. Formally, the algorithm as follows: Compute initial point: Set x, y and s are the minimizer of the problem min Ax=b x 2 and the problem min A y+s=c s 2. Set δ x = max( 1.5min i x i,) and δ s = max( 1.5min i s i,). Set y () = y, x () = x+δ x +.5 (x+δxe) (s+δ se) i (si+δs) and s () = s+δ s +.5 (x+δxe) (s+δ se) i (xi+δx). For k =,1,2, Solve the equation A I A S (k) X (k) x aff y aff s aff = r c r b X (k) S (k) e where r b = Ax (k) b and r c = A y (k) +s (k) c. Calculate α pri aff = argmax{α [,1] x(k) +α x aff }; α dual aff = argmax{α [,1] s (k) +α s aff }; µ = x (k) s (k) /n; µ aff = (x (k) +α pri aff xaff ) (s (k) +α dual aff s aff )/n; (µaff ) ) 3,1 Set centering parameter to σ = min( ; Solve the equation Calculate A I A S (k) X (k) µ xcc y cc s cc = σµe X aff S aff e ( x (k), y (k), s (k) ) = ( x aff + x cc, y aff + y cc, s aff + s cc ); Set α pri k = min(.99 α pri max,1) and α dual k Set α pri max = argmax{α [,1] x (k) +α x (k) }; α dual max = argmax{α [,1] s (k) +α s (k) }; = min(.99 α dual max,1); x (k+1) = x (k) +α pri k x(k) ; (y (k+1),s (k+1) ) = (y (k),s (k) )+α dual k ( y (k), s (k) ); ;

14-6 Lecture 14: Primal-Dual Interior-Point Method Listing 2: Mehrotra predictor corrector method 1 % This is an implementation of Mehrotra's predictor coorector algorithm. 2 % We solve the linear program min <c,x> subject to Ax=b, x>= 3 function [x, iter, mu] = solvelp(a, b, c) 4 m = size (A,1) ; n = size (A,2) ; maxiter = 1; 5 P = colamd(a') ; 6 A = A(P,:) ; b = b(p) ; 7 8 % Find an initial point 9 R = builtin (' cholinf', A * A') ; 1 y = R\(R'\(A * c)) ; s = c A' * y; x = A' * (R\(R'\b)) ; 11 dx = max( 1.5 * min(x ), ) ; ds = max( 1.5 * min( s ), ) ; 12 x = x + dx +.5 * (( x + dx)' * ( s +ds))/sum( s +ds) ; 13 s = s + ds +.5 * (( x + dx)' * ( s +ds))/sum(x +dx) ; 14 nb = mean(abs(b)) ; 15 mu = []; 16 for iter = 1:maxiter 17 mu(end+1) = sum(x. * s)/n; 18 rb = A * x b; rc = A' * y + s c; rxs = x. * s ; 19 if mu(end) < 1e 6 && mean(abs(rb)) < 1e 6 * nb, break ; end 2 21 R = builtin (' cholinf', A * diag(sparse(x./s)) * A') ; 22 [dx a, dy a, ds a ] = step dir (A, R, x, s, rc, rb, rxs) ; 23 24 xa step = dist (x, dx a) ; sa step = dist (s, ds a) ; 25 mu a = sum((x+xa step * dx a). * ( s+sa step * ds a))/n; 26 sigma = min((mu a/mu(end))ˆ3,1) ; 27 28 rxs = (sigma * mu(end) dx a. * ds a) ; 29 [dx c, dy c, ds c ] = step dir (A, R, x, s, zeros(n,1), zeros(m,1), rxs) ; 3 31 dx = dx a + dx c ; dy = dy a + dy c ; ds = ds a + ds c ; 32 x step = min(.99 * dist (x, dx),1) ; 33 s step = min(.99 * dist (s, ds),1) ; 34 35 x = x + x step * dx; y = y + s step * dy; s = s + s step * ds; 36 end 37 if iter == maxiter, x = ones(n,1) * NaN; end % x = NaN if doesn't converge. 38 end 39 % This outputs the solution of the linear system 4 % [ A' I ; A ;S X][ dx;dy;ds] = [ rc, rb, rxs ] 41 % The chol decomposition of A Sˆ 1 X A' = R' R is given 42 function [dx, dy, ds] = step dir (A, R, x, s, rc, rb, rxs) 43 r = rb + A * (( x. * rc+rxs)./s) ; 44 dy = R\(R'\r) ; ds = rc A' * dy; dx = (rxs + x. * ds)./s; 45 end 46 function [ step ] = dist (x, dx) 47 step = x./dx; 48 step(dx>=) = Inf ; 49 step = min(step) ; 5 end

Lecture 14: Primal-Dual Interior-Point Method 14-7 We note the following: 1. Usually linear programs are given by min Ax=b,l x u c x. The best way is to handle them directly. But for the simplicity of the code, we convert the linear program to min Ax=b,x c x as follows: (a) For any unconstrained x, replaced it by x = x + x with new constraints x + and x. (b) For any two sided constraints l x u, we replace it to x = u x, x l and x. (c) For any one-sided constraints x l, shift it to x. 2. This code is barely stable enough to pass all test cases in netlib problems if we perturb c by c+ε1 where ε is some tiny positive number. This avoided the issue that the set of optimum point can be unbounded. All examples are solved in 7 iterations, most of them solved in 4 iterations. (for 1 6 error). 3. I have not implement the test for infeasibility (See [85]) and the procedure of rounding (See [83]). 4. I have not implement methods to simplify the linear program or to rescale the program. However, to make the algorithm faster for problems with dense col in A, we split the corresponding columns into many copies and duplicate the corresponding variables (not shown in the matlab code above, but in the zip.). Usually this is not done in this way, but modify the linear system algorithm to handle those columns separately. 5. One may able to slightly improve the running time by doing a higher order approximation of the central path. For more implementation details, see [84]. References [83] Erling D Andersen. On exploiting problem structure in a basis identification procedure for linear programming. INFORMS Journal on Computing, 11(1):95 13, 1999. [84] Stephen J Wright. Primal-dual interior-point methods. Siam, 1997. [85] Xiaojie Xu, Pi-Fang Hung, and Yinyu Ye. A simplified homogeneous and self-dual linear programming algorithm and its implementation. Annals of Operations Research, 62(1):151 171, 1996.