Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Similar documents
Interior-Point Methods for Linear Optimization

Interior Point Algorithms for Constrained Convex Optimization

CS711008Z Algorithm Design and Analysis

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Barrier Method. Javier Peña Convex Optimization /36-725

The Steepest Descent Algorithm for Unconstrained Optimization

Linear programming II

Duality Theory of Constrained Optimization

Lecture 9 Sequential unconstrained minimization

Lecture 18: Optimization Programming

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Introduction to Semidefinite Programming (SDP)

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010


Primal-Dual Interior-Point Methods

MVE165/MMG631 Linear and integer optimization with applications Lecture 13 Overview of nonlinear programming. Ann-Brith Strömberg

Nonlinear Optimization for Optimal Control

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Lecture 13: Constrained optimization

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

SF2822 Applied nonlinear optimization, final exam Wednesday June

Written Examination

Pattern Classification, and Quadratic Problems

Interior Point Methods for Mathematical Programming

Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control

Numerical Optimization

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

12. Interior-point methods

Operations Research Lecture 4: Linear Programming Interior Point Method

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

Applications of Linear Programming

LINEAR AND NONLINEAR PROGRAMMING

Self-Concordant Barrier Functions for Convex Optimization

Constrained Optimization and Lagrangian Duality

CO 250 Final Exam Guide

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

5. Duality. Lagrangian

ICS-E4030 Kernel Methods in Machine Learning

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Constrained Optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

minimize x subject to (x 2)(x 4) u,

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Computational Finance

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Nonlinear Programming

Additional Homework Problems

Lecture 8. Strong Duality Results. September 22, 2008

Assignment 1: From the Definition of Convexity to Helley Theorem

Conditional Gradient (Frank-Wolfe) Method

Nonlinear Optimization: What s important?

2.098/6.255/ Optimization Methods Practice True/False Questions

Optimization Tutorial 1. Basic Gradient Descent

Largest dual ellipsoids inscribed in dual cones

4TE3/6TE3. Algorithms for. Continuous Optimization

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Primal/Dual Decomposition Methods

Semidefinite Programming

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

CSCI : Optimization and Control of Networks. Review on Convex Optimization

IE 5531 Midterm #2 Solutions

Lecture 15: October 15

Optimisation in Higher Dimensions

Following The Central Trajectory Using The Monomial Method Rather Than Newton's Method

Numerical optimization

Homework 4. Convex Optimization /36-725

5 Handling Constraints

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

A Brief Review on Convex Optimization

Optimality, Duality, Complementarity for Constrained Optimization

Algorithms for constrained local optimization

Tutorial on Convex Optimization: Part II

A Distributed Newton Method for Network Utility Maximization, II: Convergence

Convex Optimization M2

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

More First-Order Optimization Algorithms

Implementation of an Interior Point Multidimensional Filter Line Search Method for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization

CONSTRAINED NONLINEAR PROGRAMMING

10 Numerical methods for constrained problems

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Interior Point Methods in Mathematical Programming

Support Vector Machines: Maximum Margin Classifiers

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Lecture 7: Convex Optimizations

Lecture 6: Conic Optimization September 8

Solving Dual Problems

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Convex Optimization Boyd & Vandenberghe. 5. Duality

Optimality Conditions for Constrained Optimization

Optimization. Yuh-Jye Lee. March 28, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 40

Lecture #21. c T x Ax b. maximize subject to

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Generalization to inequality constrained problem. Maximize

On the Relationship Between Regression Analysis and Mathematical Programming

4TE3/6TE3. Algorithms for. Continuous Optimization

Transcription:

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology.

The Problem The logarithmic barrier approach to solving a linear program dates back to the work of Fiacco and McCormick in 967 in their book Sequential Unconstrained Minimization Techniques, also known simply as SUMT. The method was not believed then to be either practically or theoretically interesting, when in fact today it is both! The method was re-born as a consequence of Karmarkar s interior-point method, and has been the subject of an enormous amount of research and computation, even to this day. In these notes we present the basic algorithm and a basic analysis of its performance. Consider the linear programming problem in standard form: T P : minimize c x s.t. Ax = b x 0, where x is a vector of n variables, whose standard linear programming dual problem is: D : maximize b T π s.t. A T π + s = c s 0. Given a feasible solution x of P and a feasible solution (π, s) of D, the duality gap is simply 2

T c x b T π = x s 0. T We introduce the following notation which will be very convenient for manipulating equations, etc. Suppose that x> 0. Define the matrix X to be the n n diagonal matrix whose diagonal entries are precisely the components of x. Then X looks like: x 0... 0 0 x 2... 0....... 0 0... x n Notice that X is positive definite, and so is X 2, which looks like: (x ) 2 0... 0 0 (x 2 ) 2... 0....... 0 0... (x n ) 2 Similarly, the matrices X and X 2 look like: (/x ) 0... 0 0 (/x 2 )... 0...... 0 0... (/x n ) 3

and /(x ) 2 0... 0 0 /(x 2 ) 2... 0....... 0 0... /(x n ) 2 Let us introduce a logarithmic barrier term for P. We obtain: n P () : minimize ct x ln(x j ) j= s.t. Ax = b x> 0. Because the gradient of the objective function of P () issimply c X e, (where e is the vector of ones, i.e., e = (,,,..., ) T ), the Karush-Kuhn- Tucker conditions for P () are: Ax = b, x > 0 c X e = A T π. () If we define s = X e, then Xs = e, 4

equivalently XSe = e, and we can rewrite the Karush-Kuhn-Tucker conditions as: Ax = b, x > 0 A T π + s = c (2) XSe e = 0. From the equations of (2) it follows that if (x, π, s) is a solution of (2), then x is feasible for P,(π, s) isfeasiblefor D, and the resulting duality gap is: T x s = e T XSe = e T e = n. This suggests that we try solving P () for a variety of values of as 0. However, we cannot usually solve (2) exactly, because the third equation group is not linear in the variables. We will instead define a β-approximate solution of the Karush-Kuhn-Tucker conditions (2). A β-approximate solution of P () is defined as any solution (x, π, s) of Ax = b, x > 0 A T π + s = c (3) Xs e β. Here the norm is the Euclidean norm. 5

Lemma. If ( x, π, s ) is a β-approximate solution of P () and β <, then x is feasible for P, ( π, s) is feasible for D, and the duality gap satisfies: n( β) c T x b T π = x T s n( + β). (4) Proof: Primal feasibility is obvious. To prove dual feasibility, we need to show that s 0. To see this, note that the third equation system of (3) implies that β x j s j β which we can rearrange as: ( β) x j s j ( + β). (5) Therefore x j s j ( β) > 0, which implies x j s j > 0, and so s j > 0. From (5) we have n n n n( β) = ( β) x j s j = x T s ( + β) = n( + β). j= j= j= 2 The Primal-Dual Algorithm Based on the analysis just presented, we are motivated to develop the following algorithm: Step 0. Initialization. Data is (x 0,π 0,s 0, 0 ). k = 0. Assume that (x 0,π 0,s 0 )isa β-approximate solution of P ( 0 ) for some known value of β that satisfies β < 2. 6

Step. Set current values. ( x, π, s ) =(x k,π k,s k ), = k. Step 2. Shrink. Set = α for some α (0, ). Step 3. Compute the primal-dual Newton direction. Compute the Newton step ( x, π, s) for the equation system (2) at (x, π, s) =( x, π, s ) for =, by solving the following system of equations in the variables ( x, π, s): A x = 0 A T π + s = 0 (6) S x + X s = XSe e. Denote the solution to this system by ( x, π, s). Step 4. Update All Values. (x,π,s)=( x, π, s )+( x, π, s) Step 5. Reset Counter and Continue. (x k+,π k+,s k+ )=(x,π,s). =. k k +. Go to Step. k+ Figure shows a picture of the algorithm. Some of the issues regarding this algorithm include: how to set the approximation constant β and the fractional decrease 3 parameter α. We will see that it will be convenient to set β = 40 and 8 α =. + n the derivation of the primal-dual Newton equation system (6) 5 7

= 0 = /0 = 0 = 00 Figure : A conceptual picture of the interior-point algorithm. 8

whether or not successive iterative values (x k,π k,s k )are β-approximate solutions to P ( k ) how to get the method started 3 The Primal-Dual Newton Step We introduce a logarithmic barrier term for P to obtain P (): n P () : minimize T x c x ln(x j ) j= s.t. Ax = b x> 0. Because the gradient of the objective function of P () issimply c X e, (where e is the vector of ones, i.e., e = (,,,..., ) T ), the Karush-Kuhn- Tucker conditions for P () are: Ax = b, x > 0 c X e = A T π. (7) We define s = X e, whereby equivalently Xs = e, XSe = e, 9

and we can rewrite the Karush-Kuhn-Tucker conditions as: Ax = b, x > 0 A T π + s = c (8) XSe e = 0. Let ( x, π, s ) be our current iterate, which we assume is primal and dual feasible, namely: A x = b, x> 0, A T π + s = c, s> 0. (9) Introducing a direction ( x, π, s), the next iterate will be ( x, π, s ) + ( x, π, s), and we want to solve: A( x + x) = b, x + x> 0 A T ( π + π)+( s + s) = c ( X + X)(S + S)e e = 0. Keeping in mind that ( x, π, s ) is primal-dual feasible and so satisfies (9), we can rearrange the above to be: A x = 0 A T π + s = 0 S x + X s = e XSe X Se. Notice that the only nonlinear term in the above system of equations in ( x, π, s) is term the term X Se in the last system. If we erase this term, which is the same as the linearized version of the equations, we obtain the following primal-dual Newton equation system: 0

A x = 0 A T π + s = 0 (0) S x + X s = e XSe. The solution ( x, π, s) of the system (0) is called the primaldual Newton step. We can manipulate these equations to yield the following formulas for the solution: [ ( ] ( ) S A T ) S A T x I X AX A x + S e, π ( ) ) S A T ( AX A x S e, () ( ( ) ) S A T s A T AX A x + S e. Notice, by the way, that the computational effort in these equations lies primarily in solving a single equation: ( ) ( ) AX S A T π = A x S e. Once this system is solved, we can easily substitute: s A T π x x + S e X s. S (2) However, let us instead simply work with the primal-dual Newton system (0). Suppose that ( x, π, s) is the (unique) solution of the primal-dual Newton system (0). We obtain the new value of the variables (x, π, s) by taking the Newton step:

(x,π,s)=( x, π, s)+( x, π, s). We have the following very powerful convergence theorem which demonstrates the quadratic convergence of Newton s method for this problem, with an explicit guarantee of the range in which quadratic convergence takes place. Theorem 3. (Explicit Quadratic Convergence of Newton s Method). Suppose that ( x, π, s ) is a β-approximate solution of P () and β< 2. Let ( x, π, s) be the solution to the primal-dual Newton equations (0), and let: (x,π,s)=( x, π, s )+( x, π, s). ( ) +β Then (x,π,s) is a β ( β) 2 -approximate solution of P (). 2 Proof: Our current point ( x, π, s ) satisfies: A x = b, x> 0 A T π + s = c (3) XSe e β. Furthermore the primal-dual Newton step ( x, π, s) satisfies: A x = 0 A T π + s = 0 (4) S x + X s = e XSe. Note from the first two equations of (4) that x T s = 0. From the third equation of (3) we have β s j x j + β, j =,...,n, (5) 2

which implies: ( β) ( β) xj and s j, j =,...,n. (6) s j x j As a result of this we obtain: ( β) X x 2 = ( β) x T X X x x T X S x ) = xt X ( e XSe X s ) = xt X ( e XSe X x e XSe X x β. From this it follows that β X x <. β Therefore x = x + x = X(e + X x) > 0. We have the exact same chain of inequalities for the dual variables: ( β) S s 2 = ( β) s T S S s s T S X s ( ) = s T S e XSe ( ) S x = s T S e XSe S s e XSe S s β. From this it follows that β S s <. β Therefore s = s + s = S(e + S s) > 0. Next note from (4) that for j =,...,n we have: x j s j =( x j + x j )( s j + s j )= x j s j + x j s j + x j s j + x j s j = + x j s j. 3

Therefore ( ) x j s j e X S e =. j From this we obtain: n x = j s j j= n x = j s j x j s j j= x j s j n x j s j j= x j s j e X S e e X S e ( + β) X x S s ( + β) ( ) β 2 ( + β) β 4 Complexity Analysis of the Algorithm Theorem 4. (Relaxation Theorem). Suppose that ( x, π, s ) is a β = 3 40 -approximate solution of P (). Let α = 8 + n 5 and let = α. Then ( x, π, s) is a β = 5 -approximate solution of P ( ). Proof: The triplet ( x, π, s) satisfies A x = b, x> 0, and A T π + s = c, and so it remains to show that Xs e 5. We have ( ) ( ) Xs e = Xs e = Xs e e α α α ( ) + α Xs e e α α 4

3 ( ) α 3 + n 40 40 + n = n =. α α α 5 Theorem 4.2 (Convergence Theorem). Suppose that (x 0,π 0,s 0 ) is a 3 β = 40 -approximate solution of P ( 0 ). Then for all k =, 2, 3,..., (x k,π k,s k ) 3 is a β = 40 -approximate solution of P ( k ). Proof: By induction, suppose that the theorem is true for iterates 0,, 2,..., k. 3 Then (x k,π k,s k )isa β = 40 -approximate solution of P ( k ). From the Relaxation Theorem, (x k,π k,s k )isa 5 -approximate solution of P (k+ ) where k+ = α k. From the Quadratic Convergence Theorem, (x k+,π k+,s k+ )isa β-approximate solution of P ( k+ )for ( ) + 2 5 3 β = ( )2 =. 5 40 5 Therefore, by induction, the theorem is true for all values of k. Figure 2 shows a better picture of the algorithm. Theorem 4.3 (Complexity Theorem). Suppose that (x 0,π 0,s 0 ) is a 3 β = 40 -approximate solution of P ( 0 ). In order to obtain primal and dual feasible solutions (x k,π k,s k ) with a duality gap of at most ɛ, one needs to run the algorithm for at most iterations. ( ) 43 (x 0 ) T s 0 k = 0 n ln 37 ɛ Proof: Let k be as defined above. Note that α = 8 = ( ). 8 5 + n 5 +8 n 0 n 5

= 80 x^ ~ x = 90 _ x = 00 Figure 2: Another picture of the interior-point algorithm. 6

Therefore This implies that k ( ) k 0 0. n ( c T x k b T π k =(x k ) T s k k n( + β) 0 n ( ) ( ) k ( ) 43 0 n 40 n (x 0 ) T s 0 37 40 n, ) k ( 43 40 n ) 0 ) from (4). Taking logarithms, we obtain ( ) ( ) T 43 0 ln(c x k b T π k ) k ln +ln (x 0 ) T s 0 n 37 ( ) k 43 0 +ln (x 0 ) T s 0 n 37 ( ) ( ) 43 (x 0 ) T s 0 43 ln +ln (x 0 ) T s 0 =ln(ɛ). 37 ɛ 37 Therefore c T x k b T π k ɛ. 5 An Implementable Primal-Dual Interior-Point Algorithm Herein we describe a more implementable primal-dual interior-point algorithm. This algorithm differs from the previous method in the following respects: We do not assume that the current point is near the central path. In fact, we do not assume that the current point is even feasible. The fractional decrease parameter α is set to α = conservative value of 8 α =. + n 5 0 rather than the 7

We do not necessarily take the full Newton step at each iteration, and we take different step-sizes in the primal and dual. Our current point is ( x, π, s) for which x> 0and s> 0, but quite possibly A x = b and/or A T π + s = c. We are given a value of the central path barrier parameter > 0. We want to compute a direction ( x, π, s) so that ( x + x, π + π, s + s) approximately solves the central path equations. We set up the system: () A( x + x) = b (2) A T ( π + π)+( s + s) = c (3) ( X + X)(S + S)e = e. We linearize this system of equations and rearrange the terms to obtain the Newton equations for the current point ( π, s ): x, () A x = b A x =: r (2) A T π + s = c A T π s =: r 2 (3) S x + X s = e XSe =: r 3 We refer to the solution ( x, π, s) to the above system as the primaldual Newton direction at the point ( x, π, s). It differs from that derived earlier only in that earlier it was assumed that r = 0 and r 2 =0. Given our current point ( x, π, s ) and a given value of, we compute the Newton direction ( x, π, s) and we update our variables by choosing primal and dual step-sizes α P and α D to obtain new values: ( x, π, s ) ( x + α P x, π + α D π, s + α D s). 8

In order to ensure that x> 0and s> 0, we choose a value of r satisfying 0 < r<(r =0.99 is a common value in practice), and determine α P and α D as follows: { { }} x j α P = min, r min x j <0 x j { { }} s j α D = min, r min. s j <0 s j These step-sizes ensure that the next iterate ( x, π, s) satisfies x>0 and s >0. 5. Decreasing the Path Parameter We also want to shrink at each iteration, in order to (hopefully) shrink the duality gap. The current iterate is ( π, x, s), and the current values satisfy: T x s n We then re-set to ( ) ( x T s ) 0 n, where the fractional decrease is user-specified. 0 5.2 The Stopping Criterion We typically declare the problem solved when it is almost primal feasible, almost dual feasible, and there is almost no duality gap. We set our 9

tolerance ɛ to be a small positive number, typically ɛ = 0 8, for example, and we stop when: () A x b ɛ (2) A T π + s c ɛ (3) s T x ɛ 5.3 The Full Interior-Point Algorithm. Given (x 0,π 0,s 0 ) satisfying x 0 > 0, s 0 > 0, and 0 > 0, and r satisfying 0 < r <, and ɛ > 0. Set k 0. 2. Test stopping criterion. Check if: () Ax k b ɛ (2) A T π k + s k c ɛ (3) (s k ) T x k ɛ. If so, STOP. If not, proceed. ( ) ( (x k ) T (s k ) ) 3. Set 0 n 4. Solve the Newton equation system: () A x = b Ax k =: r (2) A T π + s = c A T π k s k =: r 2 (3) S k x + X k s = e X k S k e =: r 3 20

5. Determine the step-sizes: P = { { }} xk j min, r min x j <0 x j D = { { }} sk j min, r min. s j <0 s j 6. Update values: (x k+,π k+,s k+ ) (x k + α P x, π k + α D π, s k + α D s). Re-set k k + and return to (b). 5.4 Remarks on Interior-Point Methods The algorithm just described is almost exactly what is used in commercial interior-point method software. A typical interior-point code will solve a linear or quadratic optimization problem in 25-80 iterations, regardless of the dimension of the problem. These days, interior-point methods have been extended to allow for the solution of a very large class of convex nonlinear optimization problems. 6 Exercises on Interior-Point Methods for LP. (Interior Point Method) Verify that the formulas given in () indeed solve the equation system (0). 2

2. (Interior Point Method) Prove Proposition 6. of the notes on Newton s method for a logarithmic barrier algorithm for linear programming. 3. (Interior Point Method) Create and implement your own interior-point algorithm to solve the following linear program: LP: T minimize c x s.t. Ax = b x 0, where ( ) ( ) 0 00 A =, b =, 0 50 and ( ) T c = 9 0 0 0. In running your algorithm, you will need to specify the starting point (x 0,π 0,s 0 ), the starting value of the barrier parameter 0, and the value of the fractional decrease parameter α. Use the following values and make eight runs of the algorithm as follows: Starting Points: (x 0,π 0,s 0 ) = ((,,, ), (0, 0), (,,, )) (x 0,π 0,s 0 ) = ((00, 00, 00, 00), (0, 0), (00, 00, 00, 00)) Values of 0 : 0 = 000.0 0 =0.0 Values of α: α = 0 n α = 0 4. (Interior Point Method) Suppose that x> 0 is a feasible solution of the following logarithmic barrier problem: 22

P (): n minimize ct x ln(x j ) j= s.t. Ax = b x 0, andthatwewishtotestif x is a β-approximate solution to P () for a given value of. The conditions for x to be a β-approximate solution to P () are that there exist values (π, s) for which: A x = b, x> 0 A T π + s = c (7) Xs e β. Construct an analytic test for whether or not such (π, s) exist by solving an associated equality-constrained quadratic program (which has a closed form solution). HINT: Think of trying to minimize Xs e over appropriate values of (π, s). 5. Consider the logarithmic barrier problem: n P () : min ct x ln x j j= s.t. Ax = b x> 0. Suppose x is a feasible point, i.e., x> 0and A x = b. Let X = diag( x). a. Construct the (primal) Newton direction d at x. Show that d can be written as: ( ) d = X(I XA T (AX 2 A T ) AX) e Xc. b. Show in the above that P =(I XA T (AX 2 A T ) AX) isaprojection matrix. 23

c. Suppose A T π + s = c and s> 0 for some π, s. Show that d can also be written as: ( ) d = X(I XA T (AX 2 A T ) AX) e Xs. d. Show that if there exists ( π, s) with A T π + s = c and s> 0, and e s X 4, then X d 4, and then the Newton iterate satisfies x + d > 0. 24