On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

Similar documents
Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Primal-Dual Interior-Point Methods

12. Interior-point methods

Barrier Method. Javier Peña Convex Optimization /36-725

Lecture 18: Optimization Programming

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Interior-Point Methods

CS711008Z Algorithm Design and Analysis

Interior Point Methods for Linear Programming: Motivation & Theory

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Interior Point Algorithms for Constrained Convex Optimization

Interior Point Methods in Mathematical Programming

Interior-Point Methods for Linear Optimization

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

Optimality, Duality, Complementarity for Constrained Optimization

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lecture 6: Conic Optimization September 8

10 Numerical methods for constrained problems

Algorithms for nonlinear programming problems II

A Constraint-Reduced MPC Algorithm for Convex Quadratic Programming, with a Modified Active-Set Identification Scheme

CS-E4830 Kernel Methods in Machine Learning

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

12. Interior-point methods

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

Introduction to Nonlinear Stochastic Programming

Lecture 14 Barrier method

5. Duality. Lagrangian

Algorithms for nonlinear programming problems II

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Linear Programming: Simplex

Interior Point Methods for Mathematical Programming

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method


Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

SMO vs PDCO for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines

Convex Optimization Boyd & Vandenberghe. 5. Duality

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

Lecture 16: October 22

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Homework 4. Convex Optimization /36-725

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Support Vector Machines

Convex Optimization M2

4TE3/6TE3. Algorithms for. Continuous Optimization

Convex Optimization and Support Vector Machine

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

ICS-E4030 Kernel Methods in Machine Learning

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

A projection algorithm for strictly monotone linear complementarity problems.

Chapter 14 Linear Programming: Interior-Point Methods

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

Nonlinear Programming

A PRIMAL-DUAL INTERIOR POINT ALGORITHM FOR CONVEX QUADRATIC PROGRAMS. 1. Introduction Consider the quadratic program (PQ) in standard format:

A new primal-dual path-following method for convex quadratic programming

Lecture 9 Sequential unconstrained minimization

minimize x subject to (x 2)(x 4) u,

A priori bounds on the condition numbers in interior-point methods

Support Vector Machines for Regression

Lecture 13: Constrained optimization

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Support Vector Machines: Maximum Margin Classifiers

Constrained Optimization and Lagrangian Duality

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

Algorithms for Constrained Optimization

Duality revisited. Javier Peña Convex Optimization /36-725

A Distributed Newton Method for Network Utility Maximization, II: Convergence

ALADIN An Algorithm for Distributed Non-Convex Optimization and Control

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

Lecture: Duality.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Convex Optimization and l 1 -minimization

Lecture 9 Monotone VIs/CPs Properties of cones and some existence results. October 6, 2008

Second-order cone programming

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

Lecture: Algorithms for LP, SOCP and SDP

Derivative-Based Numerical Method for Penalty-Barrier Nonlinear Programming

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

On Superlinear Convergence of Infeasible Interior-Point Algorithms for Linearly Constrained Convex Programs *

Full Newton step polynomial time methods for LO based on locally self concordant barrier functions

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

Lecture 7: Convex Optimizations

Lecture 10: Linear programming duality and sensitivity 0-0

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

A WIDE NEIGHBORHOOD PRIMAL-DUAL INTERIOR-POINT ALGORITHM WITH ARC-SEARCH FOR LINEAR COMPLEMENTARITY PROBLEMS 1. INTRODUCTION

Convex Optimization & Lagrange Duality

Lecture 10. Primal-Dual Interior Point Method for LP

Numerical Optimization

DELFT UNIVERSITY OF TECHNOLOGY

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Introduction to optimization

Following The Central Trajectory Using The Monomial Method Rather Than Newton's Method

Transcription:

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming Altuğ Bitlislioğlu and Colin N. Jones Abstract This technical note discusses convergence conditions of a generalized variant of primal-dual interior point methods. The generalization arises due to the permitted case of having a non-uniform complementarity perturbation vector, which is equivalent to having different barrier parameters for each constraint instead of a global barrier parameter. Widely used prediction-correction methods and recently developed coordinated schemes can be considered as specific cases of the non-uniform perturbation framework. For convex quadratic programs, the polynomial complexity result of the standard feasible path following method with uniform perturbation vector is extended to the generalized case by imposing safeguarding conditions that keep the iterates close to the central-path. I. INTRODUCTION We consider the following convex quadratic optimization problem: min. x,s 2 xt Hx + q T x s. t. Ax + b + s = 0, s 0, where λ is the dual variable associated to the positivity constraint. This problem can be efficiently solved with primal-dual interior point methods that enjoy worst-case polynomial complexity [6]. In standard primal-dual methods, the problem is solved with the help of a smoothing on the KKT system. The smoothing is obtained by perturbing the complementarity conditions with a uniform and positive vector. In practice, certain correction terms are added to the perturbation vector [4], [7], to account for the non-linearity of the complementarity equation. This type of correction can be considered as a non-uniform perturbation vector since the corrected perturbation parameter for each constraint can be different. Another motivation for using a non-uniform perturbation arises from the implementation of primal-dual methods in a multi-agent setting [] where it is desirable to use a different perturbation vector for each set of local constraints in order to reduce the communication burden. In this technical note we will analyze the convergence properties of primal-dual interior points with non-uniform complementarity perturbation vector for quadratic programming. Following closely the analysis of [6], we will show that in this generalized setting, identical worst-case complexity results can be obtained, provided that certain centrality conditions are enforced. These conditions are automatically satisfied in the uniform setting, and provide guidelines for ensuring safe operation for practical implementations of coordinated or prediction correction type primal-dual interior-point methods. In the following we will first introduce the standard primal-dual interior point method and then provide the complexity result for the non-uniform setting. (λ) II. PRIMAL-DUAL INTERIOR POINT METHOD This section provides a summary of standard primal-dual interior point framework. See [8], [9] for a more in depth treatment and [5] for an up to date survey. The first order optimality conditions for problem (), or the Karush-Khun-Tucker(KKT) system, can be written as Where r = (r T dual, rt comp, r T prim )T and the following definitions are used: r(x, s, λ) = 0. (2) () r dual (x, λ) = Hx + q + A T λ, r prim (x, s) = Ax + b + s, r comp (λ, s) = ΛS, where Λ = diag(λ), S = diag(s) and the operator diag(.) constructs a diagonal matrix from the input vector. (3) *This technical note includes some of the explanatory material from [] and provides the proof of Theorem of []. This work has received support from the European Research Council under the European Unions Seventh Framework Programme (FP/2007-203), ERC Grant Agreement 307608. The authors are with Laboratoire d Automatique, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. altug.bitlislioglu - colin.jones @ epfl.ch

Primal-dual interior point (PDIP) methods operate by applying Newton s method on a smoothed version of the nonlinear set of equations (2). The smoothing is obtained by perturbing the complementary slackness condition: r cent (β, λ, s) = ΛS β, β > 0, (4) and using the modified function ˆr(β) = (rdual T, rt cent(β), rprim T )T. In order to retain the optimizer of the original KKT system, the perturbation parameter is decreased progressively. As β 0, the primal-dual pairs converge to the optimizer of () following the so called central path. In general the perturbation vector β is obtained by multiplying a scalar with a vector of ones. However we will let it attain different values for each constraint. The Newton direction z = ( x, y, s, λ, µ, γ), for the perturbed KKT system can be found by solving the following linear system of equations: H 0 AT 0 Λ S A I s 0 x s = ˆr(β). (5) λ Long step methods use the maximum step size that preserves strict positivity of the pair λ, s such that (s+t s, λ+t λ) > 0 holds, whereas short step variants further restrict the step size to preserve centrality and/or to ensure sufficient decrease in the residuals. The primal dual interior point method will terminate when the duality gap measure η = λ T s and the primal dual residuals are reduced below desired thresholds. η becomes equal to the actual duality gap, when the iterates are primal and dual feasible, r dual = 0, r primal = 0 [2]. For a uniform perturbation vector β, when the perturbed KKT system is solved, that is we obtain a point on the central path, we have that ˆη := η M = λt s M, ˆη = β, where λ, s R M, M is the total number of inequality constraints and the average complementarity value ˆη provides a measure for the duality gap. In practice it is not necessary to wait for convergence to the central path, before reducing the perturbation parameter. In the standard feasible path following method, for which convergence and complexity results are relatively easy to provide [8], the perturbation vector at each step is chosen as β = σˆη, σ (0, ). To ensure that proximity to the central path is preserved, the iterates can be constrained to stay within certain neighborhoods [3], [8]. III. CONVERGENCE WITH NON-UNIFORM PERTURBATION For brevity, we write the Netwon system in compact form [ ] rprimal,dual M z =. ΛS β Consider the modified Newton step with a non-uniform perturbation vector on the complementarity condition: [ ] rprim,dual M z =, (6) ΛS ΛSξ where the vector ξ specifies how much change in each complementarity pair is desired, with ξ R M and ξ > 0. For a given σ > 0, ξ can be selected as ξ j = σ ˆη λ js j to recover the standard uniform perturbation vector with σ. Note that any corrector applied together with the uniform-perturbation can be considered as a non-uniform perturbation vector. In this section, we will state the convergence conditions for quadratic programs (QP) in the feasible case, where the primal-dual residuals are equal to 0, to make the analysis simpler, as is common in the interior point literature [6], [0]. The results can be extended to the non-feasible case and for LPs and QPs, a feasible point can be obtained with reformulations of the problem [5], [6]. We restrict the iterations to lie inside the symmetric neighborhood [6], N s = {(x, s, λ) F 0 : for some ψ > 0, where the strictly feasible set F 0 is defined as ψ ˆη λ js j ψˆη, j [, M]} (7) F 0 = {(x, s, λ) : (r prim, r dual ) = 0, (s, λ) > 0}.

Furthermore we restrict the reduction vector ξ to satisfy the following criteria, j c : 0 < c <, ξ jλ j s j = cˆη (8) M cˆη ξ j φ cˆη (9) φ λ j s j λ j s j for some φ > ψ. The constraints (9), (8) enforce that the next point we would like to arrive with the Newton step, does not have to be on the central path as in the uniform case, but it should be at least strictly within the N s and aim to reduce the overall duality gap. By applying these conditions, we will ensure that the linearization error in the Newton step is bounded, and there is always some finite step-size that can reduce the duality-gap and keep the system within N s. The following theorem states the convergence and complexity result. Theorem : If the problem () is a convex QP, given ɛ > 0, suppose that a starting point (x 0, s 0, λ 0 ) N s satisfies η 0 ɛ κ (0) for some positive constant κ. Let {η k } be a sequence generated by the iteration scheme, which takes steps using the Newton relation (6) with admissible parameters φ, ψ and reduction vectors ξ k that satisfy (8), (9), with some α, γ (0, ), γ < α and c k γ. Then, there exists an admissible step-size t k O( M ) and an index K O(M log( ɛ )), such that η k ɛ, k K. The analysis and the proof will be similar to that of [6] by showing that starting from a point within N s, it is possible to keep the linearization error small and obtain an admissible step-size of complexity O( M ) that reduces the duality gap sufficiently and keep the iterates within N s. Allowing non-uniformity necessitates additional constraints (9), (8) on the perturbation vector and results in different bounding terms compared to [6], however the complexity result is identical. We start the analysis by deriving bounds on the linearization error. Lemma : For (x, s, λ) N s, the Newton direction resulting from (6) satisfies the following relations; where θ is defined as 0 λ T s θλ T s () θλ T s λ j s j 2θλ j s j (2) θ := 2 3/2 max ( c, φψc). (3) φψ Proof: We start by proving (). Under the feasibility assumption, primal-dual residuals are set to 0 and from (5), we have the following relations and therefore s = A x, A T λ = H x, s T λ = x T A T λ = x T H x. Since the cost function is assumed to be convex, we have H 0, which validates the left-hand side inequality in (). For the right-hand side of () we need a technical result ( [5], Lemma 3.3.). Given two vectors u, v R n, if u T v 0 then, UV 2 3/2 u + v 2 (4) where U := diag(u), V := diag(v). With relation (4) at hand, we start with the non-uniformly perturbed centrality equation Multiply both sides with (ΛS) /2 ; Since Λ, Λ, S, S are all diagonal matrices we have that S λ + Λ s = ΛS(ξ ). (5) Λ /2 S /2 λ + Λ /2 S /2 s = (ΛS) /2 (ξ ). (6) Λ S = Λ /2 S /2 λλ /2 S /2 s

Next we use (4) with u := Λ /2 S /2 λ and v := Λ /2 S /2 s. Note that the condition u T v 0 is satisfied since λ T s 0. Λ S = Λ /2 S /2 λλ /2 S /2 s 2 3/2 Λ /2 S /2 λ + Λ /2 S /2 s 2 = 2 3/2 (ΛS) /2 (ξ ) 2 = 2 3/2 j (λ j s j )(ξ j ) 2 2 3/2 ξ 2 λ T s due to(λ j s j > 0, j) Relying on the constraints (9), (8) imposed on the non-uniform perturbation vector ξ, we can obtain a parametric expression for ξ ξ 2 = max ( c, φψc) φψ which is equal to θ/(2 3/2 ), by definition (3). Therefore, we can write the bound on Λ S as Λ S θλ T s Finally using Cauchy-Schwarz inequality we arrive at the upper bound on λ T s; λ T s = ( Λ S) T Λ S = θλ T s which proves the right-hand side of (). Next, we prove the relation (2) that provide bounds on λ j s j. We write equation (6) element-wise and square both sides Since λ, s > 0, we have that λ j λ /2 j s /2 j λ j + λ /2 j s /2 j s j = λ /2 j s /2 j (ξ j ) s j λ 2 j + λ j s s 2 j + 2 λ j s j = (ξ j ) 2 λ j s j j 2 λ j s j (ξ j ) 2 λ j s j λ j s j 0.5 (ξ j ) 2 λ j s j Using the relation λ j s j Λ S Λ S, and the definition of θ, we arrive at θλ T s λ j s j 2θλ j s j In the following, we will show that a finite step-size can be obtained such that the neighborhood constraints are preserved and a sufficient decrease in the duality gap is achieved. For a given direction, we describe the values after a step with a step-size t (0, ] as λ j (t) = λ j + t λ j, s j (t) = s j + t s j. Lemma 2: For a given point (x, s, λ) N s, and the Newton direction resulting from (6), if the step-size t (0, ] satisfies the following conditions; c(φ ψ) t (ψ + M)θ (7) c(φ ψ) t 2θφ (8) then the new point attained by taking a step in the Newton direction with step-size t stays in the neighborhood N s, that is (x(t), s(t), λ(t)) N s. (9) Proof: Symmetric neighborhood N s implies lower and upper bound constraints on complementarity pairs and feasibility of the iterates. We start with the lower bound constraint. Using Lemma, the centrality equation (5) and neighborhood

constraints on the current point λ j s j ψˆη and the perturbation vector ξ j φ cˆη λ js j λ j (t)s j (t) λ j (t)s j (t) = (λ j + t λ j )(s j + t s j ) we obtain an initial lower bound on = λ j s j ( t) + tξ j λ j s j + t 2 λ j s j ψˆη( t) + tφcˆη t 2 θm ˆη Next, we derive an upper bound on ˆη(t), using the relation (8), centrality equation (5) and Lemma ˆη(t) = λ(t) T s(t)/m = ( λ T s + t(c )λ T s + t 2 λ T s ) /M ( λ T s + t(c )λ T s + t 2 θλ T s ) /M = ˆη ( ( t) + tc + t 2 θ ) Using the upper bound on ˆη(t) and lower bound on λ j (t)s j (t) we can write a sufficient condition for λ j (t)s j (t) ψˆη(t), ψˆη( t) + tφcˆη t 2 θm ˆη ψˆη ( t) + tψcˆη + ψt 2 θˆη (φ ψ)cˆη t ((ψ + M)θ) ˆη which is identical to condition (7). Next, we tackle the upper bound constraint on λ j (t)s j (t) ψ ˆη(t) which is equivalent to Seeing that λ j s j ψ ˆη, and ξ j φ λ j s j ( t) + tξ j λ j s j + t 2 λ j s j (ˆη( t) + tcˆη + t 2 λ T s/m ) ψ cˆη λ is i a sufficient condition is ψ ˆη( t) + t φ cˆη + t2 λ j s j ψ ˆη( t) + t ψ cˆη + t2 ψ λt s/m (20) t φ cˆη + t2 λ j s j t ψ cˆη + t2 ψ λt s/m t ( λ 2 j s j ) ψ λt s/n tcˆη( ψ φ ). Since λ j s j 2θλ j s j 2θ ψ ˆη and λt s 0 another sufficient condition is t 2θ ψ ˆη cˆη( ψ φ ) t ψc( ψ φ ) c(φ ψ) =. 2θ 2θφ which is equivalent to (8). For a LP and QP, the KKT system is composed of linear equations, except for the complementarity equation. Therefore the Newton step with r dual = 0, r primal = 0, will preserve the feasibility for any step-size, from which we conclude that a step-size t (0, ] satisfying (7), (8) will keep the iterates within N s. Lemma 3: For a given point (x, s, λ) N s, parameter α (0, ) and the Newton direction resulting from (6), if the step-size t, and the mean reduction parameter c, satisfy the conditions; c > α, 0 < t c α θ then the new duality gap η(t) satisfies the reduction requirement Proof: From (20), we have that (2) η(t) ( αt) η. (22) η(t) ( γ(t)) η

with γ(t) = t( c) t 2 θ. Therefore, a sufficient condition for satisfying (22) is γ(t) = t( c) t 2 θ αt t 2 θ t( c α) 0 < t c α θ which is equivalent to (2). In order to obtain the convergence and complexity result, first we will show that there exists a t O( M ) that satisfies (8), (7) and (2). Then, application of Thorem 3.2 from [0] will provide the desired result. The set of admissible step-sizes are given by; ( c α {t 0 < t min, θ c(φ ψ) c(φ ψ), )} (23) (ψ + M)θ 2θφ where θ := 2 3/2 max ( c φψ, φψc). Choose α = 0., ψ = 0. and φ = 0.2 and fix reduction parameter c = 0.8 for every step. For these values θ is given by 2 3/2 0.8 ( 0.2 0. ) 3.8 It can be shown that the step-size selection ˆt = 200M will be always admissible, which provides t O( M ). Proof of Theorem : We have already shown that with judicious selection of parameters, it is possible to find an admissible t O( M ). Plugging this into the reduction requirement (22), we get η k+ ( δ M )η k, k =, 2,..., (24) for some constant δ > 0. From here, we can directly apply Thorem 3.2 from [0] and state the result. We provide the proof for completeness. First we take the log of both sides in (24) log η k+ log( δ M ) + log η k. Writing the right hand side recursively and using (0) we get log η k+ k log( δ M ) + log η 0 k log( δ M ) + κ log( ɛ ). The log function has the property that Since α, t <, this leads to log( + x) x, x >. log η k+ k( δ M ) + κ log( ɛ ). For the convergence criterion η k < ɛ to be satisfied, it is sufficient to have k( δ M ) + κ log( ɛ ) log ɛ, which is satisfied for all k, such that k K = ( + κ) M δ log ɛ. REFERENCES [] A. Bitlislioglu and C. N. Jones. On coordinated primal-dual interior-point methods for multi-agent optimization. In 56th IEEE Conference on Decision and Control (to appear), Melbourne, Australia, December 207. [2] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004. [3] M. Colombo and J. Gondzio. Further development of multiple centrality correctors for interior point methods. Computational Optimization and Applications, 4(3):277 305, December 2008. [4] J. Gondzio. Multiple centrality corrections in a primal-dual method for linear programming. Computational Optimization and Applications, 6(2):37 56, September 996. [5] J. Gondzio. Interior point methods 25 years later. European Journal of Operational Research, 28(3):587 60, May 202. [6] J. Gondzio. Convergence Analysis of an Inexact Feasible Interior Point Method for Convex Quadratic Programming. SIAM Journal on Optimization, 23(3):50 527, January 203. [7] S. Mehrotra. On the Implementation of a Primal-Dual Interior Point Method. SIAM Journal on Optimization, 2(4):575 60, November 992. [8] J. Nocedal and J. Wright, S. Numerical Optimization. Springer, New York, 2000. [9] F. A. Potra and S. J. Wright. Interior-point methods. Journal of Computational and Applied Mathematics, 24(2):28 302, December 2000. [0] S. J. Wright. Primal-Dual Interior-Point Methods. SIAM, Philadelphia, 997.