A priori bounds on the condition numbers in interior-point methods

Similar documents
The Solution of Euclidean Norm Trust Region SQP Subproblems via Second Order Cone Programs, an Overview and Elementary Introduction

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Interior Point Methods in Mathematical Programming

Lecture 17: Primal-dual interior-point methods part II

Interior-Point Methods

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

Lecture 8 : Eigenvalues and Eigenvectors

More First-Order Optimization Algorithms

SEMIDEFINITE PROGRAM BASICS. Contents

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin

A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Enlarging neighborhoods of interior-point algorithms for linear programming via least values of proximity measure functions

Solving large Semidefinite Programs - Part 1 and 2

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

Review of Linear Algebra

Convergence Analysis of Inexact Infeasible Interior Point Method. for Linear Optimization

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

NORMS ON SPACE OF MATRICES

Largest dual ellipsoids inscribed in dual cones

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix

Lecture: Introduction to LP, SDP and SOCP

Summer School: Semidefinite Optimization

LINEAR SYSTEMS (11) Intensive Computation

Projection methods to solve SDP

The Q Method for Symmetric Cone Programmin

Chapter 3 Transformations

Functional Analysis Review

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Lecture 10. Primal-Dual Interior Point Method for LP

SF2822 Applied nonlinear optimization, final exam Wednesday June

Interior Point Methods for Linear Programming: Motivation & Theory

Interior Point Methods for Mathematical Programming

Improved Full-Newton Step O(nL) Infeasible Interior-Point Method for Linear Optimization

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

12. Interior-point methods

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

arxiv: v1 [math.oc] 26 Sep 2015

Uniform Boundedness of a Preconditioned Normal Matrix Used in Interior Point Methods

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Linear Algebraic Equations

Mathematical foundations - linear algebra

4. Algebra and Duality

Section 3.9. Matrix Norm

Semidefinite Programming

Lecture 7: Positive Semidefinite Matrices

A Full Newton Step Infeasible Interior Point Algorithm for Linear Optimization

We describe the generalization of Hazan s algorithm for symmetric programming

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Properties of Matrices and Operations on Matrices

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

On implementing a primal-dual interior-point method for conic quadratic optimization

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

m i=1 c ix i i=1 F ix i F 0, X O.

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Introduction and Math Preliminaries

G1110 & 852G1 Numerical Linear Algebra

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

Review Questions REVIEW QUESTIONS 71

Introduction to Semidefinite Programming I: Basic properties a

Linear Algebra (Review) Volker Tresp 2017

Math 108b: Notes on the Spectral Theorem

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

Lecture 5 : Projections

8.1 Concentration inequality for Gaussian random matrix (cont d)

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

10 Numerical methods for constrained problems

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Lecture: Algorithms for LP, SOCP and SDP

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

A PRIMAL-DUAL INTERIOR POINT ALGORITHM FOR CONVEX QUADRATIC PROGRAMS. 1. Introduction Consider the quadratic program (PQ) in standard format:

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Lecture #21. c T x Ax b. maximize subject to

Nonsymmetric potential-reduction methods for general cones

1 Quantum states and von Neumann entropy

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

Symmetric Matrices and Eigendecomposition

DELFT UNIVERSITY OF TECHNOLOGY

A Review of Linear Programming

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

EE 227A: Convex Optimization and Applications October 14, 2008

A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint

Semidefinite Programming

Transcription:

A priori bounds on the condition numbers in interior-point methods Florian Jarre, Mathematisches Institut, Heinrich-Heine Universität Düsseldorf, Germany. Abstract Interior-point methods are known to be sensitive to ill-conditioning and to scaling of the data. This paper presents new asymptotically sharp bounds on the condition numbers of the linear systems at each iteration of an interior-point method for solving linear or semidefinite programs and discusses a stopping test which leads to a problemindependent a priori bound on the condition numbers. Key words: Condition number, interior-point method, stopping test. August 7, 205 Introduction Asymptotically sharp bounds on the condition number of the linear systems in interior-point methods are presented. These bounds can be reduced by a preprocessing that is applicable for dense problems. To motivate the bounds, the preprocessing is discussed first followed by a statement of the bounds with and without preprocessing. As a starting point, linear programs are considered in Section 2. The extension to semidefinite programs is addressed in Section 3.. Notation The components of a vector x R n are denoted by x i. Inequalities such as x 0 are understood componentwise. The Euclidean norm of a vector x R n is denoted by x and Diag(x) denotes the n n diagonal matrix with diagonal entries x i for i n. The identity matrix is denoted by I, its dimension being evident from the context. The space of symmetric n n-matrices is denoted by S n. The scalar product of two symmetric matrices X, S is given by X S := trace(xs) inducing the Frobenius-norm and X 0 (X 0) indicates that X is symmetric and positive semidefinite (positive definite). The condition number of a matrix is always measured with respect to the 2-norm.

2 Linear programming Let A R m n, b R m, c R n be given. Consider a linear program in standard form, (LP ) min c T x Ax = b, x 0. At each iteration of an interior-point algorithm with an infeasible starting point, given some iterate x, y, s with x, s > 0, systems of the following form (or a very similar form) for finding a correction x, y, s are considered: A x = b Ax A T y + s = c A T y s () S x + X s = µ + e Xs. Here, S = Diag(s), X = Diag(x), e = (,..., ) T, and µ + is chosen as a parameter satisfying µ + µ := xt s n. Solving the second line for s and inserting both into the first row leads to a linear system of the form AD 2 A T y = rhs (2) for some right hand side rhs, where D 2 = XS see e.g. [4]. 2. Preprocessing of dense LPs For problems with a dense data matrix A the following preprocessing lends itself to reduce the dependence of interior-point algorithms on the scaling of the data of the linear program: Prior to solving problem (LP ), a Cholesky factor L with LL T = AA T is computed and [A, b] is replaced with L [A, b]. (After this transformation A has orthonormal rows.) Then, c is replaced with c A T Ac, and b and c are scaled to Euclidean norm. This scaling implies that the approximate solution generated by the interior-point algorithm needs to be rescaled to fit the original problem. (For b = 0 or c = 0 one may consider modified problems with b = Ae or c = e.) 2.2 A geometric stopping test A point (x, y, s) is primal and dual optimal, if x and (y, s) satisfy the linear equations, if x, s 0, and if x is orthogonal to s (with respect to the scalar product associated with the Euclidean norm). By the above preprocessing, the origin has Euclidean distance from the set {x Ax = b}, and from the set {(y, s) A T y + s = c}. A natural starting point in this setting is the point y = 0, x = s = e for which the angle between x and s is zero. An infeasible-starting-point interior-point method then strives at increasing this angle to π/2 while simultaneously reducing Ax b and A T y + s c and maintaining x, s > 0. The Euclidean setting of the optimality conditions and of this starting point suggests the following geometric stopping criterion: 2

Geometric stopping test:. Require the Euclidean distance to the primal-dual equality constraints to be at most ɛ, i.e., Ax b ɛ and A T y + s c ɛ, 2. Require the cosine of the angle between x and s to be less or equal to ɛ, x T s ɛ x s. For small ɛ > 0 the second inequality approximately states that the angle between x and s is π 2 α where 0 α ɛ. Typically, infeasible interior-point methods simultaneously reduce µ and the residuals of the primal-dual equations. By limiting the reduction of µ in case the norms of the residuals Ax b or A T y + s c are large, the algorithm can be tuned such that the stop test will be triggered by the second inequality. Below, this situation is considered, and in particular, iterates are considered for which the stop test is not yet strictly satisfied, more precisely, where x T s ɛ x s. The geometric stopping test is not only motivated by the Euclidean setting of the optimality conditions, it also allows to bound the condition number of the linear systems (2) to be solved at each iteration. This is the subject of the next section. 2.3 Bounds in the dense case Many implementations of interior-point algorithms generate iterates in a norm neighborhood of the central path, i.e. x i s i σµ for i n where µ := x T s/n and σ (0, ) defines the width of the neighborhood. For problems that are preprocessed as in Section 2. and for interior-point algorithms that generate iterates in such a -norm neighborhood, and with a stopping test as in Section 2.2, the following lemma applies to all iterates possibly generated by the algorithm: Lemma : Assume that A has orthonormal rows. Let x, s > 0 and σ (0, ] be given with x i s i σµ for i n where µ := x T s/n. Assume further that ɛ > 0 is given with x T s ɛ x s. Then, the condition number of the matrix AD 2 A T in (2) is bounded by cond(ad 2 A T ) n2 This bound is asymptotically sharp in the sense that for a certain class of matrices A it cannot be improved by any constant factor when min{n, /ɛ}. In case of poorly scaled problems (and for problems that do not have a finite optimal solution) the geometric stopping test implies, however, that the stopping test may be satisfied before the duality gap is below a desired threshold. In this case, conditions such as Lemma 5.3.5 in [2] state that if (LP ) has an optimal solution (x, y, s ), then (x, s ) must be rather large. 3

Proof: First observe that (by assumption on A) min y = yt ADA T y min x = xt Dx and max y = yt ADA T y max x = xt Dx so that cond(ada T ) cond(d). For i n it follows from σµ x i s i x T s = nµ that and hence, x i s i x i s i = x 2 i x i s i x2 i σµ x 2 σµ = s 2 i x i s i σµs 2 i σµ s 2, cond(d) x 2 σµ / σµ s 2 = x 2 s 2 σ 2 µ 2 ( x T s ɛ ) 2 σ 2 µ 2 = n2 To verify that this bound is asymptotically sharp let α and σ ɛ/α α/ α/ ɛ ɛ σ ɛ/α ( ) x :=, s := 0 0... 0, and A :=. 0 0... 0.. Here, since µ = n 2 n + 2σ n, the bound x is i σµ is satisfied. The other assumption to be satisfied is x T s ɛ x s. Since x = s, this is equivalent to 2σ + n 2 ɛ( α2 ɛ + σ2 ɛ α 2 + n 2) which in turn is implied, if (n 2)( ɛ) α 2 (3) holds true. Defining α := n( ρ) for some number ρ > 0, it is evident that ρ can be chosen arbitrarily close to zero without violating (3) if min{n, /ɛ} is sufficiently large. The condition number of ADA T is then given by (x 2 s 2 )/(x s ) = α4 σ 2 ɛ 2 which is arbitrarily close to the given bound. The squares in the bound of Lemma are due to the fact that the underlying linear system is a system of normal equations solving the larger indefinite system () instead of (2) generally would be more stable but also computationally more expensive. Note that the bound of Lemma is independent of µ but due to the rescaling in Section 2. it follows that x and s for any feasible x, s so that µ ɛ n for any feasible x, s satisfying the assumptions of Lemma. 4

2.4 Bound for sparse systems Generating orthonormal rows of A as in Section 2. is not feasible for large scale sparse problems. In this case, the proof of Lemma allows for a simple generalization stated in the next corollary: Corollary : If A does not have orthonormal rows but the remaining assumptions of Lemma are satisfied, generalize the notion of condition number to rectangular matrices by defining cond(a) as the maximum singular value of A divided by its minimum singular value. Then, the bound of Lemma holds in the following weaker form: cond(ada T ) n2 cond(a) 2 If all that is known about A is its condition number, then, again, this bound is asymptotically sharp. 3 Semidefinite programs Let us consider semidefinite programs of the standard form (SDP ) min C X AX = b, X 0. Here, A is a linear map: S n R m, the i-th component of AX being given by A (i) X for some given data matrices A (i). First, we will assume again that A has orthonormal rows, i.e. A (i) F = and A (i) A (j) = 0 for i j. For semidefinite programs, the scaling point W of two matrices X, S 0 defines the NT-direction [3]. Here, W is given by W := S /2 (S /2 XS /2 ) /2 S /2. Throughout this section W will refer to this particular matrix. Set P := W /2 and V := P XP = P SP, and for some symmetric matrix Z, define L V (Z) := 2 (V Z + ZV ). Then, for semidefinite programs, the equivalent of the third block row of () is given by the equation L V (P XP ) + L V (P SP ) = µ + I V 2. Multiplying this from left by (L V ) and solving the primal-dual system in an analogous way as in (2) for linear programs leads to a linear system of the form AD W A y = rhs where A denotes the adjoint of A and D W is defined by the identity D W [ S] W SW for all symmetric matrices S. Lemma can also be generalized to semidefinite programs (SDP ). We first consider the case where the iterates are on the central path, i.e. XS = µi with µ = X S/n: Corollary 2: Assume that A has orthonormal rows. For (SDP ) the bound of Lemma then applies to points on the central path (σ = in Lemma ) with X S ɛ X F S F in the form cond(ad W A ) n2 ɛ 2. 5

Proof: Since A has orthonormal rows, by the proof of Lemma, it suffices to bound the condition number of D W. For X, S on the central path, the matrix W simplifies to W = X/ µ. Let Λ be a diagonal matrix with the eigenvalues of X on its diagonal. Then, by orthogonal invariance of the Frobenius norm, D W := so that D W 2 = max D X SX F W [ S] F = max S F = S F = µ max S F = µ 2 Λ 2 i,iλ 2 j,j Si,j 2 i,j max S F = Λ SΛ F = max, S F = µ µ 2 max k and thus, D W = µ max k Λ 2 k,k. Likewise, (D W ) = D W = µ/ min k Λ 2 k,k, and hence, cond(d W ) = cond(x) 2 (= cond(w ) 2 ). Λ 4 k,k i,j S 2 i,j, By assumption, ɛ X F S F X S, and the relation XS = µi further implies ɛ X F S F X S = X (µx ) = nµ. Replacing S with µx on the left-hand side above and dividing by µ yields ɛ X F X F n. Since X F X F X 2 X 2 this implies cond(x) n/ɛ from which the claim follows. The above proof can now be generalized to points in a -norm neighborhood of the central path. To this end first note that X /2 SX /2 σµi if, and only if, S /2 XS /2 σµi (this follows because X /2 SX /2 and S /2 XS /2 are both similar to XS and thus share the same eigenvalues) so that the matrices X and S are interchangeable in the statement of the next lemma: Lemma 2: Assume that A has orthonormal rows. Let X, S 0 and σ (0, ] be given with X /2 SX /2 σµi where µ := X S/n. Assume further that ɛ > 0 is given with X S ɛ X F S F. Then, the condition number of the matrix AD W A is bounded by This bound is asymptotically sharp. cond(ad W A ) n2 Proof: We begin by quoting a result that follows the statement 2 of Lemma 3. in []: For W defined as above the following holds true: 2(X + S) W (X + S )/2. (4) By the proof of Corollary 2 it suffices to bound the condition number of W. Denote the minimum and maximum eigenvalues of a symmetric matrix Z by 2 The reference [] is self-contained and the proof of this result is easy to follow so that the proof is not reproduced here. 6

λ min (Z) and λ max (Z). Observe that W remains invariant when multiplying both, X and S with the same positive constant. We may therefore assume that X and S are scaled such that λ max (X) = λ max (S ). Let C := max(cond(x), cond(s)). Then, X λ min (X) I = Thus, the inequality (4) implies C λ max (X) I and S λ max(s)i = λ max (X) I W λ max (X)I, C C λ max (S ) I. so that cond(w ) C. Since X and S are interchangeable in Lemma 2, we assume without loss of generality that cond(w ) cond(x). Let u be an eigenvector of X to the eigenvalue λ min (X) with u =. Then, by assumption on the -norm neighborhood, σµ u T X /2 SX /2 u = λ min (X)u T Su λ min (X) S 2. Since X 2 = /λ min (X), this implies Inserting this in the left hands side of S 2 σµ X 2. ɛ X 2 S 2 ɛ X F S F X S = nµ. and dividing by ɛσµ it follows that cond(x) n/(ɛσ) from which the claim follows. Linear programs being a special case of semidefinite programs, this bound is asymptotically tight as well. The generalization of this result to mappings A that do not have orthonormal rows follows as in Corollary. Acknowledgment The author likes to thank Felix Lieder for comments that helped to improve the presentation of this paper. References [] J.D. Lawson and Y. Lim: The Geometric Mean, Matrices, Metrics, and More, The American Mathematical Monthly 08(9) (200) pp. 797 82. [2] S. Mizuno, Infeasible-interior-point algorithms, in: T. Terlaky, ed. Interior Point Methods of Mathematical Programming Applied Optimization 5 Springer (996) pp. 59 87 [3] Y.E. Nesterov and M.J. Todd, Primal-dual interior-point methods for selfscaled cones, SIAM Journal on Optimization 8(2) (998) pp. 324 364. [4] S.J. Wright, Primal-Dual Interior-Point Methods SIAM, Philadelphia (997). 7