Supplement: Hoffman s Error Bounds

Similar documents
Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 10. Primal-Dual Interior Point Method for LP

Lecture 9 Monotone VIs/CPs Properties of cones and some existence results. October 6, 2008

Lecture 5. The Dual Cone and Dual Problem

Chapter 2: Preliminaries and elements of convex analysis

Optimality Conditions for Constrained Optimization

1 Review of last lecture and introduction

Summer School: Semidefinite Optimization

Lecture 14: Optimality Conditions for Conic Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

On well definedness of the Central Path

MAT-INF4110/MAT-INF9110 Mathematical optimization

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

4. Algebra and Duality

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Farkas Lemma. Rudi Pendavingh. Optimization in R n, lecture 2. Eindhoven Technical University. Rudi Pendavingh (TUE) Farkas Lemma ORN2 1 / 15

The Karush-Kuhn-Tucker (KKT) conditions

Integer Programming, Part 1

Optimality, Duality, Complementarity for Constrained Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization

10 Numerical methods for constrained problems

CO 250 Final Exam Guide

Appendix B Convex analysis

Constrained Optimization and Lagrangian Duality

Introduction to Mathematical Programming IE406. Lecture 3. Dr. Ted Ralphs

Linear Programming: Simplex

On Conic QPCCs, Conic QCQPs and Completely Positive Programs

Chapter 1. Preliminaries

Facial Reduction and Geometry on Conic Programming

Econ Slides from Lecture 1

Problem 1 (Exercise 2.2, Monograph)

Convex Optimization M2

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

3. Linear Programming and Polyhedral Combinatorics

Least Sparsity of p-norm based Optimization Problems with p > 1

Optimization and Optimal Control in Banach Spaces

A new primal-dual path-following method for convex quadratic programming

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture 8. Strong Duality Results. September 22, 2008

Semidefinite Programming Basics and Applications

Semidefinite Programming

Uniform Boundedness of a Preconditioned Normal Matrix Used in Interior Point Methods

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

Chapter 6 Interior-Point Approach to Linear Programming

More First-Order Optimization Algorithms

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Lecture 6 - Convex Sets

Chap 2. Optimality conditions

Optimization for Machine Learning

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

Convex Optimization Lecture 6: KKT Conditions, and applications

On duality gap in linear conic problems

Asteroide Santana, Santanu S. Dey. December 4, School of Industrial and Systems Engineering, Georgia Institute of Technology

On the projection onto a finitely generated cone

An interior-point gradient method for large-scale totally nonnegative least squares problems

On the relation between concavity cuts and the surrogate dual for convex maximization problems

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

Lecture 6: Conic Optimization September 8

Discrete Optimization

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Linear programming: theory, algorithms and applications

Lagrangian Duality Theory

Solutions Chapter 5. The problem of finding the minimum distance from the origin to a line is written as. min 1 2 kxk2. subject to Ax = b.

Week 3 Linear programming duality

Constrained Optimization Theory

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

On duality theory of conic linear problems

LECTURE 10 LECTURE OUTLINE

Assignment 1: From the Definition of Convexity to Helley Theorem

On smoothness properties of optimal value functions at the boundary of their domain under complete convexity

Strong Duality: Without Simplex and without theorems of alternatives. Somdeb Lahiri SPM, PDPU. November 29, 2015.

Positive semidefinite matrix approximation with a trace constraint

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Lecture 7: Semidefinite programming

A Strongly Polynomial Simplex Method for Totally Unimodular LP

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Semidefinite Programming

5.5 Quadratic programming

Boundary Behavior of Excess Demand Functions without the Strong Monotonicity Assumption

Lecture 5 : Projections

3.3 Easy ILP problems and totally unimodular matrices

Nonlinear Programming

POLARS AND DUAL CONES

Convex Functions and Optimization

The Trust Region Subproblem with Non-Intersecting Linear Constraints

A Simple Derivation of a Facial Reduction Algorithm and Extended Dual Systems

3. Linear Programming and Polyhedral Combinatorics

Lecture 3. Optimization Problems and Iterative Algorithms

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

Computational Intelligence Lecture 13:Fuzzy Logic

Course 212: Academic Year Section 1: Metric Spaces

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Conic Linear Optimization and its Dual. yyye

The proximal mapping

Lecture 2: Linear Algebra Review

Linear Programming Inverse Projection Theory Chapter 3

Transcription:

IE 8534 1 Supplement: Hoffman s Error Bounds

IE 8534 2 In Lecture 1 we learned that linear program and its dual problem (P ) min c T x s.t. (D) max b T y s.t. Ax = b x 0, A T y + s = c s 0 under the Slater condition, admits the analytical central path {(x(µ), y(µ), s(µ)) Ax(µ) = b, A T y(µ) + s(µ) = c, x(µ) > 0, s(µ) > 0, x i (µ)s i (µ) = µ, for i = 1,..., n; µ > 0} and that lim µ 0 (x(µ), y(µ), s(µ)) = (x(0), y(0), s(0)) exists, and the limits are optimal solutions for (P ) and (D) respectively.

IE 8534 3 Now let c = e. One can easily show that y(µ) = (AX(µ)A T ) 1 b µ(ax(µ)a T ) 1 e, and x(µ) = X(µ)A T (AX(µ)A T ) 1 b+µe µx(µ)a T (AX(µ)A T ) 1 Ae. (1) But why write it in this particular way? There is an amazing fact to note here (Dikin, Stewart, and Todd): χ(a) := sup{ DA T (ADA T ) 1 D diagonal and D 0} <.

IE 8534 4 Let us try to understand why is χ(a) a finite number. Another way of writing χ(a) is the following { } y χ(a) = sup c y = argmin D1/2 (A T y c), D 0 diagonal, c R n. Denote λ(a) = max{ A 1 I I = m with A I invertible}. Clearly, λ(a) is finite. Theorem 1 χ(a) = λ(a).

IE 8534 5 Proof. For any I with I = m and A I non-singular, we let D ϵ be diagonal and Dii ϵ = 1 for i I and Dϵ ii = ϵ for i I. Clearly, D ϵ A T (AD ϵ A T ) 1 A 1 I as ϵ 0 and so λ(a) χ(a). To show χ(a) λ(a), we choose a fixed 0 c R n and a fixed positive diagonal matrix D. Consider the unique y(c, D) that minimizes D 1/2 (A T y c). Obviously the rank of the active constraints at y(c, D) must be equal to m. Let J be such that J = m, A J non-singular and A T J y(c, D) = c J. Hence y(c, D) = A T J c J. This shows that χ(a) sup{ A T J c J / c 0 c R n, J = m and A J non-singular} sup{ A T J J = m and A J non-singular} = λ(a). Combining the two inequalities the proposition follows.

IE 8534 6 A related quantity is: χ(a) := sup{ DA T (ADA T ) 1 A D diagonal and D 0} <. The above quantities play an important role in the complexity analysis for linear programming. Anyway, continuing from (1) we have x(µ) χ(a) b + nµ + n χ(a)µ. Therefore, x(0) χ(a) b.

IE 8534 7 But we assumed the Slater condition. What if the Slater condition does not hold? (though the problem itself is still feasible.) Let δ > 0, and consider {x Ax = b + δae, x 0}. The above system always satisfies the Slater condition. We then know that for any δ > 0, there is x δ 0 such that Ax δ = b + δae, x δ χ(a) b + δae. Therefore, by taking limit (on possibly a subsequence) there is always a feasible solution x satisfies the bound x χ(a) b. Theorem 2 For linear programming (P ), if it is feasible then it has a feasible solution whose norm is no more than χ(a) b ; if it has an optimal solution then it has a an optimal solution whose norm is no more than χ(a) b.

IE 8534 8 Another fact that follows immediately is the following: Lemma 1 Let J be a subset of {1, 2,..., n}. Denote A J (and x J ) to be a submatrix (subvector) of A (and x) in such a way that it collects all the columns (and components) of A (and x) whose indices belong to J. Suppose that A J x J = b, x J 0 is feasible. Then it always has a feasible solution x J such that x J χ(a) b. One way to see this is to observe the linear program ( P ) min e T J x J s.t. Ax = b x 0, is feasible and has a solution. Applying Theorem 2, the result follows.

IE 8534 9 Now let us consider the following problem. Suppose that S = {y A T y c}. Let z R m be not in S. The question is: Can we reasonably estimate the distance from z to S? This is the point when the issue of error bounds arises. Essentially we wish to have some computable measure f(z) which tells us something about the unknown quantity dist(z, S). Consider 1 min 2 z y 2 s.t. A T y c.

IE 8534 10 Let y be the optimal solution (the projection). Applying the KKT condition we know that this implies the existence of J {1, 2,..., n} (with J being its complement), such that y z = A J x J, s = c A T y, x 0, s 0, s T x = 0, s J = 0, x J = 0. In fact, once the index set J is identified, we may choose any x J 0 satisfying y z = A J x J, and the above KKT condition ensures that y is the projection. In particular, by Theorem 2 there is a short solution x with x χ(a) y z.

IE 8534 11 Putting things together, we have y z 2 = (y z) T Ax = (A T z A T y ) T x = (A T z c) T x + (c A T y ) T x (A T z c) T +x (A T z c) + x χ(a) (A T z c) + y z. Therefore, y z χ(a) (A T z c) +. This gives rise to an important result, known as Hoffman s error bound: Theorem 3 Suppose that S = {y A T y c}. Then, dist(z, S) χ(a) (A T z c) +.

IE 8534 12 It is easy to check that there is C > 0, such that (A T z c) + C dist(z, S). Therefore dist(z, S) = O ( (A T z c) + ). An interesting related result to Hoffman s error bound is as follows: If an affine subspace A and the polyhedral cone R n + do not intersect, then there must be a positive distance between them. Moreover, there are two points ˆx A and ŷ R n +, such that dist(ˆx, ŷ) = dist(a, R n +). To show this, it will be sufficient to prove a slightly more general result: Lemma 2 Suppose that Q 0. Let ϵ k 0 be a sequence. Let P be a polyhedron. Suppose that {x P x T Qx + c ϵ k } = for all k. Then {x P x T Qx + c 0} =.

IE 8534 13 Proof. Let x k = argmin { x x T Qx + c ϵ k, x P }, k = 1, 2,... If {x k k = 1, 2,..., } contains a bounded subsequence, then there will be a finite cluster point, which will be in the set {x P x T Qx + c 0}. Let us consider the case where x k is divergent. Then there will be a subsequence k K such that lim k K x k / x k = d. Since we have Qd = 0. ( ) T ( ) xk xk Q x k x k + ct x k x k 2 ϵ k x k 2

IE 8534 14 Without losing generality, write P = {x Ax = b, x 0}. For each k, let us construct a system Qx = Qx k Ax = b x 0. Clearly, it is feasible (e.g. x k is a feasible solution). Now, by Theorem 2 we know that it has a feasible solution y k such that y k χ ( Qx k + b ). By dividing x k on both sides we have y k / x k 0 as k K, which contradicts with the fact that x k is smallest in norm.

IE 8534 15 As a consequence, the shortest distance problem min x y 2 s.t. Ax = b, y R n + always has an attainable optimal solution. Therefore if {x Ax = b, x R n +} = then we can strictly separate the affine space {x Ax = b} from the cone R n +; i.e. there is λ R n such that λ T x < c for all Ax = b and λ T x > c for all x R n +. This implies: (i) λ 0; (ii) c 0; (iii) A T λ = 0; (iv) λ T b < 0. The above is the famous Farkas lemma!

IE 8534 16 One important implication of this analysis is that the projection of a polyhedron is always closed. Theorem 4 Let L be any affine mapping, and P is a polyhedron. Then L(P ) is always a polyhedron itself.

IE 8534 17 Key References: J.S. Pang, Error Bounds in Mathematical Programming, Mathematical Programming, 79, 299 332, 1997. S. Zhang, Global Error Bounds for Convex Conic Problems, SIAM Journal on Optimization, 10, 836 851, 2000. Z.Q. Luo and S. Zhang, On Extensions of the Frank-Wolfe Theorems, Computational Optimization and Applications, 13, 87 110, 1999.