Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4

Similar documents
Lecture: Introduction to LP, SDP and SOCP

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Lecture 14: Optimality Conditions for Conic Problems

Summer School: Semidefinite Optimization

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

1 Review of last lecture and introduction

A Review of Linear Programming

Convex Optimization M2

Lecture: Algorithms for LP, SOCP and SDP

SEMIDEFINITE PROGRAM BASICS. Contents

Lecture 6: Conic Optimization September 8

Global Optimization of Polynomials

Lecture 7: Semidefinite programming

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Semidefinite Programming Basics and Applications

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2

Lecture 8: Semidefinite programs for fidelity and optimal measurements

Introduction to Semidefinite Programming I: Basic properties a

Semidefinite Programming

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

The fundamental theorem of linear programming

5. Duality. Lagrangian

Lecture #21. c T x Ax b. maximize subject to

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

Lecture: Duality.

4. Algebra and Duality

Lecture: Examples of LP, SOCP and SDP

3. Linear Programming and Polyhedral Combinatorics

Lecture Note 5: Semidefinite Programming for Stability Analysis

Conic Linear Optimization and its Dual. yyye

Convex Optimization & Lagrange Duality

CO 250 Final Exam Guide

4TE3/6TE3. Algorithms for. Continuous Optimization

EE 227A: Convex Optimization and Applications October 14, 2008

The Q Method for Symmetric Cone Programmin

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Semidefinite Programming

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Problem 1 (Exercise 2.2, Monograph)

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Largest dual ellipsoids inscribed in dual cones

Duality of LPs and Applications

Agenda. 1 Duality for LP. 2 Theorem of alternatives. 3 Conic Duality. 4 Dual cones. 5 Geometric view of cone programs. 6 Conic duality theorem

Linear and non-linear programming

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

Lecture 20: November 1st

Linear and Combinatorial Optimization

Lecture 4: January 26

Convex Optimization Boyd & Vandenberghe. 5. Duality

SDP Relaxations for MAXCUT

Constrained Optimization and Lagrangian Duality

Lecture 18: Optimization Programming

Relation of Pure Minimum Cost Flow Model to Linear Programming

Lecture Notes 1: Vector spaces

Lecture 1: Systems of linear equations and their solutions

Support Vector Machines

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

16.1 L.P. Duality Applied to the Minimax Theorem

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

We describe the generalization of Hazan s algorithm for symmetric programming

A notion of Total Dual Integrality for Convex, Semidefinite and Extended Formulations

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Lecture 17: Primal-dual interior-point methods part II

Lecture: Duality of LP, SOCP and SDP

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

The Q Method for Symmetric Cone Programming

Optimality, Duality, Complementarity for Constrained Optimization

Chapter 1. Preliminaries

15-780: LinearProgramming

The Simplex Algorithm

Convex Optimization and Modeling

"SYMMETRIC" PRIMAL-DUAL PAIR

The Q Method for Second-Order Cone Programming

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

Tutorial on Convex Optimization: Part II

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

Convex Optimization M2

3. Duality: What is duality? Why does it matter? Sensitivity through duality.

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

III. Applications in convex optimization

Chapter 1: Linear Programming

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Lecture: Cone programming. Approximating the Lorentz cone.

ORF 523 Lecture 9 Spring 2016, Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Thursday, March 10, 2016

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

12. Interior-point methods

Homework 4. Convex Optimization /36-725

A Brief Review on Convex Optimization

Lecture 7. Econ August 18

Lecture 7 Duality II

Convex Optimization Lecture 6: KKT Conditions, and applications

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Transcription:

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4 Instructor: Farid Alizadeh Scribe: Haengju Lee 10/1/2001 1 Overview We examine the dual of the Fermat-Weber Problem. Next we will study optimality condition in the form of generalized complementary slackness theorem. Finally we start the study of the eigenvalue optimization problem as a semidefinite program. 2 of the Fermat Weber Problem Recall that the Fermat-Weber problem seeks a point in m dimensional space whose Euclidean distance from a set of given n points is minimum (see lecture 1). Given points v 1, v 2,..., v n R m, weights w 1, w 2,..., w n, this problem can be formulated as follows. n min w i v i x i=1 The problem can be written equivalently as a cone-lp over Q, the second order cone: min w 1 z 1 +... + w n z n z i v i x, i = 1,..., n. 1

But, where e = (1, 0,..., 0) T and ˆx = z i v i x ( ) zi x v Q 0 i ( ) ( ) zi 0 x Q vi ( ) 0 z i e + ˆx Q v i ( 0 x). Now the cone-lp formulation is: Primal min w 1 z 1 +... + ( w n ) z n 0 z i e + ˆx Q, i = 1,..., n v i ( yi0 ( yi0 y i ) ) If we define dual variable corresponding to the the second order cone y i inequality in the Primal then the dual can be formulated as: max n i=1 vt i y i y 0i = w i, i = 1,..., n z i (y0 y 1 + )... + y n = 0 x i i y Q 0 since they arise from Q. i After simplification (for instance eliminating y i0 ) we get: max n i=1 vt i y i n i=1 y i = 0 y i w i. The dual of Fermat Weber problem has an interesting interpretation in dynamics. Let us assume that w i are weights of objects hanging from threads that go through a set of holes in a table. We are to take the other ends of the threads and tie them up at a position of equilibrium, and spend minimal amount of energy. Then the y i are interpreted as forces, and they must add up to zero so that we have equilibrium. The condition y i w i simply states that the magnitude of the force exerted at the knot by the i th object cannot be larger than its weight. Assuming that the optimal location is x we can write the value of the objective function as i (x v i ) T y i because (x ) T i y i = 0. Then the objective is simply the location with minimum potential energy. (Question: 2

Can you give an interpretation of Primal and explain why the primal and dual problem are equal at the optimum?) w1 w2 w3 w4 w5 3 ity in different spaces In many situations a m-dimensional cone in can be expressed as the intersection of another n-dimensional cone and a linear space: K 1 = K L where n > m. Then, remembering that a linear space is also a cone and its dual as a cone is simply its orthogonal complement L (why?), we get K1 = K + L. Here K1 is the dual of K 1 in the space R n. But if we can get the dual in the space L then the dual cone will be m-dimensional and different from K1; let us call the dual of K 1 in the space L K 1 +. If it is at all possible to find a good characterization of K 1 + we should use that instead of K 1. Let us look at an example and see what would the problems be if we don t. In linear programming our cone is the non-negative orthant R n + and cone-lp is simply the ordinary LP: Primal min c T x a T i x = b i i = 1,..., m x i 0 i = 1,..., n max b T y i y ia i + s i = c i i = 1,..., m s i 0 i = 1,..., n Now suppose that we express the non-negative orthant as the intersection of positive semidefinite cone and the linear space L which consists of only diagonal 3

matrices, that is X L iff x ij = 0 for all i j. We define diagonal matrices C = Diag(c) and A i = Diag(a i ), that is a matrix whose diagonal entries j, j are c j (or (a i ) j ), and non-diagonal entries i, j are all zeros. Now the primal linear programming problem can be written as a semidefinite programming problem. Primal : min{c X A i X = b i, for i = 1,..., m, X ij = 0 for i j, X 0} Note that the condition X ij = 0 is the same as (E ij + E ji ) X = 0 where E ij is the matrix with all entries 0 except the i, j entry which is one. Now taking the dual of this SDP we arrive at a problem that is not equivalent to the dual of the LP: : max{b T y y i A i + s ij (E ij + E ji ) C} Even if the original LP problem has unique primal and dual solutions it is unlikely in general that the dual of the SDP formulation have unique solutions. The constraints in the dual imply that y i a i c but there are in general infinitely many s ij that can be added to a set a given optimal y. The lesson is that it is not a good idea to formulate an LP as an SDP (which was obvious at the outset). But for the same reason it is not generally a good idea to express the dual of a cone-lp over K 1 L as K 1 + L. As another example consider the second order cone Q. Now we know that x Q iff Arw x 0. Thus again SOCP can be expressed as an SDP: write Q = P n n L where L is the linear space saying matrix X is arrow shaped, i.e. X ij = 0 if i j and i 0 and j 0, and X ii = X jj for all i, j. But again formulating SOCP as and SDP is not a good idea. If we form the dual as an SDP we will have extra and unnecessary variables that play no essential role and can make the solution numerically unstable, even if the original SOCP does not have numerical problems. In future lectures we will see even more compelling reasons why the SOCP poblem should be treated in its own right rather than as a special case of SDP. 4 Generalization of Complementary Slackness Conditions Consider the pair of cone-lp problems Primal min c T x Ax = b x K 0 max b T y A T y + s = c s K 0. We studied before that at the optimum the following three relations hold: x K 0 s K 0 and x T s = 0. In the case of LP, SDP and SOCP these conditions actually imply stronger relations which we now examine. 4

Example 1 (non-negative orthant) When K = K = R n +, at the optimum, x i 0 for i = 1,..., n, s i 0 for i = 1,..., n, and x T s = 0 imply x i s i = 0 for i = 1,..., n because sum of a set of non negative numbers x i s i is zero implies that each of them must be zero. This is the familiar complementary slackness theorem of linear programming. Example 2 (the semidefinite cone) When K = K = P n n the optimal, X 0, S 0, and X S = tr(xs) = 0. Since the matrix S is symmetric S can be expressed as S = Q T ΩQ = Q T ΩQQ T ΩQ = S 1/2 S 1/2, where Q is an orthogonal matrix, and Ω a diagonal matrix containing eigenvalues of S on its diagonal. This shows that each positive semidefinite matrix has a unique positive semidefinite square root which is denoted by S 1/2. Now, 0 = tr(xs) = tr ( XS 1/2 S 1/2) = tr ( S 1/2 XS 1/2) This implies that S 1/2 XS 1/2 = 0 because S 1/2 XS 1/2 is also a positive semidefinite matrix, with non-negative eigenvalues and trace zero. Since trace is sum of eigenvalues, this is possible only when all eigenvalues are zero, which, in the case of symmetric matrices, implies that the matrix S 1/2 XS 1/2 is zero. Thus 0 = (S 1/2 X 1/2 )(X 1/2 S 1/2 ) = A T A. We now that AA T = 0 iff A = 0, thus X 1/2 S 1/2 = 0 which implies XS = 0. We have shown: Theorem 1 (Complementary slackness theorem for SDP) If X is optimal for the primal SDP, and (y, S) optimal for the dual SDP, and duality gap X S = 0, then XS = 0. Example 3 (The second order cone) When K = K = Q, we have x Q 0, s Q 0 and x T s = 0, where x, s R n+1, and x and s are indexed from 0. This means that x 0 x, and s 0 s, and x T s = 0. or equivalently, x x 2 0 x 2 1 + + x 2 2 n x 0 s 0 i s 0 (1) x 0 s s 2 0 s 2 1 + + s 2 2 n x 0 s 0 i x 0 (2) s 0 x 0 s 0 = x 1 s 1 + + x n s n (3) Now, adding (1), (2) and (3) we get 0 ( x 2 i s 0 + s2 i x ) 0 + 2x i s i x 0 s 0 = ( x 2 i s2 0 + s 2 i x2 0 + 2x i s i x 0 s 0 x 0 s 0 0 (x i s 0 + s i x 0 ) 2 x 0 s 0 Again, sum of a set of non-negative numbers is less that or equal to zero. Therefore all of them must be zero. We thus have x i s 0 + x 0 s i = 0, i = 1,..., m and x T s = 0 ) 5

We have shown Theorem 2 (Complementary slackness for SOCP) If x Q 0, s Q 0, and x T s = 0, then x 0 s i + x i s 0 = 0 for i = 1,..., n. This conditions (along with x T s = 0) can be written more succinctly as Arw (x) Arw (s)e = 0 We have implicitly assumed that x 0 0 and s 0 0. if x 0 = 0 x then this implies that x = 0 and the theorem above is trivially true. The same holds for when s 0 = 0. 5 A general complementary slackness theorem For a proper cone K R n, define C(K) {( } x C(K) = x s) K 0, s K 0, x T s = 0 R 2n Now, on the surface, the set C(K) seems to be a (2n 1)-dimensional set: Its members have 2n coordinates and since x T s = 0 we are left with 2n 1 degrees of freedom. The condition x K by itself does not impose restriction on the dimension of the set, nor does the condition s K. Nevertheless it turns out C(K) is actually an n-dimensional set! Here is why: Theorem 3 There is a one-to-one and onto continuous mapping from C(K) to R n. Before we proceed to the proof we recall the following basic Fact 1 Let S R n be a closed convex set and a R n. Then there is a unique point x = Π S (a) in S which is closest to a, i.e. there is a unique point x S such that x = argmin y S a y. The unique point above is called projection of a on to S. The proof of this fact can be found in many texts and is based on Weierstrass s theorem. Now we give proof of Theorem 3. Proof: Let a R n be any arbitrary point and define s = x a. we will first show that s K, and then show that the correspondence between a and (x, s) is a one-to-one, onto and continuous. First we show that s K. For every u K, define convex combination u α = αu + (1 α)x where 0 α 1. Again we define ζ(α) = a u α 2. Then ζ(α) is a differentiable function on the interval [0, 1] and min 0 α 1 ζ α is attained at α = 0. Claim: dζ dα 0 α=0 6

proof of Claim: Otherwise α in some neighborhood of 0, such that a u α < a u 0 contradicting the fact x = u 0 is the closest point to a in K. From this claim, dζ dα = 2(a x) T (u x) 0 α=0 2(x a) T (u x) 0 2s T (u x) 0. (4) This latter inequality is true for any u K. If we choose u = 2x then we get s T x 0. If we choose u = x/2 then s T x 0. We conclude that x T s = 0. If we plug this into (4) we get s T u 0 which means s K. Thus, for each a we get a pair (x, s) C(K). Clearly each a results in a unique (x, s) as x the projection is unique and thus so is s = x a. Also, both projection operation and s = x a are continuous. Conversely, if (x, s) C(K), then we can set a = x s. All we have to show now is that projection of a onto K is x. Assume otherwise. Then there is a point u K such that a u < a x that is (a x) T (a x) > (a u) T (a u) x T x 2(x s) T x > u T u 2(x s) T u noting that x T s = 0, 0 > u T u + x T x 2x T u + 2u T s 0 > u x 2 + 2s T u which implies that s T u < 0, contradicting the fact s K. (This proof is due to Osman Güler.) Example 4 ( of half line) Let us see what C(K) looks like in the case of half-line, that is when K = K = R +. {( } x C(K) = x 0, s 0 R s) 2 In other words, C(R + ) is the union of non-negative part of the x and s axes: it is the real line R bent at the origin by a 90 angel. Now the implication of this theorem is that since C(K) is n-dimensional, then there must exist a set of n equations, that are independent in some sense and define the manifold C(K). These n equations are precisely the complementary slackness conditions. In case of non-negative orthant, semidefinite and second order cones we were able to get these equations explicitly. When the cone K is given by a set of inequalities of the form g i (x) 0 for i = 1,..., n, and g i (x) are homogeneous and convex functions, then the classical Karush-Kuhn-Tucker conditions gives us a method of obtaining these equations. 7

6 Eigenvalue Optimization In this section we relate the eigenvalues λ 1 (A) λ 2 (A) λ n (A) for some A S n n. Let us find an SDP formulation of the largest eigenvalue, λ 1 (A). This problem can be formulated by primal and dual SDPs as follows. Primal min z zi A max A Y I Y = tr(y ) = 1 Y 0 The primal formulation simply says find the smallest z such that z is larger than all eigenvalues of A. But z is larger than all eigenvalues of A iff zi A is positive semidefinite. The dual characterization is obtained by simply taking dual. Now define the feasible set of the dual to be S, that is Definition 1 S = {Y S n n tr(y ) = 1, Y 0} (5) E = {qq T q = 1} (6) We can characterize the extreme points of S as follows: Theorem 4 S is a convex set and the set of extreme points of S is E. Proof: Convexity of S is obvious, since it is the intersection of the semideinite cone and an affine set. Y 0 implies that Y = ω 1 q 1 q T 1 + + ω k q k q T k where ωi = 1, ω i 0, and q i = 1. This shows that the extreme points of S are among elements of E. Now we prove that all elements of E are extreme points. Otherwise for some qq T there are p and r with p = r = 1 and qq T = αpp T + (1 α)rr T = ( αp 1 αr ) ( αp 1 αr ) T. If α 0 or 1 we will have a contradiction to the fact that rank(qq T )=1. so qq T are extreme points. Since the optimum of a linear function over a convex set is attained at an extreme point, it follows that the Y that maximized A Y in the dual characterization above is of the form Y = qq T, with q = 1. That is λ 1 (A) = max q =1 qt Aq This is a well-know result in linear algebra that we have proved using duality of SDP. In future lectures we will use this characterization to express optimization of eigenvalues over an affine class of matrices. 8