Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2

Similar documents
Common-Knowledge / Cheat Sheet

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Introduction to Matrix Algebra

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Differential Topology Final Exam With Solutions

Semidefinite Programming

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 7: Positive Semidefinite Matrices

Quantum Computing Lecture 2. Review of Linear Algebra

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

Lecture: Introduction to LP, SDP and SOCP

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

Lecture 18. Ramanujan Graphs continued

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

STAT200C: Review of Linear Algebra

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4

Elementary linear algebra

10-725/36-725: Convex Optimization Prerequisite Topics

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

Recall the convention that, for us, all vectors are column vectors.

Lecture 1 and 2: Random Spanning Trees

Lecture 8 : Eigenvalues and Eigenvectors

1 Last time: least-squares problems

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

1 Quantum states and von Neumann entropy

Linear Algebra Lecture Notes-II

Linear Algebra Formulas. Ben Lee

Linear algebra for computational statistics

Linear Algebra Review

Exercise Sheet 1.

Lecture: Examples of LP, SOCP and SDP

MAT Linear Algebra Collection of sample exams

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014

Algebra II. Paulius Drungilas and Jonas Jankauskas

Chapter 1. Matrix Algebra

The following definition is fundamental.

Matrices A brief introduction

Stat 206: Linear algebra

Linear Algebra. Workbook

Linear Algebra Massoud Malek

Review of Linear Algebra Definitions, Change of Basis, Trace, Spectral Theorem

Mathematical Methods wk 2: Linear Operators

NORMS ON SPACE OF MATRICES

1 Positive definiteness and semidefiniteness

Functional Analysis Exercise Class

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Linear Algebra. Session 12

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Introduction and Math Preliminaries

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Lecture 2: Linear operators

Knowledge Discovery and Data Mining 1 (VO) ( )

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Problems in Linear Algebra and Representation Theory

Lecture: Algorithms for LP, SOCP and SDP

Review problems for MA 54, Fall 2004.

Linear Algebra, 4th day, Thursday 7/1/04 REU Info:

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Review of Linear Algebra

Global Optimization of Polynomials

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Analysis and Linear Algebra. Lectures 1-3 on the mathematical tools that will be used in C103

Chapter Two Elements of Linear Algebra

Convex Optimization and Modeling

Math Linear Algebra II. 1. Inner Products and Norms

SPECTRAL THEORY EVAN JENKINS

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Chapter 6 Inner product spaces

E2 212: Matrix Theory (Fall 2010) Solutions to Test - 1

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ.

Some notes on Linear Algebra. Mark Schmidt September 10, 2009

Semidefinite Programming Basics and Applications

Mathematical foundations - linear algebra

Approximation Algorithms

CS 246 Review of Linear Algebra 01/17/19

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore

University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam June 10, 2011

Basic Concepts in Matrix Algebra

Lecture II: Linear Algebra Revisited

NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction

Symmetric matrices and dot products

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Lecture Note 5: Semidefinite Programming for Stability Analysis

Optimization Theory. A Concise Introduction. Jiongmin Yong

1 Linear Algebra Problems

Review of Some Concepts from Linear Algebra: Part 2

SEMIDEFINITE PROGRAM BASICS. Contents

UNDERSTANDING THE DIAGONALIZATION PROBLEM. Roy Skjelnes. 1.- Linear Maps 1.1. Linear maps. A map T : R n R m is a linear map if

There are six more problems on the next two pages

1. General Vector Spaces

Some notes on Coxeter groups

An Introduction to Linear Matrix Inequalities. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

Real Symmetric Matrices and Semidefinite Programming

Transcription:

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2 Instructor: Farid Alizadeh Scribe: Xuan Li 9/17/2001 1 Overview We survey the basic notions of cones and cone-lp and give several examples mostly related to semidefinite programming. 2 Program Formulations The linear and semidefinite programming problems are formulated as follows: 2.1 Standard Form Linear Programming Let c R n and b R m,a R n m with rows a i R n, i = 1,...m. min: c T x s.t. a i x = b i, i = 1,..m x 0 (1) 2.2 Semidefinite Programming Here instead of vectors a i we use symmetric matrices A i S n n (the set of n n symmetric matrices), i = 1,...m, C S n n and X S n n instead of c and x. The matrix X is positive semidefinite. The inner product is defined as A B = i,j A ij B ij = Trace(AB T ) = Tr(AB) = Tr(BA). 1

The second equation is from definition of product, and the last one come from the observation that even though matrix product is not commutative, i.e. AB BA in general, the diagonal entries of AB and BA are equal and thus their traces are equal as well. The standard form of semidefinite programming is : min C X s.t. A i X = b i, i = 1,..m X 0 3 Some Notations and Definitions cone: A set K is called a cone if αx K for each x K and for each α 0. convex Cone: A convex cone K is a cone with the additional property that x + y K for each x, y K. pointed cone A pointed cone K is a cone with the property that K ( K) = {0}. open Set A set S is open if for every point s S, B(a, ɛ) = {x : x s < ɛ} S for some positive number ɛ s. closed set A set S is a closed set if its compliment S c is open. interior of set The interior of a set S is defined as Int(S) := T T S,Topen closure of set The closure of a set S is defined as cl(s) := T T S,Tclosed boundary of set The boundary of a set S is defined as Bd(S) := Cl(S) Int(S) c Remark 1 There are some basic facts which can be easily seen from the definitions above: 2

1. An open set in R n is not open in R m for n < m ; 2. similarly, the boundary or the interior of a set isn t the same in R n as in R m ; 3. As a result one talks about an open set with respect to the topology induced by the vector space spanned by a set S; 4. similarly we speak of relative interior and relative boundary of a set which are understood to be with respected to topology of the space spanned by the the set; 5. a closed set in R n is also closed in R m. Consider the half closed interval [a, b) = {x : a a < b} in R 1. The interior of [a, b) in R 1 is the open interval (a, b) and the boundary of [a, b) is {a} {b}. But (a, b) isn t open in R 2 since for any x (a, b), we can t find some ɛ > 0 such that B(x, ɛ) (a, b). The interior of [a, b) in R 2 is empty and the boundary of [a, b) in R 2 is [a, b]. However the relative interior of [a, b) in R n is again (a, b) and the relative boundary {a, b}. Definition 1 (Proper Cone) A proper cone K R n is a closed, pointed, convex and full- dimensional cone (i.e dim(k) = n). A full-dimensional cone is a cone which contains n linearly independent vectors. Theorem 1 Every proper cone K induces a partial order which is defined as follows: x, y R n, x K y x y K X K > y x y Int(K) Proof: First note that x K x since x x = 0 K. Secondly, ifx K y, y K x, then x y K, y x K. Since K is a proper cone, thus a pointed cone, we get x = y. Finally, if x K y, y K z then x z = (x y) + (y z) K, i.e., x K z. 4 The Standard cone linear programming (K- LP) min c T x s.t. a T i x = b i, i = 1,..m x K 0 where c R n and b R m,a R n m with rows a i R n, i = 1,...m. Observe that every convex optimization problem: min x C f(x) where C is a convex set 3

scribe:xuan Li and f(x) is convex over C, can be turned into a cone-lp. First turn the problem to one with linear objective and then turn it into Cone LP: min z s.t. f(x) z 0 x C. Since the set C = {(z, x) x C and f(x) z 0} is convex our problem is now equivalent to the cone LP where min z s.t. x 0 = 1 x K 0 where K = {(x 0, z, x) (z, x) x 0 C and x 0 0} The convex set embeded in plane and turned into a cone Definition 2 (Dual Cone) The dual cone K of a proper cone is the set {z : z T x 0, x K}. It is easy to prove that if K is proper so is K. 4

Example 1 (Half line) Let R + = {x : x 0}. The dual cone R + is exactly R +. Example 2 (non-negative orthant) Let R n + = {x x k 0 for k = 1,..., n}, the dual cone equals R n +, that is the non-negative orthant is self dual. We recall that Lemma 1 A matrix X is positive semidefinite if it satisfies any one of the following equivalent conditions: 1. 2. 3. (1) a T Xa 0, a R n (2) A R n n such that AA T = X (3) All eigenvalues of X are non-negative. Example 3 (The semidefinite cone) Let P n n = {X R n n : X is positive semidefinite} Now we are interested in P n n. On one side, i.e., Z P n n, Z X 0 for allx 0, Z X = Tr(ZX) = Tr(ZAA T ) = Tr(A T ZA) 0 for all A R n n. Since X is symmetric, from the knowledge of linear algebra, X can be written as X = QΛQ T where QQ T = I, that is Q is an orthogonal matrix, and Λ is diagonal with the diagonal entries containing the eigenvalues of X. Write Q = [q 1,...q n ] and Λ = diag(λ 1,...λ n ). λ i, i = 1..n, then q i is the eigenvector corresponding to λ i, i.e, q T i Xq i = λ i Let us choose A i = p i R n where p i is the eigen vector of Z corresponding to γ i and p T i p i = 1. Then, 0 Tr(A T i ZA i ) = p T i Zp i = γ i. So all the eigenvalues of Z are non-negative, i.e., Z P n n, P n n P n n. On the other hand, Y P n n, B R n n such that Y = BB T. X P n n, X = AA T, we have Y X = Tr(YX) = Tr(BB T AA T ) = Tr(A T BB T A) = Tr[(B T A) T (B T A)] 0 i.e., Y P n n, P n n P n n. In conclusion, P n n = P n n 5

Example 4 (The second order cone) Let Q = {(x 0, x) x 0 x }. Q is a proper cone. What is Q? On one side, if z = (z 0, z) Q, then for every (x 0, x) Q ( ) (z 0, z T x0 ) = z x 0 x 0 + z T x z x + z T x z T x + z T x = 0 i.e., Q Q. The inequalities come from the Cauchy-Schwartz inequality: z T x x T z z x On the other side, we note that e = (1, 0) Q. For each element z = (z 0, z) Q we must have z T e = z 0 0. We also note that each vector of the form x = ( z, z) Q, for all z R n. Thus, in particular for z = (z 0, z) Q, z T x = z 0 z z 2 0 Since z is always non-negative, we get z 0 z, i.e., Q Q. Therefore, Q = Q Definition 3 An extreme ray of proper cone K is a half line αx = {αx α 0} for x K such that for each a αx, if a = b + c, then b, c αx. Example 5 (Extreme rays of the second order cone) Let Q the second order cone. The vectors x = ( x, x ) define the extreme rays of Q. This is fairly easy to prove. Example 6 (Extreme rays of the semidefinite cone) Let P n n be the semidefinite cone. Positive semi-definite matrices qq T of rank 1 form the extreme rays of P n n. Here is the proof. Any positive semidefinite matrix X can be written in the form of X = i λ ip i p T i (See previous lecture to see how to get this from spectral decomposition of X). This shows that all extreme rays must be among matrices of the form qq T. Now we must show that each qq T is an extreme ray. Let qq T = X+Y, where X, Y 0. Suppose {q 1 = q, q 2,..., q n } is an orthogonal set of vectors in R n. Then multiplying from left by q T i and from right by q i we see that q T i Xq i + q T i Yq i = 0 for i = 2,..., n; but since the summands are both non-negative and add up to zero, they are both zero. Thus q T i Xq i = q T i Yq i = 0 for i = 2,... n. Thus both X and Y are rank one matrices (their null space has dimension n 1) and we might as well write qq T = xx T + yy T. But the right hand side is a rank 2 matrix unless x and y are proportional, which proves they are proportional to q. Thus, qq T are extreme rays for each vector q R n. 6

4.1 An Example of a cone which is not self dual In the examples above, we note that they were all self-dual cones. But there are cones that are not self-dual. Let F be the set of functions F : R R with the following properties: 1. F is right continuous, 2. non-decreasing (i.e. if x > y then F(x) F(y),) and 3. has bounded variation, that is F(x) α > as x, and F(x) β < as x. First observe that functions in F are almost like probability distribution functions, except that their range is the interval [α, β] rather than [0, 1]. Second the set F itself is a convex cone and in fact pointed cone in the space of continuous functions. Now we define a particular kind of Moment cone. First, let us define u x = The moment cone is defined as: { M n+1 = c = 1 x x 2 x n. } u x df(x) : F(x) F that is M n+1 consits of vectors c where for each j = 0,..., n, c j is the j th moment of a distribution times a non-negative constant. Lemma 2 M n+1 is a proper cone. Proof: Let s examine the properties we need to prove: c M n+1 and α 0 αc M n+1. To see this observe that there exists F F such that c = u x df(x). Now if F is right-continuous, nondecreasing and with bounded variation, then all these properties also hold for αf for each α 0 and thus αf F. Therefore, αc = u x d(αf(x)) M n+1. Thus M n+1 is a cone. If c and d are in M n+1 then c + d M n+1. c = u x df 1 (x) M n+1, d = u x df 2 (x) M n+1 c + d = u x d[f 1 (x) + F 2 (x)] M n+1 Thus M n+1 is a convex cone. 7

If c and c are in M n+1 then c = 0. Ifc = u x df 1 (x) M n+1 and c M n+1, then c = u x df 2 (x) M n+1. c + ( c) = 0 = u x d[f 1 (x) + F 2 (x)] Especially, d[f 1 (x)+f 2 (x)] = 0. Since F 1 (x)+f 2 (x) F is non-decreasing with F 1 (x) + F 2 (x) 0 as x, we get F 1 (x) + F 2 (x) = 0 almost everywhere,i.e., F i (x) = 0, i = 1, 2 almost everywhere. It means c = 0, i.e., M n+1 M n+1 = 0. Thus M n+1 is a pointed cone. M n+1 is full-dimensional. Let F a (x) = { 0, if x < a 1, if x a Obviously, F a (x) F and u a = u x df a (x) M n+1 for all a R. Choose n + 1 distinct a 1,...a n+1, det[u a1,, u an+1 ] = i>j(a i a j ) 0 Thus M n+1 is full-dimension cone. (The determinant above is the wellknown Vander Monde determinant.) In addition we need to show that M n+1 is closed. future lectures. This will be taken up in Example 7 (Extreme rays of M n+1 ) The extreme rays of M n+1 are all αu x for x R. If c M n+1, c can be written as α 1 u x1 + α 2 u x2 + + α n+1 u xn+1, α i 0 for i = 1,..n + 1. There is a one-to-one correspondence between c M n+1 and H = α 1 u x1 u T x 1 + α 2 u x2 u T x 2 + + α n+1 u xn+1 u T x n+1. Such a matrix is called Hankel matrix. In general Hankel matrices are thos matrices, H such that H ij = h i+j, that is entries are constant along all opposite diagonals. A vector c R 2n+1 is in the moment cone if and only if the Hankel matrix H ij = c i+j is positive semidefinite. Again these assertions will be proved in future lectures. Now we examine M n+1. Let s first consider the cone defined as follows: P n+1 = {p = (p 0,..., p n ) p 0 + p 1 x + p 2 x 2 +... + p n x n = p(x) 0 for all x} Lemma 3 Every non-negative polynomial is the sum of square polynomials. 8

Proof: First it is well known that p(x) can be written as { k [ p(x) = c (x αj iβ j )(x α j + iβ j ) ]}{ n j=1 j=k+1 } (x γ j ) where i = 1 and c 0. We first claim that n must be even. Otherwise, p(x) as x p(x) and cannot be non-negative. The number of real roots is even subsequently, say 2l. since p(x) 0, all the real roots must have even multiplicity, because otherwise in the neighborhood of the root with odd multiplicity there is some t such that p(t) < 0. Thus, we can write { k [ p(x) = c (x αj iβ j )(x α j + iβ j ) ]}{ n (x γ j ) 2} j=1 j=k+1 On the other hand for each pair of conjugate complex roots we have (x α iβ)(x α + iβ) = (x α) 2 + β 2 Therefore the product expression for p(x) is product of square polynomials or sums of square polynomials, which yields a sum of square polynomials. This means that the set of extreme rays of the non-negative polynomials is among polynomials that are square q 2 (x). Thus, the coefficients of extreme rays are of the form q q = q 2, where a b is the convolution of vectors a and b, that is for a, b R n+1, a b R 2n+1 and is defined as: a b = (a 0 b 0, a 0 b 1 + a 1 b 0,..., a 0 b k + a 1 b k 1 + + a k b 0,..., a n b n ) T and q 2 = q q. Now not all square polynomials are extreme rays. In particular, if a square polynomial has non-real roots then it can be written as sum of two square polynomials as shown above. Thus, extreme rays are among those square polynomials with only real roots. We now argue that these polynomials are indeed extreme rays. 9

Suppose p(x) = (x γ j ) 2k is a polynomial with distinct roots γ j which is not an extreme ray. Then p(x) = q(x) + r(x) and since both q and r are non-negative, we must have q(x) p(x). This means that degree of q(x) is at most as large as degree of p. Furthermore, from the picture it is clear that each γ j is also a root of q(x). But if for some γ j the multiplicity in p is 2k and the multiplicity in q is 2m where m < k then in some neigborhood of γ j q(x) > p(x) because (x γ j ) 2m > (x γ j ) 2k in some neighborhood of γ j when m < k; therefore, k m for each root. Since degree of p is larger than or equal to degree of q it follows that k = m for each root. Thus q(x) = αp(x) for some constant α. We have proved: Corollary 1 p is an extreme ray of P n+1 if p = q 2 and q(x) has only real roots. P n+1 since We now show that P n+1 M n+1. Note that c = n+1 i=1 β iu xi M n+1, i p 2 i ( i n+1 ) i ) T ( β j u xj p 2 j=1 = i,j β j [ (p 2 i ) T (u xj ) ] 0. β i 0, [ (p 2 i ) T (u xj ) ] = [ p i (x) ] 2 Later in the course we will prove that that P n+1 = M n+1. 10