Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Similar documents
U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

6.854J / J Advanced Algorithms Fall 2008

SELECTED SOLUTIONS, SECTION (Weak duality) Prove that the primal and dual values p and d defined by equations (4.3.2) and (4.3.3) satisfy p d.

Lecture 10 Support Vector Machines II

Solutions to exam in SF1811 Optimization, Jan 14, 2015

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Edge Isoperimetric Inequalities

APPENDIX A Some Linear Algebra

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Singular Value Decomposition: Theory and Applications

Problem Set 9 Solutions

COS 521: Advanced Algorithms Game Theory and Linear Programming

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

First day August 1, Problems and Solutions

Perfect Competition and the Nash Bargaining Solution

Errors for Linear Systems

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Convex Optimization. Optimality conditions. (EE227BT: UC Berkeley) Lecture 9 (Optimality; Conic duality) 9/25/14. Laurent El Ghaoui.

Math 217 Fall 2013 Homework 2 Solutions

More metrics on cartesian products

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Spectral Graph Theory and its Applications September 16, Lecture 5

Lecture 17: Lee-Sidford Barrier

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

MATH Homework #2

REAL ANALYSIS I HOMEWORK 1

MMA and GCMMA two methods for nonlinear optimization

Math 261 Exercise sheet 2

Generalized Linear Methods

A 2D Bounded Linear Program (H,c) 2D Linear Programming

NP-Completeness : Proofs

Communication Complexity 16:198: February Lecture 4. x ij y ij

arxiv: v1 [math.co] 1 Mar 2014

Assortment Optimization under MNL

Module 9. Lecture 6. Duality in Assignment Problems

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Lagrange Multipliers Kernel Trick

On a direct solver for linear least squares problems

Perron Vectors of an Irreducible Nonnegative Interval Matrix

PHYS 705: Classical Mechanics. Calculus of Variations II

Math 205A Homework #2 Edward Burkard. Assume each composition with a projection is continuous. Let U Y Y be an open set.

Linear Approximation with Regularization and Moving Least Squares

The Order Relation and Trace Inequalities for. Hermitian Operators

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

CSCE 790S Background Results

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

Formulas for the Determinant

Lecture 10 Support Vector Machines. Oct

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Norms, Condition Numbers, Eigenvalues and Eigenvectors

2.3 Nilpotent endomorphisms

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Maximal Margin Classifier

Complete subgraphs in multipartite graphs

The Minimum Universal Cost Flow in an Infeasible Flow Network

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Lecture 3. Ax x i a i. i i

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Maximizing the number of nonnegative subsets

COMPLEX NUMBERS AND QUADRATIC EQUATIONS

Linear Regression Analysis: Terminology and Notation

763622S ADVANCED QUANTUM MECHANICS Solution Set 1 Spring c n a n. c n 2 = 1.

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

Time-Varying Systems and Computations Lecture 6

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

DUE: WEDS FEB 21ST 2018

Randić Energy and Randić Estrada Index of a Graph

Computing Correlated Equilibria in Multi-Player Games

e - c o m p a n i o n

MAT 578 Functional Analysis

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

2 More examples with details

Affine and Riemannian Connections

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Approximations of Set-Valued Functions Based on the Metric Average

Mixed-integer vertex covers on bipartite graphs

Foundations of Arithmetic

Feature Selection: Part 1

Deriving the X-Z Identity from Auxiliary Space Method

Combining Constraint Programming and Integer Programming

18.781: Solution to Practice Questions for Final Exam

( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1

Lecture 10: May 6, 2013

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Successive Lagrangian Relaxation Algorithm for Nonconvex Quadratic Optimization

A combinatorial problem associated with nonograms

How Strong Are Weak Patents? Joseph Farrell and Carl Shapiro. Supplementary Material Licensing Probabilistic Patents to Cournot Oligopolists *

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

On the Multicriteria Integer Network Flow Problem

Transcription:

prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove the SDP dualty theorem. 1 Lft and Project Consder the followng nteger program for the Vertex Cover problem: mn V x x + x j 1 x {0, 1} {, j} E V Replacng the last constrant wth the constrant 0 x 1, we get an LP relaxaton for the problem. Unfortunately, the ntegralty gap of ths LP s 2. However, we can rewrte the nteger program as an equvalent quadratc program: mn V x (1 x )(1 x j ) = 0 {, j} E (1 x )x = 0 V 0 x 1 V The dea of Lft and Project s to smulate quadratc programmng (or degree k programmng) wth an LP/SDP. We ntroduce new varables Y j that represent product terms: Y j ntended = x x j. For brevty, let x 0 = 1, and then Y 0 = x. The condton that (1 x )(1 x j ) = 0 can be now expressed as 1 Y 0 Y j0 + Y j = 0. We also ntroduce new constrants on Y j : 1. Y j = Y j ; 2. Y 0 = Y ; 3. the matrx Y s postve semdefnte; 1

2 4. If a T x b was a lnear constrant for the orgnal LP then we add the followng lnear constrants on Y j (for all ): (a T x)x bx (a T x)(1 x ) b(1 x ) Fact: All SDP s we saw (or that have been useful) can be derved usng ths process from the obvous lnear relaxatons 1. The followng secton borrowed from the paper Towards strong nonapproxmablty results n the Lovász-Schrjver herarchy by Mkhal Alekhnovch, Sanjeev Arora and Ianns Tourlaks presents the detals. 2 LS + dervaton of popular SDP relaxatons To llustrate the power of the LS + procedure, we sketch how to use a few rounds of LS + to derve popular SDP relaxatons used n famous approxmaton algorthms. (Ths was suggested by the revewers, who ponted out that ths s not very well-known.) It wll be more convenent to vew LS + as a method for generatng new nequaltes. Gven any relaxaton a T r x b r = 1, 2,...,m (1) (where the trval constrants 0 x 1 are assumed to be ncluded), one round of LS + produces a system of nequaltes n (n + 1) 2 varables Y j for, j = 0, 1,...,n. As mentoned, the ntended meanng s that Y j = x x j and Y 00 = 1, Y 0 = x = x x 0, and Y 00 = 1 so every quadratc expresson n the x s can be vewed as a lnear expresson n the Y j s. Ths s how the quadratc nequaltes below should be nterpreted. (1 x )a T r x (1 x )b = 1,...,n, r = 1,...,m x a T r x x b = 1,...,n, r = 1,...,m x x = x x 0 = 1, 2,...,n (The last constrant corresponds to the fact that x 2 = x for 0/1 varables.) Fnally, one mposes the condton that (Y j ) s postve semdefnte. Obvously, any postve combnaton of the above nequaltes s also mpled, and the dervatons below wll use ths fact. 2.1 Dervng the GW relaxaton The Goemans-Wllamson relaxaton for -cut [GW 94] nvolves fndng unt vectors u 1, u 2,...,u n so as to mnmze 1 4 u u j 2. {,j} E Ths SDP relaxaton can be derved by one round of LS + on the trval lnear relaxaton. Ths relaxaton has 0/1 varables x and d j. In the nteger soluton, x ndcates whch sde of the cut 1 ths process may nvolve several rounds of Lft and Project

3 vertex s on, and d j s 1 ff, j are on opposte sdes of the cut. {,j} E d j d j x x j, j = 1, 2,...,n (3) d j x + x j, j = 1, 2,...,n (4) d j 2 (x + x j ), j = 1, 2,...,n (5) Then one round of LS + generates the followng nequaltes on d j : x d j x (x x j ) (6) (1 x )d j (1 x )(x j x ). (7) Addng these and smplfyng usng the fact that x 2 = x for 0/1 varables, one obtans d j (x x j ) 2. Smlarly one can obtan d j (x x j ) 2 whereby t follows d j = (x x j ) 2 = Y +Y jj 2Y j. Now f (Y j ) s any feasble soluton then ts Cholesky decomposton v 0, v 1,...,v n R n+1 are vectors such that Y j =< v, v j >. Then d j = v v j 2. Now defne the set of vectors u 1, u 2,...,u n as u = v 0 2v. These satsfy d j = 1 4 u u j 2 (8) u 2 = v 0 2 4 < v 0, v > +4 v 2 = 1. (9) Thus the u s are a feasble soluton to the GW relaxaton. We conclude that one round of LS + produces a relaxaton at least as tght as the GW relaxaton (and n fact one can show that the two relaxatons are the same). 2.2 Dervng the ARV relaxaton Arora, Rao, and Vazran [ARV 04] derve ther log n-approxmaton for sparsest cut usng a smlar SDP relaxaton whose salent feature s the trangle nequalty: u u j 2 + u j u k 2 u u k 2, j, k. (In other words, d j = u u k 2 forms a metrc space.) Ths relaxaton mnus the trangle nequalty s derved smlarly to the GW relaxaton above (detals omtted). The clam s that the trangle nequalty s mpled after three rounds of LS +. As shown n [LS 91], r rounds of LS + mply all nequaltes on subsets of sze r that are true for the nteger soluton. In other words, the nduced soluton on subsets of sze r les n the convex hull of nteger solutons. Thus after three rounds the d j varables restrcted to sets of sze three le n the cut cone. Snce the cut cone s just the set of l 1 (pseudo)metrcs, t follows that the d j varables form a (pseudo)metrc. Thus three rounds of LS + gve a relaxaton that s at least as strong as the ARV relaxaton. (2) 3 Sheral Adams Relaxatons Now let us generalze the Lft and Project method to k-ary programmng [Sheral-Adams 91]. For every subset F of ndces of sze at most k, we ntroduce a new varable Y S. The ntended value of Y S s Y S = S x.

4 Frst we requre that for every set S (s.t. S k 2), the matrx (Y S {} {j} ) j s postve semdefnte. Then we add another famly of generc constrants: S, T : S T =, S + T k x ) S(1 x j 0. j T As t often happens n theoretcal CS the case k > 2 s much more complcated than the case k = 2. 4 Dscrepancy 4.1 Two Party Case We are now nterested n applyng the Lft and Project method to analyze the dscrepancy. Recall the defnton of the dscrepancy for two partes. Defnton 1 (2 party case) Dscrepancy of a matrx M s M C = x 1,...,x n {0,1} y 1,...,y n {0,1},j M j x y j. We saw n the last lecture that the cut norm M C s well approxmated by the norm 1 : M 1 = x 1,...,x n { 1,1} y 1,...,y n { 1,1},j M j x y j. Alon and Naor showed that ths norm s approxmated wthn a constant factor by the followng SDP relaxaton: M C = M j u v j. u 1,...,u n S 2n 1 v 1,...,v n S 2n 1,j 4.2 Multparty Case We are nterested n extendng ths method to multparty communcaton complexty. Recall that n the multparty case we have a multdmensonal matrx M j...k (.e. a covarant tensor); our objectve functon s to mze the sum of entres n a set S over all sets S that are cylnder ntersectons. Defnton 2 (3 party case) In the 3 party case, each cylnder ntersecton s determned by ts projectons on the planes O j, O k, and O jk. Let x j, y k, and z jk be ndcator functons of these projectons. Then cylnder ntersecton = {(, j, k) : x j = y k = z jk = 1} = {(, j, k) : 1 x j y k z jk = 0}. So the dscrepency s equal to Dsc(M) = x j {0,1} y k {0,1} z jk {0,1},j,k M,j,k x j y k z jk.

5 k j 0 Fgure 1: We mze,j,k M,j,k x j y k z jk over all cylnder ntersectons. We now represent ths optmzaton problem as a Sheral-Adams relaxaton. Denote the optmum value of ths relaxaton by SDP opt (M). Clearly, SDP opt (M) Dsc(M). How well does SDP opt (M) approxmate Dsc(M)? Queston 1: Is SDP opt = O( small Dsc(M))? Queston 2: Show for some natural functon (e.g. clque) that SDP opt s small. Guess: Perhaps, the answer to the frst queston s No. How to solve these problems? The natural way s to use dualty. 5 SDP Dualty A general semdefnte program s an optmzaton program of the followng form: mn C X subject to A 1 X = b 1 A 2 X = b 2 A m X = b m X 0 where C, A 1,..., A m are n n symmetrc matrces; the nner product A B s defned as A B def =,j. A j B j = tr(ab). Note that ths program s a convex program snce the set of all postve semdefnte matrces forms a convex cone. Ths follows from the followng two observatons:

6 If A s a postve semdefnte, then so s λa for every λ > 0. If A and B are postve semdefnte matrces, then so s A + B. Indeed, for every vector v, we have v T (A + B)v = v T Av + v T Bv 0 + 0. Theorem 1 (Fact at the root of dualty) A matrx Y s postve semdefnte ff for every postve semdefnte matrx A, A Y 0. Proof: Frst, assume Y 0. Let A be an arbtrary semdefnte matrx. Consder Cholesky decompostons of matrces A and Y : A = V T V, and Y = Z T Z. Then A Y = tr(ay ) = tr(v T V Z T Z) = tr(zv T V Z T ) = tr(zv T (ZV T ) T ) 0. Note that f A s a postve defnte matrx and Y 0, then A Y > 0. Indeed, snce A s nonsngular, V s also nonsngular, therefore ZV T 0, whch mples A Y > 0. The other drecton s trval: v T Y v = Y (vv T ) 0. }{{} 0 Therefore Y s postve semdefnte. Theorem 2 The followng system of equatons s nfeasble ff x 1,...,x m R s.t. x A 0. A 1 X = 0 A 2 X = 0 A m X = 0 X 0 X 0 Proof: Suppose there are no x 1,...,x m R s.t. x A 0. Ths means that the lnear subspace { x A } s dsjont from the cone of postve defnte matrces (whch equals the nteror of the cone of postve semdefnte matrces). By Farkas lemma, there exsts a separatng hyperplane that contans ths lnear space s.t. the semdefnte cone les on one sde of the hyperplane. Let Y be the normal to ths hyperplane. Then Y A = 0, and, by Theorem 1, Y 0. In other words, Y s a feasble soluton of the system of the equatons. We get a contradcton. The proof of the other drecton s straght forward. Assume that there exst x 1,...,x n for whch x A 0; but the system of equatons has a feasble soluton X. We have x A X = x (A X) = x 0 = 0;

7 cone of SDP matrces 0 x A Fgure 2: The SDP cone s dsjont from the set x A. on the other hand x A X = ( x A ) X > 0. }{{} 0 We get a contradcton. Ths concludes the proof.