Lift-and-Project Techniques and SDP Hierarchies

MFO seminar on Semidefinite Programming May 30, 2010

Typical combinatorial optimization problem: max c T x s.t. Ax b, x {0, 1} n P := {x R n Ax b} P I := conv(k {0, 1} n ) LP relaxation Integral polytope to be found Goal: Construct a new relaxation P such that P I P P, leading to P I after finitely many iterations. Gomory-Chvatal closure: P = {x u T Ax u T b u 0 with u T A integer}. P I is found after O(n 2 log n) iterations if P [0, 1] n. [Eisenbrand-Schulz 1999] But optimization over P is hard! [Eisenbrand 1999]

Plan of the lecture Goal: We present several techniques to construct a hierarchy of LP/SDP relaxations: P P 1... P n = P I. Lovász-Schrijver N / N + operators Sherali-Adams construction Lasserre construction LP / SDP LP SDP Great interest recently in such hierarchies: Polyhedral combinatorics: How many rounds are needed to find P I? Which valid inequalities are satisfied after t rounds? Complexity theory: What is the integrality gap after t rounds? Link to hardness of the problem? Proof systems: Use hierarchies as a model to generate inequalities and show e.g. P I =.

Lift-and-project strategy 1. Generate new constraints: Multiply the system Ax b by products of the constraints x i 0 and 1 x i 0. Polynomial system in x. 2. Linearize (and lift) by introducing new variables y I for products i I x i and setting x 2 i = x i. Linear system in (x, y). 3. Project back on the x-variable space. LP relaxation P satisfying P I P P.

Some notation Write Ax b as al T x b l (l = 1,..., m) ( ) 1 or as gl T 0 (l = 1,..., m) x ( ) bl setting g l =. a l Homogenization of P: { ( ) } 1 } P = λ λ 0, x P = {y R n+1 g T x l y 0 (l = 1,..., m) V = {1,..., n}.

The Lovász-Schrijver construction 1. Multiply Ax b by x i, 1 x i i V. ( ) ( 1 1 Quadratic system: gl T x x i, g l x 2. Linearize: Introduce the matrix variable Y = indexed by {0} V. Then, Y belongs to ) (1 x i ) 0 i ( )( ) T 1 1, x x M(P) = {Y S n+1 Y 0i = Y ii, Ye i, Y (e 0 e i ) P i}, 3. Project: N(P) = N + (P) = M + (P) = M(P) S + n+1. { x R V Y M(P) s.t. { x R V Y M + (P) s.t. ( ) } 1 = Ye x 0 ( ) } 1 = Ye x 0

Properties of the N- and N + -operators P I N + (P) N(P) P. N(P) conv(p {x x i = 0, 1}) for all i V. N n (P) = P I. Assume one can optimize in polynomial time over P. Then the same holds for N t (P) and for N+(P) t for any fixed t. Example: Consider the l 1 -ball centered at e/2: { P = x R V i I x i + } i V \I 1 x i 1 2 I V. Then: P I =, but 1 2 e Nn 1 + (P). n iterations of the N + operator are needed to find P I

Application to stable sets [Lovász-Schrijver 1991] P = FRAC(G) = {x R V + x i + x j 1 (ij E)} P I = STAB(G): stable set polytope of G = (V, E). N(FRAC(G)) = FRAC(G) intersected by the constraints: i V (C) x i C 1 2 for all odd circuits C. Y M(FRAC(G)) = y ij = 0 for edges ij E. N + (FRAC(G)) TH(G). Any clique inequality i Q x i 1 is valid for N + (P), while its N-rank is Q 2. The N + operator helps! n α(g) 2 N-rank n α(g) 1. N + -rank α(g) [equality if G = line graph of K 2p+1 ]

The Sherali-Adams construction 1. Multiply Ax b by x i (1 x j ) for all disjoint I, J V with I J = t. i I j J 2. Linearize & lift: Introduce new variables y U for all U P t (V ), setting x 2 i = x i and y i = x i. 3. Project back on x-variables space. Relaxation: SA t (P). Then: SA 1 (K) = N(P), Thus: SA t (P) N t (P). SA t (P) N(SA t 1 (P)).

Application to the matching polytope For G = (V, E), let P = {x R E + x(δ(v)) 1 v V }. Then: P I is the matching polytope ( = stable set polytope of the line graph of G). For G = K 2p+1 : N + -rank = p [Stephen-Tunçel 1999] N-rank [2p, p 2 ] [LS 1991] [Goemans-Tunçel 2001] SA-rank = 2p 1 [Mathieu-Sinclair 2009] Detailed analysis of the integrality gap: g t = max x SA t(p) e t x max x P e T x g t = = max x SA t(p) e t x. p 1 + 1 2p if t p 1 1 if t 2p 1 phase transition at 2p Θ( p)

A canonical lifting lemma x {0, 1} n y x = ( x i ) I V {0, 1} P(V ) i I = (1, x 1,.., x n, x 1 x 2,.., x n 1 x n,.., x i ) i V Z: matrix with columns y x for x {0, 1} n. Equivalently: Z is the 0/1 matrix indexed by P(V ) with Z(I, J) = 1 if I V, 0 else. Z 1 (I, J) = ( 1) J\I if I J, 0 else. If x P {0, 1} n, then Y = y x (y x ) T satisfies: Y 0 Y l = g l (x)y 0 localizing matrix Y (I, J) depends only on I J moment matrix y R P(V ) Y = M V (y) = (y I J ), Y l = M V (g l y)

Lemma: P I is equal to the projection on the x-variable space of {y R P(V ) y 0 = 1, M V (y) 0, M V (g l y) 0 l}. Sketch of proof: 1. Verify that M V (y) = Z diag(z 1 y)z T. 2. M V (y) 0 = λ := Z 1 y 0 = y = Zλ = λ x y x x {0,1} n where x λ x = y 0 = 1. 3. Use M V (g l y) 0 to show that λ x > 0 = x P (= x P I ). Each 0/1 polytope is projection of a simplex.

Case n = 2 Z = 1 2 12 1 1 1 1 1 0 1 0 1 2 0 0 1 1 Z 1 = 12 0 0 0 1 1 2 12 1 1 1 1 1 0 1 0 1 2 0 0 1 1 12 0 0 0 1 y 0 y 1 y 2 y 12 M V (y) = y 1 y 1 y 12 y 12 y 2 y 12 y 2 y 12 0 y 12 y 12 y 12 y 12 y 0 y 1 y 2 + y 12 0 y 1 y 12 0 y 2 y 12 0 y 12 0

SDP hierarchies Idea: Get SDP hierarchies by truncating M V (y) and M V (g l y): Consider M U (y) = (y I J ) I,J U, indexed by P(U) for U V, or M t (y) = (y I J ) I, J t, indexed by P t (V ) for some t n. 1. (local) Get the Sherali-Adams relaxation SA t (P) when considering M U (y) 0, M W (g l y) 0 U P t (V ), W P t 1 (V ). LP with variables y I for all I P t (V ) 2. (global) Get the Lasserre relaxation L t (P) when considering M t (y) 0, M t 1 (g l y) 0. Obviously: L t (P) SA t (P). SDP with variables y I for all I P 2t (V )

Link to the Lovász-Schrijver construction L 1 (P) P, L t (P) N + (L t 1 (P)). Thus: L t (P) N t 1 + (P). L t (P) is tighter but more expensive to compute! The SDP for L t (P) involves one matrix of order O(n t ), O(n 2t ) variables. The SDP for N t 1 + (P) involves O(nt 1 ) matrices of order n + 1, O(n t+1 ) variables. Note: One can define a (block-diagonal) hierarchy, in-between and cheaper than both L t (P) and N+ t 1 (P); roughly, unfold the recursive definition of the LS hierarchy, and consider suitably defined principal submatrices of M t (y) (which can be block-diagonalized to blocks of order n + 1). [Gvozdenovic-L-Vallentin 2009]

Application of the Lasserre construction to stable sets The localizing conditions in L t (FRAC(G)) boil down to the edge conditions: y ij = 0 (ij E) (for t 2). Natural generalization of the theta body TH(G). Get the bound las (t) (G). Convergence in α(g) steps: L t (FRAC(G)) = STAB(G) for t α(g). Open: Exist graphs G for which α(g) steps are needed? Question: What is the Lasserre rank of the matching polytope?

Application of the Lasserre construction to Max-Cut Max-Cut: max ij E w ij 1 x i x j 2 s.t. x {±1} V. Consider P = [ 1, 1] V, write xi 2 = 1, and project onto the subspace R (n 2) indexed by edges. The order 1 relaxation is the basic GW relaxation: max ij E w ij 1 X ij 2 s.t. X S + n, diag(x ) = e. The Lasserre rank of CUT(K n ) is at least n/2. [La 2003] (First time when ij E(K n) x ij n/2 becomes valid). Question: Does equality hold? (Yes for n 7).

The Lasserre relaxation of order 2 relaxation satisfies the triangle inequalities: Y = 12 13 23 1 y 12 y 13 y 23 12 y 12 1 y 23 y 13 13 y 13 y 23 1 y 12 0 23 y 23 y 13 y 12 1 = e T Ye 0 = y 12 + y 13 + y 23 1.

Some negative results about integrality gaps of hierarchies for max-cut Consider the basic LP relaxation of max-cut defined by the triangle inequalities. Its integrality gap is 1/2. [Schoenebeck-Trevisan-Tulsiani 2006] For the Lovász-Schrijver construction: The integrality gap remains 1/2 + ɛ after c ɛ n rounds of the N operator. But the integrality gap is 0.878 after one round of the N + operator. [Charikar-Makarychev-Makarychev 2009] For the Sherali-Adams construction: The integrality gap remains 1/2 + ɛ after n γɛ iterations.

Some positive results Chlamtac-Singh [2008] give (for the first time) an approximation algorithm whose approximation guarantee improves indefinitely as one uses higher order relaxations in the SDP hierarchy: For the maximum independent set problem in a 3-uniform hypergraph G. Namely: Given γ > 0, assuming G contains an independent set of cardinality γn, then one can find an independent set of cardinality n Ω(γ2) using the relaxation of order Θ(1/γ 2 ).

Extensions to optimization over polynomials Minimize p(x) over {x g j (x) 0}. Linearize p = α p αx α by α p αy α. Impose SDP conditions on the moment matrix: M t (y) = (y α+β ) 0. hierarchy of SDP relaxations with asymptotic convergence (due to some SOS representation results). Exploit equations: h j (x) = 0. We saw how to exploit x 2 i = x i. The canonical lifting lemma extends to the finite variety case: when the equations h j = 0 have finitely many roots. Finite convergence of the hierarchy when the equations h j = 0 have finitely many real roots.

Another hierarchy construction via copositive programming Reformulation: α(g) = min λ s.t. λ(i + A G ) J C n, where C n = {M S n x T Mx 0 x R n +} is the copositive cone. Idea [Parrilo 2000]: Replace C n by the subcones L (t) n ( n ) r = {M S n (x T Mx) x i i=1 has non-negative coefficients}, ( n K n (t) = {M S n i,j=1 L (t) n M ij x 2 i x 2 j )( n K (t) n C n. i=1 [Pólya] If M is strictly copositive then M t 0 L(t) n. x 2 i ) t is SOS},

LP bound: ν (t) (G) = min λ s.t. λ(i + A G ) J L (t) n, SDP bound: ϑ (t) (G) = min λ s.t. λ(i + A G ) J K (t) n. ν (t) (G) < t α(g) 1. ν (t) (G) = α(g) if t α(g) 2. ϑ (0) (G) = ϑ (G) Conjecture: [de Klerk-Pasechnik 2002] ϑ (t) (G) = α(g) for t α(g) 1. Yes: For graphs with α(g) 8 [Gvozdenovic-La 2007] The Lasserre hierarchy refines the copositive hierarchy: las (t+1) (G) ϑ (t) (G). Note: The convergence in α(g) steps was easy for the Lasserre hierarchy!