U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Similar documents
CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

16.1 L.P. Duality Applied to the Minimax Theorem

Lecture #21. c T x Ax b. maximize subject to

Understanding the Simplex algorithm. Standard Optimization Problems.

Chapter 1 Linear Programming. Paragraph 5 Duality

The Strong Duality Theorem 1

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Topic: Primal-Dual Algorithms Date: We finished our discussion of randomized rounding and began talking about LP Duality.

Supplementary lecture notes on linear programming. We will present an algorithm to solve linear programs of the form. maximize.

Lecture 5. x 1,x 2,x 3 0 (1)

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

SDP Relaxations for MAXCUT

Example Problem. Linear Program (standard form) CSCI5654 (Linear Programming, Fall 2013) Lecture-7. Duality

BBM402-Lecture 20: LP Duality

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

Discrete Optimization

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

Lecture 15 (Oct 6): LP Duality

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

8 Approximation Algorithms and Max-Cut

Farkas Lemma, Dual Simplex and Sensitivity Analysis

Lecture 3: Semidefinite Programming

CS261: A Second Course in Algorithms Lecture #8: Linear Programming Duality (Part 1)

ORIE 6300 Mathematical Programming I August 25, Lecture 2

Lecture 3: QR-Factorization

1 The linear algebra of linear programs (March 15 and 22, 2015)

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Semidefinite Programming Basics and Applications

Algorithmic Game Theory and Applications. Lecture 7: The LP Duality Theorem

1 Review Session. 1.1 Lecture 2

EE364a Review Session 5

Introduction to Semidefinite Programming I: Basic properties a

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

Lecture 11: Post-Optimal Analysis. September 23, 2009

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

The Simplex Algorithm

6.854J / J Advanced Algorithms Fall 2008

6.842 Randomness and Computation March 3, Lecture 8

Convex relaxation. In example below, we have N = 6, and the cut we are considering

MVE165/MMG631 Linear and integer optimization with applications Lecture 5 Linear programming duality and sensitivity analysis

Relation of Pure Minimum Cost Flow Model to Linear Programming

A notion of Total Dual Integrality for Convex, Semidefinite and Extended Formulations

Linear and Combinatorial Optimization

Lecture 4: January 26

Lectures 6, 7 and part of 8

CS261: A Second Course in Algorithms Lecture #9: Linear Programming Duality (Part 2)

Linear Programming Duality P&S Chapter 3 Last Revised Nov 1, 2004

A Review of Linear Programming

CO 250 Final Exam Guide

How to Take the Dual of a Linear Program

Week 3 Linear programming duality

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

6.854J / J Advanced Algorithms Fall 2008

CS 6820 Fall 2014 Lectures, October 3-20, 2014

The Behavior of Algorithms in Practice 2/21/2002. Lecture 4. ɛ 1 x 1 y ɛ 1 x 1 1 = x y 1 1 = y 1 = 1 y 2 = 1 1 = 0 1 1

Section Notes 9. IP: Cutting Planes. Applied Math 121. Week of April 12, 2010

Lecture: Cone programming. Approximating the Lorentz cone.

An introductory example

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

3 The Simplex Method. 3.1 Basic Solutions

1 Primals and Duals: Zero Sum Games

Lecture 5 January 16, 2013

Semidefinite Programming

Global Optimization of Polynomials

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4

Lecture Note 5: Semidefinite Programming for Stability Analysis

1 Seidel s LP algorithm

Lecture 7: Semidefinite programming

Duality of LPs and Applications

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Optimization 4. GAME THEORY

Convex Optimization M2

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Lecture 3 January 28

SEMIDEFINITE PROGRAM BASICS. Contents

Lecture 14: Optimality Conditions for Conic Problems

Linear Programming Duality

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Lecture 8: Semidefinite programs for fidelity and optimal measurements

Interior-Point Methods for Linear Optimization

1 Regression with High Dimensional Data

Chapter 1. Preliminaries

CO759: Algorithmic Game Theory Spring 2015

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

Knowledge Discovery and Data Mining 1 (VO) ( )

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 17: Duality and MinMax Theorem Lecturer: Sanjeev Arora

Lecture Notes 3: Duality

approximation algorithms I

Lectures 15: Cayley Graphs of Abelian Groups

Sensitivity Analysis and Duality in LP

Duality Theory, Optimality Conditions

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods

Transcription:

U.C. Berkeley CS94: Beyond Worst-Case Analysis Handout 1 Luca Trevisan October 3, 017 Scribed by Maxim Rabinovich Lecture 1 In which we begin to prove that the SDP relaxation exactly recovers communities in the stochastic blockmodel. Our strategy in this lecture will be to argue that, under suitable conditions on the average internal- and external-degrees a and b in the SBM, the combinatorial solution achieves the optimum. In the next lecture, we will see that by an SDP analogue of complementary slackness, we can actually guarantee that the combinatorial solution is the unique solution. We begin with a review of duality theory before diving into the main argument. 1 LP and SDP duality Duality provides a method to upper bound the optimal value of maximization problems and lower bound the optimal value of minimization problems. In the linear case, suppose we start with a maximization LP: max c T x s.t. Ax b, x 0, 1 where c R n, b R m and A R m n is a matrix of constraint coefficients. If we wanted to certify that any feasible point for this LP satisfied a certain upper bound on c T x, one natural strategy is to attempt to represent c as a linear combination of the rows a j of A. Indeed, if we managed to write c m y ja j, we would necessarily get that any feasible point had value exactly equal to y j a T j x y T Ax b T y. Due to the non-negativity constraint on x, however, we do not need to require that c be a linear combination of the rows of A. Instead, it suffices to look for coefficients y j such that c m y ja j A T y. Any such y yields an upper bound on the optimum of b T y. The problem of finding the optimal lower bound by this method is also an LP, given by: min b T y s.t. A T y c. In the language of duality theory, 1 is the primal program and is the dual. 1

The informal argument we sketched above can be formalized as the following chain of inequalities to prove that the value of the dual is an upper bound on the value of the primal 1. Indeed, if x R n is primal feasible and y R m is dual feasible, then and the claim follows. c T x A T y T x, 3 y T Ax 4 y T b 5 b T y, 6 A similar construction can be defined for SDPs. To see how this works, let denote the matrix inner product on R n n. That is, M M : i,j M ijm ij. An SDP can then be expressed as max C X, s.t. A j X b j, 1 j m, X 0 where the matrices A j R n n and the notation M M is understood as meaning that M M is a PSD matrix. We can attempt to port over the idea of the LP dual to this new setting as follows: min b T y s.t. y j A j C. 7 We now prove that in fact this construction yields weak duality in the sense that the optimum of 7 is an upper bound on the optimum of 3. For this, again suppose X R n n is primal feasible and y R m is dual feasible that is, feasible for 7. We then observe that b T y y j b j 8 y j A j X [ m ] y j A j X. [ [ ] m Now the only question is, how do we relate y m ja ] X j to C X given that y ja j C? The following lemma answers this question for us.

Lemma 1 Suppose A, B R n n are PSD. Then A B 0. Proof: Recall that we may write A n k1 λ kv k v T k, where each λ k 0 and the v k form an orthonormal basis for R n. By linearity of the matrix inner product, we then have A B k1 λ k v k v Tk B. But now we notice that v k vk T B v k,i v k,j B ij v T k Bv k 0. Since each λ k 0, we conclude that A B 0, as required. Applying the conclusion of Lemma 1 to the result of the chain of inequalities 8, we find [ m ] b T y y j A j X C X. Taking a minimum over the LHS and a maximum over the RHS yields the comparison of optima that we sought. Duality is a powerful tool worth becoming more familiar with. To this end, a nice exercise is the following. Exercise 1 Use SDP duality to prove that the value of the GW relaxation of MaxCut is E for any graph. Duality for the SBM To see how duality can help us with the SBM, let s first rewrite the SDP relaxation of the minimum balanced cut problem in a way that looks more like the SDP we analyzed above. Our starting point is the original formulation min {i, j} E 1 x i, x j s.t. xi 1, 1 i n, 9 x i 0. We aim to rewrite this problem in terms of a matrix variable X R n n. Intuitively, the correspondence between that variable and the given variables will be X ij x i, x j. 3

Now, notice that the objective can be rewritten as E {i, j} E 1 x i, x j 1 A ij x i, x j E i,j A X Thus, up to a translation and scaling that does not change the location of the optimal solution, we may replace minimization of the given objective by maximization of A X, as required for an SDP. We then note that x i x i, x i X ii, so the norm constraints can be replaced by constraints of the form E ii X 1, where E ii is the matrix with a one in cell i, i and zeros everywhere else. Finally, we notice that x i x i, x j where J is the all-ones matrix. J X, We have thus found our way to the following formulation of the SDP relaxation:, max A X s.t. E ii X 1, 1 i n 10 J X 0, X 0. We note that in the notation of 3, we can let the constraint index run from 0 to n and set A 0 J and A j E jj for 1 j n. The constraint vector is then b 0 1 n R n+1. Putting it all together, we find that the dual can be written as: We are now in a position to write down the dual. For convenience in what follows, we shall view the dual variable as y 0 y R n+1, so the notation will be slightly different from the dual formulation 7. In the modified notation, we have min y i s.t. diag y + y 0 J A, 11 where for any vector v R n, diag v denotes the corresponding diagonal matrix whose entry at i, i is v i. 4

3 Exact Reconstruction in the SBM Part I Our aim is to prove the following theorem. As usual, a pn and b qn is the average external degree. is the average internal degree Theorem There exists a universal constant c > 0 such that whenever a b > c log n a + b, the solution of the SDP relaxation 10 is given by χχ T. In particular, solving the SDP relaxation yields exact recovery in the SBM in this regime. 3.1 A candidate dual certificate The main idea behind our proof is that we already know a primal feasible solution to the SDP relaxation 10. Indeed, we can just set { 1 if i and j are in the same community, X ij 1 1 otherwise. Clearly X ii 1 for all 1 i n and J X 0 because the communities have the same size. Meanwhile, X χχ T, where χ is the indicator of the cut, so it is also PSD. Thus, it is a feasible point for the primal SDP 10 and corresponds to the optimum of the unrelaxed combinatorial problem. Our aim is to show that actually the feasible solution 1 is the unique optimal solution. The first step toward this goal is to show that it is in fact optimal, and we shall do this by exhibiting a dual solution whose dual objective value is equal to the primal value of the combinatorial solution. Notice that the value of the combinatorial solution in the primal is given by [ ] A X A ij A ij j in same community as i a i b i, j in other community where a i is the within-community or internal- degree of i and b i is the cross-community or external- degree of i. A candidate for a dual solution with the same objective value is thus given by taking y i a i b i, 1 i n, and y 0 a + b n, 13 where the latter only matters for feasibility and not the objective value. We stress that this vector is actually a random variable, since the a i and b i are random quantities. It is clear by inspection that 13 specifies dual variables that achieve the value of the combinatorial solution 1. The only question is whether this specification of the variables yields a dual feasible point. The main thrust of the proof is thus to show that, with high probability, the proposed dual solution is indeed feasible and therefore optimal, since it achieves a primal feasible value of the objective function. 5

3. Proving that the certificate is dual feasible Feasiblity of the proposed dual certificate is equivalent to the positive definiteness condition M : diag y + y 0 J A 0. We shall actually prove a stronger statement that will be useful for the complementary slackness part of the proof. We intend to show that a Mχ 0 with probability 1, where again χ is the indicator of the cut and that b x T Mx > 0 for all nonzero x χ with high probability. The first statement a we shall directly verify below. The second b will follow from a matrix concentration argument: we shall show that apart from one eigenvalue of 0 corresponding to χ, the eigenvalues of E [ M ] are all a b and we will then argue that with high probability M E [ M ] op O log n a + b, which will yield the theorem when a b > c log n a + b for a suitable absolute constant c > 0. To verify that Mχ 0 with probability 1, we compute Mχ diag y χ Aχ + Jχ diag y χ Aχ, where we have used balance to deduce Jχ 0. We now observe that diag y χ i a i b i χi, while Aχ i A ij χ j χ i A ij χ i χ j a i b i χi, where we have used the fact that χ i χ j 1 if i and j are in the same community and 1 otherwise. Thus, diag y χ Aχ and the claim follows. On the other hand, we observe E [ y i ] E [ a i b i ] a b for 1 i n. Thus, E [ M ] a b I + a + b n J E [ A ] a b I + a + b n J a + b n a b I a b n χχt. J a b n χχt It is now clear that χ is in the nullspace of E [ M ] as it must be since it is in the nullspace of M with probability 1 and that all the other eigenvalues of E [ M ] are a b, as claimed. In the next lecture, we shall use these facts together with a matrix version of Bernstein s inequality to conclude the proof of Theorem. 6