ORIE 6334 Spectral Graph Theory December 1, Lecture 27 Remix

Similar documents
ORIE 6334 Spectral Graph Theory November 22, Lecture 25

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

ORIE 6334 Spectral Graph Theory October 25, Lecture 18

Lecture 5. Max-cut, Expansion and Grothendieck s Inequality

Partitioning Algorithms that Combine Spectral and Flow Methods

Lecture 7. 1 Normalized Adjacency and Laplacian Matrices

A better approximation ratio for the Vertex Cover problem

Lecture 13: Spectral Graph Theory

Lecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]

Lecture 8: The Goemans-Williamson MAXCUT algorithm

Essential facts about NP-completeness:

Approximation & Complexity

Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007

Unique Games and Small Set Expansion

Positive Semi-definite programing and applications for approximation

Lecture Approximate Potentials from Approximate Flow

Near-Optimal Algorithms for Maximum Constraint Satisfaction Problems

SDP Relaxations for MAXCUT

Lecture 20: Goemans-Williamson MAXCUT Approximation Algorithm. 2 Goemans-Williamson Approximation Algorithm for MAXCUT

Lecture 8: Spectral Graph Theory III

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

Lecture Semidefinite Programming and Graph Partitioning

Integer Linear Programs

16 Embeddings of the Euclidean metric

Lecture 11 October 7, 2013

Convergence of Random Walks and Conductance - Draft

Notes 6 : First and second moment methods

approximation algorithms I

Lower bounds on the size of semidefinite relaxations. David Steurer Cornell

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

ENGG2430A-Homework 2

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Hierarchies. 1. Lovasz-Schrijver (LS), LS+ 2. Sherali Adams 3. Lasserre 4. Mixed Hierarchy (recently used) Idea: P = conv(subset S of 0,1 n )

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Holistic Convergence of Random Walks

Handout 1: Mathematical Background

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

Approximation Algorithms and Hardness of Integral Concurrent Flow

Lecture 4. 1 FPTAS - Fully Polynomial Time Approximation Scheme

Regression and Covariance

Handout 1: Probability

Conditional distributions (discrete case)

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

Lecture 3. 1 Polynomial-time algorithms for the maximum flow problem

Lecture 17: Cheeger s Inequality and the Sparsest Cut Problem

6.854J / J Advanced Algorithms Fall 2008

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

6.1 Occupancy Problem

Graph Partitioning Algorithms and Laplacian Eigenvalues

In case (1) 1 = 0. Then using and from the previous lecture,

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.

ORIE 6334 Spectral Graph Theory September 22, Lecture 11

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

Lecture 2 Sep 5, 2017

Approximating MAX-E3LIN is NP-Hard

CSE 525 Randomized Algorithms & Probabilistic Analysis Spring Lecture 3: April 9

Lecture 5: Probabilistic tools and Applications II

On a Cut-Matching Game for the Sparsest Cut Problem

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

PRAM lower bounds. 1 Overview. 2 Definitions. 3 Monotone Circuit Value Problem

Notes on Gaussian processes and majorizing measures

Lecture 16: Constraint Satisfaction Problems

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Lecture 13: 04/23/2014

The Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm

CS5314 Randomized Algorithms. Lecture 18: Probabilistic Method (De-randomization, Sample-and-Modify)

25.2 Last Time: Matrix Multiplication in Streaming Model

Lecture 2: Perfect Secrecy and its Limitations

Lecture 2: November 9

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

Solutions to Exercises

1 Dimension Reduction in Euclidean Space

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Approximation Algorithms and Hardness of Approximation April 19, Lecture 15

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

Lecture 4. 1 Estimating the number of connected components and minimum spanning tree

CS 6820 Fall 2014 Lectures, October 3-20, 2014

K-center Hardness and Max-Coverage (Greedy)

CS 395T Computational Learning Theory. Scribe: Mike Halcrow. x 4. x 2. x 6

Lecture 22: Hyperplane Rounding for Max-Cut SDP

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y)

Some notes on streaming algorithms continued

On the efficient approximability of constraint satisfaction problems

ACO Comprehensive Exam 19 March Graph Theory

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

20.1 2SAT. CS125 Lecture 20 Fall 2016

Handout 1: Mathematical Background

Lecture 1: April 24, 2013

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

ORIE 6334 Spectral Graph Theory November 8, Lecture 22

ACO Comprehensive Exam October 14 and 15, 2013

Lecture 18: March 15

3 Multiple Discrete Random Variables

Canonical SDP Relaxation for CSPs

11.1 Set Cover ILP formulation of set cover Deterministic rounding

. Find E(V ) and var(v ).

Network Design and Game Theory Spring 2008 Lecture 2

Transcription:

ORIE 6334 Spectral Graph Theory December, 06 Lecturer: David P. Williamson Lecture 7 Remix Scribe: Qinru Shi Note: This is an altered version of the lecture I actually gave, which followed the structure of the Barak-Steurer proof carefully. With the benefit of some hindsight, I think the following rearrangement of the same elements would have been more effective. Recap of Previous Lecture Last time we started to prove the following theorem. Theorem (Arora, Rao, Vazirani, 004) There is an O( log n)-approximation algorithm for sparsest cut. The proof of the theorem uses a SDP relaxation in terms of vectors v i R n for all i V. Define distances to be d(i, j) v i v j and balls to be B(i, r) {j V d(i, j) r}. We first showed that if there exists a vertex i V such that B(i, /4) n/4, then we can find a cut of sparsity O() OP T. If there does not exist such a vertex in V, then we can find U V with U n/ such that for any i U, /4 v i 4 and there are at least n/4 vertices j U such that d(i, j) > /4. Then we gave the ARV algorithm. Algorithm : ARV Algorithm Pick a random vector r such that r(i) N(0, ) Let L = {i V : v i r } and R = {i V : v i r } Find a maximal matching M {(i, j) L R : d(i, j) } Let L, R be the vertices in L, R respectively that remain uncovered Sort i V by increasing distance to L (i.e. d(i, L )) to get i, i,..., i n Let S k = {i,..., i k } and return S = arg min k n ρ(s k ) Observation At the end of the ARV algorithm, for any i L and j R, d(i, j) >. Assume the matching algorithm gives the same matching for r as for r. Then, we can assume that the probability of i being matched if i L is the same as the probability of i being matched if i R. Next, we stated the following two theorems and proved the first one. Theorem There exists some constant c such that Pr[ L, R c n] c. 0 This lecture is derived from lecture notes of Boaz Barak and David Steurer http://sumofsquares.org/ public/lec-arv.html. 7-

Theorem 3 (Structure Theorem) For = Ω(/ log n), E[ M ] ( c ) n. The two theorems imply that with constant probability, L, R c n, and d(i, j) for all i L and j R. We showed that if this is the case, we can then conclude that the algorithm gives us O( log n)-approximation. Today we turn to the proof of the Structure Theorem. Proof of Structure Theorem The proof shown in this section is due to Boaz Barak and David Steurer (06). The original ARV algorithm gives an O((log n) /3 )-approximation algorithm and needs another algorithm to reach the guarantee of O( log n). Later, Lee showed that the original ARV algorithm also gives O( log n)-approximation. Both of these analyses are long and technical. In 06, Rothvoss gave a somewhat easier proof (https://arxiv.org/abs/607.00854). Very recently Barak and Steurer gave a much easier proof, and this is what we will show today. Recall the proof ideas we talked about last lecture. We know that from this it is possible to prove a concentration result showing that ) Pr[v r α] exp ( α v. Thus Pr[(v i v j ) r C ln n] e C ln n 8 = n C /8 for any i, j U, since v i v j 8. Hence, for sufficiently large C, we have (v i v j ) r C ln n for all i, j U with high probability. Then one can show that E[max i,j U (v i v j ) r] C ln n. For simplicity of notation, we rename v i r as X i. Then, E[max i,j U (X i X j )] C ln n. v v r N(0, ); For the rest of the lecture, we will restrict our attention to vertices in U and ignore anything outside of U; we let n = U, and since U n/, this only changes the constants in what we need to prove. We would like to prove the following lemma. Lemma 4 There exists a constant c such that for any positive integer k, [ ] E max (X i X j ) 4k i,j U n E[ M ] c k. 7-

or If we can prove this lemma, the we have that with high probability So if we set and then 4k n E[ M ] c k C ln n, n E[ M ] C ln n + c 4k 4 k. k = n E[ M ] 4 ( ) c C ln n = O( ln n), = ( ) c k = Ω, ln n ( c ) + 4 ( c ) C ln n ( ) c, and we will have proven the Structure Theorem. How should we prove the lemma? Consider a graph H = (U, E ) where E = {(i, j) U U : d(i, j) }. Let Define H(i, k) = {j U : j can be reached from i in at most k steps in H}. Y (i, k) = Φ(k) = max (X j X i ) j H(i,k) E[Y (i, k)] where i ranges over all starting points. Then, [ ] n Φ(k) E max (X i X j ). i,j U i= So to prove Lemma 4, we ll instead prove that n Φ(j) 4k n E[ M ] c k. Or rather, we ll prove the following, which implies Lemma 4. Lemma 5 Φ(k) 4kE[ M ] cn k. To prove this lemma will need the following probability results. Lemma 6 For any two random variables X and Y, E[XY ] E[X]E[Y ] Var[X] Var[Y ] 7-3

Observation For any vector x, E[(x r) ] = x E [ ( ) ] x x r = x. Theorem 7 (Borell s Theorem) If Z, Z,..., Z t have mean 0 and are jointly normally distributed, then there exists a constant ĉ such that Var[max(Z,..., Z t )] ĉ max(var[z ],..., Var[Z t ]). Note that in Borell s Theorem, there s no dependence on the number of variables t. We also observe that Var[X j X i ] = E[(X j X i ) ] = v j v i k, by the triangle inequality and the fact that each edge (p, q) in H has v p v q. The reason why Borell s Theorem is useful is that for fixed i, (X i X j ) for some j H(i, k) has mean 0 and are (jointly) normally distributed, so that Borell s Theorem says that [ ] Var[Y (i, k)] = Var max (X j X i ) ĉ max Var[X j X i ] j H(i,k) j H(i,k) = ĉ max j H(i,k) E[(X j X i ) ] = ĉ max j H(i,k) v j v i ĉ k. Now to prove Lemma 5. But before we start, we can reflect a bit on what the lemma actually says. If we think about the expected projections of X j X i as we let j be at most k steps away from i, summing over all i, we get a constant times E[ M ] for each of the steps; this makes sense, since for any matching edge (p, q), we have that X p X q since either X p and X q or vice versa, so we pick up that difference for each edge in the matching. However, there is also a correction term that corresponds to the variance. The proof is formalized below. Proof of Lemma 5: If (i, j) E, then H(j, k ) H(i, k), so if Y (j, k ) = X h X j where h H(j, k), then Thus, if (i, j) M, Y (i, k) X h X i = Y (j, k ) + X j X i. Y (i, k) Y (j, k ) + () since X i and X j given that (i, j) is in the matching. Let N be an arbitrary pairing of vertices not in M. Then, for any (i, j) N, Y (i, k) + Y (j, k) Y (i, k ) + Y (j, k ). () Now we want to add both sides over all (i, j) M N, take expectations and get Φ. Unfortunately, if we take an expectation, there will be a coefficient in front of Y (i, k) of 7-4

the probability that i is in the matching. To get around this issue, we introduce some new random variables. Let if i is matched in M, i L, L i = 0 if i is matched in M, i R, otherwise and if i is matched in M, i R, R i = 0 if i is matched in M, i L, otherwise. Note that E[L i ] = E[R i ] = since the probability that i is in the matching when i L is the same that i is in the matching when i R. Adding both sides of () and () over M and N, we get Y (i, k)l i Y (j, k )R j + M. (3) i= j= Similarly, we have that Y (j, k )R j Y (i, k )L i + M. j= i= Then by applying induction, we obtain that for k odd Y (i, k)l i Y (j, 0)R j + k M = k M i= j= and for k even so that for any k. By Lemma 6, Y (i, k)l i Y (j, 0)L j + k M = k M, i= j= Y (i, k)l i k M (4) i= E[Y (i, k)l i ] E[Y (i, k)]e[l i ] Var[Y (i, k)] Var[L i ] ĉk. Taking expectation of both sides of (4), we get or as desired. Φ(k) ke[ M ] n ĉk, Φ(k) 4kE[ M ] n ĉk, 7-5

Research Questions: Is there an easier proof? Or a Cheeger-like proof? Recall the connection to the Cheeger-like inequality over flow packings. Can this proof be extended to non-uniform sparsest cuts, where for each pair of (s i, t i ), there is a demand d i and δ(s) ρ(s) = i:(s i,t i ) δ(s) d? i The sparsest cut problem corresponds to there being a unit demand between each pair of vertices. For the non-uniform case, it is known that there is an O( log n log log n)- approximation algorithm, but it is not known if the extra log log n term is necessary. 7-6