CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 15-11/11/2009 Partitioning Algorithms that Combine Spectral and Flow Methods Lecturer: Michael Mahoney Scribes: Kshipra Bhawalkar and Deyan Simeonov *Unedited notes Last time we looked at flow based graph partitioning. Today we will show why it works. We are given a graph G = (V, E), a cost function c : E R + and k pairs of nodes (s i, t i ). Let x(e) indicate if an edge e is cut, y(i) indicate if commodity i is cut and P i, i = 1, 2,..., k be the set of paths between s i and t i. Then we want: e E min c(e)x(e) k, s.t. d(i)y() x(e) y(i) P P i i = 1, 2,..., k e P y(i) {0, 1}, x(e) {0, 1} We relax x and y to be in [0, 1] and, given that for any feasible solution (x, y) and any α > 0, (αx, αy) is a feasible solution, we get the following LP: min e E c(e)x(e), s.t. Our strategy: solve the LP round the solution. k d(i)y(i) = 1 x(e) y(i) P P i i = 1, 2,..., k e P x(e) 0, y(i) 0 Recall: x R n, x p = ( n xp i ) 1 p - p-norm. Its induced metric is x y p. Theorem (Bourgain 85). Any n-point metric space admits an α-distortion embedding in l p with α = O(log n). Proof idea: Use as coordinates the minimal distances of points to suitably chosen random sets. Comparison between l 1 and l 2 : l 2 has good dimension reduction properties, l 1 doesn t. There is a fast (polytime, O(n 3 )) algorithm to get the best l 2 embedding, for l 1 it is NP. 1
l 2 has strong connection to diffusion, low dimensional spaces and manifolds, l 1 has strong connection to cuts, partition in graphs, multicommodity flows. Connection between l 1 and Cut metrics: There exists a representation of l 1 metrics in terms of combination of cut metrics. Cut metrics - Extreme rays of the cone of l 1 metrics. minimum ratio function over cone minimum over extreme rays. Definition. Given a graph G(V, E) and S V, say δ S is the cut-metric for S, if δ s (x, y) is the indicator of x and y being on different sides of S. Claim. The set of l 1 metrics is a convex cone, i.e.: If d 1, d 2 l 1, α 1, α 2 0, then α 1 d 1 + α 2 d 2 l 1. Does not hold for l 2! Proof: Line metric is an l 1 metric but d (i) (x, y) = x i y i for x i, y i R n. If d is an l! metric then it is a line metric. We can then argue for each dimension. Claim. Let d be a finite l 1 metric then we can write d as d = S V α S δ S α S R, δ S cut metric i.e. CUT n = {d : d = α S δ S, α S 0} S V = {positive cone generated by this metric} = {all n-points subsets of R n under l 1 metric Proof: (CUT n l 1 ) d = S V α Sδ S CUT n. Introduce one dimension for each pair of vertices. For a pair of points (i, j) value in the dimension (i, j) is sum of α S for all cuts (S, S) over all cuts that put i and j in different sets. (l 1 CUT n ): Consider a dimension d and sort points along that dimension in increasing values. Let v 1, v 2,...v k be the set of distinct values along that dimension. Define k-1 cut metrics S i = {x : x d v i 1 } and let α i = v i+1 v i. So along that dimention, Do this for each dimension. x d y d = Usefulness - you can optimize over l 1 metrics: Let C R n - convex cone. f, g : R n R +. Then: k α i δ Si f(x) min x C g(x) = min f(x) x extreme ray ofc g(x) 2
Conductance and Sparsity Given a graph G(V,E) we define the conductance h G and sparsity φ G as follows, h G := min S V E(S, S) min{ S, S } E(S, φ G := min S) S V S S Claim. φ G = {min d l1metric i,j E d ijs.t. i,j d ij = 1} Idea: Relax to optimization over a larger set i.e. a metric. 1 n λ := min i,j E d ij s.t. d ij = 1 i,j d ij 0 d ij = d ji d ij + d jk d ik Clearly λ φ. Less obviously φ O(log n)λ (homework 3). Algorithm: Given G. Solve the LP to get metric d, Use bourgain embedding result to approximate by l 1 metric (with loss O(log n). Round the solution to get a cut. For each dimension covert the l 1 metric along that to a cut metric. Choose the best. Note: If have l 1 embedding with distortion factor ξ then can approximate the cut upto ξ. (Homework 3). What else can you relax to? l 2 - not convex l2 2 - convex but distortion Omega(n). (Not a metric) (this happens in spectral technique. Can get eigenvalue λ s.t. λ is close to h(g) upto quadratic factor. (Cheegar Inequality). l2 2 + triangle inequality gives a metric This works. Arora, Rao Vazirani. 3
Various relaxations of spectral cut. Actual problem: Φ G = min S V E(S, S) S S = min x 0,1 V Spectral Method: replace l 1 with l 2. i,j d λ 2 = min A ij(x i x j ) 2 x R V i,j (x i x j ) 2 Leighton, Rao i,j A ij x i x j i,j x i x j i,j min A ijd ij d metric i,j d ij Leighton,Rao approach is bad in expanders, but is good in sparse graphs. Definition. l 2 2 representation of a graph G is an assignment of a point vector to each node v i R k for each i. s.t. v i v j 2 2 + v j v k 2 2 v i v k 2 2 It is a unit l 2 2 representation if on unit sphere i.e. v i = 1, i. Thing to note: Condition that l 2 2 distance form a metric all triangles are acute. l 2 2 metric is a convex cone. d l 1 = d l 2 2. Relax to vectors on unit sphere that form a metric, min A ij x i x j i,j E s.t. i,j x i x j 2 2 = 1 x i x j 2 2 + x j x k 2 2 x i x k 2 2 This it the semi-definite program(sdp) from Arora, Rao, Vazirani. Theorem: For uniform sparsest cut d ij = 1 i, j SDP integrality gap is θ( log n). For general sparsest cut, SDP integrality gap is θ( log n log log n). Main structure theorem: Let v, v 2,...v n be points on the unit vall in R n. s.t. d ij = v i v j 2 2 is a metric and all points are wellseperated. i,j d ij/n 2 δ = Ω(1). then S, T disjoint subsets of V s.t. S, T Ω(n) min i S,j T d ij Ω(1/ log n) Flow methods: Embed scaled version of complete graphs into G. - O(log n) ARV: embed an arbitrary graph H(iterative construction) such that H is a good expander or we find a cut. Iterative construction: multiplicative update method online learning. 4
References 1. Arora, Rao, and Vazirani, CACM article, Geometry, flows, and graph-partitioning algorithms 2. Khandekar, Rao, and Vazirani, Graph partitioning using single commodity flows 3. Orecchia, Schulman, Vazirani, and Vishnoi, On Partitioning Graphs via Single Commodity Flows 5