Non-Uniform Graph Partitioning

Robert Seffi Roy Kunal Krauthgamer Naor Schwartz Talwar Weizmann Institute Technion Microsoft Research Microsoft Research

Problem Definition Introduction Definitions Related Work Our Result Input: G = (V, E), w : E R +. Capacities n 1, n 2,..., n k s.t. k j=1 n j n.

Problem Definition Introduction Definitions Related Work Our Result Input: Output: G = (V, E), w : E R +. Capacities n 1, n 2,..., n k s.t. k j=1 n j n. A partition S 1,..., S k of V where each S j n j minimizing: 1 2 k δ(s j ). j=1

Problem Definition Introduction Definitions Related Work Our Result Input: Output: Note: G = (V, E), w : E R +. Capacities n 1, n 2,..., n k s.t. k j=1 n j n. A partition S 1,..., S k of V where each S j n j minimizing: 1 2 k δ(s j ). j=1 Number of parts k might depend on n. Capacities might be of different magnitudes.

Example - Cloud Computing Definitions Related Work Our Result processes

Example - Cloud Computing Definitions Related Work Our Result processes servers n 1 n 2 n 3 n 4 n 5 n 6 n 7 n 8 n 9

Motivation Introduction Definitions Related Work Our Result Theoretical: Captures well studied problems: Min-Bisection, Min b-balanced-cut, Min k-partitioning.

Motivation Introduction Definitions Related Work Our Result Theoretical: Captures well studied problems: Min-Bisection, Min b-balanced-cut, Min k-partitioning. Practical: Cloud and Parallel Computing: parallelism. Hardware design: VLSI layout, circuit testing. Data mining: clustering. Social network analysis: community discovery. Vision: pattern recognition. Scientific Computing: linear systems.

Related Work Introduction Definitions Related Work Our Result Heuristics: [Barnes-82], [Barnes-Vanneli-Walker-88], [Sanchis-89], [Hadley-Mark-Vanneli-92], [Rendl-Wolkowicz-95]... Mainly use spectral theory, local search and quadratic programming.

Related Work (Cont.) Introduction Definitions Related Work Our Result Balanced Graph Partitioning: (n j n /k)

Related Work (Cont.) Introduction Definitions Related Work Our Result Balanced Graph Partitioning: (n j n /k) Min-Bisection: (k = 2) { O(log 3/2 n) [Feige-Krauthgamer-02] True Approx. O (log n) [Räcke-08] Bicriteria Approx. Related to Sparsest-Cut and Min b-balanced-cut. { O (log n) [Leighton-Rao-99] O ( log n ) [Arora-Rao-Vazirani-08] Min k-partitioning: (general k) NP-Hardness no true approximation [Andreev-Räcke-06] Bicriteria Approx. { (O (log n), 2) [Even-Naor-Rao-Schieber-99] ( O ( log n log k ), 2 ) [Krauthgamer-Naor-S-09] { (O(ε 2 log 3/2 n), 1 + ε) [Andreev-Räcke-06] (O(log n), 1 + ε) [Feldmann-Forschini-12]

Related Work (Cont.) Introduction Definitions Related Work Our Result Capacitated Metric Labeling: The same as with additional assignment costs. (O (log n), O((log k))) n j n /k [Naor-S-05] (O (log n), 1) constant k [Andrews-Hajiaghayi-Karloff-Moitra-11] NP ZP T IME ( n polylog(n)) No finite approximation that violates capacities by O(log 1/2 ε k), ε > 0. [Andrews-Hajiaghayi-Karloff-Moitra-11]

Our Result Introduction Definitions Related Work Our Result Theorem [Krauthgamer-Naor-S-Talwar-13] There is a bicriteria approximation algorithm achieving a guarantee of: (O (log n), O(1)) for.

Known - Issues 1 Recursive Partitioning: How many vertices to cut in each step?

Known - Issues 1 Recursive Partitioning: How many vertices to cut in each step? 2 Spreading Metrics: Only spreading with average capacity - insufficient.

Known - Issues 1 Recursive Partitioning: How many vertices to cut in each step? 2 Spreading Metrics: Only spreading with average capacity - insufficient. 3 Räcke s Tree Decomposition: Dynamic programming does not seem to yield poly running time.

Configuration LP Introduction F j {S : S V, S n j }

Configuration LP Introduction F j {S : S V, S n j } (P) min s.t. 1 2 k j=1 k S F j δ(s) x S,j j=1 S F j:u S S F j x S,j 1 x S,j 0 x S,j 1 u V j = 1,..., k j = 1,..., k, S F j

Configuration LP (Cont.) Theorem (P) can be efficiently solved up to a loss of O(log n) in the objective.

Configuration LP (Cont.) Theorem (P) can be efficiently solved up to a loss of O(log n) in the objective. Proof Outline: Dual separation oracle of (P) relates to Min ρ-unbalanced-cut. from [Räcke-08] give an O(log n) approximation. Difficulty: Applying the above in algorithms for solving mixed packing/covering LPs yields O(log n) loss in cost and capacity.

Randomized Introduction Assumption: n 1 n 2... n k and each n j is a power of 2.

Randomized Introduction Assumption: n 1 n 2... n k and each n j is a power of 2. Notation: n 1 = n 2 = n 3 > n 4 = n 5 > n 6 = n 7 = n 8 > W 1 W 2 W 3

Randomized Introduction Assumption: n 1 n 2... n k and each n j is a power of 2. Notation: n 1 = n 2 = n 3 > n 4 = n 5 > n 6 = n 7 = n 8 > W 1 W 2 W 3 } W i {j : n j = 2 (i 1) n 1 i = 1, 2,..., l k i W i

Randomized (Cont.) Idea: Covering vertices by random cuts.

Randomized (Cont.) Idea: Covering vertices by random cuts. - General Approach 1 H i for every i = 1,..., l. 2 While V : Choose j Unif[1,..., k]. Choose S F j w.p. x S,j. Let r be the mega-bucket s.t. j W r. H r H r {S V }. V V \ S.

Randomized - Example H 1 = φ H 2 = φ H 3 = φ H l = φ

Randomized - Example H 1 = φ H 2 = φ H 3 = φ S 1 H l = φ S 1, j 1 j 1 W 3

Randomized - Example H 1 = φ H 2 = φ H 3 = S 1 S 1 H l = φ S 1, j 1 j 1 W 3

Randomized - Example H 1 = φ H 2 = φ H 3 = S 1 S 2 S 1 H l = φ S 2, j 2 j 2 W 1

Randomized - Example H 1 = S 2 \S 1 H 2 = φ H 3 = S 1 S 2 S 1 H l = φ S 2, j 2 j 2 W 1

Randomized - Example H 1 = S 2 \S 1 H 2 = φ H 3 = S 1 S 2 S 1 H l = φ S 3 S 3, j 3 j 3 W 3

Randomized - Example H 1 = S 2 \S 1 H 2 = φ H 3 = S 1, S 3 \ S 1 S 2 S 2 S 1 H l = φ S 3 S 3, j 3 j 3 W 3

Randomized - Example H 1 = S 2 \S 1 H 2 = φ H 3 = S 1, S 3 \ S 1 S 2 S 2 S 1 S 4 H l = φ S 3 S 4, j 4 j 4 W 2

Randomized - Example H 1 = S 2 \S 1 H 2 = S 4 \ S 2 S 3 H 3 = S 1, S 3 \ S 1 S 2 S 2 S 1 S 4 H l = φ S 3 S 4, j 4 j 4 W 2

Randomized - Example H 1 = S 2 \S 1 H 2 = S 4 \ S 2 S 3 H 3 = S 1, S 3 \ S 1 S 2 S 5 S 2 S 1 S 4 H l = φ S 3 S 5, j 5 j 5 W l

Randomized - Example H 1 = S 2 \S 1 H 2 = S 4 \ S 2 S 3 H 3 = S 1, S 3 \ S 1 S 2 S 5 S 2 S 1 S 4 H l = S 5 \ S 1 S 2 S 3 S 5, j 5 j 5 W l

Randomized (Cont.) Question: What to do when all vertices are covered?

Randomized (Cont.) Question: What to do when all vertices are covered? Merge: While H i > k i merge smallest two cuts in H i.

Randomized (Cont.) Question: What to do when all vertices are covered? Merge: While H i > k i merge smallest two cuts in H i. Theorem - Cost E [cost(alg)] 2 cost (P). Proof Outline: k Pr [u and v covered in different iterations] x S,j. j=1 S F j:(u,v) δ(s)

Capacity - Attempt I Observation - Interchanging Cuts Within W i It suffices to upper bound: N i number of vertices covered by H i at the end.

Capacity - Attempt I Observation - Interchanging Cuts Within W i It suffices to upper bound: N i number of vertices covered by H i at the end. Note: E [N i ] k i 2 (i 1) n 1.

Capacity - Attempt I Observation - Interchanging Cuts Within W i It suffices to upper bound: N i number of vertices covered by H i at the end. Note: E [N i ] k i 2 (i 1) n 1. l = O(log k) by merging every W i with k i 2 1 /2(i 1) into W 1. (l is number of mega-buckets)

Capacity - Attempt II Martingale M i,t E [N i (S 1, j 1 ),..., (S t, j t )] {M i,t } t=0 is a martingale with respect to {(S t, j t )} t=1.

Capacity - Attempt II Martingale M i,t E [N i (S 1, j 1 ),..., (S t, j t )] {M i,t } t=0 is a martingale with respect to {(S t, j t )} t=1. Note: M i,0 = E [N i ].

Capacity - Attempt II Martingale M i,t E [N i (S 1, j 1 ),..., (S t, j t )] {M i,t } t=0 is a martingale with respect to {(S t, j t )} t=1. Note: M i,0 = E [N i ]. M i,t M i,t 1 k i 2 (i 1) n 1. (Lipschitz) Number of iterations to cover V : T = Θ(k log n).

Capacity - Attempt III Worst Conditional Variance: Let (T 1, r 1 ),..., (T t 1, r t 1 ) be the realization that maximizes: Var [ M i,t M i,t 1 (S 1, j 1 ) = (T 1, r 1 ),..., (S t 1, j t 1 ) = (T t 1, r t 1 ) ]. v i,t is the worst variance value. Note: For every t a different conditioning might be chosen. Martingale concentration via bounded variances (Bernstein) is not sufficient: t 1 v i,t might be too big!

Capacity - Attempt III (Cont.) Pr Var [ M i,t M i,t 1 (S 1, j 1 ),..., (S t 1, j t 1 ) ] is small? t 1

Capacity - Attempt III (Cont.) Pr Var [ M i,t M i,t 1 (S 1, j 1 ),..., (S t 1, j t 1 ) ] is small? t 1 Theorem - Conditional Variances Sum [ Pr Var [ M i,t M i,t 1 (S 1, j 1 ),..., (S t 1, j t 1 ) ] t 1 ] 2α k i 2 (i 1) n 2 1 1 /α.

Capacity - Attempt III (Cont.) Martingale {M i,t } t=0 Good Event Variances are small.

Capacity - Attempt III (Cont.) Martingale {M i,t } t=0 Good Event Variances are small.???

Capacity - Attempt III (Cont.) Martingale {M i,t } t=0 Good Event Variances are small.??? Freedman s Inequality (stopping-time based concentration)

Capacity - Attempt III (Cont.) Immediate Conclusion: Freedman s inequality yields O(log log k) capacity violation.

Capacity - Attempt III (Cont.) Immediate Conclusion: Freedman s inequality yields O(log log k) capacity violation. Question How do we get O(1) capacity violation?

Instance Transformation W 1 W 2 W 3 W 4 W 5 W 6 W 7 W 8 W 9

Instance Transformation (Cont.) What is an instance transformation?

Instance Transformation (Cont.) What is an instance transformation? Total capacity of non-empty W i grows by a constant c. Inverse transformation moves cuts from W i to some W j, j i. Inverse transformation incurs a c capacity violation. What does an instance transformation provide? Large total capacity of W i Better concentration as the total capacity increases

Thank You! Questions?