Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman, Srivastava and Teng
Approximating Dense Objects by Floor joists Sparse Ones Image compression
Approximating Dense Graphs by Sparse Ones Spanners: Approximate distances to within using only = O(n 1+2/ ) edges (n = # vertices) Low-stretch trees: Approximate most distances to within O(log n) using only n-1 edges
Overview Definitions Cut & Spectral Sparsifiers Applications Cut Sparsifiers Spectral Sparsifiers A random sampling construction Derandomization
Cut Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Output: A subgraph H=(V,F) of G with weights w : F! R + such that F is small and w(± H (U)) = (1 ²) u(± G (U)) (Karger 94) 8U µ V weight of edges between U and V\U in H weight of edges between U and V\U in G U U
Cut Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Output: A subgraph H=(V,F) of G with weights w : F! R + such that F is small and w(± H (U)) = (1 ²) u(± G (U)) (Karger 94) 8U µ V weight of edges between U and V\U in H weight of edges between U and V\U in G
Generic Application of Cut Sparsifiers (Dense) Input graph G (Slow) Algorithm A for some problem P Exact/Approx Output (Efficient) Sparsification Algorithm S Min s-t cut, Sparsest cut, Max cut, Sparse graph H approx preserving solution of P Algorithm A (now faster) Approximate Output
Relation to Expander Graphs Graph H on V is an expander if, for some constant c, ± H (U) c U 8UµV, U n/2 Let G be the complete graph on V. If we give all edges of H weight w=n, then w(± H (U)) c n U ¼ c ± G (U) 8UµV, U n/2 Expanders are similar to sparsifiers of complete graph G H
Relation to Expander Graphs Simple Random Construction: Erdos-Renyi graph G np is an expander if p= (log(n)/n), with high probability. This gives an expander with (n log n) edges with high probability. But aren t there much better expanders? G H
Spectral Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Def: The Laplacian is the matrix L G such that x T L G x = st2e u st (x s -x t ) 2 8x2R V. L G is positive semidefinite since this is 0. (Spielman-Teng 04) Example: Electrical Networks View edge st as resistor of resistance 1/u st. Impose voltage x v at every vertex v. Ohm s Power Law: P = V 2 /R. Power consumed on edge st is u st (x s -x t ) 2. Total power consumed is x T L G x.
Spectral Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Def: The Laplacian is the matrix L G such that x T L G x = st2e u st (x s -x t ) 2 8x2R V. Output: A subgraph H=(V,F) of G with weights w : F! R such that F is small and x T L H x = (1 ²) x T L G x 8x 2 R V (Spielman-Teng 04) Spectral Sparsifier ) Restrict to {0,1}-vectors w(± H (U)) = (1 ²) u(± G (U)) 8U µ V ) Cut Sparsifier
Cut vs Spectral Sparsifiers Number of Constraints: Cut: w(± H (U)) = (1 ²) u(± G (U)) 8UµV (2 n constraints) Spectral: x T L H x = (1 ²) x T L G x 8x2R V (1 constraints) Spectral constraints are SDP feasibility constraints: (1-²) x T L G x x T L H x (1+²) x T L G x 8x2R V, (1-²) L G ¹ L H ¹ (1+²) L G Here X ¹ Y means Y-X is positive semidefinite Spectral constraints are actually easier to handle Checking Is H is a spectral sparsifier of G? is in P Checking Is H is a cut sparsifier of G? is non-uniform sparsest cut, so NP-hard
Application of Spectral Sparsifiers Consider the linear system L G x = b. Actual solution is x := L G -1 b. Instead, compute y := L H -1 b, where H is a spectral sparsifier of G. We know: (1-²) L G ¹ L H ¹ (1+²) L G ) y has low multiplicative error: ky-xk LG 2² kxk LG Computing y is fast since H is sparse: conjugate gradient method takes O(n F ) time (where F = # nonzero entries of L H )
Application of Spectral Sparsifiers Consider the linear system L G x = b. Actual solution is x := L G -1 b. Instead, compute y := L H -1 b, where H is a spectral sparsifier of G. We know: (1-²) L G ¹ L H ¹ (1+²) L G ) y has low multiplicative error: ky-xk LG 2² kxk LG Theorem: [Spielman-Teng 04, Koutis-Miller-Peng 10] Can compute a vector y with low multiplicative error in O(m log n (log log n) 2 ) time. (m = # edges of G)
Results on Sparsifiers Cut Sparsifiers Spectral Sparsifiers Combinatorial Karger 94 Benczur-Karger 96 Fung-Hariharan- Harvey-Panigrahi 11 Spielman-Teng 04 Linear Algebraic Spielman-Srivastava 08 Batson-Spielman-Srivastava 09 de Carli Silva-Harvey-Sato 11 Construct sparsifiers with n log O(1) n / ² 2 edges, in nearly linear time Construct sparsifiers with O(n/² 2 ) edges, in poly(n) time
Sparsifiers by Random Sampling The complete graph is easy! Random sampling gives an expander (ie. sparsifier) with O(n log n) edges.
Sparsifiers by Random Sampling Eliminate most of these Keep this Can t sample edges with same probability! Idea [BK 96] Sample low-connectivity edges with high probability, and high-connectivity edges with low probability
Non-uniform sampling algorithm [BK 96] Input: Graph G=(V,E), weights u : E! R + Output: A subgraph H=(V,F) with weights w : F! R + Choose parameter ½ Compute probabilities { p e : e2e } For i=1 to ½ For each edge e2e With probability p e, Add e to F Increase w e by u e /(½p e ) Note: E[ F ] ½ e p e Note: E[ w e ] = u e 8e2E ) For every UµV, E[ w(± H (U)) ] = u(± G (U)) Can we do this so that the cut values are tightly concentrated and E[ F ]=n log O(1) n?
Benczur-Karger 96 Input: Graph G=(V,E), weights u : E! R + Output: A subgraph H=(V,F) with weights w : F! R + Choose parameter ½ Compute probabilities { p e : e2e } For i=1 to ½ For But each what edge is strength? e2e Can t With we probability use connectivity? p e, Add e to F Increase w e by u e /(½p e ) Can we do this so that the cut values are tightly concentrated and E[ F ]=n log O(1) n? Can approximate all values in m log O(1) n time. Set ½ = O(log n/² 2 ). Let p e = 1/ strength of edge e. Cuts are preserved to within (1 ²) and E[ F ] = O(n log n/² 2 )
Fung-Hariharan-Harvey-Panigrahi 11 Input: Graph G=(V,E), weights u : E! R + Output: A subgraph H=(V,F) with weights w : F! R + Choose parameter ½ Compute probabilities { p e : e2e } For i=1 to ½ For each edge e2e With probability p e, Add e to F Increase w e by u e /(½p e ) Can we do this so that the cut values are tightly concentrated and E[ F ]=n log O(1) n? Can approximate all values in O(m + n log n) time Set ½ = O(log 2 n/² 2 ). Let p st = 1/(min cut separating s and t) Cuts are preserved to within (1 ²) and E[ F ] = O(n log 2 n/² 2 )
Overview of Analysis Most cuts hit a huge number of edges ) extremely concentrated ) whp, most cuts are close to their mean
Overview of Analysis Hits only one red edge ) poorly concentrated Hits many red edges ) reasonably concentrated High connectivity This doesn t happen often! Need bound on # small Steiner cuts Fung-Harvey-Hariharan-Panigrahi 11 Low sampling probability This doesn t happen often! Need bound on # small cuts Karger 94 The same cut also hits many green edges ) highly concentrated Low connectivity High sampling probability
Summary for Cut Sparsifiers Do non-uniform sampling of edges, with probabilities based on connectivity Decomposes graph into connectivity classes and argue concentration of all cuts Need bounds on # small cuts BK 96 used strength not connectivity Can get sparsifiers with O(n log n / ² 2 ) edges Optimal for any independent sampling algorithm
Spectral Sparsification Input: Graph G=(V,E), weights u : E! R + Recall: x T L G x = st2e u st (x s -x t ) 2 Call this x T L st x Goal: Find weights w : E! R + such that most w e are zero, and (1-²) x T L G x e2e w e x T L e x (1+²) x T L G x 8x2R V, (1- ²) L G ¹ e2e w e L e ¹ (1+²) L G General Problem: Given matrices L e satisfying e L e = L G, find coefficients w e, mostly zero, such that (1-²) L G ¹ e w e L e ¹ (1+²) L G
The General Problem: Sparsifying Sums of PSD Matrices General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L Theorem: [Ahlswede-Winter 02] Random sampling gives w with O( n log n/² 2 ) non-zeros. Theorem: [de Carli Silva-Harvey-Sato 11], building on [Batson-Spielman-Srivastava 09] Deterministic alg gives w with O( n/² 2 ) non-zeros. Cut & spectral sparsifiers with O(n/² 2 ) edges [BSS 09] Sparsifiers with more properties and O(n/² 2 ) edges [dhs 11]
Vector Case Vector General problem: Problem: Given vectors PSD matrices v 1,,v m L2[0,1] e s.t. n e. L e = L, Let find v coefficients = i v i /m. Find w e, coefficients mostly zero, wsuch e, mostly that zero, such that (1-²) k L ¹ e w e e vw e - e Lv e k¹ 1 (1+²) ² L Theorem [Althofer 94, Lipton-Young 94]: There is a w with O(log n/² 2 ) non-zeros. Proof: Random sampling & Hoeffding inequality. ) ²-approx equilibria with O(log n/² 2 ) support in zero-sum games Multiplicative version: There is a w with O(n log n/² 2 ) non-zeros such that (1-²) v e w e v e (1+²) v
Concentration Inequalities Theorem: [Chernoff 52, Hoeffding 63] Let Y 1,,Y k be i.i.d. random non-negative real numbers s.t. E[ Y i ] = Z and Y i uz. Then Theorem: [Ahlswede-Winter 02] Let Y 1,,Y k be i.i.d. random PSD nxn matrices s.t. E[ Y i ] = Z and Y i ¹uZ. Then The only difference
Balls & Bins Example Problem: Throw k balls into n bins. Want (max load) / (min load) 1+². How big should k be? AW Theorem: Let Y 1,,Y k be i.i.d. random PSD matrices such that E[ Y i ] = Z and Y i ¹uZ. Then Solution: Let Y i be all zeros, except for a single n in a random diagonal entry. Then E[ Y i ] = I, and Y i ¹ ni. Set k = (n log n /² 2 ). Whp, every diagonal entry of i Y i /k is in [1-²,1+²].
Solving the General Problem General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L AW Theorem: Let Y 1,,Y k be i.i.d. random PSD matrices such that E[ Y i ] = Z and Y i ¹uZ. Then To solve General Problem with O(n log n/² 2 ) non-zeros Repeat k:= (n log n /² 2 ) times Pick an edge e with probability p e := Tr(L e L -1 ) / n Increment w e by 1/k p e
Derandomization Vector problem: Given vectors v e 2[0,1] n s.t. e v e = v, find coefficients w e, mostly zero, such that k e w e v e - v k 1 ² Theorem [Young 94]: The multiplicative weights method deterministically gives w with O(log n/² 2 ) non-zeros Or, use pessimistic estimators on the Hoeffding proof General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L Theorem [de Carli Silva-Harvey-Sato 11]: The matrix multiplicative weights method (Arora-Kale 07) deterministically gives w with O(n log n/² 2 ) non-zeros Or, use matrix pessimistic estimators (Wigderson-Xiao 06)
MWUM for Balls & Bins Let i = load in bin i. Initially =0. Want: 1 i and i 1. Introduce penalty functions exp(l- i) and exp( i-u) Find a bin i to throw a ball into such that, increasing l by ± l and u by ± u, the penalties don t grow. i exp(l+± l - i ) i exp(l - i) i exp( i -(u+± u )) i exp( i-u) Careful analysis shows O(n log n/² 2 ) balls is enough values: l 0 u 1
MMWUM for General Problem Let A=0 and its eigenvalues. Want: 1 i and i 1. Use penalty functions Tr exp(li-a) and Tr exp(a-ui) Find a matrix L e such that adding L e to A, increasing l by ± l and u by ± u, the penalties don t grow. Tr exp((l+± l )I- (A+ L e )) Tr exp(l I-A) Tr exp((a+ L e )-(u+± u )I) Tr exp(a-ui) Careful analysis shows O(n log n/² 2 ) matrices is enough values: l 0 u 1
Beating Sampling & MMWUM To get a better bound, try changing the penalty functions to be steeper! Use penalty functions Tr (A-lI) -1 and Tr (ui-a) -1 Find a matrix L e such that adding L e to A, increasing l by ± l and u by ± u, the penalties don t grow. Tr ((A+ L e )-(l+± l )I) -1 Tr (A-l I) -1 Tr ((u+± u )I-(A+ L e )) -1 Tr (ui-a) -1 All eigenvalues stay within [l, u] values: l 0 u 1
Beating Sampling & MMWUM To get a better bound, try changing the penalty functions to be steeper! Use penalty functions Tr (A-lI) -1 and Tr (ui-a) -1 Find a matrix L e such that adding L e to A, increasing l by ± l and u by ± u, the penalties don t grow. Tr ((A+ L e )-(l+± l )I) -1 Tr (A-l I) -1 Tr ((u+± u )I-(A+ L e )) -1 Tr (ui-a) -1 General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L Theorem: [Batson-Spielman-Srivastava 09] in rank-1 case, [de Carli Silva-Harvey-Sato 11] for general case This gives a solution w with O( n/² 2 ) non-zeros.
Applications Theorem: [de Carli Silva-Harvey-Sato 11] Given PSD matrices L e s.t. e L e = L, there is an algorithm to find w with O( n/² 2 ) non-zeros such that (1-²) L ¹ e w e L e ¹ (1+²) L Application 1: Spectral Sparsifiers with Costs Given costs on edges of G, can find sparsifier H whose cost is at most (1+²) the cost of G. Application 2: Sparse SDP Solutions min { c T y : i y i A i º B, y 0 } where A i s and B are PSD has nearly optimal solution with O(n/² 2 ) non-zeros.
Open Questions Sparsifiers for directed graphs More constructions of sparsifiers with O(n/² 2 ) edges. Perhaps randomized? Iterative construction of expander graphs More control of the weights w e A combinatorial proof of spectral sparsifiers More applications of our general theorem