Graph Sparsifiers: A Survey

Similar documents
Graph Sparsifiers. Smaller graph that (approximately) preserves the values of some set of graph parameters. Graph Sparsification

Graph Sparsification I : Effective Resistance Sampling

Sparsification of Graphs and Matrices

Matrix Concentration. Nick Harvey University of British Columbia

Algorithms, Graph Theory, and the Solu7on of Laplacian Linear Equa7ons. Daniel A. Spielman Yale University

Graphs, Vectors, and Matrices Daniel A. Spielman Yale University. AMS Josiah Willard Gibbs Lecture January 6, 2016

Sparsification by Effective Resistance Sampling

ORIE 6334 Spectral Graph Theory November 8, Lecture 22

Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving. 22 Element-wise Sampling of Graphs and Linear Equation Solving

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21

Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm

Spectral Sparsification in Dynamic Graph Streams

A Tutorial on Matrix Approximation by Row Sampling

Lecture 13: Spectral Graph Theory

Laplacian Matrices of Graphs: Spectral and Electrical Theory

Algorithms, Graph Theory, and Linear Equa9ons in Laplacians. Daniel A. Spielman Yale University

Effective Resistance and Schur Complements

An SDP-Based Algorithm for Linear-Sized Spectral Sparsification

Graph Sparsification II: Rank one updates, Interlacing, and Barriers

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

An Empirical Comparison of Graph Laplacian Solvers

Towards Practically-Efficient Spectral Sparsification of Graphs. Zhuo Feng

Approximate Gaussian Elimination for Laplacians

Walks, Springs, and Resistor Networks

Spectral and Electrical Graph Theory. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Random Matrices: Invertibility, Structure, and Applications

ORIE 6334 Spectral Graph Theory October 13, Lecture 15

Graph Sparsification III: Ramanujan Graphs, Lifts, and Interlacing Families

Hierarchies. 1. Lovasz-Schrijver (LS), LS+ 2. Sherali Adams 3. Lasserre 4. Mixed Hierarchy (recently used) Idea: P = conv(subset S of 0,1 n )

Randomized sparsification in NP-hard norms

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

On the Spectra of General Random Graphs

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

Graph Partitioning Using Random Walks

Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time

Constant Arboricity Spectral Sparsifiers

A Matrix Expander Chernoff Bound

Iterative solvers for linear equations

Iterative solvers for linear equations

Lecture 10 February 4, 2013

Vectorized Laplacians for dealing with high-dimensional data sets

An Efficient Graph Sparsification Approach to Scalable Harmonic Balance (HB) Analysis of Strongly Nonlinear RF Circuits

Solving systems of linear equations

SDP Relaxations for MAXCUT

Randomized Algorithms

1 Adjacency matrix and eigenvalues

Semidefinite Programming Basics and Applications

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Ranking and sparsifying a connection graph

Approximating the Exponential, the Lanczos Method and an Õ(m)-Time Spectral Algorithm for Balanced Separator

Lecture 6 January 21, 2013

Computational Methods. Eigenvalues and Singular Values

Approximation & Complexity

Average-case Analysis for Combinatorial Problems,

On the efficient approximability of constraint satisfaction problems

Fiedler s Theorems on Nodal Domains

The Simplest Construction of Expanders

arxiv: v3 [cs.ds] 20 Jul 2010

EECS 495: Randomized Algorithms Lecture 14 Random Walks. j p ij = 1. Pr[X t+1 = j X 0 = i 0,..., X t 1 = i t 1, X t = i] = Pr[X t+

Approximate Spectral Clustering via Randomized Sketching

Iterative solvers for linear equations

Fitting a Graph to Vector Data. Samuel I. Daitch (Yale) Jonathan A. Kelner (MIT) Daniel A. Spielman (Yale)

GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] 1. Preliminaries

to be more efficient on enormous scale, in a stream, or in distributed settings.

Assignment 4: Solutions

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

Efficiently Implementing Sparsity in Learning

Optimisation and Operations Research

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.

Sparsity Matters. Robert J. Vanderbei September 20. IDA: Center for Communications Research Princeton NJ.

1 Matrix notation and preliminaries from spectral graph theory

RandNLA: Randomized Numerical Linear Algebra

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

8.1 Concentration inequality for Gaussian random matrix (cont d)

Physical Metaphors for Graphs

x 1 + x 2 2 x 1 x 2 1 x 2 2 min 3x 1 + 2x 2

Online Semidefinite Programming

Equivalent relaxations of optimal power flow

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

Lecture 13 March 7, 2017

Subsampling Semidefinite Programs and Max-Cut on the Sphere

Dissertation Defense

Convex Optimization of Graph Laplacian Eigenvalues

Approximating Submodular Functions. Nick Harvey University of British Columbia

Sparse Matrix Theory and Semidefinite Optimization

Application: Bucket Sort

Lecture 2: Linear Algebra Review

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Matrix Rank Minimization with Applications

CS60021: Scalable Data Mining. Dimensionality Reduction

Algorithm Design Using Spectral Graph Theory

Quantum walk algorithms

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Outline. Martingales. Piotr Wojciechowski 1. 1 Lane Department of Computer Science and Electrical Engineering West Virginia University.

MAA507, Power method, QR-method and sparse matrix representation.

1 Review: symmetric matrices, their eigenvalues and eigenvectors

Graph Sparsification by Edge-Connectivity and Random Spanning Trees

Expanders via Random Spanning Trees

Lecture Approximate Potentials from Approximate Flow

18.312: Algebraic Combinatorics Lionel Levine. Lecture 19

Transcription:

Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman, Srivastava and Teng

Approximating Dense Objects by Floor joists Sparse Ones Image compression

Approximating Dense Graphs by Sparse Ones Spanners: Approximate distances to within using only = O(n 1+2/ ) edges (n = # vertices) Low-stretch trees: Approximate most distances to within O(log n) using only n-1 edges

Overview Definitions Cut & Spectral Sparsifiers Applications Cut Sparsifiers Spectral Sparsifiers A random sampling construction Derandomization

Cut Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Output: A subgraph H=(V,F) of G with weights w : F! R + such that F is small and w(± H (U)) = (1 ²) u(± G (U)) (Karger 94) 8U µ V weight of edges between U and V\U in H weight of edges between U and V\U in G U U

Cut Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Output: A subgraph H=(V,F) of G with weights w : F! R + such that F is small and w(± H (U)) = (1 ²) u(± G (U)) (Karger 94) 8U µ V weight of edges between U and V\U in H weight of edges between U and V\U in G

Generic Application of Cut Sparsifiers (Dense) Input graph G (Slow) Algorithm A for some problem P Exact/Approx Output (Efficient) Sparsification Algorithm S Min s-t cut, Sparsest cut, Max cut, Sparse graph H approx preserving solution of P Algorithm A (now faster) Approximate Output

Relation to Expander Graphs Graph H on V is an expander if, for some constant c, ± H (U) c U 8UµV, U n/2 Let G be the complete graph on V. If we give all edges of H weight w=n, then w(± H (U)) c n U ¼ c ± G (U) 8UµV, U n/2 Expanders are similar to sparsifiers of complete graph G H

Relation to Expander Graphs Simple Random Construction: Erdos-Renyi graph G np is an expander if p= (log(n)/n), with high probability. This gives an expander with (n log n) edges with high probability. But aren t there much better expanders? G H

Spectral Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Def: The Laplacian is the matrix L G such that x T L G x = st2e u st (x s -x t ) 2 8x2R V. L G is positive semidefinite since this is 0. (Spielman-Teng 04) Example: Electrical Networks View edge st as resistor of resistance 1/u st. Impose voltage x v at every vertex v. Ohm s Power Law: P = V 2 /R. Power consumed on edge st is u st (x s -x t ) 2. Total power consumed is x T L G x.

Spectral Sparsifiers Input: An undirected graph G=(V,E) with weights u : E! R + Def: The Laplacian is the matrix L G such that x T L G x = st2e u st (x s -x t ) 2 8x2R V. Output: A subgraph H=(V,F) of G with weights w : F! R such that F is small and x T L H x = (1 ²) x T L G x 8x 2 R V (Spielman-Teng 04) Spectral Sparsifier ) Restrict to {0,1}-vectors w(± H (U)) = (1 ²) u(± G (U)) 8U µ V ) Cut Sparsifier

Cut vs Spectral Sparsifiers Number of Constraints: Cut: w(± H (U)) = (1 ²) u(± G (U)) 8UµV (2 n constraints) Spectral: x T L H x = (1 ²) x T L G x 8x2R V (1 constraints) Spectral constraints are SDP feasibility constraints: (1-²) x T L G x x T L H x (1+²) x T L G x 8x2R V, (1-²) L G ¹ L H ¹ (1+²) L G Here X ¹ Y means Y-X is positive semidefinite Spectral constraints are actually easier to handle Checking Is H is a spectral sparsifier of G? is in P Checking Is H is a cut sparsifier of G? is non-uniform sparsest cut, so NP-hard

Application of Spectral Sparsifiers Consider the linear system L G x = b. Actual solution is x := L G -1 b. Instead, compute y := L H -1 b, where H is a spectral sparsifier of G. We know: (1-²) L G ¹ L H ¹ (1+²) L G ) y has low multiplicative error: ky-xk LG 2² kxk LG Computing y is fast since H is sparse: conjugate gradient method takes O(n F ) time (where F = # nonzero entries of L H )

Application of Spectral Sparsifiers Consider the linear system L G x = b. Actual solution is x := L G -1 b. Instead, compute y := L H -1 b, where H is a spectral sparsifier of G. We know: (1-²) L G ¹ L H ¹ (1+²) L G ) y has low multiplicative error: ky-xk LG 2² kxk LG Theorem: [Spielman-Teng 04, Koutis-Miller-Peng 10] Can compute a vector y with low multiplicative error in O(m log n (log log n) 2 ) time. (m = # edges of G)

Results on Sparsifiers Cut Sparsifiers Spectral Sparsifiers Combinatorial Karger 94 Benczur-Karger 96 Fung-Hariharan- Harvey-Panigrahi 11 Spielman-Teng 04 Linear Algebraic Spielman-Srivastava 08 Batson-Spielman-Srivastava 09 de Carli Silva-Harvey-Sato 11 Construct sparsifiers with n log O(1) n / ² 2 edges, in nearly linear time Construct sparsifiers with O(n/² 2 ) edges, in poly(n) time

Sparsifiers by Random Sampling The complete graph is easy! Random sampling gives an expander (ie. sparsifier) with O(n log n) edges.

Sparsifiers by Random Sampling Eliminate most of these Keep this Can t sample edges with same probability! Idea [BK 96] Sample low-connectivity edges with high probability, and high-connectivity edges with low probability

Non-uniform sampling algorithm [BK 96] Input: Graph G=(V,E), weights u : E! R + Output: A subgraph H=(V,F) with weights w : F! R + Choose parameter ½ Compute probabilities { p e : e2e } For i=1 to ½ For each edge e2e With probability p e, Add e to F Increase w e by u e /(½p e ) Note: E[ F ] ½ e p e Note: E[ w e ] = u e 8e2E ) For every UµV, E[ w(± H (U)) ] = u(± G (U)) Can we do this so that the cut values are tightly concentrated and E[ F ]=n log O(1) n?

Benczur-Karger 96 Input: Graph G=(V,E), weights u : E! R + Output: A subgraph H=(V,F) with weights w : F! R + Choose parameter ½ Compute probabilities { p e : e2e } For i=1 to ½ For But each what edge is strength? e2e Can t With we probability use connectivity? p e, Add e to F Increase w e by u e /(½p e ) Can we do this so that the cut values are tightly concentrated and E[ F ]=n log O(1) n? Can approximate all values in m log O(1) n time. Set ½ = O(log n/² 2 ). Let p e = 1/ strength of edge e. Cuts are preserved to within (1 ²) and E[ F ] = O(n log n/² 2 )

Fung-Hariharan-Harvey-Panigrahi 11 Input: Graph G=(V,E), weights u : E! R + Output: A subgraph H=(V,F) with weights w : F! R + Choose parameter ½ Compute probabilities { p e : e2e } For i=1 to ½ For each edge e2e With probability p e, Add e to F Increase w e by u e /(½p e ) Can we do this so that the cut values are tightly concentrated and E[ F ]=n log O(1) n? Can approximate all values in O(m + n log n) time Set ½ = O(log 2 n/² 2 ). Let p st = 1/(min cut separating s and t) Cuts are preserved to within (1 ²) and E[ F ] = O(n log 2 n/² 2 )

Overview of Analysis Most cuts hit a huge number of edges ) extremely concentrated ) whp, most cuts are close to their mean

Overview of Analysis Hits only one red edge ) poorly concentrated Hits many red edges ) reasonably concentrated High connectivity This doesn t happen often! Need bound on # small Steiner cuts Fung-Harvey-Hariharan-Panigrahi 11 Low sampling probability This doesn t happen often! Need bound on # small cuts Karger 94 The same cut also hits many green edges ) highly concentrated Low connectivity High sampling probability

Summary for Cut Sparsifiers Do non-uniform sampling of edges, with probabilities based on connectivity Decomposes graph into connectivity classes and argue concentration of all cuts Need bounds on # small cuts BK 96 used strength not connectivity Can get sparsifiers with O(n log n / ² 2 ) edges Optimal for any independent sampling algorithm

Spectral Sparsification Input: Graph G=(V,E), weights u : E! R + Recall: x T L G x = st2e u st (x s -x t ) 2 Call this x T L st x Goal: Find weights w : E! R + such that most w e are zero, and (1-²) x T L G x e2e w e x T L e x (1+²) x T L G x 8x2R V, (1- ²) L G ¹ e2e w e L e ¹ (1+²) L G General Problem: Given matrices L e satisfying e L e = L G, find coefficients w e, mostly zero, such that (1-²) L G ¹ e w e L e ¹ (1+²) L G

The General Problem: Sparsifying Sums of PSD Matrices General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L Theorem: [Ahlswede-Winter 02] Random sampling gives w with O( n log n/² 2 ) non-zeros. Theorem: [de Carli Silva-Harvey-Sato 11], building on [Batson-Spielman-Srivastava 09] Deterministic alg gives w with O( n/² 2 ) non-zeros. Cut & spectral sparsifiers with O(n/² 2 ) edges [BSS 09] Sparsifiers with more properties and O(n/² 2 ) edges [dhs 11]

Vector Case Vector General problem: Problem: Given vectors PSD matrices v 1,,v m L2[0,1] e s.t. n e. L e = L, Let find v coefficients = i v i /m. Find w e, coefficients mostly zero, wsuch e, mostly that zero, such that (1-²) k L ¹ e w e e vw e - e Lv e k¹ 1 (1+²) ² L Theorem [Althofer 94, Lipton-Young 94]: There is a w with O(log n/² 2 ) non-zeros. Proof: Random sampling & Hoeffding inequality. ) ²-approx equilibria with O(log n/² 2 ) support in zero-sum games Multiplicative version: There is a w with O(n log n/² 2 ) non-zeros such that (1-²) v e w e v e (1+²) v

Concentration Inequalities Theorem: [Chernoff 52, Hoeffding 63] Let Y 1,,Y k be i.i.d. random non-negative real numbers s.t. E[ Y i ] = Z and Y i uz. Then Theorem: [Ahlswede-Winter 02] Let Y 1,,Y k be i.i.d. random PSD nxn matrices s.t. E[ Y i ] = Z and Y i ¹uZ. Then The only difference

Balls & Bins Example Problem: Throw k balls into n bins. Want (max load) / (min load) 1+². How big should k be? AW Theorem: Let Y 1,,Y k be i.i.d. random PSD matrices such that E[ Y i ] = Z and Y i ¹uZ. Then Solution: Let Y i be all zeros, except for a single n in a random diagonal entry. Then E[ Y i ] = I, and Y i ¹ ni. Set k = (n log n /² 2 ). Whp, every diagonal entry of i Y i /k is in [1-²,1+²].

Solving the General Problem General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L AW Theorem: Let Y 1,,Y k be i.i.d. random PSD matrices such that E[ Y i ] = Z and Y i ¹uZ. Then To solve General Problem with O(n log n/² 2 ) non-zeros Repeat k:= (n log n /² 2 ) times Pick an edge e with probability p e := Tr(L e L -1 ) / n Increment w e by 1/k p e

Derandomization Vector problem: Given vectors v e 2[0,1] n s.t. e v e = v, find coefficients w e, mostly zero, such that k e w e v e - v k 1 ² Theorem [Young 94]: The multiplicative weights method deterministically gives w with O(log n/² 2 ) non-zeros Or, use pessimistic estimators on the Hoeffding proof General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L Theorem [de Carli Silva-Harvey-Sato 11]: The matrix multiplicative weights method (Arora-Kale 07) deterministically gives w with O(n log n/² 2 ) non-zeros Or, use matrix pessimistic estimators (Wigderson-Xiao 06)

MWUM for Balls & Bins Let i = load in bin i. Initially =0. Want: 1 i and i 1. Introduce penalty functions exp(l- i) and exp( i-u) Find a bin i to throw a ball into such that, increasing l by ± l and u by ± u, the penalties don t grow. i exp(l+± l - i ) i exp(l - i) i exp( i -(u+± u )) i exp( i-u) Careful analysis shows O(n log n/² 2 ) balls is enough values: l 0 u 1

MMWUM for General Problem Let A=0 and its eigenvalues. Want: 1 i and i 1. Use penalty functions Tr exp(li-a) and Tr exp(a-ui) Find a matrix L e such that adding L e to A, increasing l by ± l and u by ± u, the penalties don t grow. Tr exp((l+± l )I- (A+ L e )) Tr exp(l I-A) Tr exp((a+ L e )-(u+± u )I) Tr exp(a-ui) Careful analysis shows O(n log n/² 2 ) matrices is enough values: l 0 u 1

Beating Sampling & MMWUM To get a better bound, try changing the penalty functions to be steeper! Use penalty functions Tr (A-lI) -1 and Tr (ui-a) -1 Find a matrix L e such that adding L e to A, increasing l by ± l and u by ± u, the penalties don t grow. Tr ((A+ L e )-(l+± l )I) -1 Tr (A-l I) -1 Tr ((u+± u )I-(A+ L e )) -1 Tr (ui-a) -1 All eigenvalues stay within [l, u] values: l 0 u 1

Beating Sampling & MMWUM To get a better bound, try changing the penalty functions to be steeper! Use penalty functions Tr (A-lI) -1 and Tr (ui-a) -1 Find a matrix L e such that adding L e to A, increasing l by ± l and u by ± u, the penalties don t grow. Tr ((A+ L e )-(l+± l )I) -1 Tr (A-l I) -1 Tr ((u+± u )I-(A+ L e )) -1 Tr (ui-a) -1 General Problem: Given PSD matrices L e s.t. e L e = L, find coefficients w e, mostly zero, such that (1-²) L ¹ e w e L e ¹ (1+²) L Theorem: [Batson-Spielman-Srivastava 09] in rank-1 case, [de Carli Silva-Harvey-Sato 11] for general case This gives a solution w with O( n/² 2 ) non-zeros.

Applications Theorem: [de Carli Silva-Harvey-Sato 11] Given PSD matrices L e s.t. e L e = L, there is an algorithm to find w with O( n/² 2 ) non-zeros such that (1-²) L ¹ e w e L e ¹ (1+²) L Application 1: Spectral Sparsifiers with Costs Given costs on edges of G, can find sparsifier H whose cost is at most (1+²) the cost of G. Application 2: Sparse SDP Solutions min { c T y : i y i A i º B, y 0 } where A i s and B are PSD has nearly optimal solution with O(n/² 2 ) non-zeros.

Open Questions Sparsifiers for directed graphs More constructions of sparsifiers with O(n/² 2 ) edges. Perhaps randomized? Iterative construction of expander graphs More control of the weights w e A combinatorial proof of spectral sparsifiers More applications of our general theorem