An SDP-Based Algorithm for Linear-Sized Spectral Sparsification
|
|
- Flora Stewart
- 5 years ago
- Views:
Transcription
1 An SDP-Based Algorithm for Linear-Sized Spectral Sparsification ABSTRACT Yin Tat Lee Microsoft Research Redmond, USA For any undirected and weighted graph G = V, E, w with n vertices and m edges, we call a sparse subgraph H of G, with proper reweighting of the edges, a + ε-spectral sparsifier if εx L G x x L H x + εx L G x holds for any x R n, where L G and L H are the respective Laplacian matrices of G and H. Noticing that Ωm time is needed for any algorithm to construct a spectral sparsifier and a spectral sparsifier of G requires Ωn edges, a natural question is to investigate, for any constant ε, if a + ε-spectral sparsifier of G with O n edges can be constructed in Õm time, where the Õ notation suppresses polylogarithmic factors. All previous constructions on spectral sparsification require either super-linear number of edges or m +Ω time. In this work we answer this question affirmatively by presenting an algorithm that, for any undirected graph G and ε > 0, outputs a + ε-spectral sparsifier of G with On/ε 2 edges in Õ m/ε O time. Our algorithm is based on three novel techniques: a new potential function which is much easier to compute yet has similar guarantees as the potential functions used in previous references; 2 an efficient reduction from a two-sided spectral sparsifier to a one-sided spectral sparsifier; 3 constructing a one-sided spectral sparsifier by a semi-definite program. CCS CONCEPTS Mathematics of computing Probabilistic algorithms; Theory of computation Sparsification and spanners; KEYWORDS spectral graph theory, spectral sparsification ACM Reference format: Yin Tat Lee and He Sun An SDP-Based Algorithm for Linear-Sized Spectral Sparsification. In Proceedings of 49th Annual ACM SIGACT Symposium on the Theory of Computing, Montreal, Canada, June 207 STOC 7, 0 pages. DOI: 0.45/ Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. STOC 7, Montreal, Canada 207 ACM /7/06... $5.00 DOI: 0.45/ INTRODUCTION He Sun University of Bristol Bristol, UK h.sun@bristol.ac.uk A sparse graph is one whose number of edges is reasonably viewed as being proportional to the number of vertices. Since most algorithms run faster on sparse instances of graphs and it is more space-efficient to store sparse graphs, it is useful to obtain a sparse representation H of G so that certain properties between G and H are preserved, see Figure for an illustration. Over the past three decades, different notions of graph sparsification have been proposed and widely used to design approximation algorithms. For instance, a spanner H of a graph G is a subgraph of G so that the shortest path distance between any pair of vertices is approximately preserved [6]. Benczúr and Karger [5] defined a cut sparsifier of a graph G to be a sparse subgraph H such that the value of any cut between G and H are approximately the same. In particular, Spielman and Teng [6] introduced a spectral sparsifer, which is a sparse subgraph H of an undirected graph G such that many spectral properties of the Laplacian matrices between G and H are approximately preserved. Formally, for any undirected graph G with n vertices and m edges, we call a subgraph H of G, with proper reweighting of the edges, a + ε-spectral sparsifier if εx L G x x L H x + εx L G x holds for any x R n, where L G and L H are the respective Laplacian matrices of G and H. Spectral sparsification has been proven to be a remarkably useful tool in algorithm design, linear algebra, combinatorial optimisation, machine learning, and network analysis G 29 H Figure : The graph sparsification is a reweighted subgraph H of an original graph G such that certain properties are preserved. These subgraphs are sparse, and are more spaceefficient to be stored than the original graphs. The picture above uses the thickness of edges in H to represent their weights. In the seminal work on spectral sparsification, Spielman and Teng [6] showed that, for any undirected graph G of n vertices, a spectral sparsifier of G with only O n log c n/ε 2 edges exists and
2 STOC 7, June 207, Montreal, Canada Yin Tat Lee, and He Sun can be constructed in nearly-linear time, where c 2 is some constant. Both the runtime of their algorithm and the number of edges in the output graph involve large poly-logarithmic factors, and this motivates a sequence of simpler and faster constructions of spectral sparsifiers with fewer edges [2, 3, 0]. In particular, since any constant-degree expander graph of On edges is a spectral sparsifier of an n-vertex complete graph, a natural question is to study, for any n-vertex undirected graph G and constant ε > 0, if a +ε-spectral sparsifier of G with On edges can be constructed in nearly-linear time. Being considered as one of the most important open questions about spectral sparsification by Batson et al. [4], there has been many efforts for fast constructions of linear-sized spectral sparsifiers, e.g. [2, 0], however the original problem posed in [4] has remained open. In this work we answer this question affirmatively by presenting the first nearly-linear time algorithm for constructing a linear-sized spectral sparsifier. The formal description of our result is as follows: Theorem.. Let G be any undirected graph with n vertices and m edges. For any 0 < ε <, there is an algorithm that runs in Õ m/ε O work, Õ /ε O depth, and produces a +ε-spectral sparsifier of G with O n/ε 2 edges 2. Theorem. shows that a linear-sized spectral sparsifier can be constructed in nearly-linear time in a single machine setting, and in polylogarithmic time in a parallel setting. The same algorithm can be applied to the matrix setting, whose result is summarised as follows: Theorem.2. Given a set of m PSD matrices {M i } m, where M i R n n. Let M = m M i and Z = m nnzm i, where nnzm i is the number of non-zero entries in M i. For any 0 < ε <, there is an algorithm that runs in Õ Z + n ω /ε O work, Õ /ε O depth and produces a + ε-spectral sparsifier of M with O n/ε 2 components, i.e., there is an non-negative coefficients {c i } m such that {c i c i 0} = O n/ε 2, and ε M c i M i + ε M. Here ω is the matrix multiplication constant.. Related Work In the seminal paper on spectral sparsification, Spielman and Teng [6] showed that a spectral sparsifier of any undirected graph G can be constructed by decomposing G into multiple nearly expander graphs, and sparsifying each subgraph individually. This method leads to the first nearly-linear time algorithm for constructing a spectral sparsifier with O n log c n/ε 2 edges for some c 2. However, both the runtime of their algorithm and the number of edges in the output graph involve large poly-logarithmic factors. Spielman and Srivastava [5] showed that a + ε-spectral sparsifier of G with On log n/ε 2 edges can be constructed by sampling the edges of G with probability proportional to their effective resistances, We say a graph algorithm runs in nearly-linear time if the algorithm runs in O m poly log n time, where m and n are the number of edges and vertices of the input graph. 2 Here, the notation Õ hides a factor of log c n for some positive constant c. which is conceptually much simpler than the algorithm presented in [6]. Noticing that any constant-degree expander graph of On edges is a spectral sparsifier of an n-vertex complete graph, Spielman and Srivastava [5] asked if any n-vertex graph has a spectral sparsifier with On edges. To answer this question, Batson, Spielman and Srivastava [3] presented a polynomial-time algorithm that, for any undirected graph G of n vertices, produces a spectral sparsifier of G with On edges. At a high level, their algorithm, a.k.a. the BSS algorithm, proceeds for On iterations, and in each iteration one edge is chosen deterministically to optimise the change of some potential function. Allen-Zhu et al. [2] noticed that a less optimal" edge, based on a different potential function, can be found in almost-linear time and this leads to an almost-quadratic time algorithm. Generalising their techniques, Lee and Sun [0] showed that a linear-sized spectral sparsifier can be constructed in time O m +c for an arbitrary small constant c. All of these algorithms proceed for Ωn c iterations, and every iteration takes Ωm +c time for some constant c > 0. Hence, to break the Ωm +c runtime barrier faced in all previous constructions multiple new techniques are needed..2 Organisation The remaining part of the paper is organised as follows. We introduce necessary notions about matrices and graphs in Section 2. In Section 3 we overview our algorithm and proof techniques. For readability, more detailed discussions and technical proofs are deferred to Section 4. 2 PRELIMINARIES 2. Matrices For any n n real and symmetric matrix A, let λ min A = λ A λ n A = λ max A be the eigenvalues of A, where λ min A and λ max A represent the minimum and maximum eigenvalues of A. We call a matrix A positive semi-definite PSD if x Ax 0 holds for any x R n, and a matrix A positive definite if x Ax > 0 holds for any x R n. For any positive definite matrix A, we define the corresponding ellipsoid by EllipA { x : x A x }. 2.2 Graph Laplacian Let G = V, E, w be a connected, undirected and weighted graph with n vertices, m edges, and weight function w : E R 0. We fix an arbitrary orientation of the edges in G, and let B R m n be the signed edge-vertex incidence matrix defined by if v is e s head, B G e,v = if v is e s tail, 0 otherwise. We define an m m diagonal matrix W G by W G e, e = w e for any edge e E[G]. The Laplacian matrix of G is an n n matrix L defined by L G u,v = wu,v if u v, degu if u = v, 0 otherwise,
3 An SDP-Based Algorithm for Linear-Sized Spectral Sparsification STOC 7, June 207, Montreal, Canada where degv = u v w u,v. It is easy to verify that x L G x = x B G W GB G x = w u,v x u x v 2 0 u v holds for any x R n. Hence, the Laplacian matrix of any undirected graph is a PSD matrix. Notice that, by setting x u = if u S and x u = 0 otherwise, x L G x equals to the value of the cut between S and V \ S. Hence, a spectral sparsifier is a stronger notion than a cut sparsifer. 2.3 Other Notations For any sequence {α i } m, we use nnzα to denote the number of non-zeros in {α i } m. For any two matrices A and B, we write A B to represent B A is PSD, and A B to represent B A is positive definite. For any two matrices A and B of the same dimension, let A B tr A B, and [ ] A 0 A B =. 0 B 3 OVERVIEW OF OUR ALGORITHM Without loss of generality we study the problem of sparsifying the sum of PSD matrices. The one-to-one correspondence between the construction of a graph sparsifier and the following Problem was presented in [3]. Problem. Given a set S of m PSD matrices M,, M m with m M i = I and 0 < ε <, find non-negative coefficients {c i } m such that {c i c i 0} = O n/ε 2, and ε I c i M i + ε I. 2 For intuition, one can think all M i are rank- matrices, i.e., M i = v i v i for some v i R n. Given the correspondence between PSD matrices and ellipsoids, Problem essentially asks to use On/ε 2 vectors from S to construct an ellipsoid, whose shape is close to be a sphere. To construct such an ellipsoid with desired shape, all previous algorithms [2, 3, 0] proceed by iterations: in each iteration j the algorithm chooses one or more vectors, denoted by v j,,v jk, and adds j k t= v jt v j t to the currently constructed matrix by setting A j = A j + j. To control the shape of the constructed ellipsoid, two barrier values, the upper barrier u j and the lower barrier l j, are maintained such that the constructed ellipsoid EllipA j is sandwiched between the outer sphere u j I and the inner sphere l j I for any iteration j. That is, the following invariant always maintains: l j I A j u j I. 3 To ensure 3 holds, two barrier values l j and u j are increased properly after each iteration, i.e., u j+ = u j + δ u, j, l j+ = l j + δ l, j for some positive values δ u, j and δ l, j. The algorithm continues this process, until after T iterations EllipA T is close to be a sphere. This implies that A T is a solution of Problem, see Figure 2 for an illustration. Iteration j Iteration j + Final iteration T Figure 2: Illustration of the algorithms for constructing a linear-sized spectral sparsifier. Here, the light grey ball and the red ball in iteration j represent the spheres u j I and l j I, and the blue ellipsoid sandwiched between the two balls corresponds to the constructed ellipsoid in iteration j. After each iteration j, the algorithm increases the value of l j and u j by some δ l, j and δ u, j so that the invariant 3 holds in iteration j +. This process is repeated for T iterations, so that the final constructed ellipsoid is close to be a sphere. However, turning the scheme described above into an efficient algorithm we need to consider the following issues: Which vectors should we pick in each iteration? How many vectors can be added in each iteration? How should we update u j and l j properly so that the invariant 3 always holds? These three questions closely relate to each other: on one hand, one can always pick a single optimal vector in each iteration based on some metric, and such conservative approach requires a linear number of iterations T = Ωn/ε 2 and super-quadric time for each iteration. On the other hand, one can choose multiple less optimal" vectors to construct j in iteration j, but this makes the update of barrier values more challenging to ensure the invariant 3 holds. Indeed, the previous constructions [2, 0] speed up their algorithms at the cost of increasing the sparsity, i.e., the number of edges in a sparsifier, by more than a multiplicative constant. To address these, we introduce three novel techniques for constructing a spectral sparsifier: First of all, we define a new potential function which is much easier to compute yet has similar guarantee as the potential function introduced in [3]. Secondly we show that solving Problem with two-sided constraints in 2 can be reduced to a similar problem with only one-sided constraints. Thirdly we prove that the problem with one-sided constraints can be solved by a semi-definite program. 3. A New Potential Function To ensure that the constructed ellipsoid A is always inside the outer sphere u I, we introduce a potential function Φ u A defined by Φ u A tr exp ui A n = exp u λ i A It is easy to see that, when EllipA gets closer to the outer sphere, λ i u I A becomes smaller and the value of Φ u A increases. Hence, a bounded value of Φ u A ensures that EllipA is inside the sphere u I. For the same reason, we introduce a potential function.
4 STOC 7, June 207, Montreal, Canada Yin Tat Lee, and He Sun Φ l A defined by Φ l A tr exp A li n = exp λ i A l to ensure that the inner sphere is always inside EllipA. We also define Φ u,l A Φ u A + Φ l A, 4 as a bounded value of Φ u,l A implies that the two events occur simultaneously. Our goal is to design a proper update rule to construct {A j } inductively, so that Φ uj,l j A j is monotone non-increasing after each iteration. Assuming this, a bounded value of the initial potential function guarantees that the invariant 3 always holds. To analyse the change of the potential function, we first notice that Φ u,l A + Φ u,l A + tr tr. e ui A ui A 2 e A li A li 2 by the convexity of the function Φ u,l. We prove that, as long as the matrix satisfies 0 δ ui A 2 and 0 δ A li 2 for some small δ, the first-order approximation gives a good approximation. Lemma 3.. Let A be a symmetric matrix. Let u, l be the barrier values such that u l and li A ui. Assume that 0, δ ui A 2 and δ A li 2 for δ /0. Then, it holds that Φ u,l A + Φ u,l A + + 2δ tr e ui A ui A 2 2δ tr e A li A li 2. We remark that this is not the first paper to use a potential function to guide the growth of the ellipsoid. In [3], two potential functions similar to Λ u,l,p A = tr ui A p + tr A li p 5 are used with p =. The main drawback is that Λ u,l, does not differentiate the following two cases: Multiple eigenvalues of A are close to the boundary both u and l. One of the eigenvalues of A is very close to the boundary either u or l. It is known that, when one of the eigenvalues of A is very close to the boundary, it is more difficult to find an optimal vector. It was shown in [2] that this problem can be alleviated by using p. However, this choice of p makes the function Λ u,l,p less smooth and hence one has to take a smaller step size δ, as shown in the following lemma from [2]. Lemma 3.2 [2]. Let A be a symmetric matrix and be a rank- matrix. Let u, l be the barrier values such that li A ui. Assume that 0, δ ui A and δ A li for δ /0p and p 0. Then, it holds that Λ u,l,p A + Λ u,l,p A + p + pδ tr ui A p+ p pδ tr A li p+. Notice that, comparing with the potential function 5, our new potential function 4 blows up much faster when the eigenvalues of A are closer to the boundaries l and u. This allows the problem of finding an optimal vector much easier than using Λ u,l,p. At the same time, we avoid the problem of taking a small step /p ui A by taking a non-linear" step ui A 2. Since there cannot be too many eigenvalues close to the boundaries, this non-linear step allows us to take a large step except on a few directions. 3.2 A Simple Construction Based on Oracle The second technique we introduce is the reduction from a spectral sparsifier with two-sided constraints to the one with one-sided constraints. Geometrically, it is equivalent to require the constructed ellipsoid inside another ellipsoid, instead of being sandwiched between two spheres as depicted in Figure 2. Ideally, we want to reduce the two-sided problem to the following problem: for a set of PSD matrices M = {M i } m such that m M i I, find a sparse representation = m α i M i with small nnzα such that I. However, in the reduction we use such matrix = m α i M i to update A j and we need the length of Ellip is large on the direction that EllipA j is small. To encode this information, we define the generalised one-sided problem as follows: Definition 3.3 One-sided Oracle. Let 0 B I, C + 0,C 0 be symmetric matrices, M = {M i } m be a set of matrices such that m M i = I. We call a randomised algorithm Oracle M, B,C +,C a one-sided oracle with speed S 0, ] and error ε > 0, if the output matrix = m α i M i of Oracle M, B,C +,C satisfies the following: nnzα λ min B tr B. 2 B and α i 0 for all i. 3 E [C ] S λ min B trc εs λ min B trc, where C = C + C, and C = C + + C. We show in Section 3.3 the existence of a one-sided oracle with speed S = Ω and error ε = 0, in which case the oracle only requires C as input, instead of C + and C. However, to construct such an oracle efficiently an additional error is introduced, which depends on C + + C. For the main algorithm Sparsify M, ε, we maintain the matrix A j inductively as we discussed at the beginning of Section 3. By employing the operator, we reduce the problem of constructing j with two-sided constraints to the problem of constructing j j with one-sided constraints. We also use C to ensure that the length of Ellip is large on the direction where the length of EllipA j is small. See Algorithm for formal description. To analyse Algorithm, we use the fact that the returned j satisfies the preconditions of Lemma 3. and prove in Lemma 4.4 that E [ Φ uj+,l j+ A j+ ] Φ uj,l j A j for any iteration j. Hence, with high probability the bounded ratio between u T and l T after the final iteration T implies that the EllipA T is close to be a sphere. In particular, for any ε < /20, a + Oε-spectral sparsfier can be constructed by calling Oracle log O 2 n ε 2 times, which is described in the lemma below. S
5 An SDP-Based Algorithm for Linear-Sized Spectral Sparsification Algorithm Sparsify M, ε : j = 0, A 0 = 0 2: l 0 = 4, u 0 = 4 3: while u j l j < do 4: B j = u j I A j 2 A j l j I 2 5: C + = 2εA j l j I 2 exp A j l j I 6: C = + 2εu j I A j 2 expu j I A j 7: j j = Oracle {M i M i } m, B j, C + C + 2, C C 2 8: A j+ A j + ε j +2ε +ε 9: δ u, j = ε 4ε S λ min B j 2ε ε 0: δ l, j = ε +4ε S λ min B j : u j+ u j + δ u, j, l j+ l j + δ l, j 2: j j + 3: end while 4: Return A j Lemma 3.4. Let 0 < ε < /20. Suppose we have one-sided oracle Oracle with speed S and error ε. Then, with constant probability the algorithm Sparsify M, ε outputs a + O ε-spectral sparsifier with O n ε 2 S vectors by calling Oracle O log 2 n ε 2 S 3.3 Solving Oracle via SDP times. Now we show that the required solution of OracleM, B,C indeed exists 3, and can be further solved in nearly-linear time by a semi-definite program. We first prove that the required matrix satisfying the conditions of Definition 3.3 exists for some absolute constant S = Ω and ε = 0. To make a parallel discussion between Algorithm and the algorithm we will present later, we use A to denote the output of the Oracle instead of. We adopt the ideas between the ellipsoid and two spheres discussed before, but only consider one sphere for the one-sided case. Hence, we introduce a barrier value u j for each iteration j, where u 0 = 0. We will use the potential function Ψ j = tr u j B A j in our analysis, where u j is increased by δ j Ψ j λ min B after iteration j. Moreover, since we only need to prove the existence of the required matrix A = m α i M i, our process proceeds for T iterations, where only one vector is chosen in each iteration. To find a desired vector, we perform random sampling, where each matrix M i is sampled with probability probm i proportional to M i C, i.e., M i C + probm i mt= M t C +, 6 where x + max{x, 0}. Notice that, since our goal is to construct A such that E[C A] is lower bounded by some threshold as stated in Definition 3.3, we should not pick any matrix M i with M i C < 0. This random sampling procedure is described in Algorithm 2, and the properties of the output matrix is summarised as follows. 3 As the goal here is to prove the existence of Oracle with error ε = 0, the input here is C instead of C + and C. STOC 7, June 207, Montreal, Canada Algorithm 2 SolutionExistence M, B, C : A 0 = 0, u 0 = and T = λ min B trb 2: for j = 0,,...,T do 3: repeat 4: Sample a matrix M t with probability probm t 5: Let j = 4Ψ j probm t M t 6: until j 2 u jb A j 7: A j+ = A j + j 8: δ j = Ψ j λ min B 9: u j+ = u j + δ j 0: end for : Return u T A T Lemma 3.5. Let 0 B I and C be symmetric matrices, and M = {M i } m be a set of PSD matrices such that m M i = I. Then SolutionExistence M, B,C outputs a matrix A = m α i M i such that the following holds: nnzα = λ min B trb. 2 A B, and α i 0 for all i. 3 E [C A] 32 λ minb tr C. Lemma 3.5 shows that the required matrix A defined in Definition 3.3 exists, and can be found by random sampling described in Algorithm 2. Our key observation is that such matrix A can be constructed by a semi-definite program. Theorem 3.6. Let 0 B I, C be symmetric matrices, and M = {M i } m be a set of matrices such that m M i = I. Let S [m] be a random set of λ min B trb coordinates, where every index i is picked with probability probm i. Let A be the solution of the following semidefinite program max C α i M i subject to A = α i M i B. 7 α i 0 i S i S Then, we have E [ C A ] 32 λ minb trc. Taking the SDP formuation 7 and the specific constraints of the Oracle s input into account, the next lemma shows that the required matrix used in each iteration of Sparsify M, ε can be computed efficiently by solving a semidefinite program. Lemma 3.7. The Oracle used in Algorithm Sparsify M, ε can be implemented in Õ Z + n ω ε O work and Õ ε O depth, where Z = m nnzm i is the total number of non-zeros in M i. When the matrix m M i = I comes from spectral sparsification of graphs, each iteration of Sparsify M, ε can be implemented in Õ mε O work and Õ ε O depth. Furthermore, the speed of this one-sided oracle is Ω and the error of this one-sided oracle is ε. Combining Lemma 3.4 and Lemma 3.7 gives us the proof of the main result.
6 STOC 7, June 207, Montreal, Canada Yin Tat Lee, and He Sun Proof of Theorem. and Theorem.2. Lemma 3.7 shows that we can construct Oracle with Ω speed and ε error that runs in Õ Z + n ω ε O work and Õ ε O depth for the matrix setting, and Õ mε O work and Õ ε O depth for the graph setting. Combining this with Lemma 3.4, which states that it suffices to call Oracle Õ /ε 2 times, the main statements hold. 3.4 Further Discussion Before presenting a more detailed analysis of our algorithm, we compare our new approach with the previous ones for constructing a linear-sized spectral sparsifier, and see how we address the bottlenecks faced in previous constructions. Notice that all previous algorithms require super poly-logarithmic number of iterations, and super linear-time for each iteration. For instance, our previous algorithm [0] for constructing a sparsifier with Opn edges requires Ωn /p iterations and Ωn +/p time per iteration for the following reasons: Ω n +/p time is needed per iteration: Each iteration takes n/д Ω time to pick the vectors when l + дi A u дi. To avoid eigenvalues of A getting too close to the boundary u or l, i.e., д being too small, we choose the potential function whose value dramatically increases when the eigenvalues of A get close u or l. As the cost, we need to scale down the added vectors by an n /p factor. Ω n /p iterations are needed: By random sampling, we choose O n /p vectors each iteration and use the matrix Chernoff bound to show that the quality of added O n /p vectors is just p = Θ times worse than adding a single vector. Hence, this requires Ω n /p iterations. In contrast, our new approach breaks these two barriers through the following way: A non-linear step: Instead of rescaling down the vectors we add uniformly, we pick much fewer vectors on the direction that blows up, i.e., we impose the condition ui A 2 instead of /p ui A. This allows us to use the new potential function 4 with form exp x to control the eigenvalues in a more aggressive way. SDP filtering: By matrix Chernoof bound, we know that the probability that we sample a few bad vectors is small. Informally, we apply semi-definite programming to filter out those bad vectors, and this allows us to add Ω n/ log O n vectors in each iteration. 4 DETAILED ANALYSIS In this section we give detailed analysis for the statements presented in Section Analysis of the Potential Function Now we analyse the properties of the potential function 4, and prove Lemma 3.. The following two facts from matrix analysis will be used in our analysis. Lemma 4. Woodbury Matrix Identity. Let A R n n, U R n k, C R k k and V R k n be matrices. Suppose that A, C and C + VA U are invertible, it holds that A + UCV = A A U C + VA U V A. Lemma 4.2 Golden-Thompson Inequality. It holds for any symmetric matrices A and B that tr e A+B tr e A e B. Proof of Lemma 3.. We analyse the change of Φ u and Φ l individually. First of all, notice that ui A = ui A /2 I ui A /2 ui A /2 ui A /2. We define Π = ui A /2 ui A /2. Since 0 δ ui A 2 and u l, it holds that Π δ ui A δ ui li δi, and therefore I Π I + δ Π. Hence, it holds that ui A ui A /2 I + δ Π ui A /2 = ui A + δ ui A ui A. By the fact that tr exp is monotone and the Golden-Thompson inequality Lemma 4.2, we have that Φ u A + = tr exp ui A tr exp ui A + δ ui A ui A tr expui A exp δ ui A ui A. Since 0 δ ui A 2 and δ /0 by assumption, we have that ui A ui A δi, and exp δ ui A ui A I + + 2δ ui A ui A. Hence, it holds that Φ u A + tr e ui A I + + 2δ ui A ui A = Φ u A + + 2δ tre ui A ui A 2. By the same analysis, we have that Φ l A + tr e A li I 2δ A li A li = Φ l A 2δ tre A li A li 2. Combining the analysis on Φ u A + and Φ l A + finishes the proof.
7 An SDP-Based Algorithm for Linear-Sized Spectral Sparsification STOC 7, June 207, Montreal, Canada Lemma 4.3. Let A be a symmetric matrix. Let u, l be the barrier values such that u l and li A ui. Assume that 0 δ u δ λ min ui A 2 and 0 δ l δ λ min A li 2 for δ /0. Then, it holds that e ui A ui A 2 Φ u+δu,l+δ l A Φ u,l A 2δ δ u tr + + 2δ δ l tr e A li A li 2. Proof. Since 0 δ u δ λ min ui A 2 and 0 δ l δ λ min A li 2, we have that δ u I δ ui A 2 and δ l I δ A li 2. The statement follows by a similar analysis for proving Lemma Analysis of the Reduction Now we present the detailed analysis for the reduction from a spectral sparsifier to a one-sided oracle. We first analyse Algorithm, and prove that in expectation the value of the potential function is not increasing. Based on this fact, we will give a proof of Lemma 3.4, which shows that a +Oε-spectral sparsifier can be constructed log by calling Oracle O 2 n ε 2 times. S Lemma 4.4. Let A j and A j+ be the matrices constructed by Algorithm in iteration j and j +, and assume that 0 ε /20. Then, it holds that E [ Φ uj+,l j+ A j+ ] Φ uj,l j A j. Proof. By the description of Algorithm and Definition 3.3, it holds that j j u j I A j 2 A j l j I 2, which implies that j u j I A j 2 and j A j l j I 2. Since u j l j by the algorithm description and 0 ε /20, by setting = ε j in Lemma 3., we have Φ uj,l j A j + ε j Φ uj,l j A j + ε + 2ε tr e u j I A j u j I A j 2 j ε 2ε tr e A j l j I A j l j I 2 j = Φ uj,l j A j ε C j. Notice that the matrices {M i M i } m as the input of Oracle always satisfy m M i M i = I I. Using this and the definition of Oracle, we know that E [ C j ] S λmin B j trc ε S λ min B j tr C, Let α j = ε S λ min B j. Then we have that E [ Φ uj,l j A j+ ] Φ uj,l j A j ε E [ ] C j Φ uj,l j A j + + 2ε + ε α j tr e u j I A j u j I A j 2 2ε ε α j tr e A j l j I A j l j I 2. On the other hand, using that 0 ε /20, S and j u j I A j, we have that δ u, j ε + 2ε + ε 4ε 8 λ min u j I A j 2 2ε λ min u j I A j+ 2 and 2ε ε δ l, j ε λ min A j l j I 2 2ε λ min A j+ l j I ε Hence, Lemma 4.3 shows that Φ uj +δ u, j,l j +δ l, j A j+ Φ uj,l j A j+ 4εδ u, j tr e u j I A j+ u j I A j εδ l, j tr e A j+ l j I A j+ l j I 2 Φ uj,l j A j+ 4εδ u, j tr e u j I A j u j I A j εδ l, j tr e A j l j I A j l j I 2. 9 By combining 8, 9, and setting we have that 4εδ u, j = + 2ε + εα j, + 4εδ l, j = 2ε εα j, E [ Φ uj+,l j+ A j+ ] Φ uj,l j A j. Proof of Lemma 3.4. We first bound the number of times the algorithm calls the oracle. Notice that Φ u0,l 0 = 2 tr exp 4 I = 2e 4 n. Hence, by Lemma 4.4 we have E [ Φ uj,l j A j ] = On for any iteration j. By Markov s inequality, it holds that Φ uj,l j A j = n O with high probability in n. In the remainder of the proof, we assume that this event occurs. Since B j = u j I A j 2 A j l j I 2 by definition, it holds that /2 exp λmin Bj Φ uj,l j A j = n O, which implies that λ min Bj = Ω log 2 n. 0 On the other hand, in iteration j the gap between u j and l j is increased by δ u, j δ l, j = Ω ε 2 S λ min B j. Combining this with 0 gives us that ε2 S δ u, j δ l, j = Ω log 2 n for any j. Since u 0 l 0 = /2 and the algorithm terminates once u j l j > for some j, with high probability in n, the algorithm log terminates in O 2 n ε 2 iterations. S Next we prove that the number of M i s involved in the output is at most O n ε 2 S. By the properties of Oracle, the number of matrices in iteration j is at most λ min B j trb j. Since x 2 exp x for all x > 0, it holds for any iteration j that tr B 2 j = tr uj I A j + tr Aj l j I 2 Φ uj,l j A j
8 STOC 7, June 207, Montreal, Canada Yin Tat Lee, and He Sun By, we know that for added matrix M i in iteration j, the average increase of the gap u j l j for each added matrix is Ω ε 2 S Φ uj, l j A j. Since E [ Φ uj,l j A j ] = On, for every new added matrix, in expectation the gap between u j and l j is increased by Ω ε 2 S n. By the ending condition of the algorithm, i.e., u j l j >, and Markov s inequality, the number of matrices picked in total is at most O n ε 2 S with constant probability. Finally we prove that the output is a +Oε-spectral sparsifier. Since the condition number of the output matrix A j is at most u j = u j l j, l j u j it suffices to prove that u j l j /u j = Oε and this easily follows from the ending condition of the algorithm and 4.3 Existence Proof for Oracle δ u, j δ l, j δ u, j = Oε. Proof of Lemma 3.5. The property on nnz α follows from the algorithm description. For the second property, notice that every chosen matrix j in iteration j satisfies j 2 u jb A j, which implies that A j u j B holds for any iteration j. Hence, A = A T B, u T and α i 0 since Ψ j 0. Now we prove the third statement. Let β = M i C +. Then, for each matrix M ij picked in iteration j, C A j is increased by C j = On the other hand, it holds that 4Ψ j probm ij C M i j = β. 4Ψ j T T u T = u 0 + δ j = + Ψj λ min B. j=0 j=0 Hence, we have that C A = C T j = u T j=0 = βλ minb 4 βλ minb 4 βλ minb 4 T j=0 β 4Ψ j + T j=0 Ψ j λ min B T j=0 Ψ j λ min B + T j=0 Ψ j T j=0 Ψ j + Ψ 0 λ min B + T j=0 Ψ j + Ψ 0 T j=0 Ψ j + Ψ 0 λ min B + T Ψ 0 β T 8 Ψj + Ψ 0, 2 j=0 where the last inequality follows by the choice of T. Hence, it suffices to bound Ψ j. Since j 2 u jb A j 2 u j+b A j, we have that Ψ j+ tr u j+ B A j + 2 j u j+ B A j 2. 3 Since trub A j is convex in u, we have that tr u j B A j tr u j+ B A j + δ j tr u j+ B A j 2 B 4 Combining 3 and 4, we have that 2 Ψ j+ tr uj B A j δ j tr uj+ B A j B + 2 j u j+ B A j 2 = Ψ j δ j λ min B tr u j+ B A j j u j+ B A j 2 5 Let E j be the event that j 2 u jb A j. Notice that our picked j in each iteration always satisfies E j by algorithm description. Since E [ j u j B A j ] = probm i 4Ψ i:m i C + j probm i M i u j B A j >0 4, by Markov inequality it holds that P [ ] E j = P [ j ] 2 u jb A j 2, and therefore E [ j u j+ B A j 2 E j ] E [ j u j+ B A j 2] P E j 2 E [ j u j+ B A j 2] = M i u j+ B A j 2 2Ψ j i:m i C + >0 tru j+ B A j 2. 2Ψ j Combining the inequality above, 5, and the fact that every j picked by the algorithm satisfies E, we have that E [ ] Ψ j+ Ψj + δ j λ min B tr 2 u j+ B A j. Ψ j By our choice of δ j, it holds for any iteration j that E [ ] Ψ j+ Ψj, and [ ] E Ψj+ + Ψ 0 E Ψ j + Ψ 0. 2 Ψ 0
9 An SDP-Based Algorithm for Linear-Sized Spectral Sparsification STOC 7, June 207, Montreal, Canada Combining this with 2, it holds that E [C A] β T E [ Ψ j + Ψ 0 ] β 8 6 T Ψ j=0 0 = β 6 T tr trc B 6 T tr B. The result follows from the fact that T λ min Btr B /2. Using the lemma above, we can prove that such A can be solved by a semidefinite program. Proof of Theorem 3.6. Note that the probability we used in the statement is the same as the one used in SolutionExistence M, B,C. Therefore, Lemma 3.5 shows that there is a matrix A of the form m α i M i such that E [C A] 32 λ minb tr C, The statement follows by the fact that A is the solution of the semidefinite program 7 that maximises C A. 4.4 Implementing the SDP in Nearly-Linear Time Now, we discuss how to solve the SDP 7 in nearly-linear time. Since this SDP is a packing SDP, it is known how to solve it in nearly-constant depth [, 8, 4]. The following result will be used in our analysis. Theorem 4.5 []. Given a SDP max x 0 c x subject to x i A i B with A i 0, B 0 and c R m. Suppose that we are given a direct access to the vector c R m and an indirect access to A i and B via an oracle O L,δ which inputs a vector x R m and outputs a vector v R m such that v i ± δ 2 A i B /2 exp L B /2 i x i A i B B /2 B /2 in W L,δ work and D L,δ depth for any x such that x i 0 and m x i A i 2B. Then, we can output x such that in work, and E [ c x ] O δ OPT with O W L,δ logm log nm/δ /δ 3 O D L,δ logm lognm/δ /δ depth, where L = 4/δ lognm/δ. x i A i B Since we are only interested in a fast implementation of the one-sided oracle used in Sparsify M, ε, it suffices to solve the SDP 7 for this particular situation. Proof of Lemma 3.7. Our basic idea is to use Theorem 4.5 as the one-sided oracle. Notice that each iteration of Sparsify M, ε uses the one-sided oracle with the input C + = 2ε A li 2 expa li A li 2 expa li, 2 C = + 2ε ui A 2 expui A ui A 2 expui A, 2 and B = ui A 2 A li 2, where we drop the subscript j indicating iterations here for simplicity. To apply Theorem 3.6, we first sample a subset S [m], then solve the SDP max C + C β i 0, β i =0 on is β i M i M i subject to β i M i M i B. By ignoring the matrices M i with i S, this SDP is equivalent to the SDP max β i 0 c β subject to β i M i M i B where c i = C + C M i M i. Now assume that i we can approximate c i with δ C + +C M i M i additive error, and ii for any x such that m x i M i M i 2B and L = Õ/δ, we can approximate M i M i B /2 exp L B /2 i x i M i M i B B /2 B /2 with ± δ multiplicative error. Then, by Theorem 4.5 we can find a vector β such that E [ c β ] Oδ OPT δ β i C + + C M i M i Oδ OPT δ β i C + + C B OPT Oδ C + + C B. where we used that M i M i B and OPT C + + C B. Since u l, we have that B I I and hence E [ c β ] OPT Oδ trc + + C 32 λ minb trc O δ log 2 n trc + + C where we apply Theorem 3.6 and 0 at the last line. Therefore, this gives an oracle with speed /32 and ε error by setting δ = ε/ log 2 n. The problem of approximating sample probabilities, {c i }, as well as implementing the oracle O L,δ is similar with approximating leverage scores [5], and relative leverage scores [2, 0]. All these references use the Johnson-Lindenstrauss lemma to reduce the problem of approximating matrix dot product or trace to matrix
10 STOC 7, June 207, Montreal, Canada Yin Tat Lee, and He Sun vector multiplication. The only difference is that, instead of computing A li q+ x and ui A q+ x for a given vector x in other references, we compute A li 2 expa li x and ui A 2 expui A x. These can be approximated by Taylor expansion and the number of terms required for Taylor expansion depends on how close the eigenvalues of A are to the boundary u or l. In particular, we show in Section 4.5 that Õ/д 2 terms in Taylor expansion suffices, where the gap д is the largest number such that l + дi A u дi. Since /д 2 = Õ by 0, each iteration can be implemented via solving Õ /ε O linear systems and Õ /ε O matrix vector multiplication. For the matrices coming from graph sparsification, this can be done in nearly-linear work and nearly-constant depth [9, 3]. For general matrices, this can be done in input sparsity time and nearly-constant depth [7,, 2]. 4.5 Taylor Expansion of x 2 expx Theorem 4.6 Cauchy s Estimates. Suppose f is holomorphic on a neighbourhood of the ball then it holds that B {z C : z s r }, f k s k! r k sup f z. z B Lemma 4.7. Let f x = x 2 expx. For any 0 < x, we have that d f x k! f k x k 8d + e x 5 xd. k=0 In particular, if d c x 2 log xε for some large enough universal constant c, we have that d f x k! f k x k ε. k=0 Proof. By the formula of the remainder term in Taylor series, we have that d f x = k! f k x k + x f d+ sx s d ds. d! k=0 For any s [x, ], we define { Ds = z C : z s s x }. 2 Since f z x/2 2 exp2/x on z Ds, Cauchy s estimates Theorem 4.6 shows that f d+ s d +! s x sup 2 d+ z Bs f z d +! s x 2 d+ 4 x 2 exp 2 x. Hence, we have that d f x k! f k tx t k k=0 x d +! 4 2 d! s x 2 d+ x 2 exp x s d ds x = 4d + e x 2 s x d x 2 x s x ds 2 d+ 8d + e x 2 x 3 x d ds x 8d + e 5 x xd. ACKNOWLEDGEMENT The authors would like to thank Michael Cohen for helpful discussions and suggesting the ideas to improve the sparsity from O n/ε 3 to O n/ε 2. REFERENCES [] Zeyuan Allen-Zhu, Yin Tat Lee, and Lorenzo Orecchia Using optimization to obtain a width-independent, parallel, simpler, and faster positive SDP solver. In 27th Annual ACM-SIAM Symposium on Discrete Algorithms SODA [2] Zeyuan Allen-Zhu, Zhenyu Liao, and Lorenzo Orecchia Spectral Sparsification and Regret Minimization Beyond Multiplicative Updates. In 47th Annual ACM Symposium on Theory of Computing STOC [3] Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava Twice- Ramanujan Sparsifiers. SIAM J. Comput. 4, 6 202, [4] Joshua D. Batson, Daniel A. Spielman, Nikhil Srivastava, and Shang-Hua Teng Spectral sparsification of graphs: theory and algorithms. Commun. ACM 56, 8 203, [5] András Benczúr and David Karger Approximating s-t Minimum Cuts in Õ n 2 Time. In 28th Annual ACM Symposium on Theory of Computing STOC [6] L. Paul Chew There Are Planar Graphs Almost as Good as the Complete Graph. J. Comput. System Sci. 39, 2 989, [7] Michael B Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, and Aaron Sidford Uniform sampling for matrix approximation. In 6th Innovations in Theoretical Computer Science ITCS [8] Rahul Jain and Penghui Yao. 20. A parallel approximation algorithm for positive semidefinite programming. In 52th Annual IEEE Symposium on Foundations of Computer Science FOCS [9] Rasmus Kyng, Yin Tat Lee, Richard Peng, Sushant Sachdeva, and Daniel A. Spielman Sparsified Cholesky and multigrid solvers for connection Laplacians. In 48th Annual ACM Symposium on Theory of Computing STOC [0] Yin Tat Lee and He Sun Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time. In 56th Annual IEEE Symposium on Foundations of Computer Science FOCS [] Mu Li, Gary L Miller, and Richard Peng Iterative row sampling. In 54th Annual IEEE Symposium on Foundations of Computer Science FOCS [2] Jelani Nelson and Huy L Nguyên OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In 54th Annual IEEE Symposium on Foundations of Computer Science FOCS [3] Richard Peng and Daniel A. Spielman An efficient parallel solver for SDD linear systems. In 46th Annual ACM Symposium on Theory of Computing STOC [4] Richard Peng and Kanat Tangwongsan Faster and simpler widthindependent parallel algorithms for positive semidefinite programming. In 24th ACM Symposium on Parallelism in Algorithms and Architectures SPAA [5] Daniel A. Spielman and Nikhil Srivastava. 20. Graph Sparsification by Effective Resistances. SIAM J. Comput. 40, 6 20, [6] Daniel A. Spielman and Shang-Hua Teng. 20. Spectral sparsification of graphs. SIAM J. Comput. 40, 4 20,
Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time
Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time Yin Tat Lee MIT Cambridge, USA yintat@mit.edu He Sun The University of Bristol Bristol, UK h.sun@bristol.ac.uk Abstract We present
More informationA Tutorial on Matrix Approximation by Row Sampling
A Tutorial on Matrix Approximation by Row Sampling Rasmus Kyng June 11, 018 Contents 1 Fast Linear Algebra Talk 1.1 Matrix Concentration................................... 1. Algorithms for ɛ-approximation
More informationGraph Sparsification I : Effective Resistance Sampling
Graph Sparsification I : Effective Resistance Sampling Nikhil Srivastava Microsoft Research India Simons Institute, August 26 2014 Graphs G G=(V,E,w) undirected V = n w: E R + Sparsification Approximate
More informationApproximate Gaussian Elimination for Laplacians
CSC 221H : Graphs, Matrices, and Optimization Lecture 9 : 19 November 2018 Approximate Gaussian Elimination for Laplacians Lecturer: Sushant Sachdeva Scribe: Yasaman Mahdaviyeh 1 Introduction Recall sparsifiers
More informationSparsification by Effective Resistance Sampling
Spectral raph Theory Lecture 17 Sparsification by Effective Resistance Sampling Daniel A. Spielman November 2, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened
More informationGraph Sparsifiers: A Survey
Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman, Srivastava and Teng Approximating Dense Objects
More informationSparsification of Graphs and Matrices
Sparsification of Graphs and Matrices Daniel A. Spielman Yale University joint work with Joshua Batson (MIT) Nikhil Srivastava (MSR) Shang- Hua Teng (USC) HUJI, May 21, 2014 Objective of Sparsification:
More informationConstant Arboricity Spectral Sparsifiers
Constant Arboricity Spectral Sparsifiers arxiv:1808.0566v1 [cs.ds] 16 Aug 018 Timothy Chu oogle timchu@google.com Jakub W. Pachocki Carnegie Mellon University pachocki@cs.cmu.edu August 0, 018 Abstract
More informationLecture 24: Element-wise Sampling of Graphs and Linear Equation Solving. 22 Element-wise Sampling of Graphs and Linear Equation Solving
Stat260/CS294: Randomized Algorithms for Matrices and Data Lecture 24-12/02/2013 Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving Lecturer: Michael Mahoney Scribe: Michael Mahoney
More informationGraph Sparsifiers. Smaller graph that (approximately) preserves the values of some set of graph parameters. Graph Sparsification
Graph Sparsifiers Smaller graph that (approximately) preserves the values of some set of graph parameters Graph Sparsifiers Spanners Emulators Small stretch spanning trees Vertex sparsifiers Spectral sparsifiers
More informationU.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21
U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Scribe: Anupam Last revised Lecture 21 1 Laplacian systems in nearly linear time Building upon the ideas introduced in the
More informationLecture 21 November 11, 2014
CS 224: Advanced Algorithms Fall 2-14 Prof. Jelani Nelson Lecture 21 November 11, 2014 Scribe: Nithin Tumma 1 Overview In the previous lecture we finished covering the multiplicative weights method and
More informationto be more efficient on enormous scale, in a stream, or in distributed settings.
16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to
More informationGraph Sparsification II: Rank one updates, Interlacing, and Barriers
Graph Sparsification II: Rank one updates, Interlacing, and Barriers Nikhil Srivastava Simons Institute August 26, 2014 Previous Lecture Definition. H = (V, F, u) is a κ approximation of G = V, E, w if:
More informationLecture 18 Oct. 30, 2014
CS 224: Advanced Algorithms Fall 214 Lecture 18 Oct. 3, 214 Prof. Jelani Nelson Scribe: Xiaoyu He 1 Overview In this lecture we will describe a path-following implementation of the Interior Point Method
More informationORIE 6334 Spectral Graph Theory November 8, Lecture 22
ORIE 6334 Spectral Graph Theory November 8, 206 Lecture 22 Lecturer: David P. Williamson Scribe: Shijin Rajakrishnan Recap In the previous lectures, we explored the problem of finding spectral sparsifiers
More informationLecture 18: March 15
CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 18: March 15 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may
More informationU.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016
U.C. Berkeley CS294: Spectral Methods and Expanders Handout Luca Trevisan February 29, 206 Lecture : ARV In which we introduce semi-definite programming and a semi-definite programming relaxation of sparsest
More informationsublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)
sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationRandom Matrices: Invertibility, Structure, and Applications
Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37
More informationImproved Quantum Algorithm for Triangle Finding via Combinatorial Arguments
Improved Quantum Algorithm for Triangle Finding via Combinatorial Arguments François Le Gall The University of Tokyo Technical version available at arxiv:1407.0085 [quant-ph]. Background. Triangle finding
More informationProject in Computational Game Theory: Communities in Social Networks
Project in Computational Game Theory: Communities in Social Networks Eldad Rubinstein November 11, 2012 1 Presentation of the Original Paper 1.1 Introduction In this section I present the article [1].
More informationAlgorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm
Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm Shang-Hua Teng Computer Science, Viterbi School of Engineering USC Massive Data and Massive Graphs 500 billions web
More informationGeometric interpretation of signals: background
Geometric interpretation of signals: background David G. Messerschmitt Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-006-9 http://www.eecs.berkeley.edu/pubs/techrpts/006/eecs-006-9.html
More informationApproximate Spectral Clustering via Randomized Sketching
Approximate Spectral Clustering via Randomized Sketching Christos Boutsidis Yahoo! Labs, New York Joint work with Alex Gittens (Ebay), Anju Kambadur (IBM) The big picture: sketch and solve Tradeoff: Speed
More informationSketching as a Tool for Numerical Linear Algebra
Sketching as a Tool for Numerical Linear Algebra (Part 2) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching
More informationCS 229r: Algorithms for Big Data Fall Lecture 17 10/28
CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 17 10/28 Scribe: Morris Yau 1 Overview In the last lecture we defined subspace embeddings a subspace embedding is a linear transformation
More informationCS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi. Gradient Descent
CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi Gradient Descent This lecture introduces Gradient Descent a meta-algorithm for unconstrained minimization. Under convexity of the
More informationORIE 6334 Spectral Graph Theory October 25, Lecture 18
ORIE 6334 Spectral Graph Theory October 25, 2016 Lecturer: David P Williamson Lecture 18 Scribe: Venus Lo 1 Max Flow in Undirected Graphs 11 Max flow We start by reviewing the maximum flow problem We are
More informationGraphs, Vectors, and Matrices Daniel A. Spielman Yale University. AMS Josiah Willard Gibbs Lecture January 6, 2016
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016 From Applied to Pure Mathematics Algebraic and Spectral Graph Theory Sparsification: approximating
More informationGraph Sparsification III: Ramanujan Graphs, Lifts, and Interlacing Families
Graph Sparsification III: Ramanujan Graphs, Lifts, and Interlacing Families Nikhil Srivastava Microsoft Research India Simons Institute, August 27, 2014 The Last Two Lectures Lecture 1. Every weighted
More informationApproximate Second Order Algorithms. Seo Taek Kong, Nithin Tangellamudi, Zhikai Guo
Approximate Second Order Algorithms Seo Taek Kong, Nithin Tangellamudi, Zhikai Guo Why Second Order Algorithms? Invariant under affine transformations e.g. stretching a function preserves the convergence
More informationSketching as a Tool for Numerical Linear Algebra
Sketching as a Tool for Numerical Linear Algebra David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationGRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] 1. Preliminaries
GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] notes by Petar MAYMOUNKOV Theme The algorithmic problem of finding a sparsest cut is related to the combinatorial problem of building expander graphs
More informationOn a Cut-Matching Game for the Sparsest Cut Problem
On a Cut-Matching Game for the Sparsest Cut Problem Rohit Khandekar Subhash A. Khot Lorenzo Orecchia Nisheeth K. Vishnoi Electrical Engineering and Computer Sciences University of California at Berkeley
More informationFaster Johnson-Lindenstrauss style reductions
Faster Johnson-Lindenstrauss style reductions Aditya Menon August 23, 2007 Outline 1 Introduction Dimensionality reduction The Johnson-Lindenstrauss Lemma Speeding up computation 2 The Fast Johnson-Lindenstrauss
More informationDiscrepancy Theory in Approximation Algorithms
Discrepancy Theory in Approximation Algorithms Rajat Sen, Soumya Basu May 8, 2015 1 Introduction In this report we would like to motivate the use of discrepancy theory in algorithms. Discrepancy theory
More informationRandNLA: Randomized Numerical Linear Algebra
RandNLA: Randomized Numerical Linear Algebra Petros Drineas Rensselaer Polytechnic Institute Computer Science Department To access my web page: drineas RandNLA: sketch a matrix by row/ column sampling
More informationConvex and Semidefinite Programming for Approximation
Convex and Semidefinite Programming for Approximation We have seen linear programming based methods to solve NP-hard problems. One perspective on this is that linear programming is a meta-method since
More informationLaplacian Matrices of Graphs: Spectral and Electrical Theory
Laplacian Matrices of Graphs: Spectral and Electrical Theory Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale University Toronto, Sep. 28, 2 Outline Introduction to graphs
More informationCS 435, 2018 Lecture 6, Date: 12 April 2018 Instructor: Nisheeth Vishnoi. Newton s Method
CS 435, 28 Lecture 6, Date: 2 April 28 Instructor: Nisheeth Vishnoi Newton s Method In this lecture we derive and analyze Newton s method which is an example of a second order optimization method. We argue
More informationPositive Semi-definite programing and applications for approximation
Combinatorial Optimization 1 Positive Semi-definite programing and applications for approximation Guy Kortsarz Combinatorial Optimization 2 Positive Sem-Definite (PSD) matrices, a definition Note that
More informationApproximating the Exponential, the Lanczos Method and an Õ(m)-Time Spectral Algorithm for Balanced Separator
Approximating the Exponential, the Lanczos Method and an Õm-Time Spectral Algorithm for Balanced Separator Lorenzo Orecchia MIT Cambridge, MA, USA. orecchia@mit.edu Sushant Sachdeva Princeton University
More informationSpectral Sparsification in Dynamic Graph Streams
Spectral Sparsification in Dynamic Graph Streams Kook Jin Ahn 1, Sudipto Guha 1, and Andrew McGregor 2 1 University of Pennsylvania {kookjin,sudipto}@seas.upenn.edu 2 University of Massachusetts Amherst
More informationarxiv:cs/ v1 [cs.cc] 7 Sep 2006
Approximation Algorithms for the Bipartite Multi-cut problem arxiv:cs/0609031v1 [cs.cc] 7 Sep 2006 Sreyash Kenkre Sundar Vishwanathan Department Of Computer Science & Engineering, IIT Bombay, Powai-00076,
More informationAlgorithms, Graph Theory, and the Solu7on of Laplacian Linear Equa7ons. Daniel A. Spielman Yale University
Algorithms, Graph Theory, and the Solu7on of Laplacian Linear Equa7ons Daniel A. Spielman Yale University Rutgers, Dec 6, 2011 Outline Linear Systems in Laplacian Matrices What? Why? Classic ways to solve
More informationLecture notes for quantum semidefinite programming (SDP) solvers
CMSC 657, Intro to Quantum Information Processing Lecture on November 15 and 0, 018 Fall 018, University of Maryland Prepared by Tongyang Li, Xiaodi Wu Lecture notes for quantum semidefinite programming
More informationThe Simplest Construction of Expanders
Spectral Graph Theory Lecture 14 The Simplest Construction of Expanders Daniel A. Spielman October 16, 2009 14.1 Overview I am going to present the simplest construction of expanders that I have been able
More informationLecture 9: Low Rank Approximation
CSE 521: Design and Analysis of Algorithms I Fall 2018 Lecture 9: Low Rank Approximation Lecturer: Shayan Oveis Gharan February 8th Scribe: Jun Qi Disclaimer: These notes have not been subjected to the
More informationLower Bounds for Testing Bipartiteness in Dense Graphs
Lower Bounds for Testing Bipartiteness in Dense Graphs Andrej Bogdanov Luca Trevisan Abstract We consider the problem of testing bipartiteness in the adjacency matrix model. The best known algorithm, due
More informationLecture Approximate Potentials from Approximate Flow
ORIE 6334 Spectral Graph Theory October 20, 2016 Lecturer: David P. Williamson Lecture 17 Scribe: Yingjie Bi 1 Approximate Potentials from Approximate Flow In the last lecture, we presented a combinatorial
More informationCopyright 2013 Springer Science+Business Media New York
Meeks, K., and Scott, A. (2014) Spanning trees and the complexity of floodfilling games. Theory of Computing Systems, 54 (4). pp. 731-753. ISSN 1432-4350 Copyright 2013 Springer Science+Business Media
More information1 Review: symmetric matrices, their eigenvalues and eigenvectors
Cornell University, Fall 2012 Lecture notes on spectral methods in algorithm design CS 6820: Algorithms Studying the eigenvalues and eigenvectors of matrices has powerful consequences for at least three
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationA Matrix Expander Chernoff Bound
A Matrix Expander Chernoff Bound Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil Srivastava UC Berkeley Vanilla Chernoff Bound Thm [Hoeffding, Chernoff]. If X 1,, X k are independent mean zero random variables
More informationLECTURE NOTE #11 PROF. ALAN YUILLE
LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform
More informationCS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5
CS 229r: Algorithms for Big Data Fall 215 Prof. Jelani Nelson Lecture 19 Nov 5 Scribe: Abdul Wasay 1 Overview In the last lecture, we started discussing the problem of compressed sensing where we are given
More informationHardness Results for Structured Linear Systems
58th Annual IEEE Symposium on Foundations of Computer Science Hardness Results for Structured Linear Systems Rasmus Kyng Department of Computer Science Yale University New Haven, CT, USA Email: rasmus.kyng@yale.edu
More informationFast Approximate Matrix Multiplication by Solving Linear Systems
Electronic Colloquium on Computational Complexity, Report No. 117 (2014) Fast Approximate Matrix Multiplication by Solving Linear Systems Shiva Manne 1 and Manjish Pal 2 1 Birla Institute of Technology,
More informationLecture Introduction. 2 Brief Recap of Lecture 10. CS-621 Theory Gems October 24, 2012
CS-62 Theory Gems October 24, 202 Lecture Lecturer: Aleksander Mądry Scribes: Carsten Moldenhauer and Robin Scheibler Introduction In Lecture 0, we introduced a fundamental object of spectral graph theory:
More informationCHAPTER 11. A Revision. 1. The Computers and Numbers therein
CHAPTER A Revision. The Computers and Numbers therein Traditional computer science begins with a finite alphabet. By stringing elements of the alphabet one after another, one obtains strings. A set of
More informationA Lower Bound for the Size of Syntactically Multilinear Arithmetic Circuits
A Lower Bound for the Size of Syntactically Multilinear Arithmetic Circuits Ran Raz Amir Shpilka Amir Yehudayoff Abstract We construct an explicit polynomial f(x 1,..., x n ), with coefficients in {0,
More informationLecture 9: Matrix approximation continued
0368-348-01-Algorithms in Data Mining Fall 013 Lecturer: Edo Liberty Lecture 9: Matrix approximation continued Warning: This note may contain typos and other inaccuracies which are usually discussed during
More informationSupremum of simple stochastic processes
Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there
More informationOn Sketching Quadratic Forms
On Sketching Quadratic Forms Alexandr Andoni Jiecao Chen Robert Krauthgamer Bo Qin David P. Woodruff Qin Zhang April 2, 2015 Abstract We initiate the study of sketching a quadratic form: given an n n matrix
More informationOn Sketching Quadratic Forms
On Sketching Quadratic Forms Alexandr Andoni Jiecao Chen Robert Krauthgamer Bo Qin David P. Woodruff Qin Zhang December 18, 2015 Abstract We undertake a systematic study of sketching a quadratic form:
More informationLabel Cover Algorithms via the Log-Density Threshold
Label Cover Algorithms via the Log-Density Threshold Jimmy Wu jimmyjwu@stanford.edu June 13, 2017 1 Introduction Since the discovery of the PCP Theorem and its implications for approximation algorithms,
More informationSpectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min
Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before
More informationLow-Rank PSD Approximation in Input-Sparsity Time
Low-Rank PSD Approximation in Input-Sparsity Time Kenneth L. Clarkson IBM Research Almaden klclarks@us.ibm.com David P. Woodruff IBM Research Almaden dpwoodru@us.ibm.com Abstract We give algorithms for
More informationDiscrete quantum random walks
Quantum Information and Computation: Report Edin Husić edin.husic@ens-lyon.fr Discrete quantum random walks Abstract In this report, we present the ideas behind the notion of quantum random walks. We further
More informationSum-of-Squares Method, Tensor Decomposition, Dictionary Learning
Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees
More informationSparse Johnson-Lindenstrauss Transforms
Sparse Johnson-Lindenstrauss Transforms Jelani Nelson MIT May 24, 211 joint work with Daniel Kane (Harvard) Metric Johnson-Lindenstrauss lemma Metric JL (MJL) Lemma, 1984 Every set of n points in Euclidean
More informationLower bounds on the size of semidefinite relaxations. David Steurer Cornell
Lower bounds on the size of semidefinite relaxations David Steurer Cornell James R. Lee Washington Prasad Raghavendra Berkeley Institute for Advanced Study, November 2015 overview of results unconditional
More informationA Quantum Interior Point Method for LPs and SDPs
A Quantum Interior Point Method for LPs and SDPs Iordanis Kerenidis 1 Anupam Prakash 1 1 CNRS, IRIF, Université Paris Diderot, Paris, France. September 26, 2018 Semi Definite Programs A Semidefinite Program
More informationTighter Low-rank Approximation via Sampling the Leveraged Element
Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com
More informationCSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming
CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming
More informationA local algorithm for finding dense bipartite-like subgraphs
A local algorithm for finding dense bipartite-like subgraphs Pan Peng 1,2 1 State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences 2 School of Information Science
More information16 Embeddings of the Euclidean metric
16 Embeddings of the Euclidean metric In today s lecture, we will consider how well we can embed n points in the Euclidean metric (l 2 ) into other l p metrics. More formally, we ask the following question.
More informationTowards Practically-Efficient Spectral Sparsification of Graphs. Zhuo Feng
Design Automation Group Towards Practically-Efficient Spectral Sparsification of Graphs Zhuo Feng Acknowledgements: PhD students: Xueqian Zhao, Lengfei Han, Zhiqiang Zhao, Yongyu Wang NSF CCF CAREER grant
More informationLearning symmetric non-monotone submodular functions
Learning symmetric non-monotone submodular functions Maria-Florina Balcan Georgia Institute of Technology ninamf@cc.gatech.edu Nicholas J. A. Harvey University of British Columbia nickhar@cs.ubc.ca Satoru
More informationLecture 18 Nov 3rd, 2015
CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 18 Nov 3rd, 2015 Scribe: Jefferson Lee 1 Overview Low-rank approximation, Compression Sensing 2 Last Time We looked at three different
More informationRandomized sparsification in NP-hard norms
58 Chapter 3 Randomized sparsification in NP-hard norms Massive matrices are ubiquitous in modern data processing. Classical dense matrix algorithms are poorly suited to such problems because their running
More informationProperties of Expanders Expanders as Approximations of the Complete Graph
Spectral Graph Theory Lecture 10 Properties of Expanders Daniel A. Spielman October 2, 2009 10.1 Expanders as Approximations of the Complete Graph Last lecture, we defined an expander graph to be a d-regular
More informationLinear Algebra Massoud Malek
CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product
More informationRandomized Numerical Linear Algebra: Review and Progresses
ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November
More informationapproximation algorithms I
SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions
More informationUniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit
Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit arxiv:0707.4203v2 [math.na] 14 Aug 2007 Deanna Needell Department of Mathematics University of California,
More informationThe Matrix-Tree Theorem
The Matrix-Tree Theorem Christopher Eur March 22, 2015 Abstract: We give a brief introduction to graph theory in light of linear algebra. Our results culminates in the proof of Matrix-Tree Theorem. 1 Preliminaries
More informationConvex relaxation. In example below, we have N = 6, and the cut we are considering
Convex relaxation The art and science of convex relaxation revolves around taking a non-convex problem that you want to solve, and replacing it with a convex problem which you can actually solve the solution
More informationLower-Stretch Spanning Trees
Lower-Stretch Spanning Trees Michael Elkin Ben-Gurion University Joint work with Yuval Emek, Weizmann Institute Dan Spielman, Yale University Shang-Hua Teng, Boston University + 1 The Setting G = (V, E,
More informationCoordinate Methods for Accelerating l Regression and Faster Approximate Maximum Flow
208 IEEE 59th Annual Symposium on Foundations of Computer Science Coordinate Methods for Accelerating l Regression and Faster Approximate Maximum Flow Aaron Sidford Department of Management Science & Engineering
More informationA Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover
A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover Grant Schoenebeck Luca Trevisan Madhur Tulsiani Abstract We study semidefinite programming relaxations of Vertex Cover arising
More informationApproximation Algorithms
Approximation Algorithms Chapter 26 Semidefinite Programming Zacharias Pitouras 1 Introduction LP place a good lower bound on OPT for NP-hard problems Are there other ways of doing this? Vector programs
More informationGraph Partitioning Algorithms and Laplacian Eigenvalues
Graph Partitioning Algorithms and Laplacian Eigenvalues Luca Trevisan Stanford Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee, Yin Tat Lee, and Shayan Oveis Gharan spectral graph theory Use linear
More informationThe Strong Largeur d Arborescence
The Strong Largeur d Arborescence Rik Steenkamp (5887321) November 12, 2013 Master Thesis Supervisor: prof.dr. Monique Laurent Local Supervisor: prof.dr. Alexander Schrijver KdV Institute for Mathematics
More informationAdaptive Online Gradient Descent
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow
More informationBlame and Coercion: Together Again for the First Time
Blame and Coercion: Together Again for the First Time Supplementary Material Jeremy Siek Indiana University, USA jsiek@indiana.edu Peter Thiemann Universität Freiburg, Germany thiemann@informatik.uni-freiburg.de
More informationOn the distinguishability of random quantum states
1 1 Department of Computer Science University of Bristol Bristol, UK quant-ph/0607011 Distinguishing quantum states Distinguishing quantum states This talk Question Consider a known ensemble E of n quantum
More information