University of Bristol - Explore Bristol Research. Publisher's PDF, also known as Version of record

Size: px
Start display at page:

Download "University of Bristol - Explore Bristol Research. Publisher's PDF, also known as Version of record"

Transcription

1 Zanetti, L., Sun, H., & Peng, R. 05. Partitioning Well-Clustered Graphs: Spectral Clustering Works!. Paper presented at Conference on Learning Theory, Paris, France. Publisher's PDF, also known as Version of record Link to publication record in Explore Bristol Research PDF-document University of Bristol - Explore Bristol Research General rights This document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Full terms of use are available:

2 JMLR: Workshop and Conference Proceedings vol 40: 33, 05 Partitioning Well-Clustered Graphs: Spectral Clustering Works! Richard Peng Massachusetts Institute of Technology, Cambridge, USA He Sun Luca Zanetti University of Bristol, Bristol, UK Abstract In this work we study the widely used spectral clustering algorithms, i.e. partition a graph into k clusters via embedding the vertices of a graph into a low-dimensional space using the bottom eigenvectors of the Laplacian matrix, and partitioning the embedded points via k-means algorithms. We show that, for a wide class of graphs, spectral clustering algorithms give a good approximation of the optimal clustering. While this approach was proposed in the early 990s and has comprehensive applications, prior to our work similar results were known only for graphs generated from stochastic models. We also give a nearly-linear time algorithm for partitioning well-clustered graphs, which is based on heat kernel embeddings and approximate nearest neighbor data structures.. Introduction Partitioning a graph into two or more pieces is one of the most fundamental problems in combinatorial optimization, and has comprehensive applications in various disciplines of computer science. One of the most studied graph partitioning problems is the edge expansion problem, i.e., finding a cut with few crossing edges normalized by the size of the smaller side of the cut. Formally, let G = V, E be an undirected graph. For any set S, the conductance of set S is defined by φ G S ES, V \ S, vols where vols is the total weight of edges incident to vertices in S, and let the conductance of G be φg min φ GS. S:volSvolG/ The edge expansion problem asks for a set S V of vols volv / such that φ G S = φg. This problem is known to be NP-hard Matula and Shahrokhi, 990 and, assuming the Small Set Expansion Conecture Raghavendra et al., 0, does not admit a polynomial-time algorithm which achieves a constant factor approximation in the worst case. The k-way partitioning problem is a natural generalization of the edge expansion problem. We call subsets of vertices i.e. clusters A,..., A k a k-way partition of G if A i A = for different c 05 R. Peng, H. Sun & L. Zanetti.

3 PENG SUN ZANETTI i and, and k i= A i = V. The k-way partitioning problem asks for a k-way partition of G such that the conductance of any A i in the partition is at most the k-way expansion constant, defined by ρk min max φ GA i.. partition A,...,A k ik Clusters of low conductance in a real network usually capture the notion of community, and algorithms for finding these subsets have applications in various domains such as community detection and network analysis. In computer vision, most image segmentation procedures are based on regionbased merge and split Coleman and Andrews, 979, which in turn rely on partitioning graphs into multiple subsets Shi and Malik, 000. On a theoretical side, decomposing vertex/edge sets into multiple disoint subsets is used in designing approximation algorithms for Unique Games Trevisan, 008, and efficient algorithms for graph problems Kelner et al., 04; Leighton and Rao, 999; Spielman and Teng, 0. Despite widespread use of various graph partitioning schemes over the past decades, the quantitative relationship between the k-way expansion constant and the eigenvalues of the graph Laplacians were unknown until a sequence of very recent results Lee et al., 0; Louis et al., 0. For instance, Lee et al. 0 proved the following higher-order Cheeger inequality: λ k ρk Ok λ k,. where 0 = λ... λ n are the eigevalues of the normalized Laplacian matrix L of G. Informally, the higher-order Cheeger inequality shows that a graph G has a k-way partition with low ρk if and only if λ k is small. Indeed,. implies that a large gap between λ k+ and ρk guarantees i existence of a k-way partition {S i } k i= with bounded φ GS i ρk, and ii any k + -way partition of G contains a subset with significantly higher conductance ρk + λ k+ / compared with ρk. Hence, a suitable lower bound on the gap Υ for some k, defined by Υ λ k+ ρk,.3 implies the existence of a k-way partition for which every cluster has low conductance, and that G is a well-clustered graph. In this paper we study the well-clustered graphs which satisfy a gap assumption on Υ, and similar assumptions were addressed in previous reference, e.g. Zhu et al., 03; Oveis Gharan and Trevisan, 04. To further reason the rationality of the assumption on Υ, first notice that if the gap assumption does not hold, then G can be partitioned into at least k + subsets of low conductance; secondly, the assumption on Υ is closely related to the gap between λ k+ and λ k for some k. The value of k for which there is a gap between λ k+ and λ k has been observed empirically as an indication of the correct number of clusters Fortunato, 00; von Luxburg, Our Results We give structural results that show close connections between the eigenvectors and the indicator vectors of the clusters. This characterization allows us to show that many variants of k-means algorithms, that are based on the spectral embedding and that work in practice, can be rigorously analyzed in theory. Moreover, exploiting our gap assumption, we can approximate this spectral

4 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! embedding using the heat kernel of the graph. Combining this with locality-sensitive hashing, we give a nearly-linear time algorithm for the k-way partitioning problem. Our structural results can be summarized as follows. Let {S i } k i= be a k-way partition of G achieving ρk defined in.. We define {ḡ i } k i= to be the normalized indicator vectors of the clusters {S i } k i=, and {f i} k i= to be the eigenvectors corresponding to the k smallest eigenvalues of L. We show that, under the condition of Υ = Ωk, the span of {ḡ i } k i= and the span of {f i} k i= are close to each other, which can be stated formally in Theorem.. Theorem. The Structure Theorem Let {S i } k i= be a k-way partition of G achieving ρk, and let Υ = λ k+ /ρk = Ωk. Assume that {f i } k i= are the first k eigenvectors of matrix L, and ḡ,..., ḡ k R n are the indicator vectors of {S i } k i= with proper normalization. Then, the following statements hold:. For every ḡ i, there is a linear combination of {f i } k i=, called ˆf i, such that g i ˆf i /Υ.. For every f i, there is a linear combination of {g i } k i=, called ĝ i, such that f i ĝ i.k/υ. This theorem generalizes the result shown by Arora et al. Arora et al. 00, Theorem., which proves the easier direction the first statement, Theorem., and can be considered as a stronger version of the well-known Davis-Kahan theorem Davis and Kahan, 970. We remark that, despite that we use the higher-order Cheeger inequality. to motivate the definition of Υ, our proof of the structure theorem is self-contained. Specifically, it omits much of the machinery used in the proofs of higher-order and improved Cheeger inequalities Kwok et al., 03; Lee et al., 0. The structure theorem has several applications. For instance, we look at the well-known spectral embedding F : V R k defined by F u NormalizationFactoru f u,..., f k u,.4 where NormalizationFactoru R is the normalization factor for u V. We use Theorem. to show that this well-known spectral embedding exhibits very nice geometric properties: i all points F u from the same cluster are close to each other, and ii most pairs of points F u, F v from different clusters are far from each other; iii the bigger the value of Υ, the higher concentration the embedded points within the same cluster. Based on these facts, we analyze the performance of spectral k-means algorithms, aiming at answering the following longstanding open question: Why do spectral k-means algorithms perform well in practice? We show that the partition {A i } k i= produced by the spectral k-means algorithm gives a good approximation of any optimal partition {S i } k i= : every A i has low conductance, and has large overlap with its correspondence S i. This allows us to apply various k-means algorithms Kumar et al., 004; Ostrovsky et al., 0 to give spectral clustering algorithms, with different time versus approximation tradeoffs. These algorithms have comprehensive applications, and have been the subect of extensive experimental studies for more than 0 years, e.g. Ng et al.,. See the formal definition in Section 3.. For simplicity, we use the word spectral k-means algorithms to refer to the algorithms which combine a spectral embedding with a k-means algorithm in Euclidean space. 3

5 PENG SUN ZANETTI 00; von Luxburg, 007. While similar results on spectral clustering mainly focus on graphs in the stochastic block model, to the best of our knowledge it is the first rigorous analysis of spectral clustering for general graphs exhibiting a multi-cut structure. Our result is as follows: Theorem. Approximation Guarantee of Spectral k-means Algorithms Let G be a graph satisfying the condition Υ = λ k+ /ρk = Ωk 3, and k N. Let F : V R k be the embedding defined above. Let {A i } k i= be a k-way partition by any k-means algorithm running in Rk that achieves an approximation ratio APT. Then, the following statements hold: i vola i S i = O APT k 3 Υ vols i, and ii φ G A i = O φ G S i + APT k 3 Υ. We further study fast algorithms for partitioning well-clustered graphs. Notice that, for moderately large values of k, e.g. k = ωlog n, directly applying k-means algorithms and Theorem. does not give a nearly-linear time algorithm, since i obtaining the spectral embedding.4 requires Ωmk time for computing k eigenvectors, and ii most k-means algorithms run in time Ωnk. To overcome the first obstacle, we study the so-called heat kernel embedding, an embedding from V to R n, and approximates the squared-distance F u F v of the embedded points F u and F v via their heat-kernel distance. Since a desired approximation of the heat kernel distances of vertices is computable in nearly-linear time, this approach avoids the computation of eigenvectors. For the second obstacle, instead of applying k-means algorithms as a black-box we apply approximate nearest-neighbor algorithms. Our approach presents an ad-hoc version of k- means algorithms, and indicates that in certain scenarios the standard Lloyd-type heuristic widely used in k-means algorithms can eventually be avoided. Our third result is as follows: Theorem.3 Nearly-Linear Time Algorithm For Partitioning Graphs Let G = V, E be a graph of n vertices and m edges, and a parameter k N. Assume that Υ = λ k+ /ρk = Ωk 5, and {S i } k i= is a k-way partition such that φ GS i ρk. Then there is an algorithm which runs in Õm time3 and outputs a k-way partition {A i } k i= such that i vola i S i = O k 3 log k Υ vols i, and ii φ G A i = Oφ G S i + k 3 log k Υ. We remark that bounds of other expansion parameters of k-way partitioning can be derived from our analysis as well. For instance, one quantity studied extensively in machine learning is the normalized cut Shi and Malik, 000. While the k-way expansion of a graph is the maximum conductance of the sets in the partition, the corresponding normalized cut is the sum of the conductance of all the sets in this partition. It is easy to see that these two measures differ by at most a factor of k, and the normallized cut for a k-way partition from spectral clustering can be bounded as well... Related Work There is a large amount of literature on graph partitioning under various settings. Oveis Gharan and Trevisan 04 formulate the notion of clusters with respect to the inner and outer conductance: a cluster S should have low outer conductance, and the conductance of the induced subgraph by S should be high. Under a gap assumption between λ k+ and λ k, they present a polynomialtime algorithm which finds a k-way partition {A i } k i= that satisfy the inner- and outer-conductance condition. In order to assure that every A i has high inner conductance, they assume that λ k+ 3. The Õ term hides a factor of poly log n. 4

6 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! polykλ /4 k, which is much stronger than ours. Moreover, their algorithm runs in polynomial-time, in contrast to our nearly-linear time algorithm. Dey et al. 04 studies the properties of the spectral embedding for graphs having a gap between λ k and λ k+ and presents a k-way partition algorithm, which is based on the k-center clustering and is similar in spirit to our work. Using combinatorial arguments, they are able to show that the clusters concentrate around k distant points in the spectral embedding. In contrast to our work, their result only holds for bounded-degree graphs, and cannot provide an approximate guarantee for individual clusters. Moreover, their algorithm runs in nearly-linear time only if k = Opoly log n. We also explore the separation between λ k and λ k+ from an algorithmic perspective, and show that this assumption interacts well with heat-kernel embeddings. The heat kernel has been used in previous algorithms on local partitioning Chung, 009, balanced separators Orecchia et al., 0. It also plays a key role in current efficient approximation algorithms for finding low conductance cuts Orecchia et al., 008; Sherman, 009. However, most of these theoretical guarantees are through the matrix multiplicative weights update framework Arora et al., 0; Arora and Kale, 007. Our algorithm instead directly uses the heat-kernel embedding to find low conductance cuts. There is also a considerable amount of research on partitioning random graphs. For instance, in the Stochastic Block Model SBM McSherry, 00, the input graph with k clusters is generated according to probabilities p and q with p > q: an edge between any two vertices within the same cluster is placed with probability p, and an edge between any two vertices from different clusters is placed with probability q. It is proven that spectral algorithms give the correct clustering for certain ranges of p and q, e.g. McSherry, 00; Rohe et al., 0; Vu, 04. However, the analysis of these algorithms cannot be easily generalized into our setting: we consider graphs where edges are not necessarily chosen independently with certain probabilities, but can be added in an adversarial way. For this reason, standard perturbation theorems used in the analysis of algorithms for SBMs, such as the Davis-Kahan theorem Davis and Kahan, 970, cannot be always applied, and ad-hoc arguments specific for graphs, like our structure theorem Theorem., become necessary.. Preliminaries Let G = V, E be an undirected and unweighted graph with n vertices and m edges. The set of neighbors of a vertex u is represented by Nu, and its degree is d u = Nu. For any set S V, let vols u S d u. For any set S, T V, we define ES, T to be the set of edges between S and T, aka ES, T {{u, v} u S and v T }. For simplicity, we write S = ES, V \ S for any set S V. For two sets X and Y, the symmetric difference of X and Y is defined as X Y X \ Y Y \ X. We will work extensively with algebraic obects related to G. We will also use D to denote the n n diagonal matrix with D uu = d u for u V [G]. The Laplacian matrix of G is defined by L D A, where A is the adacency matrix of G and D is the n n diagonal matrix with D uu = d u for u V [G]. The normalized Laplacian matrix of G is defined by L D / LD / = I D / AD /. For this matrix, we will denote its n eigenvalues with 0 = λ λ n, with their corresponding orthonormal eigenvectors f,..., f n. Note that if G is connected, the first eigenvector is f = D / f, where f is any non-zero constant vector. 5

7 PENG SUN ZANETTI For a vector x R n, the Euclidean norm of x is given by x = n i= x i /. For any f : V R, the Rayleigh quotient of f with respect to graph G is given by Rf f Lf f = f Lf {u,v} EG fu fv = f D u d ufu, where f D f Df. Based on the Rayleigh quotient, the conductance of a set S i can be expressed as φ G S i = Rḡ i, and the gap Υ can be written as Υ = λ k+ ρk = min ik λ k+ φ G S i = min ik λ k+ Rḡ i.. Throughout the rest of the paper, we will use S,..., S k to express a k-way partition of G achieving the minimum conductance ρk. Note that this partition may not be unique. 3. Connection Between Eigenvectors and Indicator Vectors of Clusters In this section we study the relations between the multiple cuts of a graph and the eigenvectors of the graph s normalized Laplacian matrix. Given clusters S... S k, define the indicator vector of cluster S i by { if u Si, g i u = 3. 0 if u S i, and define the corresponding normalized indicator vector by g i = D/ g i D / g i. 3. A basic result in spectral graph theory states that G has k connected components if and only if the k smallest eigenvalues are 0, implying that the spaces spanned by f,, f k and ḡ,, ḡ k are the same. Generalizing this result, we expect that these two spaces would be still similar if these k components of G are loosely connected, in the sense that i every eigenvector f i can be approximately expressed by a linear combination of {g i } k i=, and ii every indicator vector ḡ i can be approximately expressed by a linear combination of {f i } k i=. This leads to our structure theorem, which is illustrated in Figure. ˆf i Part ˆf i g i /Υ ḡ i ˆf i = a linear combination of {f } with coefficients α i ĝ i = a linear combination of {ḡ } with coefficients β i f i Part f i ĝ i.k/υ ĝ i Figure : Relations among { ˆf i }, {f i }, {ḡ i }, and {ĝ i } given in Theorem 3.. Here Υ is the gap defined with respect to λ k+ and ρk. 6

8 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! Theorem 3. Let Υ = Ωk, and i k. Then, the following statements hold:. There is a linear combination of the eigenvectors f,..., f k with coefficients α i : ˆfi = α i f + + α i k f k, such that g i ˆf i /Υ.. There is a linear combination of the vectors ḡ,..., ḡ k with coefficients β i : ĝ i = β i + β i k ḡk, such that f i ĝ i.k/υ. ḡ + Part of Theorem 3. shows that the normalized indicator vectors ḡ i of every cluster S i can be approximated by a linear combination of the first k eigenvectors, with respect to the value of Υ. The proof follows from the fact that if ḡ i has small Rayleigh quotient, then the inner product between ḡ i and the eigenvectors corresponding to larger eigenvalues must be small. This statement was also shown implicitly in Theorem. of Arora et al. 00. Part of Theorem 3. is more interesting, and shows that the opposite direction holds as well, i.e., any f i i k can be approximated by a linear combination of the normalized indicator vectors {g i } k i=. To sketch the proof, note that if we could write every g i exactly as a linear combination of {f i } k i=, then we could write every f i i k as a linear combination of {g i } k i=. This is because both of {f i } k i= and {g i} k i= are sets of linearly independent vectors of the same dimension and span {g,..., g k } span {f,..., f k }. However, the g i s are only close to a linear combination of the first k eigenvectors, as shown in Part. We will denote this combination as ˆf i, and use the fact that the errors of approximation are small to show that these { ˆf i } k i= are almost orthogonal between each other. This allows us to show that span { ˆf,..., ˆf k } = span {f,..., f k }, which implies Part. Theorem 3. shows a close connection between the first k eigenvectors and the indicator vectors of the clusters. We leverage this and the fact that the {ĝ i } s are almost orthogonal between each other to show that for any two different clusters S i, S there exists an eigenvector having reasonably different values on the coordinates which correspond to S i and S. Lemma 3. Let Υ = Ωk 3. For any i k, let ĝ i = β i g + + β i k g k be such that f i ĝ i.k/υ. Then, for any l, there exists i {,..., k} such that β i l β i ζ 0 k. 3.3 We point out that it was already shown in Kwok et al. 03 that the first k eigenvectors can be approximated by a k + -step function. The quality of the approximation is the same as the one given by our structure theorem. However, a k + -step approximation is not enough to show that the entire cluster is concentrated around a certain point. We further compare the structure theorem and standard matrix perturbation results, and point out that the standard matrix perturbation theorems cannot be applied in our setting. For instance, we look at a well-clustered graph G, and there are subsets C in a cluster such that most neighbors of vertices in C are outside the cluster. In this case the adacency matrix representing crossing edges of G has high spectral norm, and hence a standard matrix perturbation argument could not give us meaningful result. However, our structure theorem takes the fact that volc must be small into account, and that is why the structure theorem is needed to analyze the cut-structures of a graph. 7

9 PENG SUN ZANETTI 4. Analysis of Spectral k-means Algorithms In this section we analyze an algorithm based on the classical spectral clustering paradigm, and give an approximation guarantee of this method on well-clustered graphs. We will show that any k- means algorithm AlgoMeanX, k with certain approximation guarantee can be used for the k-way partitioning problem. Furthermore, it suffices to call AlgoMean in a black-box manner with a point set X R d. This section is structured as follows. We first give a quick overview of spectral and k-means clustering in Section 4.. In Section 4., we use the structure theorem to analyze the spectral embedding. Section 4.3 gives a general result about the k-means algorithm when applied to this embedding, and the proof sketch of Theorem k-means Clustering Given a set of points X R d, a k-means algorithm AlgoMeanX, k seeks to find a set K of k centers c,, c k to minimize the sum of the squared-distance between x X and the center to which it is assigned. Formally, for any partition X,, X k of the set X R d, we define the cost x X i x c i, i.e., the COST function function by COSTX,..., X k min k c,...,c k R d i= minimizes the total squared-distance between the points x s and their individually closest center c i, where c,..., c k are chosen arbitrarily in R d. We further define the optimal clustering cost by k X min COSTX,..., X k. 4. partition X,...,X k A typical spectral k-means algorithm on graphs can be described as follows: i Compute the bottom k eigenvectors f,, f k of the normalized Laplacian matrix 4 of graph G. ii Map every vertex u V [G] to a point F u R k according to F u = NormalizationFactoru f u,..., f k u, 4. with a proper normalization factor NormalizationFactoru R for each u V. iii Let X {F u : u V } be the set of the embedded points from vertices in G. Run AlgoMeanX, k, and group vertices of G into k clusters according to the output of AlgoMeanX, k. This approach that combines a k-means algorithm with a spectral embedding has been widely used in practice for a long time, although there is a lack of rigorous analysis of its performance prior to our result. 4.. Analysis of the Spectral Embedding The first step of the k-means clustering described above is to map vertices of a graph into points in Euclidean space, through the spectral embedding 4.. This subsection analyzes the properties of this embedding. Let us define the normalization factor to be NormalizationFactoru d u. We will show that the embedding 4. with the normalization factor above has very nice properties: embedded points from the same cluster S i are concentrated around their center c i R k, and 4. Other graph matrices e.g. the adacency matrix, and the Laplacian matrix are also widely used in practice. Notice that, with proper normalization, the choice of these matrices does not substantially influence the performance of k-means algorithms. 8

10 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! embedded points from different clusters of G are far from each other. These properties imply that a simple k-means algorithm is able to produce a good clustering 5. We first define k points p i R k i k, where p i vol Si β i,..., β k i. 4.3 We will show in Lemma 4. that all embedded points X i {F u : u S i } i k are concentrated around p i. Moreover, we bound the total squared-distance between points in the ith cluster X i and p i, which is proportional to /Υ: the bigger the value of Υ, the higher concentration the points within the same cluster have. Notice that we do not claim that p i is the actual center of X i. However, these approximated points p i s suffice for our analysis. Lemma 4. It holds that k i= u S i d u F u p i.k /Υ. We will further show in Lemma 4. that these points p i i k exhibit another excellent property: the distance between p i and p is inversely proportional to the volume of the smaller cluster between S i and S. Therefore, points in S i of smaller vols i are far from points in S of bigger vols. Notice that, if this was not the case, a small misclassification of points in a bigger cluster S could introduce a large error to the cluster of smaller volume. Lemma 4. For every i, it holds that p i p defined in 3.3. ζ 0 min {vols i,vols }, where ζ is 4.3. Proof Sketch of Theorem. Now we analyze why spectral k-means algorithms perform well for solving the k-way partitioning problem. We assume that A,..., A k is any k-way partition returned by a k-means algorithm with an approximation ratio of APT. We map every vertex u to d u identical points in R k. This trick allows us to bound the volume of the overlap between the clusters retrieved by a k-means algorithm and the optimal ones. For this reason we define the cost function of partition A,..., A k of V [G] by COSTA,..., A k and the optimal clustering cost is defined by k Lemma 4.3 It holds that k.k /Υ. min c,...,c k R k k i= u A i d u F u c i, min partition A,...,A k COSTA,..., A k. 5. Notice that this embedding is similar with the one used in Lee et al. 0, with the only difference that F u is not normalized and so it is not necessarily a unit vector. This difference, though, is crucial for our analysis. 9

11 PENG SUN ZANETTI Since A,, A k is the output of a k-means algorithm with approximation ratio APT, by Lemma 4.3 we have that COSTA,..., A k APT.k /Υ. We will show that this upper bound of APT.k /Υ suffices to show that this approximate clustering A,..., A k is close to the actual clustering S,..., S k, in the sense that i every A i has low conductance, and ii under a proper permutation σ : {,..., k} {,..., k}, the symmetric difference between A i and S σi is small. The fact is proven by contradiction: If we could always find a set A i with high symmetric difference with its correspondence S σi, regardless of how we map {A i } to their corresponding {S σi }, then the COST value will be high, which contradicts to the fact that COSTA,..., A k APT.k /Υ. The core of of the whole contradiction arguments is the following technical lemma: Lemma 4.4 Let A,..., A k be a partition of V. Suppose that, for every permutation of the indices σ : {,..., k} {,..., k}, there exists i such that vol A i S σi ε vol Sσi for ε 0 5 k 3 /Υ, then COSTA,..., A k 0 4 ε/k. 5. Partitioning Well-Clustered Graphs in Nearly-Linear Time In this section we present a nearly-linear time algorithm for partitioning well-clustered graphs. At a high level, our algorithm follows the general framework of k-means algorithms, and consists of two steps: the seeding step, and the grouping step. The seeding step chooses k candidate centers such that each one is close to the actual center of a different cluster. The grouping step assigns the remaining vertices to their individual closest candidate centers. We show that, due to the wellseparation properties of the embedded points, both steps in a k-means algorithm can be greatly simplified. Together with the heat kernel embedding, we obtain an ad-hoc version of k-means algorithms, which runs in nearly-linear time. Throughout the whole section, we assume that Υ = Ωk 5, where Ω hides a factor of log c k for some constant c 5.. The Seeding Step The seeding step chooses k points from {F u} u V [G], such that with constant probability every point i belongs to a different cluster, and ii is close to the center of a cluster. To give an intuition of this seeding step, notice that our approximate center p i satisfies p i / vols i, and most embedded points F u are close to their approximate centers, implying that F u is approximately equal to / vols i for most vertices u S i. Hence, sampling points F u with probability proportional to d u F u ensures that points from different clusters are sampled with approximately the same probability. We prove that, after sampling Θk log k points which constitute the sampling set C, with constant probability there is at least one point sampled from each cluster. Next we remove the sampled points in C which are close to each other, until there are exactly k points left. We call this resulting set C, and prove that with constant probability there is exactly one point in C from a cluster. See Algorithm for formal description. We remark that choosing good candidate centers is crucial for most k-means algorithms, and has been studied extensively in literature e.g. Arthur and Vassilvitskii 007; Ostrovsky et al. 0. While recent algorithms obtain good initial centers by iteratively picking points from a non-uniform distribution and take Ωnk time, Algorithm runs in Õk time. 0

12 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! Algorithm SEEDANDTRIMk, x : input: the number of clusters k, and the embedding {xu} u V [G] : Let N = Θk log k. 3: for i =,..., N do 4: Set c i = u with probability proportional to d u xu. 5: end for 6: for i =,..., N do 7: Delete all c with < i such that xc i xc < xc i 0 4 k. 8: end for 9: return c... c k 5.. The Grouping Step After obtaining C, we group the remaining vertices by assigning vertex u to cluster S i if F u is closer to c i C than the other points in C. A naive implementation of this step takes Ωnk time. To speed it up, we apply the ε-approximate nearest neighbor data structures Indyk and Motwani, 998 which gives us a nearly-linear time algorithm for the grouping step Dealing with Large k Since computing the spectral embedding F : V [G] R k takes Omk time, SEEDANDTRIMk, F and the grouping step together run in nearly-liner time as long as k = Opoly log n. However, for larger values of k this approach becomes problematic, as it is not clear how to obtain the embedding F in nearly-linear time. To handle this case, we notice that both steps only use the pairwise distances of the embedded points F u and F v, rather than the spectral embedding itself. We show that these distances can be approximated by the so-called heat kernel distance, which can be approximately computed in nearly-linear time. Formally speaking, the heat kernel of G with parameter t 0, called the temperature, is defined by n H t e tl = e tλ i f i f i. 5. We view the heat kernel as a geometric embedding from V [G] to R n, where the th coordinate of the embedding is given by the th bottom eigenvector of H t, i.e. i= x t u e t λ f u,, e t λn f n u. 5. du We define the squared-distance between the points x t u and x t v by η t u, v x t u x t v. 5.3 We prove that, under the condition of k = Ωlog n and the gap assumption of Υ, there is a wide range of t for which η t u, v gives a good approximation of F u F v.

13 PENG SUN ZANETTI Lemma 5. Let k = Ωlog n and Υ = Ωk 3. Then, there is t such that with high probability the embedding {x t u} u V [G] defined by 5. satisfies e F u F v x t u x t v F u F v + n c, e F u x t u F u + n c for all u, v, where c is some constant. Moreover, these pairwise distances x t u x t v for all {u, v} EG can be approximated up to a constant factor in Õ m time. Now we use the doubling argument to find a required t: We run the seeding and the grouping steps using the heat kernel embedding for all t of the form t = i, where i =,,...., Olog n, and keep track of the best partition found so far. It is easy to see that the final partition after enumerating these t s gives a desired approximation. The overall description of our algorithm for large values of k is shown in Algorithm. Algorithm Clustering Algorithm : INPUT: G, k : for i =,..., k do 3: A i := 4: end for 5: COSTA,..., A k := ; 6: for t =, 4, 8,..., polyn do 7: c,..., c k SEEDANDTRIMk, x t 8: Compute a partition A,..., A k of V : for every v V assign v to its nearest center c i using the ε-nns algorithm with ε = log k. 9: If COSTA,..., A k COSTA,..., A k SETA i := A i FOR i k 0: end for : return A,, A k Acknowledgments Part of this work was done while He Sun and Luca Zanetti worked at Max Planck Institute for Informatics, and He Sun was visiting the Simons Institute for the Theory of Computing, UC Berkeley. We are grateful to Luca Trevisan for insightful comments on an early version of our paper, and to Gary Miller for very helpful discussions about heat kernels on graphs. References Saneev Arora and Satyen Kale. A combinatorial, primal-dual approach to semidefinite programs. In 39th Annual ACM Symposium on Theory of Computing STOC 07, pages 7 36, 007. Saneev Arora, Boaz Barak, and David Steurer. Subexponential algorithms for unique games and related problems. In 5st Annual IEEE Symposium on Foundations of Computer Science FOCS 0, pages , 00.

14 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! Saneev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a metaalgorithm and applications. Theory of Computing, 8: 64, 0. David Arthur and Sergei Vassilvitskii. k-means++: The advantages of careful seeding. In 8th Annual ACM-SIAM Symposium on Discrete Algorithms SODA 07, pages , 007. Fan R. K. Chung. A local graph partitioning algorithm using heat kernel pagerank. Internet Mathematics, 63:35 330, 009. Guy B. Coleman and Harry C. Andrews. Image segmentation by clustering. Proceedings of the IEEE, 675: , 979. Sanoy Dasgupta and Anupam Gupta. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms, :60 65, 003. Chandler Davis and William M. Kahan. The rotation of eigenvectors by a perturbation. iii. SIAM Journal on Numerical Analysis, 7: 46, 970. Tamal K. Dey, Alfred Rossi, and Anastasios Sidiropoulos. Spectral concentration, robust k-center, and simple clustering. arxiv: , 04. Santo Fortunato. Community detection in graphs. Physics Reports, 4863:75 74, 00. Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, 0. Piotr Indyk and Raeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In 30th Annual ACM Symposium on Theory of Computing STOC 98, pages , 998. Jonathan A. Kelner, Yin Tat Lee, Lorenzo Orecchia, and Aaron Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In 5th Annual ACM-SIAM Symposium on Discrete Algorithms SODA 4, pages 7 6, 04. Ioannis Koutis, Alex Levin, and Richard Peng. Improved Spectral Sparsification and Numerical Algorithms for SDD Matrices. In 9th International Symposium on Theoretical Aspects of Computer Science STACS, pages 66 77, 0. Amit Kumar, Yogish Sabharwal, and Sandeep Sen. A simple linear time + ε-approximation algorithm for geometric k-means clustering in any dimensions. In 45th Annual IEEE Symposium on Foundations of Computer Science FOCS 04, pages , 004. Tsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee, Shayan Oveis Gharan, and Luca Trevisan. Improved Cheeger s inequality: analysis of spectral partitioning algorithms through higher order spectral gap. In 45th Annual ACM Symposium on Theory of Computing STOC 3, pages 0, 03. James R. Lee, Shayan Oveis Gharan, and Luca Trevisan. Multi-way spectral partitioning and higherorder Cheeger inequalities. In 44th Annual ACM Symposium on Theory of Computing STOC, pages 7 30, 0. Frank T. Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM, 466:787 83, 999. Anand Louis, Prasad Raghavendra, Prasad Tetali, and Santosh Vempala. Many sparse cuts via higher eigenvalues. In 44th Annual ACM Symposium on Theory of Computing STOC, pages 3 40, 0. 3

15 PENG SUN ZANETTI David W. Matula and Farhad Shahrokhi. Sparsest cuts and bottlenecks in graphs. Discrete Applied Mathematics, 7-:3 3, 990. Frank McSherry. Spectral partitioning of random graphs. In 4nd Annual IEEE Symposium on Foundations of Computer Science FOCS 0, pages , 00. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, : , 00. Lorenzo Orecchia, Leonard J. Schulman, Umesh V. Vazirani, and Nisheeth K. Vishnoi. On partitioning graphs via single commodity flows. In 40th Annual ACM Symposium on Theory of Computing STOC 08, pages , 008. Lorenzo Orecchia, Sushant Sachdeva, and Nisheeth K Vishnoi. Approximating the exponential, the Lanczos method and an Õm-time spectral algorithm for balanced separator. In 44th Annual ACM Symposium on Theory of Computing STOC, pages 4 60, 0. Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, and Chaitanya Swamy. The effectiveness of Lloyd-type methods for the k-means problem. Journal of the ACM, 596:8, 0. Shayan Oveis Gharan and Luca Trevisan. Partitioning into expanders. In 5th Annual ACM-SIAM Symposium on Discrete Algorithms SODA 4, pages 56 66, 04. Prasad Raghavendra, David Steurer, and Madhur Tulsiani. Reductions between expansion problems. In 7th Conference on Computational Complexity CCC, pages 64 73, 0. Karl Rohe, Sourav Chatteree, and Bin Yu. Spectral clustering and the high-dimensional stochastic blockmodel. The Annals of Statistics, 394:878 95, 0. Jonah Sherman. Breaking the multicommodity flow barrier for O log n-approximations to sparsest cut. In 50th Annual IEEE Symposium on Foundations of Computer Science FOCS 09, pages , 009. Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8: , 000. Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. SIAM Journal on Computing, 406:93 96, 0. Daniel A. Spielman and Shang-Hua Teng. Spectral sparsification of graphs. SIAM Journal on Computing, 404:98 05, 0. Luca Trevisan. Approximation algorithms for unique games. Theory of Computing, 4: 8, 008. Ulrike von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 74:395 46, 007. Van Vu. A simple SVD algorithm for finding hidden partitions. arxiv: , 04. Zeyuan Allen Zhu, Silvio Lattanzi, and Vahab S. Mirrokni. A local algorithm for finding wellconnected clusters. In 30th International Conference on Machine Learning ICML 3, pages , 03. 4

16 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! Appendix A. Omitted Proofs of Section 3 In this section we prove Theorem 3. and Lemma 3.. We begin proving the first direction of Theorem 3.. Proof of Theorem 3. Part We write g i as a linear combination of the eigenvectors of L, i.e. g i = α i f + + α i n f n and let the vector ˆf i be the proection of vector ḡ i on the subspace spanned by {f i } k i=, i.e. ˆf i = α i f + + α i k f k. By the definition of Rayleigh quotients, we have that Rg i = α i f + + α n i f n L α i f + + α n i f n = α i λ + + α n i λn α i λ + + α i k λk + α α i α λ + α λ k+, where α and α i + + g i ˆf i = which finishes the proof. α i k α α i. Therefore, we have that α i α i k+ + + Rgi /λ k+ /Υ, α i n = α α i λ k+ /Υ, Next we prove Part of Theorem 3.. We first state two classical results that will be used in our proof. Theorem A. Geršgorin Circle Theorem Let A be an n n matrix, and let R i A = i A i,, for i n. Then, all eigenvalues of A are in the union of Geršgorin Discs defined by n {z C : z A i,i R i A}. i= Theorem A. Corollary 6.3.4, Horn and Johnson 0 Let A be an n n normal matrix with eigenvalues λ,..., λ n and E be an n n matrix. If ˆλ is an eigenvalue of A + E, then there is some eigenvalue λ i of A for which ˆλ λ i E. 5

17 PENG SUN ZANETTI Proof of Theorem 3. Part By Part, every g i is approximated by a vector ˆf i defined by ˆf i = α i f + α i k f k. Define a k by k matrix A such that A i, = α i, i.e., the th column of matrix A consists of values { } α k i representing ˆf. We define a vector α by i= α = α,, α k. Notice that i the norm of every column of A is close to, and ii different columns are almost orthogonal to each other, since by Cauchy-Schwarz inequality it holds that α i, α α i α max {Rḡi /λ k+, Rḡ /λ k+ } /Υ, for i. This implies that A is almost an orthogonal matrix. Moreover, it holds for any i that A k k A i, = A l,i A l, = α i l α l = α i, α /Υ l= l= while A A i,i = k l= α i l /Υ. Then, by the Geršgorin Circle Theorem cf. Theorem A., it holds that all the eigenvalues of A A are at least /Υ k /Υ = k/υ. Therefore, A has no eigenvalue with value 0 as long as Υ > k, i.e., the vectors { α } k = are linearly independent. Combining this with the fact that span { ˆf,..., ˆf k } span {f,..., f k } and dimspan {f,..., f k } = k, it holds that span { ˆf,..., ˆf k } = span {f,..., f k }. Hence, we can write every f i i k as a linear combination of { ˆf i } k i=, i.e., Now define the value of ĝ i as f i = β i Without loss of generality we assume that = f i = ˆf + β i ĝ i = β i g + β i g + + β i k l= β i β i β i β i l ˆf + + β i k ˆf k. A. = max { β i ˆfl + ˆf β i /Υ β i l l l k g k. A. },..., β i. Then, it holds k β i l ˆf l, ˆf l ˆf l, ˆf l l l k /Υ, β i 6

18 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! which implies for a large enough value of Υ that β i Υ k.. Υ Combining this with Part of Theorem 3., it is easy to see that f i ĝ i k max k β i ˆf g.k/υ. Now we prove Lemma 3., which shows that for any two different clusters S i, S there exists an eigenvector having reasonably different values on the coordinates which correspond to S i and S., Proof of Lemma 3. Let β i = β i,..., βi k for i k. Since ḡi ḡ for any i, we have by the orthonormality of g,, g k that ĝ i, ĝ = β i g + + β i k g k, β g + + β k g k k = β i l β l g l = β i, β, l= and β i, β = ĝi, ĝ f i f i ĝ i, f f ĝ = f i, f f i ĝ i, f f ĝ, f i + f i ĝ i, f ĝ f i ĝ i + f ĝ + f i ĝ i f ĝ. k/υ +.k/υ. Moreover, it holds that β i = ĝi = f i + ĝ i f i + ĝ i f i +.k/υ, and β i = ĝ i = f i + ĝ i f i ĝ i f i.k/υ, which implies that β i. k/υ +.k/υ, +. k/υ +.k/υ. A.3 In other words, we showed that β i s are almost orthonormal. Now we construct a k by k matrix B, where the th column of B is β. By the Geršgorin Circle Theorem Theorem A., all eigenvalues λ of B B satisfies λ B B i,i k. k/υ +.k/υ A.4 7

19 PENG SUN ZANETTI for any i. Combing this with A.3, we have that B eigenvalues have modulus close to. Now we show that β i l and β i are far from each other by contradiction. Suppose there exist l such that ζ max β i ik l β i < 0 k. This implies that the th row and lth row of matrix B are somewhat close to each other. Let us now define matrix E R k k, where E l,i β i β i l, and E t,i = 0 for any t l and i k. Moreover, let Q = B + E. Notice that Q has two identical rows, and rank at most k. Therefore, Q has an eigenvalue with value 0, and the spectral norm E of E, the largest singular value of E, is at most kζ. By definition of matrix Q we have that Q Q = B B + B E + E B + E E. Since B B is symmetric and 0 is an eigenvalue of Q Q, by Theorem A. we know that, if ˆλ is an eigenvalue of Q Q, then there is an eigenvalue λ of B B such that which implies that ˆλ λ B E + E B + E E B E + E B + E E 4 kζ + kζ, ˆλ λ 4 kζ kζ k. k/υ +.k/υ 4 kζ kζ, due to A.3 and A.4. By setting ˆλ = 0, we have that k. k/υ +.k/υ 4 kζ kζ 0. By the condition of Υ = Ωk 3, the inequality above implies that ζ contradiction. 0 k, which leads to a Appendix B. Omitted Proofs of Section 4 In this section we list all the proofs omitted from Section 4. We start proving Lemma 4., which shows that the embedded points of a cluster S i are concentrated around the corresponding point p i. Proof of Lemma 4. Since x = D / x D holds for any x R n, by Theorem 3. we have for any k that k d u i= u S i F u p i Summing over all for k implies that k d u i= u S i F u p i = = D / f D / ĝ D = f ĝ.k/υ. k k d u i= = u S i 8 F u p i.k /Υ.

20 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! The next lemma shows that the l -norm of p i is inversely proportional to the volume of S i. This implies that embedded points from a big cluster are close to the origin, while embedded points from a small cluster are far from the origin. Lemma B. It holds for every i k that 9 p 0 vols i i Proof By 4.3, we have that p i = vols i β i 0 vols i.,..., β k i. Notice that p i is ust the ith row of the matrix B defined in the proof of Lemma 3., normalized by volsi. Since B and B share the same singular values this follows from the SVD decomposition, by A.4 the eigenvalues of BB are close to. But since BB i,i is equal to the squared norm of the ith row of B, we have that β i,..., β k i which implies the statement.. k/υ +.k/υ, +. k/υ +.k/υ, B. We can now prove Lemma 4. which shows that, for every i, p i is far from p. Moreover, their distance is inversely proportional to the volume of the smaller cluster between S i and S. Proof of Lemma 4. Let S i and S be two arbitrary clusters. By Lemma 3., there exists l k such that β l i β l ζ. By the definition of p i and p it follows that p i p i p p By B., we know that k l= β l = β Therefore, we have that p i p i p p β l i k t= β t i β l k t= β t,..., β k ζ 0, + ζ. 0 β l i β l ζ,. 9

21 PENG SUN ZANETTI and p i p i, p p ζ /4. Without loss of generality, we assume that p i p. By Lemma B., it holds that p i 9 0 vols i, and p i p 9 0 vols. Hence, it holds that p i 9 0 min {vols i, vols }. We can now finish the proof by considering two cases based on p i. Case : Suppose that p i 4 p. We have that p i p p i p 3 p i, 4 which implies that p i p 9 p i 6 min {vols i, vols }. Case : Suppose p = α p i for α 4, ]. In this case, we have that p i p p = i + p p i p i, p p i p p p i p + ζ /4 p i p = + α p i ζ /4α p i = + α α + αζ / p i αζ p i ζ 0 min {vols i, vols }, and the lemma follows. The next lemma gives an upper bound to the cost of the optimal k-means clustering which depends on the gap Υ. Proof of Lemma 4.3 Since k is obtained by minimizing over all partitions A,..., A k and c,..., c k, we have that k k d u i= u S i F u p i. B. 0

22 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! Hence the statement follows by applying Lemma 4.. By Lemma 4.3 and the assumption that A,, A k is an APT-factor approximation of an optimal clustering, we have that COSTA,..., A k. APT k /Υ. Now we prove Theorem., which is based on the upper bound of COST and Lemma 4.4. Proof of Theorem. Let A,..., A k be a k-way partition that achieves an approximation ratio of APT. We first show that there exists a permutation σ of the indices such that vol A i S σi 0 5 k 3 APT Υ vols σi for any i k. Assume for contradiction that there is i k such that This implies by Lemma 4.4 that vola i S σi > 05 k 3 APT Υ vol S σi. COSTA,..., A k 0 APT k /Υ, which contradicts to the fact that A,..., A k is an APT-approximation to a k-way partition, whose optimal cost is at most APT k /Υ. Next we bound the conductance of every cluster A i. Let ε = 05 k 3 APT Υ k3 APT = O Υ For any i k, the number of leaving edges of A i is upper bounded by On the other hand, we have that A i A i \ S σi + A i S σi. A i S σi + A i S σi ε vol S σi + φg Sσi vol Sσi + ε vol Sσi = ε + φ G Sσi volsσi. vol A i vol A i S σi ε volsσi. Hence, φ G A i ε + φ GS σi vols σi ε vols σi = ε + φ GS σi ε = Oφ G S σi + APT k 3 /Υ. The rest of this section is devoted to prove Lemma 4.4, which is based on the following highlevel idea: Suppose by contradiction that there is a cluster S which is very different from every cluster A l. Then there is a cluster A i with significant overlap with two different clusters S and S Lemma B.. However, we already proved in Lemma 4. that any two clusters are far from each other. This implies that the COST value of A,..., A k must be high, which leads to a contradiction.

23 PENG SUN ZANETTI Lemma B. such that Suppose for every permutation π : {,..., k} {,..., k} there exists an index i vol A i S πi ε vol Sπi. Then, for any index i there are indices i i and ε i 0 such that and k i= ε i ε. vola i S i vola i S i ε i min {vols i, vols i }, Proof Let σ : {,..., k} {,..., k} be the function defined by σi = argmax k vola i S. vols We first assume that σ is one-to-one, i.e. σ is a permutation. By the hypothesis of the lemma, there exists an index i such that vola i S σi ε vols σi. Without loss of generality, we assume that i =. Notice that vol A S σ = vol A S σ + vol A S. B.3 σ Hence, one of the summations on the right hand side of B.3 is at least ε vol S σ. Now the proof is based on the case distinction. Case : Assume that vol A S σ ε volsσ. We define τ for k to be τ = vol A S σ vol S σ. We have that and by the definition of σ it holds that vol A S σ vol S σ τ ε, vol A S σ vol = τ S σ for k. Setting ε = τ for k and ε = 0 finishes the proof of Case. Case : Assume that vol A S ε vols σ. σ Let us define τ for k, σ, to be B.4 τ = vola S vol S σ. Then, B.4 implies that σ τ ε.

24 PARTITIONING WELL-CLUSTERED GRAPHS: SPECTRAL CLUSTERING WORKS! S i A i S i B i u c i p i p i Figure : We use the fact that p i c i p i c i, and lower bound the value of COST function by only looking at the contribution of points u B i for all i k. The statement in this case holds by assuming vol A S σ ε vol Sσ, since otherwise we have vol S σ vol A S σ = vol A S σ ε vol Sσ ε vol Sσ, and this case was proven in Case. So it suffices to study the case in which σ defined earlier is not one-to-one. Then, there is k such that {σ,, σk}. For any l k, let l = vola l S. vols Then, k l= τ l = ε and it holds for any l k that vol A l S σl vol vola l S = τ l S σl vols. τ Proof of Lemma 4.4 By the assumption of Lemma 4.4 and Lemma B., for every i there exist i i such that vola i S i ε i min {vols i, vols i }, B.5 vola i S i ε i min {vols i, vols i }, for some ε 0, and k ε i ε. i= 3

Graph Partitioning Algorithms and Laplacian Eigenvalues

Graph Partitioning Algorithms and Laplacian Eigenvalues Graph Partitioning Algorithms and Laplacian Eigenvalues Luca Trevisan Stanford Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee, Yin Tat Lee, and Shayan Oveis Gharan spectral graph theory Use linear

More information

Lecture 17: Cheeger s Inequality and the Sparsest Cut Problem

Lecture 17: Cheeger s Inequality and the Sparsest Cut Problem Recent Advances in Approximation Algorithms Spring 2015 Lecture 17: Cheeger s Inequality and the Sparsest Cut Problem Lecturer: Shayan Oveis Gharan June 1st Disclaimer: These notes have not been subjected

More information

On a Cut-Matching Game for the Sparsest Cut Problem

On a Cut-Matching Game for the Sparsest Cut Problem On a Cut-Matching Game for the Sparsest Cut Problem Rohit Khandekar Subhash A. Khot Lorenzo Orecchia Nisheeth K. Vishnoi Electrical Engineering and Computer Sciences University of California at Berkeley

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Graph Partitioning Using Random Walks

Graph Partitioning Using Random Walks Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

arxiv: v4 [cs.ds] 23 Jun 2015

arxiv: v4 [cs.ds] 23 Jun 2015 Spectral Concentration and Greedy k-clustering Tamal K. Dey Pan Peng Alfred Rossi Anastasios Sidiropoulos June 25, 2015 arxiv:1404.1008v4 [cs.ds] 23 Jun 2015 Abstract A popular graph clustering method

More information

Unique Games and Small Set Expansion

Unique Games and Small Set Expansion Proof, beliefs, and algorithms through the lens of sum-of-squares 1 Unique Games and Small Set Expansion The Unique Games Conjecture (UGC) (Khot [2002]) states that for every ɛ > 0 there is some finite

More information

Lecture 21 November 11, 2014

Lecture 21 November 11, 2014 CS 224: Advanced Algorithms Fall 2-14 Prof. Jelani Nelson Lecture 21 November 11, 2014 Scribe: Nithin Tumma 1 Overview In the previous lecture we finished covering the multiplicative weights method and

More information

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality CSE 521: Design and Analysis of Algorithms I Spring 2016 Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality Lecturer: Shayan Oveis Gharan May 4th Scribe: Gabriel Cadamuro Disclaimer:

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database University of Maryland-College Park Advance Scientific Computing I,II Spectral Clustering on Handwritten Digits Database Author: Danielle Middlebrooks Dmiddle1@math.umd.edu Second year AMSC Student Advisor:

More information

Lecture 9: Low Rank Approximation

Lecture 9: Low Rank Approximation CSE 521: Design and Analysis of Algorithms I Fall 2018 Lecture 9: Low Rank Approximation Lecturer: Shayan Oveis Gharan February 8th Scribe: Jun Qi Disclaimer: These notes have not been subjected to the

More information

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Spectral Graph Theory and its Applications Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Outline Adjacency matrix and Laplacian Intuition, spectral graph drawing

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016 U.C. Berkeley CS294: Spectral Methods and Expanders Handout Luca Trevisan February 29, 206 Lecture : ARV In which we introduce semi-definite programming and a semi-definite programming relaxation of sparsest

More information

Sparsification by Effective Resistance Sampling

Sparsification by Effective Resistance Sampling Spectral raph Theory Lecture 17 Sparsification by Effective Resistance Sampling Daniel A. Spielman November 2, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] 1. Preliminaries

GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] 1. Preliminaries GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] notes by Petar MAYMOUNKOV Theme The algorithmic problem of finding a sparsest cut is related to the combinatorial problem of building expander graphs

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

Random Walks and Evolving Sets: Faster Convergences and Limitations

Random Walks and Evolving Sets: Faster Convergences and Limitations Random Walks and Evolving Sets: Faster Convergences and Limitations Siu On Chan 1 Tsz Chiu Kwok 2 Lap Chi Lau 2 1 The Chinese University of Hong Kong 2 University of Waterloo Jan 18, 2017 1/23 Finding

More information

Approximate Gaussian Elimination for Laplacians

Approximate Gaussian Elimination for Laplacians CSC 221H : Graphs, Matrices, and Optimization Lecture 9 : 19 November 2018 Approximate Gaussian Elimination for Laplacians Lecturer: Sushant Sachdeva Scribe: Yasaman Mahdaviyeh 1 Introduction Recall sparsifiers

More information

Lecture 13: Spectral Graph Theory

Lecture 13: Spectral Graph Theory CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 13: Spectral Graph Theory Lecturer: Shayan Oveis Gharan 11/14/18 Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

Markov chain methods for small-set expansion

Markov chain methods for small-set expansion Markov chain methods for small-set expansion Ryan O Donnell David Witmer November 4, 03 Abstract Consider a finite irreducible Markov chain with invariant distribution π. We use the inner product induced

More information

Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time

Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time Yin Tat Lee MIT Cambridge, USA yintat@mit.edu He Sun The University of Bristol Bristol, UK h.sun@bristol.ac.uk Abstract We present

More information

Graph Sparsification I : Effective Resistance Sampling

Graph Sparsification I : Effective Resistance Sampling Graph Sparsification I : Effective Resistance Sampling Nikhil Srivastava Microsoft Research India Simons Institute, August 26 2014 Graphs G G=(V,E,w) undirected V = n w: E R + Sparsification Approximate

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21 U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Scribe: Anupam Last revised Lecture 21 1 Laplacian systems in nearly linear time Building upon the ideas introduced in the

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Markov Chains and Spectral Clustering

Markov Chains and Spectral Clustering Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu

More information

CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi. Gradient Descent

CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi. Gradient Descent CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi Gradient Descent This lecture introduces Gradient Descent a meta-algorithm for unconstrained minimization. Under convexity of the

More information

Approximate Spectral Clustering via Randomized Sketching

Approximate Spectral Clustering via Randomized Sketching Approximate Spectral Clustering via Randomized Sketching Christos Boutsidis Yahoo! Labs, New York Joint work with Alex Gittens (Ebay), Anju Kambadur (IBM) The big picture: sketch and solve Tradeoff: Speed

More information

Discrepancy Theory in Approximation Algorithms

Discrepancy Theory in Approximation Algorithms Discrepancy Theory in Approximation Algorithms Rajat Sen, Soumya Basu May 8, 2015 1 Introduction In this report we would like to motivate the use of discrepancy theory in algorithms. Discrepancy theory

More information

A Schur Complement Cheeger Inequality

A Schur Complement Cheeger Inequality A Schur Complement Cheeger Inequality arxiv:8.834v [cs.dm] 27 Nov 28 Aaron Schild University of California, Berkeley EECS aschild@berkeley.edu November 28, 28 Abstract Cheeger s inequality shows that any

More information

1 Adjacency matrix and eigenvalues

1 Adjacency matrix and eigenvalues CSC 5170: Theory of Computational Complexity Lecture 7 The Chinese University of Hong Kong 1 March 2010 Our objective of study today is the random walk algorithm for deciding if two vertices in an undirected

More information

A local algorithm for finding dense bipartite-like subgraphs

A local algorithm for finding dense bipartite-like subgraphs A local algorithm for finding dense bipartite-like subgraphs Pan Peng 1,2 1 State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences 2 School of Information Science

More information

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman Kernels of Directed Graph Laplacians J. S. Caughman and J.J.P. Veerman Department of Mathematics and Statistics Portland State University PO Box 751, Portland, OR 97207. caughman@pdx.edu, veerman@pdx.edu

More information

18.S096: Spectral Clustering and Cheeger s Inequality

18.S096: Spectral Clustering and Cheeger s Inequality 18.S096: Spectral Clustering and Cheeger s Inequality Topics in Mathematics of Data Science (Fall 015) Afonso S. Bandeira bandeira@mit.edu http://math.mit.edu/~bandeira October 6, 015 These are lecture

More information

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods Stat260/CS294: Spectral Graph Methods Lecture 20-04/07/205 Lecture: Local Spectral Methods (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

Constant Arboricity Spectral Sparsifiers

Constant Arboricity Spectral Sparsifiers Constant Arboricity Spectral Sparsifiers arxiv:1808.0566v1 [cs.ds] 16 Aug 018 Timothy Chu oogle timchu@google.com Jakub W. Pachocki Carnegie Mellon University pachocki@cs.cmu.edu August 0, 018 Abstract

More information

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning David Steurer Cornell Approximation Algorithms and Hardness, Banff, August 2014 for many problems (e.g., all UG-hard ones): better guarantees

More information

Introduction to Spectral Graph Theory and Graph Clustering

Introduction to Spectral Graph Theory and Graph Clustering Introduction to Spectral Graph Theory and Graph Clustering Chengming Jiang ECS 231 Spring 2016 University of California, Davis 1 / 40 Motivation Image partitioning in computer vision 2 / 40 Motivation

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Lecture 12 : Graph Laplacians and Cheeger s Inequality CPS290: Algorithmic Foundations of Data Science March 7, 2017 Lecture 12 : Graph Laplacians and Cheeger s Inequality Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Graph Laplacian Maybe the most beautiful

More information

approximation algorithms I

approximation algorithms I SUM-OF-SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 201 meta-task encoded as low-degree polynomial in R x example: f(x) = i,j n w ij x i x j 2 given: functions

More information

Spectral Partitiong in a Stochastic Block Model

Spectral Partitiong in a Stochastic Block Model Spectral Graph Theory Lecture 21 Spectral Partitiong in a Stochastic Block Model Daniel A. Spielman November 16, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007

Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007 CS880: Approximations Algorithms Scribe: Tom Watson Lecturer: Shuchi Chawla Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007 In the last lecture, we described an O(log k log D)-approximation

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Optimal compression of approximate Euclidean distances

Optimal compression of approximate Euclidean distances Optimal compression of approximate Euclidean distances Noga Alon 1 Bo az Klartag 2 Abstract Let X be a set of n points of norm at most 1 in the Euclidean space R k, and suppose ε > 0. An ε-distance sketch

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Ulrike von Luxburg, Gary Miller, Doyle & Schnell, Daniel Spielman January 27, 2015 MVA 2014/2015

More information

Lecture Introduction. 2 Brief Recap of Lecture 10. CS-621 Theory Gems October 24, 2012

Lecture Introduction. 2 Brief Recap of Lecture 10. CS-621 Theory Gems October 24, 2012 CS-62 Theory Gems October 24, 202 Lecture Lecturer: Aleksander Mądry Scribes: Carsten Moldenhauer and Robin Scheibler Introduction In Lecture 0, we introduced a fundamental object of spectral graph theory:

More information

LOCAL CHEEGER INEQUALITIES AND SPARSE CUTS

LOCAL CHEEGER INEQUALITIES AND SPARSE CUTS LOCAL CHEEGER INEQUALITIES AND SPARSE CUTS CAELAN GARRETT 18.434 FINAL PROJECT CAELAN@MIT.EDU Abstract. When computing on massive graphs, it is not tractable to iterate over the full graph. Local graph

More information

Graph Matching & Information. Geometry. Towards a Quantum Approach. David Emms

Graph Matching & Information. Geometry. Towards a Quantum Approach. David Emms Graph Matching & Information Geometry Towards a Quantum Approach David Emms Overview Graph matching problem. Quantum algorithms. Classical approaches. How these could help towards a Quantum approach. Graphs

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform

Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Some Useful Background for Talk on the Fast Johnson-Lindenstrauss Transform Nir Ailon May 22, 2007 This writeup includes very basic background material for the talk on the Fast Johnson Lindenstrauss Transform

More information

A Simple Algorithm for Clustering Mixtures of Discrete Distributions

A Simple Algorithm for Clustering Mixtures of Discrete Distributions 1 A Simple Algorithm for Clustering Mixtures of Discrete Distributions Pradipta Mitra In this paper, we propose a simple, rotationally invariant algorithm for clustering mixture of distributions, including

More information

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms A simple example Two ideal clusters, with two points each Spectral clustering Lecture 2 Spectral clustering algorithms 4 2 3 A = Ideally permuted Ideal affinities 2 Indicator vectors Each cluster has an

More information

Hardness of Embedding Metric Spaces of Equal Size

Hardness of Embedding Metric Spaces of Equal Size Hardness of Embedding Metric Spaces of Equal Size Subhash Khot and Rishi Saket Georgia Institute of Technology {khot,saket}@cc.gatech.edu Abstract. We study the problem embedding an n-point metric space

More information

Approximating the Exponential, the Lanczos Method and an Õ(m)-Time Spectral Algorithm for Balanced Separator

Approximating the Exponential, the Lanczos Method and an Õ(m)-Time Spectral Algorithm for Balanced Separator Approximating the Exponential, the Lanczos Method and an Õm-Time Spectral Algorithm for Balanced Separator Lorenzo Orecchia MIT Cambridge, MA, USA. orecchia@mit.edu Sushant Sachdeva Princeton University

More information

Reductions Between Expansion Problems

Reductions Between Expansion Problems Reductions Between Expansion Problems Prasad Raghavendra David Steurer Madhur Tulsiani November 11, 2010 Abstract The Small-Set Expansion Hypothesis (Raghavendra, Steurer, STOC 2010) is a natural hardness

More information

Random Feature Maps for Dot Product Kernels Supplementary Material

Random Feature Maps for Dot Product Kernels Supplementary Material Random Feature Maps for Dot Product Kernels Supplementary Material Purushottam Kar and Harish Karnick Indian Institute of Technology Kanpur, INDIA {purushot,hk}@cse.iitk.ac.in Abstract This document contains

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Dissertation Defense

Dissertation Defense Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation

More information

A Tutorial on Matrix Approximation by Row Sampling

A Tutorial on Matrix Approximation by Row Sampling A Tutorial on Matrix Approximation by Row Sampling Rasmus Kyng June 11, 018 Contents 1 Fast Linear Algebra Talk 1.1 Matrix Concentration................................... 1. Algorithms for ɛ-approximation

More information

Graph Clustering Algorithms

Graph Clustering Algorithms PhD Course on Graph Mining Algorithms, Università di Pisa February, 2018 Clustering: Intuition to Formalization Task Partition a graph into natural groups so that the nodes in the same cluster are more

More information

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These

More information

Approximating maximum satisfiable subsystems of linear equations of bounded width

Approximating maximum satisfiable subsystems of linear equations of bounded width Approximating maximum satisfiable subsystems of linear equations of bounded width Zeev Nutov The Open University of Israel Daniel Reichman The Open University of Israel Abstract We consider the problem

More information

Stanford University CS366: Graph Partitioning and Expanders Handout 13 Luca Trevisan March 4, 2013

Stanford University CS366: Graph Partitioning and Expanders Handout 13 Luca Trevisan March 4, 2013 Stanford University CS366: Graph Partitioning and Expanders Handout 13 Luca Trevisan March 4, 2013 Lecture 13 In which we construct a family of expander graphs. The problem of constructing expander graphs

More information

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before

More information

Locally-biased analytics

Locally-biased analytics Locally-biased analytics You have BIG data and want to analyze a small part of it: Solution 1: Cut out small part and use traditional methods Challenge: cutting out may be difficult a priori Solution 2:

More information

Learning symmetric non-monotone submodular functions

Learning symmetric non-monotone submodular functions Learning symmetric non-monotone submodular functions Maria-Florina Balcan Georgia Institute of Technology ninamf@cc.gatech.edu Nicholas J. A. Harvey University of British Columbia nickhar@cs.ubc.ca Satoru

More information

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY ILSE C.F. IPSEN Abstract. Absolute and relative perturbation bounds for Ritz values of complex square matrices are presented. The bounds exploit quasi-sparsity

More information

Iterative solvers for linear equations

Iterative solvers for linear equations Spectral Graph Theory Lecture 15 Iterative solvers for linear equations Daniel A. Spielman October 1, 009 15.1 Overview In this and the next lecture, I will discuss iterative algorithms for solving linear

More information

Sum-of-Squares and Spectral Algorithms

Sum-of-Squares and Spectral Algorithms Sum-of-Squares and Spectral Algorithms Tselil Schramm June 23, 2017 Workshop on SoS @ STOC 2017 Spectral algorithms as a tool for analyzing SoS. SoS Semidefinite Programs Spectral Algorithms SoS suggests

More information

Using Laplacian Eigenvalues and Eigenvectors in the Analysis of Frequency Assignment Problems

Using Laplacian Eigenvalues and Eigenvectors in the Analysis of Frequency Assignment Problems Using Laplacian Eigenvalues and Eigenvectors in the Analysis of Frequency Assignment Problems Jan van den Heuvel and Snežana Pejić Department of Mathematics London School of Economics Houghton Street,

More information

Lecture 8: Linear Algebra Background

Lecture 8: Linear Algebra Background CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 8: Linear Algebra Background Lecturer: Shayan Oveis Gharan 2/1/2017 Scribe: Swati Padmanabhan Disclaimer: These notes have not been subjected

More information

Linear Spectral Hashing

Linear Spectral Hashing Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to

More information

First Efficient Convergence for Streaming k-pca: a Global, Gap-Free, and Near-Optimal Rate

First Efficient Convergence for Streaming k-pca: a Global, Gap-Free, and Near-Optimal Rate 58th Annual IEEE Symposium on Foundations of Computer Science First Efficient Convergence for Streaming k-pca: a Global, Gap-Free, and Near-Optimal Rate Zeyuan Allen-Zhu Microsoft Research zeyuan@csail.mit.edu

More information

Spectral Generative Models for Graphs

Spectral Generative Models for Graphs Spectral Generative Models for Graphs David White and Richard C. Wilson Department of Computer Science University of York Heslington, York, UK wilson@cs.york.ac.uk Abstract Generative models are well known

More information

Cutting Graphs, Personal PageRank and Spilling Paint

Cutting Graphs, Personal PageRank and Spilling Paint Graphs and Networks Lecture 11 Cutting Graphs, Personal PageRank and Spilling Paint Daniel A. Spielman October 3, 2013 11.1 Disclaimer These notes are not necessarily an accurate representation of what

More information

An Algorithmist s Toolkit Nov. 10, Lecture 17

An Algorithmist s Toolkit Nov. 10, Lecture 17 8.409 An Algorithmist s Toolkit Nov. 0, 009 Lecturer: Jonathan Kelner Lecture 7 Johnson-Lindenstrauss Theorem. Recap We first recap a theorem (isoperimetric inequality) and a lemma (concentration) from

More information

An Improved Approximation Algorithm for Requirement Cut

An Improved Approximation Algorithm for Requirement Cut An Improved Approximation Algorithm for Requirement Cut Anupam Gupta Viswanath Nagarajan R. Ravi Abstract This note presents improved approximation guarantees for the requirement cut problem: given an

More information

Kernelization by matroids: Odd Cycle Transversal

Kernelization by matroids: Odd Cycle Transversal Lecture 8 (10.05.2013) Scribe: Tomasz Kociumaka Lecturer: Marek Cygan Kernelization by matroids: Odd Cycle Transversal 1 Introduction The main aim of this lecture is to give a polynomial kernel for the

More information

Spectral Techniques for Clustering

Spectral Techniques for Clustering Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods

More information

Self-Tuning Spectral Clustering

Self-Tuning Spectral Clustering Self-Tuning Spectral Clustering Lihi Zelnik-Manor Pietro Perona Department of Electrical Engineering Department of Electrical Engineering California Institute of Technology California Institute of Technology

More information

A better approximation ratio for the Vertex Cover problem

A better approximation ratio for the Vertex Cover problem A better approximation ratio for the Vertex Cover problem George Karakostas Dept. of Computing and Software McMaster University October 5, 004 Abstract We reduce the approximation factor for Vertex Cover

More information

EIGENVECTOR NORMS MATTER IN SPECTRAL GRAPH THEORY

EIGENVECTOR NORMS MATTER IN SPECTRAL GRAPH THEORY EIGENVECTOR NORMS MATTER IN SPECTRAL GRAPH THEORY Franklin H. J. Kenter Rice University ( United States Naval Academy 206) orange@rice.edu Connections in Discrete Mathematics, A Celebration of Ron Graham

More information

MATH 567: Mathematical Techniques in Data Science Clustering II

MATH 567: Mathematical Techniques in Data Science Clustering II This lecture is based on U. von Luxburg, A Tutorial on Spectral Clustering, Statistics and Computing, 17 (4), 2007. MATH 567: Mathematical Techniques in Data Science Clustering II Dominique Guillot Departments

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Discrete quantum random walks

Discrete quantum random walks Quantum Information and Computation: Report Edin Husić edin.husic@ens-lyon.fr Discrete quantum random walks Abstract In this report, we present the ideas behind the notion of quantum random walks. We further

More information

Appendix A. Proof to Theorem 1

Appendix A. Proof to Theorem 1 Appendix A Proof to Theorem In this section, we prove the sample complexity bound given in Theorem The proof consists of three main parts In Appendix A, we prove perturbation lemmas that bound the estimation

More information

Unique Games Conjecture & Polynomial Optimization. David Steurer

Unique Games Conjecture & Polynomial Optimization. David Steurer Unique Games Conjecture & Polynomial Optimization David Steurer Newton Institute, Cambridge, July 2013 overview Proof Complexity Approximability Polynomial Optimization Quantum Information computational

More information

The Simplest Construction of Expanders

The Simplest Construction of Expanders Spectral Graph Theory Lecture 14 The Simplest Construction of Expanders Daniel A. Spielman October 16, 2009 14.1 Overview I am going to present the simplest construction of expanders that I have been able

More information

Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm

Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm Shang-Hua Teng Computer Science, Viterbi School of Engineering USC Massive Data and Massive Graphs 500 billions web

More information

Fixed Parameter Algorithms for Interval Vertex Deletion and Interval Completion Problems

Fixed Parameter Algorithms for Interval Vertex Deletion and Interval Completion Problems Fixed Parameter Algorithms for Interval Vertex Deletion and Interval Completion Problems Arash Rafiey Department of Informatics, University of Bergen, Norway arash.rafiey@ii.uib.no Abstract We consider

More information

Fast Approximation Algorithms for Graph Partitioning using Spectral and Semidefinite-Programming Techniques

Fast Approximation Algorithms for Graph Partitioning using Spectral and Semidefinite-Programming Techniques Fast Approximation Algorithms for Graph Partitioning using Spectral and Semidefinite-Programming Techniques Lorenzo Orecchia Electrical Engineering and Computer Sciences University of California at Berkeley

More information

Fiedler s Theorems on Nodal Domains

Fiedler s Theorems on Nodal Domains Spectral Graph Theory Lecture 7 Fiedler s Theorems on Nodal Domains Daniel A. Spielman September 19, 2018 7.1 Overview In today s lecture we will justify some of the behavior we observed when using eigenvectors

More information

Lecture 9: SVD, Low Rank Approximation

Lecture 9: SVD, Low Rank Approximation CSE 521: Design and Analysis of Algorithms I Spring 2016 Lecture 9: SVD, Low Rank Approimation Lecturer: Shayan Oveis Gharan April 25th Scribe: Koosha Khalvati Disclaimer: hese notes have not been subjected

More information

An SDP-Based Algorithm for Linear-Sized Spectral Sparsification

An SDP-Based Algorithm for Linear-Sized Spectral Sparsification An SDP-Based Algorithm for Linear-Sized Spectral Sparsification ABSTRACT Yin Tat Lee Microsoft Research Redmond, USA yile@microsoft.com For any undirected and weighted graph G = V, E, w with n vertices

More information

An indicator for the number of clusters using a linear map to simplex structure

An indicator for the number of clusters using a linear map to simplex structure An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany

More information