Dissertation Defense

Size: px
Start display at page:

Download "Dissertation Defense"

Transcription

1 Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation Defense April 23, / 46

2 Committee Ravi Kannan (Advisor) Dana Angluin Dan Spielman Mike Mahoney (Yahoo!) Mitra (Yale University) Dissertation Defense April 23, / 46

3 Outline 1 Introduction: Clustering and Spectral algorithms 2 Four results: 1 Clustering using Bi-partitioning 2 Clustering in sparse graphs 3 A Robust clustering algorithm 4 An Entrywise notion for spectral stability 3 Future work Mitra (Yale University) Dissertation Defense April 23, / 46

4 Clustering What is clustering? Issues Given a set of objects S, partition into disjoint sets or clusters S 1... S k Partitioning done according to some notion of closeness, i.e. objects in a cluster S r are close to each other, and far from objects in other clusters. What is the right definition of closeness? Algorithms to find clusters given the right definition Mitra (Yale University) Dissertation Defense April 23, / 46

5 Clustering: Examples Figure: From Yale face dataset Mitra (Yale University) Dissertation Defense April 23, / 46

6 Clustering: Matrices Term-document matrices M V C A H K F P CS Doc CS Doc CS Doc Medicine Doc Medicine Doc Medicine Doc M - Microprocessor, V - VirtualMemory, C - L2 Cache, A - Algorithm, H - Hemoglobin, K - Kidney, F - Fracture, P - Painkiller Moral Clustering problems can be modelled as object-feature matrices, objects can be seen as vectors in high dimesional space. Mitra (Yale University) Dissertation Defense April 23, / 46

7 Mixture models Each cluster is defined by a simple (high-dimensional) probability distribution. Objects are samples from these distributions. Hope Can successfully cluster if centers (means) are far apart. µ 1 µ 2 How large does need to be? Figure: Two circles whose centers are separated Mitra (Yale University) Dissertation Defense April 23, / 46

8 Random graphs A G n,p random graph is generated by selecting each possible edge with independent probability p Example: G 5,0.5 EA = A Mitra (Yale University) Dissertation Defense April 23, / 46

9 Planted partition model for Graphs Total n vertices, divided in to k clusters T 1, T 2... T k of size n 1... n k There are k(k+1) 2 probabilities P rs (= P sr ) such that if v T r, u T s, the edge e(u, v) is present with probability P rs Mitra (Yale University) Dissertation Defense April 23, / 46

10 Planted partition model for Graphs P = EA = A µ 1 = {0.5, 0.5, 0.5, 0.1, 0.1, 0.1} µ 2 = {0.1, 0.1, 0.1, 0.5, 0.5, 0.5} Mitra (Yale University) Dissertation Defense April 23, / 46

11 Algorithmic Issues Heuristic Analysis: Analyze an algorithm known to work in practice. Spectral Algorthms Uses information about the spectrum (Eigenvalues, Eigenvectors, Singular Vectors etc) of the data matrix to do clustering Quite popular, seems to work in practice. Singular values and vectors can be computed efficiently For a matrix A, the span of the top k singular vectors gives A k, the rank k matrix such that for all rank k matrices M A A k A M Mitra (Yale University) Dissertation Defense April 23, / 46

12 Why might this work? Intuition: Eliminates noise Avoids curse of dimensionality Cheeger s inequality (relation to sparsest cut) Convention Eigen/Singular values are often sorted from largest to smallest (in absolute value). λ 1 λ 2 λ 3... The eigen/singular vector corresponding to λ i is the i th eigen/singular vector Mitra (Yale University) Dissertation Defense April 23, / 46

13 Why might this work? A = Quick Definition: if A is square symmetric, v its eigenvector if v = 1, and Av = λv for some λ (an eigenvalue). Mitra (Yale University) Dissertation Defense April 23, / 46

14 Why might this work? A = = {1,... 1} T, A1 = 41 1 is the first eigenvector Av v = {1, 1, 1, 1, 1, 1, 1, 1} T = 2v This is the second eigenvector, and reveals the cluster. Mitra (Yale University) Dissertation Defense April 23, / 46

15 Previous Work Lot of work: [B 87], [DF 89], [AK 97], [AKS 98], [VW2002]... McSherry 2001 An instance of the planted partition model A with k clusters can be clustered with probability 1 o(1) if the following separation condition holds (centers are far apart): for all r s µ r µ s 2 cσ 2 log n where σ 2 = max rs P rs, n = number of vertices Assumption: σ 2 log6 n n, i.e. atleast polylog degree Spectral method: Take best rank-k approximation A k, do greedy on that matrix. This gives approximate clustering Clean-up: Use combinatorial projections, ie counting edges to approximate partitions. Mitra (Yale University) Dissertation Defense April 23, / 46

16 Our Contributon Clustering by Recursive Bi-partitioning: use the second singular vector to bi-partition the data. repeat. Pseudo-random models of Clustering: used to model cluster problems for sparse (constant-degree) graphs. Rotationally invariant algorithms: Remove combinatorial/ad-hoc techniques for discrete distributions. Entrywise bounds for Eigenvectors: A different notion of spectral stability for random graphs. Mitra (Yale University) Dissertation Defense April 23, / 46

17 Spectral Clustering by Recursive Bi-partitioning Mitra (Yale University) Dissertation Defense April 23, / 46

18 Spectral Clustering by Recursive Bi-partitioning Joint work with Dasgupta, Hopcroft and Kannan (ESA 2006) Goal Instead of a rank-k approximation based method, use a incremental algorithm that bi-partitions the data at each step. Result Clustering possible if for all r s where σ 2 r = max s P rs log6 n n µ r µ s 2 c(σ r + σ s ) 2 log n Mitra (Yale University) Dissertation Defense April 23, / 46

19 Basic Step Given A, find vector v 1 of AJ that maximizes AJv 1 where J = I 1 n 11T Sort entries of v 1 : v 1 (1) v 1 (2)... v 1 (n) Find i, i + 1 such that v 1 (i) v 1 (i + 1) is largest Return {1,... i} and {i + 1,... n} as the bi-partition Definition refresher v 1 is the first right singular vector of AJ, and close to the second right singular vector of A Mitra (Yale University) Dissertation Defense April 23, / 46

20 Main algorithm Given A, randomly partition the rows into t = 4 log n parts B i (i = 1 to t) of equal size Bi-partition the (same) columns t times using Basic Step (last slide) Combine these (approximate) bi-partitions to find an accurate bi-partition Mitra (Yale University) Dissertation Defense April 23, / 46

21 Analysis Let s focus on one B i. Call it B. Let B = E(B) v 1 (BJ) is almost structured Let v 1 = v 1 (BJ). Then, v 1 = r α r g (r) + v where g (r) is the characteristic vector of T r, v is orthogonal to each g (r) and v 1 c 2 Mitra (Yale University) Dissertation Defense April 23, / 46

22 Analysis Let s focus on one B i. Call it B. Let B = E(B) v 1 (BJ) is almost structured Let v 1 = v 1 (BJ). Then, Furedi-Komlos 81 if σ 2 log6 n n B B 3σ n v 1 = r α r g (r) + v where g (r) is the characteristic vector of T r, v is orthogonal to each g (r) and v 1 c 2 BJ BJv + BJv BJ v + BJv + BJ BJ v BJ (1 v 2 ) + B B v Using (1 x) 1 x v B B 1 BJ c 2 Mitra (Yale University) Dissertation Defense April 23, / 46

23 Analysis v 1 = v + v, v = α r g (r) Claim When sorted, there is a Ω(1) gap in the α s. v is orthogonal to 1. This implies, αr 0 On the other hand, 1 = v 1 2 = v 2 + v 2 r α 2 r + 1 c 2 2 r α 2 r 1 2 v looks like this. Combines to prove the existence of a Ω(1) gap. Mitra (Yale University) Dissertation Defense April 23, / 46

24 Analysis v 1 = v + v, v = α r g (r) Claim No more than n min c 3 vertices cross the gap. (n min = min r n r ) Implied by the fact that v is small. Ω(1) gap in α 1 4 n min gap in v. Let there are m vertices that cross the gap. Then, 1 m v n min c2 2 m 16n min c 2 2 v 1 looks like this. Mitra (Yale University) Dissertation Defense April 23, / 46

25 Combining the 4 log n bi-partitions We showed No more than n min c 3 gap. vertices cross the Given 4 log n, bi-partitions, construct a graph on the vertices thus: For each u, v [n] set e(u, v) = 1 if vertices u and v are on the same side of the bi-partition in 1 2ɛ fraction of cases. Find connected components in the graph. Return them as a (bi-)partition. Mitra (Yale University) Dissertation Defense April 23, / 46

26 Combining the 4 log n bi-partitions Equivalently A vertex has ɛ = 1 c 3 being misclassified. probability of Given 4 log n, bi-partitions, construct a graph on the vertices thus: For each u, v [n] set e(u, v) = 1 if vertices u and v are on the same side of the bi-partition in 1 2ɛ fraction of cases. Find connected components in the graph. Return them as a (bi-)partition. Mitra (Yale University) Dissertation Defense April 23, / 46

27 Combining the 4 log n bi-partitions Equivalently A vertex has ɛ = 1 c 3 probability of being misclassified. Given 4 log n, bi-partitions, construct a graph on the vertices thus: For each u, v [n] set e(u, v) = 1 if vertices u and v are on the same side of the bi-partition in 1 2ɛ fraction of cases. Find connected components in the graph. Return them as a (bi-)partition. Need to show: No two vertices from the same cluster can be put in different components. We find at least two components. Mitra (Yale University) Dissertation Defense April 23, / 46

28 Combining the 4 log n bi-partitions Equivalently A vertex has ɛ = 1 c 3 probability of being misclassified. Given 4 log n, bi-partitions, construct a graph on the vertices thus: For each u, v [n] set e(u, v) = 1 if vertices u and v are on the same side of the bi-partition in 1 2ɛ fraction of cases. Find connected components in the graph. Return them as a (bi-)partition. Clean clusters No two vertices from the same cluster can be put in different components. Let u, v T r. Vertex v is on the right side of the bi-partition (1 ɛ) fraction of cases. Same is true for u. So u and v on the same side at least (1 2ɛ) fraction of cases. Mitra (Yale University) Dissertation Defense April 23, / 46

29 Combining the 4 log n bi-partitions Equivalently A vertex has ɛ = 1 c 3 probability of being misclassified. Given 4 log n, bi-partitions, construct a graph on the vertices thus: For each u, v [n] set e(u, v) = 1 if vertices u and v are on the same side of the bi-partition in 1 2ɛ fraction of cases. Find connected components in the graph. Return them as a (bi-)partition. Nontrivial Partitions We find at least two components. A counting argument. Mitra (Yale University) Dissertation Defense April 23, / 46

30 Pseudo-randomness and Clustering Mitra (Yale University) Dissertation Defense April 23, / 46

31 Sparse graphs? Goal Design a model that will allow constant-degree graphs. Problems Standard condition: µ r µ s 2 cσ 2 log n. A planted partition model with σ 2 = Θ( d n ) for constant d will have vertices with logarithmic degree. Our Result We Introduce a model where clustering possible if: for constant α µ r µ s 2 c α2 n log2 α Mitra (Yale University) Dissertation Defense April 23, / 46

32 Solution: Use pseudo-randomness A graph G(V, E) is (p, α) pseudo-random if for all A, B V e(a, B) p A B α A B Theorem A G n,p random graph is (p, 2 np) pseudo-random (p log6 n n ) Proof: E(e(A, B)) = p A B. Using Chernoff Bound, P( e(a, B) E(e(A, B)) > 2 np A B ) exp( 2n) But there are only 2 n.2 n = 2 2n pairs of sets A, B. The claim follows. Intuition Pseudo-random graphs are deterministic versions of random-graphs Mitra (Yale University) Dissertation Defense April 23, / 46

33 The model Graph G, k clusters T r, r [k] For some α, and for each r, s [k], there is p rs such that: G(T r, T s ) is (p rs, α) pseudo-random. Also, e(x, T s ) p rs T s 2α if x T r Algorithmic issue: Furedi-Komlos doesn t apply, and there is no independence! Ā A Mitra (Yale University) Dissertation Defense April 23, / 46

34 Rotationally Invariant Algorithm for Discrete Distributions Mitra (Yale University) Dissertation Defense April 23, / 46

35 Discrete vs. Continuous Similar results can be proved for discrete and continuous models µ r µ s 2 Ω (σ 2 log n) The algorithms: 1 Share the spectral part that gives an approximation 2 Differ in clean-up phase continuous models seems to have more natural algorithms Mixture of Gaussians: k high dimensional gaussians with centers µ r, r = 1 to k. pdf for the k-th cluster/gaussian: f r (x) ( exp 1 ) 2 (x µ r ) Σ 1 r (x µ r ) Σ r is the covariance matrix. Mitra (Yale University) Dissertation Defense April 23, / 46

36 Discrete vs. Continuous We would like an algorithm 1 Simple, natural clean-up phase 2 Rotationally invariant 3 Easily extensible to more complex models McSherry 2001 Conjecture: Such an algorithm exists. Mitra (Yale University) Dissertation Defense April 23, / 46

37 Discrete vs. Continuous We would like an algorithm 1 Simple, natural clean-up phase 2 Rotationally invariant 3 Easily extensible to more complex models Simplicity One shot distance-based or projection-based algorithm, instead of combinatorial, incremental or sampling techniques. McSherry 2001 Conjecture: Such an algorithm exists. Mitra (Yale University) Dissertation Defense April 23, / 46

38 Discrete vs. Continuous We would like an algorithm 1 Simple, natural clean-up phase 2 Rotationally invariant Natural assumption If the vectors are rotated, the clustering remains the same. 3 Easily extensible to more complex models McSherry 2001 Conjecture: Such an algorithm exists. Mitra (Yale University) Dissertation Defense April 23, / 46

39 Discrete vs. Continuous We would like an algorithm 1 Simple, natural clean-up phase 2 Rotationally invariant 3 Easily extensible to more complex models Extension Simpler algorithms are easier to adapt: models without complete independence, without block structuring. McSherry 2001 Conjecture: Such an algorithm exists. Mitra (Yale University) Dissertation Defense April 23, / 46

40 Discrete vs. Continuous We would like an algorithm 1 Simple, natural clean-up phase 2 Rotationally invariant 3 Easily extensible to more complex models McSherry 2001 Conjecture: Such an algorithm exists. Our result The conjecture is true. Theorem Consider a matrix generated from a discrete mixture model with k- clusters, m objects and n features. Clustering possible if: ( µ r µ s 2 cσ n ) log m m Mitra (Yale University) Dissertation Defense April 23, / 46

41 Our algorithm Cluster(A, k) Divide A into A 1 and A 2 {µ r } = Centers(A 1, k) Project (A 2, µ 1,... µ k ) Project(A 2, µ 1,... µ k ) Group v A 2 with the µ r that minimizes v µ r Centers(A 1, k) Uses a spectral algorithm to find approximate clusters P r, r [k]. Returns empirical centers µ r = 1 P r v P r v Mitra (Yale University) Dissertation Defense April 23, / 46

42 Analysis Lemma Proof idea: µ r µ r 2 c 1 σ 2 ( 1 + n m µ r = 1 p r v P r v p r µ r = v P r v = P r v + s v Q rs p r (µ r µ r ) = Pr (v µ r ) + s Q rs (v µ r ) ) 1 20 µ r µ s 2 (1) Spectral method returns a approximately correct partition. P r = correctly classified part of P r, p r = P r, p r = P r Q rs = should be in P s, placed in P r, q rs = Q rs Mitra (Yale University) Dissertation Defense April 23, / 46

43 Analysis p r (µ r µ r ) = Pr (v µ r ) + s Q rs (v µ r ) Need to bound Pr (v µ r ) and for all s (v µ r ) (v µ s ) + q rs µ s q rs µ r Q rs Q rs q rs µ s q rs µ r = q rs µ s µ r Turns out q rs decreases if µ s µ r increases, cancels each other out. Pr (v µ r ) follows from an argument based on a spectral norm bound (ala Furedi-Komlos). Mitra (Yale University) Dissertation Defense April 23, / 46

44 Analysis Lemma For each sample u, if u T r, then for all s r (u µ r ) (µ r µ s ) 2 5 µ r µ s 2 Assume µ r = µ r + δ r ; r. Then, (u µ r ) (µ s µ r ) =(u µ r δ r ) (µ s µ r δ r + δ s ) =(u µ r ) (µ r µ s ) δ r (µ s µ r ) δ r (δ s δ r ) + (u µ r ) (δ s δ r ) (u µ r ) (µ r µ s ) is small by separation assumption. Mitra (Yale University) Dissertation Defense April 23, / 46

45 Analysis Lemma For each sample u, if u T r, then for all s r (u µ r ) (µ r µ s ) 2 5 µ r µ s 2 Assume µ r = µ r + δ r ; r. Then, (u µ r ) (µ s µ r ) =(u µ r δ r ) (µ s µ r δ r + δ s ) =(u µ r ) (µ r µ s ) δ r (µ s µ r ) δ r (δ s δ r ) + (u µ r ) (δ s δ r ) δ r (µ s µ r ) δ r µ s µ r by Cauchy-Schwartz, is small as δ r is small. Mitra (Yale University) Dissertation Defense April 23, / 46

46 Analysis Lemma For each sample u, if u T r, then for all s r (u µ r ) (µ r µ s ) 2 5 µ r µ s 2 Assume µ r = µ r + δ r ; r. Then, (u µ r ) (µ s µ r ) =(u µ r δ r ) (µ s µ r δ r + δ s ) =(u µ r ) (µ r µ s ) δ r (µ s µ r ) δ r (δ s δ r ) + (u µ r ) (δ s δ r ) δ r (δ s δ r ) is similarly small. Mitra (Yale University) Dissertation Defense April 23, / 46

47 Analysis Lemma For each sample u, if u T r, then for all s r (u µ r ) (µ r µ s ) 2 5 µ r µ s 2 Assume µ r = µ r + δ r ; r. Then, (u µ r ) (µ s µ r ) =(u µ r δ r ) (µ s µ r δ r + δ s ) =(u µ r ) (µ r µ s ) δ r (µ s µ r ) δ r (δ s δ r ) + (u µ r ) (δ s δ r ) Main challenge Bounding (u µ r ) (δ s δ r ) Mitra (Yale University) Dissertation Defense April 23, / 46

48 Completing the proof Claim (u µ r ) (δ s δ r ) < c 3 σ 2 (1 + n m ) log m Proof idea: (u µ r ) δ r = (u(i) µ r (i))δ r (i) = x(i) i [n] i [n] This is a sum of zero mean random variables x(i). E(x(i) 2 ) 2δ r (i) 2 σ 2 i E(x(i) 2 ) 2σ 2 δ r 2 c 3 kσ 4 ( 1 + n m ) x(i) δ i 2c 4 σ 2, because the number of 1 s in a column can be at most 1.1mσ 2. Mitra (Yale University) Dissertation Defense April 23, / 46

49 Completing the proof So we have a sum of absolutely bounded, zero mean, bounded variance random variables. Can apply: Bernstein s inequality Let {X i } n i=1 be a collection of independent, random variables where Pr { X i M} = 1 i. Then, for any ε 0 { } ( ) n Pr (X i E[X i ]) ε ε 2 exp 2 ( θ 2 + M 3 ε) i=1 where θ = EX 2 i Plugging in our values, Pr{ ( x(i) c 3 σ 2 (1 + n )} m ) + log m 1 m 3 i [n] Mitra (Yale University) Dissertation Defense April 23, / 46

50 Entrywise Bounds for Eigenvectors of Random Graphs Mitra (Yale University) Dissertation Defense April 23, / 46

51 Well studied: l 2 norm bounds Already saw: if A is the adjacency matrix of a G n,p graph Lot of research on similar bounds. A E(A) 3 np v = v 1 (E(A)) = 1 n 1 Question u = v 1 (A) =? Goal Study u v max i [n] u(i) v(i) A potentially useful notion of spectral stability. Mitra (Yale University) Dissertation Defense April 23, / 46

52 Can l 2 give l? Not directly! The spectral norm bound on A E(A) can be converted to a bound on u v. Best bound we can get u v u v 3 Too weak! 1 np is much larger np 1 that n Mitra (Yale University) Dissertation Defense April 23, / 46

53 Eigenvector of a Random Graph Figure: G 400,0.2 Mitra (Yale University) Dissertation Defense April 23, / 46

54 Our result Let A be the adjacency matrix of a G n,p graph, and u = v 1 (A). Then with probability 1 o(1), for all i u(i) = 1 n (1 ± ɛ) log n where ɛ = c log n 2 log np np, p log6 n n, c 2 constant Essentially optimal. Mitra (Yale University) Dissertation Defense April 23, / 46

55 Proof Only need a few elementary properties. Let = 2 probability, e(i) = np(1 ± ); for all i [n] log n np e(a, B) p A B 2 np A B ; for all A, B λ 1 (A) np. With high Normalize u = v 1 (A) such that max i (u(i)) = u(1) = 1 Au = (np)u (Au)(1) = (np)u(1) i A(1, i)u(i) = np N(1) u(i) = np Claim At least np 2 vertices of N(1) have u(i) 1 2 We know, N(1) u(i) = np Mitra (Yale University) Dissertation Defense April 23, / 46

56 Proof Only need a few elementary properties. Let = 2 probability, e(i) = np(1 ± ); for all i [n] log n np e(a, B) p A B 2 np A B ; for all A, B λ 1 (A) np. With high If not N(1) u(i) np (np(1 + ) np )(1 2 ) 2 np 2 + (np(1 + ))(1 2 ) 2 np np 2 < np Claim At least np 2 vertices of N(1) have u(i) 1 2 We know, N(1) u(i) = np Mitra (Yale University) Dissertation Defense April 23, / 46

57 Proof (contd.) Idea Extend the argument to successive neighborhood sets. We define a sequence of sets {S t } for t = 1... S 1 = {1} S t+1 = {i : i N(S t ) and u(i) 1 c(t + 1) } How quickly does S t+1 grow? Lemma Let t be the last index such that S t 2n 3. For all t t S t+1 (np) S t 9t 2 Exponential increase! Mitra (Yale University) Dissertation Defense April 23, / 46

58 Connection to Clustering Experiments show that for our models, no clean-up is necessary at all. Needed Subtle entrywise bounds for the second (and smaller) eigenvectors for the planted model. Figure: Second eigenvector of a graph with two clusters. Mitra (Yale University) Dissertation Defense April 23, / 46

59 Connection to Clustering Can show for models with stronger separation conditions Theorem Assume σ 2 = Ω( 1 n ). Then the second eigenvector provides a clean clustering if µ r µ s 2 σ 2/3 log n Stronger than standard assumption σ 2 log n Mitra (Yale University) Dissertation Defense April 23, / 46

60 Future Work 1 Clustering without clean-up 2 Clustering below the variance bound Ω (σ 2 ) 3 A Chernoff type bound for entrywise error? Algorithmic applications? Mitra (Yale University) Dissertation Defense April 23, / 46

61 Thanks! Mitra (Yale University) Dissertation Defense April 23, / 46

A Simple Algorithm for Clustering Mixtures of Discrete Distributions

A Simple Algorithm for Clustering Mixtures of Discrete Distributions 1 A Simple Algorithm for Clustering Mixtures of Discrete Distributions Pradipta Mitra In this paper, we propose a simple, rotationally invariant algorithm for clustering mixture of distributions, including

More information

Clustering Algorithms for Random and Pseudo-random Structures

Clustering Algorithms for Random and Pseudo-random Structures Abstract Clustering Algorithms for Random and Pseudo-random Structures Pradipta Mitra 2008 Partitioning of a set objects into a number of clusters according to a suitable distance metric is one of the

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Spectral Graph Theory and its Applications Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Outline Adjacency matrix and Laplacian Intuition, spectral graph drawing

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Norms of Random Matrices & Low-Rank via Sampling

Norms of Random Matrices & Low-Rank via Sampling CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 4-10/05/2009 Norms of Random Matrices & Low-Rank via Sampling Lecturer: Michael Mahoney Scribes: Jacob Bien and Noah Youngs *Unedited Notes

More information

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 Lecture 3 In which we show how to find a planted clique in a random graph. 1 Finding a Planted Clique We will analyze

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

1 Adjacency matrix and eigenvalues

1 Adjacency matrix and eigenvalues CSC 5170: Theory of Computational Complexity Lecture 7 The Chinese University of Hong Kong 1 March 2010 Our objective of study today is the random walk algorithm for deciding if two vertices in an undirected

More information

Lecture 13: Spectral Graph Theory

Lecture 13: Spectral Graph Theory CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 13: Spectral Graph Theory Lecturer: Shayan Oveis Gharan 11/14/18 Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Lecture 12 : Graph Laplacians and Cheeger s Inequality CPS290: Algorithmic Foundations of Data Science March 7, 2017 Lecture 12 : Graph Laplacians and Cheeger s Inequality Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Graph Laplacian Maybe the most beautiful

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Random matrices: Distribution of the least singular value (via Property Testing)

Random matrices: Distribution of the least singular value (via Property Testing) Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued

More information

Graph Partitioning Using Random Walks

Graph Partitioning Using Random Walks Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality CSE 521: Design and Analysis of Algorithms I Spring 2016 Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality Lecturer: Shayan Oveis Gharan May 4th Scribe: Gabriel Cadamuro Disclaimer:

More information

Expander Construction in VNC 1

Expander Construction in VNC 1 Expander Construction in VNC 1 Sam Buss joint work with Valentine Kabanets, Antonina Kolokolova & Michal Koucký Prague Workshop on Bounded Arithmetic November 2-3, 2017 Talk outline I. Combinatorial construction

More information

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods Stat260/CS294: Spectral Graph Methods Lecture 20-04/07/205 Lecture: Local Spectral Methods (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

Random Lifts of Graphs

Random Lifts of Graphs 27th Brazilian Math Colloquium, July 09 Plan of this talk A brief introduction to the probabilistic method. A quick review of expander graphs and their spectrum. Lifts, random lifts and their properties.

More information

Spectral and Electrical Graph Theory. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Spectral and Electrical Graph Theory. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Spectral and Electrical Graph Theory Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Outline Spectral Graph Theory: Understand graphs through eigenvectors and

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

Rank minimization via the γ 2 norm

Rank minimization via the γ 2 norm Rank minimization via the γ 2 norm Troy Lee Columbia University Adi Shraibman Weizmann Institute Rank Minimization Problem Consider the following problem min X rank(x) A i, X b i for i = 1,..., k Arises

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

1 Tridiagonal matrices

1 Tridiagonal matrices Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate

More information

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure Stat260/CS294: Spectral Graph Methods Lecture 19-04/02/2015 Lecture: Local Spectral Methods (2 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

AM205: Assignment 2. i=1

AM205: Assignment 2. i=1 AM05: Assignment Question 1 [10 points] (a) [4 points] For p 1, the p-norm for a vector x R n is defined as: ( n ) 1/p x p x i p ( ) i=1 This definition is in fact meaningful for p < 1 as well, although

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

Positive Semi-definite programing and applications for approximation

Positive Semi-definite programing and applications for approximation Combinatorial Optimization 1 Positive Semi-definite programing and applications for approximation Guy Kortsarz Combinatorial Optimization 2 Positive Sem-Definite (PSD) matrices, a definition Note that

More information

Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors

Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors Alexander Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu

More information

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before

More information

Random Matrices: Invertibility, Structure, and Applications

Random Matrices: Invertibility, Structure, and Applications Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Sparsification by Effective Resistance Sampling

Sparsification by Effective Resistance Sampling Spectral raph Theory Lecture 17 Sparsification by Effective Resistance Sampling Daniel A. Spielman November 2, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

Info-Greedy Sequential Adaptive Compressed Sensing

Info-Greedy Sequential Adaptive Compressed Sensing Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for

More information

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory Tim Roughgarden & Gregory Valiant May 2, 2016 Spectral graph theory is the powerful and beautiful theory that arises from

More information

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization

More information

Graph Partitioning Algorithms and Laplacian Eigenvalues

Graph Partitioning Algorithms and Laplacian Eigenvalues Graph Partitioning Algorithms and Laplacian Eigenvalues Luca Trevisan Stanford Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee, Yin Tat Lee, and Shayan Oveis Gharan spectral graph theory Use linear

More information

Spectral Partitiong in a Stochastic Block Model

Spectral Partitiong in a Stochastic Block Model Spectral Graph Theory Lecture 21 Spectral Partitiong in a Stochastic Block Model Daniel A. Spielman November 16, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

Inverse Power Method for Non-linear Eigenproblems

Inverse Power Method for Non-linear Eigenproblems Inverse Power Method for Non-linear Eigenproblems Matthias Hein and Thomas Bühler Anubhav Dwivedi Department of Aerospace Engineering & Mechanics 7th March, 2017 1 / 30 OUTLINE Motivation Non-Linear Eigenproblems

More information

Iterative solvers for linear equations

Iterative solvers for linear equations Spectral Graph Theory Lecture 15 Iterative solvers for linear equations Daniel A. Spielman October 1, 009 15.1 Overview In this and the next lecture, I will discuss iterative algorithms for solving linear

More information

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Singular value decomposition (SVD) of large random matrices. India, 2010

Singular value decomposition (SVD) of large random matrices. India, 2010 Singular value decomposition (SVD) of large random matrices Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu India, 2010 Motivation New challenge of multivariate statistics:

More information

Graph Sparsification I : Effective Resistance Sampling

Graph Sparsification I : Effective Resistance Sampling Graph Sparsification I : Effective Resistance Sampling Nikhil Srivastava Microsoft Research India Simons Institute, August 26 2014 Graphs G G=(V,E,w) undirected V = n w: E R + Sparsification Approximate

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving. 22 Element-wise Sampling of Graphs and Linear Equation Solving

Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving. 22 Element-wise Sampling of Graphs and Linear Equation Solving Stat260/CS294: Randomized Algorithms for Matrices and Data Lecture 24-12/02/2013 Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving Lecturer: Michael Mahoney Scribe: Michael Mahoney

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Problem Set 2. Assigned: Mon. November. 23, 2015

Problem Set 2. Assigned: Mon. November. 23, 2015 Pseudorandomness Prof. Salil Vadhan Problem Set 2 Assigned: Mon. November. 23, 2015 Chi-Ning Chou Index Problem Progress 1 SchwartzZippel lemma 1/1 2 Robustness of the model 1/1 3 Zero error versus 1-sided

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

ORIE 6334 Spectral Graph Theory October 13, Lecture 15

ORIE 6334 Spectral Graph Theory October 13, Lecture 15 ORIE 6334 Spectral Graph heory October 3, 206 Lecture 5 Lecturer: David P. Williamson Scribe: Shijin Rajakrishnan Iterative Methods We have seen in the previous lectures that given an electrical network,

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003 CS6999 Probabilistic Methods in Integer Programming Randomized Rounding April 2003 Overview 2 Background Randomized Rounding Handling Feasibility Derandomization Advanced Techniques Integer Programming

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms 南京大学 尹一通 Martingales Definition: A sequence of random variables X 0, X 1,... is a martingale if for all i > 0, E[X i X 0,...,X i1 ] = X i1 x 0, x 1,...,x i1, E[X i X 0 = x 0, X 1

More information

Lecture 15: Expanders

Lecture 15: Expanders CS 710: Complexity Theory 10/7/011 Lecture 15: Expanders Instructor: Dieter van Melkebeek Scribe: Li-Hsiang Kuo In the last lecture we introduced randomized computation in terms of machines that have access

More information

Geometry of log-concave Ensembles of random matrices

Geometry of log-concave Ensembles of random matrices Geometry of log-concave Ensembles of random matrices Nicole Tomczak-Jaegermann Joint work with Radosław Adamczak, Rafał Latała, Alexander Litvak, Alain Pajor Cortona, June 2011 Nicole Tomczak-Jaegermann

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

On the concentration of eigenvalues of random symmetric matrices

On the concentration of eigenvalues of random symmetric matrices On the concentration of eigenvalues of random symmetric matrices Noga Alon Michael Krivelevich Van H. Vu April 23, 2012 Abstract It is shown that for every 1 s n, the probability that the s-th largest

More information

Using Friendly Tail Bounds for Sums of Random Matrices

Using Friendly Tail Bounds for Sums of Random Matrices Using Friendly Tail Bounds for Sums of Random Matrices Joel A. Tropp Computing + Mathematical Sciences California Institute of Technology jtropp@cms.caltech.edu Research supported in part by NSF, DARPA,

More information

Notes on Gaussian processes and majorizing measures

Notes on Gaussian processes and majorizing measures Notes on Gaussian processes and majorizing measures James R. Lee 1 Gaussian processes Consider a Gaussian process {X t } for some index set T. This is a collection of jointly Gaussian random variables,

More information

Invertibility of random matrices

Invertibility of random matrices University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]

More information

Fiedler s Theorems on Nodal Domains

Fiedler s Theorems on Nodal Domains Spectral Graph Theory Lecture 7 Fiedler s Theorems on Nodal Domains Daniel A. Spielman September 19, 2018 7.1 Overview In today s lecture we will justify some of the behavior we observed when using eigenvectors

More information

Graph Clustering Algorithms

Graph Clustering Algorithms PhD Course on Graph Mining Algorithms, Università di Pisa February, 2018 Clustering: Intuition to Formalization Task Partition a graph into natural groups so that the nodes in the same cluster are more

More information

Fundamentals of Matrices

Fundamentals of Matrices Maschinelles Lernen II Fundamentals of Matrices Christoph Sawade/Niels Landwehr/Blaine Nelson Tobias Scheffer Matrix Examples Recap: Data Linear Model: f i x = w i T x Let X = x x n be the data matrix

More information

Lecture 6: Random Walks versus Independent Sampling

Lecture 6: Random Walks versus Independent Sampling Spectral Graph Theory and Applications WS 011/01 Lecture 6: Random Walks versus Independent Sampling Lecturer: Thomas Sauerwald & He Sun For many problems it is necessary to draw samples from some distribution

More information

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev. Announcements No class monday. Metric embedding seminar. Review expectation notion of high probability. Markov. Today: Book 4.1, 3.3, 4.2 Chebyshev. Remind variance, standard deviation. σ 2 = E[(X µ X

More information

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION YAQIAO LI In this note we will prove Szemerédi s regularity lemma, and its application in proving the triangle removal lemma and the Roth s theorem on

More information

Fiedler s Theorems on Nodal Domains

Fiedler s Theorems on Nodal Domains Spectral Graph Theory Lecture 7 Fiedler s Theorems on Nodal Domains Daniel A Spielman September 9, 202 7 About these notes These notes are not necessarily an accurate representation of what happened in

More information

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These

More information

3 Best-Fit Subspaces and Singular Value Decomposition

3 Best-Fit Subspaces and Singular Value Decomposition 3 Best-Fit Subspaces and Singular Value Decomposition (SVD) Think of the rows of an n d matrix A as n data points in a d-dimensional space and consider the problem of finding the best k-dimensional subspace

More information

Bounded Arithmetic, Expanders, and Monotone Propositional Proofs

Bounded Arithmetic, Expanders, and Monotone Propositional Proofs Bounded Arithmetic, Expanders, and Monotone Propositional Proofs joint work with Valentine Kabanets, Antonina Kolokolova & Michal Koucký Takeuti Symposium on Advances in Logic Kobe, Japan September 20,

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Convex and Semidefinite Programming for Approximation

Convex and Semidefinite Programming for Approximation Convex and Semidefinite Programming for Approximation We have seen linear programming based methods to solve NP-hard problems. One perspective on this is that linear programming is a meta-method since

More information

Random Methods for Linear Algebra

Random Methods for Linear Algebra Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform

More information

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition ECS231: Spectral Partitioning Based on Berkeley s CS267 lecture on graph partition 1 Definition of graph partitioning Given a graph G = (N, E, W N, W E ) N = nodes (or vertices), E = edges W N = node weights

More information

Clustering and Gaussian Mixture Models

Clustering and Gaussian Mixture Models Clustering and Gaussian Mixture Models Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 25, 2016 Probabilistic Machine Learning (CS772A) Clustering and Gaussian Mixture Models 1 Recap

More information

On the Spectra of General Random Graphs

On the Spectra of General Random Graphs On the Spectra of General Random Graphs Fan Chung Mary Radcliffe University of California, San Diego La Jolla, CA 92093 Abstract We consider random graphs such that each edge is determined by an independent

More information

Lecture 10 February 4, 2013

Lecture 10 February 4, 2013 UBC CPSC 536N: Sparse Approximations Winter 2013 Prof Nick Harvey Lecture 10 February 4, 2013 Scribe: Alexandre Fréchette This lecture is about spanning trees and their polyhedral representation Throughout

More information

Laplacian Matrices of Graphs: Spectral and Electrical Theory

Laplacian Matrices of Graphs: Spectral and Electrical Theory Laplacian Matrices of Graphs: Spectral and Electrical Theory Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale University Toronto, Sep. 28, 2 Outline Introduction to graphs

More information

Small Ball Probability, Arithmetic Structure and Random Matrices

Small Ball Probability, Arithmetic Structure and Random Matrices Small Ball Probability, Arithmetic Structure and Random Matrices Roman Vershynin University of California, Davis April 23, 2008 Distance Problems How far is a random vector X from a given subspace H in

More information

Combinatorial Optimization

Combinatorial Optimization Combinatorial Optimization 2017-2018 1 Maximum matching on bipartite graphs Given a graph G = (V, E), find a maximum cardinal matching. 1.1 Direct algorithms Theorem 1.1 (Petersen, 1891) A matching M is

More information

Extreme eigenvalues of Erdős-Rényi random graphs

Extreme eigenvalues of Erdős-Rényi random graphs Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA Inhomogeneous Erdős-Rényi random

More information

Probability Background

Probability Background Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database University of Maryland-College Park Advance Scientific Computing I,II Spectral Clustering on Handwritten Digits Database Author: Danielle Middlebrooks Dmiddle1@math.umd.edu Second year AMSC Student Advisor:

More information

Linear algebra for computational statistics

Linear algebra for computational statistics University of Seoul May 3, 2018 Vector and Matrix Notation Denote 2-dimensional data array (n p matrix) by X. Denote the element in the ith row and the jth column of X by x ij or (X) ij. Denote by X j

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

The Informativeness of k-means for Learning Mixture Models

The Informativeness of k-means for Learning Mixture Models The Informativeness of k-means for Learning Mixture Models Vincent Y. F. Tan (Joint work with Zhaoqiang Liu) National University of Singapore June 18, 2018 1/35 Gaussian distribution For F dimensions,

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Lecture 5: January 30

Lecture 5: January 30 CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 5: January 30 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information