Stanford University Graph Partitioning and Expanders Handout 3 Luca Trevisan May 8, 2013

Similar documents
U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 2/21/2008. Notes for Lecture 8

Finding Dense Subgraphs in G(n, 1/2)

Eigenvalues of Random Graphs

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Edge Isoperimetric Inequalities

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

Lecture 10: May 6, 2013

Min Cut, Fast Cut, Polynomial Identities

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Lecture 4: Universal Hash Functions/Streaming Cont d

Problem Set 9 Solutions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Complete subgraphs in multipartite graphs

Lecture 2: Gram-Schmidt Vectors and the LLL Algorithm

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

Calculation of time complexity (3%)

Linear Approximation with Regularization and Moving Least Squares

Section 8.3 Polar Form of Complex Numbers

= z 20 z n. (k 20) + 4 z k = 4

Lecture 17: Lee-Sidford Barrier

COS 521: Advanced Algorithms Game Theory and Linear Programming

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 6 Luca Trevisan September 12, 2017

Maximizing the number of nonnegative subsets

Expected Value and Variance

2.3 Nilpotent endomorphisms

Computing Correlated Equilibria in Multi-Player Games

Spectral graph theory: Applications of Courant-Fischer

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Spectral Clustering. Shannon Quinn

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Feb 14: Spatial analysis of data fields

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

Lecture 4: Constant Time SVD Approximation

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Notes on Frequency Estimation in Data Streams

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture Space-Bounded Derandomization

Spectral Graph Theory and its Applications September 16, Lecture 5

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

APPENDIX A Some Linear Algebra

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

Distribution of subgraphs of random regular graphs

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Lecture 10 Support Vector Machines II

CSCE 790S Background Results

Lecture Notes on Linear Regression

Deriving the X-Z Identity from Auxiliary Space Method

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Communication Complexity 16:198: February Lecture 4. x ij y ij

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

Vapnik-Chervonenkis theory

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013

1 Vectors over the complex numbers

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Random Walks on Digraphs

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

More metrics on cartesian products

Approximate Smallest Enclosing Balls

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Perron Vectors of an Irreducible Nonnegative Interval Matrix

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

The Order Relation and Trace Inequalities for. Hermitian Operators

Composite Hypotheses testing

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Lecture 5 September 17, 2015

The Minimum Universal Cost Flow in an Infeasible Flow Network

Density matrix. c α (t)φ α (q)

Foundations of Arithmetic

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Feature Selection: Part 1

Lecture 12: Discrete Laplacian

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Module 9. Lecture 6. Duality in Assignment Problems

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN

a b a In case b 0, a being divisible by b is the same as to say that

Homework Notes Week 7

Representation theory and quantum mechanics tutorial Representation theory and quantum conservation laws

Convergence of random processes

10-701/ Machine Learning, Fall 2005 Homework 3

Randić Energy and Randić Estrada Index of a Graph

The Second Eigenvalue of Planar Graphs

Google PageRank with Stochastic Matrix

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

Difference Equations

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

Volume 18 Figure 1. Notation 1. Notation 2. Observation 1. Remark 1. Remark 2. Remark 3. Remark 4. Remark 5. Remark 6. Theorem A [2]. Theorem B [2].

Low correlation tensor decomposition via entropy maximization

n ). This is tight for all admissible values of t, k and n. k t + + n t

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Transcription:

Stanford Unversty Graph Parttonng and Expanders Handout 3 Luca Trevsan May 8, 03 Lecture 3 In whch we analyze the power method to approxmate egenvalues and egenvectors, and we descrbe some more algorthmc applcatons of spectral graph theory. The power method Last week, we showed that, f G = (V, E) s a d-regular graph, and L s ts normalzed Laplacan matrx wth egenvalues 0 = λ λ... λ n, gven an egenvector of λ, the algorthm SpectralPartton fnds, n nearly-lnear tme O( E + V log V ), a cut (S, V S) such that φ(s) φ(g). More generally, f, nstead of beng gven an egenvector x such that Lx = λ x, we are gven a vector x such that x T Lx (λ + ɛ)x T x, then the algorthm fnds a cut such that φ(s) 4φ(G) + ɛ. In ths lecture we descrbe and analyze an algorthm that computes such a vector usng O(( V + E ) ɛ log V ɛ ) arthmetc operatons. A symmetrc matrx s postve sem-defnte (abbrevated PSD) f all ts egenvalues are nonnegatve. We begn by descrbng an algorthm that approxmates the largest egenvalue of a gven symmetrc PSD matrx. Ths mght not seem to help very much because because we want to compute the second smallest, not the largest, egenvalue. We wll see, however, that the algorthm s easly modfed to accomplsh what we want.. The Power Method to Approxmate the Largest Egenvalue The algorthm works as follows c 03 by Luca Trevsan. Ths work s lcensed under the Creatve Commons Attrbuton- NonCommercal-NoDervs 3.0 Unported Lcense. To vew a copy of ths lcense, vst http: //creatvecommons.org/lcenses/by-nc-nd/3.0/ or send a letter to Creatve Commons, 7 Second Street, Sute 300, San Francsco, Calforna, 9405, USA.

Algorthm Power Input: PSD matrx M, parameter k Pck unformly at random x 0 {, } n for := to k x := M x return x k That s, the algorthm smply pcks unformly at random a vector x wth ± coordnates, and outputs M k x. Note that the algorthm performs O(k (n + m)) arthmetc operatons, where m s the number of non-zero entres of the matrx M. Theorem For every PSD matrx M, postve nteger k and parameter ɛ > 0, wth probablty 3/6 over the choce of x 0, the algorthm Power outputs a vector x k such that x T k Mx k x T k x k λ ( ɛ) where λ s the largest egenvalue of M. + 4n( ɛ) k Note that, n partcular, we can have k = O(log n/ɛ) and xt k Mx k ( O(ɛ)) λ x T k x. k Let λ λ n be the egenvalues of M, wth multplctes, and v,..., v n be a system of orthonormal egenvectors such that Mv = λ v. Theorem s mpled by the followng two lemmas Lemma Let v R n be a vector such that v =. Sample unformly x {, } n. Then P [ x, v ] 3 6 Lemma 3 Let x R n be a vector such that x, v. Then, for every postve nteger t and postve ɛ > 0, f we defne y := M k x, we have y T My y T y λ ( ɛ) + 4 x ( ɛ) k

It remans to prove the two lemmas. Proof: (Of Lemma ) Let v = (v,..., v n ). The nner product x, v s the random varable S := x v Let us compute the frst, second, and fourth moment of S. E S = 0 E S = v = E S 4 = 3 ( v ) v 4 3 Recall that the Paley-Zygmund nequalty states that f Z s a non-negatve random varable wth fnte varance, then, for every 0 δ, we have whch follows by notng that P[Z δ E Z] ( δ) (E Z) E Z () that E Z = E[Z Z<δ E Z ] + E[Z Z δ E Z ], and that E[Z Z<δ E Z ] δ E Z, E[Z Z δ E Z ] E Z E Z δ E Z = E Z P[Z δ E Z] We apply the Paley-Zygmund nequalty to the case Z = S and δ = /4, and we derve P [ S ] 4 ( ) 3 4 3 = 3 6 3

Remark 4 The proof of Lemma works even f x {, } n s selected accordng to a 4-wse ndependent dstrbuton. Ths means that the algorthm can be derandomzed n polynomal tme. Proof: (Of Lemma 3) Let us wrte x as a lnear combnaton of the egenvectors x = a v + + a n v n where the coeffcents can be computed as a = x, v. Note that, by assumpton, a.5, and that, by orthonormalty of the egenvectors, x = a. We have and so y = a λ k v + + a n λ k nv n and y T My = y T y = a λ k+ a λ k We need to prove a lower bound to the rato of the above two quanttes. We wll compute a lower bound to the numerator and an upper bound to the denomnator n terms of the same parameter. Let l be the number of egenvalues larger than λ ( ɛ). Then, recallng that the egenvalues are sorted n non-ncreasng order, we have y T My l = a λ k+ λ ( ɛ) l = a λ k We also see that n =l+ a λ k λ k ( ɛ) k n =l+ a λ k ( ɛ) k x 4

4a λ k ( ɛ) t x 4 x ( ɛ) k l = a λ k So we have gvng y T y ( + 4 x ( ɛ) k ) l = a y T My y T y λ ( ɛ) + 4 x ( ɛ) k Remark 5 Where dd we use the assumpton that M s postve semdefnte? What happens f we apply ths algorthm to the adjacency matrx of a bpartte graph?. Approxmatng the Second Largest Egenvalue Suppose now that we are nterested n fndng the second largest egenvalue of a gven PSD matrx M. If M has egenvalues λ λ λ n, and we know the egenvector v of λ, then M s a PSD lnear map from the orthogonal space to v to tself, and λ s the largest egenvalue of ths lnear map. We can then run the prevous algorthm on ths lnear map. Algorthm Power Input: PSD matrx M, vector v parameter k Pck unformly at random x {, } n x 0 := x v x, v for := to k x := M x return x k If v,..., v n s an orthonormal bass of egenvectors for the egenvalues λ λ n of M, then, at the begnnng, we pck a random vector x = a v + a v + a n v n 5

that, wth probablty at least 3/6, satsfes a /. (Cf. Lemma.) Then we compute x 0, whch s the projecton of x on the subspace orthogonal to v, that s Note that x = n and that x 0 n. The output s the vector x k x 0 = a v + a n v n x k = a λ k v + a n λ k nv n If we apply Lemma 3 to subspace orthogonal to v, we see that when a / we have that, for every 0 < ɛ <, x T k Mx k x T k x k λ ( ɛ) 4n( ɛ) k We have thus establshed the followng analyss. Theorem 6 For every PSD matrx M, postve nteger k and parameter ɛ > 0, f v s a length- egenvector of the largest egenvalue of M, then wth probablty 3/6 over the choce of x 0, the algorthm Power outputs a vector x k v such that x T k Mx k x T k x k λ ( ɛ) + 4n( ɛ) k where λ s the second largest egenvalue of M, countng multplctes..3 The Second Smallest Egenvalue of the Laplacan Fnally, we come to the case n whch we want to compute the second smallest egenvalue of the normalzed Laplacan matrx L = I A of a d-regular graph G = (V, E), d where A s the adjacency matrx of G. Consder the matrx M := I L = I + d A. Then f 0 = λ... λ n are the egenvalues of L, we have that = λ λ λ n 0 are the egenvalues of M, and that M s PSD. M and L have the same egenvectors, and so v = n (,..., ) s a length- egenvector of the largest egenvalue of M. By runnng algorthm Power, we can fnd a vector x such that 6

and so, rearrangng, we have x T Mx T ( ɛ) ( λ ) x T x x T Mx T = x T x x T Lx x T Lx x T x λ + ɛ If we want to compute a vector whose Raylegh quotent s, say, at most λ, then the runnng tme wll be Õ(( V + E )/λ ), because we need to set ɛ = λ /, whch s not nearly lnear n the sze of the graph f λ s, say O(/ V ). For a runnng tme that s nearly lnear n n for all values of λ, one can, nstead, apply the power method to the pseudonverse L + of L. (Assumng that the graph s connected, L + x s the unque vector y such that Ly = x, f x (,..., ), and L + x = 0 f x s parallel to (,..., ).) Ths s because L + has egenvalues 0, /λ,..., /λ n, and so L + s PSD and /λ s ts largest egenvalue. Although computng L + s not known to be doable n nearly lnear tme, there are nearly lnear tme algorthms that, gven x, solve n y the lnear system Ly = x, and ths s the same as computng the product L + x, whch s enough to mplement algorthm Power appled to L +. In tme O((V + E ) (log V /ɛ) O() ), we can fnd a vector y such that y = (L + ) k x, where x s a random vector n {, } n, shfted to be orthogonal to (,..., ) and k = O(log V /ɛ). What s the Raylegh quotent of such a vector wth respect to L? Let v,..., v n be a bass of orthonormal egenvectors for L and L +. If 0 = λ λ λ n are the egenvalues of L, then we have and, for =,..., n, we have Lv = L + v = 0 Lv = λ L + v = λ Wrte x = a v + a n v n, where a n, and ssume that, as happens wth probablty at least 3/6, we have a. Then 4 y = n = a λ k and the Raylegh quotent of y wth respect to L s 7

y T Ly y T y = a a λ k and the analyss proceeds smlarly to the analyss of the prevous secton. If we let l be the ndex such that λ l (+ɛ) λ λ l+ then we can upper bound the numerator as a λ k l l l a a ( + a λ k + λ k ( + ɛ) k λ k + n λ k ( + ɛ) k λ k + λ k 4n ( + ɛ) k and we can lower bound the denomnator as and the Raylegh quotent s a λ k l ( + ɛ)λ ( + ɛ) k λ k ) l l a a a λ k λ k 4na λ k ( ) y T Ly y T y λ 4n ( + ɛ) + ( + ɛ) λ ( + ɛ) k when k = O ( ɛ log n ɛ ). An O(( V + E ) (log V ) O() ) algorthm to solve n y the lnear system Ly = x was frst developed by Spelman and Teng. Faster algorthms (wth a lower exponent n the (log V ) O() part of the runnng tme, and smaller constants) were later developed by Kouts, Mller and Peng, and, very recently, by Kelner, Oreccha, Sdford, and Zhu. >l a 8

Other applcatons of spectral graph theory. Spectral graph theory n rregular graphs Let G = (V, E) be an undrected graph n whch every vertex has postve degree and A be the adjacency matrx of G. We want to defne a Laplacan matrx L and a Raylegh quotent such that the k-th egenvalue of L s the mnmum over all k- dmensonal spaces of the maxmum Raylegh quotent n the space, and we want the conductance of a set to be the same as the Raylegh quotent of the ndcator vector of the set. All the facts that we have proved n the regular case essentally reduce to these two propertes of the Laplacan and the Raylegh quotent. Let d v be the degree of vertex v n G. We defne the Raylegh quotent of a vector x R V as R G (x) := {u,v} E x u x v v d vx v Let D be the dagonal matrx of degrees such that D u,v = 0 f u v and D v,v = d v. Then defne the Laplacan of G as L G := I D / AD / Note that n a d-regular graph we have D = di and L G standard defnton. = I A, whch s the d Snce L = L G s a real symmetrc matrx, the k-th smallest egenvalue of L s λ k = mn k dmensonal S max x S x T Lx x T x Now let us do the change of varable y D / x. We have λ k = mn k dmensonal S max y S y T D / LD / y y T Dy In the numerator, y T Dy = v d vy v, and n the denomnator a smple calculaton shows y T D / LD / y = y T (D A)y = {u,v} y v y u so ndeed 9

λ k = mn k dmensonal S For two vectors y, z, defne the nner product max y S R G (y) y, z G := v d v y v z v Then we can prove that λ = mn R G(y) y: y,(,...,) G =0 Wth these defntons and observatons n place, t s now possble to repeat the proof of Cheeger s nequalty step by step (replacng the condton v x v = 0 wth d vx v = 0, adjustng the defnton of Raylegh quotent, etc.) and prove that f λ s the second smallest egenvalue of the Laplacan of an rregular graph G, and φ(g) s the conductance of G, then λ φ(g) λ. Hgher-order Cheeger nequalty The Cheeger nequalty gves a robust verson of the fact that λ = 0 f and only f G s dsconnected. It s possble to also gve a robust verson of the fact that λ k = 0 f and only f G has at least k connected components. We wll restrct the dscusson to regular graphs. For a sze parameter s V /, denote the sze-s small-set expanson of a graph SSE s (G) := mn φ(s) S V : S s So that SSE n (G) = φ(g). Ths s an nterestng optmzaton problem, because n many settngs n whch non-expandng sets correspond to clusters, t s more nterestng to fnd small non-expandng sets (and, possbly, remove them and terate to fnd more) than to fnd large ones. It has been studed very ntensely n the past fve years because of ts connecton wth the Unque Games Conjecture, whch s n turn one of the key open problems n complexty theory. If λ k = 0, then we know that are at least k connected components, and, n partcular, there s a set S V such that φ(s) = 0 and S n, meanng that SSE n = 0. By k k analogy wth the Cheeger nequalty, we may look for a robust verson of ths fact, 0

of the form SSE n O( λ k k ). Unfortunately there are counterexamples, but Arora, Barak and Steurer have proved that, for every δ, SSE n +δ k O ( λk To formulate a hgher-order verson of the Cheeger nequalty, we need to defne a quantty that generalze expanson n a dfferent way. For an nteger parameter k, defne order k expanson as δ ) φ k (G) = mn S,...S k V dsjont max φ(s ) =,...,k Note that φ (G) = φ(g). Then Lee, Oves-Gharan and Trevsan prove that and λ k φ k(g) O(k ) λ k φ.9 k (G) O( λ k log k) (whch was also proved by Lous, Raghavendra, Tetal and Vempala). The upper bounds are algorthmc, and gven k orthogonal vectors all of Raylegh quotent at most λ, there are effcent algorthms that fnd at least k dsjont sets each of expanson at most O(k λ) and at least.9 k dsjont sets each of expanson at most O( λ log k)..3 A Cheeger-type nequalty for λ n We proved that λ n = f and only f G has a bpartte connected component. What happens when λ n s, say,.999? We can defne a bpartte verson of expanson as follows: β(g) := mn x {,0,} V {u,v} E x u + x v v d v x v The above quantty has the followng combnatoral nterpretaton: take a set S of vertces, and a partton of S nto two dsjont sets A, B. Then defne β(s, A, B) := E(A) + E(B) + E(S, V S) vol(s)

where E(A) s the number of edges entrely contaned n A, and E(S, V S) s the number of edges wth one endpont n S and one endpont n V S. We can thnk of β(s, A, B) as measurng what fracton of the edges ncdent on S we need to delete n order to make S dsconnected from the rest of the graph and A, B be a bpartton of the subgraph nduced by S. In other words, t measure how close S s to beng a bpartte connected component. Then we see that β(g) = mn S V, A,B partton of S β(s, A, B) Trevsan proves that ( λ n) β(g) ( λ n )