Stanford Unversty Graph Parttonng and Expanders Handout 3 Luca Trevsan May 8, 03 Lecture 3 In whch we analyze the power method to approxmate egenvalues and egenvectors, and we descrbe some more algorthmc applcatons of spectral graph theory. The power method Last week, we showed that, f G = (V, E) s a d-regular graph, and L s ts normalzed Laplacan matrx wth egenvalues 0 = λ λ... λ n, gven an egenvector of λ, the algorthm SpectralPartton fnds, n nearly-lnear tme O( E + V log V ), a cut (S, V S) such that φ(s) φ(g). More generally, f, nstead of beng gven an egenvector x such that Lx = λ x, we are gven a vector x such that x T Lx (λ + ɛ)x T x, then the algorthm fnds a cut such that φ(s) 4φ(G) + ɛ. In ths lecture we descrbe and analyze an algorthm that computes such a vector usng O(( V + E ) ɛ log V ɛ ) arthmetc operatons. A symmetrc matrx s postve sem-defnte (abbrevated PSD) f all ts egenvalues are nonnegatve. We begn by descrbng an algorthm that approxmates the largest egenvalue of a gven symmetrc PSD matrx. Ths mght not seem to help very much because because we want to compute the second smallest, not the largest, egenvalue. We wll see, however, that the algorthm s easly modfed to accomplsh what we want.. The Power Method to Approxmate the Largest Egenvalue The algorthm works as follows c 03 by Luca Trevsan. Ths work s lcensed under the Creatve Commons Attrbuton- NonCommercal-NoDervs 3.0 Unported Lcense. To vew a copy of ths lcense, vst http: //creatvecommons.org/lcenses/by-nc-nd/3.0/ or send a letter to Creatve Commons, 7 Second Street, Sute 300, San Francsco, Calforna, 9405, USA.
Algorthm Power Input: PSD matrx M, parameter k Pck unformly at random x 0 {, } n for := to k x := M x return x k That s, the algorthm smply pcks unformly at random a vector x wth ± coordnates, and outputs M k x. Note that the algorthm performs O(k (n + m)) arthmetc operatons, where m s the number of non-zero entres of the matrx M. Theorem For every PSD matrx M, postve nteger k and parameter ɛ > 0, wth probablty 3/6 over the choce of x 0, the algorthm Power outputs a vector x k such that x T k Mx k x T k x k λ ( ɛ) where λ s the largest egenvalue of M. + 4n( ɛ) k Note that, n partcular, we can have k = O(log n/ɛ) and xt k Mx k ( O(ɛ)) λ x T k x. k Let λ λ n be the egenvalues of M, wth multplctes, and v,..., v n be a system of orthonormal egenvectors such that Mv = λ v. Theorem s mpled by the followng two lemmas Lemma Let v R n be a vector such that v =. Sample unformly x {, } n. Then P [ x, v ] 3 6 Lemma 3 Let x R n be a vector such that x, v. Then, for every postve nteger t and postve ɛ > 0, f we defne y := M k x, we have y T My y T y λ ( ɛ) + 4 x ( ɛ) k
It remans to prove the two lemmas. Proof: (Of Lemma ) Let v = (v,..., v n ). The nner product x, v s the random varable S := x v Let us compute the frst, second, and fourth moment of S. E S = 0 E S = v = E S 4 = 3 ( v ) v 4 3 Recall that the Paley-Zygmund nequalty states that f Z s a non-negatve random varable wth fnte varance, then, for every 0 δ, we have whch follows by notng that P[Z δ E Z] ( δ) (E Z) E Z () that E Z = E[Z Z<δ E Z ] + E[Z Z δ E Z ], and that E[Z Z<δ E Z ] δ E Z, E[Z Z δ E Z ] E Z E Z δ E Z = E Z P[Z δ E Z] We apply the Paley-Zygmund nequalty to the case Z = S and δ = /4, and we derve P [ S ] 4 ( ) 3 4 3 = 3 6 3
Remark 4 The proof of Lemma works even f x {, } n s selected accordng to a 4-wse ndependent dstrbuton. Ths means that the algorthm can be derandomzed n polynomal tme. Proof: (Of Lemma 3) Let us wrte x as a lnear combnaton of the egenvectors x = a v + + a n v n where the coeffcents can be computed as a = x, v. Note that, by assumpton, a.5, and that, by orthonormalty of the egenvectors, x = a. We have and so y = a λ k v + + a n λ k nv n and y T My = y T y = a λ k+ a λ k We need to prove a lower bound to the rato of the above two quanttes. We wll compute a lower bound to the numerator and an upper bound to the denomnator n terms of the same parameter. Let l be the number of egenvalues larger than λ ( ɛ). Then, recallng that the egenvalues are sorted n non-ncreasng order, we have y T My l = a λ k+ λ ( ɛ) l = a λ k We also see that n =l+ a λ k λ k ( ɛ) k n =l+ a λ k ( ɛ) k x 4
4a λ k ( ɛ) t x 4 x ( ɛ) k l = a λ k So we have gvng y T y ( + 4 x ( ɛ) k ) l = a y T My y T y λ ( ɛ) + 4 x ( ɛ) k Remark 5 Where dd we use the assumpton that M s postve semdefnte? What happens f we apply ths algorthm to the adjacency matrx of a bpartte graph?. Approxmatng the Second Largest Egenvalue Suppose now that we are nterested n fndng the second largest egenvalue of a gven PSD matrx M. If M has egenvalues λ λ λ n, and we know the egenvector v of λ, then M s a PSD lnear map from the orthogonal space to v to tself, and λ s the largest egenvalue of ths lnear map. We can then run the prevous algorthm on ths lnear map. Algorthm Power Input: PSD matrx M, vector v parameter k Pck unformly at random x {, } n x 0 := x v x, v for := to k x := M x return x k If v,..., v n s an orthonormal bass of egenvectors for the egenvalues λ λ n of M, then, at the begnnng, we pck a random vector x = a v + a v + a n v n 5
that, wth probablty at least 3/6, satsfes a /. (Cf. Lemma.) Then we compute x 0, whch s the projecton of x on the subspace orthogonal to v, that s Note that x = n and that x 0 n. The output s the vector x k x 0 = a v + a n v n x k = a λ k v + a n λ k nv n If we apply Lemma 3 to subspace orthogonal to v, we see that when a / we have that, for every 0 < ɛ <, x T k Mx k x T k x k λ ( ɛ) 4n( ɛ) k We have thus establshed the followng analyss. Theorem 6 For every PSD matrx M, postve nteger k and parameter ɛ > 0, f v s a length- egenvector of the largest egenvalue of M, then wth probablty 3/6 over the choce of x 0, the algorthm Power outputs a vector x k v such that x T k Mx k x T k x k λ ( ɛ) + 4n( ɛ) k where λ s the second largest egenvalue of M, countng multplctes..3 The Second Smallest Egenvalue of the Laplacan Fnally, we come to the case n whch we want to compute the second smallest egenvalue of the normalzed Laplacan matrx L = I A of a d-regular graph G = (V, E), d where A s the adjacency matrx of G. Consder the matrx M := I L = I + d A. Then f 0 = λ... λ n are the egenvalues of L, we have that = λ λ λ n 0 are the egenvalues of M, and that M s PSD. M and L have the same egenvectors, and so v = n (,..., ) s a length- egenvector of the largest egenvalue of M. By runnng algorthm Power, we can fnd a vector x such that 6
and so, rearrangng, we have x T Mx T ( ɛ) ( λ ) x T x x T Mx T = x T x x T Lx x T Lx x T x λ + ɛ If we want to compute a vector whose Raylegh quotent s, say, at most λ, then the runnng tme wll be Õ(( V + E )/λ ), because we need to set ɛ = λ /, whch s not nearly lnear n the sze of the graph f λ s, say O(/ V ). For a runnng tme that s nearly lnear n n for all values of λ, one can, nstead, apply the power method to the pseudonverse L + of L. (Assumng that the graph s connected, L + x s the unque vector y such that Ly = x, f x (,..., ), and L + x = 0 f x s parallel to (,..., ).) Ths s because L + has egenvalues 0, /λ,..., /λ n, and so L + s PSD and /λ s ts largest egenvalue. Although computng L + s not known to be doable n nearly lnear tme, there are nearly lnear tme algorthms that, gven x, solve n y the lnear system Ly = x, and ths s the same as computng the product L + x, whch s enough to mplement algorthm Power appled to L +. In tme O((V + E ) (log V /ɛ) O() ), we can fnd a vector y such that y = (L + ) k x, where x s a random vector n {, } n, shfted to be orthogonal to (,..., ) and k = O(log V /ɛ). What s the Raylegh quotent of such a vector wth respect to L? Let v,..., v n be a bass of orthonormal egenvectors for L and L +. If 0 = λ λ λ n are the egenvalues of L, then we have and, for =,..., n, we have Lv = L + v = 0 Lv = λ L + v = λ Wrte x = a v + a n v n, where a n, and ssume that, as happens wth probablty at least 3/6, we have a. Then 4 y = n = a λ k and the Raylegh quotent of y wth respect to L s 7
y T Ly y T y = a a λ k and the analyss proceeds smlarly to the analyss of the prevous secton. If we let l be the ndex such that λ l (+ɛ) λ λ l+ then we can upper bound the numerator as a λ k l l l a a ( + a λ k + λ k ( + ɛ) k λ k + n λ k ( + ɛ) k λ k + λ k 4n ( + ɛ) k and we can lower bound the denomnator as and the Raylegh quotent s a λ k l ( + ɛ)λ ( + ɛ) k λ k ) l l a a a λ k λ k 4na λ k ( ) y T Ly y T y λ 4n ( + ɛ) + ( + ɛ) λ ( + ɛ) k when k = O ( ɛ log n ɛ ). An O(( V + E ) (log V ) O() ) algorthm to solve n y the lnear system Ly = x was frst developed by Spelman and Teng. Faster algorthms (wth a lower exponent n the (log V ) O() part of the runnng tme, and smaller constants) were later developed by Kouts, Mller and Peng, and, very recently, by Kelner, Oreccha, Sdford, and Zhu. >l a 8
Other applcatons of spectral graph theory. Spectral graph theory n rregular graphs Let G = (V, E) be an undrected graph n whch every vertex has postve degree and A be the adjacency matrx of G. We want to defne a Laplacan matrx L and a Raylegh quotent such that the k-th egenvalue of L s the mnmum over all k- dmensonal spaces of the maxmum Raylegh quotent n the space, and we want the conductance of a set to be the same as the Raylegh quotent of the ndcator vector of the set. All the facts that we have proved n the regular case essentally reduce to these two propertes of the Laplacan and the Raylegh quotent. Let d v be the degree of vertex v n G. We defne the Raylegh quotent of a vector x R V as R G (x) := {u,v} E x u x v v d vx v Let D be the dagonal matrx of degrees such that D u,v = 0 f u v and D v,v = d v. Then defne the Laplacan of G as L G := I D / AD / Note that n a d-regular graph we have D = di and L G standard defnton. = I A, whch s the d Snce L = L G s a real symmetrc matrx, the k-th smallest egenvalue of L s λ k = mn k dmensonal S max x S x T Lx x T x Now let us do the change of varable y D / x. We have λ k = mn k dmensonal S max y S y T D / LD / y y T Dy In the numerator, y T Dy = v d vy v, and n the denomnator a smple calculaton shows y T D / LD / y = y T (D A)y = {u,v} y v y u so ndeed 9
λ k = mn k dmensonal S For two vectors y, z, defne the nner product max y S R G (y) y, z G := v d v y v z v Then we can prove that λ = mn R G(y) y: y,(,...,) G =0 Wth these defntons and observatons n place, t s now possble to repeat the proof of Cheeger s nequalty step by step (replacng the condton v x v = 0 wth d vx v = 0, adjustng the defnton of Raylegh quotent, etc.) and prove that f λ s the second smallest egenvalue of the Laplacan of an rregular graph G, and φ(g) s the conductance of G, then λ φ(g) λ. Hgher-order Cheeger nequalty The Cheeger nequalty gves a robust verson of the fact that λ = 0 f and only f G s dsconnected. It s possble to also gve a robust verson of the fact that λ k = 0 f and only f G has at least k connected components. We wll restrct the dscusson to regular graphs. For a sze parameter s V /, denote the sze-s small-set expanson of a graph SSE s (G) := mn φ(s) S V : S s So that SSE n (G) = φ(g). Ths s an nterestng optmzaton problem, because n many settngs n whch non-expandng sets correspond to clusters, t s more nterestng to fnd small non-expandng sets (and, possbly, remove them and terate to fnd more) than to fnd large ones. It has been studed very ntensely n the past fve years because of ts connecton wth the Unque Games Conjecture, whch s n turn one of the key open problems n complexty theory. If λ k = 0, then we know that are at least k connected components, and, n partcular, there s a set S V such that φ(s) = 0 and S n, meanng that SSE n = 0. By k k analogy wth the Cheeger nequalty, we may look for a robust verson of ths fact, 0
of the form SSE n O( λ k k ). Unfortunately there are counterexamples, but Arora, Barak and Steurer have proved that, for every δ, SSE n +δ k O ( λk To formulate a hgher-order verson of the Cheeger nequalty, we need to defne a quantty that generalze expanson n a dfferent way. For an nteger parameter k, defne order k expanson as δ ) φ k (G) = mn S,...S k V dsjont max φ(s ) =,...,k Note that φ (G) = φ(g). Then Lee, Oves-Gharan and Trevsan prove that and λ k φ k(g) O(k ) λ k φ.9 k (G) O( λ k log k) (whch was also proved by Lous, Raghavendra, Tetal and Vempala). The upper bounds are algorthmc, and gven k orthogonal vectors all of Raylegh quotent at most λ, there are effcent algorthms that fnd at least k dsjont sets each of expanson at most O(k λ) and at least.9 k dsjont sets each of expanson at most O( λ log k)..3 A Cheeger-type nequalty for λ n We proved that λ n = f and only f G has a bpartte connected component. What happens when λ n s, say,.999? We can defne a bpartte verson of expanson as follows: β(g) := mn x {,0,} V {u,v} E x u + x v v d v x v The above quantty has the followng combnatoral nterpretaton: take a set S of vertces, and a partton of S nto two dsjont sets A, B. Then defne β(s, A, B) := E(A) + E(B) + E(S, V S) vol(s)
where E(A) s the number of edges entrely contaned n A, and E(S, V S) s the number of edges wth one endpont n S and one endpont n V S. We can thnk of β(s, A, B) as measurng what fracton of the edges ncdent on S we need to delete n order to make S dsconnected from the rest of the graph and A, B be a bpartton of the subgraph nduced by S. In other words, t measure how close S s to beng a bpartte connected component. Then we see that β(g) = mn S V, A,B partton of S β(s, A, B) Trevsan proves that ( λ n) β(g) ( λ n )