U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and we analyze the power method to fnd approxmate egenvectors, thus havng a complete descrpton of a polynomal-tme approxmaton algorthm for sparsest cut Irregular Graphs For smplcty, we proved our results on λ and λ k for regular graphs. Those results extend, essentally wth the same proofs, to the case of rregular undrected graphs. In an rregular graph G = (V, E, the noton that generalzes edge expanson s called conductance. If d v s the degree of vertex v, then the conductance of set S of vertces s E(S, V S φ(s := v S d v We wll call the sum of the degrees of a set of vertces the volume of the set, and denote t vol(s := v S d v. The conductance of the graph G s φ(g := mn S:vol(S vol(v φ(s Hgher-order conductance s defned as hgher-order expanson, but wth conductance replacng expanson n the defnton. The Cheeger nequaltes λ φ(g λ stll hold, wth the same proof. Wth some abuse of notaton, we wll call the followng quantty the Raylegh quotent of x R L (x := {u,v} E (x u x v v V d vx v
even f, techncally, t s the Raylegh quotent of D / x, where D s the dagonal matrx of degrees. We can also adapt the proof of the hgher-order Cheeger nequalty to show λ k φ k(g O(k 3.5 λ k More Cheeger-type Bounds We proved that f (S F, V S F s the cut found by Fedler s algorthm usng the egenvector of λ, then φ(s F, V S F φ(g whch s a good bound, although t usually underestmates the qualty of the solutons found n practce. (There are graphs, however, for whch the above nequalty s tght wthn a constant factor. One case n whch we can mprove the analyss s when there are not too many egenvalues close to λ Theorem There s a constant c such that, f (S F, V S F s the cut obtaned by Fedler s algorthm usng an egenvector for λ, then, for every k, φ(s F, V S F c k λ λk So we have φ(s F, V S F c φ(g mn k k λk whch s a better bound for famles of graphs n whch, for some k, λ k >> k λ. We wll not have tme to prove Theorem, but we wll state the two man peces of ts proof. Lemma Let x R V 0 be a non-negatve vector. Then, for every k, there s a non-negatve vector y R V 0 whose entres take at most k dstnct values and such that x y R L(x λ k x
That s, f R L (x >> λ k, then there are k values such that most entres of x are close to one of those k values. Lemma 3 There s a constant c such that, for every non-negatve vectors x R V 0 and y R V 0, f y s such that ts entrs contan only k dstnct values, then there s a threshold t > 0 such that φ({v : x v t} c k ( R L (x + R L (x x y x The above lemma should be compared to the fact, whch was a major pece n the proof of Cheeger s nequalty, that f x R V 0 s an arbtrary non-negatve vector, then there s a threshold t > 0 such that φ({v : x v t} R L (x One obtans Theorem n the followng. Start from an egenvector x of λ and, usng the frst step n the proof of Cheeger s nequalty, obtan a vector x R V 0 wth non-negatve entres such that R L (x R L (x = λ and such that the support of x contans at most V / vertces. Use Lemma to fnd a vector y wth non-negatve entres and wth at most k dstnct values among ts entres such that x y λ λ k x. Then use Lemma 3 and the fact that λ k to conclude that there exsts at t > 0 such that φ({v : x v t} O(k λ λk The set {v : x v t} contans at most V / vertces, t s one of the cuts consdered by Fedler s algorthm on nput x. Another property of graphs n whch λ k s large for small k s that they contan large expanders as nduced subgraphs. Theorem 4 There s a constant c such that, for every graph G and every k, there exsts a partton of the vertces nto l k sets (S..., S l such that, f we call G the subgraph nduced by the vertex set S, we have φ G c λ k k Theorem 5 If φ k+ > (+ɛφ k, then there s a partton of the vertces nto k subsets (S,..., S k such that ( ɛ {,..., k} : φ G Ω φ k+, φ(s kφ k k 3
3 The Power Method Earler n ths class, we showed that, f G = (V, E s a d-regular graph, and L s ts normalzed Laplacan matrx wth egenvalues 0 = λ λ... λ n, gven an egenvector of λ, Fedler s algorthm fnds, n nearly-lnear tme O( E + V log V, a cut (S, V S such that φ(s φ(g. More generally, f, nstead of beng gven an egenvector x such that Lx = λ x, we are gven a vector x such that x T Lx (λ + ɛx T x, then the algorthm fnds a cut such that φ(s 4φ(G + ɛ. We wll now see how to compute such a vector usng O(( V + E ɛ log V ɛ arthmetc operatons. A symmetrc matrx s postve sem-defnte (abbrevated PSD f all ts egenvalues are nonnegatve. We begn by descrbng an algorthm that approxmates the largest egenvalue of a gven symmetrc PSD matrx. Ths mght not seem to help very much because because we want to compute the second smallest, not the largest, egenvalue. We wll see, however, that the algorthm s easly modfed to accomplsh what we want. 3. The Power Method to Approxmate the Largest Egenvalue The algorthm works as follows Algorthm Power Input: PSD matrx M, parameter k Pck unformly at random x 0 {, } n for := to k x := M x return x k That s, the algorthm smply pcks unformly at random a vector x wth ± coordnates, and outputs M k x. Note that the algorthm performs O(k (n + m arthmetc operatons, where m s the number of non-zero entres of the matrx M. Theorem 6 For every PSD matrx M, postve nteger k and parameter ɛ > 0, wth probablty 3/6 over the choce of x 0, the algorthm Power outputs a vector x k such that 4
x T k Mx k x T k x k λ ( ɛ where λ s the largest egenvalue of M. + 4n( ɛ k Note that, n partcular, we can have k = O(log n/ɛ and xt k Mx k ( O(ɛ λ x T k x. k Let λ λ n be the egenvalues of M, wth multplctes, and v,..., v n be a system of orthonormal egenvectors such that Mv = λ v. Theorem 6 s mpled by the followng two lemmas Lemma 7 Let v R n be a vector such that v =. Sample unformly x {, } n. Then P [ x, v ] 3 6 Lemma 8 For every x R n, for every postve nteger k and postve ɛ > 0, f we defne y := M k x, we have y T My y T y λ ( ɛ ( + x ( ɛk x, v It remans to prove the two lemmas. Proof: (Of Lemma 7 Let v = (v,..., v n. The nner product x, v s the random varable S := x v Let us compute the frst, second, and fourth moment of S. E S = 0 E S = v = E S 4 = 3 ( v v 4 3 5
Recall that the Paley-Zygmund nequalty states that f Z s a non-negatve random varable wth fnte varance, then, for every 0 δ, we have whch follows by notng that P[Z δ E Z] ( δ (E Z E Z ( that E Z = E[Z Z<δ E Z ] + E[Z Z δ E Z ], and that E[Z Z<δ E Z ] δ E Z, E[Z Z δ E Z ] E Z E Z δ E Z = E Z P[Z δ E Z] We apply the Paley-Zygmund nequalty to the case Z = S and δ = /4, and we derve P [ S ] 4 ( 3 4 3 = 3 6 Remark 9 The proof of Lemma 7 works even f x {, } n s selected accordng to a 4-wse ndependent dstrbuton. Ths means that the algorthm can be derandomzed n polynomal tme. Proof: (Of Lemma 8 Let us wrte x as a lnear combnaton of the egenvectors x = a v + + a n v n where the coeffcents can be computed as a = x, v. We have and so y = a λ k v + + a n λ k nv n 6
and y T My = y T y = a λ k+ a λ k We need to prove a lower bound to the rato of the above two quanttes. We wll compute a lower bound to the numerator and an upper bound to the denomnator n terms of the same parameter. Let l be the number of egenvalues larger than λ ( ɛ. Then, recallng that the egenvalues are sorted n non-ncreasng order, we have y T My l = a λ k+ λ ( ɛ l = a λ k We also see that n =l+ a λ k λ k ( ɛ k n =l+ a λ k ( ɛ k x a λ k ( ɛ t x a x a ( ɛ k l = a λ k So we have gvng y T y ( + x ( ɛ k a l = a y T My y T y λ ( ɛ ( + x ( ɛ k a 7
Remark 0 Where dd we use the assumpton that M s postve semdefnte? What happens f we apply ths algorthm to the adjacency matrx of a bpartte graph? 3. Approxmatng the Second Largest Egenvalue Suppose now that we are nterested n fndng the second largest egenvalue of a gven PSD matrx M. If M has egenvalues λ λ λ n, and we know the egenvector v of λ, then M s a PSD lnear map from the orthogonal space to v to tself, and λ s the largest egenvalue of ths lnear map. We can then run the prevous algorthm on ths lnear map. Algorthm Power Input: PSD matrx M, vector v parameter k Pck unformly at random x {, } n x 0 := x v x, v for := to k x := M x return x k If v,..., v n s an orthonormal bass of egenvectors for the egenvalues λ λ n of M, then, at the begnnng, we pck a random vector x = a v + a v + a n v n that, wth probablty at least 3/6, satsfes a /. (Cf. Lemma 7. Then we compute x 0, whch s the projecton of x on the subspace orthogonal to v, that s Note that x = n and that x 0 n. The output s the vector x k x 0 = a v + a n v n x k = a λ k v + a n λ k nv n If we apply Lemma 8 to subspace orthogonal to v, we see that when a / we have that, for every 0 < ɛ <, x T k Mx k x T k x k λ ( ɛ 4n( ɛ k We have thus establshed the followng analyss. 8
Theorem For every PSD matrx M, postve nteger k and parameter ɛ > 0, f v s a length- egenvector of the largest egenvalue of M, then wth probablty 3/6 over the choce of x 0, the algorthm Power outputs a vector x k v such that x T k Mx k x T k x k λ ( ɛ + 4n( ɛ k where λ s the second largest egenvalue of M, countng multplctes. 3.3 The Second Smallest Egenvalue of the Laplacan Fnally, we come to the case n whch we want to compute the second smallest egenvalue of the normalzed Laplacan matrx L = I A of a d-regular graph G = (V, E, d where A s the adjacency matrx of G. Consder the matrx M := I L = I + d A. Then f 0 = λ... λ n are the egenvalues of L, we have that = λ λ λ n 0 are the egenvalues of M, and that M s PSD. M and L have the same egenvectors, and so v = n (,..., s a length- egenvector of the largest egenvalue of M. By runnng algorthm Power, we can fnd a vector x such that and so, rearrangng, we have x T Mx T ( ɛ ( λ x T x x T Mx T = x T x x T Lx x T Lx x T x λ + ɛ If we want to compute a vector whose Raylegh quotent s, say, at most λ, then the runnng tme wll be Õ(( V + E /λ, because we need to set ɛ = λ /, whch s not nearly lnear n the sze of the graph f λ s, say O(/ V. For a runnng tme that s nearly lnear n n for all values of λ, one can, nstead, apply the power method to the pseudonverse L + of L. (Assumng that the graph s connected, L + x s the unque vector y such that Ly = x, f x (,...,, and L + x = 0 f x s parallel to (,...,. Ths s because L + has egenvalues 0, /λ,..., /λ n, and so L + s PSD and /λ s ts largest egenvalue. Although computng L + s not known to be doable n nearly lnear tme, there are nearly lnear tme algorthms that, gven x, solve n y the lnear system Ly = x, 9
and ths s the same as computng the product L + x, whch s enough to mplement algorthm Power appled to L +. (Such algorthms wll be dscussed n the thrd part of the course. The algorthms wll fnd an approxmate soluton y to the lnear system Ly = x, but ths wll be suffcent. In the followng, we proceed as f the soluton was exact. In tme O((V + E (log V /ɛ O(, we can fnd a vector y such that y = (L + k x, where x s a random vector n {, } n, shfted to be orthogonal to (,..., and k = O(log V /ɛ. What s the Raylegh quotent of such a vector wth respect to L? Let v,..., v n be a bass of orthonormal egenvectors for L and L +. If 0 = λ λ λ n are the egenvalues of L, then we have and, for =,..., n, we have Lv = L + v = 0 Lv = λ L + v = λ Wrte x = a v + a n v n, where a n, and ssume that, as happens wth probablty at least 3/6, we have a. Then 4 y = n = a λ k and the Raylegh quotent of y wth respect to L s y T Ly y T y = a a λ k and the analyss proceeds smlarly to the analyss of the prevous secton. If we let l be the ndex such that λ l (+ɛ λ λ l+ then we can upper bound the numerator as a λ k l l l a a a λ k + λ k ( + ɛ k λ k + n λ k ( + ɛ k λ k + 4na λ k ( + ɛ k λ k >l a 0
( 4n + ( + ɛ k and we can lower bound the denomnator as and the Raylegh quotent s when k = O ( ɛ log n ɛ. a λ k l ( + ɛλ l l a a a λ k λ k λ k ( y T Ly y T y λ 4n ( + ɛ + ( + ɛ λ ( + ɛ k