Lecture 7. 1 Normalized Adjacency and Laplacian Matrices

ORIE 6334 Spectral Graph Theory September 3, 206 Lecturer: David P. Williamson Lecture 7 Scribe: Sam Gutekunst In this lecture, we introduce normalized adjacency and Laplacian matrices. We state and begin to prove Cheeger s inequality, which relates the second eigenvalue of the normalized Laplacian matrix to a graph s connectivity. Before stating the inequality, we will also define three related measures of expansion properties of a graph: conductance, edge) expansion, and sparsity. Normalized Adjacency and Laplacian Matrices We use notation from Lap Chi Lau. Definition The normalized adjacency matrix is A D /2 AD /2, where A is the adjacency matrix of G and D diagd) for di) the degree of node i. For a graph G with no isolated vertices), we can see that 0 0 d) 0 0 D /2 d2)....... 0 0 Definition 2 The normalized Laplacian matrix is L I A. dn) Notice that L I A D /2 D A)D /2 D /2 L G D /2, for L G the unnormalized) Laplacian. Recall that for the largest eigenvalue λ of A and the maximum degree of a vertex in a graph, d avg λ. Normalizing the adjacency matrix makes its largest eigenvalue, so the analogous result for normalized matrices is the following: Claim Let α α n be the eigenvalues of A and let λ λ n be the eigenvalues of L. Then α α n, 0 λ λ n 2. 0 This lecture is derived from Lau s 202 notes, Week 2, http://appsrv.cse.cuhk.edu.hk/~chi/ csc560/notes/l02.pdf. 7-

Proof: First, we show that 0 is an eigenvalue of L using the vector x D /2 e. Then L D /2 e) D /2 L G D /2 D /2 e D /2 L G e 0, since e is a eigenvector of L G corresponding to eigenvalue 0. This shows that D /2 e is an eigenvector of L of eigenvalue 0. To show that it s the smallest eigenvalue, notice that L is positive semidefinite, as for any x R n : x T L x x T I A )x i V 0. xi 2xi)xj) di)dj) ) xi) xj) 2 di) dj) xi) The last equality can be see in reverse by expanding xj). We have di) dj) now shown that L has nonnegative eigenvalues, so indeed λ 0. To show that α, we make use of the positive semidefiniteness of L I A. This gives us that, for all x R n : x T I A )x 0 x T x x T A x 0 xt A x x T x. ) This Rayleigh quotient gives us the upper bound that α. To get equality, consider again x D /2 e. Since, for this x, x T L x 0 x T I A )x 0. The exact same steps as in Equation yield xt A x, as we now have equality. x T x To get a similar lower bound on α n, we can show that I +A is positive semidefinite using a similar sum expansion 2. Then x T I + A )x 0 x T x + x T A x 0 xt A x x T x α n. A slick proof that does not make use of this quadratic is to use the fact that L G is positive semidefinite. Thus L G BB T for some B, so that L V V T for V D /2 B. 2 This time, use x T I + A )x i V xi + 2xi)xj) di)dj) xi) + xj) 0. di) dj) 7-2

Finally, notice that x T I + A )x 0 implies the following chain: x T A x x T x x T Ix x T A x 2x T x xt L x x T x 2 λ n 2, using the same Rayleigh quotient trick and that λ n is the maximizer of that quotient. Remark Notice that, given the spectrum of A, we have the following: A has spectrum negatives of A, and I A adds one to each eigenvalue of A. Hence, 0 λ λ n 2 follows directly from α α n. 2 Connectivity and λ 2 L ) Recall that λ 2 L G ) 0 if and only if G is disconnected. The same is true for λ 2 L ), and we can say more! 2. Flavors of Connectivity Let S V. Recall that δs) denotes the set of edges with exactly one endpoint in S, and define vols) i S di). Definition 3 The conductance of S V is φs) The edge expansion of S is δs) min{vols), volv S)}. αs) δs), for S n S 2. The sparsity of S is ρs) δs) S V S. These measures are similar if G is d-regular i.e., di) d for all i V ). In this case, n αs) dφs), ρs) αs) nρs). 2 To see the first equality, e.g., notice that the volume of S is d S. In general, notice that 0 φs) for all S V. We re usually interested in finding the sets S that minimize these quantities over the entire graph. 7-3

Definition 4 We define φg) min φs), αg) min αs), S V S V : S n 2 ρg) min S V ρs). We call a graph G an expander if φg) or αg)) is large i.e. a constant 3 ). Otherwise, we say that G has a sparse cut. One algorithm for finding a sparse cut that works well in practice, but that lacks strong theoretical guarantees is called spectral partitioning. Algorithm : Spectral Partitioning Compute x 2 of L the eigenvector corresponding to λ 2 L )); 2 Sort V such that x 2 ) x 2 n).; 3 Define the sweep cuts for i,..., n by S i {,..., i}.; 4 Return min i {,...,n } φs i ).; The following picture illustrates the idea of the algorithm; sweep cuts correspond to cuts between consecutive bars: x 2 ) x 2 2) x 2 3) x 2 n ) x 2 n) Cheeger s inequality provides some insight into why this algorithm works well. 3 Cheeger s Inequality We now work towards proving the following: Theorem 2 Cheeger s Inequality) Let λ 2 be the second smallest eigenvalue of L. Then: λ 2 2 φg) 2λ 2. The theorem proved by Jeff Cheeger actually has to do with manifolds and hypersurfaces; the theorem above is considered to be a discrete analog of Cheeger s original inequality. But the name has stuck. Typically, people think of the first inequality being easy and the second being hard. We ll prove the first inequality, and start the proof of the second inequality. 3 One should then ask A constant with respect to what? Usually one defines families of graphs of increasing size as families of expanders, in which case we want the conductance or expansion constant with respect to the number of vertices. 7-4

Proof: Recall that λ 2 x T L x min x:<x,d /2 e>0 x T x min x:<x,d /2 e>0 x T D /2 L G D /2 x. x T x Consider the change of variables obtained by setting y D /2 x and x D /2 y : λ 2 min y:<d /2 y,d /2 e>0 y T L G y D /2 y) T D /2 y) min y T L G y y:<d /2 y,d /2 e>0 y T Dy. The minimum is being taken over all y such that < D /2 y, D /2 e > 0. That is, over y such that: D /2 y) T D /2 e 0 y T De 0 i V di)yi) 0. Hence, we have that λ 2 min y: i V di)yi)0 yi) yj))2. i V di)yi)2 Now let S be such that φg) φs ), and try defining {, i S ŷi) 0, else. It would be great if λ 2 was bounded by δs ) δs ) i S di) vols ). However, there are two problems. We have i V di)ŷi) 0; moreover δs ) might not be vols ) φs ), as we want the denominator to be min{vols ), volv S )}. Hence, we redefine Now we notice that: di)ŷi) i V ŷi) {, vols ) volv S ), i S else. i S di) i/ S di) vols ) volv S ) 0. Thus, this is a feasible solution to the minimization problem defining λ 2, and we have that the only edges contributing anything nonzero to the numerator are those with exactly one endpoint in S. Thus: λ 2 i S di) δs ) vols ) + vols ) volv S ) + i/ S di) volv S ) 7-5

δs ) + vols ) volv S ) + vols ) volv S ) ) δs ) vols ) + volv S ) { } 2 δs ) max vols ), volv S ) 2 δs ) min{vols ), volv S )} 2φG). This completes the proof of the first inequality. To get the second, the idea is to suppose we had a y with Ry) yi) yj))2. i V di)yi)2 Claim 3 We ll be able to find a cut S suppy ) {i V δs) 2Ry). vols) : yi) 0} with This will not suffice to prove the second part of the inequality, as δs) need not vols) equal φs), but we ll come back to this next lecture. Proof: Without loss of generality, we assume yi), as we can scale y if not. Our trick from Trevisan) is to pick t 0, ] uniformly at random, and let S t {i V : yi t}. Notice that: E[volS t )] i V di)pr[i S t ] i V di)yi, and assuming 4 that i, j) E yi yj, E[ δs t ) ] Pr[i, j) δs t )] Pr[yi < t yj ] yj yi ). 4 We make this assumption without loss of generality because it doesn t matter in the end and is notationally convenient. 7-6

Rewriting the above using difference of squares and using Cauchy-Schwarz, yj) yi))yj) + yi)) yj) yi)) 2 yj) + yi) i,h) E This gives that yj) yi) 2 yi)) yj 2di)yi i V 2Ry) di)yi. E[ δs t ) ] E[volS t )] Ry) E[ δs t ) 2Ry) vols t )] 0. This means that there exists a t such that i V δs t ) vols t ) 2Ry). yj + yi ) To derandomize the algorithm, look at each of the n possible cuts S t by looking at sweep cuts for the order y y2 yn. 7-7