ORIE 4741: Learning with Big Messy Data Spectral Graph Theory Mika Sumida Operations Research and Information Engineering Cornell September 15, 2017 1 / 32
Outline Graph Theory Spectral Graph Theory Laplacian regularizer Spectral Embedding 2 / 32
What is a graph? A graph is a collection of nodes that are connected by a set of lines or arrows models systems where objects have some pairwise relationship with each other 3 / 32
What is a graph? A graph is a collection of nodes that are connected by a set of lines or arrows models systems where objects have some pairwise relationship with each other Q: What are some examples of graphs in real life? 3 / 32
Examples of graphs : Social network Nodes are users Edges could be Facebook friendships, LinkedIn connections (undirected) Instagram and Twitter follows (directed) 4 / 32
Examples of graphs : Transportation network Subway systems, freight networks Roads, bridges, and highway systems 5 / 32
Examples of graphs : Collaboration graphs Hollywood graph Academic collaborations 6 / 32
Formal definition of graphs A graph, G = (V, E), is made up of a Vertex set V = {v 1,..., v n } and an Edge set E = {e ij } We say an edge e ij connects vertices v i and v j. 7 / 32
Formal definition of graphs A graph, G = (V, E), is made up of a Vertex set V = {v 1,..., v n } and an Edge set E = {e ij } We say an edge e ij connects vertices v i and v j. Two basic types of graphs: Undirected graphs: edges are sets e ij = {v i, v j } Directed graphs: edges are ordered e ij = (v i, v j ) 7 / 32
Formal definition of graphs A graph, G = (V, E), is made up of a Vertex set V = {v 1,..., v n } and an Edge set E = {e ij } We say an edge e ij connects vertices v i and v j. Two basic types of graphs: Undirected graphs: edges are sets e ij = {v i, v j } Directed graphs: edges are ordered e ij = (v i, v j ) The degree of a vertex, d(v) = # of edges incident to v 8 / 32
Formal definition of graphs A graph, G = (V, E), is made up of a Vertex set V = {v 1,..., v n } and an Edge set E = {e ij } We say an edge e ij connects vertices v i and v j. Two basic types of graphs: Undirected graphs: edges are sets e ij = {v i, v j } Directed graphs: edges are ordered e ij = (v i, v j ) The degree of a vertex, d(v) = # of edges incident to v G is connected if there is a path between every two vertices in the graph. 8 / 32
Common graphs Path graph 9 / 32
Common graphs Path graph Complete graph 9 / 32
Common graphs Path graph Complete graph Star graph 9 / 32
Common graphs Path graph Complete graph Star graph Cycle 9 / 32
Graph Theory After modeling our system as a graph, we can ask about maximum degree : finding influential people and celebrities finding complete subgraphs (cliques) : detecting communities minimum cut : sever a communication network into pieces shortest paths : routing cars in transportation network 10 / 32
Outline Graph Theory Spectral Graph Theory Laplacian regularizer Spectral Embedding 11 / 32
Spectral Graph Theory Spectral graph theory studies properties of graphs through the eigenvalues (spectra) and eigenvectors of associated graph matrices Eigenvalues of the Laplacian matrix characterize the connectivity of a graph Approximation algorithms for Max Cut Spectral clustering 12 / 32
Adjacency Matrix Encode connections in a graph in a matrix like a spreadsheet Adjacency matrix is a V V matrix with entries: { 1 if there is an edge e ij A(i, j) = 0 otherwise 13 / 32
Adjacency Matrix Encode connections in a graph in a matrix like a spreadsheet Adjacency matrix is a V V matrix with entries: { 1 if there is an edge e ij A(i, j) = 0 otherwise Example: 2 4 1 3 5 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 1 0 13 / 32
Degree Matrix Degree matrix is a V V matrix with entries: { d(i) if i = j D(i, j) = 0 otherwise 14 / 32
Degree Matrix Degree matrix is a V V matrix with entries: { d(i) if i = j D(i, j) = 0 otherwise Example: 2 4 1 3 5 3 0 0 0 0 0 2 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 1 14 / 32
Laplacian Matrix Laplacian matrix, L = D - A d(i) if i = j L(i, j) = 1 if there is an edge e ij 0 otherwise Example: 2 4 1 3 5 3 1 1 1 0 1 2 1 0 0 1 1 3 1 0 1 0 1 3 1 0 0 0 1 1 15 / 32
Laplacians of common graphs Path graph 16 / 32
Laplacians of common graphs Path graph 1 1 0 0 L P4 = 1 2 1 0 0 1 2 1 0 0 1 1 Complete graph 16 / 32
Laplacians of common graphs Path graph 1 1 0 0 L P4 = 1 2 1 0 0 1 2 1 0 0 1 1 Complete graph 3 1 1 1 L K4 = 1 3 1 1 1 1 3 1 1 1 1 3 16 / 32
Laplacians of common graphs Star graph 17 / 32
Laplacians of common graphs Star graph 3 1 1 1 L S4 = 1 1 0 0 1 0 1 0 1 0 0 1 Cycle 17 / 32
Laplacians of common graphs Star graph 3 1 1 1 L S4 = 1 1 0 0 1 0 1 0 1 0 0 1 Cycle 2 1 0 1 L C4 = 1 2 1 0 0 1 2 1 1 0 1 2 17 / 32
Matrices as operators Can think of matrices as functions operating on vectors M 1 M1 T v M : R n R n where Mv = M 2 v = M2 T v M n Mn T v What happens if we apply the Laplacian to a vector v? 18 / 32
Laplacian operator Think of v as a distribution of weights or values on the vertices of G. (Lv)(i) = L T i v n = L(i, j)v(j) j=1 = d(i)v(i) + n j=1,j i = d(i)v(i) = j:{i,j} E j:{i,j} E L(i, j)v(j) v(j) ( ) v(i) v(j) 19 / 32
Quadratic Form of the Laplacian The quadratic form associated with a matrix M is the function f : R n R where f (v) = v T Mv 20 / 32
Quadratic Form of the Laplacian The quadratic form associated with a matrix M is the function For the Laplacian matrix, n v T Lv = v(i) (Lv)(i) = i=1 f : R n R where f (v) = v T Mv n v(i) i=1 = {i,j} E = {i,j} E = {i,j} E j:{i,j} E ( ) v(i) v(j) ( ) v(i) v(i) v(j) v(i) 2 2v(i)v(j) + v(j) 2 ( ) 2 v(i) v(j) ( ) + v(j) v(j) v(i) 20 / 32
Laplacian operator (Lv)(i) = v T Lv = {i,j} E {i,j} E and ( ) v(i) v(j) ( ) 2 v(i) v(j) Laplacian operators measure the smoothness of v across the edges of G If v = c 1, Lv = 0 = 0 is an eigenvalue of L with associated eigenvector 1 v T Lv 0 so L is positive semi-definite 21 / 32
Properties of the Laplacian matrix L, D, and A are all real symmetric matrices Spectral Theorem: An n n real symmetric matrix has n real eigenvalues with n real eigenvectors that form an orthonormal basis 22 / 32
Properties of the Laplacian matrix L, D, and A are all real symmetric matrices Spectral Theorem: An n n real symmetric matrix has n real eigenvalues with n real eigenvectors that form an orthonormal basis L is positive semi-definite = All eigenvalues of L are non-negative 22 / 32
Properties of the Laplacian matrix L, D, and A are all real symmetric matrices Spectral Theorem: An n n real symmetric matrix has n real eigenvalues with n real eigenvectors that form an orthonormal basis L is positive semi-definite = All eigenvalues of L are non-negative L has eigenvalues 0 = λ 1 λ 2 λ n 22 / 32
Outline Graph Theory Spectral Graph Theory Laplacian regularizer Spectral Embedding 23 / 32
Smooth regularizer d 1 r(w) = (w i+1 w i ) 2 = Sw 2 = w T S T Sw i=1 where S R (d 1) d is the first order difference operator 1 j = i S(i, j) = 1 j = i + 1 0 otherwise 24 / 32
Smoothed least squares problem Why smooth? minimize n (y i w T x i ) 2 + λ Sw 2 i=1 can couple coefficients of adjacent features allow model to change over space or time example: different years in tax data 25 / 32
Smoothed least squares problem Why smooth? minimize n (y i w T x i ) 2 + λ Sw 2 i=1 can couple coefficients of adjacent features allow model to change over space or time example: different years in tax data Can couple any pair of model coefficients, not just (i, i + 1)! 25 / 32
A closer look at the first order difference operator 1 1 0 0 0 0 1 2 1 0 0 0 S T 0 1 2 0 0 0 S = 0 0 0 2 1 0 0 0 0 1 2 1 0 0 0 0 1 1 26 / 32
A closer look at the first order difference operator 1 1 0 0 0 0 1 2 1 0 0 0 S T 0 1 2 0 0 0 S = 0 0 0 2 1 0 0 0 0 1 2 1 0 0 0 0 1 1 Laplacian of the path graph = L G(Path) Sw 2 = w T L G(Path) w 26 / 32
Laplacian regularizer Suppose we have a graph G that encodes relationships between features Product recommendation: Features are whether a customer bought a certain product. Graph has edges between similar products Geographic features: Features are states of residence. Graph has an edge if two states share a border Time series: Features are based on years. Graph has edges between consecutive years (y, y + 1). Add in weaker time dependencies by having edges (y, y + 2) with smaller weights. Laplacian regularizer smooths coefficients over edges of the graph G. 27 / 32
Laplacian Regularized Least Squares Laplacian regularizer smooths coefficients over edges of the graph G. minimize n (y i w T x i ) 2 + w T L G w i=1 = n (y i w T x i ) 2 + i=1 {i,j} E ( ) 2 w(i) w(j) 28 / 32
Outline Graph Theory Spectral Graph Theory Laplacian regularizer Spectral Embedding 29 / 32
Drawing graphs with eigenvectors Laplacian quadratic form measures difference between values of a vector across edges For eigenvector of L, v T Lv = λ Eigenvector of L with small eigenvalue has similar v-values on adjacent vertices Laplacian eigenvectors of low eigenvalue can be used to embed graph into 2-D or 3-D 30 / 32
Spectral Embedding Demo https://github.com/orie4741/demos/ spectralgraphtheory.ipynb 31 / 32
References Daniel Spielman s Spectral Graph Theory notes : http://www.cs.yale.edu/homes/spielman/561/ David Williamson ORIE 6334 notes : https://people.orie.cornell.edu/dpw/orie6334/ index.html 32 / 32