Lecture 9: Laplacian Eigenmaps

Similar documents
Lecture 7: Convex Optimizations

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

Lecture 4: Phase Transition in Random Graphs

Lecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian

Lecture 10: Dimension Reduction Techniques

MATH 829: Introduction to Data Mining and Analysis Clustering II

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

MATH 567: Mathematical Techniques in Data Science Clustering II

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Unsupervised dimensionality reduction

MATH 567: Mathematical Techniques in Data Science Clustering II

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name.

TOPOLOGY FOR GLOBAL AVERAGE CONSENSUS. Soummya Kar and José M. F. Moura

Lecture 1 and 2: Introduction and Graph theory basics. Spring EE 194, Networked estimation and control (Prof. Khan) January 23, 2012

LECTURE NOTE #11 PROF. ALAN YUILLE

A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo

Non-linear Dimensionality Reduction

Nonlinear Analysis with Frames. Part III: Algorithms

17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs

Algebraic Representation of Networks

Nonlinear Dimensionality Reduction

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Spectral Analysis of k-balanced Signed Graphs

Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering. Some Slides

Eigenvalues of Random Power Law Graphs (DRAFT)

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Finite Frames and Graph Theoretical Uncertainty Principles

SDP and eigenvalue bounds for the graph partition problem

Convex Optimization of Graph Laplacian Eigenvalues

Graph theoretic uncertainty principles

Numerical Schemes from the Perspective of Consensus

Machine Learning for Data Science (CS4786) Lecture 11

An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones

Spectra of Adjacency and Laplacian Matrices

Graph Partitioning Using Random Walks

Diffusion and random walks on graphs

Laplacian and Random Walks on Graphs

Data-dependent representations: Laplacian Eigenmaps

A New Spectral Technique Using Normalized Adjacency Matrices for Graph Matching 1

Spectral Clustering on Handwritten Digits Database Mid-Year Pr

COMPSCI 514: Algorithms for Data Science

STA141C: Big Data & High Performance Statistical Computing

EECS 275 Matrix Computation

Chapter 3 Transformations

Multi-Robotic Systems

The weighted spectral distribution; A graph metric with applications. Dr. Damien Fay. SRG group, Computer Lab, University of Cambridge.

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Global Positioning from Local Distances

Math 3191 Applied Linear Algebra

A Duality View of Spectral Methods for Dimensionality Reduction

Information Propagation Analysis of Social Network Using the Universality of Random Matrix

Specral Graph Theory and its Applications September 7, Lecture 2 L G1,2 = 1 1.

LAKELAND COMMUNITY COLLEGE COURSE OUTLINE FORM

Spectral Clustering. Zitao Liu

Fast Linear Iterations for Distributed Averaging 1

Spectral Clustering. Guokun Lai 2016/10

EIGENVECTOR NORMS MATTER IN SPECTRAL GRAPH THEORY

8.1 Concentration inequality for Gaussian random matrix (cont d)

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

A Duality View of Spectral Methods for Dimensionality Reduction

Markov Chains, Random Walks on Graphs, and the Laplacian

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms

On the Spectra of General Random Graphs

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

NOTES ON THE PERRON-FROBENIUS THEORY OF NONNEGATIVE MATRICES

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

1 Last time: least-squares problems

CS 246 Review of Linear Algebra 01/17/19

Introduction to Semidefinite Programming I: Basic properties a

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

SEMIDEFINITE PROGRAM BASICS. Contents

1 Matrix notation and preliminaries from spectral graph theory

New skew Laplacian energy of a simple digraph

Spectral Clustering on Handwritten Digits Database

arxiv:quant-ph/ v1 22 Aug 2005

Nonlinear Dimensionality Reduction. Jose A. Costa

Lecture 7: Positive Semidefinite Matrices

Linear Algebra Solutions 1

ON SUM OF SQUARES DECOMPOSITION FOR A BIQUADRATIC MATRIX FUNCTION

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

An Introduction to Spectral Graph Theory

Four new upper bounds for the stability number of a graph

Ma/CS 6b Class 20: Spectral Graph Theory

Math 205, Summer I, Week 4b:

Norms and embeddings of classes of positive semidefinite matrices

Lecturer: Naoki Saito Scribe: Ashley Evans/Allen Xue. May 31, Graph Laplacians and Derivatives

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

An Algorithmist s Toolkit September 10, Lecture 1

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs

Parseval Frame Construction

1.10 Matrix Representation of Graphs

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion

Quick Tour of Linear Algebra and Graph Theory

Spherical Euclidean Distance Embedding of a Graph

Lecture 13 Spectral Graph Algorithms

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Networks and Their Spectra

Transcription:

Lecture 9: Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 18, 2017

Optimization Criteria Assume G = (V, W ) is a undirected weighted graph with n nodes and weight matrix W. We interpret W i,j as the similarity between nodes i and j. The larger the weight the more similar the nodes, and the closer they are in a geometric graph embedding. Thus we look for a dimension d > 0 and a set of points {y 1, y 2,, y n } R d so that d i,j = y i y j s is small for large weight W i,j.

Optimization Criteria Assume G = (V, W ) is a undirected weighted graph with n nodes and weight matrix W. We interpret W i,j as the similarity between nodes i and j. The larger the weight the more similar the nodes, and the closer they are in a geometric graph embedding. Thus we look for a dimension d > 0 and a set of points {y 1, y 2,, y n } R d so that d i,j = y i y j s is small for large weight W i,j. A natural optimization criterion candidate: J(y 1, y 2,, y n ) = 1 i,j n W i,j y i y j 2,

Optimization Criteria Lemma J(y 1, y 2,, y n ) = 1 2 is convex in (y 1,, y n ). W i,j y i y j 2 1 i,j n Proof Idea: Write it as a positive semidefinite quadratic criterion: n J = y i 2 n n W i,j W i,j y i, y j i=1 j=1 i,j=1 Let Y = [y 1 y n ] be the d n matrix of coordinates. Let D = diag(d k ), with d k = n i=1 W k,i, be a n n diagonal matrix. A little algebra shows: { } J(Y ) = trace Y (D W )Y T.

Optimization Criteria Equivalent forms: J(Y ) = trace { } { } Y (D W )Y T = trace Y Y T = trace { G} where G = Y T Y is the n n Gram matrix. Thus: J is quadratic in Y, and positive semidefnite, hence convex. Also: J is linear in G.

Optimization Criteria Equivalent forms: J(Y ) = trace { } { } Y (D W )Y T = trace Y Y T = trace { G} where G = Y T Y is the n n Gram matrix. Thus: J is quadratic in Y, and positive semidefnite, hence convex. Also: J is linear in G. Question: Are there other convex functions in Y that behave similarly?

Optimization Criteria Equivalent forms: J(Y ) = trace { } { } Y (D W )Y T = trace Y Y T = trace { G} where G = Y T Y is the n n Gram matrix. Thus: J is quadratic in Y, and positive semidefnite, hence convex. Also: J is linear in G. Question: Are there other convex functions in Y that behave similarly? Answer: Yes! Examples: J(y 1,, y n ) = W i,j y i y j 1 i,j n

Optimization Criteria Equivalent forms: J(Y ) = trace { } { } Y (D W )Y T = trace Y Y T = trace { G} where G = Y T Y is the n n Gram matrix. Thus: J is quadratic in Y, and positive semidefnite, hence convex. Also: J is linear in G. Question: Are there other convex functions in Y that behave similarly? Answer: Yes! Examples: J(y 1,, y n ) = W i,j y i y j J(y 1,, y n ) = 1 i,j n 1 i,j n W i,j y i y j p 1/p, p 1

Constraints Absent any constraint, minimize trace {Y } Y T has solution Y = 0. To avoid this trivial solution, we impose a normalization constraint. Choices: YY T = I d YDY T = I d What does this mean? n y k yk T = I d Parseval frame k=1 n d k y k yk T = I d Parseval weighted frame k=1

The Optimization Problem Combining one criterion with one constraint: minimize trace {Y } Y (LE) : T subject to YDY T = I d called the Laplacian Eigenmap problem. Alternative problem: (UnLE) : minimize subject to trace {Y } Y T YY T = I d called the unnormalized Laplacian eigenmap problem.

The optimization problem How to solve the Laplacian eigenmap problem: minimize trace {Y } Y (LE) : T subject to YDY T = I d First note the problem is not convex, because of the equality constraint. How to make it convex? How to solve? 1. First absorb the scaling D into the solution: Ỹ = YD 1/2 Problem becomes: minimize trace {Ỹ } D 1/2 D 1/2 Ỹ T = trace {Ỹ Ỹ } T subject to Ỹ Ỹ T = I d

The optimization problem 2. Consider the optimization problem for P: { } minimize trace P subject to P = P T 0 P I n trace(p) = d Proposition Claims: A. The above optimization problem is a convex SDP. B. At optimum: P = Ỹ T Ỹ.

Eigenproblem The optimum solutions of the (LE) and (UnLE) problems are given by appropriate eigenvectors: minimize trace {Ỹ Ỹ } T subject to Ỹ Ỹ T = I d Solution: Ỹ = e T 1. ed T, e k = λ k e k where 0 = λ 1 λ d are the smallest d eigenvalues, and e k = 1 are the normalized eigenvectors.

Generalized Eigenproblem (LE) : minimize trace {Y } Y T subject to YDY T Y = Ỹ D 1/2 = I d the rows of Ỹ are eigenvectors of the normalized Laplacian e k = λ k e k. Let f k be the (transpose) rows of Y : Y = f T 1. f T d, f k = D 1/2 e k Thus: D 1/2 f k = λ k D 1/2 f k, or: D 1/2 D 1/2 f k = λ k Df k, or: f k = λ k Df k This is called generalized eigenproblem associated to (, D).

Eigenproblem Consider the unnormalized Laplacian eigenmap problem: minimize trace {Y } Y (UnLE) : T subject to YY T. = I d The solution Y unle is the d n matrix whose rows are eigenvectors of the unnormalized Laplacian = D W, g k = µ k g k, g k = 1, 0 = µ 1 µ d, and Y unle = g T 1. gd T.

What eigenspace to choose? In most implementations one skips the eigenvectors associated to 0 eigenvalue. Why? In the unnormalized case, g 1 = 1 n [1, 1,, 1] T, hence no new information. In your class projects, skip the bottom eigenvector. This fact is explicitely stated in the problem.

Embedding Algorithm Algorithm () Input: Weight matrix W, target dimension d 1 Construct the diagonal matrix D = diag(d ii ) 1 i n, where D ii = n k=1 W i,k. 2 Construct the normalized Laplacian = I D 1/2 WD 1/2. 3 Compute the bottom d + 1 eigenvectors e 1,, e d+1, e k = λ k e k, 0 = λ 1 λ d+1.

Embedding Algorithm-cont s Algorithm ( - cont d) 4 Construct the d n matrix Y, e 2 Y =. D 1/2 e d+1 5 The new geometric graph is obtained by converting the columns of Y into n d-dimensional vectors: [ y 1 y n ] = Y Output: Set of points {y 1, y 2,, y n } R d.

Example see: http://www.math.umd.edu/ rvbalan/teaching/amsc663fall2010/ PROJECTS/P5/index.html

References B. Bollobás, Graph Theory. An Introductory Course, Springer-Verlag 1979. 99(25), 15879 15882 (2002). S. Boyd, L. Vandenberghe, Convex Optimization, available online at: http://stanford.edu/ boyd/cvxbook/ F. Chung, Spectral Graph Theory, AMS 1997. F. Chung, L. Lu, The average distances in random graphs with given expected degrees, Proc. Nat.Acad.Sci. 2002. F. Chung, L. Lu, V. Vu, The spectra of random graphs with Given Expected Degrees, Internet Math. 1(3), 257 275 (2004). R. Diestel, Graph Theory, 3rd Edition, Springer-Verlag 2005. P. Erdös, A. Rényi, On The Evolution of Random Graphs

G. Grimmett, Probability on Graphs. Random Processes on Graphs and Lattices, Cambridge Press 2010. C. Hoffman, M. Kahle, E. Paquette, Spectral Gap of Random Graphs and Applications to Random Topology, arxiv: 1201.0425 [math.co] 17 Sept. 2014. A. Javanmard, A. Montanari, Localization from Incomplete Noisy Distance Measurements, arxiv:1103.1417, Nov. 2012; also ISIT 2011. J. Leskovec, J. Kleinberg, C. Faloutsos, Graph Evolution: Densification and Shrinking Diameters, ACM Trans. on Knowl.Disc.Data, 1(1) 2007.