Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

Size: px
Start display at page:

Download "Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst"

Transcription

1 Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

2 Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of the data as a graph Construct an embedding of the graph Applications clustering, semi-supervised learning, regression, and reinforcement learning

3 Semi-Supervised Learning In many applications, unlabeled examples are plentiful, but labeled ones are in limited supply Is it possible to exploit unlabeled data to improve classification? Use the geometry of the space of unlabeled data Crucial assumption: the label function is smooth on the manifold

4 Semi-Supervised Learning on Graphs Random Walk Matrix? ?? 4 7? ? Non-symmetric

5 Label Propagation (Zhou and Ghahramani, 2002) Compute affinity matrix W Form the row sums D ii = Σ j W ij Initialize Y 0 = (y 1,,y l,y l+1, y n ) Iterate: + Y t+1 = D -1 W Y t Y l t+1 = Y l (clamp observed labels) Assign labels by sign of Y - Two-moons problem -

6 Label Propagation Convergence Final labels

7 Nonlinear dimensionality reduction (ISOMAP, LLE, Laplacian Eigenmaps, Diffusion Maps, MVU, ) Swissroll Embedding Embedding should preserve locality

8 Local Linear Embedding [Roweis and Saul, Science 2000] Learn weight matrix W (W )= X i i j W ij Xj 2 Learn embedding Y φ(y )= i Y i j W ij Yj 2

9 [Roweis, LLE]

10 [Roweis, LLE]

11

12 Constructing Similarity Matrix Manifold methods can be given as input a neighborhood similarity matrix W over the data Gaussian kernel: W (i, j) =e x i x j 2 2σ 2 K-NN kernel: W(i,j) = 1 if x i is near x j The matrix is then normalized and diagonalized (or dilated): Basis functions: columns (eigenvectors) Embeddings: rows (sorted in increasing or decreasing order w.r.t. eigenvalues)

13 Undirected Graph Embedding Optimization problem of embedding a graph to preserve local geometry: Min y i,j (y i y j ) 2 w i,j s.t. y T D y = 1 where y i ε R is the embedding of i th vertex and D is a diagonal matrix of row sums of W x 1 w 13 x 3 x 2 y 3 y 1 y 2

14 Graph Embedding The best mapping is found by solving the generalized eigenvector problem W φ = λ D φ where D is a diagonal matrix of row sums of W If the graph is connected, this can be written as D -1 W φ = λ φ

15 Introducing the Laplacian The graph embedding problem can be written as Min y i,j (y i y j ) 2 w i,j = Min y i,j (y i 2 + y j 2 2y i y j ) w i,j = Min y y T L y where L = D-W is the combinatorial Laplacian

16 Properties of the Laplacian The Laplacian L is positive semidefinite 1 2 The Laplacian for this graph is = Note that <f, Lf> = f T L f = (f 1 f 2 ) 2 Hence, for any f 0, <f, Lf> >= 0 All the eigenvalues of L are non-negative Combinatorial Laplacian L = D W acts on f (L f)(i) =Σ i~j (f i f j ) w ij

17 Combinatorial Graph Laplacian (Fiedler, 73) = closed chain 4 Eigenvectors of the Combinatorial Laplacian L = D - W 7 6 5

18 Normalized Graph Laplacian Normalized Graph Laplacian is defined as Note that L = D -1/2 (D-W) D -1/2 D -1 W = D -1/2 (D -1/2 W D -1/2 ) D 1/2 = D -1/2 (I L) D 1/2 The normalized graph Laplacian has the same eigenvalues as the random walk matrix

19 Faculty Collaboration Graph: What hidden structure does this contain?

20 Spectral Clustering 1. Given a set X of data points to cluster 2. Form the normalized matrix M = D -1/2 W D -1/2 3. Compute its k largest eigenvectors 4. Arrange the eigenvectors as the columns of a matrix Y 5. Each point is embedded in R k given by the row of the matrix Y 6. Run K-means on the new embedding

21 Spectral Clustering using Graph Laplacian Cluster: 1 Embedding using the 2 nd and 3 rd eigenvector of the graph Laplacian Adler Barrington Immerman Kurose Rosenberg Shenoy Sitaraman Towsley cluster: 2 Adrion Allan Avrunin Barto Brock Clarke Cohen Croft Grupen Hanson Jensen Lehnert Lesser Levine Mahadevan Manmatha McCallum Moll Moss Osterweil Riseman Rissland Schultz Utgoff Woolf Zilberstein Weems

22 Spectral Clustering Cluster: 3 Adrion Cluster: 2 Adler Barrington Immerman Kurose Rosenberg Shenoy Sitaraman Towsley Weems Embedding using the 2 nd and 3 rd eigenvector of the graph Laplacian Cluster: 1 Barto Brock Grupen Hanson Mahadevan Moll Moss Riseman Schultz Utgoff Allan Avrunin Clarke Cohen Croft Jensen Lehnert Lesser Levine Manmatha McCallum Osterweil Rissland Woolf Zilberstein

23 Spectral Clustering using Graph Laplacian Cluster: 1 Cluster: 2 Cluster: 3 Cluster: 4 Cluster: 5 Cluster: 6 Adrion Adler Barto Avrunin Allan Barrington Brock Rosenberg Grupen Clarke Croft Immerman Cohen Sitaraman Hanson Lesser Jensen Kurose Lehnert Weems Mahadevan Osterweil Levine Shenoy Rissland Moll Manmatha Towsley Utgoff Moss McCallum Woolf Riseman Zilberstein Schultz

24 [Ng, Jordan, and Weiss, NIPS]

25 Regularization Perspective Combinatorial Laplacian L = D W acts on f (L f)(i) =Σ i~j (f i f j ) w ij We can express <f, Lf> as a Dirichlet sum <f, Lf> = (i,j) (f i f j ) 2 w ij The pseudo-inverse of the Laplacian L + defines a reproducing kernel Hilbert space (RKHS)

26 Laplacian Eigenmaps (Belkin and Niyogi) Given a set of instances, form the affinity matrix W (e.g. using Gaussian kernel) Form the combinatorial Laplacian L = D-W Compute its k lowest eigenvectors Lφ i = λ i φ i These can be used to smoothly approximate any function on the graph

27 Least-Squares Reinforcement Learning (Boyan, Bradtke and Barto, Bertsekas and Nedic, Lagoudakis and Parr) T π ( Vˆ( s)) = i V ˆ( s) φ ( s) i w i Basis Φ T π ( ˆV (s)) = E π (R + γ ˆV )

28 Learning Representation and Control in Markov Decision Processes (Mahadevan and Maggioni, JMLR 2007) G OPTIMAL Bottlenecks Polynomial basis does poorly here How to automatically find a good basis?

29 Representation Policy Iteration (Mahadevan, UAI 2005) Policy improvement Greedy Policy Actor Policy evaluation Φ Representation Learner Trajectories Critic New bases

30 Out of Sample Extension (Baker, 1976; Williams and Seeger, NIPS 2001) How to compute the embeddings of new points? The Nystrom extension is a classical interpolation developed in the solution of integral equations φ m (x) = 1/λ m j w j k(x,s j ) φ m (s j ) Mountain Car MDP

31 RPI in Continuous Domains: Inverted Pendulum [Mahadevan and Maggioni, JMLR 2007] PVFs on the Inverted Pendlum Task 3000 PVFs [Lagoudakis and Parr, JMLR 2003] 2500 Number of Steps Machine-generated representation Human-designed representation Number of Episodes 25 Eigenvectors of Normalized Laplacian Speedup of RBFs

32 #!!! *!! )!! RPI in Continuous Domains: Acrobot Task (4-dim state space) +,-./012319:/57 (!! '!! "!! &!! %!! $!! Machine-generated representation #!!!! " #! #" $! $" %! +,-./ /7 TD + CMAC 40X faster Human-designed representation

33 Other Methods in Manifold Learning Multiscale diffusion wavelets Instead of localizing bases to a particular eigenvalue, find bases over a frequency band Bases are local and multiscale, not global Semi-definite embedding Learn a kernel matrix from the data that preserves local geometry and global variance ISOMAP Learn an embedding that preserves global distances on the graph

34 Multiscale Diffusion Wavelets (Coifman and Maggioni, ACHA 2006) Diffusion Wavelet at Level 1 Basis Function 1 Diffusion Wavelet at Level 3 Basis Function 4 Diffusion Wavelet at Level 4 Basis Function Unit vectors Level 1 Level 3 Level Diffusion Wavelet at Level 6 Basis Function 1 Diffusion Wavelet at Level 7 Basis Function 1 Diffusion Wavelet at Level 9 Basis Function Eigenvector Level 6 Level 7 Level

35 Compression in 3D Graphics (Karni & Gotsman, SIGGRAPH 2000) ~20,000 vertices ~ 1.5 Mb Extend JPEG to 3D Topology Geometry Object Topology X Coordinate Function Y Coordinate Function Z Coordinate Function nz = x 10 4

36 3D Mesh Compression using Diffusion Wavelets (Mahadevan, ICML 2007) Multiscale bases Level 9 Level 4 Level 5 Level 8 Level 10

37 Compressing Large 3D Objects Elephant Object File: ea4.obj Vertices: Partitions: Laplacian Bases: seconds 4.5 DWT Bases: seconds 4 Geometric + Laplacian Error ~20,000 vertices Numbers of bases in Multiples of 10

38 Summary Learning on graphs and manifolds exploits the non-euclidean geometry of the underlying space Label propagation: semi-supervised learning on graphs Spectral clustering: uses the eigenvectors of the Laplacian as a new representation Laplacian eigenmaps: uses the eigenvectors to approximate functions Diffusion wavelets: multiscale approach

CMPSCI 791BB: Advanced ML: Laplacian Learning

CMPSCI 791BB: Advanced ML: Laplacian Learning CMPSCI 791BB: Advanced ML: Laplacian Learning Sridhar Mahadevan Outline! Spectral graph operators! Combinatorial graph Laplacian! Normalized graph Laplacian! Random walks! Machine learning on graphs! Clustering!

More information

Laplacian Agent Learning: Representation Policy Iteration

Laplacian Agent Learning: Representation Policy Iteration Laplacian Agent Learning: Representation Policy Iteration Sridhar Mahadevan Example of a Markov Decision Process a1: $0 Heaven $1 Earth What should the agent do? a2: $100 Hell $-1 V a1 ( Earth ) = f(0,1,1,1,1,...)

More information

New Frontiers in Representation Discovery

New Frontiers in Representation Discovery New Frontiers in Representation Discovery Sridhar Mahadevan Department of Computer Science University of Massachusetts, Amherst Collaborators: Mauro Maggioni (Duke), Jeff Johns, Sarah Osentoski, Chang

More information

Representation Policy Iteration: A Unified Framework for Learning Behavior and Representation

Representation Policy Iteration: A Unified Framework for Learning Behavior and Representation Representation Policy Iteration: A Unified Framework for Learning Behavior and Representation Sridhar Mahadevan Unified Theories of Cognition (Newell, William James Lectures, Harvard) How are we able to

More information

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes Sridhar Mahadevan Department of Computer Science University of Massachusetts Amherst, MA

More information

Learning Representation & Behavior:

Learning Representation & Behavior: Learning Representation & Behavior: Manifold and Spectral Methods for Markov Decision Processes and Reinforcement Learning Sridhar Mahadevan, U. Mass, Amherst Mauro Maggioni, Yale Univ. June 25, 26 ICML

More information

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes Journal of Machine Learning Research 8 (27) 2169-2231 Submitted 6/6; Revised 3/7; Published 1/7 Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

More information

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Learning Representation and Control In Continuous Markov Decision Processes

Learning Representation and Control In Continuous Markov Decision Processes Learning Representation and Control In Continuous Markov Decision Processes Sridhar Mahadevan Department of Computer Science University of Massachusetts 140 Governor s Drive Amherst, MA 03 mahadeva@cs.umass.edu

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to

More information

Fast Direct Policy Evaluation using Multiscale Analysis of Markov Diffusion Processes

Fast Direct Policy Evaluation using Multiscale Analysis of Markov Diffusion Processes Fast Direct Policy Evaluation using Multiscale Analysis of Markov Diffusion Processes Mauro Maggioni mauro.maggioni@yale.edu Department of Mathematics, Yale University, P.O. Box 88, New Haven, CT,, U.S.A.

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Data-dependent representations: Laplacian Eigenmaps

Data-dependent representations: Laplacian Eigenmaps Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component

More information

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13 CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as

More information

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31 Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from

More information

Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions

Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions Sridhar Mahadevan Department of Computer Science University of Massachusetts Amherst, MA 13 mahadeva@cs.umass.edu Mauro

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Representation Policy Iteration

Representation Policy Iteration Representation Policy Iteration Sridhar Mahadevan Department of Computer Science University of Massachusetts 14 Governor s Drive Amherst, MA 13 mahadeva@cs.umass.edu Abstract This paper addresses a fundamental

More information

Nonlinear Dimensionality Reduction. Jose A. Costa

Nonlinear Dimensionality Reduction. Jose A. Costa Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Global vs. Multiscale Approaches

Global vs. Multiscale Approaches Harmonic Analysis on Graphs Global vs. Multiscale Approaches Weizmann Institute of Science, Rehovot, Israel July 2011 Joint work with Matan Gavish (WIS/Stanford), Ronald Coifman (Yale), ICML 10' Challenge:

More information

Multiscale Manifold Learning

Multiscale Manifold Learning Multiscale Manifold Learning Chang Wang IBM T J Watson Research Lab Kitchawan Rd Yorktown Heights, New York 598 wangchan@usibmcom Sridhar Mahadevan Computer Science Department University of Massachusetts

More information

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold. Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Constructing Basis Functions from Directed Graphs for Value Function Approximation

Constructing Basis Functions from Directed Graphs for Value Function Approximation Constructing Basis Functions from Directed Graphs for Value Function Approximation Jeff Johns johns@cs.umass.edu Sridhar Mahadevan mahadeva@cs.umass.edu Dept. of Computer Science, Univ. of Massachusetts

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview

More information

Basis Construction from Power Series Expansions of Value Functions

Basis Construction from Power Series Expansions of Value Functions Basis Construction from Power Series Expansions of Value Functions Sridhar Mahadevan Department of Computer Science University of Massachusetts Amherst, MA 3 mahadeva@cs.umass.edu Bo Liu Department of

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Manifold Learning: Theory and Applications to HRI

Manifold Learning: Theory and Applications to HRI Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher

More information

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Manifold Alignment using Procrustes Analysis

Manifold Alignment using Procrustes Analysis Chang Wang chwang@cs.umass.edu Sridhar Mahadevan mahadeva@cs.umass.edu Computer Science Department, University of Massachusetts, Amherst, MA 13 USA Abstract In this paper we introduce a novel approach

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels March 13, 2008 Paper: R.R. Coifman, S. Lafon, maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels Figure: Example Application from [LafonWWW] meaningful geometric

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June 2003; 15 (6):1373-1396 Presentation for CSE291 sp07 M. Belkin 1 P. Niyogi 2 1 University of Chicago, Department

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Markov Chains, Random Walks on Graphs, and the Laplacian

Markov Chains, Random Walks on Graphs, and the Laplacian Markov Chains, Random Walks on Graphs, and the Laplacian CMPSCI 791BB: Advanced ML Sridhar Mahadevan Random Walks! There is significant interest in the problem of random walks! Markov chain analysis! Computer

More information

Graph-Based Semi-Supervised Learning

Graph-Based Semi-Supervised Learning Graph-Based Semi-Supervised Learning Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux Université de Montréal CIAR Workshop - April 26th, 2005 Graph-Based Semi-Supervised Learning Yoshua Bengio, Olivier

More information

Value Function Approximation in Reinforcement Learning using the Fourier Basis

Value Function Approximation in Reinforcement Learning using the Fourier Basis Value Function Approximation in Reinforcement Learning using the Fourier Basis George Konidaris Sarah Osentoski Technical Report UM-CS-28-19 Autonomous Learning Laboratory Computer Science Department University

More information

Multiscale Wavelets on Trees, Graphs and High Dimensional Data

Multiscale Wavelets on Trees, Graphs and High Dimensional Data Multiscale Wavelets on Trees, Graphs and High Dimensional Data ICML 2010, Haifa Matan Gavish (Weizmann/Stanford) Boaz Nadler (Weizmann) Ronald Coifman (Yale) Boaz Nadler Ronald Coifman Motto... the relationships

More information

Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization

Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization Jeff Johns and Sridhar Mahadevan and Chang Wang Computer Science Department University of Massachusetts Amherst Amherst,

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

A Multiscale Framework for Markov Decision Processes using Diffusion Wavelets

A Multiscale Framework for Markov Decision Processes using Diffusion Wavelets A Multiscale Framework for Markov Decision Processes using Diffusion Wavelets Mauro Maggioni Program in Applied Mathematics Department of Mathematics Yale University New Haven, CT 6 mauro.maggioni@yale.edu

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Manifold Coarse Graining for Online Semi-supervised Learning

Manifold Coarse Graining for Online Semi-supervised Learning for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,

More information

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

An Analysis of Laplacian Methods for Value Function Approximation in MDPs

An Analysis of Laplacian Methods for Value Function Approximation in MDPs An Analysis of Laplacian Methods for Value Function Approximation in MDPs Marek Petrik Department of Computer Science University of Massachusetts Amherst, MA 3 petrik@cs.umass.edu Abstract Recently, a

More information

A graph based approach to semi-supervised learning

A graph based approach to semi-supervised learning A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

Bi-stochastic kernels via asymmetric affinity functions

Bi-stochastic kernels via asymmetric affinity functions Bi-stochastic kernels via asymmetric affinity functions Ronald R. Coifman, Matthew J. Hirn Yale University Department of Mathematics P.O. Box 208283 New Haven, Connecticut 06520-8283 USA ariv:1209.0237v4

More information

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Gigantic Image Collections What does the world look like? High

More information

Diffusion Wavelets and Applications

Diffusion Wavelets and Applications Diffusion Wavelets and Applications J.C. Bremer, R.R. Coifman, P.W. Jones, S. Lafon, M. Mohlenkamp, MM, R. Schul, A.D. Szlam Demos, web pages and preprints available at: S.Lafon: www.math.yale.edu/~sl349

More information

Predicting Graph Labels using Perceptron. Shuang Song

Predicting Graph Labels using Perceptron. Shuang Song Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction

More information

Dimensionality Reduc1on

Dimensionality Reduc1on Dimensionality Reduc1on contd Aarti Singh Machine Learning 10-601 Nov 10, 2011 Slides Courtesy: Tom Mitchell, Eric Xing, Lawrence Saul 1 Principal Component Analysis (PCA) Principal Components are the

More information

Lecture: Some Practical Considerations (3 of 4)

Lecture: Some Practical Considerations (3 of 4) Stat260/CS294: Spectral Graph Methods Lecture 14-03/10/2015 Lecture: Some Practical Considerations (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough.

More information

What is semi-supervised learning?

What is semi-supervised learning? What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

Clustering in kernel embedding spaces and organization of documents

Clustering in kernel embedding spaces and organization of documents Clustering in kernel embedding spaces and organization of documents Stéphane Lafon Collaborators: Raphy Coifman (Yale), Yosi Keller (Yale), Ioannis G. Kevrekidis (Princeton), Ann B. Lee (CMU), Boaz Nadler

More information

How to learn from very few examples?

How to learn from very few examples? How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A

More information

Semi Supervised Distance Metric Learning

Semi Supervised Distance Metric Learning Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN Contribution from: Mathematical Physics Studies Vol. 7 Perspectives in Analysis Essays in Honor of Lennart Carleson s 75th Birthday Michael Benedicks, Peter W. Jones, Stanislav Smirnov (Eds.) Springer

More information

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING Luis Rademacher, Ohio State University, Computer Science and Engineering. Joint work with Mikhail Belkin and James Voss This talk A new approach to multi-way

More information

Robust Laplacian Eigenmaps Using Global Information

Robust Laplacian Eigenmaps Using Global Information Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Spectral Techniques for Clustering

Spectral Techniques for Clustering Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Graph Metrics and Dimension Reduction

Graph Metrics and Dimension Reduction Graph Metrics and Dimension Reduction Minh Tang 1 Michael Trosset 2 1 Applied Mathematics and Statistics The Johns Hopkins University 2 Department of Statistics Indiana University, Bloomington November

More information

Graph-Laplacian PCA: Closed-form Solution and Robustness

Graph-Laplacian PCA: Closed-form Solution and Robustness 2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and

More information

Semi-Supervised Learning

Semi-Supervised Learning Semi-Supervised Learning getting more for less in natural language processing and beyond Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning many human

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017

More information

Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization

Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization Compact Spectral Bases for Value Function Approximation Using Kronecker Factorization Jeff Johns and Sridhar Mahadevan and Chang Wang Computer Science Department University of Massachusetts Amherst Amherst,

More information

Intrinsic Structure Study on Whale Vocalizations

Intrinsic Structure Study on Whale Vocalizations 1 2015 DCLDE Conference Intrinsic Structure Study on Whale Vocalizations Yin Xian 1, Xiaobai Sun 2, Yuan Zhang 3, Wenjing Liao 3 Doug Nowacek 1,4, Loren Nolte 1, Robert Calderbank 1,2,3 1 Department of

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Graph Partitioning Using Random Walks

Graph Partitioning Using Random Walks Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient

More information

IFT LAPLACIAN APPLICATIONS. Mikhail Bessmeltsev

IFT LAPLACIAN APPLICATIONS.   Mikhail Bessmeltsev IFT 6112 09 LAPLACIAN APPLICATIONS http://www-labs.iro.umontreal.ca/~bmpix/teaching/6112/2018/ Mikhail Bessmeltsev Rough Intuition http://pngimg.com/upload/hammer_png3886.png You can learn a lot about

More information

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018

More information

Advances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008

Advances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008 Advances in Manifold Learning Presented by: Nakul Verma June 10, 008 Outline Motivation Manifolds Manifold Learning Random projection of manifolds for dimension reduction Introduction to random projections

More information

Diffusion Geometries, Global and Multiscale

Diffusion Geometries, Global and Multiscale Diffusion Geometries, Global and Multiscale R.R. Coifman, S. Lafon, MM, J.C. Bremer Jr., A.D. Szlam, P.W. Jones, R.Schul Papers, talks, other materials available at: www.math.yale.edu/~mmm82 Data and functions

More information

Neural Networks, Convexity, Kernels and Curses

Neural Networks, Convexity, Kernels and Curses Neural Networks, Convexity, Kernels and Curses Yoshua Bengio Work done with Nicolas Le Roux, Olivier Delalleau and Hugo Larochelle August 26th 2005 Perspective Curse of Dimensionality Most common non-parametric

More information

Justin Solomon MIT, Spring 2017

Justin Solomon MIT, Spring 2017 Justin Solomon MIT, Spring 2017 http://pngimg.com/upload/hammer_png3886.png You can learn a lot about a shape by hitting it (lightly) with a hammer! What can you learn about its shape from vibration frequencies

More information

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Boaz Nadler Dept. of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel 76 boaz.nadler@weizmann.ac.il

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger kilianw@cis.upenn.edu Fei Sha feisha@cis.upenn.edu Lawrence K. Saul lsaul@cis.upenn.edu Department of Computer and Information

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

Linear Least-squares Dyna-style Planning

Linear Least-squares Dyna-style Planning Linear Least-squares Dyna-style Planning Hengshuai Yao Department of Computing Science University of Alberta Edmonton, AB, Canada T6G2E8 hengshua@cs.ualberta.ca Abstract World model is very important for

More information

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter Classification Semi-supervised learning based on network Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS 249-2 2017 Winter Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Xiaojin

More information

From graph to manifold Laplacian: The convergence rate

From graph to manifold Laplacian: The convergence rate Appl. Comput. Harmon. Anal. 2 (2006) 28 34 www.elsevier.com/locate/acha Letter to the Editor From graph to manifold Laplacian: The convergence rate A. Singer Department of athematics, Yale University,

More information

Regression on Manifolds Using Kernel Dimension Reduction

Regression on Manifolds Using Kernel Dimension Reduction Jens Nilsson JENSN@MATHS.LTH.SE Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden Fei Sha FEISHA@CS.BERKELEY.EDU Computer Science Division, University of California, Berkeley,

More information

Statistical and Computational Analysis of Locality Preserving Projection

Statistical and Computational Analysis of Locality Preserving Projection Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637

More information

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning classification classifiers need labeled data to train labeled data

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Manifold Regularization

Manifold Regularization Manifold Regularization Vikas Sindhwani Department of Computer Science University of Chicago Joint Work with Mikhail Belkin and Partha Niyogi TTI-C Talk September 14, 24 p.1 The Problem of Learning is

More information

Approximate Dynamic Programming Using Bellman Residual Elimination and Gaussian Process Regression

Approximate Dynamic Programming Using Bellman Residual Elimination and Gaussian Process Regression Approximate Dynamic Programming Using Bellman Residual Elimination and Gaussian Process Regression The MIT Faculty has made this article openly available. Please share how this access benefits you. Your

More information

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course Spectral Algorithms I Slides based on Spectral Mesh Processing Siggraph 2010 course Why Spectral? A different way to look at functions on a domain Why Spectral? Better representations lead to simpler solutions

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Université du Littoral- Calais July 11, 16 First..... to the

More information

Higher Order Learning with Graphs

Higher Order Learning with Graphs Sameer Agarwal sagarwal@cs.ucsd.edu Kristin Branson kbranson@cs.ucsd.edu Serge Belongie sjb@cs.ucsd.edu Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA

More information

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics

More information