Manifold Coarse Graining for Online Semi-supervised Learning

Size: px
Start display at page:

Download "Manifold Coarse Graining for Online Semi-supervised Learning"

Transcription

1 for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran. September 6, 2011 M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

2 1 Introduction Manifold regularization 2 3 Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure 4 Summary and Future Works M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

3 Semi-supervised Leraning Introduction Manifold regularization Semi-supervised Learning: utilize unlabeled data to to enhance classification. Manifold assumption: the labeling function varies smoothly with respect to underlying manifold. Manifold structure is modeled by the neighborhood graph of the data points. Application such as image segmentation, handwritten digit recognition, text classification, and etc. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

4 Online setting Outline Introduction Manifold regularization Online classification of data is required in common applications such as object tracking, face recognition in surveillance systems, image retrieval and etc. Usually unlabeled data is easily available in such classification problems. Most of the classic SSL algorithms are not efficient in an online classification setting. This is due to Time complexity of O(n 3 ) in standard implementations. Space complexity. Our approach: data reduction! M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

5 Some solutions Outline Introduction Manifold regularization Using a smoothness regularization term on the just τ most recent data. Using RPtree to partition the graph into clusters covering the manifold structure. Replacing neighboring points in Euclidean space with fixed number of centroids. Caorse graining based on diffusion maps. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

6 Our solution Outline Introduction Manifold regularization Our method like above ones rely on data reduction to overcome memory and time limitations. A semi-supervised data reduction method Captures the geometric structure of data Considers the labeled data as a cue to better preserve the classification accuracy. Spectral decomposition is used to find similar nodes on the manifold in order to be merged. Assuming a maximum buffer length of k, we do data reduction whenever this limit is reached. O(k 3 ) Time complexity compared with O(n 3 ) for each newly arrived data. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

7 Problem statement Outline Introduction Manifold regularization Let X u = {x 1,..., x u } and X l = {x u+1,..., x u+l } be sets of unlabeled and labeled data points respectively. Also let y = (y(u + 1),..., y(u + l)) be the vector labels on X l. Our goal is to predict labels of X = X u X l. Let W be the weight matrix of the k-nn graph of X, W (i, j) = exp( x i x j 2 /2σ) where σ is the bandwidth parameter. Defining diagonal matrix D with D(i, i) = n j=1 W (i, j) and the laplacian matrix L un = D W, manifold regularization algorithms minimize the smoothness functional S(f ) = 1 W (i, j)(f (i) f (j)) 2 = f T L un f 2 i,j under some appropriate criteria. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

8 Introduction Manifold regularization Harmonic Solution and Label Propagation Minimizing f T L un f such that f l = y is called Harmonic Solution for manifold regularization problem. Label Propagation is a way for computing HS. In this algorithm labels are propagated from labeled to unlabeled nodes through edges in an iterative manner. Defining the stochastic matrix P as P(i, j) = W (i, j)/ the algorithm is stated as follows 1 Propagation: f (t+1) Pf (t) 2 Clamping: f l = y n W (i, k). k=1 Under appropriate conditions, the solution converges to the HS: f u = (I P uu ) 1 P ul y f l = y. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

9 Spectral view Defining absorbing stochastic matrix, [ ] Puu P Q ul 0 I The entire process of LP may be rewritten in one step So the solution is where f (t+1) = Qf (t) n f (t+1) (j) = q ij f (t) (i). f u (j) = l+u k=u+1 i=1 Q (j, k)y(k), Q lim n Qn. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

10 Spectral view (cont.) Q (j, k) = p k (j), for u + 1 k l + u and 1 < j < n, where p k (j) denotes element j of the k th right eigenvector which is unitary (corresponding to eigenvalue equal to one). f u can be expressed in terms of the right unitary eigenvectors of Q: f u (j) = l+u k=u+1 p k (j)y(k). f u remains unchanged if these eigenvectors are preserved in a CG process. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

11 Intution Outline Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure w 13 w 23 = q 13 = q 23 =, w 13 + w 14 w 23 + w 24 w 14 w 24 = q 14 = q 24 =. w 13 + w 14 w 23 + w w 14 w 13 w 23 w w 13 + w 23 w 14 + w w 35 w 48 w 35 w 48 5 w 36 w (a) Before CG 8 5 w 36 w (b) After CG 8 f (1) = q 13 f (3) + q 14 f (4) f (0) = q 03 f (3) + q 04 f (4) f (2) = q 23 f (3) + q 24 f (4) f (3) = q 31 f (1) + q 32 f (2) + f (3) = q 30 f (0) + f (4) = q 41 f (1) + q 42 f (2) + f (4) = q 40 f (0) + M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

12 Formal statement Outline Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure This process can be modeled by the transformation Q = LQR where d 1 d d 1+d 2 d 1+d L = R = 0.. I n 2. I n In the general case, R and L can be defined such that the transform merges all the nodes with the same neighbors. Rows i and j of Q are equal if and only if p k (i) = p k (j) for all k with λ k 0, where p k is a right eigenvector of Q. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

13 Formal statement(cont.) Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure Lemma Lp is the right unitary eigenvector of Q with the same eigenvalue as p, where p is the right unitary eigenvector of Q. Before merge: f u (j) = u+l k=u+1 p k(j)y(k). After Merge: f u(j ) = u +l k=u +1 (Lp k)(j )y(k) Theorem LP solution is preserved for nodes or cluster of nodes in exact CG, i.e. when we merge nodes if their corresponding elements are the same along all right eigenvectors with nonzero eigenvalues. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

14 Motivation Outline Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure In real problems the case where neighbors of two or more nodes are exactly the same rarely occurs. Merging nodes when their corresponding elements in eigenvectors are close enough. For example along i th eigenvector we consider two elements approximately the same if their difference is no more than η i, then merge nodes if they are pairwise approximately the same. On the other hand LP results depend on unitary eigenvectors, so a good practice is to do CG on unitary eigenvectors only. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

15 Analysis Outline Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure consider the term p i RLp i. Lp i is the approximate right eigenvector of Q. Multiplying by R unfolds the clusters. Defining ε i = p i RLp i, we would like to find an upper bound of ε i for nodes to be merged. The smaller ε i is, the more similar Lp i is to p i Consider node 1 that is placed in a cluster S 1 = {1,..., m}, ε i (1) = p i (1) (RLp i )(1) = η i (1 u 1(1) m j=1 u 1(j) ). M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

16 Analysis(cont.) Outline Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure Suppose p is the true right eigenvector of Q. We would like to have Lp as its right eigenvector so as to better preserve the manifold structure and LP. It is thus interesting to see whether Lp can be approximately considered as an eigenvector of Q. Considering Q (Lp) = λ(lp) + e, we d like to minimize e. We have n l i=1 e(i) 2 u 1 (i) 2D, where D = k λ 2 n i=1 i j=1 u 1(j) ε i (j) 2 and p i s and λ i s are right eigenvectors and associated eigenvalues that CG is performed along. i : η i 2 1 2kM λ i 2 λmin n = λ l D/n D M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

17 Manifold Structure Outline Exact Coarse Graining Approximate Coarse Graining Preserving Manifold Structure Not just unitary eigenvectors, but some other eigenvectors corresponding to larger eigenvalues. (c) Before CG (d) After CG M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

18 Eigenvector Preservation Summary and Future Works Real world data: Handwritten digit recognition. Reduce the number of nodes from 1000 to 92. Eigenvalues and cosine similarity of eigenvectors before and after CG. Eigenvalues and eigenvectors are well preserved. This guarantees a good accuracy of classification after reduction i λ i λ i (Lp i ) p i Lp i p i M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

19 Online Classification Summary and Future Works Maximum buffer size is 200. New data arrive sequentially. Classification accuracy up to time t. Comparison with the baseline and state of the art work (graph quantization). Manifold structure and label information is considered in the process of data reduction. Accuracy baseline coarse graining 0.84graph quantization Time step Accuracy Time step Accuracy Time step M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

20 Manifold Structure Preservation Summary and Future Works CG is done for 500 data points to reduce the data size to 100. Accuracy is averaged over 500 new data points added separately. Prevent new data points recover the manifold structure. An indication of how well the manifold structure is preserved in CG. Accuracy coarse graining 0.84 graph quantization Number of clusters Accuracy Number of clusters M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

21 Outlier Robustness Outline Summary and Future Works Noise is added manually and classification accuracy is calculated. Outliers are generated by adding multiples of the data variance. In our method outliers are merged and their effect is reduced while in the graph quantization method separate clusters are devoted to outliers. Accuracy Outlier ratio M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

22 Summary and Future Works Summary Semi-supervised CG algorithm is proposed to reduce the number of data points while preserving the manifold structure. The prediction result is closely related to the eigenvectors of a variation of the stochastic matrix. Exact and approximate coarse graining algorithms. Future works: A theoretical analysis of robustness against noise. Extending the spectral view point to other manifold learning methods. Deriving tighter error bounds on CG. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

23 Summary and Future Works Thanks for your Attention. M. Farajtabar et al. for Online Semi-supervised Learning September 6, / 23

Efficient Iterative Semi-Supervised Classification on Manifold

Efficient Iterative Semi-Supervised Classification on Manifold Efficient Iterative Semi-Supervised Classification on Manifold Mehrdad Farajtabar, Hamid R. Rabiee, Amirreza Shaban, Ali Soltani-Farani Digital Media Lab, AICTC Research Center, Department of Computer

More information

What is semi-supervised learning?

What is semi-supervised learning? What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017

More information

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31 Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

Semi-Supervised Learning by Multi-Manifold Separation

Semi-Supervised Learning by Multi-Manifold Separation Semi-Supervised Learning by Multi-Manifold Separation Xiaojin (Jerry) Zhu Department of Computer Sciences University of Wisconsin Madison Joint work with Andrew Goldberg, Zhiting Xu, Aarti Singh, and Rob

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Online Manifold Regularization: A New Learning Setting and Empirical Study

Online Manifold Regularization: A New Learning Setting and Empirical Study Online Manifold Regularization: A New Learning Setting and Empirical Study Andrew B. Goldberg 1, Ming Li 2, Xiaojin Zhu 1 1 Computer Sciences, University of Wisconsin Madison, USA. {goldberg,jerryzhu}@cs.wisc.edu

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018

More information

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton February 10, 2015 MVA 2014/2015 Previous

More information

Semi Supervised Distance Metric Learning

Semi Supervised Distance Metric Learning Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =

More information

Consensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University

Consensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University Consensus Algorithms for Camera Sensor Networks Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University Camera Sensor Networks Motes Small, battery powered Embedded camera Wireless interface

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter Classification Semi-supervised learning based on network Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS 249-2 2017 Winter Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Xiaojin

More information

Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures

Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures STIAN NORMANN ANFINSEN ROBERT JENSSEN TORBJØRN ELTOFT COMPUTATIONAL EARTH OBSERVATION AND MACHINE LEARNING LABORATORY

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

How to learn from very few examples?

How to learn from very few examples? How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

Predicting Graph Labels using Perceptron. Shuang Song

Predicting Graph Labels using Perceptron. Shuang Song Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction

More information

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS Akshay Gadde and Antonio Ortega Department of Electrical Engineering University of Southern California, Los Angeles Email: agadde@usc.edu,

More information

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Semi-Supervised Learning

Semi-Supervised Learning Semi-Supervised Learning getting more for less in natural language processing and beyond Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning many human

More information

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Le Song Machine Learning I CSE 674, Fall 23 Unsupervised learning Learning from raw (unlabeled, unannotated, etc) data, as opposed to supervised data where a classification of

More information

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning classification classifiers need labeled data to train labeled data

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course Spectral Algorithms I Slides based on Spectral Mesh Processing Siggraph 2010 course Why Spectral? A different way to look at functions on a domain Why Spectral? Better representations lead to simpler solutions

More information

Nonlinear Dimensionality Reduction. Jose A. Costa

Nonlinear Dimensionality Reduction. Jose A. Costa Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels March 13, 2008 Paper: R.R. Coifman, S. Lafon, maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels Figure: Example Application from [LafonWWW] meaningful geometric

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

An Introduction to Spectral Graph Theory

An Introduction to Spectral Graph Theory An Introduction to Spectral Graph Theory Mackenzie Wheeler Supervisor: Dr. Gary MacGillivray University of Victoria WheelerM@uvic.ca Outline Outline 1. How many walks are there from vertices v i to v j

More information

Semi-Supervised Learning of Speech Sounds

Semi-Supervised Learning of Speech Sounds Aren Jansen Partha Niyogi Department of Computer Science Interspeech 2007 Objectives 1 Present a manifold learning algorithm based on locality preserving projections for semi-supervised phone classification

More information

An indicator for the number of clusters using a linear map to simplex structure

An indicator for the number of clusters using a linear map to simplex structure An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany

More information

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018 Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google

More information

Automatic Subspace Learning via Principal Coefficients Embedding

Automatic Subspace Learning via Principal Coefficients Embedding IEEE TRANSACTIONS ON CYBERNETICS 1 Automatic Subspace Learning via Principal Coefficients Embedding Xi Peng, Jiwen Lu, Senior Member, IEEE, Zhang Yi, Fellow, IEEE and Rui Yan, Member, IEEE, arxiv:1411.4419v5

More information

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Semi-supervised approach which attempts to incorporate partial information from unlabeled data points Semi-supervised approach which attempts

More information

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Statistical Learning. Dong Liu. Dept. EEIS, USTC Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

A graph based approach to semi-supervised learning

A graph based approach to semi-supervised learning A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.

More information

LEC 5: Two Dimensional Linear Discriminant Analysis

LEC 5: Two Dimensional Linear Discriminant Analysis LEC 5: Two Dimensional Linear Discriminant Analysis Dr. Guangliang Chen March 8, 2016 Outline Last time: LDA/QDA (classification) Today: 2DLDA (dimensionality reduction) HW3b and Midterm Project 3 What

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING Luis Rademacher, Ohio State University, Computer Science and Engineering. Joint work with Mikhail Belkin and James Voss This talk A new approach to multi-way

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Multiscale Wavelets on Trees, Graphs and High Dimensional Data

Multiscale Wavelets on Trees, Graphs and High Dimensional Data Multiscale Wavelets on Trees, Graphs and High Dimensional Data ICML 2010, Haifa Matan Gavish (Weizmann/Stanford) Boaz Nadler (Weizmann) Ronald Coifman (Yale) Boaz Nadler Ronald Coifman Motto... the relationships

More information

Semi-Supervised Learning with Graphs

Semi-Supervised Learning with Graphs Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu LTI SCS CMU Thesis Committee John Lafferty (co-chair) Ronald Rosenfeld (co-chair) Zoubin Ghahramani Tommi Jaakkola 1 Semi-supervised Learning classifiers

More information

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics

More information

Statistical Learning Notes III-Section 2.4.3

Statistical Learning Notes III-Section 2.4.3 Statistical Learning Notes III-Section 2.4.3 "Graphical" Spectral Features Stephen Vardeman Analytics Iowa LLC January 2019 Stephen Vardeman (Analytics Iowa LLC) Statistical Learning Notes III-Section

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms A simple example Two ideal clusters, with two points each Spectral clustering Lecture 2 Spectral clustering algorithms 4 2 3 A = Ideally permuted Ideal affinities 2 Indicator vectors Each cluster has an

More information

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before

More information

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring / Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

More information

Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation

Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation Author: Raif M. Rustamov Presenter: Dan Abretske Johns Hopkins 2007 Outline Motivation and Background Laplace-Beltrami Operator

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

Graph-Based Semi-Supervised Learning

Graph-Based Semi-Supervised Learning Graph-Based Semi-Supervised Learning Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux Université de Montréal CIAR Workshop - April 26th, 2005 Graph-Based Semi-Supervised Learning Yoshua Bengio, Olivier

More information

Spectral Generative Models for Graphs

Spectral Generative Models for Graphs Spectral Generative Models for Graphs David White and Richard C. Wilson Department of Computer Science University of York Heslington, York, UK wilson@cs.york.ac.uk Abstract Generative models are well known

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J Olver 8 Numerical Computation of Eigenvalues In this part, we discuss some practical methods for computing eigenvalues and eigenvectors of matrices Needless to

More information

MATH 829: Introduction to Data Mining and Analysis Clustering II

MATH 829: Introduction to Data Mining and Analysis Clustering II his lecture is based on U. von Luxburg, A Tutorial on Spectral Clustering, Statistics and Computing, 17 (4), 2007. MATH 829: Introduction to Data Mining and Analysis Clustering II Dominique Guillot Departments

More information

Periodicity & State Transfer Some Results Some Questions. Periodic Graphs. Chris Godsil. St John s, June 7, Chris Godsil Periodic Graphs

Periodicity & State Transfer Some Results Some Questions. Periodic Graphs. Chris Godsil. St John s, June 7, Chris Godsil Periodic Graphs St John s, June 7, 2009 Outline 1 Periodicity & State Transfer 2 Some Results 3 Some Questions Unitary Operators Suppose X is a graph with adjacency matrix A. Definition We define the operator H X (t)

More information

Manifold Regularization

Manifold Regularization Manifold Regularization Vikas Sindhwani Department of Computer Science University of Chicago Joint Work with Mikhail Belkin and Partha Niyogi TTI-C Talk September 14, 24 p.1 The Problem of Learning is

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Lecture: Local Spectral Methods (1 of 4)

Lecture: Local Spectral Methods (1 of 4) Stat260/CS294: Spectral Graph Methods Lecture 18-03/31/2015 Lecture: Local Spectral Methods (1 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

CS 664 Segmentation (2) Daniel Huttenlocher

CS 664 Segmentation (2) Daniel Huttenlocher CS 664 Segmentation (2) Daniel Huttenlocher Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher

More information

Discrete Signal Processing on Graphs: Sampling Theory

Discrete Signal Processing on Graphs: Sampling Theory IEEE TRANS. SIGNAL PROCESS. TO APPEAR. 1 Discrete Signal Processing on Graphs: Sampling Theory Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kovačević arxiv:153.543v [cs.it] 8 Aug 15 Abstract

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

Graph Partitioning Using Random Walks

Graph Partitioning Using Random Walks Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

2 Graphs, Diffusion Maps, and Semi-supervised Learning

2 Graphs, Diffusion Maps, and Semi-supervised Learning Graphs, Diffusion Maps, and Semi-supervised Learning.1 Graphs Graphs will be one of the main objects of study through these lectures, it is time ( ) to introduce them. V A graph G = (V, E) contains a set

More information

1 Graph Kernels by Spectral Transforms

1 Graph Kernels by Spectral Transforms Graph Kernels by Spectral Transforms Xiaojin Zhu Jaz Kandola John Lafferty Zoubin Ghahramani Many graph-based semi-supervised learning methods can be viewed as imposing smoothness conditions on the target

More information

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Math 471 (Numerical methods) Chapter 3 (second half). System of equations Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion 1 Overview With clustering, we have several key motivations: archetypes (factor analysis) segmentation hierarchy

More information