A graph based approach to semi-supervised learning

Similar documents
Graphs, Geometry and Semi-supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Manifold Regularization

Semi-Supervised Learning of Speech Sounds

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Manifold Regularization

of semi-supervised Gaussian process classifiers.

What is semi-supervised learning?

Semi-Supervised Learning by Multi-Manifold Separation

Regularization in Reproducing Kernel Banach Spaces

Justin Solomon MIT, Spring 2017

Unsupervised dimensionality reduction

How to learn from very few examples?

Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances

Multi-View Point Cloud Kernels for Semi-Supervised Learning

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011

Graph-Based Semi-Supervised Learning

LABELED data is expensive to obtain in terms of both. Laplacian Embedded Regression for Scalable Manifold Regularization

Predicting Graph Labels using Perceptron. Shuang Song

Nonlinear Dimensionality Reduction. Jose A. Costa

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Learning with Consistency between Inductive Functions and Kernels

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Analysis of Spectral Kernel Design based Semi-supervised Learning

Online Manifold Regularization: A New Learning Setting and Empirical Study

IFT LAPLACIAN APPLICATIONS. Mikhail Bessmeltsev

Vector-valued Manifold Regularization

An RKHS for Multi-View Learning and Manifold Co-Regularization

An Overview of Outlier Detection Techniques and Applications

Graphs in Machine Learning

Non-linear Dimensionality Reduction

Neural Networks, Convexity, Kernels and Curses

Nonlinear Dimensionality Reduction

Back to the future: Radial Basis Function networks revisited

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Data dependent operators for the spatial-spectral fusion problem

Discrete vs. Continuous: Two Sides of Machine Learning

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

6.036 midterm review. Wednesday, March 18, 15

Self-Tuning Semantic Image Segmentation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Kernels A Machine Learning Overview

Global vs. Multiscale Approaches

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients

9.520 Problem Set 2. Due April 25, 2011

Iterative Laplacian Score for Feature Selection

Unlabeled Data: Now It Helps, Now It Doesn t

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Multi-view Laplacian Support Vector Machines

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)

One-class Label Propagation Using Local Cone Based Similarity

Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems

10-701/ Recitation : Kernels

Semi-supervised Learning

Graph Quality Judgement: A Large Margin Expedition

The crucial role of statistics in manifold learning

Kernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1

Joint distribution optimal transportation for domain adaptation

Gaussian Processes (10/16/13)

The Learning Problem and Regularization

Semi-Supervised Classification with Universum

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Unsupervised Classification via Convex Absolute Value Inequalities

Machine Learning Practice Page 2 of 2 10/28/13

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN

Semi-supervised Learning using Sparse Eigenfunction Bases

Statistical Machine Learning

Kernel Methods. Konstantin Tretyakov MTAT Machine Learning

Spectral Techniques for Clustering

Kernels for Multi task Learning

Data-dependent representations: Laplacian Eigenmaps

Semi Supervised Distance Metric Learning

Introduction to SVM and RVM

Kernel Methods. Konstantin Tretyakov MTAT Machine Learning

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Research Statement on Statistics Jun Zhang

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Statistical Translation, Heat Kernels, and Expected Distances

Advances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008

Large Scale Semi-supervised Linear SVMs. University of Chicago

Similarity and kernels in machine learning

Graphs in Machine Learning

Learning gradients: prescriptive models

Nearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

CMPSCI 791BB: Advanced ML: Laplacian Learning

CS798: Selected topics in Machine Learning

Improved Local Coordinate Coding using Local Tangents

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Learning. Szeged, March Faculty of Mathematics and Computer Science, Babeş Bolyai University, Cluj-Napoca/Kolozsvár

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Higher Order Learning with Graphs

Geometry on Probability Spaces

Nonlinear Dimensionality Reduction

Seeing stars when there aren't many stars Graph-based semi-supervised learning for sentiment categorization

Transcription:

A graph based approach to semi-supervised learning 1 Feb 2011

Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 1-48, 2006. M. Belkin, P. Niyogi. Towards a Theoretical Foundation for Laplacian Based Manifold Methods. Journal of Computer and System Sciences, 2007.

What is semi-supervised learning? Prediction, but with the help of unsupervised examples.

Why semi-supervised learning?

Why semi-supervised learning? Practical reasons: unlabeled data cheap

Why semi-supervised learning? Practical reasons: unlabeled data cheap More natural model of human learning

An example

An example

An example

An example

Semi-supervised learning framework 1 l labeled examples (x, y) generated by distribution P. u unlabeled examples drawn from marginal P X. Mercer kernel K. f = argmin f Hk 1 l l V (x i, y i, f ) + γ f 2 K i=1

Semi-supervised learning framework 2 Classical representer theorem: f (x) = l α i K(x i, x) i=1

Manifold regularization: assumptions Assumptions: P supported on manifold M P(y x) varies smoothly along geodesics of P X

Manifold regularization: assumptions Assumptions: P supported on manifold M P(y x) varies smoothly along geodesics of P X Modified objective: f = argmin f HK 1 l l V (x i, y i, f ) + γ A f 2 K + γ I f 2 I i=1

Manifold regularization: known marginal Theorem If P X known and M is a smooth Riemannian manifold, f (x) = l i=1 + α(z)k(x, z)dp X (z) M

Manifold regularization: unknown marginal Need to estimate marginal and f I

Manifold regularization: unknown marginal Need to estimate marginal and f I Only requires unlabeled data

Manifold regularization: unknown marginal Need to estimate marginal and f I Only requires unlabeled data Natural choice: f 2 I = M Mf 2 dp

Manifold regularization: unknown marginal Need to estimate marginal and f I Only requires unlabeled data Natural choice: f 2 I = M Mf 2 dp Approximate M with graph

Manifold regularization: building the graph Single-linkage clustering Nearest neighbor methods

Manifold regularization: building the graph Single-linkage clustering Nearest neighbor methods Use graph laplacian instead of manifold Laplacian

Manifold regularization: using the graph Theorem By choosing exponential weights for the edges, the graph Laplacian converges to the manifold Laplacian in probability. f 1 = argmin l f HK l i=1 V (x i, y i, f ) + γ A f 2 K + γ I f T Lf (u+l) 2 L = D W

Main result Theorem f (x) = l+u i=1 α ik(x i, x)

Regularized least squares Classical RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ f 2 K

Regularized least squares Classical RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ f 2 K Solution: f (x) = l i=1 α i K(x i, x), α = (K + λli ) 1 Y

Regularized least squares 1 Classical RLS: argmin l f HK l i=1 (y i f (x i )) 2 + λ f 2 K Solution: f (x) = l i=1 α i K(x i, x), α = (K + λli ) 1 Y Laplacian RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ A f 2 K + λ I (u+l) 2 f T Lf

Regularized least squares 1 Classical RLS: argmin l f HK l i=1 (y i f (x i )) 2 + λ f 2 K Solution: f (x) = l i=1 α i K(x i, x), α = (K + λli ) 1 Y Laplacian RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ A f 2 K + Solution: f (x) = l+u i=1 α I K(x, x i), α = (JK + λ A li + λ I l LK) 1 Y (u+l) 2 λ I (u+l) 2 f T Lf

Support vector machines Like in regularized least squares, there is a version of the SVM called Laplacian SVM.

Two moons dataset

Wisconsin breast cancer data 683 samples. Benign or malignant? Clump thickness Uniformity of cell size and shape etc

Wisconsin breast cancer data: results

Longer term stuff Besides geometric structure, what else can we use? Invariance? Learning the manifold: Simplicial complex instead of graph? Homology. Nice example in natural image statistics (Mumford et al, 2003)

Longer term stuff 2 Hickernell, Song, and Zhang. Reproducing kernel Banach spaces with the l 1 norm. Preprint. Reproducing kernel Banach spaces with the l 1 norm II: error analysis for regularized least squares regression. Preprint.