A graph based approach to semi-supervised learning

Size: px

Start display at page:

Download "A graph based approach to semi-supervised learning"

Dulcie Phebe Gardner
5 years ago
Views:

1 A graph based approach to semi-supervised learning 1 Feb 2011

2 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 1-48, M. Belkin, P. Niyogi. Towards a Theoretical Foundation for Laplacian Based Manifold Methods. Journal of Computer and System Sciences, 2007.

3 What is semi-supervised learning? Prediction, but with the help of unsupervised examples.

4 Why semi-supervised learning?

5 Why semi-supervised learning? Practical reasons: unlabeled data cheap

6 Why semi-supervised learning? Practical reasons: unlabeled data cheap More natural model of human learning

7 An example

8 An example

9 An example

10 An example

11 Semi-supervised learning framework 1 l labeled examples (x, y) generated by distribution P. u unlabeled examples drawn from marginal P X. Mercer kernel K. f = argmin f Hk 1 l l V (x i, y i, f ) + γ f 2 K i=1

12 Semi-supervised learning framework 2 Classical representer theorem: f (x) = l α i K(x i, x) i=1

13 Manifold regularization: assumptions Assumptions: P supported on manifold M P(y x) varies smoothly along geodesics of P X

14 Manifold regularization: assumptions Assumptions: P supported on manifold M P(y x) varies smoothly along geodesics of P X Modified objective: f = argmin f HK 1 l l V (x i, y i, f ) + γ A f 2 K + γ I f 2 I i=1

15 Manifold regularization: known marginal Theorem If P X known and M is a smooth Riemannian manifold, f (x) = l i=1 + α(z)k(x, z)dp X (z) M

16 Manifold regularization: unknown marginal Need to estimate marginal and f I

17 Manifold regularization: unknown marginal Need to estimate marginal and f I Only requires unlabeled data

18 Manifold regularization: unknown marginal Need to estimate marginal and f I Only requires unlabeled data Natural choice: f 2 I = M Mf 2 dp

19 Manifold regularization: unknown marginal Need to estimate marginal and f I Only requires unlabeled data Natural choice: f 2 I = M Mf 2 dp Approximate M with graph

20 Manifold regularization: building the graph Single-linkage clustering Nearest neighbor methods

21 Manifold regularization: building the graph Single-linkage clustering Nearest neighbor methods Use graph laplacian instead of manifold Laplacian

22 Manifold regularization: using the graph Theorem By choosing exponential weights for the edges, the graph Laplacian converges to the manifold Laplacian in probability. f 1 = argmin l f HK l i=1 V (x i, y i, f ) + γ A f 2 K + γ I f T Lf (u+l) 2 L = D W

23 Main result Theorem f (x) = l+u i=1 α ik(x i, x)

24 Regularized least squares Classical RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ f 2 K

25 Regularized least squares Classical RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ f 2 K Solution: f (x) = l i=1 α i K(x i, x), α = (K + λli ) 1 Y

26 Regularized least squares 1 Classical RLS: argmin l f HK l i=1 (y i f (x i )) 2 + λ f 2 K Solution: f (x) = l i=1 α i K(x i, x), α = (K + λli ) 1 Y Laplacian RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ A f 2 K + λ I (u+l) 2 f T Lf

27 Regularized least squares 1 Classical RLS: argmin l f HK l i=1 (y i f (x i )) 2 + λ f 2 K Solution: f (x) = l i=1 α i K(x i, x), α = (K + λli ) 1 Y Laplacian RLS: argmin f HK 1 l l i=1 (y i f (x i )) 2 + λ A f 2 K + Solution: f (x) = l+u i=1 α I K(x, x i), α = (JK + λ A li + λ I l LK) 1 Y (u+l) 2 λ I (u+l) 2 f T Lf

28 Support vector machines Like in regularized least squares, there is a version of the SVM called Laplacian SVM.

29 Two moons dataset

30 Wisconsin breast cancer data 683 samples. Benign or malignant? Clump thickness Uniformity of cell size and shape etc

31 Wisconsin breast cancer data: results

32 Longer term stuff Besides geometric structure, what else can we use? Invariance? Learning the manifold: Simplicial complex instead of graph? Homology. Nice example in natural image statistics (Mumford et al, 2003)

33 Longer term stuff 2 Hickernell, Song, and Zhang. Reproducing kernel Banach spaces with the l 1 norm. Preprint. Reproducing kernel Banach spaces with the l 1 norm II: error analysis for regularized least squares regression. Preprint.

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In