Graphs in Machine Learning

Size: px
Start display at page:

Download "Graphs in Machine Learning"

Transcription

1 Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017 MVA 2017/2018

2 Previous Lecture recommendation on a bipartite graph resistive networks recommendation score as a resistance? Laplacian and resistive networks resistance distance and random walks geometry of the data and the connectivity spectral clustering connectivity vs. compactness MinCut, RatioCut, NCut spectral relaxations manifold learning with Laplacian eigenmaps Michal Valko Graphs in Machine Learning SequeL - 2/50

3 Previous Lab Session by Pierre Perrault Content graph construction test sensitivity to parameters: σ, k, ε spectral clustering spectral clustering vs. k-means image segmentation Short written report (graded, all reports around 40% of grade) Check the course website for the policies Questions to piazza Deadline: , 23:59 Michal Valko Graphs in Machine Learning SequeL - 3/50

4 This Lecture manifold learning with Laplacian eigenmaps semi-supervised learning inductive and transductive semi-supervised learning SSL with self-training SVMs and semi-supervised SVMs = TSVMs Gaussian random fields and harmonic solution graph-based semi-supervised learning transductive learning manifold regularization Michal Valko Graphs in Machine Learning SequeL - 4/50

5 Manifold Learning: Recap problem: definition reduction/manifold learning Given {x i } N i=1 from Rd find {y i } N i=1 in Rm, where m d. What do we know about the dimensionality reduction representation/visualization (2D or 3D) an old example: globe to a map often assuming M R d feature extraction linear vs. nonlinear dimensionality reduction What do we know about linear vs. nonlinear methods? linear: ICA, PCA, SVD,... nonlinear often preserve only local distances Michal Valko Graphs in Machine Learning SequeL - 5/50

6 Manifold Learning: Linear vs. Non-linear Michal Valko Graphs in Machine Learning SequeL - 6/50

7 Manifold Learning: Preserving (just) local distances d(y i, y j ) = d(x i, x j ) only if d(x i, x j ) is small min ij w ij y i y j 2 Looks familiar? Michal Valko Graphs in Machine Learning SequeL - 7/50

8 Manifold Learning: Laplacian Eigenmaps Step 1: Solve generalized eigenproblem: Lf = λdf Step 2: Assign m new coordinates: x i (f 2 (i),..., f m+1 (i)) Note 1 : we need to get m + 1 smallest eigenvectors Note 2 : f 1 is useless mbelkin/papers/lem_nc_03.pdf Michal Valko Graphs in Machine Learning SequeL - 8/50

9 Manifold Learning: Laplacian Eigenmaps to 1D Laplacian Eigenmaps 1D objective min f f T Lf s.t. f i R, f T D1 = 0, f T Df = 1 The meaning of the constraints is similar as for spectral clustering: f T Df = 1 is for scaling f T D1 = 0 is to not get v 1 What is the solution? Michal Valko Graphs in Machine Learning SequeL - 9/50

10 Manifold Learning: Example laplacian-eigenmap- -diffusion-map- -manifold-learning Michal Valko Graphs in Machine Learning SequeL - 10/50

11 Semi-supervised learning: How is it possible? This is how children learn! hypothesis Michal Valko Graphs in Machine Learning SequeL - 11/50

12 Semi-supervised learning (SSL) SSL problem: definition Given {x i } N i=1 from Rd and {y i } n l i=1, with n l N, find {y i } n i=n l +1 (transductive) or find f predicting y well beyond that (inductive). Some facts about SSL assumes that the unlabeled data is useful works with data geometry assumptions cluster assumption low-density separation manifold assumption smoothness assumptions, generative models,... now it helps now, now it does not (sic) provable cases when it helps inductive or transductive/out-of-sample extension Michal Valko Graphs in Machine Learning SequeL - 12/50

13 SSL: Self-Training Michal Valko Graphs in Machine Learning SequeL - 13/50

14 SSL: Overview: Self-Training SSL: Self-Training Input: L = {x i, y i } n l i=1 and U = {x i} N i=n l +1 Repeat: train f using L apply f to (some) U and add them to L What are the properties of self-training? its a wrapper method heavily depends on the the internal classifier some theory exist for specific classifiers nobody uses it anymore errors propagate (unless the clusters are well separated) Michal Valko Graphs in Machine Learning SequeL - 14/50

15 SSL: Self-Training: Bad Case Michal Valko Graphs in Machine Learning SequeL - 15/50

16 SSL: Transductive SVM: S3VM Michal Valko Graphs in Machine Learning SequeL - 16/50

17 SSL: Transductive SVM: Classical SVM Linear case: f = w T x + b we look for (w, b) max-margin classification max w,b 1 w s.t. y i (w T x i + b) 1 i = 1,..., n l note the difference between functional and geometric margin max-margin classification min w,b w 2 s.t. y i (w T x i + b) 1 i = 1,..., n l Michal Valko Graphs in Machine Learning SequeL - 17/50

18 SSL: Transductive SVM: Classical SVM max-margin classification: separable case min w,b w 2 s.t. y i (w T x i + b) 1 i = 1,..., n l max-margin classification: non-separable case min w,b λ w 2 + i ξ i s.t. y i (w T x i + b) 1 ξ i i = 1,..., n l ξ i 0 i = 1,..., n l Michal Valko Graphs in Machine Learning SequeL - 18/50

19 SSL: Transductive SVM: Classical SVM max-margin classification: non-separable case min w,b λ w 2 + i ξ i s.t. y i (w T x i + b) 1 ξ i i = 1,..., n l ξ i 0 i = 1,..., n l Unconstrained formulation using hinge loss: In general? min w,b n l max (1 y i (w T x i + b), 0) + λ w 2 i min w,b n l V (x i, y i, f (x i )) + λω(f ) i Michal Valko Graphs in Machine Learning SequeL - 19/50

20 SSL: Transductive SVM: Classical SVM: Hinge loss V (x i, y i, f (x i )) = max (1 y i (w T x i + b), 0) Michal Valko Graphs in Machine Learning SequeL - 20/50

21 SSL: Transductive SVM: Unlabeled Examples min w,b n l max (1 y i (w T x i + b), 0) + λ w 2 i How to incorporate unlabeled examples? No y s for unlabeled x. Prediction of f for (any) x? ŷ = sgn (f (x)) = sgn (w T x + b) Pretending that sgn (f (x)) is the true label... V (x, ŷ, f (x)) = max (1 ŷ (w T x + b), 0) = max (1 sgn (w T x + b) (w T x + b), 0) = max (1 w T x + b, 0) Michal Valko Graphs in Machine Learning SequeL - 21/50

22 SSL: Transductive SVM: Hinge and Hat Loss What is the difference in the objectives? Hinge loss penalizes? Hat loss penalizes? Michal Valko Graphs in Machine Learning SequeL - 22/50

23 SSL: Transductive SVM: S3VM This is what we wanted! Michal Valko Graphs in Machine Learning SequeL - 23/50

24 SSL: Transductive SVM: Formulation Main SVM idea stays the same: penalize the margin n l n l +n u min max (1 y i (w T x i + b), 0)+λ 1 w 2 +λ 2 max (1 w T x i + b, 0) w,b i=1 i=n l +1 min w,b What is the loss and what is the regularizer? n l n l +n u max (1 y i (w T x i + b), 0)+λ 1 w 2 +λ 2 max (1 w T x i + b, 0) i=1 i=n l +1 Think of unlabeled data as the regularizers for your classifiers! Practical hint: Additionally enforce the class balance. What it the main issue of TSVM? recent advancements: Michal Valko Graphs in Machine Learning SequeL - 24/50

25 SSL with Graphs: Prehistory Blum/Chawla: Learning from Labeled and Unlabeled Data using Graph Mincuts *following some insights from vision research in 1980s Michal Valko Graphs in Machine Learning SequeL - 25/50

26 SSL with Graphs: MinCut MinCut SSL: an idea similar to MinCut clustering Where is the link? What is the formal statement? We look for f (x) {±1} cut = n l +n u i,j=1 w ij (f (x i ) f (x j )) 2 = Ω(f ) Why (f (x i ) f (x j )) 2 and not f (x i ) f (x j )? Michal Valko Graphs in Machine Learning SequeL - 26/50

27 SSL with Graphs: MinCut We look for f (x) {±1} Ω(f) = n l +n u i,j=1 w ij (f (x i ) f (x j )) 2 Clustering was unsupervised, here we have supervised data. Recall the general objective-function framework: min w,b n l V (x i, y i, f (x i )) + λω(f) i It would be nice if we match the prediction on labeled data: V (x, y, f (x)) = (f (x i ) y i ) 2 n l i=1 Michal Valko Graphs in Machine Learning SequeL - 27/50

28 SSL with Graphs: MinCut Final objective function: min f {±1} n l +nu n l i=1 n l +n u (f (x i ) y i ) 2 + λ i,j=1 w ij (f (x i ) f (x j )) 2 This is an integer program :( Can we solve it? Are we happy? We need a better way to reflect the confidence. Michal Valko Graphs in Machine Learning SequeL - 28/50

29 SSL with Graphs: Harmonic Functions Zhu/Ghahramani/Lafferty: Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions *a seminal paper that convinced people to use graphs for SSL Idea 1: Look for a unique solution. Idea 2: Find a smooth one. (harmonic solution) Harmonic SSL 1): As before we constrain f to match the supervised data: f (x i ) = y i i {1,..., n l } 2): We enforce the solution f to be harmonic. i j f (x i ) = f (x j)w ij i j w i {n l + 1,..., n u + n l } ij Michal Valko Graphs in Machine Learning SequeL - 29/50

30 SSL with Graphs: Harmonic Functions The harmonic solution is obtained from the mincut one... min f {±1} n l +nu n l i=1 n l +n u (f (x i ) y i ) 2 + λ i,j=1 w ij (f (x i ) f (x j )) 2... if we just relax the integer constraints to be real... min f R n l +nu n l i=1 n l +n u (f (x i ) y i ) 2 + λ i,j=1... or equivalently (note that f (x i ) = f i )... n l +n u min w ij (f (x i ) f (x j )) 2 f R n l +nu i,j=1 s.t. y i = f (x i ) i = 1,..., n l w ij (f (x i ) f (x j )) 2 Michal Valko Graphs in Machine Learning SequeL - 30/50

31 SSL with Graphs: Harmonic Functions Properties of the relaxation from ±1 to R there is a closed form solution for f this solution is unique globally optimal it is either constant or has a maximum/minimum on a boundary f (x i ) may not be discrete but we can threshold it electric-network interpretation random-walk interpretation Michal Valko Graphs in Machine Learning SequeL - 31/50

32 SSL with Graphs: Harmonic Functions Random walk interpretation: 1) start from the vertex you want to label and randomly walk 2) P(j i) = w ij k w P = D 1 W ik 3) finish when a labeled vertex is hit absorbing random walk f i = probability of reaching a positive labeled vertex Michal Valko Graphs in Machine Learning SequeL - 32/50

33 SSL with Graphs: Harmonic Functions How to compute HS? Option A: iteration/propagation Step 1: Set f (x i ) = y i for i = 1,..., n l Step 2: Propagate iteratively (only for unlabeled) i j f (x i ) f (x j)w ij i j w i {n l + 1,..., n u + n l } ij Properties: this will converge to the harmonic solution we can set the initial values for unlabeled nodes arbitrarily an interesting option for large-scale data Michal Valko Graphs in Machine Learning SequeL - 33/50

34 SSL with Graphs: Harmonic Functions How to compute HS? Option B: Closed form solution Define f = (f (x 1 ),..., f (x nl +n u )) = (f 1,..., f nl +n u ) Ω(f) = n l +n u i,j=1 w ij (f (x i ) f (x j )) 2 = f T Lf L is a (n l + n u ) (n l + n u ) matrix: [ Lll L L = lu L u1 L uu ] How to compute this constrained minimization problem? Michal Valko Graphs in Machine Learning SequeL - 34/50

35 SSL with Graphs: Harmonic Functions Let us compute harmonic solution using harmonic property! How did we formalize the harmonic property of a circuit? (Lf) u = 0 u In matrix notation [ Lll L ul L lu L uu ] [ ] [ fl... = f l is constrained to be y l and for f u f u 0 u ]... from which we get L ul f l + L uu f u = 0 u f u = L 1 uu ( L ul f l ) = L 1 uu (W ul f l ). Note that this does not depend on L ll. Michal Valko Graphs in Machine Learning SequeL - 35/50

36 SSL with Graphs: Harmonic Functions Can we see that this calculates the probability of a random walk? f u = L 1 uu ( L ul f l ) = L 1 uu (W ul f l ) Note that P = D 1 W. Then equivalently f u = (I P uu ) 1 P ul f l. Split the equation into +ve & -ve part: f i = (I P uu ) 1 iu P ulf l = (I P uu ) 1 j:y j =1 iu P uj } {{ } p (+1) i = p (+1) i p ( 1) i j:y j = 1 (I P uu ) 1 iu P uj } {{ } p ( 1) i Michal Valko Graphs in Machine Learning SequeL - 36/50

37 SSL with Graphs: Regularized Harmonic Functions f i = p (+1) i p ( 1) i = f i = f i sgn(f }{{} i ) }{{} confidence label What if a nasty outlier sneaks in? The prediction for the outlier can be hyperconfident :( How to control the confidence of the inference? We add a sink to the graph. Allow the random walk to die! sink = artificial label node with value 0 We connect it to every other vertex. What will this do to our predictions? depends on the weigh on the edges Michal Valko Graphs in Machine Learning SequeL - 37/50

38 SSL with Graphs: Regularized Harmonic Functions How do we compute this regularized random walk? How does γ g influence HS? f u = (L uu + γ g I) 1 (W ul f l ) What happens to sneaky outliers? Michal Valko Graphs in Machine Learning SequeL - 38/50

39 SSL with Graphs: Harmonic Functions Why don t we represent the sink in L explicitly? Formally, to get the harmonic solution on the graph with sink... L ll + γ G I nl L lu γ G f l... L ul L uu + γ G I nu γ G f u = 0 u γ G 1 nl 1 γ G 1 nu 1 nγ G 0... L ul f l + (L uu + γ G I nu ) f u = 0 u... which is the same if we disregard the last column and row... [ ] [ ] [ ] Lll + γ G I nl L lu fl... = L uu + γ G I nu L ul f u 0 u... and therefore we simply add γ G to the diagonal of L! Michal Valko Graphs in Machine Learning SequeL - 39/50

40 SSL with Graphs: Soft Harmonic Functions Regularized HS objective with Q = L + γ g I: n l min (f (x i ) y i ) 2 + λf T Qf f R n l +nu i=1 What if we do not really believe that f (x i ) = y i, i? f = min (f f R N y)t C(f y) + f T Qf { c l for labeled examples C is diagonal with C ii = c u otherwise. { true label for labeled examples y pseudo-targets with y i = 0 otherwise. Michal Valko Graphs in Machine Learning SequeL - 40/50

41 SSL with Graphs: Soft Harmonic Functions f = min f R n (f y)t C(f y) + f T Qf Closed form soft harmonic solution: f = (C 1 Q + I) 1 y What are the differences between hard and soft? Not much different in practice. Provable generalization guarantees for the soft one. Michal Valko Graphs in Machine Learning SequeL - 41/50

42 SSL with Graphs: Regularized Harmonic Functions Larger implications of random walks random walk relates to commute distance which should satisfy ( ) Vertices in the same cluster of the graph have a small commute distance, whereas two vertices in different clusters of the graph have a large commute distance. Do we have this property for HS? What if N? Luxburg/Radl/Hein: Getting lost in space: Large sample analysis of the commute distance people/luxburg/publications/luxburgradlhein2010_paperandsupplement.pdf Solutions? 1) γ g 2) amplified commute distance 3) L p 4) L... The goal of these solutions: make them remember! Michal Valko Graphs in Machine Learning SequeL - 42/50

43 SSL with Graphs: Out of sample extension Both MinCut and HFS only inferred the labels on unlabeled data. They are transductive. What if a new point x nl +n u+1 arrives? also called out-of-sample extension Option 1) Add it to the graph and recompute HFS. Option 2) Make the algorithms inductive! Allow to be defined everywhere: f : X R Allow f (x i ) y i. Why? To deal with noise. Solution: Manifold Regularization Michal Valko Graphs in Machine Learning SequeL - 43/50

44 SSL with Graphs: Manifold Regularization General (S)SL objective: min f n l V (x i, y i, f (x i )) + λω(f ) i Want to control f, also for the out-of-sample data, i.e., everywhere. Ω(f ) = λ 2 f T Lf + λ 1 f (x) 2 dx For general kernels: min f H K n l x X V (x i, y i, f (x i )) + λ 1 f 2 K + λ 2 f T Lf i Michal Valko Graphs in Machine Learning SequeL - 44/50

45 SSL with Graphs: Manifold Regularization f = arg min f H K n l V (x i, y i, f ) + λ 1 f 2 K + λ 2 f T Lf i Representer Theorem for Manifold Regularization The minimizer f has a finite expansion of the form V (x, y, f ) = (y f (x)) 2 f (x) = V (x, y, f ) = max (0, 1 yf (x)) n l +n u i=1 α i K(x, x i ) LapRLS Laplacian Regularized Least Squares LapSVM Laplacian Support Vector Machines Michal Valko Graphs in Machine Learning SequeL - 45/50

46 SSL with Graphs: Laplacian SVMs f = arg min f H K n l max (0, 1 yf (x)) + γ A f 2 K + γ I f T Lf i Allows us to learn a function in RKHS, i.e., RBF kernels. Michal Valko Graphs in Machine Learning SequeL - 46/50

47 SSL with Graphs: Laplacian SVMs Michal Valko Graphs in Machine Learning SequeL - 47/50

48 Checkpoint 1 Semi-supervised learning with graphs: min ( ) f {±1} n l +nu n l i=1 n l +n u w ij (f (x i ) y i ) 2 + λ i,j=1 (f (x i ) f (x j )) 2 Regularized harmonic Solution: f u = (L uu + γ g I) 1 (W ul f l ) Michal Valko Graphs in Machine Learning SequeL - 48/50

49 Checkpoint 2 Unconstrained regularization in general: f = min f R N (f y)t C(f y) + f T Qf Out of sample extension: Laplacian SVMs f = arg min f H K n l max (0, 1 yf (x)) + λ 1 f 2 K + λ 2 f T Lf i Michal Valko Graphs in Machine Learning SequeL - 49/50

50 Michal Valko ENS Paris-Saclay, MVA 2017/2018 SequeL team, Inria Lille Nord Europe

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton February 10, 2015 MVA 2014/2015 Previous

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Ulrike von Luxburg, Gary Miller, Doyle & Schnell, Daniel Spielman January 27, 2015 MVA 2014/2015

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Ulrike von Luxburg, Gary Miller, Mikhail Belkin October 16th, 2017 MVA 2017/2018

More information

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Ulrike von Luxburg, Gary Miller, Doyle & Schnell, Daniel Spielman October 10th,

More information

Manifold Coarse Graining for Online Semi-supervised Learning

Manifold Coarse Graining for Online Semi-supervised Learning for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,

More information

Manifold Regularization

Manifold Regularization Manifold Regularization Vikas Sindhwani Department of Computer Science University of Chicago Joint Work with Mikhail Belkin and Partha Niyogi TTI-C Talk September 14, 24 p.1 The Problem of Learning is

More information

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31 Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

What is semi-supervised learning?

What is semi-supervised learning? What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning classification classifiers need labeled data to train labeled data

More information

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter Classification Semi-supervised learning based on network Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS 249-2 2017 Winter Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Xiaojin

More information

Hilbert Space Methods in Learning

Hilbert Space Methods in Learning Hilbert Space Methods in Learning guest lecturer: Risi Kondor 6772 Advanced Machine Learning and Perception (Jebara), Columbia University, October 15, 2003. 1 1. A general formulation of the learning problem

More information

Data-dependent representations: Laplacian Eigenmaps

Data-dependent representations: Laplacian Eigenmaps Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will

More information

A graph based approach to semi-supervised learning

A graph based approach to semi-supervised learning A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.

More information

Semi-Supervised Learning with Graphs

Semi-Supervised Learning with Graphs Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu LTI SCS CMU Thesis Committee John Lafferty (co-chair) Ronald Rosenfeld (co-chair) Zoubin Ghahramani Tommi Jaakkola 1 Semi-supervised Learning classifiers

More information

Semi-Supervised Learning

Semi-Supervised Learning Semi-Supervised Learning getting more for less in natural language processing and beyond Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning many human

More information

Graph-Based Semi-Supervised Learning

Graph-Based Semi-Supervised Learning Graph-Based Semi-Supervised Learning Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux Université de Montréal CIAR Workshop - April 26th, 2005 Graph-Based Semi-Supervised Learning Yoshua Bengio, Olivier

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

CSC 411 Lecture 17: Support Vector Machine

CSC 411 Lecture 17: Support Vector Machine CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 12: Graph Clustering Cho-Jui Hsieh UC Davis May 29, 2018 Graph Clustering Given a graph G = (V, E, W ) V : nodes {v 1,, v n } E: edges

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector

More information

Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems

Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems Tomáš Kocák SequeL team INRIA Lille France Michal Valko SequeL team INRIA Lille France Rémi Munos SequeL team, INRIA

More information

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Gigantic Image Collections What does the world look like? High

More information

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Semi-supervised approach which attempts to incorporate partial information from unlabeled data points Semi-supervised approach which attempts

More information

Semi-Supervised Learning by Multi-Manifold Separation

Semi-Supervised Learning by Multi-Manifold Separation Semi-Supervised Learning by Multi-Manifold Separation Xiaojin (Jerry) Zhu Department of Computer Sciences University of Wisconsin Madison Joint work with Andrew Goldberg, Zhiting Xu, Aarti Singh, and Rob

More information

How to learn from very few examples?

How to learn from very few examples? How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A

More information

CSE 546 Final Exam, Autumn 2013

CSE 546 Final Exam, Autumn 2013 CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,

More information

About this class. Maximizing the Margin. Maximum margin classifiers. Picture of large and small margin hyperplanes

About this class. Maximizing the Margin. Maximum margin classifiers. Picture of large and small margin hyperplanes About this class Maximum margin classifiers SVMs: geometric derivation of the primal problem Statement of the dual problem The kernel trick SVMs as the solution to a regularization problem Maximizing the

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information

Random Walks, Random Fields, and Graph Kernels

Random Walks, Random Fields, and Graph Kernels Random Walks, Random Fields, and Graph Kernels John Lafferty School of Computer Science Carnegie Mellon University Based on work with Avrim Blum, Zoubin Ghahramani, Risi Kondor Mugizi Rwebangira, Jerry

More information

Self-Tuning Semantic Image Segmentation

Self-Tuning Semantic Image Segmentation Self-Tuning Semantic Image Segmentation Sergey Milyaev 1,2, Olga Barinova 2 1 Voronezh State University sergey.milyaev@gmail.com 2 Lomonosov Moscow State University obarinova@graphics.cs.msu.su Abstract.

More information

Graph-Based Anomaly Detection with Soft Harmonic Functions

Graph-Based Anomaly Detection with Soft Harmonic Functions Graph-Based Anomaly Detection with Soft Harmonic Functions Michal Valko Advisor: Milos Hauskrecht Computer Science Department, University of Pittsburgh, Computer Science Day 2011, March 18 th, 2011. Anomaly

More information

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking

More information

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

MLCC 2017 Regularization Networks I: Linear Models

MLCC 2017 Regularization Networks I: Linear Models MLCC 2017 Regularization Networks I: Linear Models Lorenzo Rosasco UNIGE-MIT-IIT June 27, 2017 About this class We introduce a class of learning algorithms based on Tikhonov regularization We study computational

More information

Nonlinear Dimensionality Reduction. Jose A. Costa

Nonlinear Dimensionality Reduction. Jose A. Costa Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time

More information

Online Manifold Regularization: A New Learning Setting and Empirical Study

Online Manifold Regularization: A New Learning Setting and Empirical Study Online Manifold Regularization: A New Learning Setting and Empirical Study Andrew B. Goldberg 1, Ming Li 2, Xiaojin Zhu 1 1 Computer Sciences, University of Wisconsin Madison, USA. {goldberg,jerryzhu}@cs.wisc.edu

More information

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of

More information

SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels

SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic

More information

Machine Learning And Applications: Supervised Learning-SVM

Machine Learning And Applications: Supervised Learning-SVM Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine

More information

Does Unlabeled Data Help?

Does Unlabeled Data Help? Does Unlabeled Data Help? Worst-case Analysis of the Sample Complexity of Semi-supervised Learning. Ben-David, Lu and Pal; COLT, 2008. Presentation by Ashish Rastogi Courant Machine Learning Seminar. Outline

More information

INTRODUCTION TO DATA SCIENCE

INTRODUCTION TO DATA SCIENCE INTRODUCTION TO DATA SCIENCE JOHN P DICKERSON Lecture #13 3/9/2017 CMSC320 Tuesdays & Thursdays 3:30pm 4:45pm ANNOUNCEMENTS Mini-Project #1 is due Saturday night (3/11): Seems like people are able to do

More information

Kernel Methods. Barnabás Póczos

Kernel Methods. Barnabás Póczos Kernel Methods Barnabás Póczos Outline Quick Introduction Feature space Perceptron in the feature space Kernels Mercer s theorem Finite domain Arbitrary domain Kernel families Constructing new kernels

More information

SEMI-SUPERVISED classification, which estimates a decision

SEMI-SUPERVISED classification, which estimates a decision IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Semi-Supervised Support Vector Machines with Tangent Space Intrinsic Manifold Regularization Shiliang Sun and Xijiong Xie Abstract Semi-supervised

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France Partially based on material by: Toma s Koca k November 23, 2015 MVA 2015/2016 Last Lecture Examples of applications of online SSL

More information

9.520 Problem Set 2. Due April 25, 2011

9.520 Problem Set 2. Due April 25, 2011 9.50 Problem Set Due April 5, 011 Note: there are five problems in total in this set. Problem 1 In classification problems where the data are unbalanced (there are many more examples of one class than

More information

Classification: The rest of the story

Classification: The rest of the story U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher

More information

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1). Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes

More information

Support vector machines Lecture 4

Support vector machines Lecture 4 Support vector machines Lecture 4 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin Q: What does the Perceptron mistake bound tell us? Theorem: The

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

ML (cont.): SUPPORT VECTOR MACHINES

ML (cont.): SUPPORT VECTOR MACHINES ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version

More information

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING Luis Rademacher, Ohio State University, Computer Science and Engineering. Joint work with Mikhail Belkin and James Voss This talk A new approach to multi-way

More information

Multi-View Point Cloud Kernels for Semi-Supervised Learning

Multi-View Point Cloud Kernels for Semi-Supervised Learning Multi-View Point Cloud Kernels for Semi-Supervised Learning David S. Rosenberg, Vikas Sindhwani, Peter L. Bartlett, Partha Niyogi May 29, 2009 Scope In semi-supervised learning (SSL), we learn a predictive

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Discrete vs. Continuous: Two Sides of Machine Learning

Discrete vs. Continuous: Two Sides of Machine Learning Discrete vs. Continuous: Two Sides of Machine Learning Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Oct. 18,

More information

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

IFT LAPLACIAN APPLICATIONS. Mikhail Bessmeltsev

IFT LAPLACIAN APPLICATIONS.   Mikhail Bessmeltsev IFT 6112 09 LAPLACIAN APPLICATIONS http://www-labs.iro.umontreal.ca/~bmpix/teaching/6112/2018/ Mikhail Bessmeltsev Rough Intuition http://pngimg.com/upload/hammer_png3886.png You can learn a lot about

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Machine Learning Lecture 6 Note

Machine Learning Lecture 6 Note Machine Learning Lecture 6 Note Compiled by Abhi Ashutosh, Daniel Chen, and Yijun Xiao February 16, 2016 1 Pegasos Algorithm The Pegasos Algorithm looks very similar to the Perceptron Algorithm. In fact,

More information

Approximation Theoretical Questions for SVMs

Approximation Theoretical Questions for SVMs Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually

More information

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Statistical Learning. Dong Liu. Dept. EEIS, USTC Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle

More information

Physical Metaphors for Graphs

Physical Metaphors for Graphs Graphs and Networks Lecture 3 Physical Metaphors for Graphs Daniel A. Spielman October 9, 203 3. Overview We will examine physical metaphors for graphs. We begin with a spring model and then discuss resistor

More information

Limits of Spectral Clustering

Limits of Spectral Clustering Limits of Spectral Clustering Ulrike von Luxburg and Olivier Bousquet Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany {ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

The Learning Problem and Regularization

The Learning Problem and Regularization 9.520 Class 02 February 2011 Computational Learning Statistical Learning Theory Learning is viewed as a generalization/inference problem from usually small sets of high dimensional, noisy data. Learning

More information

One-class Label Propagation Using Local Cone Based Similarity

One-class Label Propagation Using Local Cone Based Similarity One-class Label Propagation Using Local Based Similarity Takumi Kobayashi and Nobuyuki Otsu Abstract In this paper, we propose a novel method of label propagation for one-class learning. For binary (positive/negative)

More information

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms A simple example Two ideal clusters, with two points each Spectral clustering Lecture 2 Spectral clustering algorithms 4 2 3 A = Ideally permuted Ideal affinities 2 Indicator vectors Each cluster has an

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Cluster Kernels for Semi-Supervised Learning

Cluster Kernels for Semi-Supervised Learning Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Introduction to Spectral Graph Theory and Graph Clustering

Introduction to Spectral Graph Theory and Graph Clustering Introduction to Spectral Graph Theory and Graph Clustering Chengming Jiang ECS 231 Spring 2016 University of California, Davis 1 / 40 Motivation Image partitioning in computer vision 2 / 40 Motivation

More information

Support Vector Machines

Support Vector Machines CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe is indeed the best)

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date

More information

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Boaz Nadler Dept. of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel 76 boaz.nadler@weizmann.ac.il

More information

Max Margin-Classifier

Max Margin-Classifier Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES

SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES JIANG ZHU, SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, P. R. China E-MAIL:

More information

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses

More information

LABELED data is expensive to obtain in terms of both. Laplacian Embedded Regression for Scalable Manifold Regularization

LABELED data is expensive to obtain in terms of both. Laplacian Embedded Regression for Scalable Manifold Regularization Laplacian Embedded Regression for Scalable Manifold Regularization Lin Chen, Ivor W. Tsang, Dong Xu Abstract Semi-supervised Learning (SSL), as a powerful tool to learn from a limited number of labeled

More information

of semi-supervised Gaussian process classifiers.

of semi-supervised Gaussian process classifiers. Semi-supervised Gaussian Process Classifiers Vikas Sindhwani Department of Computer Science University of Chicago Chicago, IL 665, USA vikass@cs.uchicago.edu Wei Chu CCLS Columbia University New York,

More information

Semi-supervised Learning

Semi-supervised Learning Semi-supervised Learning Introduction Supervised learning: x r, y r R r=1 E.g.x r : image, y r : class labels Semi-supervised learning: x r, y r r=1 R, x u R+U u=r A set of unlabeled data, usually U >>

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces Notes on the framework of Ando and Zhang (2005 Karl Stratos 1 Beyond learning good functions: learning good spaces 1.1 A single binary classification problem Let X denote the problem domain. Suppose we

More information