Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Size: px
Start display at page:

Download "Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31"

Transcription

1 Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

2 Learning from Examples Input space X, and output space Y = {1, 1}. Training set S = {z 1 = (x 1, y 1 ),..., z l = (x l, y l )} in Z = X Y drawn i.i.d. from some unknown distribution. Classifier f : X Y. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 2/31

3 Transductive Setting Input space X = {x 1,..., x n }, and output space Y = {1, 1}. Training set S = {z 1 = (x 1, y 1 ),..., z l = (x l, y l )}. Classifier f : X Y. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 3/31

4 Intuition about classification: Manifold Local consistency. Nearby points are likely to have the same label. Global consistency. Points on the same structure (typically referred to as a cluster or manifold) are likely to have the same label. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 4/31

5 A Toy Dataset (Two Moons) (a) Toy Data (Two Moons) unlabeled point labeled point 1 labeled point (b) SVM (RBF Kernel) (c) k NN 1.5 (d) Ideal Classification Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 5/31

6 Algorithm 1. Form the affinity matrix W defined by W ij = exp( x i x j 2 /2σ 2 ) if i j and W ii = Construct the matrix S = D 1/2 W D 1/2 in which D is a diagonal matrix with its (i, i)-element equal to the sum of the i-th row of W. 3. Iterate f(t + 1) = αsf(t) + (1 α)y until convergence, where α is a parameter in (0, 1). 4. Let f denote the limit of the sequence {f(t)}. Label each point x i as y i = sgn(f i ). Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 6/31

7 Convergence Theorem. The sequence {f(t)} converges to f = β(i αs) 1 y, where β = 1 α. Proof. Suppose F (0) = Y. By the iteration equation, we have t 1 f(t) = (αs) t 1 Y + (1 α) (αs) i Y. (1) i=0 Since 0 < α < 1 and the eigenvalues of S in [ 1, 1], lim t (αs)t 1 = 0, and lim t t 1 i=0 (αs) i = (I αs) 1. (2) Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 7/31

8 Regularization Framework Cost function Q(f) = 1 [ n 2 W ij ( 1 Dii f i 1 Djj f j ) 2 + µ n ( fi y i ) 2 ] i,j=1 i=1 Smoothness term. Measure the changes between nearby points. Fitting term. Measure the changes from the initial label assignments. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 8/31

9 Regularization Framework Theorem. f = arg min f F Q(f). Proof. Differentiating Q(f) with respect to f, we have Q f = f Sf + µ(f y) = 0, (1) f=f which can be transformed into f µ Sf µ y = 0. (2) 1 + µ Let α = 1/(1 + µ) and β = µ/(1 + µ). Then (I αs)f = βy. (3) Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 9/31

10 Two Variants Substitute P = D 1 W for S in the iteration equation. Then f = (I αp ) 1 y. Replace S with P T, the transpose of P. Then f = (I αp T ) 1 y, which is equivalent to f = (D αw ) 1 y. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 10/31

11 Toy Problem 1.5 (a) t = (b) t = (c) t = (d) t = Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 11/31

12 Toy Problem Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 12/31

13 Handwritten Digit Recognition (USPS) k NN (k = 1) SVM (RBF kernel) consistency variant (1) variant (2) 0.3 test error # labeled points Dimension: 16x16. Size: (α = 0.95) Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 13/31

14 Handwritten Digit Recognition (USPS) consistency variant (1) variant (2) test error values of parameter α Size of labeled data: l = 50. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 14/31

15 Text Classification (20-newsgroups) k NN (k = 1) SVM (RBF kernel) consistency variant (1) variant (2) 0.6 test error # labeled points Dimension: Size: (α = 0.95) Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 15/31

16 Text Classification (20-newsgroups) consistency variant (1) variant (2) 0.4 test error values of parameter α Size of labeled data: l = 50. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 16/31

17 Spectral Graph Theory Normalized graph Laplacian = D 1/2 (D W )D 1/2. Linear operator on the space of functions defined on the Graph. Theorem. i,j W ij ( ) 2 1 Dii f i 1 f j = f, f. Djj Discrete analogy of Laplace-Beltrami operator on Riemannian Manifold which satisfies f 2 = (f)f. M M Discrete Laplace equation f = y. Green s function G =. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 17/31

18 Reversible Markov Chains Lazy random walk defined by the transition probability matrix P = (1 α)i + αd 1 W, α (0, 1). Hitting time H ij = E{ number of steps required for a random walk to reach a position x j with an initial position x i }. Commute time C ij = H ij + H ji. Theorem. Let L = (D αw ) 1. Then C ij L ii + L jj L ij L ji Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 18/31

19 Ranking Problem Problem setting. Given a set of point X = {x 1,..., x q, x q+1,..., x n } R m, the first q points are the queries. The task is to rank the remaining points according to their relevances to the queries. Examples. Image, document, movie, book, protein ("killer application"),... Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 19/31

20 Intuition of Ranking: Manifold (a) Two moons ranking problem (b) Ideal ranking query The relevant degrees of points in the upper moon to the query should decrease along the moon shape. All points in the upper moon should be more relevant to the query than the points in the lower moon. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 20/31

21 Toy Ranking (a) Connected graph Simply ranking the data according to the shortest paths on the graph does not work well. Robust solution is to assemble all paths between two points: f = i αi S i y. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 21/31

22 Toy Ranking Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 22/31

23 Connection to Google Theorem. For the task of ranking data represented by a connected and undirected graph without queries, f and PageRank yield the same ranking list. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 23/31

24 Personalized Google: a variant The ranking scores given by PageRank: π(t + 1) = αp T π(t). (4) Add a query term on the right-hand side for the query-based ranking, π(t + 1) = αp T π(t) + (1 α)y. (5) This can be viewed as the personalized version of PageRank. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 24/31

25 Image Ranking The top-left digit in each panel is the query. The left panel shows the top 99 by our method; and the right panel shows the top 99 by the Euclidean distance. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 25/31

26 Document Ranking 0.9 (a) autos 0.9 (b) motorcycles inner product inner product inner product manifold ranking (c) baseball manifold ranking inner product manifold ranking (d) hockey manifold ranking Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 26/31

27 Related Work Graph/disffusion/cluster kernel (Kondor et al 2002; Chapelle et al. 2002; Smola et al. 2003). Spectral clustering (Shi et al. 1997; Ng et al. 2001). Manifold learning (nonlinear data reduction)(tenenbaum et al. 2000; Roweis et al. 2000) Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 27/31

28 Related Work Random walks (Szummer et al. 2001). Graph min-cuts (Blum et al. 2001) Learning on manifolds (Belkin et al. 2001). Gaussian random fields (Zhu et al. 2003). Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 28/31

29 Conclusion Proposed a general semi-supervised learning algorithm. Proposed a general example-based ranking algorithm. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 29/31

30 Next Work Model selection. Active learning. Generalization theory of learning from labeled and unlabeled data. Specifical problems & large-scale problems. Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 30/31

31 References 1. Zhou, D., Bousquet, O., Lal, T.N., Weston, J. and Schölkopf, B.: Learning with Local and Global Consistency. NIPS, Zhou, D., Weston, J., Gretton, A., Bousquet, O. and Schölkopf, B.: Ranking on Data Manifolds. NIPS, Weston, J., C. Leslie, D. Zhou, A. Elisseeff and W. S. Noble: Semi-Supervised Protein Classification using Cluster Kernels. NIPS, Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 31/31

What is semi-supervised learning?

What is semi-supervised learning? What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,

More information

How to learn from very few examples?

How to learn from very few examples? How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A

More information

Discrete vs. Continuous: Two Sides of Machine Learning

Discrete vs. Continuous: Two Sides of Machine Learning Discrete vs. Continuous: Two Sides of Machine Learning Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Oct. 18,

More information

A Regularization Framework for Learning from Graph Data

A Regularization Framework for Learning from Graph Data A Regularization Framework for Learning from Graph Data Dengyong Zhou Max Planck Institute for Biological Cybernetics Spemannstr. 38, 7076 Tuebingen, Germany Bernhard Schölkopf Max Planck Institute for

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

Cluster Kernels for Semi-Supervised Learning

Cluster Kernels for Semi-Supervised Learning Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning classification classifiers need labeled data to train labeled data

More information

Semi-Supervised Learning

Semi-Supervised Learning Semi-Supervised Learning getting more for less in natural language processing and beyond Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning many human

More information

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Semi-Supervised Learning with Graphs

Semi-Supervised Learning with Graphs Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu LTI SCS CMU Thesis Committee John Lafferty (co-chair) Ronald Rosenfeld (co-chair) Zoubin Ghahramani Tommi Jaakkola 1 Semi-supervised Learning classifiers

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

Regularization on Discrete Spaces

Regularization on Discrete Spaces Regularization on Discrete Spaces Dengyong Zhou and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany {dengyong.zhou, bernhard.schoelkopf}@tuebingen.mpg.de

More information

1 Graph Kernels by Spectral Transforms

1 Graph Kernels by Spectral Transforms Graph Kernels by Spectral Transforms Xiaojin Zhu Jaz Kandola John Lafferty Zoubin Ghahramani Many graph-based semi-supervised learning methods can be viewed as imposing smoothness conditions on the target

More information

Semi-Supervised Learning on Riemannian Manifolds

Semi-Supervised Learning on Riemannian Manifolds Semi-Supervised Learning on Riemannian Manifolds Mikhail Belkin, Partha Niyogi University of Chicago, Department of Computer Science Abstract. We consider the general problem of utilizing both labeled

More information

Generalized Optimization Framework for Graph-based Semi-supervised Learning

Generalized Optimization Framework for Graph-based Semi-supervised Learning Generalized Optimization Framework for Graph-based Semi-supervised Learning Konstantin Avrachenkov INRIA Sophia Antipolis, France K.Avrachenkov@sophia.inria.fr Alexey Mishenin St. Petersburg State University,

More information

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Gigantic Image Collections What does the world look like? High

More information

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of

More information

Semi-Supervised Learning on Riemannian Manifolds

Semi-Supervised Learning on Riemannian Manifolds Machine Learning, 56, 209 239, 2004 c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Semi-Supervised Learning on Riemannian Manifolds MIKHAIL BELKIN misha@math.uchicago.edu PARTHA NIYOGI

More information

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter

Classification Semi-supervised learning based on network. Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS Winter Classification Semi-supervised learning based on network Speakers: Hanwen Wang, Xinxin Huang, and Zeyu Li CS 249-2 2017 Winter Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Xiaojin

More information

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Boaz Nadler Dept. of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel 76 boaz.nadler@weizmann.ac.il

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017

More information

Graph-Based Semi-Supervised Learning

Graph-Based Semi-Supervised Learning Graph-Based Semi-Supervised Learning Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux Université de Montréal CIAR Workshop - April 26th, 2005 Graph-Based Semi-Supervised Learning Yoshua Bengio, Olivier

More information

Limits of Spectral Clustering

Limits of Spectral Clustering Limits of Spectral Clustering Ulrike von Luxburg and Olivier Bousquet Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany {ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de

More information

An Iterated Graph Laplacian Approach for Ranking on Manifolds

An Iterated Graph Laplacian Approach for Ranking on Manifolds An Iterated Graph Laplacian Approach for Ranking on Manifolds Xueyuan Zhou Department of Computer Science University of Chicago Chicago, IL zhouxy@cs.uchicago.edu Mikhail Belkin Department of Computer

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview

More information

Manifold Coarse Graining for Online Semi-supervised Learning

Manifold Coarse Graining for Online Semi-supervised Learning for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,

More information

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,

More information

Global vs. Multiscale Approaches

Global vs. Multiscale Approaches Harmonic Analysis on Graphs Global vs. Multiscale Approaches Weizmann Institute of Science, Rehovot, Israel July 2011 Joint work with Matan Gavish (WIS/Stanford), Ronald Coifman (Yale), ICML 10' Challenge:

More information

Efficient Iterative Semi-Supervised Classification on Manifold

Efficient Iterative Semi-Supervised Classification on Manifold Efficient Iterative Semi-Supervised Classification on Manifold Mehrdad Farajtabar, Hamid R. Rabiee, Amirreza Shaban, Ali Soltani-Farani Digital Media Lab, AICTC Research Center, Department of Computer

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions

Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions Xiaojin hu HUXJ@CSCMUEDU oubin Ghahramani OUBIN@GASBUCLACU John Lafferty LAFFER@CSCMUEDU School of Computer Science, Carnegie Mellon

More information

Partially labeled classification with Markov random walks

Partially labeled classification with Markov random walks Partially labeled classification with Markov random walks Martin Szummer MIT AI Lab & CBCL Cambridge, MA 0239 szummer@ai.mit.edu Tommi Jaakkola MIT AI Lab Cambridge, MA 0239 tommi@ai.mit.edu Abstract To

More information

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking

More information

Semi-Supervised Classification with Universum

Semi-Supervised Classification with Universum Semi-Supervised Classification with Universum Dan Zhang 1, Jingdong Wang 2, Fei Wang 3, Changshui Zhang 4 1,3,4 State Key Laboratory on Intelligent Technology and Systems, Tsinghua National Laboratory

More information

Similarity and kernels in machine learning

Similarity and kernels in machine learning 1/31 Similarity and kernels in machine learning Zalán Bodó Babeş Bolyai University, Cluj-Napoca/Kolozsvár Faculty of Mathematics and Computer Science MACS 2016 Eger, Hungary 2/31 Machine learning Overview

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

Random Walks, Random Fields, and Graph Kernels

Random Walks, Random Fields, and Graph Kernels Random Walks, Random Fields, and Graph Kernels John Lafferty School of Computer Science Carnegie Mellon University Based on work with Avrim Blum, Zoubin Ghahramani, Risi Kondor Mugizi Rwebangira, Jerry

More information

Semi-Supervised Learning through Principal Directions Estimation

Semi-Supervised Learning through Principal Directions Estimation Semi-Supervised Learning through Principal Directions Estimation Olivier Chapelle, Bernhard Schölkopf, Jason Weston Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany {first.last}@tuebingen.mpg.de

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

Manifold Regularization

Manifold Regularization Manifold Regularization Vikas Sindhwani Department of Computer Science University of Chicago Joint Work with Mikhail Belkin and Partha Niyogi TTI-C Talk September 14, 24 p.1 The Problem of Learning is

More information

Improving Semi-Supervised Target Alignment via Label-Aware Base Kernels

Improving Semi-Supervised Target Alignment via Label-Aware Base Kernels Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Improving Semi-Supervised Target Alignment via Label-Aware Base Kernels Qiaojun Wang, Kai Zhang 2, Guofei Jiang 2, and Ivan Marsic

More information

Semi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion

Semi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion Semi-supervised ictionary Learning Based on Hilbert-Schmidt Independence Criterion Mehrdad J. Gangeh 1, Safaa M.A. Bedawi 2, Ali Ghodsi 3, and Fakhri Karray 2 1 epartments of Medical Biophysics, and Radiation

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Clustering and efficient use of unlabeled examples

Clustering and efficient use of unlabeled examples Clustering and efficient use of unlabeled examples Martin Szummer MIT AI Lab & CBCL Cambridge, MA 02139 szummer@ai.mit.edu Tommi Jaakkola MIT AI Lab Cambridge, MA 02139 tommi@ai.mit.edu Abstract Efficient

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

Learning with Consistency between Inductive Functions and Kernels

Learning with Consistency between Inductive Functions and Kernels Learning with Consistency between Inductive Functions and Kernels Haixuan Yang Irwin King Michael R. Lyu Department of Computer Science & Engineering The Chinese University of Hong Kong Shatin, N.T., Hong

More information

A graph based approach to semi-supervised learning

A graph based approach to semi-supervised learning A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.

More information

Appendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts

Appendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts Appendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts Hariharan Narayanan Department of Computer Science University of Chicago Chicago IL 6637 hari@cs.uchicago.edu

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Semi-supervised approach which attempts to incorporate partial information from unlabeled data points Semi-supervised approach which attempts

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton February 10, 2015 MVA 2014/2015 Previous

More information

Higher Order Learning with Graphs

Higher Order Learning with Graphs Sameer Agarwal sagarwal@cs.ucsd.edu Kristin Branson kbranson@cs.ucsd.edu Serge Belongie sjb@cs.ucsd.edu Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA

More information

Semi-Supervised Learning by Multi-Manifold Separation

Semi-Supervised Learning by Multi-Manifold Separation Semi-Supervised Learning by Multi-Manifold Separation Xiaojin (Jerry) Zhu Department of Computer Sciences University of Wisconsin Madison Joint work with Andrew Goldberg, Zhiting Xu, Aarti Singh, and Rob

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Spectral Feature Selection for Supervised and Unsupervised Learning

Spectral Feature Selection for Supervised and Unsupervised Learning Spectral Feature Selection for Supervised and Unsupervised Learning Zheng Zhao Huan Liu Department of Computer Science and Engineering, Arizona State University zhaozheng@asu.edu huan.liu@asu.edu Abstract

More information

Manifold Denoising. Matthias Hein Markus Maier Max Planck Institute for Biological Cybernetics Tübingen, Germany

Manifold Denoising. Matthias Hein Markus Maier Max Planck Institute for Biological Cybernetics Tübingen, Germany Manifold Denoising Matthias Hein Markus Maier Max Planck Institute for Biological Cybernetics Tübingen, Germany {first.last}@tuebingen.mpg.de Abstract We consider the problem of denoising a noisily sampled

More information

One-class Label Propagation Using Local Cone Based Similarity

One-class Label Propagation Using Local Cone Based Similarity One-class Label Propagation Using Local Based Similarity Takumi Kobayashi and Nobuyuki Otsu Abstract In this paper, we propose a novel method of label propagation for one-class learning. For binary (positive/negative)

More information

Laplacian Spectrum Learning

Laplacian Spectrum Learning Laplacian Spectrum Learning Pannagadatta K. Shivaswamy and Tony Jebara Department of Computer Science Columbia University New York, NY 0027 {pks203,jebara}@cs.columbia.edu Abstract. The eigenspectrum of

More information

Graph Quality Judgement: A Large Margin Expedition

Graph Quality Judgement: A Large Margin Expedition Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Graph Quality Judgement: A Large Margin Expedition Yu-Feng Li Shao-Bo Wang Zhi-Hua Zhou National Key

More information

Predicting Graph Labels using Perceptron. Shuang Song

Predicting Graph Labels using Perceptron. Shuang Song Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Large-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver

Large-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Large-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver Yan-Ming Zhang and Xu-Yao Zhang National Laboratory

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Statistical Learning. Dong Liu. Dept. EEIS, USTC Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc.

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010

More information

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS Akshay Gadde and Antonio Ortega Department of Electrical Engineering University of Southern California, Los Angeles Email: agadde@usc.edu,

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Noise Robust Spectral Clustering

Noise Robust Spectral Clustering Noise Robust Spectral Clustering Zhenguo Li, Jianzhuang Liu, Shifeng Chen, and Xiaoou Tang, Dept. of Information Engineering Microsoft Research Asia The Chinese University of Hong Kong Beijing, China {zgli5,jzliu,sfchen5}@ie.cuhk.edu.hk

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Online Manifold Regularization: A New Learning Setting and Empirical Study

Online Manifold Regularization: A New Learning Setting and Empirical Study Online Manifold Regularization: A New Learning Setting and Empirical Study Andrew B. Goldberg 1, Ming Li 2, Xiaojin Zhu 1 1 Computer Sciences, University of Wisconsin Madison, USA. {goldberg,jerryzhu}@cs.wisc.edu

More information

Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances

Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances Wee Sun Lee,2, Xinhua Zhang,2, and Yee Whye Teh Department of Computer Science, National University of Singapore. 2

More information

Relational Learning with Gaussian Processes

Relational Learning with Gaussian Processes Relational Learning with Gaussian Processes Wei Chu CCLS Columbia Univ. New York, NY 05 Vikas Sindhwani Dept. of Comp. Sci. Univ. of Chicago Chicago, IL 60637 Zoubin Ghahramani Dept. of Engineering Univ.

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Semi-supervised Learning using Sparse Eigenfunction Bases

Semi-supervised Learning using Sparse Eigenfunction Bases Semi-supervised Learning using Sparse Eigenfunction Bases Kaushik Sinha Dept. of Computer Science and Engineering Ohio State University Columbus, OH 43210 sinhak@cse.ohio-state.edu Mikhail Belkin Dept.

More information

Information Extraction from Text

Information Extraction from Text Information Extraction from Text Jing Jiang Chapter 2 from Mining Text Data (2012) Presented by Andrew Landgraf, September 13, 2013 1 What is Information Extraction? Goal is to discover structured information

More information

Convex Methods for Transduction

Convex Methods for Transduction Convex Methods for Transduction Tijl De Bie ESAT-SCD/SISTA, K.U.Leuven Kasteelpark Arenberg 10 3001 Leuven, Belgium tijl.debie@esat.kuleuven.ac.be Nello Cristianini Department of Statistics, U.C.Davis

More information

Semi Supervised Distance Metric Learning

Semi Supervised Distance Metric Learning Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =

More information

Graph-based Semi-supervised Learning: Realizing Pointwise Smoothness Probabilistically

Graph-based Semi-supervised Learning: Realizing Pointwise Smoothness Probabilistically Graph-based Semi-supervised Learning: Realizing Pointwise Smoothness Probabilistically Yuan Fang Kevin Chen-Chuan Chang Hady W. Lauw University of Illinois at Urbana-Champaign, USA Advanced Digital Sciences

More information

Graph Metrics and Dimension Reduction

Graph Metrics and Dimension Reduction Graph Metrics and Dimension Reduction Minh Tang 1 Michael Trosset 2 1 Applied Mathematics and Statistics The Johns Hopkins University 2 Department of Statistics Indiana University, Bloomington November

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Kernel Methods in Machine Learning

Kernel Methods in Machine Learning Kernel Methods in Machine Learning Autumn 2015 Lecture 1: Introduction Juho Rousu ICS-E4030 Kernel Methods in Machine Learning 9. September, 2015 uho Rousu (ICS-E4030 Kernel Methods in Machine Learning)

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Lecture: Some Practical Considerations (3 of 4)

Lecture: Some Practical Considerations (3 of 4) Stat260/CS294: Spectral Graph Methods Lecture 14-03/10/2015 Lecture: Some Practical Considerations (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough.

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Graph and Network Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Methods Learnt Classification Clustering Vector Data Text Data Recommender System Decision Tree; Naïve

More information

Fast Graph Laplacian Regularized Kernel Learning via Semidefinite Quadratic Linear Programming

Fast Graph Laplacian Regularized Kernel Learning via Semidefinite Quadratic Linear Programming Fast Graph Laplacian Regularized Kernel Learning via Semidefinite Quadratic Linear Programming Xiao-Ming Wu Dept. of IE The Chinese University of Hong Kong wxm007@ie.cuhk.edu.hk Zhenguo Li Dept. of IE

More information

An RKHS for Multi-View Learning and Manifold Co-Regularization

An RKHS for Multi-View Learning and Manifold Co-Regularization Vikas Sindhwani vsindhw@us.ibm.com Mathematical Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA David S. Rosenberg drosen@stat.berkeley.edu Department of Statistics, University

More information

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018

More information

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning Kernel Machines Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 SVM linearly separable case n training points (x 1,, x n ) d features x j is a d-dimensional vector Primal problem:

More information

Diffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University

Diffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University Diffusion/Inference geometries of data features, situational awareness and visualization Ronald R Coifman Mathematics Yale University Digital data is generally converted to point clouds in high dimensional

More information

Discriminative K-means for Clustering

Discriminative K-means for Clustering Discriminative K-means for Clustering Jieping Ye Arizona State University Tempe, AZ 85287 jieping.ye@asu.edu Zheng Zhao Arizona State University Tempe, AZ 85287 zhaozheng@asu.edu Mingrui Wu MPI for Biological

More information

Fast Krylov Methods for N-Body Learning

Fast Krylov Methods for N-Body Learning Fast Krylov Methods for N-Body Learning Nando de Freitas Department of Computer Science University of British Columbia nando@cs.ubc.ca Maryam Mahdaviani Department of Computer Science University of British

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

SEMI-SUPERVISED classification, which estimates a decision

SEMI-SUPERVISED classification, which estimates a decision IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Semi-Supervised Support Vector Machines with Tangent Space Intrinsic Manifold Regularization Shiliang Sun and Xijiong Xie Abstract Semi-supervised

More information