Manifold Regularization

Size: px
Start display at page:

Download "Manifold Regularization"

Transcription

1 Manifold Regularization Vikas Sindhwani Department of Computer Science University of Chicago Joint Work with Mikhail Belkin and Partha Niyogi TTI-C Talk September 14, 24 p.1

2 The Problem of Learning is drawn from an unknown probability distribution. A Learning Algorithm maps of a hypothesis space mapping. to an element of functions should provide good labels for future examples. Regularization : Choose a simple function that agrees with data. TTI-C Talk September 14, 24 p.2

3 The Problem of Learning Notions of simplicity are the key to successful learning. Here s a simple function that agrees with data. TTI-C Talk September 14, 24 p.3

4 Learning and Prior Knowledge But Simplicity is a Relative Concept. Prior Knowledge of the Marginal can modify our notions of simplicity. TTI-C Talk September 14, 24 p.4

5 Motivation How can we exploit prior knowledge of the marginal distribution? More practically, how can we use unlabeled examples drawn from Why is this important? Natural Data has structure to exploit. Natural Learning is largely semi-supervised. Labels are Expensive, Unlabeled data is cheap and plenty. TTI-C Talk September 14, 24 p.5

6 Contributions A data-dependent, Geometric Regularization Framework for Learning from examples. Representer Theorems provide solutions. Extensions of SVM and RLS for Semi-supervised Learning. Regularized Spectral Clustering and Dimensionality Reduction. The problem of Out-of-sample extensions in graph methods is resolved. Good Empirical Performance. TTI-C Talk September 14, 24 p.6

7 %. Regularization with RKHS Learning in Reproducing Kernel Hilbert Spaces : ' (! " Regularized Least Squares (RLS) : ' ) Support Vector Machine (SVM) : )! +-, * TTI-C Talk September 14, 24 p.7

8 / ; What are RKHS? Hilbert Spaces with a nice property : If two functions are close in the distance derived from the inner product, their values are close. / 1 Reproducing Property : is linear, continuous. By Reisz s Representation theorem, Kernel Function RKHS : : 3 : TTI-C Talk September 14, 24 p.8

9 D D G Why RKHS? Rich Function Spaces with complexity control e.g Gaussian Kernel H I G ' ( 'M I L 'KJ > C >@?BA DFE = < Representer Theorems show that the minimizer has the form : and therefore, 7 O 7 N S NS N RQ S P 9 8 ' ( Motivates kernelization (KPCA, KFD, etc). Good empirical performance. : TTI-C Talk September 14, 24 p.9

10 Known Marginal If is known, solve : ' V %WV ' ( %UT! " Extrinsic and Intrinsic Regularization controls complexity in ambient space. controls complexity in the intrinsic geometry of %XT %V TTI-C Talk September 14, 24 p.1

11 Z Continuous Representer Theorem Assume that the penalty term is sufficiently smooth with respect to the RKHS norm. Then the solution to the optimization problem exists and admits the following representation V ( M N Y N where ] ] [\ is the support of the marginal. TTI-C Talk September 14, 24 p.11

12 _ ^ 9 8 Z Z ( 9 8 Z Z A Manifold Regularizer If, the support of the marginal is a compact submanifold, it seems natural to choose : ' V Z and to find that minimizes : Z %V ' ( %UT! " TTI-C Talk September 14, 24 p.12

13 Z 9 8 Z Z Z Z 9 8 Mc d ) Z Z Z Laplace Beltrami Operator The intrinsic regularizer is a quadratic form involving the Laplace-Beltrami operator on the manifold d Mc ) `ba : Z V because some calculus on manifolds establishes that for any vector field, TTI-C Talk September 14, 24 p.13

14 l l /m S S ) n Passage to the Discrete In reality, is unknown and sampled only via examples. Labels are not required for empirical estimates of. BeUf ' V Manifold Graph S k ghji 4 S. ef Laplace Beltrami Graph Laplacian. `ba Mc ol p o ' V Z ' V S ' S ) TTI-C Talk September 14, 24 p.14

15 ( p o '. Algorithms We have motivated the following optimization problem : Find a function that minimizes : %V ol ' ( %qt! " " r Laplacian RLS ' ) Laplacian SVM )! +-, * TTI-C Talk September 14, 24 p.15

16 ( s S s Empirical Representer Theorem The minimizer admits an expansion BeUf N N BeUf as Proof : Write any S N BeUf 9 3k 8, s. increases the norm. So TTI-C Talk September 14, 24 p.16

17 N p p ' l t N N N N ) r t " "u N r Laplacian RLS By the Representer Theorem, the problem becomes finite dimensional. For Laplacian RLS, we find that minimizes : Bef %WV N %XT ' "! ".,, The solution is :,!, 7 77! 7 77 /m where : Gram Matrix ; and Mc %WV = l t %XT '" TTI-C Talk September 14, 24 p.17

18 , u z u z N r Laplacian SVM For Laplacian SVMs, we solve a QP : p ' ) yx *wv, subject to :, and pt = l D {} ~ BeUf where then invert a linear system : -z t %T %WV p t = l -z %XT ' " TTI-C Talk September 14, 24 p.18

19 4 N Manifold Regularization Input : l labeled and u unlabeled examples Output : _ 5 Algorithm : Contruct adjacency Graph. Compute Laplacian. Choose Kernel matrix K. Choose. (?) Compute. Output %V %T. Compute Gram Y N Bef TTI-C Talk September 14, 24 p.19

20 Out-of-sample Extn.Spectral Clustering Ž Unity of Learning Supervised Partially Supervised Unsupervised SVM/RLS Graph Regularization Graph Mincut š X ƒ ƒ ŠŒ Xˆ ƒ ª ± ª ž³ ²ž± ž± ª q ª U ª «ž œÿž œ b ž B b ž œÿžf œ µµ ª ª ƒ Š ˆ ƒ Out-of-sample Extn. ª q ª U ª «ž œÿž œ ª ª Š ˆ ƒ Reg. Spectral Clust. ŠŒ -ˆ ƒ ¹ µµ ª ª TTI-C Talk September 14, 24 p.2

21 ol p o % º N Æ l % d d N Ç Regularized Spectral Clustering Unsupervised Manifold Regularization : ' ( º ÃÄà > ¾ >¼ ÀÂÁ»½¼ ¾ Representer Theorem : leads to an eigenvalue problem : f ' Å Å and. is the smallest-eigenvalue eigenvector; P projects orthogonal to. TTI-C Talk September 14, 24 p.21

22 Experiments : Synthetic SVM Laplacian SVM Laplacian SVM γ A =.3125 γ I = γ A =.3125 γ I = γ A =.3125 γ I = γ A = 1e 6 γ I = γ A =.1 γ I = γ A =.1 γ I = 1 TTI-C Talk September 14, 24 p.22

23 œ Î œ Î œ Õ Related Algorithms Transductive SVMs [Joachims, Vapnik] œš ¹ µµ b ž š ž Ï ³ ÈÎ b ž š ž Ï ³ ƒ ÈÊÉ œš ž ž Ë ÍÍÍ ŠŒ Ë Ì ˆ Semi-supervised SVMs [Bennet,Fung et al] ž š ž Ï ³ ƒ ÈÊÉ Ñ ž Ë ÍÍÍ ŠÐ Ë Ì ˆ µµ Ó ž š Ï b ž š Ò Ï ³ œš œš žf C Measure-based Reg. [Bousquet et al] Ú b ØbÙ b Ö U ž ž ƒ È É Ô ˆ ž TTI-C Talk September 14, 24 p.23

24 Experiments : Synthetic Data 2.5 SVM 2.5 Transductive SVM 2.5 Laplacian SVM TTI-C Talk September 14, 24 p.24

25 Experiments : Digits Error Rates RLS vs LapRLS RLS LapRLS Error Rates SVM vs LapSVM SVM LapSVM Error Rates TSVM vs LapSVM TSVM LapSVM Classification Problems Classification Problems Classification Problems LapRLS (Test) Out of Sample Extension LapRLS (Unlabeled) LapSVM (Test) Out of Sample Extension LapSVM (Unlabeled) SVM (o), TSVM (x) Deviation Performance Deviation LapSVM Deviation TTI-C Talk September 14, 24 p.25

26 Experiments : Digits 9 8 RLS vs LapRLS RLS (T) RLS (U) LapRLS (T) LapRLS (U) 1 9 SVM vs LapSVM SVM (T) SVM (U) LapSVM (T) LapSVM (U) Average Error Rate 5 4 Average Error Rate Number of Labeled Examples Number of Labeled Examples TTI-C Talk September 14, 24 p.26

27 Experiments : Speech Error Rate (unlabeled set) Error Rates (test set) RLS vs LapRLS RLS LapRLS Labeled Speaker RLS vs LapRLS RLS LapRLS Labeled Speaker Error Rates (unlabeled set) Error Rates (test set) SVM vs TSVM vs LapSVM SVM TSVM LapSVM Labeled Speaker SVM vs TSVM vs LapSVM SVM TSVM LapSVM Labeled Speaker TTI-C Talk September 14, 24 p.27

28 Experiments : Speech 3 RLS 3 LapRLS Error Rate (Test) 25 2 Error Rate (Test) 25 2 Experiment 1 Experiment Error Rate (Unlabeled) Experiment 1 Experiment Error Rate (Unlabeled) 3 SVM 3 LapSVM Error Rate (Test) 25 2 Error Rate (Test) 25 2 Experiment 1 Experiment Error Rate (Unlabeled) Experiment 1 Experiment Error Rate (Unlabeled) TTI-C Talk September 14, 24 p.28

29 Experiments : Text Method PRBEP Error k-nn SGT Naive-Bayes 12.9 Cotraining 6.2 SVM (5.6) 1.41 (2.5) TSVM (1.) 5.22 (.5) LapSVM (2.3) 5.41 (1.) RLS (6.2) (2.7) LapRLS (3.1) 5.99 (1.4) TTI-C Talk September 14, 24 p.29

30 Experiments : Text Performance of RLS, LapRLS Performance of SVM, LapSVM PRBEP 75 7 PRBEP rls (U) rls (T) laprls (U) laprls (T) 65 6 svm (U) svm (T) lapsvm (U) lapsvm (T) Number of Labeled Examples LapSVM performance (Unlabeled) Number of Labeled Examples LapSVM performance (Test) PRBEP PRBEP U=779 l U=35 U=15 78 U=779 l U=35 U= Number of Labeled Examples Number of Labeled Examples TTI-C Talk September 14, 24 p.3

31 Future Work Generalization as a function of labeled and unlabeled examples. Additional Structure : Structured Outputs, Invariances Active Learning, Feature Selection Efficient Algorithms : Linear Methods, Sparse Solutions Applications : Bioinformatics, Text, Speech, Vision,... TTI-C Talk September 14, 24 p.31

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

A graph based approach to semi-supervised learning

A graph based approach to semi-supervised learning A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.

More information

of semi-supervised Gaussian process classifiers.

of semi-supervised Gaussian process classifiers. Semi-supervised Gaussian Process Classifiers Vikas Sindhwani Department of Computer Science University of Chicago Chicago, IL 665, USA vikass@cs.uchicago.edu Wei Chu CCLS Columbia University New York,

More information

How to learn from very few examples?

How to learn from very few examples? How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A

More information

Semi-Supervised Learning of Speech Sounds

Semi-Supervised Learning of Speech Sounds Aren Jansen Partha Niyogi Department of Computer Science Interspeech 2007 Objectives 1 Present a manifold learning algorithm based on locality preserving projections for semi-supervised phone classification

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

Justin Solomon MIT, Spring 2017

Justin Solomon MIT, Spring 2017 Justin Solomon MIT, Spring 2017 http://pngimg.com/upload/hammer_png3886.png You can learn a lot about a shape by hitting it (lightly) with a hammer! What can you learn about its shape from vibration frequencies

More information

IFT LAPLACIAN APPLICATIONS. Mikhail Bessmeltsev

IFT LAPLACIAN APPLICATIONS.   Mikhail Bessmeltsev IFT 6112 09 LAPLACIAN APPLICATIONS http://www-labs.iro.umontreal.ca/~bmpix/teaching/6112/2018/ Mikhail Bessmeltsev Rough Intuition http://pngimg.com/upload/hammer_png3886.png You can learn a lot about

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017

More information

Joint distribution optimal transportation for domain adaptation

Joint distribution optimal transportation for domain adaptation Joint distribution optimal transportation for domain adaptation Changhuang Wan Mechanical and Aerospace Engineering Department The Ohio State University March 8 th, 2018 Joint distribution optimal transportation

More information

What is semi-supervised learning?

What is semi-supervised learning? What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,

More information

Multi-View Point Cloud Kernels for Semi-Supervised Learning

Multi-View Point Cloud Kernels for Semi-Supervised Learning Multi-View Point Cloud Kernels for Semi-Supervised Learning David S. Rosenberg, Vikas Sindhwani, Peter L. Bartlett, Partha Niyogi May 29, 2009 Scope In semi-supervised learning (SSL), we learn a predictive

More information

Nonlinear Dimensionality Reduction. Jose A. Costa

Nonlinear Dimensionality Reduction. Jose A. Costa Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time

More information

Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances

Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances Semi-Supervised Learning in Reproducing Kernel Hilbert Spaces Using Local Invariances Wee Sun Lee,2, Xinhua Zhang,2, and Yee Whye Teh Department of Computer Science, National University of Singapore. 2

More information

Discrete vs. Continuous: Two Sides of Machine Learning

Discrete vs. Continuous: Two Sides of Machine Learning Discrete vs. Continuous: Two Sides of Machine Learning Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Oct. 18,

More information

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31 Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from

More information

! " # $! % & '! , ) ( + - (. ) ( ) * + / 0 1 2 3 0 / 4 5 / 6 0 ; 8 7 < = 7 > 8 7 8 9 : Œ Š ž P P h ˆ Š ˆ Œ ˆ Š ˆ Ž Ž Ý Ü Ý Ü Ý Ž Ý ê ç è ± ¹ ¼ ¹ ä ± ¹ w ç ¹ è ¼ è Œ ¹ ± ¹ è ¹ è ä ç w ¹ ã ¼ ¹ ä ¹ ¼ ¹ ±

More information

THE UNIVERSITY OF CHICAGO ON SEMI-SUPERVISED KERNEL METHODS A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES

THE UNIVERSITY OF CHICAGO ON SEMI-SUPERVISED KERNEL METHODS A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES THE UNIVERSITY OF CHICAGO ON SEMI-SUPERVISED KERNEL METHODS A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT

More information

LA PRISE DE CALAIS. çoys, çoys, har - dis. çoys, dis. tons, mantz, tons, Gas. c est. à ce. C est à ce. coup, c est à ce

LA PRISE DE CALAIS. çoys, çoys, har - dis. çoys, dis. tons, mantz, tons, Gas. c est. à ce. C est à ce. coup, c est à ce > ƒ? @ Z [ \ _ ' µ `. l 1 2 3 z Æ Ñ 6 = Ð l sl (~131 1606) rn % & +, l r s s, r 7 nr ss r r s s s, r s, r! " # $ s s ( ) r * s, / 0 s, r 4 r r 9;: < 10 r mnz, rz, r ns, 1 s ; j;k ns, q r s { } ~ l r mnz,

More information

An Analytical Comparison between Bayes Point Machines and Support Vector Machines

An Analytical Comparison between Bayes Point Machines and Support Vector Machines An Analytical Comparison between Bayes Point Machines and Support Vector Machines Ashish Kapoor Massachusetts Institute of Technology Cambridge, MA 02139 kapoor@mit.edu Abstract This paper analyzes the

More information

Framework for functional tree simulation applied to 'golden delicious' apple trees

Framework for functional tree simulation applied to 'golden delicious' apple trees Purdue University Purdue e-pubs Open Access Theses Theses and Dissertations Spring 2015 Framework for functional tree simulation applied to 'golden delicious' apple trees Marek Fiser Purdue University

More information

AN IDENTIFICATION ALGORITHM FOR ARMAX SYSTEMS

AN IDENTIFICATION ALGORITHM FOR ARMAX SYSTEMS AN IDENTIFICATION ALGORITHM FOR ARMAX SYSTEMS First the X, then the AR, finally the MA Jan C. Willems, K.U. Leuven Workshop on Observation and Estimation Ben Gurion University, July 3, 2004 p./2 Joint

More information

SVM example: cancer classification Support Vector Machines

SVM example: cancer classification Support Vector Machines SVM example: cancer classification Support Vector Machines 1. Cancer genomics: TCGA The cancer genome atlas (TCGA) will provide high-quality cancer data for large scale analysis by many groups: SVM example:

More information

Large Scale Semi-supervised Linear SVMs. University of Chicago

Large Scale Semi-supervised Linear SVMs. University of Chicago Large Scale Semi-supervised Linear SVMs Vikas Sindhwani and Sathiya Keerthi University of Chicago SIGIR 2006 Semi-supervised Learning (SSL) Motivation Setting Categorize x-billion documents into commercial/non-commercial.

More information

Semi-Supervised Learning on Riemannian Manifolds

Semi-Supervised Learning on Riemannian Manifolds Semi-Supervised Learning on Riemannian Manifolds Mikhail Belkin, Partha Niyogi University of Chicago, Department of Computer Science Abstract. We consider the general problem of utilizing both labeled

More information

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011

Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Hou, Ch. et al. IEEE Transactions on Neural Networks March 2011 Semi-supervised approach which attempts to incorporate partial information from unlabeled data points Semi-supervised approach which attempts

More information

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data

Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Boaz Nadler Dept. of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel 76 boaz.nadler@weizmann.ac.il

More information

Learning with Consistency between Inductive Functions and Kernels

Learning with Consistency between Inductive Functions and Kernels Learning with Consistency between Inductive Functions and Kernels Haixuan Yang Irwin King Michael R. Lyu Department of Computer Science & Engineering The Chinese University of Hong Kong Shatin, N.T., Hong

More information

LABELED data is expensive to obtain in terms of both. Laplacian Embedded Regression for Scalable Manifold Regularization

LABELED data is expensive to obtain in terms of both. Laplacian Embedded Regression for Scalable Manifold Regularization Laplacian Embedded Regression for Scalable Manifold Regularization Lin Chen, Ivor W. Tsang, Dong Xu Abstract Semi-supervised Learning (SSL), as a powerful tool to learn from a limited number of labeled

More information

Predicting Graph Labels using Perceptron. Shuang Song

Predicting Graph Labels using Perceptron. Shuang Song Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

Optimal data-independent point locations for RBF interpolation

Optimal data-independent point locations for RBF interpolation Optimal dataindependent point locations for RF interpolation S De Marchi, R Schaback and H Wendland Università di Verona (Italy), Universität Göttingen (Germany) Metodi di Approssimazione: lezione dell

More information

SEMI-SUPERVISED classification, which estimates a decision

SEMI-SUPERVISED classification, which estimates a decision IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Semi-Supervised Support Vector Machines with Tangent Space Intrinsic Manifold Regularization Shiliang Sun and Xijiong Xie Abstract Semi-supervised

More information

Semi-Supervised Learning on Riemannian Manifolds

Semi-Supervised Learning on Riemannian Manifolds Machine Learning, 56, 209 239, 2004 c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Semi-Supervised Learning on Riemannian Manifolds MIKHAIL BELKIN misha@math.uchicago.edu PARTHA NIYOGI

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Cluster Kernels for Semi-Supervised Learning

Cluster Kernels for Semi-Supervised Learning Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de

More information

Kernels for Multi task Learning

Kernels for Multi task Learning Kernels for Multi task Learning Charles A Micchelli Department of Mathematics and Statistics State University of New York, The University at Albany 1400 Washington Avenue, Albany, NY, 12222, USA Massimiliano

More information

Semi-Supervised Learning

Semi-Supervised Learning Semi-Supervised Learning getting more for less in natural language processing and beyond Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning many human

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

Semi-Supervised Learning with Graphs

Semi-Supervised Learning with Graphs Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu LTI SCS CMU Thesis Committee John Lafferty (co-chair) Ronald Rosenfeld (co-chair) Zoubin Ghahramani Tommi Jaakkola 1 Semi-supervised Learning classifiers

More information

Convex Optimization in Classification Problems

Convex Optimization in Classification Problems New Trends in Optimization and Computational Algorithms December 9 13, 2001 Convex Optimization in Classification Problems Laurent El Ghaoui Department of EECS, UC Berkeley elghaoui@eecs.berkeley.edu 1

More information

Advances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008

Advances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008 Advances in Manifold Learning Presented by: Nakul Verma June 10, 008 Outline Motivation Manifolds Manifold Learning Random projection of manifolds for dimension reduction Introduction to random projections

More information

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning classification classifiers need labeled data to train labeled data

More information

Kernel expansions with unlabeled examples

Kernel expansions with unlabeled examples Kernel expansions with unlabeled examples Martin Szummer MIT AI Lab & CBCL Cambridge, MA szummer@ai.mit.edu Tommi Jaakkola MIT AI Lab Cambridge, MA tommi@ai.mit.edu Abstract Modern classification applications

More information

Multi-view Laplacian Support Vector Machines

Multi-view Laplacian Support Vector Machines Multi-view Laplacian Support Vector Machines Shiliang Sun Department of Computer Science and Technology, East China Normal University, Shanghai 200241, China slsun@cs.ecnu.edu.cn Abstract. We propose a

More information

A Functional Quantum Programming Language

A Functional Quantum Programming Language A Functional Quantum Programming Language Thorsten Altenkirch University of Nottingham based on joint work with Jonathan Grattage and discussions with V.P. Belavkin supported by EPSRC grant GR/S30818/01

More information

Regression on Manifolds Using Kernel Dimension Reduction

Regression on Manifolds Using Kernel Dimension Reduction Jens Nilsson JENSN@MATHS.LTH.SE Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden Fei Sha FEISHA@CS.BERKELEY.EDU Computer Science Division, University of California, Berkeley,

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to

More information

Gene expression experiments. Infinite Dimensional Vector Spaces. 1. Motivation: Statistical machine learning and reproducing kernel Hilbert Spaces

Gene expression experiments. Infinite Dimensional Vector Spaces. 1. Motivation: Statistical machine learning and reproducing kernel Hilbert Spaces MA 751 Part 3 Gene expression experiments Infinite Dimensional Vector Spaces 1. Motivation: Statistical machine learning and reproducing kernel Hilbert Spaces Gene expression experiments Question: Gene

More information

Infinite Dimensional Vector Spaces. 1. Motivation: Statistical machine learning and reproducing kernel Hilbert Spaces

Infinite Dimensional Vector Spaces. 1. Motivation: Statistical machine learning and reproducing kernel Hilbert Spaces MA 751 Part 3 Infinite Dimensional Vector Spaces 1. Motivation: Statistical machine learning and reproducing kernel Hilbert Spaces Microarray experiment: Question: Gene expression - when is the DNA in

More information

An RKHS for Multi-View Learning and Manifold Co-Regularization

An RKHS for Multi-View Learning and Manifold Co-Regularization Vikas Sindhwani vsindhw@us.ibm.com Mathematical Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY 10598 USA David S. Rosenberg drosen@stat.berkeley.edu Department of Statistics, University

More information

An Introduction to Optimal Control Applied to Disease Models

An Introduction to Optimal Control Applied to Disease Models An Introduction to Optimal Control Applied to Disease Models Suzanne Lenhart University of Tennessee, Knoxville Departments of Mathematics Lecture1 p.1/37 Example Number of cancer cells at time (exponential

More information

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN Contribution from: Mathematical Physics Studies Vol. 7 Perspectives in Analysis Essays in Honor of Lennart Carleson s 75th Birthday Michael Benedicks, Peter W. Jones, Stanislav Smirnov (Eds.) Springer

More information

Chapter 9. Support Vector Machine. Yongdai Kim Seoul National University

Chapter 9. Support Vector Machine. Yongdai Kim Seoul National University Chapter 9. Support Vector Machine Yongdai Kim Seoul National University 1. Introduction Support Vector Machine (SVM) is a classification method developed by Vapnik (1996). It is thought that SVM improved

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June 2003; 15 (6):1373-1396 Presentation for CSE291 sp07 M. Belkin 1 P. Niyogi 2 1 University of Chicago, Department

More information

DISSIPATIVE SYSTEMS. Lectures by. Jan C. Willems. K.U. Leuven

DISSIPATIVE SYSTEMS. Lectures by. Jan C. Willems. K.U. Leuven DISSIPATIVE SYSTEMS Lectures by Jan C. Willems K.U. Leuven p.1/26 LYAPUNOV FUNCTIONS Consider the classical dynamical system, the flow with, the state, and Denote the set of solutions, the vector-field.

More information

Kernel Methods. Barnabás Póczos

Kernel Methods. Barnabás Póczos Kernel Methods Barnabás Póczos Outline Quick Introduction Feature space Perceptron in the feature space Kernels Mercer s theorem Finite domain Arbitrary domain Kernel families Constructing new kernels

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Vectors. Teaching Learning Point. Ç, where OP. l m n

Vectors. Teaching Learning Point. Ç, where OP. l m n Vectors 9 Teaching Learning Point l A quantity that has magnitude as well as direction is called is called a vector. l A directed line segment represents a vector and is denoted y AB Å or a Æ. l Position

More information

Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation

Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation Author: Raif M. Rustamov Presenter: Dan Abretske Johns Hopkins 2007 Outline Motivation and Background Laplace-Beltrami Operator

More information

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of

More information

Redoing the Foundations of Decision Theory

Redoing the Foundations of Decision Theory Redoing the Foundations of Decision Theory Joe Halpern Cornell University Joint work with Larry Blume and David Easley Economics Cornell Redoing the Foundations of Decision Theory p. 1/21 Decision Making:

More information

Back to the future: Radial Basis Function networks revisited

Back to the future: Radial Basis Function networks revisited Back to the future: Radial Basis Function networks revisited Qichao Que, Mikhail Belkin Department of Computer Science and Engineering Ohio State University Columbus, OH 4310 que, mbelkin@cse.ohio-state.edu

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date

More information

Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent

Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent Journal of Computational Information Systems 9: 15 (2013) 6251 6258 Available at http://www.jofcis.com Large Scale Semi-supervised Linear SVM with Stochastic Gradient Descent Xin ZHOU, Conghui ZHU, Sheng

More information

ETIKA V PROFESII PSYCHOLÓGA

ETIKA V PROFESII PSYCHOLÓGA P r a ž s k á v y s o k á š k o l a p s y c h o s o c i á l n í c h s t u d i í ETIKA V PROFESII PSYCHOLÓGA N a t á l i a S l o b o d n í k o v á v e d ú c i p r á c e : P h D r. M a r t i n S t r o u

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Does Unlabeled Data Help?

Does Unlabeled Data Help? Does Unlabeled Data Help? Worst-case Analysis of the Sample Complexity of Semi-supervised Learning. Ben-David, Lu and Pal; COLT, 2008. Presentation by Ashish Rastogi Courant Machine Learning Seminar. Outline

More information

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) Diffeomorphic Warping Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) What Manifold Learning Isn t Common features of Manifold Learning Algorithms: 1-1 charting Dense sampling Geometric Assumptions

More information

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction

Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,

More information

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1). Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

A Beamforming Method for Blind Calibration of Time-Interleaved A/D Converters

A Beamforming Method for Blind Calibration of Time-Interleaved A/D Converters A Beamforming Method for Blind Calibration of Time-nterleaved A/D Converters Bernard C. Levy University of California, Davis Joint wor with Steve Huang September 30, 200 Outline 1 Motivation Problem Formulation

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Data-dependent representations: Laplacian Eigenmaps

Data-dependent representations: Laplacian Eigenmaps Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component

More information

Statistical machine learning and kernel methods. Primary references: John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis

Statistical machine learning and kernel methods. Primary references: John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis Part 5 (MA 751) Statistical machine learning and kernel methods Primary references: John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis Christopher Burges, A tutorial on support

More information

Towards a High Level Quantum Programming Language

Towards a High Level Quantum Programming Language MGS Xmas 04 p.1/?? Towards a High Level Quantum Programming Language Thorsten Altenkirch University of Nottingham based on joint work with Jonathan Grattage and discussions with V.P. Belavkin supported

More information

Synopsis of Numerical Linear Algebra

Synopsis of Numerical Linear Algebra Synopsis of Numerical Linear Algebra Eric de Sturler Department of Mathematics, Virginia Tech sturler@vt.edu http://www.math.vt.edu/people/sturler Iterative Methods for Linear Systems: Basics to Research

More information

Regularization on Discrete Spaces

Regularization on Discrete Spaces Regularization on Discrete Spaces Dengyong Zhou and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany {dengyong.zhou, bernhard.schoelkopf}@tuebingen.mpg.de

More information

A Brief Survey on Semi-supervised Learning with Graph Regularization

A Brief Survey on Semi-supervised Learning with Graph Regularization 000 00 002 003 004 005 006 007 008 009 00 0 02 03 04 05 06 07 08 09 020 02 022 023 024 025 026 027 028 029 030 03 032 033 034 035 036 037 038 039 040 04 042 043 044 045 046 047 048 049 050 05 052 053 A

More information

1 Graph Kernels by Spectral Transforms

1 Graph Kernels by Spectral Transforms Graph Kernels by Spectral Transforms Xiaojin Zhu Jaz Kandola John Lafferty Zoubin Ghahramani Many graph-based semi-supervised learning methods can be viewed as imposing smoothness conditions on the target

More information

Support Vector Machines for Classification: A Statistical Portrait

Support Vector Machines for Classification: A Statistical Portrait Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,

More information

The Learning Problem and Regularization

The Learning Problem and Regularization 9.520 Class 02 February 2011 Computational Learning Statistical Learning Theory Learning is viewed as a generalization/inference problem from usually small sets of high dimensional, noisy data. Learning

More information

Support Vector Machines (SVMs).

Support Vector Machines (SVMs). Support Vector Machines (SVMs). SemiSupervised Learning. SemiSupervised SVMs. MariaFlorina Balcan 3/25/215 Support Vector Machines (SVMs). One of the most theoretically well motivated and practically most

More information

An Example file... log.txt

An Example file... log.txt # ' ' Start of fie & %$ " 1 - : 5? ;., B - ( * * B - ( * * F I / 0. )- +, * ( ) 8 8 7 /. 6 )- +, 5 5 3 2( 7 7 +, 6 6 9( 3 5( ) 7-0 +, => - +< ( ) )- +, 7 / +, 5 9 (. 6 )- 0 * D>. C )- +, (A :, C 0 )- +,

More information

Optimal Control of PDEs

Optimal Control of PDEs Optimal Control of PDEs Suzanne Lenhart University of Tennessee, Knoville Department of Mathematics Lecture1 p.1/36 Outline 1. Idea of diffusion PDE 2. Motivating Eample 3. Big picture of optimal control

More information

4.3 Laplace Transform in Linear System Analysis

4.3 Laplace Transform in Linear System Analysis 4.3 Laplace Transform in Linear System Analysis The main goal in analysis of any dynamic system is to find its response to a given input. The system response in general has two components: zero-state response

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Laplacian Spectrum Learning

Laplacian Spectrum Learning Laplacian Spectrum Learning Pannagadatta K. Shivaswamy and Tony Jebara Department of Computer Science Columbia University New York, NY 0027 {pks203,jebara}@cs.columbia.edu Abstract. The eigenspectrum of

More information

Limits of Spectral Clustering

Limits of Spectral Clustering Limits of Spectral Clustering Ulrike von Luxburg and Olivier Bousquet Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany {ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de

More information

QUESTIONS ON QUARKONIUM PRODUCTION IN NUCLEAR COLLISIONS

QUESTIONS ON QUARKONIUM PRODUCTION IN NUCLEAR COLLISIONS International Workshop Quarkonium Working Group QUESTIONS ON QUARKONIUM PRODUCTION IN NUCLEAR COLLISIONS ALBERTO POLLERI TU München and ECT* Trento CERN - November 2002 Outline What do we know for sure?

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Global vs. Multiscale Approaches

Global vs. Multiscale Approaches Harmonic Analysis on Graphs Global vs. Multiscale Approaches Weizmann Institute of Science, Rehovot, Israel July 2011 Joint work with Matan Gavish (WIS/Stanford), Ronald Coifman (Yale), ICML 10' Challenge:

More information

Robust Laplacian Eigenmaps Using Global Information

Robust Laplacian Eigenmaps Using Global Information Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX

More information