MANIFOLD LEARNING: A MACHINE LEARNING PERSPECTIVE. Sam Roweis. University of Toronto Department of Computer Science. [Google: Sam Toronto ]
|
|
- Jocelyn Shaw
- 5 years ago
- Views:
Transcription
1 MANIFOLD LEARNING: A MACHINE LEARNING PERSPECTIVE Sam Roweis University of Toronto Department of Computer Science [Google: Sam Toronto ] MSRI High-Dimensional Data Workshop December 10, 2004
2 Manifold Learning Means many things to many people. In machine learning, generally refers to a class of unsupervised statistical problems: Dimensionality reduction of a finite data set to preserve or highlight certain features of the original measurements. Latent factor modeling of high-dimensional observations using only a small number of underlying causes. Density estimation, based on a finite sample of a points from a distribution over a high-dimensional space. Mathematically, we assume x = f(y)+noise We see samples of x based on some unknown function f( ), underlying distribution p(y), and some uncharacterized noise process; and we want to learn f( ) (or its inverse). Ill-posed, so we typically make several strong assumptions, e.g. f is smooth, p(y) is uniform and the noise is small.
3 Motivations for Manifold Learning Most Inputs are Redundant Data are points in a high dimensional space. Coherent structure in the world generates strong correlations between components. Geometrically, observations lie on or near thin, connected low dimensional manifolds. Many Processes are Nonlinear We want to model the curved geometry of high-dimensional manifolds. Linearity can be a useful approximation in local domains, but globally too strong. Most interesting data has nonlinear structure. Computational Savings Need to vastly decrease size of inputs while preserving important similarities and differences. Improve efficiency of statistical algorithms, avoid the curse.
4 Dimensionality Reduction Goal: find a set of low-dimensional coordinates y n for each high-dimensional observation x n in order to preserve some measure of the original structure. Appeal: no assumptions about distributions, the data is just what we have in front of us. Disadvantages: does not generalize to new data, does not explicitly reveal anything about the nature of the underlying process, its latent causes or the structure of the manifold it induces in the observation space.
5 Dimensionality Reduction Approach: optimization of low-dimensional coordinates directly, given some carefully designed objective function. Common theme: how to convert local info into global info (e.g. overlapping local geometric constraints, geodesic distances on local graphs, preserving neighbour identities) Typical setup: build a locally connected graph on the data sample; use local measurements to induce a global objective function; optimize this objective using an eigenvector method Examples of Linear methods: SVD, PCA, Classical MDS Examples of Nonlinear methods: Kruskal MDS, Isomap, LLE, Laplacian Eigenmaps, and variants (Conformal Isomap, Hessian LLE, Semidefinite Embedding), Local MDS, Projection Pursuit, self-organizing maps, Stochastic Neighbour Embedding (SNE)
6 Latent factor models Goal: build an explicit model (often probabilistic) of the embedding function f( ) that explains the data we saw. Appeal: explicitly represents underlying causes, allows us to generalize off the data, handles uncertainty and noise naturally. Disadvantages: too many unknowns to build a full probabilistic model, in particular there is a fundamental degeneracy between sampling in latent space and curvature of manifold.
7 Latent factor models Approach: make very strong assumptions and proceed from there using maximum likelihood learning (or approximations). (Typical assumptions: uniform density in latent space, isometric embedding, bounded curvature of manifold.) Examples of Linear methods: probabilistic PCA, factor analysis, etc. Examples of Nonlinear methods: autoencoder neural networks, principal curves/surfaces, generative topographic mapping (GTM), independent components analysis (ICA), Kernel PCA
8 Global Coordination of Local Models Locally simple (e.g. linear) models can be stitched together or aligned to form a global factor model of the entire data space. Appeal: manifolds often look simple locally (e.g. almost linear, almost uniform data sampling). We can often train a simple model well if it is restricted to a small part of space. Combining models has a long history in statistics as mixture modeling. Disadvantages: we need to specify what our goal is in coordination and then design new algorithms to achieve this Approaches: Decoupled train local models and align their internal coordinates later (Teh/Roweis, Brand, Verbeek). Simultaneously fit local models in a way that encourages their agreement (Roweis,Saul,Hinton).
9 Other Issues in Manifold Learning Out of sample extensions for many dimensionality reduction methods can be achieved with interpolation techniques such as the Nystrom approximation. Semi-supervised versions of many of these problems arise naturally if we are given class labels, partial observations of hidden causes, correspondence information, etc. Recent clustering algorithms have used similar techniques and addressed related problems (e.g. spectral clustering, min cut) Exploration of fundamental link between spectral nonlinear dimensionality reduction algorithms and kernel methods. Isolated sub-problem of estimating the underlying (co-)dimensionality of a manifold has received lots of attention. Increased focus on computational speedups, e.g. landmark methods, efficient iterated eigensolvers, convex programming.
10 Linear Projection Methods References Zoubin Ghahramani & Geoff Hinton, The EM algorithm for Mixtures of Factor Analyzers, U.Toronto Tech Report CRG-TR-96-1, A.J. Bell & T.J. Sejnowski, An information maximisation approach to blind separation and blind deconvolution, Neural Computation 7(6), David Mackay, Maximum Likelihood and Covariant Algorithms for ICA, unpublished, A. Hyvarinen, J. Karhunen, & E. Oja. Independent Component Analysis. Wiley, Sam Roweis, EM Algorithms for PCA and SPCA, NIPS 10, M.E. Tipping & C.M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society,61(3), pp. 611, A. Basilevsky. Statistical Factor Analysis and Related Methods. Wiley, NewYork, I. Borg & P. Groenen Modern Multidimensional Scaling: Theory and Applications. Springer, T.F. Cox & M.A.A. Cox. Multidimensional Scaling., Chapman and Hall,2001.
11 Alignment of Local Models References Michael E. Tipping & Christopher M. Bishop, Mixtures of Probabilistic Principal Component Analysers., Neural Computation 11(2), pp , Sam Roweis, Lawrence Saul & Geoff Hinton. Global Coordination of Local Linear Models. NIPS 14, pp , Yee Whye Teh & Sam T. Roweis, Automatic Alignment of Hidden Representations. NIPS 15, pp , M. Brand, Charting a manifold, NIPS 15, J. H. Ham, D. D. Lee & L. K. Saul Learning high dimensional correspondences from low dimensional manifolds., ICML Workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, 2003 J.J. Verbeek, S.T. Roweis & N. Vlassis, Non-linear CCA and PCA by Alignment of Local Models. NIPS 16, 2004.
12 References Neural networks, and other nonparametric mappings Geoffrey Hinton & Sam T. Roweis, Stochastic Neighbor Embedding. NIPS 15, pp , G. E. Hinton, P. Dayan & M. Revow, Modeling the manifolds of handwritten digits. IEEE Transactions on Neural Networks, N. Kambhatla & T. Leen, Dimension reduction by local principal component analysis. Neural Computation, v.9, pp , C.M. Bishop, M. Svenson & C.K.I. Williams, GTM: The Generative Topographic Mapping, Neural Computation, 10(1), pp , H. Bourlard & Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition, Biological Cybernetics, Vol. 59, pp , 1988 K.I.Diamantaras & S.Y. Kung, Principal Component Neural Networks. John Wiley, R. Durbin & D. Willshaw, An Analogue Approach to the Travelling Salesman Problem Using an Elastic Net Method Nature, Vol. 326, pp , 1987 E. Erwin, K. Obermayer & K. Schulten, Self-organizing maps: ordering, convergence properties and energy functions Biological Cybernetics, 67(1), pp , 1992.
13 References Principal Curves and Projection Pursuit T.J. Hastie & W. Stuetzle. Principal curves. Journal of the American Statistical Association v.84, pp , P. Diaconis & D. Freedman, Asymptotics of graphical projection pursuit. Annals of Statistics v. 12, pp , J.H. Friedman, W. Stuetzle & A. Schroeder. Projection pursuit density estimation. Journal of the American Statistical Association v.79, pp , J.H. Friedman & J.W. Tukey. A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers,c-23(9), pp , P.J. Huber. Projection pursuit. Annals of Statistics, 13(2), pp , 1985.
14 References Eigenvector Manifold Learning Algorithms S.T. Roweis & L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding. Science, 290(22), pp , L. K. Saul & S. T. Roweis, Think globally, fit locally: unsupervised learning of low dimensional manifolds, Journal of Machine Learning Research, v. 4, pp , J. B. Tenenbaum, V. de Silva & J. C. Langford, A Global Geometric Framework for Nonlinear Dimensionality reduction, Science 290(22), pp , J.B. Tenenbaum, Mapping a Manifold of Perceptual Observations, NIPS 10, M. Belkin & P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), pp , M. Belkin & P. Niyogi. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering NIPS 14, pp , Y. Bengio, J. Paiement & P. Vincent. Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps and spectral clustering., NIPS 16, V. desilva & J.B. Tenenbaum. Global versus local methods in nonlinear dimensionality reduction. NIPS 15, pp , D. L. Donoho & C. E. Grimes, Hessian Eigenmaps: new locally linear embedding techniques for high-dimensional data, Proceedings of the National Academy of Arts and Sciences, v. 100 pp , H. Zha and Z. Zhang, Isometric embedding and continuum Isomap, ICML pp , K. Q. Weinberger, F. Sha & L. K. Saul, Learning a kernel matrix for nonlinear dimensionality reduction, ICML, K. Q. Weinberger & L. K. Saul, Unsupervised learning of image manifolds by semidefinite programming, CVPR, 2004.
15 Spectral Clustering References Andrew Ng, Michael Jordan & Yair Weiss, On spectral clustering: analysis and an algorithm. NIPS 14, Marina Meila & Jianbo Shi. Learning segmentation by random walks., NIPS 12, pp , C. Fowlkes, S. Belongie, F. Chung & J. Malik. Spectral grouping using the Nystrom method. IEEE Trans. Pattern Analysis and Machine Intelligence, 26(2), Jianbo Shi & Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), pp , R. Pless & I. Simon, Embedding images in non-flat spaces, Washington U., Tech. Rep. WU-CS-01-43, F.R.K. Chung. Spectral Graph Theory., American Mathematical Society, 1997.
16 Kernel Methods References M.A. Aizerman, E.M. Braverman & L.I. Rozoner. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control v.25, pp ,1964. J. Ham, D.D. Lee, S. Mika & B. Scholkopf. A kernel view of dimensionality reduction of manifolds. ICML, C.K.I. Williams. On a Connection between Kernel PCA and Metric Multidimensional Scaling. NIPS 13, pp , C.K.I. Williams & M. Seeger. Using the Nystrom method to speed up kernel machines. NIPS 13, pp , B. Schoelkopf, A. Smola & K.-R. Mueller, Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, 10(5), pp , B. Scholkopf. The kernel trick for distances. NIPS 13, pp , B. Scholkopf & A.Smola. Learning with Kernels, MIT Press, S. Mika, B. Scholkopf, A. Smola, K. Muller, M. Scholz & G. Ratsch. Kernel PCA and de-noising in feature spaces. NIPS 11, 1999.
17 Useful Overviews, etc. References Martin Law s Manifold Learning Resource Page lawhiu/manifold/ Chris Burges review of dimensionality reduction cburges/tech reports/tr dimred.pdf General Mathematical/Statistical Background B.D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, 1996 R.O. Duda & P.E. Hart. Pattern Classification and Scene Analysis., John Wiley, G.H. Golub & C.F. Van Loan. Matrix Computations. (3rd ed.) JohnsHopkins,1996. R.A. Horn & C.R. Johnson. Matrix Analysis. Cambridge University Press,1985. J.R. Magnus & H. Neudecker, Matrix Differential Calculus with Applications, Wiley, T. Hastie, R. Tibshirani & J. Friedman, The Elements of Statistical Learning, Springer-Verlag, 2001.
Nonlinear Dimensionality Reduction. Jose A. Costa
Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time
More informationCSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13
CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as
More informationDistance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
More informationDimensionality Reduction AShortTutorial
Dimensionality Reduction AShortTutorial Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo Waterloo, Ontario, Canada, 2006 c Ali Ghodsi, 2006 Contents 1 An Introduction to
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationNonlinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap
More informationRiemannian Manifold Learning for Nonlinear Dimensionality Reduction
Riemannian Manifold Learning for Nonlinear Dimensionality Reduction Tony Lin 1,, Hongbin Zha 1, and Sang Uk Lee 2 1 National Laboratory on Machine Perception, Peking University, Beijing 100871, China {lintong,
More informationLearning a Kernel Matrix for Nonlinear Dimensionality Reduction
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger kilianw@cis.upenn.edu Fei Sha feisha@cis.upenn.edu Lawrence K. Saul lsaul@cis.upenn.edu Department of Computer and Information
More informationLaplacian Eigenmaps for Dimensionality Reduction and Data Representation
Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to
More informationA Duality View of Spectral Methods for Dimensionality Reduction
A Duality View of Spectral Methods for Dimensionality Reduction Lin Xiao 1 Jun Sun 2 Stephen Boyd 3 May 3, 2006 1 Center for the Mathematics of Information, California Institute of Technology, Pasadena,
More informationConnection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis
Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal
More informationLearning a kernel matrix for nonlinear dimensionality reduction
University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science 7-4-2004 Learning a kernel matrix for nonlinear dimensionality reduction Kilian Q. Weinberger
More informationA Duality View of Spectral Methods for Dimensionality Reduction
Lin Xiao lxiao@caltech.edu Center for the Mathematics of Information, California Institute of Technology, Pasadena, CA 91125, USA Jun Sun sunjun@stanford.edu Stephen Boyd boyd@stanford.edu Department of
More informationLearning Eigenfunctions: Links with Spectral Clustering and Kernel PCA
Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures
More informationNonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization
Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization Kilian Q. Weinberger, Benjamin D. Packer, and Lawrence K. Saul Department of Computer and Information Science
More informationRobust Laplacian Eigenmaps Using Global Information
Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview
More informationSPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS
SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised
More informationNon-linear Dimensionality Reduction
Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)
More informationSpectral Dimensionality Reduction
Spectral Dimensionality Reduction Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux Jean-François Paiement, Pascal Vincent, and Marie Ouimet Département d Informatique et Recherche Opérationnelle Centre
More informationGraphs, Geometry and Semi-supervised Learning
Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In
More informationLearning Eigenfunctions Links Spectral Embedding
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux Jean-François Paiement, Pascal Vincent, and Marie Ouimet Département d Informatique et
More informationImproved Local Coordinate Coding using Local Tangents
Improved Local Coordinate Coding using Local Tangents Kai Yu NEC Laboratories America, 10081 N. Wolfe Road, Cupertino, CA 95129 Tong Zhang Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems
More informationLarge-Scale Manifold Learning
Large-Scale Manifold Learning Ameet Talwalkar Courant Institute New York, NY ameet@cs.nyu.edu Sanjiv Kumar Google Research New York, NY sanjivk@google.com Henry Rowley Google Research Mountain View, CA
More informationGraph-Laplacian PCA: Closed-form Solution and Robustness
2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and
More informationBi-stochastic kernels via asymmetric affinity functions
Bi-stochastic kernels via asymmetric affinity functions Ronald R. Coifman, Matthew J. Hirn Yale University Department of Mathematics P.O. Box 208283 New Haven, Connecticut 06520-8283 USA ariv:1209.0237v4
More informationLocalized Sliced Inverse Regression
Localized Sliced Inverse Regression Qiang Wu, Sayan Mukherjee Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University, Durham NC 2778-251,
More informationLinear and Non-Linear Dimensionality Reduction
Linear and Non-Linear Dimensionality Reduction Alexander Schulz aschulz(at)techfak.uni-bielefeld.de University of Pisa, Pisa 4.5.215 and 7.5.215 Overview Dimensionality Reduction Motivation Linear Projections
More informationDiscriminative K-means for Clustering
Discriminative K-means for Clustering Jieping Ye Arizona State University Tempe, AZ 85287 jieping.ye@asu.edu Zheng Zhao Arizona State University Tempe, AZ 85287 zhaozheng@asu.edu Mingrui Wu MPI for Biological
More informationLearning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst
Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of
More informationGaussian Process Latent Random Field
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Gaussian Process Latent Random Field Guoqiang Zhong, Wu-Jun Li, Dit-Yan Yeung, Xinwen Hou, Cheng-Lin Liu National Laboratory
More informationIntrinsic Structure Study on Whale Vocalizations
1 2015 DCLDE Conference Intrinsic Structure Study on Whale Vocalizations Yin Xian 1, Xiaobai Sun 2, Yuan Zhang 3, Wenjing Liao 3 Doug Nowacek 1,4, Loren Nolte 1, Robert Calderbank 1,2,3 1 Department of
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationStatistical and Computational Analysis of Locality Preserving Projection
Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637
More informationManifold Learning and it s application
Manifold Learning and it s application Nandan Dubey SE367 Outline 1 Introduction Manifold Examples image as vector Importance Dimension Reduction Techniques 2 Linear Methods PCA Example MDS Perception
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,
Sparse Kernel Canonical Correlation Analysis Lili Tan and Colin Fyfe 2, Λ. Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong. 2. School of Information and Communication
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationRegression on Manifolds Using Kernel Dimension Reduction
Jens Nilsson JENSN@MATHS.LTH.SE Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden Fei Sha FEISHA@CS.BERKELEY.EDU Computer Science Division, University of California, Berkeley,
More informationMachine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,
Eric Xing Eric Xing @ CMU, 2006-2010 1 Machine Learning Data visualization and dimensionality reduction Eric Xing Lecture 7, August 13, 2010 Eric Xing Eric Xing @ CMU, 2006-2010 2 Text document retrieval/labelling
More informationThe Numerical Stability of Kernel Methods
The Numerical Stability of Kernel Methods Shawn Martin Sandia National Laboratories P.O. Box 5800 Albuquerque, NM 87185-0310 smartin@sandia.gov November 3, 2005 Abstract Kernel methods use kernel functions
More informationExploring model selection techniques for nonlinear dimensionality reduction
Exploring model selection techniques for nonlinear dimensionality reduction Stefan Harmeling Edinburgh University, Scotland stefan.harmeling@ed.ac.uk Informatics Research Report EDI-INF-RR-0960 SCHOOL
More informationFull text available at: Dimension Reduction: A Guided Tour
Dimension Reduction: A Guided Tour Dimension Reduction: A Guided Tour Christopher J. C. Burges Microsoft Research One Microsoft Way Redmond, WA 98052-6399 USA chris.burges@microsoft.com Boston Delft Foundations
More informationApprentissage non supervisée
Apprentissage non supervisée Cours 3 Higher dimensions Jairo Cugliari Master ECD 2015-2016 From low to high dimension Density estimation Histograms and KDE Calibration can be done automacally But! Let
More informationLocal Learning Projections
Mingrui Wu mingrui.wu@tuebingen.mpg.de Max Planck Institute for Biological Cybernetics, Tübingen, Germany Kai Yu kyu@sv.nec-labs.com NEC Labs America, Cupertino CA, USA Shipeng Yu shipeng.yu@siemens.com
More informationGlobal (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction
Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction A presentation by Evan Ettinger on a Paper by Vin de Silva and Joshua B. Tenenbaum May 12, 2005 Outline Introduction The
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationInformative Laplacian Projection
Informative Laplacian Projection Zhirong Yang and Jorma Laaksonen Department of Information and Computer Science Helsinki University of Technology P.O. Box 5400, FI-02015, TKK, Espoo, Finland {zhirong.yang,jorma.laaksonen}@tkk.fi
More informationLearning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31
Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from
More informationTheoretical analysis of LLE based on its weighting step
Theoretical analysis of LLE based on its weighting step Yair Goldberg and Ya acov Ritov Department of Statistics and The Center for the Study of Rationality The Hebrew University March 29, 2011 Abstract
More informationSpectral Dimensionality Reduction via Maximum Entropy
Sheffield Institute for Translational Neuroscience and Department of Computer Science, University of Sheffield Abstract We introduce a new perspective on spectral dimensionality reduction which views these
More informationSpectral Clustering. Zitao Liu
Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of
More informationThe Curse of Dimensionality for Local Kernel Machines
The Curse of Dimensionality for Local Kernel Machines Yoshua Bengio, Olivier Delalleau & Nicolas Le Roux April 7th 2005 Yoshua Bengio, Olivier Delalleau & Nicolas Le Roux Snowbird Learning Workshop Perspective
More informationDiscriminant Uncorrelated Neighborhood Preserving Projections
Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,
More informationPart I Generalized Principal Component Analysis
Part I Generalized Principal Component Analysis René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University Principal Component Analysis (PCA) Given a set of points
More informationFace Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi
Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold
More informationSpectral Techniques for Clustering
Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods
More informationTUTORIAL PART 1 Unsupervised Learning
TUTORIAL PART 1 Unsupervised Learning Marc'Aurelio Ranzato Department of Computer Science Univ. of Toronto ranzato@cs.toronto.edu Co-organizers: Honglak Lee, Yoshua Bengio, Geoff Hinton, Yann LeCun, Andrew
More informationStatistical Learning. Dong Liu. Dept. EEIS, USTC
Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle
More informationLocality Preserving Projections
Locality Preserving Projections Xiaofei He Department of Computer Science The University of Chicago Chicago, IL 60637 xiaofei@cs.uchicago.edu Partha Niyogi Department of Computer Science The University
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationSINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES
SINGLE-TASK AND MULTITASK SPARSE GAUSSIAN PROCESSES JIANG ZHU, SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, P. R. China E-MAIL:
More informationData-dependent representations: Laplacian Eigenmaps
Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationGraph Metrics and Dimension Reduction
Graph Metrics and Dimension Reduction Minh Tang 1 Michael Trosset 2 1 Applied Mathematics and Statistics The Johns Hopkins University 2 Department of Statistics Indiana University, Bloomington November
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationL26: Advanced dimensionality reduction
L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern
More informationChap.11 Nonlinear principal component analysis [Book, Chap. 10]
Chap.11 Nonlinear principal component analysis [Book, Chap. 1] We have seen machine learning methods nonlinearly generalizing the linear regression method. Now we will examine ways to nonlinearly generalize
More informationSpectral Clustering. by HU Pili. June 16, 2013
Spectral Clustering by HU Pili June 16, 2013 Outline Clustering Problem Spectral Clustering Demo Preliminaries Clustering: K-means Algorithm Dimensionality Reduction: PCA, KPCA. Spectral Clustering Framework
More informationSpectral Hashing. Antonio Torralba 1 1 CSAIL, MIT, 32 Vassar St., Cambridge, MA Abstract
Spectral Hashing Yair Weiss,3 3 School of Computer Science, Hebrew University, 9904, Jerusalem, Israel yweiss@cs.huji.ac.il Antonio Torralba CSAIL, MIT, 32 Vassar St., Cambridge, MA 0239 torralba@csail.mit.edu
More informationDimensionality Reduction
Dimensionality Reduction Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, UCL Apr/May 2016 High dimensional data Example data: Gene Expression Example data: Web Pages Google
More informationLocal Fisher Discriminant Analysis for Supervised Dimensionality Reduction
Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Masashi Sugiyama sugi@cs.titech.ac.jp Department of Computer Science, Tokyo Institute of Technology, ---W8-7, O-okayama, Meguro-ku,
More informationChapter 1. GEOMETRIC METHODS FOR FEATURE EXTRACTION AND DIMENSIONAL REDUCTION A Guided Tour. Introduction. Christopher J.C. Burges
Chapter 1 GEOMETRIC METHODS FOR FEATURE EXTRACTION AND DIMENSIONAL REDUCTION A Guided Tour Christopher J.C. Burges Microsoft Research Abstract Keywords: We give a tutorial overview of several geometric
More informationNonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the
More informationNonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.
Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all
More informationarxiv: v1 [cs.lg] 30 Jun 2012
Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders arxiv:1207.0057v1 [cs.lg] 30 Jun 2012 Yoshua Bengio, Guillaume Alain, and Salah Rifai Department of Computer Science and
More informationValidation of nonlinear PCA
Matthias Scholz. Validation of nonlinear PCA. (pre-print version) The final publication is available at www.springerlink.com Neural Processing Letters, 212, Volume 36, Number 1, Pages 21-3 Doi: 1.17/s1163-12-922-6
More informationDimensionality Reduction: A Comparative Review
Dimensionality Reduction: A Comparative Review L.J.P. van der Maaten, E.O. Postma, H.J. van den Herik MICC, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands. Abstract In recent
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationLocally Linear Embedded Eigenspace Analysis
Locally Linear Embedded Eigenspace Analysis IFP.TR-LEA.YunFu-Jan.1,2005 Yun Fu and Thomas S. Huang Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign 405 North
More informationWhat, exactly, is a cluster? - Bernhard Schölkopf, personal communication
Chapter 1 Warped Mixture Models What, exactly, is a cluster? - Bernhard Schölkopf, personal communication Previous chapters showed how the probabilistic nature of GPs sometimes allows the automatic determination
More informationUsing Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method
Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Antti Honkela 1, Stefan Harmeling 2, Leo Lundqvist 1, and Harri Valpola 1 1 Helsinki University of Technology,
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationNonlinear Learning using Local Coordinate Coding
Nonlinear Learning using Local Coordinate Coding Kai Yu NEC Laboratories America kyu@sv.nec-labs.com Tong Zhang Rutgers University tzhang@stat.rutgers.edu Yihong Gong NEC Laboratories America ygong@sv.nec-labs.com
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationSmooth Bayesian Kernel Machines
Smooth Bayesian Kernel Machines Rutger W. ter Borg 1 and Léon J.M. Rothkrantz 2 1 Nuon NV, Applied Research & Technology Spaklerweg 20, 1096 BA Amsterdam, the Netherlands rutger@terborg.net 2 Delft University
More informationLECTURE NOTE #11 PROF. ALAN YUILLE
LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform
More informationSmart PCA. Yi Zhang Machine Learning Department Carnegie Mellon University
Smart PCA Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Abstract PCA can be smarter and makes more sensible projections. In this paper, we propose smart PCA, an extension
More informationLinear Heteroencoders
Gatsby Computational Neuroscience Unit 17 Queen Square, London University College London WC1N 3AR, United Kingdom http://www.gatsby.ucl.ac.uk +44 20 7679 1176 Funded in part by the Gatsby Charitable Foundation.
More informationMultiple Similarities Based Kernel Subspace Learning for Image Classification
Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese
More informationTutorial on Gaussian Processes and the Gaussian Process Latent Variable Model
Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,
More informationMatching the dimensionality of maps with that of the data
Matching the dimensionality of maps with that of the data COLIN FYFE Applied Computational Intelligence Research Unit, The University of Paisley, Paisley, PA 2BE SCOTLAND. Abstract Topographic maps are
More informationIntegrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction
Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,
More informationShort-term Wind Speed Forecasting by Using Model Structure Selection and Manifold Algorithm
by Using Model Structure Selection and Manifold Algorithm 1 School of Automation, Southeast University, Jiangsu, 210096, Nanjing, China Key Laboratory of Measurement and Control for Complex System of Ministry
More informationClustering in kernel embedding spaces and organization of documents
Clustering in kernel embedding spaces and organization of documents Stéphane Lafon Collaborators: Raphy Coifman (Yale), Yosi Keller (Yale), Ioannis G. Kevrekidis (Princeton), Ann B. Lee (CMU), Boaz Nadler
More informationLecture 10: Dimension Reduction Techniques
Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set
More informationISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:
More information