Supplemental Materials for. Local Multidimensional Scaling for. Nonlinear Dimension Reduction, Graph Drawing. and Proximity Analysis
|
|
- Brittany Brown
- 5 years ago
- Views:
Transcription
1 Supplemental Materials for Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing and Proximity Analysis Lisha Chen and Andreas Buja Yale University and University of Pennsylvania October 3, 2008 Abstract In the this supplemental report we include sections that describe (1) a new interpretation of nonlinear dimension reduction in terms of a population framework as opposed to manifold learning, (2) a simulation example based on the well-known Swissroll to illustrate LMDS on data with known structure, (3) robustness properties of the LC meta-criterion under an error model. Finally we include color versions of the plots in the main paper. 1 A Population Framework Nonlinear dimension reduction can be approached with statistical modeling if it is thought of as manifold learning. Zhang and Zha (2005), for example, 1
2 examine tangent space alignment with a theoretical error analysis whereby an assumed target manifold is reconstructed from noisy point observations. With practical experience in mind, we introduce an alternative to manifold learning and its assumption that the data falls near a warped sheet. We propose instead that the data be modeled by a distribution in high dimensions that describes the data, warts and all, with variation caused by digitization, rounding, uneven sampling density, and so on. In this view the goal is to develop methodology that shows what can be shown in low dimensions and to provide diagnostics that pinpoint problems with the reduction. By avoiding prejudices implied by assumed models, the data are allowed to show whatever they have to show, be it manifolds, patchworks of manifolds of different dimensions, clusters, sculpted shapes with protrusions or holes, general uneven density patterns, and so on. For an example where this unprejudiced EDA view is successful, see the Frey face image data (Section 4.2, main paper) which exhibit a noisy patchwork of 0-D, 1-D and 2-D submanifolds in the form of clusters, rods between clusters, and webfoot-like structures between rods. This data example also makes it clear that no single dimension reduction may be sufficient to show all that is of interest: very localized methods (K = 4) reveal the underlying video sequence and transitions between clusters, whereas global methods (PCA, MDS) show the extent of the major interpretable clusters and dimensions more realistically. In summary, it is often pragmatic to assume less (a distribution) rather than more (a manifold) and to assign the statistician the task of flattening the distribution to lower dimensions as faithfully and usefully as possible. An immediate benefit of the distribution view is to recognize Isomap as non-robust due to its use of geodesic distances. For a distribution, a geodesic 2
3 distance is a path with shortest length in the support of the distribution. If a distribution s support is convex, the notion of geodesic distance in the population reduces to chordal distance. For small samples, Isomap may find manifolds that approximate the high-density areas of the distribution, but for increasing sample size the geodesic paths will find shortcuts that cross the low-density areas and asymptotically approximate chordal distances, thus changing the qualitative message of the dimension reduction as N. By comparison, the LMDS criterion has a safer behavior under statistical sampling as meaningful limits for N exist. Here is the population target: LMDS P N (x) = E [ ( Y Y x(y ) x(y ) ) 2 1[(Y,Y ) N] ] (1) t E [ x(y ) x(y ] ) 1 [(Y,Y )/ N], where Y, Y are iid P(dy) on IR p, N is a symmetric neighborhood definition in IR p, and the configuration y x(y) is interpreted as a map from the support of P(dy) in IR p to IR d whose quality is being judged by this criterion. When specialized to empirical measures, LMDS P N (x) becomes LMDSD N (x(y 1),...,x(y N )). A full theory would establish when local minima of (1) exist and when those of (equation 7, main paper) obtained from data converge to those of (1) for N. Further theory would describe rates with which N and t can be shrunk to obtain finer resolution while still achieving consistency. The LC meta-criteria also have population versions for dimension reduction: For y in the support of P(dy) (which we assume continuous), let Nα Y (y) be the Euclidean neighborhood of y that contains mass α: P[Y Nα Y (y)] = α; similarly, in configuration space let Nα X (x(y)) be the mass-α neighborhood of x(y): P[x(Y ) Nα X (x(y))] = α. Then the population version of the LC meta- 3
4 criterion for parameter α is M α = 1 α P[Y N Y α (Y ) and x(y ) N X α (x(y ))], where again Y, Y are iid P(dy). This specializes to M K (equation 9, main paper) for K /N = α when P(dy) is an empirical measure (modulo exclusion of the center points, which is asymptotically irrelevant). The quantity M α is a continuity measure, unlike for example the criterion of Hessian Eigenmaps (Donoho and Grimes 2003) which is a smoothness measure. A population point of view had proven successful once before in work by Buja, Logan, Reeds and Shepp (1994) which tackled the problem of MDS performance on completely uniformative or null data. It was shown that in the limit (N ) null data {D i,j } produce non-trivial spherical distributions as configurations and that this effect is also a likely contributor to the horseshoe effect, the artifactual bending of MDS configurations in the presence of noise. 2 A Simulation Example: Swiss Roll Data To explore the effectiveness of LMDS when the underlying structure is known, we rely on a simulation example widely used in nonlinear dimension reduction: the Swiss roll, which is a 2-D planar rectangle (Figure 9, left) rolled up and embedded in 3-D (Figure 9, right). The objective is to recover the intrinsic 2-D structure from the observed 3-D data. Our data consist of 100 points forming an equispaced 5 20 rectangular grid in parameter space, mapped to 3-D by warping the long side along a spiral. We color the data points red, orange, green and blue by dividing the rectangular grid into four chunks of 25 points in the long dimension. This scheme facilitates visual evaluation of a configuration generated with a particular method. 4
5 Given the data, we obtain a 2-D configuration by following these steps: (1) Calculate the Euclidean pairwise distances between the D data points. (2) Define the neighborhood based on metric thresholding (ǫ-nn). More specifically, for any pair of points (i,j), (i,j) N iff D i,j ǫ = 1. (We have also tried K-NN (K = 4) for defining neighborhood, with results slightly inferior to metric thresholding.) (3) Choose a value for tuning parameter τ. (4) Optimize the LMDS stress function to generate a configuration. In Figure 10 we show the configurations with four different values of τ. We see that the configurations with relatively larger τ s (τ = 1, 0.1) tend to be stretched out on the long side, which indicates repulsion is too strong. The most desirable configuration is achieved at τ = 0.01 which is most similar to the intrinsic 2-D structure. With no repulsion (τ = 0) the configurations get trapped in local minimal, which again indicates that repulsion is essential for achieving meaningful configurations. The quality of the configurations as measured by the values of the LC metacriterion is consistent with our visual impressions, with larger M ǫ =1 corresponding to a better configuration. (As in K-NN thresholding, we use ǫ for the calculation of the meta-criterion in distinction of ǫ used in defining the local graph for the stress function.) This again supports the idea that the LC meta-criterion provides a good measure for local faithfulness and the quality of a configuration. We also compared the configurations generated by LMDS with those of other dimension reduction methods, including PCA, MDS, Isomap and LLE. The configurations are shown in Figure 11. PCA provides a linear projection of the data that captures the greatest variation of the data, which is determined by the spiral structure. The global structure of the MDS configuration is very similar 5
6 to the PCA configuration due to the dominance of large distances in the stress function. For Isomap, we use both the original Isomap method and a slightly modified version. The latter version uses metric distance scaling instead of classical scaling to generate the configuration from the shortest path distances. Interestingly, Isomap with classical scaling produces a double fan shape which systematically deviates from the underlying true structure. Isomap with metric distance scaling on the other hand yields a configuration with less distortion, which is another example of the generally lower desirability of classical scaling compared to metric scaling. The parabolic shape generated by LLE suggests that LLE is able to do the unrolling to some extent but not fully, at least not without further fine tuning (LLE is known to be sensitive to tuning). The last embedding generated by LMDS is clearly the winner, and it outperforms its close competitor, the embedding from Isomap metric scaling, in terms of faithfulness of local structure. While in Isomap metric scaling the local texture is just a byproduct of the global embedding due to the dominance of the large geodesic distances in the stress functions, in LMDS one is able to tune τ to obtain locally faithful configuration. We see this as an appealing feature of LMDS. 3 Moderate Robustness of the LC Meta- Criterion A referee raised the question of robustness of the LC meta-criterion to noise. In the presence of noise, the set of K nearest neighbors of each data point will be affected. By measuring the degree of local overlap between neighborhoods 6
7 of configurations and noise-contaminated data, can the LC meta-criterion still guide the search for desirable configurations? The referee s intuition was that if the noise level exceeds the diameter of the K nearest neighborhoods the LC meta-criterion will become essentially random. We explored the question through simulation. The results show that the LC meta-criterion is moderately robust to noise. It certainly does not become completely random for moderate noise levels. In our simulation, we corrupted the true interpoint distances D i,j of the uncorrupted Swiss roll data with Gaussian noise e D with mean 0 and standard deviation SD(e D ). In order to measure the ability of the LC meta-criterion to distinguish good from bad configurations, we corrupted the underlying true 2-D configuration (the 5 20 regular grid with unit length grid steps) with Gaussian noise e X with mean 0 and standard deviation SD(e X ). The idea is that the LC meta-criterion is supposed to show a decline for increasing SD(e X ), the steeper the better. We then calculated the LC meta-criterion M K =4 by measuring the overlaps of K = 4 neighborhoods in terms of the corrupted distances D i,j vis-à-vis the corrupted grids. Figure 12 shows M K =4 as a function of the noise level SD(e X ) in each of the six plots, and as a function of the noise level SD(e D ) across the plots. The noise levels we considered for SD(e X ) and SD(e D ) range both from 0 to 1, which is commensurate with the fact that the true underlying grid steps have unit length. Each plot shows the averages and 90% coverage intervals for M k =4 based on 100 simulations. As hoped for, M k =4 generally decreases as SD(e X ) increases. When the distances are moderately contaminated with noise (SD(e D ) =0.2, 0.4, 0.6), the LC meta-criterion provides a good measure of the quality of a configuration, indicated by the overall downward trend of 7
8 M k =4. Only when the noise level is too great (e D = 0.8, 1) does the LC metacriterion lose its ability to distinguish good configurations (small SD(e X )) from bad configurations (large SD(e X )). It is conceivable that a version of the above simulations could be developed into a general diagnostics methodology for data analysis. It would provide insight into the robustness of the structure found in terms of a sensitivity analysis. We thank the referee for suggesting this simulation study which may well lead to refining the methodology of non-linear dimension reduction in general. 4 Color Plots We reproduce here the full series of plots of the main paper, but those that were intended to be in color are rendered here in color. Figures 9 and on are new and illustrate the Swiss roll experiments in this supplemental report. REFERENCES Buja, A., Logan, B.F, Reeds, J.R., Shepp, L.A. 1994, Inequalities and Positive- Definite Functions Arising from a Problem in Multidimensional Scaling, The Annals of Statistics, 22, Donoho, D.L., and Grimes, C., 2003, Hessian Eigenmaps: Locally Linear Embedding Techniques For High-Dimensional Data, Proc. of National Academy of Sciences, 100 (10), Zhang, Z., and Zha, H., 2005, Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Scientific Computing, 26,
9 30 3D LMDS Up down Pose Lighting Direction Figure 1: Sculpture Face Data. Three-D LMDS configuration, K = 6, optimized τ. 9
10 Lighting Direction Up down Pose PCA MDS LMDS ISOMAP LLE Figure 2: Sculpture Face Data. Three-D configurations from five methods. The gray tones encode terciles of known parameters indicated in the three column labels, rotated to best separation. 10
11 K=K =6 M K M K M K AdjustedM K 5e 04 5e 03 5e 02 5e 01 τ K (=K) M K K=4 K=6 K=8 K=10 K=25 K=125 K=650 Adjusted M K K=4 K=6 K=8 K=10 K=25 K=125 K= K K Figure 3: Sculpture Face Data: Selection of τ and K in the stress function LMDS K,τ. Top Left: Trace τ M K =6 for configurations that minimize LMDS K=6,τ. A global maximum is attained near τ =.005. Top Right: Traces K M K, M adj K for τ- minimized configurations, constraining K = K in LMDS K. The adjusted criterion points to a global maximum at K = K = 8. Bottom: Assessing τ-optimized solutions of LMDS K with traces K M K, M (adj) K for various values of K: K = 8 dominates over a range of K from 4 to
12 3D PCA (N 6 =2.6) 3D ISOMAP (N 6 =4.5) D LLE (N 6 =2.8) D LMDS (N 6 =5.2) Figure 4: Sculpture Face Data, pointwise diagnostics for 2-D views of 3-D configurations from four methods. The colors code N 6 (i) as follows: red overlap 0 to 2, orange overlap 3 to 4, green overlap 5 to 6. A comparison of the density of red and orange points reveals the local dominance of Isomap and LMDS over PCA and LLE, as well as an edge of LMDS over Isomap on these data. 12
13 K=8, M4=.268 1e+06 5e+05 0e+00 5e+05 1e e+06 5e e e e+06 K=4, M4= K=20, M4= K=12, M4= Figure 5: Frey Face Data. 2-D views of 3-D configurations from LMDS with four choices of K. 13
14 3D PCA 3D MDS 3D ISOMAP K=12 3D LLE K=12 3D KPCA 3D LMDS K=12 14 Figure 6: Frey Face Data. 2-D views of 3-D configurations, comparing PCA, MDS, Isomap, LLE, KPCA and LMDS.
15 3D PCA 3D MDS 3D ISOMAP 3D LLE 3D KPCA 3D LMDS K=12 Figure 7: Frey Face Data. The six configurations 15 (from Figure 6) with connecting lines that reflect the time order of the video footage.
16 M K K=4 K=8 K=12 K=20 K=50 K=100 K=150 K= K M K adjusted with MDS K Figure 8: Frey Face Data. Traces of the Meta-Criterion, K M K, for various choices of the neighborhood size K in the local stress LMDS K. Left: unadjusted fractional overlap; right: adjusted with MDS as the baseline. MDS is LMDS with K=1964, which by definition has the profile M adj K 0 in the right hand frame. 16
17 Figure 9: Swiss roll Data: A 2-D plane (left) rolled up and embedded into 3-D space (right). 17
18 τ= 1, M ε =1 =0.44 τ= 0.1, M ε =1 = τ= 0.01, M ε =1 = τ= 0, M ε =1 = Figure 10: Swiss roll Data. 2-D configurations from LMDS with four choices of τ. 18
19 PCA MDS Isomap Isomap Metric LLE LMDS Figure 11: Swiss roll Data. 2-D configurations, comparing PCA, MDS, Isomap (with both classical and metric scaling), LLE, and LMDS. 19
20 SD(e D )=0 SD(e D )=0.2 M K = SD(e D )= SD(e D )=0.6 M K = SD(e D )= SD(e D )=1 M K = SD(e X ) SD(e X ) Figure 12: Swiss roll Data. A simulation study on robustness of LC meta-criterion. e D is the noise added to original 3-D swissroll 20 and e X is the noise added to intrinsic 2-D structure to generate artificial configurations.
Local Multidimensional Scaling for. Nonlinear Dimension Reduction, Graph Drawing. and Proximity Analysis
Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing and Proximity Analysis Lisha Chen and Andreas Buja Yale University and University of Pennsylvania May 29, 2008 Abstract In
More informationLocal Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Layout and Proximity Analysis
Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Layout and Proximity Analysis Lisha Chen and Andreas Buja University of Pennsylvania This draft: July 26, 2006 Abstract In recent
More informationNonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationNon-linear Dimensionality Reduction
Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)
More informationNonlinear Dimensionality Reduction. Jose A. Costa
Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time
More informationNonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.
Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all
More informationLearning Eigenfunctions: Links with Spectral Clustering and Kernel PCA
Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures
More informationConnection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis
Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationNonlinear Manifold Learning Summary
Nonlinear Manifold Learning 6.454 Summary Alexander Ihler ihler@mit.edu October 6, 2003 Abstract Manifold learning is the process of estimating a low-dimensional structure which underlies a collection
More informationL26: Advanced dimensionality reduction
L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern
More informationDimension Reduction and Low-dimensional Embedding
Dimension Reduction and Low-dimensional Embedding Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/26 Dimension
More informationLECTURE NOTE #11 PROF. ALAN YUILLE
LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform
More informationRobust Laplacian Eigenmaps Using Global Information
Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX
More informationNonlinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap
More informationDistance Preservation - Part 2
Distance Preservation - Part 2 Graph Distances Niko Vuokko October 9th 2007 NLDR Seminar Outline Introduction Geodesic and graph distances From linearity to nonlinearity Isomap Geodesic NLM Curvilinear
More informationManifold Learning and it s application
Manifold Learning and it s application Nandan Dubey SE367 Outline 1 Introduction Manifold Examples image as vector Importance Dimension Reduction Techniques 2 Linear Methods PCA Example MDS Perception
More informationAdvanced Machine Learning & Perception
Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 2 Nonlinear Manifold Learning Multidimensional Scaling (MDS) Locally Linear Embedding (LLE) Beyond Principal Components Analysis (PCA)
More informationFace Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi
Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationGlobal (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction
Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction A presentation by Evan Ettinger on a Paper by Vin de Silva and Joshua B. Tenenbaum May 12, 2005 Outline Introduction The
More informationLaplacian Eigenmaps for Dimensionality Reduction and Data Representation
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June 2003; 15 (6):1373-1396 Presentation for CSE291 sp07 M. Belkin 1 P. Niyogi 2 1 University of Chicago, Department
More informationDiffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University
Diffusion/Inference geometries of data features, situational awareness and visualization Ronald R Coifman Mathematics Yale University Digital data is generally converted to point clouds in high dimensional
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationAdvances in Manifold Learning Presented by: Naku Nak l Verm r a June 10, 2008
Advances in Manifold Learning Presented by: Nakul Verma June 10, 008 Outline Motivation Manifolds Manifold Learning Random projection of manifolds for dimension reduction Introduction to random projections
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationLearning a Kernel Matrix for Nonlinear Dimensionality Reduction
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger kilianw@cis.upenn.edu Fei Sha feisha@cis.upenn.edu Lawrence K. Saul lsaul@cis.upenn.edu Department of Computer and Information
More informationLearning a kernel matrix for nonlinear dimensionality reduction
University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science 7-4-2004 Learning a kernel matrix for nonlinear dimensionality reduction Kilian Q. Weinberger
More informationIntrinsic Structure Study on Whale Vocalizations
1 2015 DCLDE Conference Intrinsic Structure Study on Whale Vocalizations Yin Xian 1, Xiaobai Sun 2, Yuan Zhang 3, Wenjing Liao 3 Doug Nowacek 1,4, Loren Nolte 1, Robert Calderbank 1,2,3 1 Department of
More informationLecture: Some Practical Considerations (3 of 4)
Stat260/CS294: Spectral Graph Methods Lecture 14-03/10/2015 Lecture: Some Practical Considerations (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough.
More informationLecture 10: Dimension Reduction Techniques
Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set
More informationDiffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)
Diffeomorphic Warping Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) What Manifold Learning Isn t Common features of Manifold Learning Algorithms: 1-1 charting Dense sampling Geometric Assumptions
More informationDimensionality Reduction: A Comparative Review
Tilburg centre for Creative Computing P.O. Box 90153 Tilburg University 5000 LE Tilburg, The Netherlands http://www.uvt.nl/ticc Email: ticc@uvt.nl Copyright c Laurens van der Maaten, Eric Postma, and Jaap
More informationPARAMETERIZATION OF NON-LINEAR MANIFOLDS
PARAMETERIZATION OF NON-LINEAR MANIFOLDS C. W. GEAR DEPARTMENT OF CHEMICAL AND BIOLOGICAL ENGINEERING PRINCETON UNIVERSITY, PRINCETON, NJ E-MAIL:WGEAR@PRINCETON.EDU Abstract. In this report we consider
More informationFounda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #4 Similarity. Edward Chang
Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval Lecture #4 Similarity Edward Y. Chang Edward Chang Foundations of LSMM 1 Edward Chang Foundations of LSMM 2 Similar? Edward Chang
More informationData-dependent representations: Laplacian Eigenmaps
Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:
More informationDimensionality Reduction:
Dimensionality Reduction: From Data Representation to General Framework Dong XU School of Computer Engineering Nanyang Technological University, Singapore What is Dimensionality Reduction? PCA LDA Examples:
More informationModeling and Predicting Chaotic Time Series
Chapter 14 Modeling and Predicting Chaotic Time Series To understand the behavior of a dynamical system in terms of some meaningful parameters we seek the appropriate mathematical model that captures the
More informationDistance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
More informationA Duality View of Spectral Methods for Dimensionality Reduction
A Duality View of Spectral Methods for Dimensionality Reduction Lin Xiao 1 Jun Sun 2 Stephen Boyd 3 May 3, 2006 1 Center for the Mathematics of Information, California Institute of Technology, Pasadena,
More informationGraph Metrics and Dimension Reduction
Graph Metrics and Dimension Reduction Minh Tang 1 Michael Trosset 2 1 Applied Mathematics and Statistics The Johns Hopkins University 2 Department of Statistics Indiana University, Bloomington November
More informationApprentissage non supervisée
Apprentissage non supervisée Cours 3 Higher dimensions Jairo Cugliari Master ECD 2015-2016 From low to high dimension Density estimation Histograms and KDE Calibration can be done automacally But! Let
More informationDimensionality Reduction: A Comparative Review
Dimensionality Reduction: A Comparative Review L.J.P. van der Maaten, E.O. Postma, H.J. van den Herik MICC, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands. Abstract In recent
More informationMachine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,
Eric Xing Eric Xing @ CMU, 2006-2010 1 Machine Learning Data visualization and dimensionality reduction Eric Xing Lecture 7, August 13, 2010 Eric Xing Eric Xing @ CMU, 2006-2010 2 Text document retrieval/labelling
More informationDiscriminant Uncorrelated Neighborhood Preserving Projections
Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,
More informationManifold Regularization
9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,
More informationLaplacian Eigenmaps for Dimensionality Reduction and Data Representation
Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationA Duality View of Spectral Methods for Dimensionality Reduction
Lin Xiao lxiao@caltech.edu Center for the Mathematics of Information, California Institute of Technology, Pasadena, CA 91125, USA Jun Sun sunjun@stanford.edu Stephen Boyd boyd@stanford.edu Department of
More informationIs Manifold Learning for Toy Data only?
s Manifold Learning for Toy Data only? Marina Meilă University of Washington mmp@stat.washington.edu MMDS Workshop 2016 Outline What is non-linear dimension reduction? Metric Manifold Learning Estimating
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationDimensionality Reduction by Unsupervised Regression
Dimensionality Reduction by Unsupervised Regression Miguel Á. Carreira-Perpiñán, EECS, UC Merced http://faculty.ucmerced.edu/mcarreira-perpinan Zhengdong Lu, CSEE, OGI http://www.csee.ogi.edu/~zhengdon
More informationTheoretical analysis of LLE based on its weighting step
Theoretical analysis of LLE based on its weighting step Yair Goldberg and Ya acov Ritov Department of Statistics and The Center for the Study of Rationality The Hebrew University March 29, 2011 Abstract
More informationExploring model selection techniques for nonlinear dimensionality reduction
Exploring model selection techniques for nonlinear dimensionality reduction Stefan Harmeling Edinburgh University, Scotland stefan.harmeling@ed.ac.uk Informatics Research Report EDI-INF-RR-0960 SCHOOL
More informationDimensionality Reduc1on
Dimensionality Reduc1on contd Aarti Singh Machine Learning 10-601 Nov 10, 2011 Slides Courtesy: Tom Mitchell, Eric Xing, Lawrence Saul 1 Principal Component Analysis (PCA) Principal Components are the
More informationGraphs, Geometry and Semi-supervised Learning
Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview
More informationDESIGNING CNN GENES. Received January 23, 2003; Revised April 2, 2003
Tutorials and Reviews International Journal of Bifurcation and Chaos, Vol. 13, No. 10 (2003 2739 2824 c World Scientific Publishing Company DESIGNING CNN GENES MAKOTO ITOH Department of Information and
More informationMIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design
MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation
More informationManifold Learning: Theory and Applications to HRI
Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationData dependent operators for the spatial-spectral fusion problem
Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.
More informationLocally Linear Embedded Eigenspace Analysis
Locally Linear Embedded Eigenspace Analysis IFP.TR-LEA.YunFu-Jan.1,2005 Yun Fu and Thomas S. Huang Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign 405 North
More informationManifold Learning: From Linear to nonlinear. Presenter: Wei-Lun (Harry) Chao Date: April 26 and May 3, 2012 At: AMMAI 2012
Manifold Learning: From Linear to nonlinear Presenter: Wei-Lun (Harry) Chao Date: April 26 and May 3, 2012 At: AMMAI 2012 1 Preview Goal: Dimensionality Classification reduction and clustering Main idea:
More informationOptical Flow-Based Transport on Image Manifolds
Optical Flow-Based Transport on Image Manifolds S. Nagaraj, C. Hegde, A. C. Sankaranarayanan, R. G. Baraniuk ECE Dept., Rice University CSAIL, Massachusetts Inst. Technology ECE Dept., Carnegie Mellon
More informationInformative Laplacian Projection
Informative Laplacian Projection Zhirong Yang and Jorma Laaksonen Department of Information and Computer Science Helsinki University of Technology P.O. Box 5400, FI-02015, TKK, Espoo, Finland {zhirong.yang,jorma.laaksonen}@tkk.fi
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Clustering and Dimensionality Reduction Clustering Kmeans DBSCAN Gaussian Mixture Model Dimensionality Reduction principal component analysis manifold learning Other Feature Processing
More information(Non-linear) dimensionality reduction. Department of Computer Science, Czech Technical University in Prague
(Non-linear) dimensionality reduction Jiří Kléma Department of Computer Science, Czech Technical University in Prague http://cw.felk.cvut.cz/wiki/courses/a4m33sad/start poutline motivation, task definition,
More informationStatistical Learning. Dong Liu. Dept. EEIS, USTC
Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle
More informationRegression on Manifolds Using Kernel Dimension Reduction
Jens Nilsson JENSN@MATHS.LTH.SE Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden Fei Sha FEISHA@CS.BERKELEY.EDU Computer Science Division, University of California, Berkeley,
More informationTANGENT SPACE INTRINSIC MANIFOLD REGULARIZATION FOR DATA REPRESENTATION. Shiliang Sun
TANGENT SPACE INTRINSIC MANIFOLD REGULARIZATION FOR DATA REPRESENTATION Shiliang Sun Department o Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 200241, P. R.
More informationGaussian Process Latent Random Field
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Gaussian Process Latent Random Field Guoqiang Zhong, Wu-Jun Li, Dit-Yan Yeung, Xinwen Hou, Cheng-Lin Liu National Laboratory
More informationMAP Examples. Sargur Srihari
MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances
More informationSample Complexity of Learning Mahalanobis Distance Metrics. Nakul Verma Janelia, HHMI
Sample Complexity of Learning Mahalanobis Distance Metrics Nakul Verma Janelia, HHMI feature 2 Mahalanobis Metric Learning Comparing observations in feature space: x 1 [sq. Euclidean dist] x 2 (all features
More informationRobust Statistics, Revisited
Robust Statistics, Revisited Ankur Moitra (MIT) joint work with Ilias Diakonikolas, Jerry Li, Gautam Kamath, Daniel Kane and Alistair Stewart CLASSIC PARAMETER ESTIMATION Given samples from an unknown
More informationKernel-Based Contrast Functions for Sufficient Dimension Reduction
Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach
More informationLocalized Sliced Inverse Regression
Localized Sliced Inverse Regression Qiang Wu, Sayan Mukherjee Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University, Durham NC 2778-251,
More informationUnsupervised learning: beyond simple clustering and PCA
Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have
More informationSmart PCA. Yi Zhang Machine Learning Department Carnegie Mellon University
Smart PCA Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Abstract PCA can be smarter and makes more sensible projections. In this paper, we propose smart PCA, an extension
More informationPrincipal Components Analysis. Sargur Srihari University at Buffalo
Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2
More informationExploiting Sparse Non-Linear Structure in Astronomical Data
Exploiting Sparse Non-Linear Structure in Astronomical Data Ann B. Lee Department of Statistics and Department of Machine Learning, Carnegie Mellon University Joint work with P. Freeman, C. Schafer, and
More informationRiemannian Manifold Learning for Nonlinear Dimensionality Reduction
Riemannian Manifold Learning for Nonlinear Dimensionality Reduction Tony Lin 1,, Hongbin Zha 1, and Sang Uk Lee 2 1 National Laboratory on Machine Perception, Peking University, Beijing 100871, China {lintong,
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationEstimation of Intrinsic Dimensionality Using High-Rate Vector Quantization
Estimation of Intrinsic Dimensionality Using High-Rate Vector Quantization Maxim Raginsky and Svetlana Lazebnik Beckman Institute, University of Illinois 4 N Mathews Ave, Urbana, IL {maxim,slazebni}@uiuc.edu
More informationNotion of Distance. Metric Distance Binary Vector Distances Tangent Distance
Notion of Distance Metric Distance Binary Vector Distances Tangent Distance Distance Measures Many pattern recognition/data mining techniques are based on similarity measures between objects e.g., nearest-neighbor
More informationDIDELĖS APIMTIES DUOMENŲ VIZUALI ANALIZĖ
Vilniaus Universitetas Matematikos ir informatikos institutas L I E T U V A INFORMATIKA (09 P) DIDELĖS APIMTIES DUOMENŲ VIZUALI ANALIZĖ Jelena Liutvinavičienė 2017 m. spalis Mokslinė ataskaita MII-DS-09P-17-7
More informationBeyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian
Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics
More informationInderjit Dhillon The University of Texas at Austin
Inderjit Dhillon The University of Texas at Austin ( Universidad Carlos III de Madrid; 15 th June, 2012) (Based on joint work with J. Brickell, S. Sra, J. Tropp) Introduction 2 / 29 Notion of distance
More informationISOMAP TRACKING WITH PARTICLE FILTER
Clemson University TigerPrints All Theses Theses 5-2007 ISOMAP TRACKING WITH PARTICLE FILTER Nikhil Rane Clemson University, nrane@clemson.edu Follow this and additional works at: https://tigerprints.clemson.edu/all_theses
More informationFreeman (2005) - Graphic Techniques for Exploring Social Network Data
Freeman (2005) - Graphic Techniques for Exploring Social Network Data The analysis of social network data has two main goals: 1. Identify cohesive groups 2. Identify social positions Moreno (1932) was
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationQUIZ ON CHAPTER 4 APPLICATIONS OF DERIVATIVES; MATH 150 FALL 2016 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS
Math 150 Name: QUIZ ON CHAPTER 4 APPLICATIONS OF DERIVATIVES; MATH 150 FALL 2016 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100% Show all work, simplify as appropriate, and use good form and procedure
More informationDimensionality Reduction AShortTutorial
Dimensionality Reduction AShortTutorial Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo Waterloo, Ontario, Canada, 2006 c Ali Ghodsi, 2006 Contents 1 An Introduction to
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationUnsupervised Transformation Learning via Convex Relaxations
Unsupervised Transformation Learning via Convex Relaxations Tatsunori B. Hashimoto John C. Duchi Percy Liang Stanford University Stanford, CA 94305 {thashim,jduchi,pliang}@cs.stanford.edu Abstract Our
More informationThe Curse of Dimensionality for Local Kernel Machines
The Curse of Dimensionality for Local Kernel Machines Yoshua Bengio, Olivier Delalleau & Nicolas Le Roux April 7th 2005 Yoshua Bengio, Olivier Delalleau & Nicolas Le Roux Snowbird Learning Workshop Perspective
More information