Diffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University

Similar documents
Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN

Diffusion Geometries, Diffusion Wavelets and Harmonic Analysis of large data sets.

Diffusion Wavelets and Applications

Functional Maps ( ) Dr. Emanuele Rodolà Room , Informatik IX

Filtering via a Reference Set. A.Haddad, D. Kushnir, R.R. Coifman Technical Report YALEU/DCS/TR-1441 February 21, 2011

Clustering in kernel embedding spaces and organization of documents

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course

Data dependent operators for the spatial-spectral fusion problem

Multiscale Wavelets on Trees, Graphs and High Dimensional Data

Justin Solomon MIT, Spring 2017

Data-dependent representations: Laplacian Eigenmaps

Supplementary Material Accompanying Geometry Driven Semantic Labeling of Indoor Scenes

Exploiting Sparse Non-Linear Structure in Astronomical Data

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Non-linear Dimensionality Reduction

dynamical zeta functions: what, why and what are the good for?

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Lecture: Some Practical Considerations (3 of 4)

EECS 275 Matrix Computation

LECTURE NOTE #11 PROF. ALAN YUILLE

IFT LAPLACIAN APPLICATIONS. Mikhail Bessmeltsev

Spectral Processing. Misha Kazhdan

Harmonic Analysis and Geometries of Digital Data Bases

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Diffusion Wavelets for multiscale analysis on manifolds and graphs: constructions and applications

Graphs, Geometry and Semi-supervised Learning

Nonlinear Dimensionality Reduction

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

CS 664 Segmentation (2) Daniel Huttenlocher

EE Technion, Spring then. can be isometrically embedded into can be realized as a Gram matrix of rank, Properties:

Multiscale bi-harmonic Analysis of Digital Data Bases and Earth moving distances.

CS 188: Artificial Intelligence Fall 2011

Consistent Manifold Representation for Topological Data Analysis

Empirical Intrinsic Modeling of Signals and Information Geometry

Partially labeled classification with Markov random walks

Multiscale analysis on graphs

Visual Tracking via Geometric Particle Filtering on the Affine Group with Optimal Importance Functions

Perturbation of the Eigenvectors of the Graph Laplacian: Application to Image Denoising

Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING

The Laplacian ( ) Matthias Vestner Dr. Emanuele Rodolà Room , Informatik IX

Nonlinear Dimensionality Reduction

Three-Dimensional Electron Microscopy of Macromolecular Assemblies

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Neighbourhoods of Randomness and Independence

Spatial Bayesian Nonparametrics for Natural Image Segmentation

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions

Signal processing methods have significantly changed. Diffusion Maps. Signal Processing

Conference in Honor of Aline Bonami Orleans, June 2014

Diffusion Geometries, Global and Multiscale

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric

A Bayesian perspective on GMM and IV

DIMENSION REDUCTION. min. j=1

Spectral Algorithms II

Global vs. Multiscale Approaches

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves

Semi-Markov/Graph Cuts

Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation

Regression on Manifolds Using Kernel Dimension Reduction

Heat Kernel Signature: A Concise Signature Based on Heat Diffusion. Leo Guibas, Jian Sun, Maks Ovsjanikov

Clustering. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein. Some slides adapted from Jacques van Helden

Stable MMPI-2 Scoring: Introduction to Kernel. Extension Techniques

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

Random projection trees and low dimensional manifolds. Sanjoy Dasgupta and Yoav Freund University of California, San Diego

1584 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 6, AUGUST 2011

Graph Partitioning Using Random Walks

From graph to manifold Laplacian: The convergence rate

Class Averaging in Cryo-Electron Microscopy

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay

MAP Examples. Sargur Srihari

Supplemental Materials for. Local Multidimensional Scaling for. Nonlinear Dimension Reduction, Graph Drawing. and Proximity Analysis

An Introduction to Laplacian Spectral Distances and Kernels: Theory, Computation, and Applications. Giuseppe Patané, CNR-IMATI - Italy

Convolutional Neural Networks

Information Theory in Computer Vision and Pattern Recognition

CS 343: Artificial Intelligence

Machine Learning - MT Clustering

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)

Distance Preservation - Part 2

Manifold Regularization

Manifold Coarse Graining for Online Semi-supervised Learning

Intelligent Systems:

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Metrics: Growth, dimension, expansion

Bi-stochastic kernels via asymmetric affinity functions

Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators

Image Filtering. Slides, adapted from. Steve Seitz and Rick Szeliski, U.Washington

Lecture 3: Compressive Classification

Pose tracking of magnetic objects

Multiscale Manifold Learning

Graph Functional Methods for Climate Partitioning

Iterative Laplacian Score for Feature Selection

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

PARAMETERIZATION OF NON-LINEAR MANIFOLDS

Transcription:

Diffusion/Inference geometries of data features, situational awareness and visualization Ronald R Coifman Mathematics Yale University

Digital data is generally converted to point clouds in high dimensional Euclidean space. The data is often correlated and analyzed through global linear algebra tools such as SVD. More generally a kernel defining affinity between data points can replace the correlation.(usually the kernel of a positive operator) Distances between data points are used to build a graph for which the spectral theory of the Laplace operator can be used to provide data clustering. In all these cases the eigenfunctions of some linear operator are used to effect a dimensional reduction through projection into a low dimensional space. We claim that diffusion (inference) geometries provide a powerful systematic approach to data analysis and organization, a methodology which enables global situational assessment based on local inferences.

Diffusions between A and B have to go through the bottleneck,while C is easily reachable from B. The Markov matrix defining a diffusion could be given by a kernel, or by inference between neighboring nodes. The diffusion distance accounts for preponderance of inference. The shortest path between A and C is roughly the same as between B and C. The diffusion distance however is larger since diffusion occurs through a bottleneck.

Original data set Embedding of data into the first 3 diffusion coordinates

The long term diffusion of heterogeneous material is remapped below. The left side has a higher proportion of heat conducting material,thereby reducing the diffusion distance among points, the bottle neck increases that distance

Diffusion map into 3 d of the heterogeneous graph The distance between two points measures the diffusion between them.

A mixture of two materials where the red is very conductive relative to the blue The green sets are diffusion balls at time 3. On the right we have the diffusion map which encodes the diffusion distance as the Euclidean distance in the map (in order number of samples, not its square).

A similarity or relationship between two points (digital documents) is computed as a combination of all chains of inference linking them, these are the diffusion metrics. Clustering in this metric leads to robust document segmentation and tracking. Various local criteria of relevance lead to distinct geometries. In these geometries the user can define relevance and filter away unrelated information. Self organization of digital documents is achieved through local similarity modeling, the top eigenfunctions of the empirical model provide global organization of the given set of data. The diffusion maps embed the data into low dimensional Euclidean space and convert isometrically the (diffusion) relational inference metric to the corresponding Euclidean distance.

Diffusion metrics can be computed efficiently as an ordinary Euclidean distance in a low dimensional embedding by the diffusion map ( total computation time scale linearly with the data size, and can be updated on line ). Data exploration, and perceptualizationis enabled by the diffusion map since it converts complex inference chains to ordinary physical distance in the perceptual displays, to provide situational awareness. The diffusion geometry which is induced by the various chains of inference enable a multiscale hierarchical organization governed by the scaling of the length of chains. Diffusion from reference points provides efficient linkages to related points propagating searches into the data while ranking for relevance

Intrinsic data cloud filtering by diffusion geometry. This method extends the algorithm in which the curve is parametrized by arc length and the filtered using Fourier modes. In our case the eigenfunctions are discretized FT approximations.

The data is given as a random cloud, the filter organizes the data. The colors are not part of the data

A noisy disorganized spiral, filtered at band 14.

Point cloud around a dumbellsurface

Gaussian cloud filtering by diffusion

Diffusion as a search mechanism. Starting with a few labeled points in two classes, the points are identified by the preponderance of evidence. (Szummer,Slonim, Tishby )

Conventional nearest neighbor search, compared with a diffusion search. The data is a pathology slide,each pixel is a digital document (spectrum below for each class )

A pathology slide in which three tissue type are displayed as an RGB mixture,where each color represents the probability of being of a given type.

The First two eigenfunctions organize the small images which were provided in random order

Organization of documents using diffusion geometry

A simple model for the generation of point clouds * is provided through the Langevin and Fokker-Planck equations : *---------------------------------------------------------------------------------------------------------------- Think of propagation of fire in the presence of wind. Or water draining down a landscape

The natural diffusion on the surface of the dumbell is mapped out in the embedding. Observe that A is closer to B than to C,and that the two lobes are well separated by the bottleneck.

Extension of Empirical functions off the data set. An important point of this multiscale analysis involves the relation of the spectral theory on the set to the localization on and off the set of the corresponding eigenfunctions. In the case of a smooth compact submanifold of Euclidean space it can be shown that any band limited function of band B can be expanded to exponential accuracy in terms of eigenfunctions of the 2 Laplace operator with eigenvalues not exceeding B Conversely every eigenfunction of the laplace operator satisfying this condition extends as a band limited function with band C B ( both of these statements can be proved by observing that for eigenfunctions of a Laplace operator we can estimate the size of a derivatives of order 2m as a power m of the eigenvalue, implying that eigenfunctions on the manifold are well approximated by restrictions of band limited functions of corresponding band.

Multiscale extension of expansions on the data set. The bottom image shows the distance to which low degree expansions in Laplace eigenfunctions can be extended reliably.