Geometric Inference for Probability distributions

Similar documents
Geometric Inference for Probability distributions

A sampling theory for compact sets

Stability of boundary measures

Geometric Inference on Kernel Density Estimates

Geometric inference for measures based on distance functions The DTM-signature for a geometric comparison of metric-measure spaces from samples

Geometric Inference for Probability Measures

Some Topics in Computational Topology. Yusu Wang. AMS Short Course 2014

Some Topics in Computational Topology

WITNESSED K-DISTANCE

Witnessed k-distance

Metric Thickenings of Euclidean Submanifolds

Persistent Homology of Data, Groups, Function Spaces, and Landscapes.

Declutter and Resample: Towards parameter free denoising

Statistics and Topological Data Analysis

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

Homology through Interleaving

Learning from persistence diagrams

Inference of curvature using tubular neighborhoods

Manifold Reconstruction in Arbitrary Dimensions using Witness Complexes

AN INTRODUCTION TO VARIFOLDS DANIEL WESER. These notes are from a talk given in the Junior Analysis seminar at UT Austin on April 27th, 2018.

Cones of measures. Tatiana Toro. University of Washington. Quantitative and Computational Aspects of Metric Geometry

Integral Jensen inequality

Centre for Mathematics and Its Applications The Australian National University Canberra, ACT 0200 Australia. 1. Introduction

Lecture Notes 1: Vector spaces

Gromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

VARIATIONAL PRINCIPLE FOR THE ENTROPY

Gromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph

Sobolev Mappings between Manifolds and Metric Spaces

Introduction to the theory of currents. Tien-Cuong Dinh and Nessim Sibony

Optimal Transport: A Crash Course

LECTURE 10: THE ATIYAH-GUILLEMIN-STERNBERG CONVEXITY THEOREM

LECTURE 15: COMPLETENESS AND CONVEXITY

Minimal time mean field games

The optimal partial transport problem

Optimization and Optimal Control in Banach Spaces

On the regularity of solutions of optimal transportation problems

RESEARCH STATEMENT MICHAEL MUNN

THESIS METRIC THICKENINGS OF EUCLIDEAN SUBMANIFOLDS. Submitted by. Joshua R. Mirth. Department of Mathematics

Topology Guaranteeing Manifold Reconstruction using Distance Function to Noisy Data

4 Countability axioms

Topology of the set of singularities of a solution of the Hamilton-Jacobi Equation

generalized modes in bayesian inverse problems

Parameter-free Topology Inference and Sparsification for Data on Manifolds

Topological complexity of motion planning algorithms

Convex Geometry. Carsten Schütt

Deformations of Unbounded Convex Bodies and Hypersurfaces

Domains in metric measure spaces with boundary of positive mean curvature, and the Dirichlet problem for functions of least gradient

ON A MEASURE THEORETIC AREA FORMULA

Pseudo-Poincaré Inequalities and Applications to Sobolev Inequalities

Scalar Field Analysis over Point Cloud Data

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

A few words about the MTW tensor

Local regression, intrinsic dimension, and nonparametric sparsity

PATH FUNCTIONALS OVER WASSERSTEIN SPACES. Giuseppe Buttazzo. Dipartimento di Matematica Università di Pisa.

THEOREMS, ETC., FOR MATH 515

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

An introduction to Mathematical Theory of Control

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Optimal transport paths and their applications

Pairwise Comparison Dynamics for Games with Continuous Strategy Space

Distribution-specific analysis of nearest neighbor search and classification

1 Lesson 1: Brunn Minkowski Inequality

Deforming conformal metrics with negative Bakry-Émery Ricci Tensor on manifolds with boundary

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

Moment Measures. Bo az Klartag. Tel Aviv University. Talk at the asymptotic geometric analysis seminar. Tel Aviv, May 2013

Large deviations for random projections of l p balls

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Mañé s Conjecture from the control viewpoint

arxiv: v1 [math.ap] 25 Oct 2012

Neighboring feasible trajectories in infinite dimension

Preconditioning techniques in frame theory and probabilistic frames

FULL CHARACTERIZATION OF OPTIMAL TRANSPORT PLANS FOR CONCAVE COSTS

Nonparametric regression for topology. applied to brain imaging data

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

Nonlinear Energy Forms and Lipschitz Spaces on the Koch Curve

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

NOTES ON THE REGULARITY OF QUASICONFORMAL HOMEOMORPHISMS

Nearest neighbor classification in metric spaces: universal consistency and rates of convergence

i=1 β i,i.e. = β 1 x β x β 1 1 xβ d

THEOREMS, ETC., FOR MATH 516

Topological properties

Integral Menger Curvature

18.175: Lecture 3 Integration

Optimal construction of k-nearest neighbor graphs for identifying noisy clusters

Robustness for a Liouville type theorem in exterior domains

Some Background Material

CAPACITIES ON METRIC SPACES

Stochastic Proximal Gradient Algorithm

Generative Models and Optimal Transport

Local semiconvexity of Kantorovich potentials on non-compact manifolds

arxiv: v2 [math.dg] 30 Nov 2016

Localizing solutions of the Einstein equations

Real Analysis II, Winter 2018

1.1. MEASURES AND INTEGRALS

On fuzzy entropy and topological entropy of fuzzy extensions of dynamical systems

Displacement convexity of the relative entropy in the discrete h

Hessian measures of semi-convex functions and applications to support measures of convex bodies

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

Geodesic Delaunay Triangulations in Bounded Planar Domains

Transcription:

Geometric Inference for Probability distributions F. Chazal 1 D. Cohen-Steiner 2 Q. Mérigot 2 1 Geometrica, INRIA Saclay, 2 Geometrica, INRIA Sophia-Antipolis 2009 June 1

Motivation What is the (relevant) topology/geometry of a point cloud data set in R d? Motivations : Reconstruction, manifold learning and NLDR, clustering and segmentation, etc...

Geometric inference Question Given an approximation C of a geometric object K, is it possible to estimate the topological and geometric properties of K, knowing only the approximation C? The answer depends on the considered class of objects and a notion of distance between the objects (approximation) ; Positive answer for a large class of compact sets endowed with the Hausdorff distance (i.e. K and C are close if all the points of C (resp. K) are close to K (resp. C)). In this talk : - can the considered objects be probability measures on R d? - motivation : allowing approximations to have outliers or to be corrupted by non local noise.

Distance functions for geometric inference Distance function and Hausforff distance Distance to a compact K R d : d K : x inf p K x p Hausdorff distance between compact sets K, K R d : d H (K, K ) = inf x R d d K (x) d K (x) Replace K and C by d K and d C. Compare the topology of the offsets K r = d 1 K ([0, r]) and C r = d 1 C ([0, r]).

Stability properties of the offsets Topological/geometric properties of the offsets of K are stable with respect to Hausdorff approximation : 1. Topological stability of the offsets of K (CCSL 06, NSW 06). 2. Approximate normal cones (CCSL 08). 3. Boundary measures (CCSM 07), curvature measures (CCSLT 09), Voronoi covariance measures (GMO 09).

The problem of outliers If K = K {x} where d K (x) > R, then d K d K > R : offset-based inference methods fail! Question : Can we generalized the previous approach by replacing the distance function by a distance-like function having a better behavior with respect to noise and outliers?

The three main ingredients for stability The proofs of stability relies on three important theoretical facts : 1 the stability of the map K d K : d K d K = sup x R d d K (x) d K (x) = d H (K, K ) 2 the 1-Lipschitz property for d K ; 3 the 1-concavity on the function d 2 K : x x 2 d 2 K (x) is convex.

Replacing compact sets by measures A measure µ is a mass distribution on R d. Mathematically, it is defined as a map µ that takes a (Borel) subset B R d and outputs a nonnegative number µ(b). Moreover we ask that if (B i ) are disjoint subsets, µ ( i N B ) i = i N µ(b i) µ(b) corresponds to to the mass of µ contained in B a point cloud C = {p 1,..., p n } defines a measure µ C = 1 n i δ p i the volume form on a k-dimensional submanifold M of R d defines a measure vol k M.

Distance between measures The Wasserstein distance d W (µ, ν) between two probability measures µ, ν quantifies the optimal cost of pushing µ onto ν, the cost of moving a small mass dx from x to y being x y 2 dx. c 1 d 1 c i π i,j d j µ ν π 1 µ and ν are discrete measures : µ = i c iδ xi, ν = j d jδ yj with j d j = i c i. 2 Transport plan : set of coefficients π ij 0 with i π ij = d j and j π ij = c i. 3 Cost of a transport plan ( ) 1/2 C(π) = ij x i y j 2 π ij 4 d W (µ, ν) := inf π C(π)

Wasserstein distance Examples : If C 1 and C 2 are two point clouds, with #C 1 = #C 2, then d W (µ C1, µ C2 ) is the square root of the cost of a minimal least-square matching between C 1 and C 2. If C = {p 1,..., p n } is a point cloud, and C = {p 1,..., p n k 1, o 1,..., o k } with d(o i, C) = R, then d H (C, C ) R but d W (µ C, µ C ) k (R + diam(c)) n

The distance to a measure Distance function to a measure, first attempt Let m ]0, 1[ be a positive mass, and µ a probability measure on R d : δ µ,m (x) = inf {r > 0; µ(b(x, r)) > m}. x δ µ,m(x) µ(b(x, δ µ,m(x)) = m µ δ µ,m is the smallest distance needed to attain a mass of at least m ; Coincides with the distance to the k-th neighbor when m = k/n and µ = 1 n n i=1 δ p i : δ µ,k/n (µ) = x p k C (x).

Unstability of µ δ µ,m Distance to a measure, first attempt Let m ]0, 1[ be a positive mass, and µ a probability measure on R d : δ µ,m (x) = inf {r > 0; µ(b(x, r)) > m}. Unstability under Wasserstein perturbations : µ ε = (1/2 ε)δ 0 + (1/2 + ε)δ 1 for ε > 0 : x < 0, δ µε,1/2(x) = x 1 for ε = 0 : x < 0, δ µ0,1/2(x) = x 0 Consequence : the map µ δ µ,m C 0 (R d ) is discontinuous whatever the (reasonable) topology on C 0 (R d ).

The distance function to a measure. Definition If µ is a measure on R d and m 0 > 0, one let : ( 1 d µ,m0 : x R d m 0 m0 0 ) 1/2 δµ,m(x)dm 2 Example. Let C = {p 1,..., p n } and µ = 1 n n i=1 δ p i. Let p k C (x) denote the kth nearest neighbor to x in C, and set m 0 = k 0 /n : ( 1 d µ,m0 (x) = k 0 k 0 k=1 x p k C (x) 2 ) 1/2

The distance function to a discrete measure. Example (continued) Let C = {p 1,..., p n } and µ = 1 n n i=1 δ p i. Let p k C (x) denote the kth nearest neighbor to x in C, and set m 0 = k 0 /n : d µ,m0 (x) = k0 x p k C (x) 2) 1/2. ( 1 k0 k=1 x 2 d 2 µ,m 0 (x) = 1 k 0 2x p k p C k (x) k C (x) 0 k=1 Hence, this function is linear inside k 0 th-order Voronoï cells, and : ( x 2 d 2 µ,m 0 (x)) = 1 k 0 p k C k (x) 0 k=1 2

1-Concavity of the squared distance function Example. (continued) d 2 µ,m 0 (y) = 1 k 0 y p k C k (y) 2 1 0 k=1 k 0 k 0 k=1 1 k 0 y x + x p k C k (x) 2 0 k=1 y p k C (x) 2 = y x 2 + 1 k 0 x p k C k (x) 2 1 + 2 0 k=1 k 0 k 0 k=1 = y x 2 + d 2 µ,m 0 (x) + y x d 2 µ,m 0 (x) y x x p k C (x)

Proposition The distance function to a discrete measure is 1-concave, ie. ( y 2 d 2 µ,m 0 (y)) ( x 2 d 2 µ,m 0 (x)) ( x 2 d 2 µ,m 0 )(x) x y This is equivalent to the function x x 2 d 2 µ,m 0 (x) being convex.

Another expression for d µ,m0 Projection submeasure For any x R d, and m > 0, denote by µ x,m the restriction of µ on the ball B = B(x, δ µ,m (x)), whose trace on the sphere B has been rescaled so that the total mass of µ x,m is m. The measure µ x,m gives mass to the multiple projections of x on µ ; For the point cloud case, when m 0 = k 0 /n + r with 0 < r < 1, and x is not on a k-voronoï face, µ x,m0 = k 0 k=1 1 n δ p k C (x) + rδ p k 0 +1 C (x)

Another expression for d µ,m0 Proposition Let µ be a probability measure on R d and m 0 > 0. Then, for almost every point x R d d µ,m0 is differentiable at x and : d 2 µ,m 0 (x) = 1 x h 2 dµ x,m0 (h) m 0 ( x 2 d 2 µ,m 0 (x)) = 2 hdµ x,m0 (h) m 0 for the point cloud case : ( x 2 d 2 µ,m 0 (x)) = 2 k 0 k0 k=1 pk C (x) interpretation : d µ,m0 (x) is the (partial) Wasserstein distance between the Dirac mass m 0 δ x and µ.

Theorem 1 the function x d µ,m0 (x) is 1-Lipschitz ; 2 the function x x 2 d 2 µ,m 0 (x) is convex ; 3 the map µ d µ,m0 from probability measures to continuous 1 functions is m0 Lipschitz, ie d µ,m0 d µ,m 0 1 m0 d W (µ, µ ) the proof of 2 and 3 involves the second expression for d µ,m0. properties 1 and 2 are related to the regularity of d µ,m0 stability property. while 3 is a

Consequences of the previous properties 1 existence of an analogous to the medial axis 2 stability of a filtered version of it (as with the µ-medial axis) under Wasserstein perturbation 3 stability of the critical function of a measure 4 the gradient d µ,m0 is L 1 -stable 5... = the distance functions d µ,m0 share many stability and regularity properties with the usual distance function.

Example : square with outliers 10% outliers, k = 150 δ µ,m0, m 0 = 1/10

Example : square with outliers d µ,m0 d µ,m0

A 3D example Reconstruction of an offset from a noisy dataset, with 10% outliers

A reconstruction theorem Theorem Let µ be a probability measure of dimension at most k > 0 with compact support K R d such that r α (K) > 0 for some α (0, 1]. For any 0 < η < r α (K), there exists positive constants m 1 = m 1 (µ, α, η) > 0 and C = C(m 1 ) > 0 such that : for any m 0 < m 1 and any probability measure µ such that W 2 (µ, µ ) < C m 0, the sublevel set d 1 µ,m 0 ((, η]) is homotopy equivalent (and even isotopic) to the offsets d 1 K ([0, r]) of K for 0 < r < r α (K).

k-nn density estimation vs distance to a measure data Density is estimated using x m 0 ω d 1 (δ µ,m0 (x)), m 0 = 150/1200.

k-nn density estimation vs distance to a measure density log(1 + 1/d µ,m0 ) Density is estimated using x (Devroye-Wagner 77). m 0 vol d (B(x,δ µ,m0 (x))), m 0 = 150/1200

k-nn density estimation vs distance to a measure 1 the gradient of the estimated density can behave wildly 2 exhibits peaks near very dense zone 1. can be fixed using d µ,m0 (because of the semiconcavity) 2. shows that the distance function is a better-behaved geometric object to associate to a measure.

k-nn density estimation vs distance to a measure 1 the gradient of the estimated density can behave wildly 2 exhibits peaks near very dense zone 1. can be fixed using d µ,m0 (because of the semiconcavity) 2. shows that the distance function is a better-behaved geometric object to associate to a measure.

Pushing data along the gradient of d µ,m0 Mean-Shift like algorithm (Comaniciu-Meer 02) Theoretical guarantees on the convergence of the algorithm and smoothness of trajectories.

Pushing data along the gradient of d µ,m0

Take-home messages µ d µ,m0 provide a way to associate geometry to a measure in Euclidean space. d µ,m0 is robust to Wasserstein perturbations : outliers and noise are easily handled (no assumption on the nature of the noise). d µ,m0 shares regularity properties with the usual distance function to a compact. Geometric stability results in this measure-theoretic setting : topology/geometry of the sublevel sets of d µ,m0, stable notion of persistence diagram for µ,... Algorithm : for finite point clouds d µ,m0 and (d µ,m0 ) can be easily and efficiently computed in any dimension. To get more details : http ://geometrica.saclay.inria.fr/team/fred.chazal/papers/rr-6930.pdf