Statistics and Topological Data Analysis
|
|
- Kerry Brian O’Connor’
- 6 years ago
- Views:
Transcription
1 Geometry Understanding in Higher Dimensions CHAIRE D INFORMATIQUE ET SCIENCES NUMÉRIQUES Collège de France - June 2017 Statistics and Topological Data Analysis Bertrand MICHEL Ecole Centrale de Nantes - Lab. de mathématiques Jean Leray
2 Introduction : Topological Data Analysis and Statistics
3 Topological Data Analysis and Topological Inference The aim of TDA is to infer relevant qualitative and quantitative topological structures (clusters, holes...) directly from the data. data : typically point cloud X n Two popular methods in TDA : Mapper algorithm [Singh et al., 2007] and persistent homology [Edelsbrunner et al., 2002].
4 Topological Data Analysis (TDA) Why is topology interesting for data analysis? multiscale compact invariant under coordinate changes stable with respect to (small) perturbations informative topological space topological descriptors β 0 β 1 β 2 point cloud
5 Topological Data Analysis (TDA) For exploratory analysis, visualization
6 Topological Data Analysis (TDA) For exploratory analysis, visualization For feature extraction and statistical learning raw data Topological descriptors Geometric descriptors Other signatures supervised methods unsupervised methods
7 Statistics, Learning and TDA A statistical approach to TDA means that : we consider data as generated from an unknown distribution the inferred topological features by TDA methods are seen as estimators of topological quantities describing an underlying object. β 0 β 1 β 2
8 Statistics, Learning and TDA Directions of research (non-exhaustive list): Consistency / convergence of TDA methods: [Chazal15 JMLR], [Bobrowski 17 Bernouilli] Confidence regions for TDA [Fasy 14 AoS] [Chazal 15 JOCG ] Central tendency for persistent homology [Turner 14 DCG] [Fasy15 Nips] Robust methods for TDA [Chazal 17, EJS Chazal 17 JMLR] Representations of persistence in Euclidean spaces [Bubenik15 JMLR] [Adams15] Develop kernels for topological descriptors [Reininghaus 15 IEEE] [Carriere 17 ICML ] Statistical analysis of Mapper [Carriere 17]...
9 Homology and Persistent homology
10 Topological Stability and Regularity Topological inference : under regularity assumptions, topological properties of X can be recovered from (the off-sets) of a close enough object Y.
11 Topological Stability and Regularity Topological inference : under regularity assumptions, topological properties of X can be recovered from (the off-sets) of a close enough object Y. The local feature size is a local notion of regularity : For x X, lfs X (x) := d (x, M(X c )). The global version of the local feature size is the reach [Federer, 1959] : κ(x) = inf x X c lfs X(x). The reach is small if either X is not smooth or if X is close to being selfintersecting. Weak feature size and its extensions [Chazal and Lieutier, 2007] (by considering the critical values of d X ).
12 Topological Stability and Regularity Topological inference : under regularity assumptions, topological properties of X can be recovered from (the off-sets) of a close enough object Y. Example : d H (X, Y) = inf {α 0 X Y α and Y X α } Theorem [Chazal and Lieutier, 2007]: Let X and Y be two compact sets in R d and let ε > 0 be such that d H (X, Y) < ε, wfs(x) > 2ε and wfs(y) > 2ε. Then for any 0 < α < 2ε, X α and Y β are homotopy equivalent.
13 Topological Stability and Regularity Topological inference : under regularity assumptions, topological properties of X can be recovered from (the off-sets) of a close enough object Y. Example : d H (X, Y) = inf {α 0 X Y α and Y X α } Theorem [Chazal and Lieutier, 2007]: Let X and Y be two compact sets in R d and let ε > 0 be such that d H (X, Y) < ε, wfs(x) > 2ε and wfs(y) > 2ε. Then for any 0 < α < 2ε, X α and Y β are homotopy equivalent. Sampling conditions in Hausdorff metric. Statistical analysis of homotopy inference can be deduced from support estimation of a distribution under the Hausdorff metric.
14 Homology inference [Niyogi et al., 2008 and 2011] [Balakrishnan et al., 2012] : The Betti number (actually the homotopy type) of Riemannian manifolds with positive reach can be recovered with high probability from offsets of a sample on (or close to) the manifold. Homology inference Homotopy is not easy to compute in practice. Singular homology provides a algebraic description of holes in a geometric shape (connected components, loops, etc...) Betti number β k is the rank of the k-th homology group. Computational Topology : Betti numbers can be computed on simplicial complexes.
15 Persistent homology Starting from a point cloud X n, let Filt = (C α ) α A be a fitration of nested simplicial complexes. α Persistent homology: identification of persistent topological features along the filtration. multiscale information ; more stable and more robust ;
16 Barecodes and Persistence Diagrams Filtration of simplicial complexes Filt(X n ) X n Offsets Barcode
17 Barecodes and Persistence Diagrams Filtration of simplicial complexes Filt(X n ) death X n Offsets Barcode birth Dgm (Filt(X n )) Persistence diagram of the filtration Filt(X n ) built on X n.
18 Distance between persistence diagrams and stability death Add the diagonal Dgm 1 Dgm 2 Multiplicity: 2 0 birth The bottleneck distance between two diagrams Dgm 1 and Dgm 2 is d b (Dgm 1, Dgm 2 ) = inf γ Γ sup p Dgm 1 p γ(p) where Γ is the set of all the bijections between Dgm 1 and Dgm 2 and p q = max( x p x q, y p y q ).
19 Distance between persistence diagrams and stability death Add the diagonal Dgm(Filt(X)) Dgm(Filt(Y)) Multiplicity: 2 0 Theorem [Chazal et al., 2012]: For any compact metric spaces (X, ρ) and (Y, ρ ), d b (Dgm(Filt(X)), Dgm(Filt(Y))) 2 d GH (X, Y). Consequently, if X and Y are embedded in the same metric space (M, ρ) then d b (Dgm(Filt(X)), Dgm(Filt(Y))) 2 d H (X, Y). birth
20 Statistics and Persistent homology
21 Persistence diagram inference [Chazal 2015 JMLR] (M, ρ) metric space X compact set in M. X Filt(X) well defined for any compact metric space [Chazal et al., 2012] Dgm(Filt(X)) 0 0 Convergence??? X n Filt( X n ) n points sampled in X according to µ Estimator of Dgm(Filt(K)) 0 0 Dgm(Filt( X n ))
22 Persistence diagram inference For a, b > 0, µ satisfies the (a, b)-standard assumption on its support X µ if for any x X µ and any r > 0 : Theorem: For a, b > 0 : µ(b(x, r)) min(ar b, 1). P(a, b, M) : set of all the probability measures satisfying the (a, b)-standard assumption on the metric space (M, ρ). sup µ P(a,b,M) E [ ] d b (Dgm(Filt(X µ )), Dgm(Filt( X n ))) C ( ln n n ) 1/b where C only depends on a and b. Under additional technical hypotheses, for any estimator Dgm n of Dgm(Filt(X µ )): [ ] lim inf E d b (Dgm(Filt(X µ )), Dgm n ) C n 1/b n sup µ P(a,b,M) where C is an absolute constant.
23 Confidence sets for persistence diagrams [Fasy 2014 AoS] P ( Dgm(Filt(K)) ˆR ) 1 α??
24 Confidence sets for persistence diagrams [Fasy 2014 AoS] P ( Dgm(Filt(K)) ˆR ) 1 α?? Using the Hausdorff stability, we can define confidence sets for persistence diagrams: d b (Dgm (Filt(K)), Dgm (Filt(X n ))) d H (K, X n ). It is sufficient to find c n such that ( ) lim sup dh (K, X n ) > c n α. n
25 Confidence sets for persistence diagrams [Fasy 2014 AoS] Subsampling method: N subsamples X 1 b,n,..., XN b,n of size b. ) Compute T j = d H (X j b,n, X n, j = 1,..., N. Compute L b (t) = 1 N N j=1 1 T j >t, Take c b = 2L 1 b (α). If P satisfies an (a, b) standard assumption then, for n large enough : P ( W (Dgm (Filt(X µ )), Dgm (Filt(X n ))) > c b ) P ( d H (X µ, X n ) > c b ) α + O ( b n ) 1/4
26 Central tendency for persistent homology Frechet mean [Turner 2014] Use an alternative descriptor of persistence : Persistence landscapes [Bubenik, 2015]
27 Persistence landscapes [Bubnik JMLR 2015] d d+b 2 d b 2 d b 2 b X n Dgm(Filt(Xn )) persistence landscape d+b 2 Dgm = { } ( d i+b i, d i+b i ), i I 2 2 Persistence landscape λ of Dgm: λ(k, t) = kmax p D Λ p(t), t R, k N, where kmax is k-th largest value in the set. For p = ( b+d, d b ) Dgm, 2 2 t b t [b, b+d ] 2 Λ p (t) = d t t ( b+d 2, d] 0 otherwise. Stability: For any t R and any k N, λ(k, t) λ (k, t) d b (Dgm, Dgm ).
28 Subsampling methods for pers. homology [Chazal ICML 2015] joint work with F. Chazal, B. Fasy, F. Lecci, A. Rinaldo and L. Wasserman Let X = {X 1,, X m } sampled from µ. λ X : corresponding persistence landscape. Ψ m µ : the measure induced by µ m on the space of persistence landscapes. We consider the point-wise expectations of the (random) persistence landscape under this measure: E Ψ m µ [λ X (t)], t [0, T ] For S1 m,..., Sl m some independent samples of size m from µ m, the empirical counterpart of E Ψ m µ [λ X (t)] is λ m l (t) = 1 l l i=1 λ S m i (t), for all t [0, T ],
29 Subsampling methods for pers. homology [Chazal ICML 2015] Definition: The p-th Wasserstein distance between two measures µ, ν defined on (M, ρ) is W ρ,p (µ, ν) = ( inf Π M M ) 1 [ρ(x, y)] p p dπ(x, y), where the infimum is taken over all measures on M M with marginals µ and ν. Stability of the average landscape: Theorem: Let X µ m and Y ν m, where µ and ν are two probability measures on M. For any p 1 we have E Ψ m µ [λ X ] E Ψ m ν [λ Y ] 2 m 1 p Wρ,p (µ, ν).
30 Subsampling methods for pers. homology [Chazal ICML 2015] Application: Analysis of accelerometer data. Fred Fabrizio Bertrand topological features carry discriminative information no registration/calibration preprocessing step needed
31 Commercial break: Gudhi with Statistical learning Python Libraries and coming soon : Gudhi Stat with more tools for statistics and TDA.
32 Robust TDA
33 Standard TDA methods are not robust to outliers X r := x X B(x, r) = d 1 X ([0, r]) where the distance function d X to X is d X (y) = inf x X x y
34 Standard TDA methods are not robust to outliers X r := x X B(x, r) = d 1 X ([0, r]) where the distance function d X to X is d X (y) = inf x X x y
35 Some possible noise models for geometry Additive noise model P = µ Φ distribution with support the geometric shape G noise distribution Clutter noise model P = πµ + (1 π)u G distribution with support the geometric shape G Uniform distribution on the box G A few outliers P = πµ + (1 π)ψ distribution with support the geometric shape G Distribution of outliers G
36 Robust TDA with an alternative distance function? We would like to consider the sub levels of an alternative distance function related to the sampling measure, which support is X, or close to X.
37 Distance To Measure [Chazal 11 FoCM] Preliminary distance function to a measure P : Let u ]0, 1[ be a positive mass, and P a probability measure on R d : δ P,u (x) = inf {r > 0 : P (B(x, r)) u} δ P,u (x) δ P,u is the smallest distance needed to capture a mass of at least u. x δ P,u is the quantile function at u of the r.v. supp(p ) x X u where X P.
38 Distance To Measure [Chazal 11 FoCM] Preliminary distance function to a measure P : Let u ]0, 1[ be a positive mass, and P a probability measure on R d : δ P,u (x) = inf {r > 0 : P (B(x, r)) u} x δ P,u (x) Definition: Given a probability measure P on R d and m > 0, the distance function to the measure P (DTM) is defined by u supp(p ) d P,m : x R d ( 1 m m 0 ) 1/2 δp,u(x)du 2
39 Distance To Measure [Chazal 11 FoCM] Properties of the DTM : Stability under Wassertein perturbations: d P,m d Q,m 1 m W 2 (P, Q) The function x d 2 P,m (x) is semiconcave, this is ensuring strong regularity properties on the geometry of its sublevel sets. Consequently, if P is a probability distribution close to P for Wasserstein distance W 2, then the sublevel sets of d P,m provide a topologically correct approximation of the support of P.
40 Distance to The Empirical Measure (DTEM) Let X 1,..., X n sample according to P and let P n be the empirical measure. Then d 2 P n (x) = 1 k x X, k (i) 2 n k where X (1) x X (2) x X (k) x X (n) x i=1 X (i) x k = 8
41 Geometric inference with the DTM Theorem: [Chazal et al., 2011] Let µ be a measure that has dimension at most k > 0 with compact support G such that reach α (G) R > 0 for some α > 0. Let ν be another measure and ε be an upper bound on the uniform distance between d G and d ν,m0. Then, for any r [4ε/α 2, R 3ε] and any η ]0, R[, the r-sublevel sets of d µ,m0 and the η-sublevel sets of d G are homotopy equivalent as soon as: W 2 (µ, ν) R m /α 2 C(µ) 1/k m 1/k+1/2 0. In practice : X 1... X n sampled according to P. Assume W 2 (P, µ) small. P n = n i=1 δ X i : empirical measure. Than for n large enough, W 2 (P n, µ) is small and the sublevel sets of d Pn,m provide a topologically correct approximation of G. G
42 Wasserstein deconvolution and DTM denoising Additive noise model P = µ Φ distribution with support the geometric shape G noise distribution G In this case, W 2 (P, µ) can be large. Ideally we would like to denoise directly d Pn,n, but this can be hardly achieved because the DTM is not a linear functional of the measure. Alternative approach : deconvolve the observed measure [Caillerie EJS 2011] X 1,..., X n Deconvolved measure µ n d µn,m W 2 (P n, µ) W 2 ( µ n, µ) m d µn,m d µ ( 0) ( 0)
43 Wasserstein deconvolution and DTM denoising
44 DTM and persistent homology
45 DTM and persistent homology death Dgm P,m X i P d P,m 0 birth d b ( DgmP,m, Dgm Q,m ) dp,m d Q,m 1 m W 2 (P, Q) Take Q = P n... Wasserstein Stability of the DTM [Chazal et al., 2012] Stability of Persistent homology [Cohen- Steiner et al.,2005, Chazal et al., 2012]
46 Estimation of the DTM via the empirical DTM [Chazal EJS 17, Chazal JMLR 17] Quantity of interest: d 2 P n, k n (x) d 2 P, k n (x) Observe that d 2 P,m(x) = 1 m m 0 Fx 1 (u)du where F x is the cdf of x X 2 with X P. The distance to the empirical measure is the empirical counter part of the distance to P : d Pn,m(x) 2 = 1 m m 0 F 1 x,n(u)du where F x,n is the cdf of x X 2 with X P n. Finally we get that d 2 P n, k n (x) d 2 P, k n (x) = 1 m m 0 { F 1 x,n(u) Fx 1 (u) } du
47 Estimation of the DTM via the empirical DTM [Chazal EJS 17, Chazal JMLR 17] Quantity of interest: d 2 P n, k n (x) d 2 P, k n (x) Two complementary approaches of the problem: Asymptotic approach : k nn = m is fixed and n tends to infinity. Non asymptotic approach : n is fixed, and we want a tight control over the fluctuations of the empirical DTM, in function of k, which can be taken very small. We do not use Wasserstein stability for either of the two approaches. Wasserstein rates of convergence [Fournier and Guillin, 2013 ;Dereich et al., 2013] do not provide tight rates for the DTM in this context.
48 Bootstrap and significance of topological features Aim : studying the persistent homology of the sub-levels of the DTM and providing confidence regions. Two alternative boostrap methods : by bootstrapping the DTM Bottleneck Bootstrap
49 Bootstrap and significance of topological features Bootstrapping the DTM P P n
50 Bootstrap and significance of topological features Bootstrapping the DTM P P n P n P n
51 Bootstrap and significance of topological features Bootstrapping the DTM P P n P n P n
52 Bootstrap and significance of topological features Bootstrapping the DTM P P n P n P n Φ(P ) Φ(P n ) Φ(P n ) Φ(P n)
53 Bootstrap and significance of topological features Bootstrapping the DTM For m (0, 1), define c α by P ( n d 2 P,m d 2 P n,m > c α ) = α. Let X 1,..., X n be a sample from P n, and let P n be the corresponding (bootstrap) empirical measure. We consider the bootstrap quantity d P n,m(x) of d Pn,m. The bootstrap estimate ĉ α is defined by P ( n d 2 Pn,m d 2 P n,m > ĉ α X 1,..., X n ) = α where ĉ α can be approximated by Monte Carlo. Theorem: If Fx 1 is regular enough, the DTM is Hadamard differentiable at P. Consequently, the bootstrap method for the DTM is asymptotically valid.
54 Bootstrap and significance of topological features Bootstrapping the DTM Dgm : persistence diagram of the sub-levels of d P,m Dgm : persistence diagram of the sub-levels of d Pn,m. Let { } C n = E Diag : d b ( Dgm, E) ĉα n where Diag is the set of all the persistence diagrams., Then, P(Dgm C n ) = P ( d b (Dgm, Dgm) ĉα n ) P Bootstrap estimate ( ) d 2 P,m d 2 P n,m ĉα n
55 Bootstrap and significance of topological features The Bottleneck Bootstrap Dgm : persistence diagram of the sub-levels of d P,m Dgm : persistence diagram of the sub-levels of d Pn,m. Dgm : persistence diagram of the sub-levels of d P n,m. We directly bootstrap in the set of the persistence diagram by considering the random quantity d b ( Dgm, Dgm). We define ˆt α by ( ndb P ( Dgm ), Dgm) > ˆt α X 1,..., X n The quantile ˆt α can be estimated by Monte Carlo. = α.
56 Bootstrap and significance of topological features For both methods we can identify significant features by putting a band of size 2ĉ α or 2ˆt α around the diagonal: In practice, the bottleneck bootstrap can lead to more precise inferences because in many cases the stability result is not sharp enough: d b ( Dgm, Dgm) d P,m d Pn,m.
57 Concluding remarks TDA methods focus on the topological properties (homology / persistent homology) of a shape. TDA methods can be used as an exploratory method, in particuar when the point cloud is sampled on (close to) a real geometric object as a feature extraction procedure, next these extracted features can be used for learning purposes. TDA is an emerging field, at the interface maths, computer sciences, statistics. Many topics about the statistical analysis of TDA Applications in many fields of sciences ( medecine, biology, dynamic systems, astronomy, dynamical systems, physics...)
Some Topics in Computational Topology. Yusu Wang. AMS Short Course 2014
Some Topics in Computational Topology Yusu Wang AMS Short Course 2014 Introduction Much recent developments in computational topology Both in theory and in their applications E.g, the theory of persistence
More informationSome Topics in Computational Topology
Some Topics in Computational Topology Yusu Wang Ohio State University AMS Short Course 2014 Introduction Much recent developments in computational topology Both in theory and in their applications E.g,
More informationThe Intersection of Statistics and Topology:
The Intersection of Statistics and Topology: Confidence Sets for Persistence Diagrams Brittany Terese Fasy joint work with S. Balakrishnan, F. Lecci, A. Rinaldo, A. Singh, L. Wasserman 3 June 2014 [FLRWBS]
More informationLearning from persistence diagrams
Learning from persistence diagrams Ulrich Bauer TUM July 7, 2015 ACAT final meeting IST Austria Joint work with: Jan Reininghaus, Stefan Huber, Roland Kwitt 1 / 28 ? 2 / 28 ? 2 / 28 3 / 28 3 / 28 3 / 28
More informationMultivariate Topological Data Analysis
Cleveland State University November 20, 2008, Duke University joint work with Gunnar Carlsson (Stanford), Peter Kim and Zhiming Luo (Guelph), and Moo Chung (Wisconsin Madison) Framework Ideal Truth (Parameter)
More informationGeometric Inference for Probability distributions
Geometric Inference for Probability distributions F. Chazal 1 D. Cohen-Steiner 2 Q. Mérigot 2 1 Geometrica, INRIA Saclay, 2 Geometrica, INRIA Sophia-Antipolis 2009 June 1 Motivation What is the (relevant)
More informationMinimax Rates. Homology Inference
Minimax Rates for Homology Inference Don Sheehy Joint work with Sivaraman Balakrishan, Alessandro Rinaldo, Aarti Singh, and Larry Wasserman Something like a joke. Something like a joke. What is topological
More informationGeometric inference for measures based on distance functions The DTM-signature for a geometric comparison of metric-measure spaces from samples
Distance to Geometric inference for measures based on distance functions The - for a geometric comparison of metric-measure spaces from samples the Ohio State University wan.252@osu.edu 1/31 Geometric
More informationGeometric Inference on Kernel Density Estimates
Geometric Inference on Kernel Density Estimates Bei Wang 1 Jeff M. Phillips 2 Yan Zheng 2 1 Scientific Computing and Imaging Institute, University of Utah 2 School of Computing, University of Utah Feb
More informationNonparametric regression for topology. applied to brain imaging data
, applied to brain imaging data Cleveland State University October 15, 2010 Motivation from Brain Imaging MRI Data Topology Statistics Application MRI Data Topology Statistics Application Cortical surface
More informationPersistent homology and nonparametric regression
Cleveland State University March 10, 2009, BIRS: Data Analysis using Computational Topology and Geometric Statistics joint work with Gunnar Carlsson (Stanford), Moo Chung (Wisconsin Madison), Peter Kim
More informationPersistent Homology of Data, Groups, Function Spaces, and Landscapes.
Persistent Homology of Data, Groups, Function Spaces, and Landscapes. Shmuel Weinberger Department of Mathematics University of Chicago William Benter Lecture City University of Hong Kong Liu Bie Ju Centre
More informationGeometric Inference for Probability distributions
Geometric Inference for Probability distributions F. Chazal 1 D. Cohen-Steiner 2 Q. Mérigot 2 1 Geometrica, INRIA Saclay, 2 Geometrica, INRIA Sophia-Antipolis 2009 July 8 F. Chazal, D. Cohen-Steiner, Q.
More informationHomology through Interleaving
Notes by Tamal K. Dey, OSU 91 Homology through Interleaving 32 Concept of interleaving A discrete set P R k is assumed to be a sample of a set X R k, if it lies near to it which we can quantify with the
More informationPersistent Homology: Course Plan
Persistent Homology: Course Plan Andrey Blinov 16 October 2017 Abstract This is a list of potential topics for each class of the prospective course, with references to the literature. Each class will have
More informationInterleaving Distance between Merge Trees
Noname manuscript No. (will be inserted by the editor) Interleaving Distance between Merge Trees Dmitriy Morozov Kenes Beketayev Gunther H. Weber the date of receipt and acceptance should be inserted later
More informationWITNESSED K-DISTANCE
WITNESSED K-DISTANCE LEONIDAS GUIBAS, DMITRIY MOROZOV, AND QUENTIN MÉRIGOT Abstract. Distance function to a compact set plays a central role in several areas of computational geometry. Methods that rely
More informationGeometric Inference for Probability Measures
Geometric Inference for Probability Measures Frédéric CHAZAL David COHEN-STEINER Quentin MÉRIGOT June 21, 2010 Abstract Data often comes in the form of a point cloud sampled from an unknown compact subset
More informationMetric Thickenings of Euclidean Submanifolds
Metric Thickenings of Euclidean Submanifolds Advisor: Dr. Henry Adams Committe: Dr. Chris Peterson, Dr. Daniel Cooley Joshua Mirth Masters Thesis Defense October 3, 2017 Introduction Motivation Can we
More informationA sampling theory for compact sets
ENS Lyon January 2010 A sampling theory for compact sets F. Chazal Geometrica Group INRIA Saclay To download these slides: http://geometrica.saclay.inria.fr/team/fred.chazal/teaching/distancefunctions.pdf
More informationTopological Simplification in 3-Dimensions
Topological Simplification in 3-Dimensions David Letscher Saint Louis University Computational & Algorithmic Topology Sydney Letscher (SLU) Topological Simplification CATS 2017 1 / 33 Motivation Original
More informationAn Introduction to Topological Data Analysis
An Introduction to Topological Data Analysis Elizabeth Munch Duke University :: Dept of Mathematics Tuesday, June 11, 2013 Elizabeth Munch (Duke) CompTop-SUNYIT Tuesday, June 11, 2013 1 / 43 There is no
More informationStability of boundary measures
Stability of boundary measures F. Chazal D. Cohen-Steiner Q. Mérigot INRIA Saclay - Ile de France LIX, January 2008 Point cloud geometry Given a set of points sampled near an unknown shape, can we infer
More informationDistributions of Persistence Diagrams and Approximations
Distributions of Persistence Diagrams and Approximations Vasileios Maroulas Department of Mathematics Department of Business Analytics and Statistics University of Tennessee August 31, 2018 Thanks V. Maroulas
More informationA Kernel on Persistence Diagrams for Machine Learning
A Kernel on Persistence Diagrams for Machine Learning Jan Reininghaus 1 Stefan Huber 1 Roland Kwitt 2 Ulrich Bauer 1 1 Institute of Science and Technology Austria 2 FB Computer Science Universität Salzburg,
More informationSEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE
SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE FRANÇOIS BOLLEY Abstract. In this note we prove in an elementary way that the Wasserstein distances, which play a basic role in optimal transportation
More informationPersistent Homology. 24 Notes by Tamal K. Dey, OSU
24 Notes by Tamal K. Dey, OSU Persistent Homology Suppose we have a noisy point set (data) sampled from a space, say a curve inr 2 as in Figure 12. Can we get the information that the sampled space had
More informationarxiv: v1 [cs.cv] 13 Jul 2017
Deep Learning with Topological Signatures Christoph Hofer Department of Computer Science University of Salzburg Salzburg, Austria, 5020 chofer@cosy.sbg.ac.at Roland Kwitt Department of Computer Science
More informationSTOCHASTIC CONVERGENCE OF PERSISTENCE LANDSCAPES AND SILHOUETTES
STOCHASTIC CONVERGENCE OF PERSISTENCE LANDSCAPES AND SILHOUETTES Frédéric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, and Larry Wasserman Abstract. Persistent homology is a widely
More informationLogarithmic Sobolev Inequalities
Logarithmic Sobolev Inequalities M. Ledoux Institut de Mathématiques de Toulouse, France logarithmic Sobolev inequalities what they are, some history analytic, geometric, optimal transportation proofs
More informationGromov-Hausdorff stable signatures for shapes using persistence
Gromov-Hausdorff stable signatures for shapes using persistence joint with F. Chazal, D. Cohen-Steiner, L. Guibas and S. Oudot Facundo Mémoli memoli@math.stanford.edu l 1 Goal Shape discrimination is a
More informationA Gaussian Type Kernel for Persistence Diagrams
TRIPODS Summer Bootcamp: Topology and Machine Learning Brown University, August 2018 A Gaussian Type Kernel for Persistence Diagrams Mathieu Carrière joint work with S. Oudot and M. Cuturi Persistence
More informationarxiv: v5 [math.st] 20 Dec 2018
Tropical Sufficient Statistics for Persistent Homology Anthea Monod 1,, Sara Kališnik 2, Juan Ángel Patiño-Galindo1, and Lorin Crawford 3 5 1 Department of Systems Biology, Columbia University, New York,
More informationGromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph
Gromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph Frédéric Chazal and Ruqi Huang and Jian Sun July 26, 2014 Abstract In many real-world applications data appear to be sampled around
More informationWitnessed k-distance
Witnessed k-distance Leonidas J. Guibas, Quentin Mérigot, Dmitriy Morozov To cite this version: Leonidas J. Guibas, Quentin Mérigot, Dmitriy Morozov. Witnessed k-distance. Discrete and Computational Geometry,
More informationStratification Learning through Homology Inference
Stratification Learning through Homology Inference Paul Bendich Sayan Mukherjee Institute of Science and Technology Austria Department of Statistical Science Klosterneuburg, Austria Duke University, Durham
More informationOn Inverse Problems in TDA
Abel Symposium Geiranger, June 2018 On Inverse Problems in TDA Steve Oudot joint work with E. Solomon (Brown Math.) arxiv: 1712.03630 Features Data Rn The preimage problem in the data Sciences feature
More informationComparing persistence diagrams through complex vectors
Comparing persistence diagrams through complex vectors Barbara Di Fabio ARCES, University of Bologna Final Project Meeting Vienna, July 9, 2015 B. Di Fabio, M. Ferri, Comparing persistence diagrams through
More informationarxiv: v1 [stat.me] 27 Sep 2016
Topological Data Analysis Larry Wasserman Department of Statistics/Carnegie Mellon University Pittsburgh, USA, 15217; email: larry@stat.cmu.edu arxiv:1609.08227v1 [stat.me] 27 Sep 2016 Abstract Topological
More informationChristophe A.N. Biscio Department of Mathematical Sciences, Aalborg University, Denmark and Jesper Møller. June 28, Abstract
arxiv:1611.00630v2 [math.st] 27 Jun 2017 The accumulated persistence function, a new useful functional summary statistic for topological data analysis, with a view to brain artery trees and spatial point
More informationScalar Field Analysis over Point Cloud Data
DOI 10.1007/s00454-011-9360-x Scalar Field Analysis over Point Cloud Data Frédéric Chazal Leonidas J. Guibas Steve Y. Oudot Primoz Skraba Received: 19 July 2010 / Revised: 13 April 2011 / Accepted: 18
More informationGromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph
Noname manuscript No. (will be inserted by the editor) Gromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph Frédéric Chazal Ruqi Huang Jian Sun Received: date / Accepted: date Abstract
More informationProximity of Persistence Modules and their Diagrams
Proximity of Persistence Modules and their Diagrams Frédéric Chazal David Cohen-Steiner Marc Glisse Leonidas J. Guibas Steve Y. Oudot November 28, 2008 Abstract Topological persistence has proven to be
More informationSize Theory and Shape Matching: A Survey
Size Theory and Shape Matching: A Survey Yi Ding December 11, 008 Abstract This paper investigates current research on size theory. The research reviewed in this paper includes studies on 1-D size theory,
More informationarxiv: v1 [math.at] 27 Jul 2012
STATISTICAL TOPOLOGY USING PERSISTENCE LANDSCAPES PETER BUBENIK arxiv:1207.6437v1 [math.at] 27 Jul 2012 Abstract. We define a new descriptor for persistent homology, which we call the persistence landscape,
More informationA Topological Study of Functional Data and Fréchet Functions of Metric Measure Spaces
A Topological Study of Functional Data and Fréchet Functions of Metric Measure Spaces arxiv:1811.06613v1 [math.at] 15 Nov 2018 Haibin Hang, Facundo Mémoli, Washington Mio Department of Mathematics, Florida
More informationAMS 2000 subject classifications. Primary 62G05, 62G20; secondary 62H12. Key words and phrases. Persistent homology, topology, density estimation.
The Annals of Statistics 2014, Vol. 42, No. 6, 2301 2339 DOI: 10.1214/14-AOS1252 c Institute of Mathematical Statistics, 2014 CONFIDENCE SETS FOR PERSISTENCE DIAGRAMS arxiv:1303.7117v3 [math.st] 20 Nov
More informationarxiv: v4 [math.at] 23 Jan 2015
STATISTICAL TOPOLOGICAL DATA ANALYSIS USING PERSISTENCE LANDSCAPES PETER BUBENIK arxiv:1207.6437v4 [math.at] 23 Jan 2015 Abstract. We define a new topological summary for data that we call the persistence
More informationPersistent Local Systems
Persistent Local Systems Amit Patel Department of Mathematics Colorado State University joint work with Robert MacPherson School of Mathematics Institute for Advanced Study Abel Symposium on Topological
More informationMetric shape comparison via multidimensional persistent homology
Metric shape comparison via multidimensional persistent homology Patrizio Frosini 1,2 1 Department of Mathematics, University of Bologna, Italy 2 ARCES - Vision Mathematics Group, University of Bologna,
More informationManifold Estimation, Hidden Structure and Dimension Reduction
Manifold Estimation, Hidden Structure and Dimension Reduction We consider two related problems: (i) estimating low dimensional structure (ii) using low dimensional approximations for dimension reduction.
More information24 PERSISTENT HOMOLOGY
24 PERSISTENT HOMOLOGY Herbert Edelsbrunner and Dmitriy Morozov INTRODUCTION Persistent homology was introduced in [ELZ00] and quickly developed into the most influential method in computational topology.
More informationPatrizio Frosini. Department of Mathematics and ARCES, University of Bologna
An observer-oriented approach to topological data analysis Part 2: The algebra of group invariant non-expansive operators and its application in the project GIPHOD - Group Invariant Persistent Homology
More informationDeclutter and Resample: Towards parameter free denoising
Declutter and Resample: Towards parameter free denoising Mickaël Buchet, Tamal K. Dey, Jiayuan Wang, Yusu Wang arxiv:1511.05479v2 [cs.cg] 27 Mar 2017 Abstract In many data analysis applications the following
More information5. Persistence Diagrams
Vin de Silva Pomona College CBMS Lecture Series, Macalester College, June 207 Persistence modules Persistence modules A diagram of vector spaces and linear maps V 0 V... V n is called a persistence module.
More informationOptimal Transport and Wasserstein Distance
Optimal Transport and Wasserstein Distance The Wasserstein distance which arises from the idea of optimal transport is being used more and more in Statistics and Machine Learning. In these notes we review
More informationPersistence modules in symplectic topology
Leonid Polterovich, Tel Aviv Cologne, 2017 based on joint works with Egor Shelukhin, Vukašin Stojisavljević and a survey (in progress) with Jun Zhang Morse homology f -Morse function, ρ-generic metric,
More information7. The Stability Theorem
Vin de Silva Pomona College CBMS Lecture Series, Macalester College, June 2017 The Persistence Stability Theorem Stability Theorem (Cohen-Steiner, Edelsbrunner, Harer 2007) The following transformation
More informationGraph Induced Complex: A Data Sparsifier for Homology Inference
Graph Induced Complex: A Data Sparsifier for Homology Inference Tamal K. Dey Fengtao Fan Yusu Wang Department of Computer Science and Engineering The Ohio State University October, 2013 Space, Sample and
More informationHeat Flows, Geometric and Functional Inequalities
Heat Flows, Geometric and Functional Inequalities M. Ledoux Institut de Mathématiques de Toulouse, France heat flow and semigroup interpolations Duhamel formula (19th century) pde, probability, dynamics
More informationStabilizing the unstable output of persistent homology computations
Stabilizing the unstable output of persistent homology computations Peter Bubenik with Paul Bendich (Duke) and Alex Wagner (Florida) May 5, 2017 Conference on Applied and Computational Algebraic Topology
More informationPenalized Barycenters in the Wasserstein space
Penalized Barycenters in the Wasserstein space Elsa Cazelles, joint work with Jérémie Bigot & Nicolas Papadakis Université de Bordeaux & CNRS Journées IOP - Du 5 au 8 Juillet 2017 Bordeaux Elsa Cazelles
More informationPersistence Theory: From Quiver Representations to Data Analysis
Persistence Theory: From Quiver Representations to Data Analysis Comments and Corrections Steve Oudot June 29, 2017 Comments page viii, bottom of page: the following names should be added to the acknowledgements:
More informationPersistence based Clustering
July 9, 2009 TGDA Workshop Persistence based Clustering Primoz Skraba joint work with Frédéric Chazal, Steve Y. Oudot, Leonidas J. Guibas 1-1 Clustering Input samples 2-1 Clustering Input samples Important
More informationA (computer friendly) alternative to Morse Novikov theory for real and angle valued maps
A (computer friendly) alternative to Morse Novikov theory for real and angle valued maps Dan Burghelea Department of mathematics Ohio State University, Columbus, OH based on joint work with Stefan Haller
More informationApplications of Persistent Homology to Time-Varying Systems
Applications of Persistent Homology to Time-Varying Systems Elizabeth Munch Duke University :: Dept of Mathematics March 28, 2013 Time Death Radius Birth Radius Liz Munch (Duke) PhD Defense March 28, 2013
More informationConsistent Manifold Representation for Topological Data Analysis
Consistent Manifold Representation for Topological Data Analysis Tyrus Berry (tberry@gmu.edu) and Timothy Sauer Dept. of Mathematical Sciences, George Mason University, Fairfax, VA 3 Abstract For data
More informationPersistence Landscape of Functional Signal and Its. Application to Epileptic Electroencaphalogram Data
Persistence Landscape of Functional Signal and Its Application to Epileptic Electroencaphalogram Data Yuan Wang, Hernando Ombao, Moo K. Chung 1 Department of Biostatistics and Medical Informatics University
More informationPersistent Homology. 128 VI Persistence
8 VI Persistence VI. Persistent Homology A main purpose of persistent homology is the measurement of the scale or resolution of a topological feature. There are two ingredients, one geometric, assigning
More informationManifold Regularization
9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,
More informationarxiv: v2 [math.pr] 15 May 2016
MAXIMALLY PERSISTENT CYCLES IN RANDOM GEOMETRIC COMPLEXES OMER BOBROWSKI, MATTHEW KAHLE, AND PRIMOZ SKRABA arxiv:1509.04347v2 [math.pr] 15 May 2016 Abstract. We initiate the study of persistent homology
More informationSupplementary Figures
Death time Supplementary Figures D i, Å Supplementary Figure 1: Correlation of the death time of 2-dimensional homology classes and the diameters of the largest included sphere D i when using methane CH
More informationExercises from other sources REAL NUMBERS 2,...,
Exercises from other sources REAL NUMBERS 1. Find the supremum and infimum of the following sets: a) {1, b) c) 12, 13, 14, }, { 1 3, 4 9, 13 27, 40 } 81,, { 2, 2 + 2, 2 + 2 + } 2,..., d) {n N : n 2 < 10},
More informationLarge deviations and averaging for systems of slow fast stochastic reaction diffusion equations.
Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics
More informationSupplement to: Subsampling Methods for Persistent Homology
Suppleent to: Subsapling Methods for Persistent Hoology A. Technical results In this section, we present soe technical results that will be used to prove the ain theores. First, we expand the notation
More informationHypoelliptic multiscale Langevin diffusions and Slow fast stochastic reaction diffusion equations.
Hypoelliptic multiscale Langevin diffusions and Slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint works with Michael Salins 2 and Konstantinos Spiliopoulos 3.) 1. Department of Mathematics
More informationThe DTM-signature for a geometric comparison of metric-measure spaces from samples
The DTM-signature for a geometric comparison of metric-measure spaces from samples Claire Brécheteau To cite this version: Claire Brécheteau The DTM-signature for a geometric comparison of metric-measure
More informationarxiv: v1 [stat.ml] 12 Jun 2017
Kernel method for persistence diagrams via kernel embedding and weight factor Genki Kusano Kenji Fukumizu Yasuaki Hiraoka arxiv:1706.03472v1 [stat.ml] 12 Jun 2017 Abstract Topological data analysis is
More informationMath 209B Homework 2
Math 29B Homework 2 Edward Burkard Note: All vector spaces are over the field F = R or C 4.6. Two Compactness Theorems. 4. Point Set Topology Exercise 6 The product of countably many sequentally compact
More informationASYMPTOTIC STRUCTURE FOR SOLUTIONS OF THE NAVIER STOKES EQUATIONS. Tian Ma. Shouhong Wang
DISCRETE AND CONTINUOUS Website: http://aimsciences.org DYNAMICAL SYSTEMS Volume 11, Number 1, July 004 pp. 189 04 ASYMPTOTIC STRUCTURE FOR SOLUTIONS OF THE NAVIER STOKES EQUATIONS Tian Ma Department of
More informationTopology Guaranteeing Manifold Reconstruction using Distance Function to Noisy Data
Topology Guaranteeing Manifold Reconstruction using Distance Function to Noisy Data ABSTRACT Frédéric Chazal Université de Bourgogne Institute of Mathematics of Burgundy France fchazal@u-bourgogne.fr Given
More informationLocal semiconvexity of Kantorovich potentials on non-compact manifolds
Local semiconvexity of Kantorovich potentials on non-compact manifolds Alessio Figalli, Nicola Gigli Abstract We prove that any Kantorovich potential for the cost function c = d / on a Riemannian manifold
More informationMostow Rigidity. W. Dison June 17, (a) semi-simple Lie groups with trivial centre and no compact factors and
Mostow Rigidity W. Dison June 17, 2005 0 Introduction Lie Groups and Symmetric Spaces We will be concerned with (a) semi-simple Lie groups with trivial centre and no compact factors and (b) simply connected,
More informationOptimal Reconstruction Might be Hard
Optimal Reconstruction Might be Hard Dominique Attali, André Lieutier To cite this version: Dominique Attali, André Lieutier. Optimal Reconstruction Might be Hard. Discrete and Computational Geometry,
More informationData dependent operators for the spatial-spectral fusion problem
Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.
More informationNeuronal structure detection using Persistent Homology
Neuronal structure detection using Persistent Homology J. Heras, G. Mata and J. Rubio Department of Mathematics and Computer Science, University of La Rioja Seminario de Informática Mirian Andrés March
More informationarxiv: v2 [math.at] 10 Nov 2013
SKETCHES OF A PLATYPUS: A SURVEY OF PERSISTENT HOMOLOGY AND ITS ALGEBRAIC FOUNDATIONS MIKAEL VEJDEMO-JOHANSSON arxiv:1212.5398v2 [math.at] 10 Nov 2013 Computer Vision and Active Perception Laboratory KTH
More informationStability and computation of topological invariants of solids in R n
Stability and computation of topological invariants of solids in R n Frédéric Chazal André Lieutier Abstract In this work, one proves that under quite general assumptions one can deduce the topology of
More informationLearning Eigenfunctions: Links with Spectral Clustering and Kernel PCA
Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures
More informationMatthew L. Wright. St. Olaf College. June 15, 2017
Matthew L. Wright St. Olaf College June 15, 217 Persistent homology detects topological features of data. For this data, the persistence barcode reveals one significant hole in the point cloud. Problem:
More informationStatistical inference on Lévy processes
Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline
More informationQuantifying Transversality by Measuring the Robustness of Intersections
Quantifying Transversality by Measuring the Robustness of Intersections Herbert Edelsbrunner Dmitriy Morozov, and mit Patel bstract By definition, transverse intersections are stable under infinitesimal
More informationTopological complexity of motion planning algorithms
Topological complexity of motion planning algorithms Mark Grant mark.grant@ed.ac.uk School of Mathematics - University of Edinburgh 31st March 2011 Mark Grant (University of Edinburgh) TC of MPAs 31st
More informationUnlabeled Data: Now It Helps, Now It Doesn t
institution-logo-filena A. Singh, R. D. Nowak, and X. Zhu. In NIPS, 2008. 1 Courant Institute, NYU April 21, 2015 Outline institution-logo-filena 1 Conflicting Views in Semi-supervised Learning The Cluster
More informationCut Locus and Topology from Surface Point Data
Cut Locus and Topology from Surface Point Data Tamal K. Dey Kuiyu Li March 24, 2009 Abstract A cut locus of a point p in a compact Riemannian manifold M is defined as the set of points where minimizing
More informationMetrics on Persistence Modules and an Application to Neuroscience
Metrics on Persistence Modules and an Application to Neuroscience Magnus Bakke Botnan NTNU November 28, 2014 Part 1: Basics The Study of Sublevel Sets A pair (M, f ) where M is a topological space and
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationPseudo-Poincaré Inequalities and Applications to Sobolev Inequalities
Pseudo-Poincaré Inequalities and Applications to Sobolev Inequalities Laurent Saloff-Coste Abstract Most smoothing procedures are via averaging. Pseudo-Poincaré inequalities give a basic L p -norm control
More informationLECTURE 3: SMOOTH FUNCTIONS
LECTURE 3: SMOOTH FUNCTIONS Let M be a smooth manifold. 1. Smooth Functions Definition 1.1. We say a function f : M R is smooth if for any chart {ϕ α, U α, V α } in A that defines the smooth structure
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More information