Geometric Inference on Kernel Density Estimates

Size: px

Start display at page:

Download "Geometric Inference on Kernel Density Estimates"

Piers Cummings
5 years ago
Views:

1 1 Geometric Inference on Kernel Density Estimates Jeff M. Phillis University of Utah Bei Wang University of Utah Yan Zheng University of Utah 3 Abstract We show that geometric inference of a oint cloud can be calculated by examining its kernel density estimate. This intermediate ste results in the inference being statically robust to noise and allows for large comutational gains and scalability (e.g. on 100 million oints). In articular, by first creating a coreset for the kernel density estimate, the data reresenting the final geometric and toological structure has size deending only on the error tolerance, not on the size of the original oint set or the comlexity of the structure. To achieve this result, we study how to relace distance to a measure, as studied by Chazal, Cohen- Steiner, and Mérigot, with the kernel distance. The kernel distance is monotonic with the kernel density estimate (sublevel sets of the kernel distance are suerlevel sets of the kernel density estimate), thus allowing us to examine the kernel density estimate in this manner. We show it has several comutational and stability advantages. Moreover, we rovide an algorithm to estimate its toology using weighted Vietoris-Ris comlexes.

2 Introduction Geometric inference [9] is the task of recovering the toological and geometric features of an object using only a finite samled oint cloud. Secifically the goal is for the inference from the oint cloud to reserve homeomorhism [, 7], homotoy tye [17, 18, 3, 50, 1], or homology [17, 50, 1] of the underlying object. The field has rogressively considered inference on more and more general objects, including smooth manifolds [50, 51, 0] and comact subsets of Euclidean sace [1]. While the underlying object varies, the aroaches have generally relied on combinatorial structures on the oint clouds, such as α-comlexes [30] and Vietoris-Ris comlexes [1]. The α-comlex is tyically associated with the offsets of oint clouds (i.e. as union of balls of an aroriate radius centered at the oint samle). In this aer we ask: what geometric inference guarantees can be made from kernel density estimates? These well-studied statistical objects [57, 54, 5, 6] create a smooth intermediate reresentation that is robust to noise and satial variation, and thanks to recent comutational advances [43, 5, 65, 64] are easily constructed very efficiently at enormous scales with formal aroximation guarantees. We consider the following aroach for rocessing a oint set drawn from an unknown object. 1. Create a kernel density estimate of the oint set, this aroximates the true underlying measure.. Aroximate the ersistence diagram of the sublevel sets filtration of kernel distance using weighted Vietoris-Ris comlexes. 3. Examine the shae and toological roerties of the suerlevel sets of the kernel density estimates. To analyze this rocess, we take a few detours from what might be suggested. First, instead of taking the classical view of an α-comlex as a union of balls, we exlore a newer interretation as the sublevel sets of a distance function to the object of study. These distance function based techniques [9] have been used with success in estimating the toology and geometry of the underlying object with certain data models [47, 19, 1, 11, 13, 48, 14]. However, when noise is resent, the reconstruction task is much more difficult as a single stray oint can drastically alter the underlying toology, therefore decreasing the stability of the distance function. Recently Chazal, Cohen-Steiner, and Merigot [15] have introduced distance to a measure, which is a distance-like function that is robust to erturbations and noise on the data, and can be aroximated efficiently [39, 5]. We build on the technology develoed in this line of work in order to understand and construct suerlevel sets of kernel density estimates. Second, instead of linearly varying the threshold of suerlevel sets on the kernel density estimate, we consider the sublevel sets of the kernel distance, which is roughly the squareroot of a constant minus the kernel density estimate. In addition to allowing us to analyze the kernel density estimates, this kernel distance [36, 58, 43, 53], offers many robustness guarantees and comutational advantages that are on ar and occasionally advantageous comared to those of the distance to a measure. This is not the first work exloring geometric inference and toological roerties of suerlevel sets of kernel density estimates. An recent manuscrit [4] offers a statistical ersective on this roblem, and exlores, among other things, how the samle size of the inut data set affects the robustness of the toological roerties such as ersistence diagrams. By examining this roblem through the kernel distance, our bounds on the same questions are considerably smaller and require far fewer arameters Background on Kernels, Kernel Density Estimates, and Kernel Distance A kernel is a non-negative similarity measure K : R d R d R + ; more similar oints have higher value. For any fixed R d, a kernel K(, ) can be normalized to be a robability distribution; that is x R K(, x)dx = 1. For the uroses of this article we focus on the Gaussian kernel defined as d 1

3 K(, x) = σ ex( x /σ ). 1 A kernel density estimate [57, 54, 5, 6] is a way to estimate a continuous distribution function over R d for a finite oint set P R d. Secifically, KDE P (x) = 1 K(, x). P The kernel distance [59, 41, 36, 61, 9, 8, 37, 43, 53] (also called current distance or maximum mean discreancy) is a metric [49, 58] between two oint sets P, Q (as long as the kernel used is characteristic [58], a slight restriction of being ositive definite [3, 4, 4, 63], this includes the Gaussian and Lalace kernels). Define a similarity between the two oint sets as κ(p, Q) = 1 1 K(, q). P Q P P q Q Then the kernel distance between two oint sets is defined as D K (P, Q) = κ(p, P ) + κ(q, Q) κ(p, Q). When oint set Q is a single oint x, then κ(p, x) = KDE P (x). Kernel density estimates can aly to any measure µ (on R d ) as KDE µ (x) = R K(, x)µ()d. The d similarity between two measures is κ(µ, ν) = R q R K(, q)µ()ν(q)ddq, and then the kernel dis- d d tance between two measures µ and ν is still a metric, defined as D K (µ, ν) = κ(µ, µ) + κ(ν, ν) κ(µ, ν). When the measure ν is a unit Dirac mass at x (ν(q) = 1 when q = x and ν(q) = 0 otherwise), then κ(µ, x) = KDE µ (x). Given a finite oint set P R d, we can work with emirical measure µ P defined as µ P = 1 P P δ, where δ is the Dirac measure on, and D K (µ P, µ Q ) = D K (P, Q). If K is ositive definite, it is said to have the reroducing roerty [4, 3, 4, 63]. This imlies that K(, x) is an inner roduct in some reroducing kernel Hilbert sace (RKHS) H K. Secifically, there is a lifting ma φ : R d H K so that K(, x) = φ(), φ(x) HK, and moreover the entire set P can be reresented as Φ(P ) = P φ(), which is a single element of H K and has a norm Φ(P ) HK = κ(p, P ). A single oint x R d also has a norm φ(x) HK = K(x, x) in this sace and we can also write κ(p, x) = KDE P (x) = Φ(P ) φ(x) H K. 1. Geometric Inference and Distance to a Measure: A Review Given an unknown object (e.g. a comact set) S R d and a finite oint cloud P R d that comes from 81 S under some rocess, geometric inference aims to recover toological and geometric roerties of S from 8 P. The offset-based (and more generally, the distance function-based) aroach for geometric inference 83 reconstructs a geometric and toological aroximation of S by offsets from P [11, 1, 13, 14, 15, 17, 18, 84 0]. 85 Given a comact set S R d, we can define a distance function f S to S; a common examle is f S (x) = inf y S x y. The offsets of S are the sublevel sets of f S, denoted (S) r = f 1 S ([0, r]). Now an arox- 87 imation of S by another comact set P R d (e.g. a finite oint cloud) can be quantified by the Hausdorff 88 distance d H (S, P ) := f S f P = inf x R d f S (x) f P (x) of their distance functions. The intuition 89 behind the inference of toology is that if d H (S, P ) is small, thus f S and f P are close, and subsequently, S, (S) r and (P ) r 90 carry the same toology for an aroriate scale r. In other words, to comare the toology of offsets (S) r and (P ) r 91, we require Hausdorff stability with resect to their distance functions f S and f P. 9 An examle of an offset-based toological inference result is formally stated as follows (as a articular ver- 93 sion of the reconstruction Theorem 4.6 in [1]), where the reach of a comact set S, reach(s), is defined as 94 the minimum distance between S and its medial axis. 1 K(, x) is normalized so that K(x, x) = 1 for σ = 1. The choice of coefficient σ is not the standard normalization, but it is erfectly valid as it scales everything by a constant. It has the roerty that σ K(, x) x / for x small.

4 Theorem 1.1 (Reconstruction from f P [1, 15]). Let S, P R d be comact sets such that reach(s) > R and ε := d H (S, P ) R/7. Then S and (P ) r are homotoy equivalent (and even isotoic) if 4ε r R 3ε. Here R ensures that the toological roerties of S and (S) r are the same, and the ε arameter ensures (S) r and (P ) r are close. Tyically ε is tied to the density with which a oint cloud P is samled from S. In addition to the Hausdorff stability roerty stated above, as exlained in [15], there are a few more secific roerties of f S that enable it to be useful for geometric inference [1, 14, 47, 48]: (F1) f S is 1-Lischitz: For all x, y R d, f S (x) f S (y) x y. (F) fs is 1-semiconcave: The ma x Rd (f S (x)) x is concave. As described in [15], here (F1) ensures that f S is differentiable almost everywhere and the medial axis of S has zero d-volume; and (F) is a crucial technical tool, e.g. in roving the existence of the flow of the gradient of the distance function for toological inference [1]. Distance to a measure. Given a robability measure µ on R d and a arameter m 0 > 0 smaller than the total mass of µ, the distance to a measure d ccm : R n R + [15] is defined for any oint x R d as ( 1 m0 1/ d ccm (x) = (δ µ,m (x)) dm), where δ µ,m (x) = inf { r > 0 : µ( m B r (x)) m }, 0 m=0 and where B r (x) is a ball of radius r centered at x and B r (x) is its closure. It has been shown in [15] that d ccm is a distance-like function that has the following roerties: (D1) The function x d ccm (x) is 1-Lischitz. (D) The function x (d ccm (x)) x is concave. (D3) [Stability] For robability measures µ and ν on R d and m 0 > 0, then d ccm d ccm ν,m 0 1 m0 W (µ, ν). 114 ( 1/ Here W is the Wasserstein distance W (µ, ν) = inf π Π(µ,ν) R d R x y dπ(x, y)) between two d 115 measures, where dπ(x, y) measures the amount of mass transferred from location x to location y and π 116 Π(µ, ν) is a transference lan [6]. Proerty (D3) imlies that in addition to d ccm being suitable for known 117 reconstruction theorems, it is also stable with resect to the Wasserstein distance on µ. 118 Furthermore, for the reconstruction theorem [1] (Theorem 1.1) to hold, the authors require a distance- 119 like version ([15], Proosition 4.) of the Isotoy Lemma ([38], Proosition 1.8), which adds an extra 10 condition that d ccm has to be roer, in the sense that, for every sequence { i } in R d that escaes to 11 infinity, {d ccm (x)} escaes to infinity in R + ([46], Proosition.17). 1 On a oint set P, one can algorithmically construct a toological estimation of d ccm µ P,m 0, e.g. aroximate 13 the ersistence diagram [3] of the sublevel sets filtration of d ccm µ P,m 0. In the case of d ccm µ P,m 0, the sublevel sets 14 can be aroximated by union of balls, and their toology can be estimated using tools from comutational 15 toology such as weighted alha-shaes [39] and weighted Ris comlexes [6] Our Results Here we show how to erform geometric inference on a kernel density estimate of a oint set P. We accomlish this by showing that a similar set of roerties hold for the kernel distance with resect to a measure µ, d K µ : R d R + (in lace of distance to a measure d ccm ), defined as d K µ (x) = D K (µ, x) = κ(µ, µ) + κ(x, x) κ(µ, x). This treats x as a robability measure reresented by a Dirac mass at x. Secifically, the following roerties of d K µ allow it to inherit the reconstruction roerties of d ccm. 3

5 (K1) d K µ is 1-Lischitz. (K) (d K µ ) is 1-semiconcave: the ma x (d K µ (x)) x is concave. (K3) d K µ is roer. (K4) [Stability] If µ and ν are two measures on R d, then d K µ d K ν D K (µ, ν). In addition, we rovide constructive toological estimation results associated with d K µ. We aroximate the ersistence diagram of the sublevel sets filtration of d K µ using weighted Ris comlexes, following ower distance machinery introduced in [6]. We also describe further advantages of the kernel distance. (i) It has a small coreset reresentation, which allows for sarse reresentation and efficient, scalable comutation. In articular, an ε-kernel samle [43, 5, 65] Q of µ is a finite oint set whose size only deends on ε > 0 and such that max x R d KDE µ (x) KDE µq (x) = max x R d κ(µ, x) κ(µ Q, x) ε. (ii) It is Lischitz with resect to the outlier arameter σ when the inut x is fixed. (iii) Its inference is easily interretable through the suerlevel sets of a kernel density estimate. (iv) As σ tends to for any two robability measures µ, ν, the kernel distance is bounded by the Wasserstein distance: lim σ D K (µ, ν) W (µ, ν) Proerties of the Kernel Distance Recall we use the σ -normalized Gaussian kernel K σ (, x) = σ ex( x /σ ). For ease of exosition, unless otherwise noted, we will assume σ is fixed and write K instead of K σ. Yet, the choice of σ is imortant. It is a smoothing arameter and controls the variance of the kernel; it determines the amount of noise or outliers tolerated by d K µ. We will address its significance in Section 4. and AendixA.1 in more detail Semiconcave Proerty for d K µ Lemma.1 (K). (d K µ ) is 1-semiconcave: the ma x (d K µ (x)) x is concave. Proof. This follows from the second derivative of T (x) = (d K µ (x)) x in any direction always being non-ositive. We can rewrite T (x) = κ(µ, µ) + κ(x, x) κ(µ, x) x = κ(µ, µ) + κ(x, x) (K(, x) + x )µ()d. R d Note that both κ(µ, µ) and κ(x, x) are absolute constants, so we can ignore them in the second derivative. Furthermore, by setting t(, x) = K(, x) x, the second derivative of T (x) is non-ositive if the second derivative of t(, x) is non-ositive for all, x R d. First note that the second derivative of x is a constant in every direction. The second derivative of K(, x) is symmetric about, so we can consider the second derivative along any vector u = x, d ( ) ) u t(, x) = du σ 1 ex ( u σ. This exression reaches its maximum value at u = x = 0 where it is exactly 0; this follows since ex( u /σ ) is always ositive and u /σ is always non-negative. We also note in Aendix B.3 that semiconcavity follows trivially in the RKHS H K. 4

6 Lischitz roerty for d K µ. We generalize a (folklore [15]) relation between semiconcave and Lischitz functions, and (re)rove it in Aendix B for comleteness. Then, using Lemma.1, roerty (K1) follows. Lemma.. Function g(x) such that (g(x)) is l-semiconcave, for l 1, then g(x) must be l-lischitz. Lemma.3 (K1). d K µ is 1-Lischitz on its inut Proerness of d K µ For the Isotoy Lemma to hold, we show that d K µ is roer, when its range is restricted to be less than c µ := κ(µ, µ) + κ(x, x). Here, the value of c µ deends on µ not on x since κ(x, x) = K(x, x) = σ. Lemma.4 (K3). d K µ is roer. Proof. We begin by first reviewing several equivalent definitions of a roer ma. Definition (i): A continuous ma f : X Y between two toological saces is roer if and only if the inverse image of every comact subset in Y is comact in X ([45], age 84; [46], age 45). Suose both X and Y are toological manifolds, an alternative characterization of roerness exists in terms of divergent sequences, that is, for Definition (ii): a continuous ma f : X Y between two toological manifolds is roer if and only if for every sequence { i } in X that escaes to infinity, {f( i )} escaes to infinity in Y ([46], Proosition.17). Here, for a toological sace X, a sequence { i } in X escae to infinity if for every comact set G X, there are at most finitely many values of i for which i G ([46], age 46). Based on the later definition, for examle, d ccm is roer in the sense that it tends to infinity as x goes to infinity. To rove that d K µ is roer, we rove the following two claims: (a) A continuous function f : R d [0, c) (where c is a constant) is roer, if for all sequence {x i } in R d that escaes to infinity, the sequence {f(x i )} tends to c; (b) Let f := d K µ and one needs to show that for all sequence {x i } that escaes to infinity, the sequence {f(x i )} tends to c µ ; or equivalently, κ(µ, x i ) tends to 0. We rove claim (a) by roving its contraositive. If a continuous function f : R d [0, c) is not roer, then there exists a sequence {x i } in R d that escaes to infinity, such that the sequence {f(x i )} does not tend to c. Suose f is not roer, this imlies that there exists a constant b < c such that f 1 [0, b] is not comact (based on roerness definition (i)) and therefore either not closed or unbounded. We first show that A := f 1 [0, b] is closed. We make use of the following theorem ([44], age 88, Theorem 10 ): A maing f of a toological sace X into a toological sace Y is continuous if and only if the re-image f 1 (F ) of every closed set F Y is closed in X. Since f is continuous, it imlies that the re-image of every closed set [a, b] R is closed in R d. Therefore, A is closed, therefore it must be unbounded. Since every unbounded sequence contains a monotone subsequence that has either + or as a limit, therefore A contains a subsequence S := {x i } that tends to an infinite limit. In addition, as elements in S escaes to infinity, {f(x i )} tends to b and does not tend to c. Therefore (a) holds by contraosition. To rove claim (b), we need to show that for all sequence {x i } that escaes to infinity, κ(µ, x i ) tends to 0. For each x i, define a radius r i = x i 0 / and define a ball B i that is centered at the origine 0 and has radius r i. As x i goes to infinity, r i increases until for any fixed arbitrary ε > 0, we have B i µ()d 1 ε/σ and thus R d \B i µ()d ε/σ. Furthermore, let i = arg min B x i, so x i i = r i. Thus also as x i goes to infinity, r i increases until for any ε > 0 we have K( i, x i ) ε/. We now decomose κ(µ, x i ) = B i K(, x i )µ()d + q R d \B i K(q, x i )µ(q)dq. Thus for any ε > 0, as x i goes to infinity, the first term is at most ε/ since all K(, x i ) K( i, x i ) ε/ and the second term is at most ε/ since K(q, x) σ and q R d \B i µ(q)dq ε/σ. Since these results hold for all ε, as x i goes to infinity and ε goes to 0, κ(µ, x i ) goes to 0. Combine (a) with (b) and the fact that d K µ is a continuous (in fact, Lischitz) function, we obtained the roerness result. 5

7 By the definition of roerness, Lemma.4 imlies that it is a closed ma and its levelset at any value a [0, c µ ) is comact. This also means that the sublevel set of d K µ (for ranges [0, a) [0, c µ )) is comact. Since the levelset (sublevel set) of d K µ corresonds to the levelset (suerlevel set) of KDE µ, we have the following corollary. Corollary.1. The suerlevel sets of KDE µ for all ranges whose lower bound a > 0 are comact. The result in [31] imlies that the above statement is true for µ being a uniform measure over a finite oint set. The above corollary is a more general statement Reconstruction and Toological Estimation using Kernel Distance 3.1 Homotoy Equivalent Reconstruction using d K µ We have shown that the kernel distance function d K µ is a distance-like function in the sense defined in [15]: it is a non-negative function which is 1-Lischitz and roer (with restriction on the range), whose square is 1-semiconcave. Therefore the reconstruction theory for a distance-like function [15] (which is an extension of results for comact sets [1]) holds in the setting of d K µ. Most imortantly we state the following two corollaries for comleteness, whose roofs follow trivially from the roofs of Proosition 4. and Theorem 4.6 in [15]. Before their formal statement, we need some notation adated from [15] to make these statements recise. Let φ : R d R + be a distance-like function. A oint x R d is an α-critical oint if φ (x + h) φ (x) + α h φ(x) + h with α [0, 1], h R d. Let (φ) r = {x R d φ(x) r} denote the sublevel set of φ, and let (φ) [r 1,r ] = {x R d r 1 φ(x) r } denote all oints at levels in the range [r 1, r ]. For α [0, 1], the α-reach of φ is the maximum r such that (φ) r has no α-critical oint, denoted as reach α (φ). When α = 1, reach 1 coincides with reach introduced in [35]. 5 6 Theorem 3.1 (Isotoy lemma on d K µ ). Let r 1 < r be two ositive numbers such that d K µ oints in (d K µ ) [r 1,r ]. Then all the sublevel sets (d K µ ) r are isotoic for r [r 1, r ]. has no critical Theorem 3. (Reconstruction on d K µ ). Let d K µ and d K ν be two kernel distance functions such that d K µ d K ν ε. Suose reach α (d K µ ) R for some α > 0. Then r [4ε/α, R 3ε], and η (0, R), the sublevel sets (d K µ ) η and (d K ν ) r are homotoy equivalent for ε R/(5 + 4/α ) Power Distance using d K µ A ower distance using d K µ for a measure µ is defined with a oint set P R d and a metric d(, ) on R d, f P (µ, x) = min P ( d(, x) + d K µ () ). We consider the instance fp K that uses kernel distance d(, x) := D K(, x). Many details are left to Aendix D. Given a measure µ, let + = arg max q R d κ(µ, q), and let P + R d be a oint set that contains +. 1 We can show (Theorem D. and Theorem D.1) that d K µ (x) fp K + (µ, x) 14d K µ (x). However, constructing + exactly seems quite difficult. Now consider an emirical measure µ P defined by a oint set P. We show (in Theorem D.5 in Aendix D.3) how to construct a oint ˆ + (that aroximates + ) such that D K (P, ˆ + ) (1 + δ)d K (P, + ) for any δ > 0. The median concentration Λ P is a radius such that no oint P has more than half of the oints 6

8 of P within Λ P. The runtime is olynomial in n and 1/δ assuming the sread of P is bounded, and for the choice of σ that σ/λ P and d are absolute constants. Then we consider ˆP + = P {ˆ + }, where ˆ + is found with δ = 1/ in the above construction. Then we can rovide the following multilicative bound, roven in Theorem D.3. The lower bound holds indeendent of the choice of P as shown in Theorem D.1. Theorem 3.3. For any oint set P R d and oint x R d, with emirical measure µ P defined by P, then 1 d K µ P (x) f KˆP+ (µ P, x) 8d K µ P (x) Constructive Toological Estimation using d K µ 47 In order to erform constructive toological estimation using the kernel distance d K µ, one needs to be able to 48 comute quantities related to its sublevel sets, in articular, to comute the ersistence diagram of the sub- 49 level sets filtration of d K µ. Now we describe such tools needed for the kernel distance based on machinery 50 recently develoed by Buchet et al. [6], which shows how to aroximate the ersistent homology of dis- 51 tance to a measure for any metric sace via a ower distance construction. Then using similar constructions, 5 we can use the weighted Ris filtration to aroximate the ersistence diagram of the kernel distance. 53 To state our results, first we require some technical notions and assume basic knowledge on ersistent 54 homology (see [3, 33] for a readable background). Given a metric sace X with the distance d X (, ), a 55 set P X and a function w : P R, the (general) ower distance f associated with (P, w) is defined as f(x) = 56 min P (d X (, x) + w() ). Now given the set (P, w) and its corresonding ower distance 57 f, one could use the weighted Ris filtration to aroximate the ersistence diagram of w. Consider the 58 sublevel set of f, f 1 ((, α]). It is the union of balls centered at oints P with radius r (α) = 59 α w() for each. The weighted Cech comlex C α (P, w) for arameter α is the union of simlices s such that 60 s B(, r (α)) 0. The weighted Ris comlex R α (P, w) for arameter α is the maximal 61 comlex whose 1-skeleton is the same as C α (P, w). The corresonding weighted Ris filtration is denoted 6 as {R α (P, w)}. 63 Setting w := d K µ P and given a oint set ˆP+ described in Section 3., we consider the weighted Ris 64 filtration {R α ( ˆP +, d K µ )} based on the kernel version of the ower distance, f KˆP+. We can now state the 65 following corollary of Theorem 3.3 (roofs detailed in Aendix D.4). Here, the ersistence diagrams are 66 viewed on a logarithmic scale, that is, we change coordinates of oints following the maing (x, y) 67 (ln x, ln y). d ln B denotes the corresonding bottleneck distance between ersistence diagrams Corollary 3.1. The weighted Ris filtration {R α ( ˆP +, d K µ P )} can be used to aroximate the ersistence diagram of d K µ P such that d ln B (Dgm(dK µ P ), Dgm({R α ( ˆP +, d K µ P )})) ln(16). Proof. To rove that two ersistence diagrams are close, one could rove that their filtration are interleaved [10], that is, two filtrations {U α } and {V α } are ε-interleaved if for any α, U α V α+ε U α+ε. First, Lemmas D.1 and D.13 rove that the ersistence diagrams Dgm(d K µ P ) and Dgm({R α ( ˆP +, d K µ P )})) are well-defined. Second, the results of Theorem 3.3 imlies an 8 multilicative interleaving. Therefore for any α R, (d K µ P ) 1 ((, α]) (f KˆP+ ) 1 ((, α) (d K µ P ) 1 ((, 8 α]). On a logarithmic scale (by taking the natural log of both sides), such interleaving becomes addictive, ln d K µ P ln f KˆP+ ln d K µ P + ln 8. 7

9 76 Theorem 4 of [16] imlies d ln B(Dgm(d K µ P ), Dgm(f KˆP+ )) ln(8) In addition, by the Persistent Nerve Lemma ([1], Theorem 6 of [55], an extension of the Nerve Theorem [40]), the sublevel sets filtration of d K µ, which corresond to unions of balls of increasing radius, has the same ersistent homology as the nerve filtration of these balls (which, by definition, is the Cech filtration). Finally, there exists a multilicative interleaving between weighted Ris and Cech comlexes (Proosition 31 of [16]), C α R α C α. We then obtain the following bounds on ersistence diagrams, d ln B(Dgm(f K P + ), Dgm({R α (P +, d K µ P )})) ln. 8 We use triangle inequality to obtain the final result: d ln B(Dgm(d K µ P ), Dgm({R α (P +, d K µ P )})) ln(16) Based on Corollary 3.1, we have an algorithm that aroximates the ersistent homology of the sublevel set filtration of d K µ by constructing the weighted Ris filtration corresonding to the kernel-based ower distance and comuting its ersistent homology. For memory efficient comutation, sarse (weighted) Ris filtrations could be adated by considering simlices on subsamles at each scale [56, 16], although some restrictions on the sace aly Stability Proerties for d K µ Lemma 4.1 (K4). For two measures µ and ν on R d we have d K µ d K ν D K (µ, ν). Proof. Since D K (, ) is a metric, then by triangle inequality, for any x R d we have D K (µ, x) D K (µ, ν) + D K (ν, x) and D K (ν, x) D K (ν, µ) + D K (µ, x). Therefore for any x R d we have D K (µ, x) D K (ν, x) D K (µ, ν), roving the claim. Note that both the Wasserstein and kernel distance are integral robability metrics [49, 58], so both (D3) and (K4) are interesting, but not easily comarable. In the next subsection we attemt to reconcile this Comaring D K to W Lemma 4.. There is no Lischitz constant γ such that for any two robability measures µ and ν we have W (µ, ν) γd K (µ, ν). 98 Proof. Consider two measures µ and ν which are identical, excet some ositive fractional amount τ of mass 99 from µ is moved to a distance n in ν from the origin. The Wasserstein distance requires a transortation lan that moves this τ mass to some art of µ with cost τ Ω(n) in W (µ, ν). On the other hand, D K (µ, ν) = κ(µ, µ) + κ(ν, ν) κ(µ, ν) σ + σ 0 = σ is bounded We conjecture for any two robability measures µ and ν that D K (µ, ν) W (µ, ν). This would show that d K µ is at least as stable as d ccm since a bound on W (µ, ν) would also bound D K (µ, ν), but not vice versa. Here we only show that this is true for a secialized case and as σ. To simlify notation, all integrals are assumed to be over the full domain R d. Lemma 4.3. Consider two robability measures µ and ν on R d where ν is reresented by a Dirac mass at a oint x R d. Then d K µ (x) = D K (µ, ν) W (µ, ν) for any σ > 0, where the equality only holds when µ is also a Dirac mass at x. 8

10 Proof. Since both W (µ, ν) and D K (µ, ν) are metrics and hence non-negative, we can instead consider their squared versions: (W (µ, ν)) = x µ()d and (D K (µ, ν)) = K(x, x) + = σ (1 + q q K(, q)µ()µ(q)dqd K(, x)µ()d ) q ex ( σ µ()µ(q)dqd Now use the bound 1 t e t 1 for t 0 to aroximate (D K (µ, ν)) σ (1 + = q (1)µ(q)dqµ()d (1 x µ()d = (W (µ, ν)). ) ) x ex ( σ µ()d. ) ) x σ µ()d The inequality becomes an equality only when x = 0 for all P, and since they are both metrics, this is the only location where they are both In Aendix C we show that even if ν is not a unit Dirac mass this at x, then this inequality hold in the limit 31 as σ goes to infinity. The technical work towards the next theorem is making recise how σ K(, x) x /, how on measures D K (µ, ν) acts like the difference between the means µ = 313 µ()d and 314 ν, and how the difference becomes smaller as σ goes to infinity Theorem 4.1. For any two robability measures µ and ν defined on R d µ ν W (µ, ν). Thus lim σ D K (µ, ν) W (µ, ν). 4. Stability with Resect to σ. lim D K(µ, ν) = µ ν and σ We now exlore the Lischitz roerties of d K µ with resect to the noise arameter σ. We argue that any distance function that is robust to noise needs some sort of arameter to address how many outliers to ignore or how far away a oint should be to be considered an outlier. For instance, this arameter in d ccm is m 0 which controls the amount of measure µ to be used in the distance. Here we show that d K µ has a articularly nice roerty, that it is Lischitz with resect to this choice of σ for any fixed x. Many details are deferred to Aendix B. The larger σ the more effect outliers have, and the smaller σ the less the data is smoothed and thus the closer the noise needs to be to the underlying object to effect the inference. The effect of changing σ is well-studied henomenon [6, 57, 31] and there is a suite of acceted heuristics for choosing the best value [60], although it is not clear they are intended for roblems in geometric inference. Lemma 4.4. Let h(σ, z) = ex( z /σ d 38 ). We can bound h(σ, z) 1, dσ h(σ, z) (/e)/σ and d 39 h(σ, z) (18/e 3 )/σ over any choice of z > 0. dσ Theorem 4.. For any x, d K µ (x) is l-lischitz with resect to σ, for l = 18/e 3 + 8/e + < 6. Proof. (Sketch) Define an indicator function 1 x () = 1 iff = x, and otherwise 0. It is now useful to define a function f x (σ) as ( ) q (µ()µ(q) f x (σ) = ex + 1x σ ()1 x (q) µ()1 x (q) ) dqd. q 9

11 333 So d K µ (x) = σ f x (σ) and we can also define F (σ) = (d K µ (x)) l σ = σ f x (σ) lσ Now to rove d K µ (x) is l-lischitz, we can show that (d K µ ) is l-semi-concave with resect to σ, and aly Lemma.. This boils down to showing the second derivative of F (σ) is always non-ositive. d d F (σ) = σ dσ dσ f x(σ) + 4σ d dσ f x(σ) + f x (σ) l. First we note that q (µ()µ(q) + µ()1 x(q) 1 x ()1 x (q))dqd since both µ and 1 x integrate to 1. Combining this with Lemma 4.4 using h(σ, q ), we can now bound each term as follows. d ( ) ( 18 1 F (σ) σ dσ e 3 σ 1 + 4σ e σ ) + ( 1 ) ( 18 e ) e + = 0. Lischitz in m 0 for d ccm. A Lischitz roerty is not known for d ccm with resect to m 0. Consider a µ P for oint set P R d where τ fraction of the oints P are close to x R d, and the rest far from x. When m 0 is above τ some mass is needed from the far art of P, causing a raid change in value with varying m 0. Thus the Lischitz roerty with resect to m 0 is related to P = max, P. 5 Algorithmic and Aroximation Observations Kernel coresets. The kernel distance is robust under random samles [43]. Secifically, if Q is a oint set randomly chosen from R d roortional to µ of size O((1/ε )(d + log(1/δ)) then d K µ d K Q ε with robability at least 1 δ. Thus if ε R/(5 + 4/α ), then with Theorem 3. any sublevel set (d K µ ) η is homotoy equivalent to (d K Q )r for r [4ε/α, R 3ε] and η (0, R). Furthermore, it is ossible to construct subsets Q with the same roerties but with even smaller size of Q = O(((1/ε) log(1/εδ)) d/(d+) ) [5]; in articular in R the required size is Q = O((1/ε) log(1/εδ)). The above three size bounds also ensure that KDE µ KDE Q ε, so Q is an ε-kernel samle (see Lemma B.). Exloiting the above constructions, recent work [65] builds a data structure to allow for efficient aroximate evaluations of KDE P where P = 100,000,000. This oens the door for reconstruction-based geometric inference which are robust to noise, on enormous datasets. Stability of ersistence diagrams. Furthermore, the stability results on ersistence diagrams [3] hold for kernel density estimates of µ and Q (where Q is a coreset of µ with the same size bounds as above); that is, d B (Dgm(KDE µ ), Dgm(KDE Q )) ε, where d B is the bottleneck distance between ersistence diagrams. Similar roerties are observed in [4], but with weaker and more comlicated bounds on Q. Geometric inference via suerlevel sets of kernel density estimates. A convenient roerty of d K µ ( ) is that it is monotonic with KDE µ ( ). That is, as d K µ (x) gets smaller, KDE µ (x) gets larger. Thus instead of doing geometric inference using sublevel sets of the kernel distance function d K µ, we can just examine the suerlevel sets of KDE µ. This rovides a very clean and natural interretation of the reconstruction roblem through the well-studied lens of kernel density estimates. Note that the difference between sublevel sets of d K µ and suerlevel sets of KDE µ which reresent the same region of R d do not have a linear relationshi. Recall d K µ (x) = (c µ ) KDE µ (x) where (c µ ) = σ + κ(µ, µ) is a constant that deends only on µ. 10

12 Acknowledgement The authors thank Don Sheehy, Frédéric Chazal and the rest of the Geometrica grou at INRIA-Saclay for enlightening discussions on geometric and toological reconstruction. We also thank Don Sheehy for ersonal communications regarding the ower distance constructions References [1] P. K. Agarwal, S. Har-Peled, H. Kalan, and M. Sharir. Union of random minkowski sums and network vulnerability analysis. In Proceedings of 9th Symosium on Comutational Geometry, 013. [] N. Amenta and M. Bern. Surface reconstruction by voronoi filtering. Discrete and Comutational Geometry, (4): , [3] N. Aronszajn. Theory of reroducing kernels. Transactions of the American Mathematical Society, 68: , [4] S. Balakrishnan, B. T. Fasy, F. Lecci, A. Rinaldo, A. Singh, and L. Wasserman. Statistical inference for ersistent homology. Technical reort, ArXiv: , March 013. [5] M. Buchet, F. Chazal, S. Y. Oudot, and D. R. Sheehy. Efficient and robust toological data analysis on metric saces. Technical reort, ArXiv: , 013. [6] M. Buchet, F. Chazal, S. Y. Oudot, and D. R. Sheehy. Efficient and robust toological data analysis on metric saces. arxiv: , 013. [7] G. Carlsson, G. Singh, and A. Zomorodian. Comuting multidimensional ersistence. Algorithms and Comutation: Lecture Notes in Comuter Science, 5878: , 009. [8] G. Carlsson and A. Zomorodian. The theory of multidimensional ersistence. Proc. 3rd Ann. Sym. Comutational Geometry, ages , 007. [9] F. Chazal and D. Cohen-Steiner. Geometric inference. Tessellations in the Sciences, 01. [10] F. Chazal, D. Cohen-Steiner, M. Glisse, L. J. Guibas, and S. Y. Oudot. Proximity of ersistence modules and their diagrams. In Proceedings 5th Annual Symosium on Comutational Geometry, ages 37 46, 009. [11] F. Chazal, D. Cohen-Steiner, and A. Lieutier. Normal cone aroximation and offset shae isotoy. Comutational Geometry: Theory and Alications, 4: , 009. [1] F. Chazal, D. Cohen-Steiner, and A. Lieutier. A samling theory for comact sets in Euclidean sace. Discrete Comutational Geometry, 41(3): , 009. [13] F. Chazal, D. Cohen-Steiner, A. Lieutier, and B. Thibert. Stability of curvature measures. Comuter Grahics Forum (Proceedings Symosium on Geometry Processing), ages , 009. [14] F. Chazal, D. Cohen-Steiner, and Q. Mérigot. Boundary measures for geometric inference. Foundations of Comutational Mathematics, 10():1 40, 010. [15] F. Chazal, D. Cohen-Steiner, and Q. Mérigot. Geometric inference for robability measures. Foundations of Comutational Mathematics, 11(6): ,

13 [16] F. Chazal, V. de Silva, M. Glisse, and S. Oudot. The structure and stability of ersistence modules. arxiv: , 013. [17] F. Chazal and A. Lieutier. Weak feature size and ersistent homology: comuting homology of solids in rn from noisy data samles. Proceedings 1st Annual Symosium on Comutational Geometry, ages 55 6, 005. [18] F. Chazal and A. Lieutier. Toology guaranteeing manifold reconstruction using distance function to noisy data. Proceedings nd Annual Symosium on Comutational Geometry, ages , 006. [19] F. Chazal and A. Lieutier. Stability and comutation of toological invariants of solids in R n. Discrete Comutational Geometry, 37(4): , 007. [0] F. Chazal and A. Lieutier. Smooth manifold reconstruction from noisy and non-uniform aroximation with guarantees. Comutational Geometry: Theory and Alications, 40(): , 008. [1] F. Chazal and S. Oudot. Towards ersistence-based reconstruction in euclidean saces. Proceedings 4th Annual Symosium on Comutational Geometry, ages 3 41, 008. [] S. Cohen. Finding color and shae atterns in images. Technical reort, Stanford University: CS-TR , [3] D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. Stability of ersistence diagrams. Discrete and Comutational Geometry, 37:103 10, 007. [4] H. Daumé. From zero to reroducing kernel hilbert saces in twelve ages or less. htt://ub. hal3.name/daume04rkhs.s, 004. [5] L. Devroye and L. Györfi. Nonarametric Density Estimation: The L 1 View. Wiley, [6] L. Devroye and G. Lugosi. Combinatorial Methods in Density Estimation. Sringer-Verlag, 001. [7] T. K. Dey and S. Goswami. Provable surface reconstruction from noisy samles. Comutational Geometry, 35(1):14 141, 006. [8] S. Durrleman, X. Pennec, A. Trouvé, and N. Ayache. Measuring brain variability via sulcal lines registration: A diffeomorhic aroach. In 10th International Conference on Medical Image Comuting and Comuter Assisted Intervention, 007. [9] S. Durrleman, X. Pennec, A. Trouvé, and N. Ayache. Sarse aroximation of currents for statistics on curves and surfaces. In 11th International Conference on Medical Image Comuting and Comuter Assisted Intervention, 008. [30] H. Edelsbrunner. The union of balls and its dual shae. Proceedings 9th Annual Symosium on Comutational Geometry, ages 18 31, [31] H. Edelsbrunner, B. T. Fasy, and G. Rote. Add isotroic Gaussian kernels at own risk: More and more resiliant modes in higher dimensions. Proceedings 8th Annual Symosium on Comutational Geometry, ages , 01. [3] H. Edelsbrunner and J. Harer. Persistent homology - a survey. Contemorary Mathematics, 453:57 8,

14 [33] H. Edelsbrunner and J. Harer. Comutational Toology: An Introduction. American Mathematical Society, Providence, RI, USA, 010. [34] H. Edelsbrunner, D. Letscher, and A. J. Zomorodian. Toological ersistence and simlification. Discrete and Comutational Geometry, 8: , 00. [35] H. Federer. Curvature measures. Transactions of the American Mathematical Society, 93: , [36] J. Glaunès. Transort ar difféomorhismes de oints, de mesures et de courants our la comaraison de formes et l anatomie numérique. PhD thesis, Université Paris 13, 005. [37] J. Glaunès and S. Joshi. Temlate estimation form unlabeled oint set data and surfaces for comutational anatomy. In International Worksho on Mathematical Foundations of Comutational Anatomy, 006. [38] K. Grove. Critical oint theory for distance functions. Proceedings of Symosia in Pure Mathematics, 54: , [39] L. Guibas, Q. Mérigot, and D. Morozov. Witnessed k-distance. Proceedings 7th Annual Symosium on Comutational Geometry, ages 57 64, 011. [40] A. Hatcher. Algebraic Toology. Cambridge University Press, 00. [41] M. Hein and O. Bousquet. Hilbertian metrics and ositive definite kernels on robability measures. In Proceedings 10th International Worksho on Artificial Intelligence and Statistics, 005. [4] M. I. Jordan. Reroducing kernel hilbert saces. htt:// jordan/ courses/81b-sring04/lectures/rkhs.s. [43] S. Joshi, R. V. Kommaraju, J. M. Phillis, and S. Venkatasubramanian. Comaring distributions and shaes using the kernel distance. Proceedings 7th Annual Symosium on Comutational Geometry, 011. [44] A. N. Kolmogorov, S. V. Fomin, and R. A. Silverman. Introductory Real Analysis. Dover Publications, [45] J. M. Lee. Introduction to toological manifolds. Sringer-Verlag, 000. [46] J. M. Lee. Introduction to smooth manifolds. Sringer, 003. [47] A. Lieutier. Any oen bounded subset of R n has the same homotoy tye as its medial axis. Comuter- Aided Design, 36: , 004. [48] Q. Mérigot, M. Ovsjanikov, and L. Guibas. Robust Voronoi-based curvature and feature estimation. Proceedings SIAM/ACM Joint Conference on Geometric and Physical Modeling, ages 1 1, 009. [49] A. Müller. Integral robability metrics and their generating classes of functions. Advances in Alied Probability, 9():49 443, [50] P. Niyogi, S. Smale, and S. Weinberger. Finding the homology of submanifolds with high confidence from random samles. Discrete and Comutational Geometry, 39(1): , 008. [51] P. Niyogi, S. Smale, and S. Weinberger. A toological view of unsuervised learning from noisy data. Manuscrit,

15 [5] J. M. Phillis. es-samles for kernels. Proceedings 4th Annual ACM-SIAM Symosium on Discrete Algorithms (to aear), 013. [53] J. M. Phillis and S. Venkatasubramanian. A gentle introduction to the kernel distance. arxiv: , March 011. [54] D. W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, 199. [55] D. R. Sheehy. A multicover nerve for geometric inference. Canadian Conference in Comutational Geometry, 01. [56] D. R. Sheehy. Linear-size aroximations to the Vietoris-Ris filtration. Discrete & Comutational Geometry, 49: , 013. [57] B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chaman & Hall/CRC, [58] B. K. Srierumbudur, A. Gretton, K. Fukumizu, B. Schölkof, and G. R. G. Lanckriet. Hilbert sace embeddings and metrics on robability measures. Journal of Machine Learning Research, 11: , 010. [59] C. Suquet. Distances Euclidiennes sur les mesures signées et alication à des théorèmes de Berry- Esséen. Bulletin Belgium Mathetics Society, : , [60] B. A. Turlach. Bandwidth selection in kernel density estimation: A review. Discussion aer 9317, Istitut de Statistique, UCL, Louvain-la-Neuve, Belgium. [61] M. Vaillant and J. Glaunès. Surface matching via currents. In Proceedings Information Processing in Medical Imaging, volume 19, ages 381 9, 005. [6] C. Villani. Toics in Otimal Transortation. American Mathematical Society, 003. [63] G. Wahba. Suort vector machines, reroducing kernel Hilbert saces, and randomization GACV. In Advances in Kernel Methods Suort Vector Learning, ages Bernhard Schölkof and Alezander J. Smola and Christoher J. C. Burges and Rosanna Soentiet, [64] C. Yang, R. Duraiswami, N. Gumerov, and L. Davis. Imroved fast Gauss transform and efficient kernel density estimation. In Proceedings 9th International Conference on Comuter Vision, ages , 003. [65] Y. Zheng, J. Jestes, J. M. Phillis, and F. Li. Quality and efficiency in kernel density estimates for large data. In Proceedings ACM Conference on the Management of Data (SIGMOD), A Exeriments We consider measures µ P defined by a oint set P R. To exerimentally visualize the structures of the suerlevel sets of kernel density estimates, or equivalently sublevel sets of the kernel distance, we do the simlest thing and just evaluate d K µ P at every grid oint on a sufficiently dense grid. 14

16 Figure 1: Sublevel sets for the kernel distance while varying the isolevel γ, for fixed σ (left) and for fixed isolevel γ but variable σ (right), with Gaussian kernel. The variable values of σ and γ are chosen to make the lots similar. 504 Grid aroximation. Due to the 1-Lischitz roerty of the kernel distance, well chosen grid oints 505 have several nice roerties. We consider the functions u to some resolution arameter ε > 0, consistent 506 with the arameter used to create a coreset aroximation Q. Now secifically, consider an axis-aligned grid G ε,d with edge length ε/ 507 d so no oint x R d is further than ε from some grid oint g G ε,d. 508 Since K(x, y) ε when x y σ ln(σ /ε) = δ ε,σ, we only need to consider grid oints g G ε,d 509 which are within δ ε,σ of some oint P (or q Q, of coreset Q of P ) [43, 65]. This is at most ( 510 d/ε) d (δ ε,σ ) d = O((σ log(ε/d)/ε) d ) grid oints total for d a fixed constant. Furthermore, due to the Lischitz roerty of d K P, when considering a secific level set at r 51 a oint x such that d K P (x) r ε is no further than ε from some g G such that dk P (g) r, and 513 every ball B ε (x) centered at some oint x Rd of radius ε so that all y B ε (x) has dk P (y) r has 514 some reresentative oint g G ε,d such that g B ε (x), and hence d K P (g) r. Thus dee regions and satially thick features are reserved, however thin assageways or layers that are near the threshold r, even if they do not corresond to a critical oint, may erroneously become disconnected, causing hantom comonents or other toological features Varying arameter r or σ. We demonstrate the geometric inference on a synthetic dataset in [0, 1] where 900 oints are chosen near a circle centered at (0.5, 0.5) with radius 0.5 or along a line segment from (0, 0) to (1, 1). Each oint has Gaussian noise added with standard deviation The remaining 1100 oints are chosen uniformly from [0, 1]. We use a Gaussian kernel with σ = Figure 1 shows (left) various sublevel sets γ Γ for the kernel distance at a fixed σ = 0.05 and (right) various suerlevel sets for a fixed γ = , but various values of σ Σ, where Γ = [ , , , , ] and Σ = [0.0485, , 0.049, , 0.05] This choice of Γ and Σ were made to highlight how similar the isolevels can be. Non-uniform densities. Another advantage of techniques based on some sort of distance to a measure (as oosed to offsets from the oints) is that they more naturally handle inut data sets where different 15

17 (a) (b) Figure : Sublevel sets of kernel distance (a), and distance to a measure (b) on a data set with two circles with different scales and densities. These lots fix a level set γ and vary (a) σ for d K µ and (b) m 0 for d ccm. The arameters are chosen to rovide similar aearing images. Figure 3: Alternate kernel density estimates for the same dataset as Figure 1. From left to right, they use the Lalace, triangle, Eanechnikov, and the ball kernel regions have different densities of oints. In Figure we consider a data set with 000 oints in [0, 1] generated from circles, where 00 oints are chosen near a circle centered at (0.08, 0.08) with radius 0.04 and 1600 are chosen near a circle centered at (0.9, 0.9) with radius Each oint in the to circle has Gaussian noise added with standard deviation 0.01 and the bottom circle has Gaussian noise added with standard deviation The remaining 00 oints are chosen uniformly from [0, 1]. Figure (a) shows the sublevel sets of d K µ with a fixed γ = and vary σ Σ, where Σ = [ , 0.014, , , 0.015] Each circle is recovered at some level. Using distance to a measure (see Figure (b)), we tried to tune the arameters so it looked similar to Figure (a). Secifically we fixed γ = and used values m 0 M 0, where M 0 = [0.05, 0.04, 0.1, 0., 0.5]. 16

18 Alternative kernels. for instance We can choose kernels other than the Gaussian kernel in the kernel density estimate, the Lalace kernel K(, x) = ex( x y /σ), the triangle kernel K(, x) = max{0, 1 x y /σ}, the Eanechnikov kernel K(, x) = max{0, 1 x y /σ }, or the ball kernel (K(, x) = {1 if x σ; o.w. 0}. Figure 3 chooses arameters to make them comarable to the Figure 1(left). Of these only the Lalace kernel is characteristic [58] making the corresonding version of the kernel distance a metric. Investigating which of the above reconstruction theorems hold when using the Lalace or other kernels is an interesting question for future work. Additionally, normal vector information and even k-forms can be used in the definition of a kernel [36, 61, 9, 8, 37, 43]; this variant is known as the current distance. In some cases it retains its metric roerties and has been shown to be very useful for shae alignment in conjunction with medical imaging A.1 Discussion: Noise and Scale Much of geometric and toological reconstruction grew out of the desire to understand shaes at various scales. A common mechanism is offset based; e.g., α-shaes [30] reresent the scale of a shae with the α arameter controlling the offsets of a oint cloud. There are two arameters with the kernel distance: r controls the offset through the sublevel set of the function, and σ controls the noise. We argue that any function which is robust to noise must have a arameter that controls the noise (e.g. σ for d K µ and m 0 for d ccm ). Here σ clearly defines some sense of scale in the setting of density estimation [57] and has a geometrical interretation, while m 0 reresents a fraction of the measure and is hard to interret geometrically, as illustrated by the lack of a Lischitz roerty for d ccm with resect to m 0. There are several exeriments above, in Aendix A, from which several insights can be drawn. One observation is that even though there are two arameters r and σ that control the scale, the interesting values tyically have r very close to σ. Thus, we recommend to first set σ to control the scale at which the data is studied, and then exlore the effect of varying r for values near σ. Moreover, not much structure seems to be missed by not exloring the sace of both arameters; Figure 1 shows that fixing one (of r and σ) and varying the other can rovide very similar suerlevel sets. However, it is ossible instead to fix r and exlore the ersistent toological features in the data [34, 3] (those less effected by smoothing) by varying σ. On the other hand, it remains a challenge roblem to study two arameter ersistent homology [8, 7] under the setting of kernel distance (or kernel density estimate) B Lischitz Proerty of d K µ In this section we formalize several Lischitz roerties about the kernel distance. That is as some arameter changes, how quickly can d K µ change B.1 Relating Semiconcavity to Lischitz Proerties We rove a generalization of a folklore result relating l-semiconcavity of functions and l-lischitz functions. Lemma.. Function g(x) such that (g(x)) is l-semiconcave, for l 1, then g(x) must be l-lischitz. 17

Geometric Inference on Kernel Density Estimates

Geometric Inference on Kernel Density Estimates Bei Wang 1 Jeff M. Phillips 2 Yan Zheng 2 1 Scientific Computing and Imaging Institute, University of Utah 2 School of Computing, University of Utah Feb