SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE FRANÇOIS BOLLEY Abstract. In this note we prove in an elementary way that the Wasserstein distances, which play a basic role in optimal transportation issues, turn some spaces of probability measures into separable complete metric spaces. introduction Let (, d) be a separable complete metric space and P be the set of Borel probability measures on. Given p > 0, the R {+ }-valued map W p defined on P P by W p (µ, ν) = inf π W p (µ, ν) = inf π ( d(x, y) p dπ(x, y) ) 1 p if 1 p d(x, y) p dπ(x, y) if 0 < p < 1, where π runs over the set of probability measures on with marginals µ and ν, defines a metric on the subset P p of measures µ in P such that d(x 0, x) p dµ(x) be finite for some (and hence any) x 0 in : it is called the Wasserstein distance of order p (see [1], [4], [5] or [6] for instance). These distances are strongly linked to the theory of optimal transportation and have been widely used in various applications to partial differential equations, functional inequalities and probability theory. Some of them involve probability measures on infinite dimensional spaces such as the Wiener space of R d -valued continuous functions on the interval [0, T ] (as in [3] for instance) or some sets of probability measures on a phase space; this is a motivation to the general framework considered in this note, in which we shall prove: Theorem. If (, d) is a separable complete metric space and p a positive number, then the metric space (P p, W p ) is separable and complete. The completeness property has been proven in [4] by comparing the W p distances with the weaker Prohorov distance for which the property is known, and in [1] by means of a deep result by Kolmogorov. Here we shall give a more direct and elementary argument. Let us actually note that these kind of properties can be studied as in [2] within the following broader scope of weighted spaces of probability measures. Let (, τ) be a topological space and ω be a real-valued continuous function on, bounded by below by a positive constant, and let P ω denote the set of Borel probability measures µ on such that ω(x) dµ(x) be finite. We equip P ω with the natural weak topology defined by the set C bω of real-valued continuous functions f on such that ω 1 f be bounded on : this topology, which will be denoted w-c bω, is defined by the seminorms µ sup f i (x) dµ(x) i=1,...,n 1
2 FRANÇOIS BOLLEY for any finite family f 1,..., f n of functions in C bω. Then one can prove that if the topological space (, τ) is separable (resp. separable and metrizable, resp. separable, metrizable and topologically complete), then so is (P ω, w-c bω ). Conversely if (P ω, w-c bω ) is separable (resp. separable, metrizable and topologically complete), then so is (, τ) if (, τ) is a priori metrizable. As in the case when ω = 1, that is, without weight, where they are known, these properties can be proven either by building some explicit distances on the considered spaces of probability measures, or by abstract functional methods as in [2]. In the case when (, d) is a separable complete metric space and ω = 1 + d(x 0, ) p for p > 0, then the w-c bω topology on the set P ω = P p is metrized by the distance W p : in particular (P p, W p ) is separable and topologically complete. The following two sections are devoted to a direct proof of the above theorem, which in particular ensures that (P p, W p ) is complete. 1. Separability In this section we prove that the metric space (P p, W p ) is separable if (, d) is a separable complete metric space and p is a positive number. If (x n ) n is a sequence dense in (, d) we actually prove that the countable set of measures of the form b n δ xn, where N is an integer number, the b n s are nonnegative rational numbers with unit sum and δ x stands for the point mass at x, is dense in (P p, W p ). Let indeed µ be a given measure in P p and ε be a given positive number. 1. We first approach µ by a measure µ 1 = numbers with a n = 1. a n δ xn where the a n s are nonnegative real For this we note that is covered by the balls B(x n, ε max(1,1/p) ) with centers x n and radius ε max(1,1/p), and is partitioned by the sets B n = B(x n, ε max(1,1/p) )\ B(x k, ε max(1,1/p) ), k n 1 so that the a n = µ[ B n ] have unit sum. Moreover sending each point in B n onto x n for each n defines a transport map between µ and µ 1 = consequently B n x x n p dµ(x) W p (µ, µ 1 ) ε. 2. Then we approach µ 1 by a measure µ 2 = rational numbers with b n = 1. a n δ xn with cost a n ε p max(1,1/p) = ε max(p,1) ; b n δ xn where the b n s are nonnegative
SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE 3 First of all µ 1 belongs to P p since it is at finite W p distance from the measure µ in P p. Hence a n x n x 1 p = W p (µ 1, δ x1 ) max(p,1) is finite since δ x1 also belongs to P p. In particular there exists an integer N such that n=n+1 a n x n x 1 p ε max(p,1). For each 2 n N we now let b n be a nonnegative rational number such that and such that ( N ) 1an 0 a n b n ε max(p,1) a j x j x 1 p, b 1 = a 1 + j=1 (a n b n ) + n=2 n=n+1 be rational: in particular the b n s have unit sum. Moreover one can transport µ 1 onto µ 2 = b n δ xn by keeping a b n mass at x n for each n N and sending the remaining a n b n mass from x n onto x 1, and sending the whole a n mass from x n onto x 1 for each n N + 1; the associated cost is (a n b n ) x n x 1 p + a n x n x 1 p 2 ε max(p,1), so that n=n+1 W p (µ 1, µ 2 ) 2 ε. 3. To sum up we have approached µ by a measure µ 2 in P p, of the expected form and which is at most 3 ε distant in W p metric. a n 2. Completeness In this section we prove that the metric space (P p, W p ) is complete if (, d) is a separable complete metric space and p is a positive number. Let indeed (µ n ) n be a Cauchy sequence in (P p, W p ). 1. For p 1 we first prove that (µ n ) n is uniformly tight by adapting a classical proof of Ulam lemma. Let ε be a given positive number. We note that (µ n ) n is Cauchy in (P 1, W 1 ) since W 1 W p. Hence there exists N such that W 1 (µ n, µ N ) ε 2 for any n N so that, for any n, there exists j N such that W 1 (µ n, µ j ) ε 2. (1)
4 FRANÇOIS BOLLEY The finite family (µ j ) j N is uniformly tight by Ulam lemma, so there exist a compact set K such that µ j (K) 1 ε for any j N, whence q points x 1,..., x q in such that for any j N, where U = q B(x k, ε). µ j (U) 1 ε (2) Then let φ be the 1 ( ε -Lipschitz function defined on by φ(x) = d(x, U) ) +. 1 Given ε j and n, if π is any joint measure on with marginals µ j and µ n, then φ(x) dµ j (x) φ(y) dµ n (y) = dπ(x, y) (φ(x) φ(y)) 1 d(x, y) dπ(x, y); ε hence φ(x) dµ j (x) φ(y) dµ n (y) 1 ε W 1(µ j, µ n ). On the other hand 1 U φ 1 U ε where U ε = {x; d(x, U) < ε}, so φ(x) dµ j (x) µ j (U) and φ(y) dµ n (y) µ n (U ε ). Consequently µ n (U ε ) µ j (U) 1 ε W 1(µ j, µ n ). (3) Thus, by (1), (2) and (3), for any ε > 0 we have found q points x 1,..., x q such that for any n since U ε q B(x k, 2 ε). ( q µ n \ B(x k, 2 ε) ) 2 ε Therefore, replacing ε by ε 2 m 1 where m is any integer, there exist q(m) points x m 1,..., x m q(m) in such that for any n. In particular the set is such that for any n. µ n ( \ S) q(m) ( µ n \ m=1 S = + B ( x m k, ε 2 m)) ε 2 m q(m) m=1 q(m) ( µ n \ B ( x m k, ε 2 m) B ( x m k, ε 2 m)) m=1 ε 2 m = ε
SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE 5 On the other hand, for any ρ, and choosing m such that ε 2 m ρ, the set S can be covered by the q(m) balls B(x m k, ε 2 m ) with radius ε 2 m ρ: in other words it is totally bounded, so that its closure S is compact since is complete. To sum up, the set S is compact and satisfies µ n ( \ S) ε for any n: this means that the sequence (µ n ) n is indeed uniformly tight. 2. We deduce from step 1 that (µ n ) n converges in (P p, W p ) in the case when p 1. Indeed (µ n ) n is uniformly tight by step 1, so by Prohorov theorem there exists a subsequence (µ n ) n of (µ n ) n converging to a probability measure µ on for the narrow weak topology. The distance W p (µ, µ n ) actually tends to 0 as n goes to infinity. Let indeed π n m be a probability measure on with marginals µ n and µ m, optimal in the sense that d(x, y) p dπ n m (x, y) = W p(µ n, µ m ) p. The sequence (µ n ) n is uniformly tight, hence so is (π n m ) n for given m. Thus by Prohorov theorem again there exists a subsequence (π n m ) n of (π n m ) n converging to a probability measure π m on for the narrow weak topology. Then by semicontinuity d(x, y) p dπ m (x, y) lim inf d(x, y) p dπ n n m (x, y) = lim inf W p(µ n, µ m ) p. + n + (4) But on one hand π n m has marginals µ n and µ m, so at the limit (in n ) π m has marginals µ and µ m ; hence W p (µ, µ m ) p d(x, y) p dπ m (x, y) (5) for any m. On the other hand the sequence (µ n ) n is Cauchy for the distance W p, so for any ε > 0 and n, m large enough W p (µ n, µ m ) ε. (6) It finally follows from (4), (5) and (6) that W p (µ, µ m ) ε for m large enough, which means that µ belongs to P p and that W p (µ, µ n ) indeed tends to 0 as n goes to infinity. Finally W p (µ n, µ) tends to 0 as n goes to infinity since the whole sequence (µ n ) n is Cauchy in (P p, W p ). 3. We deduce from step 2 that (µ n ) n converges in (P p, W p ) in the case when 0 < p < 1. Indeed d p is a distance on which defines the same topology as d and (, d p ) is complete if so is (, d). Moreover P p (, d) = P 1 (, d p ) and W p (, d) = W 1 (, d p ) in obvious notation. Thus, given an exponent p ]0, 1[ and a metric d on, the results associated with the exponent p and the metric d stem from the results proven in step 2 for the exponent 1 and the metric d p.
6 FRANÇOIS BOLLEY References [1] L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows in metric spaces and in the spaces of probability measures. Birkhäuser, Basel, 2005. [2] F. Bolley. Applications du transport optimal à des problèmes de limites de champ moyen. Thèse de doctorat, Ecole Normale Supérieure de Lyon. Available at http://www.lsp.ups-tlse.fr/fp/bolley, 2005. [3] F. Bolley. Quantitative concentration inequalities on sample path space for mean field interaction. Preprint available at http://www.lsp.ups-tlse.fr/fp/bolley, 2005. [4] S. T. Rachev. Probability metrics and the stability of stochastic models. John Wiley and Sons, Chichester, 1991. [5] S. T. Rachev and L. Rüschendorf. Mass transportation problems. Vol I and II. Springer, New York, 1998. [6] C. Villani. Topics in optimal transportation, volume 58 of Grad. Stud. Math. AMS, Providence, 2003. Ecole Normale Supérieure de Lyon, Umpa (UMR 5669), 46 allée d Italie, F-69364 Lyon cedex 7 Current address: Institut de Mathématiques, LSP (UMR C5583), Université Paul Sabatier, Route de Narbonne, F-31062 Toulouse cedex 9 E-mail address: bolley@cict.fr