Generalized Orlicz spaces and Wasserstein distances for convex concave scale functions

Similar documents
Generalized Orlicz Spaces and Wasserstein Distances for Convex-Concave Scale Functions

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

L p Spaces and Convexity

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s.

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

THEOREMS, ETC., FOR MATH 515

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions

Examples of Dual Spaces from Measure Theory

MATHS 730 FC Lecture Notes March 5, Introduction

Annalee Gomm Math 714: Assignment #2

Part II Probability and Measure

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

TERENCE TAO S AN EPSILON OF ROOM CHAPTER 3 EXERCISES. 1. Exercise 1.3.1

Lecture 4 Lebesgue spaces and inequalities

Analysis Comprehensive Exam Questions Fall 2008

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

AN ELEMENTARY PROOF OF THE TRIANGLE INEQUALITY FOR THE WASSERSTEIN METRIC

Generalized Ricci Bounds and Convergence of Metric Measure Spaces

THEOREMS, ETC., FOR MATH 516

2 Lebesgue integration

Lebesgue-Radon-Nikodym Theorem

Solutions to Tutorial 11 (Week 12)

Functional Analysis, Stein-Shakarchi Chapter 1

GLUING LEMMAS AND SKOROHOD REPRESENTATIONS

Advanced Analysis Qualifying Examination Department of Mathematics and Statistics University of Massachusetts. Tuesday, January 16th, 2018

3 Integration and Expectation

A brief introduction to N functions and Orlicz function spaces. John Alexopoulos Kent State University, Stark Campus

Riesz Representation Theorems

INTRODUCTION TO MEASURE THEORY AND LEBESGUE INTEGRATION

Real Analysis Chapter 3 Solutions Jonathan Conder. ν(f n ) = lim

It follows from the above inequalities that for c C 1

Applied Analysis (APPM 5440): Final exam 1:30pm 4:00pm, Dec. 14, Closed books.

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

Real Analysis, 2nd Edition, G.B.Folland Signed Measures and Differentiation

It follows from the above inequalities that for c C 1

Real Analysis Problems

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

COMMENT ON : HYPERCONTRACTIVITY OF HAMILTON-JACOBI EQUATIONS, BY S. BOBKOV, I. GENTIL AND M. LEDOUX

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

MATH 418: Lectures on Conditional Expectation

Summary of Real Analysis by Royden

Propagation of Smallness and the Uniqueness of Solutions to Some Elliptic Equations in the Plane

+ 2x sin x. f(b i ) f(a i ) < ɛ. i=1. i=1

Reducing subspaces. Rowan Killip 1 and Christian Remling 2 January 16, (to appear in J. Funct. Anal.)

Characterization of quadratic mappings through a functional inequality

FUNDAMENTALS OF REAL ANALYSIS by. IV.1. Differentiation of Monotonic Functions

Integral Jensen inequality

l(y j ) = 0 for all y j (1)

6 Classical dualities and reflexivity

Real Analysis Notes. Thomas Goller

Problem Set. Problem Set #1. Math 5322, Fall March 4, 2002 ANSWERS

3. (a) What is a simple function? What is an integrable function? How is f dµ defined? Define it first

Chapter 8. General Countably Additive Set Functions. 8.1 Hahn Decomposition Theorem

NONLINEAR ELLIPTIC EQUATIONS WITH MEASURES REVISITED

Three hours THE UNIVERSITY OF MANCHESTER. 24th January

Signed Measures and Complex Measures

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Distance-Divergence Inequalities

Methods of Applied Mathematics

Theory of Probability Fall 2008

MA677 Assignment #3 Morgan Schreffler Due 09/19/12 Exercise 1 Using Hölder s inequality, prove Minkowski s inequality for f, g L p (R d ), p 1:

CHAPTER 6. Differentiation

Some topics in analysis related to Banach algebras, 2

Banach Spaces V: A Closer Look at the w- and the w -Topologies

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Local semiconvexity of Kantorovich potentials on non-compact manifolds

Problem set 4, Real Analysis I, Spring, 2015.

Chapter 6. Integration. 1. Integrals of Nonnegative Functions. a j µ(e j ) (ca j )µ(e j ) = c X. and ψ =

Aliprantis, Border: Infinite-dimensional Analysis A Hitchhiker s Guide

A VERY BRIEF REVIEW OF MEASURE THEORY

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

ON A MAXIMAL OPERATOR IN REARRANGEMENT INVARIANT BANACH FUNCTION SPACES ON METRIC SPACES

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

MTH 404: Measure and Integration

Nonparametric estimation under Shape Restrictions

A List of Problems in Real Analysis

CHAPTER I THE RIESZ REPRESENTATION THEOREM

Chapter 4. The dominated convergence theorem and applications

PCA sets and convexity

Section Integration of Nonnegative Measurable Functions

Optimization and Optimal Control in Banach Spaces

Probability and Measure

Tools from Lebesgue integration

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP

6. Duals of L p spaces

Received 8 June 2003 Submitted by Z.-J. Ruan

Math 4121 Spring 2012 Weaver. Measure Theory. 1. σ-algebras

Invariances in spectral estimates. Paris-Est Marne-la-Vallée, January 2011

MATH 202B - Problem Set 5

Czechoslovak Mathematical Journal

9 Radon-Nikodym theorem and conditioning

Computers and Mathematics with Applications

Functional Analysis, Math 7320 Lecture Notes from August taken by Yaofeng Su

Bonus Homework. Math 766 Spring ) For E 1,E 2 R n, define E 1 + E 2 = {x + y : x E 1,y E 2 }.

Reminder Notes for the Course on Measures on Topological Spaces

REAL AND COMPLEX ANALYSIS

Transcription:

Bull. Sci. math. 135 (2011 795 802 www.elsevier.com/locate/bulsci Generalized Orlicz spaces and Wasserstein distances for convex concave scale functions Karl-Theodor Sturm Institut für Angewandte Mathematik, Endenicher Allee 60, 53115 Bonn, Germany Available online 6 July 2011 Abstract Given a strictly increasing, continuous function ϑ : R + R +, based on the cost functional ϑ ( d(x,y dq(x, y, we define the L ϑ -Wasserstein distance W ϑ (μ, ν between probability measures μ, ν on some metric space (, d. The function ϑ will be assumed to admit a representation ϑ = ψ as a composition of a convex and a concave function and ψ, resp. Besides convex functions and concave functions this includes all C 2 functions. For such functions ϑ we extend the concept of Orlicz spaces, defining the metric space L ϑ (, m of measurable functions f : R such that, for instance, d ϑ (f, g 1 ϑ ( f(x g(x dμ(x 1. 2011 Elsevier Masson SAS. All rights reserved. 1. Convex concave compositions Throughout this paper, ϑ will be a strictly increasing, continuous function from R + to R + with ϑ(0 = 0. E-mail address: sturm@uni-bonn.de. 0007-4497/$ see front matter 2011 Elsevier Masson SAS. All rights reserved. doi:10.1016/j.bulsci.2011.07.013

796 K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 Definition 1.1. ϑ will be called ccc function ( convex concave composition iff there exist two strictly increasing continuous functions,ψ : R + R + with (0 = ψ(0 = 0s.t. is convex, ψ is concave and ϑ = ψ. The pair (, ψ will be called convex concave factorization of ϑ. The factorization is called minimal (or non-redundant if for any other factorization (, ψ the function 1 is convex. Two minimal factorizations of a given function ϑ differ only by a linear change of variables. Indeed, if 1 is convex and also 1 is convex then there exists a λ (0, s.t. (t = (λt and ψ(t= λ 1 ψ(t. For each convex, concave or ccc function f : R + R + put f (t := f 1 [ ] (t+ := lim f(t+ h f(t. h 0 h Lemma 1.2. (i For any ccc function ϑ, the function log ϑ is locally of bounded variation and the distribution (log ϑ defines a signed Radon measure on (0,, henceforth denoted by d(log ϑ. (ii A pair (, ψ of strictly increasing convex or concave, resp., continuous functions with (0 = ψ(0 = 0 is a factorization of ϑ iff d ( log ϑ = ψ 1 d( log + d ( log ψ (1 in the sense of signed Radon measures. (iii The factorization (, ψ is minimal iff for any other factorization (, ψ d ( log ψ d ( log ψ in the sense of nonnegative Radon measures on (0,. (iv Every ccc function ϑ admits a minimal factorization ( ˇϑ, ˆϑ given by ˇϑ := ϑ ˆϑ 1 and ˆϑ(x:= x 0 ( exp y 1 dν (z dy where dν (z denotes the negative part of the Radon measure dν(z = d(log ϑ (z. Proof. (i, (ii: The chain rule for convex/concave functions yields ϑ (t = ( ψ(t ψ (t for each factorization (, ψ of a ccc function ϑ. Taking logarithms it implies that log ϑ locally is a BV function (as a difference of two increasing functions and, hence, that the associated Radon measures satisfy d ( log ϑ = d ( log ψ + d ( log ψ = ψ 1 d( log + d ( log ψ.

K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 797 (iii: The factorization (, ψ is minimal if and only if for any other factorization (, ψ the function u = 1 = ψ ψ 1 is convex. Since log ψ = log u ( ψ + log ψ, the latter is equivalent to d ( log ψ d ( log ψ which is the claim. (iv: Define ˆϑ as above. It remains to verify that ˆϑ <. Let(, ψ be any convex concave factorization of ϑ. Without restriction assume ψ (1 = 1. Then the Hahn decomposition of (1 yields dν d ( log ψ. Hence, for all 0 x 1 x 0 ˆϑ(x= exp dν (z dy x ( 0 y 1 exp d ( log ψ (z dy = ψ(x<. 0 y (2 This already implies that ˆϑ is finite, strictly increasing and continuous on [0,. (For instance, for x>1itfollows ˆϑ(x ˆϑ(1 + x 1. Moreover, one easily verifies that ˆϑ is concave. Since ν +,ν are the minimal nonnegative measures in the ( Hahn or Jordan decomposition of ν = ν + ν, it follows that ( ˇϑ, ˆϑ is a minimal cc decomposition of ϑ. Examples 1.3. Each convex function ϑ is a ccc function. A minimal factorization is given by (ϑ, Id. Each concave function ϑ is a ccc function. A minimal factorization is given by (Id,ϑ. Each C 2 function ϑ with ϑ (0+>0 is a ccc function. The minimal factorization is given by ˆϑ(x:= x 0 exp ( y 1 ϑ (z 0 ϑ (z dz dy and ˇϑ := ϑ ˆϑ 1. (The condition ϑ (0+>0 can be replaced by the strictly weaker requirement that the previous integral defining ˆϑ is finite. 2. The metric space L ϑ (, μ Let (,Ξ,μ be a σ -finite measure space and (, ψ a minimal ccc factorization of a given function ϑ. Then L ϑ (, μ will denote the space of all measurable functions f : R such that t ψ( f dμ <

798 K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 for some t (0, where as usual functions which agree almost everywhere are identified. Note that due to the fact that r (r for large r grows at least linearly the previous condition is equivalent to the condition (1 t ψ( f dμ 1forsomet (0,. Theorem 2.1. L ϑ (, μ is a complete metric space with the metric { d ϑ (f, g = inf t (0, : t ψ( f g } dμ 1. The definition of this metric does not depend on the choice of the minimal ccc factorization of the function ϑ. However, choosing an arbitrary convex concave factorization of ϑ might change the value of d ϑ. Note that always d ϑ (f, g = d ϑ (f g,0. Proof of Theorem 2.1. Let f,g,h L ϑ (, μ be given and choose r, s > 0 with d ϑ (f, g < r and d ϑ (g, h < s. The latter implies r ψ( f g dμ 1, s ψ( g h dμ 1. Concavity of ψ yields ψ( f h ψ( f g + ψ( g h. Put t = r + s. Then convexity of implies t ψ( f h ( r ψ( f g + s ψ( g h t r t s r ( ψ( f g t + s ( ψ( g h r t. s Hence, t ψ( f h dμ r t ( ψ( f g r r t 1 + s t 1 = 1 dμ+ s t ( ψ( g h s dμ and thus d ϑ (f, h t. This proves that d ϑ (f, h d ϑ (f, g + d ϑ (g, h. In order to prove the completeness of the metric, let (f n n be a Cauchy sequence in L ϑ. Then d ϑ (f n,f m <ɛ n for all n, m with m n and suitable ɛ n 0. Choose an increasing sequence of measurable sets k, k N, with μ( k < and k k =. Then ψ ( f n f m dμ 1 ɛ n k for all k,m,n with m n. Jensen s inequality implies ( 1 1 ψ ( f n f m dμ 1 μ( k ɛ n μ( k k

K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 799 and thus k ψ(fn ψ(f m dμ ɛn μ( k 1 μ( k In other words, (ψ(f n n is a Cauchy sequence in L 1 ( k,μ. It follows that it has a subsequence (ψ(f ni i which converges μ-almost everywhere on k. In particular, (f ni i converges almost everywhere on k towards some limiting function f (which easily is shown to be independent of k. Finally, Fatou s lemma now implies k ɛ n ψ ( f n f dμ lim inf for each k and n N. Hence, ψ ( f n f dμ 1, ɛ n that is, d ϑ (f n,f ɛ n which proves the claim. Finally, it remains to verify that m k d ϑ (f, g = 0 f = g μ-a.e. on.. ψ ( f n f m dμ 1 ɛ n The implication is trivial. For the reverse implication, we may argue as in the previous completeness proof: d ϑ (f, g = 0 will yield k t ψ( f g dμ 1 for all k N and all t>0 which in turn implies k ψ(f ψ(g dμ = 0. The latter proves f = gμ-a.e. on which is the claim. Examples 2.2. If ϑ(r= r p for some p (0, then ( d ϑ (f, g = 1/p f g p dμ with p := p if p 1 and p := 1ifp 1. Proposition 2.3. (i If ϑ is convex then f L ϑ (,μ := d ϑ (f, 0 is indeed a norm and L ϑ (, μ is a Banach space, called Orlicz space. The norm is called Luxemburg norm. (ii If ϑ is concave then d ϑ (f, g = ϑ ( f g dμ ϑ(f ϑ(g L 1 (,μ.

800 K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 (iii For general ccc function ϑ = ψ d ϑ (f, g = ψ ( f g L (,μ. (iv If μ(m = 1 then for each strictly increasing, convex function Φ : R + R + Φ 1 (1 = 1 d Φ ϑ (f, g d ϑ (f, g ( Jensen s inequality. with Proof. (i If ψ(r= cr then obviously d ϑ (tf, 0 = t d ϑ (f, 0. See also standard literature [2]. (ii Concavity of ϑ implies ϑ( f g ϑ(f ϑ(g. (iv Assume that d Φ ϑ (f, g < t for some t (0,. It implies Φ ( t ψ( f g dμ 1. Classical Jensen inequality for integrals yields ( Φ t ψ( f g dμ 1 which due to the fact that Φ 1 (1 = 1 in turn implies d ϑ (f, g t. 3. The L ϑ -Wasserstein space Let (, d be a complete separable metric space and ϑ a ccc function with minimal factorization (, ψ.thel ϑ -Wasserstein space P ϑ ( is defined as the space of all probability measures μ on equipped with its Borel σ -field s.t. t ψ( d(x,y dμ(x < for some y and some t (0,. TheL ϑ -Wasserstein distance of two probability measures μ, ν P ϑ ( is defined as { W ϑ (μ, ν = inf t>0: inf q Π(μ,ν t ψ( d(x,y } dq(x,y 1 where Π(μ,ν denotes the set of all couplings of μ and ν, i.e. the set of all probability measures q on s.t. q(a = μ(a, q( A = ν(a for all Borel sets A. Given two probability measures μ, ν P ϑ (, a coupling q of them is called optimal iff w ψ( d(x,y dq(x,y 1 for w := W ϑ (μ, ν. Proposition 3.1. For each pair of probability measures μ, ν P ϑ ( there exists an optimal coupling q.

K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 801 Proof. For t (0, define the cost function c t (x, y = t ψ(d(x,y. Note that t c t (x, y is continuous and decreasing. Given μ, ν s.t. w := W ϑ (μ, ν <. Then for all t>wthe measures μ and ν have finite c t -transportation costs. More precisely, inf c t (x,ydq(x,y 1. q Π(μ,ν Hence, there exists q n Π(μ,ν s.t. c w+ 1 (x, y dq n (x, y 1 + 1 n n. In particular, c w+1(x, y dq n (x, y 2 for all n N. Hence, the family (q n n is tight [3, Lemma 4.4]. Therefore, there exists a converging subsequence (q nk k with limit q Π(μ,ν satisfying c w+ 1 (x,ydq(x,y 1 + 1 n n for all n [3, Lemma 4.3] and thus c w (x,ydq(x,y 1. Proposition 3.2. W ϑ is a complete metric on P ϑ (. The triangle inequality for W ϑ is valid not only on P ϑ ( but on the whole space P( of probability measures on. The triangle inequality implies that W ϑ (μ, ν < for all μ, ν P ϑ (. Proof of Proposition 3.2. Given three probability measures μ 1,μ 2,μ 3 on and numbers r, s with W ϑ (μ 1,μ 2 <r and W ϑ (μ 2,μ 3 <s. Then there exist a coupling q 12 of μ 1 and μ 2 and a coupling q 23 of μ 2 and μ 3 s.t. r ψ d dq 12 1, ( 1 s ψ d dq 23 1. Let q 123 be the gluing of the two couplings q 12 and q 23, see e.g. [1, Lemma 11.8.3]. That is, q 123 is a probability measure on s.t. the projection onto the first two factors coincides with q 12 and the projection onto the last two factors coincides with q 23.Letq 13 denote the projection of q 123 onto the first and third factor. In particular, this will be a coupling of μ 1 and μ 3. Then for t := r + s t ψ( d(x,z dq 13 (x, z t ψ( d(x,y + d(y,z dq 123 (x,y,z

802 K.-T. Sturm / Bull. Sci. math. 135 (2011 795 802 r t r t 1 + s t 1 = 1. ( r ψ(d(x,y + s t r t ψ(d(y,z dq 123 (x,y,z s ( ψ(d(x,y dq 123 (x,y,z+ s r t ( ψ(d(y,z dq 123 (x,y,z s Hence, W ϑ (μ 1,μ 3 t. This proves the triangle inequality. To prove completeness, assume that (μ k k is a W ϑ -Cauchy sequence, say W ϑ (μ n,μ k t n for all k n with t n 0asn. Then there exist couplings q n,k of μ n and μ k s.t. ψ ( d(x,y dq n,k (x, y 1. (3 t n Jensen s inequality implies d(x,ydq n,k (x, y t n 1 (1 with d(x,y := ψ(d(x,y. The latter is a complete metric on with the same topology as d. That is, (μ k k is a Cauchy sequence w.r.t. the L 1 -Wasserstein distance on P(, d. Because of completeness of P 1 (, d, we thus obtain an accumulation point μ and a converging subsequence (μ ki i. According to [3, Lemma 4.4], this also yields an accumulation point q n of the sequence (q n,ki i. Continuity of the involved cost functions together with Fatou s lemma allows to pass to the limit in (3 to derive ψ ( d(x,y dq n (x, y 1 t n which proves that W ϑ (μ, μ n t n 0asn. With a similar argument, one verifies that W ϑ (μ, ν = 0 if and only if μ = ν. Remark 3.3. For each pair of probability measures μ, ν on W ϑ (μ, ν 1 inf ϑ ( d(x,y dq(x,y 1. References q Π(μ,ν [1] R.M. Dudley, Real Analysis and Probability, Cambridge University Press, 2002. [2] M.M. Rao, Z.D. Ren, Theory of Orlicz Spaces, Pure Appl. Math., Marcel Dekker, 1991. [3] C. Villani, Optimal Transport, Old and New, Grundlehren Math. Wiss., vol. 338, Springer, Berlin/Heidelberg, 2009.