arxiv: v1 [math.ap] 10 Oct 2013

Size: px
Start display at page:

Download "arxiv: v1 [math.ap] 10 Oct 2013"

Transcription

1 The Exponential Formula for the Wasserstein Metric Katy Craig 0/0/3 arxiv:30.292v math.ap] 0 Oct 203 Abstract We adapt Crandall and Liggett s method from the Banach space case to give a new proof of the exponential formula for the Wasserstein metric. In doing this, we introduce a new class of metrics transport metrics that have stronger convexity properties than the Wasserstein metric. With these, we prove an Euler-Lagrange equation characterizing the discrete gradient flow. We also prove an almost contraction inequality that controls the distance between discrete gradient flows with different initial data. Combining these results, we obtain the exponential formula and quantify the rate at which the discrete gradient flow converges to the continuous gradient flow. We then apply our estimates to give simple proofs of properties of the gradient flow, including the contracting semigroup property and energy dissipation inequality. Contents Introduction 2 2 Discrete Gradient Flow in W 2 : Background and New Results 5 2. Wasserstein Metric Geodesics and Generalized Geodesics Convexity Differentiability Transport Metrics Discrete Gradient Flow Euler-Lagrange Equation Discrete Variational Inequality Exponential Formula for the Wasserstein Metric 5 3. Almost Contraction Inequality Relation Between Proximal Maps with Different Time Steps Asymmetric Recursive Inequality Inductive Bound Exponential Formula for the Wasserstein Metric Gradient Flow with Initial Conditions µ D(E) Appendix Varying Time Steps Allowing E(µ) < + when µ charges small sets Work partially supported by U.S. National Science Foundation grants DMS and DMS c 203 by the author. This paper may be reproduced, in its entirety, for non-commercial purposes. Key words: Wasserstein metric, gradient flow, exponential formula; Math Subject Classification: 47J, 49K, 49J

2 2 Introduction Given a continuously differentiable, convex function E : R d R {+ }, the gradient flow of E is the solution to the Cauchy problem d dt u(t) = E(u(t)), u(0) D(E) = {v Rd : E(v) < + }. () Through suitable generalizations of the notion of the gradient, the theory of gradient flows has been extensively studied in Hilbert spaces 3], Banach spaces 6, 7], nonpositively curved metric spaces 9], and general metric spaces, 5], including the the space of probability measures with finite second moment P 2 (R d ) endowed with the Wasserstein metric W 2. Gradient flow in the Wasserstein metric is of particular interest due to the few restrictions it imposes on initial data (no regularity, just P 2 (R d )) and the wide variety of partial differential equations that can be( studied with ) this perspective. Formally, the gradient vector field with respect to W 2 is µ δe δρ (µ), 2]. However, to rigorously prove existence of the gradient flow and study its properties, one typically works with a discretized version of the problem that doesn t rely on a rigorous notion of the gradient. In the Euclidean case, this discretized version of () is obtained by the implicit Euler method: the discrete gradient flow sequence with time step is u n u n = E(u n ), u 0 = u. We define the proximal map J : R d R d : u J u by J u u = E(J u) J u = (id + E) u. (2) Thus, the nth element of the discrete gradient flow sequence can be expressed as J n u. Setting = t n and sending n gives the exponential formula relating the discrete gradient flow to the gradient flow u(t): lim J n n t/n u = lim (id + t n n E) u = u(t) In this paper, we apply Crandall and Liggett s method for Banach space gradient flow 6] to prove the exponential formula for the Wasserstein metric, uniting the Banach space and Wasserstein theories. To generalize the notion of the proximal map to a metric space, note that equation (2) is the Euler-Lagrange equation corresponding to the minimization problem J u = argmin v R d { 2 v u 2 + E(v) This expression for J allows us to define the proximal map and discrete gradient flow without a rigorous notion of the gradient. We define the Wasserstein proximal map by { } J µ := argmin ν P 2 (R d ) 2 W 2 2 (ν, µ) + E(ν). (3) Likewise, the exponential formula is }. lim J n n t/n µ = S(t)µ, (4)

3 3 where S(t)µ denotes the gradient flow at time t with initial conditions µ. (See Definition.3.) The exponential formula (4) was first proved in the Wasserstein metric by Ambrosio, Gigli, and Savaré, through a careful analysis of affine interpolations of functions of the discrete gradient flow for example, interpolations of W2 2 µ, ν) and W2 2(J n µ, ν) ]. With this method, they obtain the sharp rate of convergence of the discrete gradient flow to the gradient flow and use this to develop many properties of the gradient flow. In spite of the definitive results obtained by their method, Ambrosio, Gigli, and Savaré also raised the question of whether it might be possible to obtain the same results using Crandall and Liggett s method, bringing together the Banach space and Wasserstein theories of gradient flow. Crandall and Liggett s method is appealing due to its robustness and simplicity, relying on convexity inequalities to quantify the behavior of the discrete gradient flow and iterating these inequalities to prove convergence as the time step goes to zero 6]. At first glance, an adaptation of Crandall and Liggett s method to the Wasserstein metric seems unlikely. For E convex, the generalization of E in the Banach space case is an accretive operator, which, by definition, is an operator for which the corresponding proximal map satisfies (J n J u J v u v. (5) While such an inequality does hold in metric spaces of nonpositive curvature (Mayer used it adapt Crandall and Liggett s method to this case 9]), the Wasserstein metric is nonnegatively curved, Theorem 7.3.2], and it is unknown if such a contraction holds. Still, there exist almost contraction inequalities, such as, Lemma 4.2.4] or 4, Theorem.3], and Carlen and the author demonstrated their usefulness for the studying qualitative properties of gradient flow applying them to show that many of the remarkable features of the solutions to the porous media and fast diffusion equations, such as convergence to Barenblatt profiles, are also present in the discrete gradient flow. 4]. In this paper, we use this type of almost contraction inequality to adapt Crandall and Liggett s method to the Wasserstein metric, giving new proofs of the exponential formula and of several properties of the gradient flow. A fundamental difference between our method and Crandall and Liggett s is that our almost contraction inequality involves the square distance, rather than the distance itself. This prevents us from applying the triangle inequality, as they did, to control the distance between different elements of the the discrete gradient flow. Furthermore, unlike in the Hilbertian case where is -convex along geodesics, x x y 2 2 µ 2 W 2 2 (µ, ω) is not, Example 9..5]. This is a recurring difficulty when extending results from Hilbert and Banach spaces to the Wasserstein metric. Ambrosio, Gigli, and Savaré circumvent this in ] by introducing a different class of curves generalized geodesics along which the square distance is - convex. We further develop this idea, introducing a class of transport metrics W 2,ω whose geodesics correspond exactly to the generalized geodesics. These metrics satisfy the key property that µ 2 W 2 2,ω(µ, ω) is convex along the geodesics induced by W 2,ω. This turns out to be the essential fact needed to control the discrete gradient flow and adapt Crandall and Liggett s method to the Wasserstein case. In sections 2. through 2.4, we recall general facts about the Wasserstein metric and functionals defined on this metric space. We will often impose the following assumptions on our functionals.

4 4 ASSUMPTION. (optional domain assumption). E(µ) < + only if µ is absolutely continuous with respect to Lebesgue measure. This assumption ensures that for all µ D(E) and ν P(R d ) there exists an optimal transport map t ν µ from µ to ν 2] (see section 2.). This is purely for notational convenience. In section 4.2 we describe how to remove this assumption. ASSUMPTION.2 (convexity assumption). E is proper, coercive, lower semicontinuous, and λ-convex along generalized geodesics for λ R. This assumption is essential. In particular, the fact that E is λ-convex along generalized geodesics ensures that E is λ-convex in the transport metric W 2,ω. In section 2.5, we define the transport metric W 2,ω and the corresponding subdifferential 2,ω and study their properties. In section 2.6, we recall basic facts of the discrete gradient flow, the proximal map J, and the associated minimization problem. In section 2.7, we we reframe the minimization problem in terms of the transport metrics, allowing us to prove an Euler-Lagrange equation for minimizer J. In section 2.8, we recall the discrete variational inequality from, Theorem 4..2] and prove a stronger version using transport metrics. In section 3., we begin our proof of the exponential formula by proving an asymmetric almost contraction inequality. In section 3.2, we apply our Euler-Lagrange equation to obtain an expression relating proximal maps with different time steps. In sections 3.3 and 3.4, we combine these results to bound the distance between gradient flow sequences with different time steps via an asymmetric induction in the style of Rasmussen 3]. Finally, in section 3.5, we prove the exponential formula (4) and quantify the convergence of the discrete gradient flow to the gradient flow. Following Ambrosio, Gilgi, and Savaré, Equation (4.0.3)], we define the Wasserstein gradient flow as follows: DEFINITION.3 (gradient flow). A curve S(t)µ : (0, + ) P 2 (R d ) is the gradient flow of a functional E with initial data µ D(E) if S(t)µ t 0 µ and d 2 dt W 2 2 (S(t)µ, ω) + λ 2 W 2 2 (S(t)µ, ω) E(ω) E(S(t)µ), ω D(E), Lebesgue a.e. t > 0. (6) We will sometimes refer to S(t)µ as the continuous gradient flow, to distinguish it from the discrete gradient flow. In the particular case that E satisfies convexity assumption.2 and is of the form { E(µ) = R F (x, ρ(x), ρ(x))dx for µ = ρ dx, ρ(x) C (R d ) d + otherwise, with F C 2 (R d 0, + ) R d ), (6) is equivalent to the gradient flow µ t := S(t)µ, satisfying ( d dt µ t = E ) µ t, µ t = ρ t dx ρ t in a weak sense, in the duality with Cc (R d (0, + )), Lemma 0.4., Theorem..4]. Formally, ( W E(µ) = E ) ρ µ is the Wasserstein gradient vector field, where E ρ t is the first variation of E at ρ t, 2]. We close section 3.5 by applying our estimates to give simple proofs of properties of the continuous gradient flow, including the contracting semigroup property and the energy dissipation

5 5 inequality. Finally, in section 3.6, we extend our results, which only applied to gradient flows with initial conditions µ D( E ), to include initial conditions µ D(E). (See Definition 2.8 of the metric slope E.) In the appendix, we describe two extensions. In section, 4., we adapt our proof of the exponential formula to include discrete gradient flows with varying time steps. In section 4.2, we describe how to remove the domain assumption., which we imposed for notational convenience. This new proof of the convergence of discrete gradient flow suggests several directions for future work. Is there an underlying geometric structure that relates the transport metrics W 2,ω to the Wasserstein metric W 2, making the assumption that E is convex along generalized geodesics more geometrically natural? Can this new proof of the exponential formula be used to study the behavior of the gradient flow as the functional E is perturbed or regularized? Is there a generalization of E in the Wasserstein metric that corresponds to the Banach space notion of an accretive operator? Acknowledgements: The author thanks Prof. Giuseppe Savaré for suggesting this problem. The author also thanks Prof. Eric Carlen for suggesting the form of Theorem 3.2 and for many helpful conversations. 2 Discrete Gradient Flow in W 2 : Background and New Results 2. Wasserstein Metric Let P(R d ) denote the set of Borel probability measures on R d. Given µ, ν P(R d ), a Borel measurable map t : R d R d transports µ onto ν if ν(b) = µ(t (B)) for all Borel sets B R d. We call ν the push-forward of µ under t and write ν = t#µ. Consider a measure µ P(R d R d ). (We distinguish probability measures on R d R d or R d R d R d, from probability measures on R d by writing them in bold font.) Let π be the projection onto the first component of R d R d, and let π 2 be the projection onto the second component. The first and second marginals of µ are π #µ P(R d ) and π 2 #µ P(R d ). Given µ, ν P (R d ), the set of transport plans from µ to ν is Γ(µ, ν) := {µ P(R d R d ) : π #µ = µ, π 2 #µ = ν}. The Wasserstein distance between µ and ν is W 2 (µ, ν) := ( { }) /2 inf x y 2 dµ(x, y) : µ Γ(µ, ν). (7) R d R d When W 2 (µ, ν) < +, there exist plans which attain the infimum. We denote this set of optimal transport plans by Γ 0 (µ, ν). When µ is absolutely continuous with respect to Lebesgue measure, there is a unique optimal transport plan from µ to ν of the form (id t)#µ, where id(x) = x is the identity transformation and t is unique µ-a.e. 2]. In particular, there is a map t satisfying t#µ = ν and ( ) /2 W 2 (µ, ν) = id t 2 dµ. R d We denote this unique optimal transport map by t ν µ. Furthermore, a Borel measurable map t that transports µ to ν is optimal if and only if it is cyclically monotone µ-a.e. 2, 0], i.e. if there exists N R d with µ(n) = 0 such that for every finite sequence of distinct points {x,..., x m } R d \N, t(x ) (x 2 x ) + t(x 2 ) (x 3 x 2 ) + + t(x m ) (x x m ) 0.

6 6 One technical difficulty when working with the Wasserstein distance on P(R d ) is that there exist measures that are infinite distances apart. Throughout this paper, we denote by ω 0 some fixed reference measure and define P 2,ω0 (R d ) = {µ P(R d ) : W 2 (µ, ω 0 ) < + }. By the triangle inequality, (P 2,ω0 (R d ), W 2 ) is a metric space. When ω 0 = δ 0, the Dirac mass at the origin, this is P 2 (R d ), the subset of P(R d ) with finite second moment. 2.2 Geodesics and Generalized Geodesics DEFINITION 2. (constant speed geodesic). Given a metric space (X, d), a constant speed geodesic u : 0, ] X is a curve satisfying d(u α, u β ) = β α d(u 0, u ), for all α, β 0, ]. We will often refer to constant speed geodesics simply as geodesics. As shown in, Theorem 7.2.2], all geodesics in P 2,ω0 (R d ) are curves of the form µ α = ( ( α)π + απ 2) #µ, µ Γ 0 (µ 0, µ ). If µ 0 is absolutely continuous with respect to Lebesgue measure, the geodesic from µ 0 to µ is unique and of the form µ α = ( ( α)id + αt µ µ 0 ) #µ0. We now recall the definition of generalized geodesics from, Definition 9.2.2]. Given a finite product R d R d R d, let π i the be projection onto the ith component and π i,j be the projection onto the ith and jth components. DEFINITION 2.2 (generalized geodesic). Given µ 0, µ, ω P 2,ω0 (R d ), a generalized geodesic from µ 0 to µ with base ω is a curve µ α : 0, ] P(R d ) of the form where µ P(R d R d R d ) satisfies µ α := ( ( α)π 2 + απ 3) #µ, π,2 #µ Γ 0 (ω, µ 0 ) and π,3 #µ Γ 0 (ω, µ ). (8) We refer to any µ P(R d R d R d ) that satisfies (8) as a plan that induces a generalized geodesic from µ 0 to µ with base ω. REMARK 2.3. Such a µ always exists by, Lemma 5.3.2]. If the base ω equals either µ 0 or µ, then µ α is a geodesic joining µ 0 and µ. REMARK 2.4. If ω is absolutely continuous with respect to Lebesgue measure, the generalized geodesic from µ 0 to µ with base ω is unique and of the form µ α = (( α)t µ 0 ω + αt µ ω ) #ω. Since (( α)t µ 0 ω + αt µ ω ) is a positive combination of optimal transport maps, it is cyclically monotone, hence it is the optimal transport map from ω to µ α.

7 7 2.3 Convexity Given a metric space (X, d), we consider functionals E : X R {+ } that satisfy the following conditions. proper: D(E) := {u X : E(u) < + } coercive: There exists 0 > 0, u 0 X such that inf v X { 2 0 d 2 (u 0, v) + E(v) lower semicontinuous: For all u n, u X such that u n u, lim inf n E(u n) E(u). λ-convex along a curve u α : Given λ R and a curve u α X, } >. (9) E(u α ) ( α)e(u 0 ) + αe(u ) α( α) λ 2 d2 (u 0, u ), α 0, ]. (0) λ-convex along geodesics: Given λ R, for all u 0, u X, there exists a geodesic u α from u 0 and u along which (0) holds. We will often simply say that E is λ-convex, or in the case λ = 0, convex. Fix ω 0 P(R d ) and suppose (X, d) = (P 2,ω0 (R d ), W 2 ). In this setting, convexity is often referred to as displacement convexity 0], and we have the additional stronger notion of convexity along generalized geodesics, Definition 9.2.2]. DEFINITION 2.5 (λ-convex along generalized geodesics). Given λ R, a functional E : P 2,ω0 (R d ) R {+ } is λ-convex along a generalized geodesics µ α if E(µ α ) ( α)e(µ 0 ) + αe(µ ) α( α) λ x 2 x 3 2 dµ, () 2 where µ is the plan that induces the generalized geodesic. E is convex along generalized geodesics if, for all µ 0, µ, ω P 2,ω0 (R d ), there exists a generalized geodesic µ α from µ 0 to µ with base ω along which E is convex according to (). REMARK 2.6. This definition is slightly different from E being λ-convex along all of the curves µ α according to equation (0), since W2 2 (µ 0, µ ) x 2 x 3 2 dµ(x). (2) When λ > 0, equation () is stronger, and when λ < 0, it is weaker. REMARK 2.7. When ω = µ 0 or µ, µ α is simply the geodesic from µ 0 to µ and equality holds in (2). Therefore, λ-convexity along generalized geodesics implies λ-convexity along geodesics.

8 8 2.4 Differentiability Given a functional E on a metric space (X, d), we may consider the metric slope. DEFINITION 2.8 (metric slope). Given a metric space (X, d) and a functional E : X R {+ }, for every u D(E), the metric slope of E at u is E (u) := lim sup v u (E(u) E(v)) + d(u, v). If our metric space is (X, d) = (P 2,ω0 (R d ), W 2 ), we may also consider the subdifferential of E, Definition 0..]. For ease of notation, we assume E satisfies domain assumption.. This ensures that, for any µ D(E), ν P 2,ω0 (R d ), there exists a unique optimal transport map t ν µ from µ to ν. The subdifferential can be defined without this assumption, but the notation becomes more cumbersome. We explain how to extend these results to the general case in section 4.2. The Wasserstein subdifferential, as originally defined in, Section 0], is inspired by the standard Euclidean subdifferential. One simply replaces the Euclidean distance with the Wasserstein metric and the vector v u with the transport map t ν µ id. DEFINITION 2.9 (Wasserstein subdifferential). Consider E : P 2,ω0 (R d ) R {+ } proper, lower semicontinuous, and satisfying domain assumption.. Given µ D( E ), ξ L 2 (µ) belongs to the Wasserstein subdifferential of E at µ if E(ν) E(µ) ξ, t ν µ id dµ + o(w 2 (µ, ν)) as ν W 2 µ. R d We write ξ E(µ). REMARK 2.0 (Wasserstein subdifferential and metric slope)., Lemma 0..5] relates the Wasserstein subdifferential and the metric slope: If E satisfies Assumptions. and.2, then µ D( E ) if and only if E(µ) is nonempty. In this case, E (µ) = min{ ξ L 2 (µ) : ξ E(µ)}. Finally, we recall the definition of the strong subdifferential from, 0..]. This quantifies the rate of change of E at µ when approaching µ via any transport map, optimal or not. DEFINITION 2. (strong subdifferential). Consider E : P 2,ω0 (R d ) R {+ } proper, lower semicontinuous, and satisfying domain assumption.. ξ E(µ) is a strong subdifferential in case for all measurable maps t : R d R d such that t id L 2 (µ) < +, E(t#µ) E(µ) ξ, t id dµ + o( t id L 2 (µ)) R d 2.5 Transport Metrics as t L2 id. A recurring difficulty in extending results from a Hilbert space (H, ) to the Wasserstein metric (P 2,ω0, W 2 ) is that while x x y 2 2

9 9 is -convex along geodesics, µ 2 W 2 2 (µ, ω) is not, Example 9..5]. Ambrosio, Gigli, and Savaré circumvent this difficulty by introducing the notion of generalized geodesics and showing that µ 2 W 2 2 (µ, ω) is -convex along generalized geodesics with base ω, Lemma 9.2.]. In this section, we introduce a class of metrics whose geodesics correspond exactly to the generalized geodesics with a given base. Furthermore, these metrics satisfy the key property that the square metric is convex along geodesics. This convexity turns out to provide the necessary control over the discrete gradient flow to adapt Crandall and Liggett s method to the Wasserstein metric. For simplicity of notation, we make the following assumption on the measure ω: ASSUMPTION 2.2 (ω doesn t charge small sets). ω P(R d ) is absolutely continuous with respect to Lebesgue measure. This ensures the existence of an optimal transport maps t µ ω from ω to any µ P 2,ω (R d ) 2]. We use these optimal transport map to define the (2, ω)-transport distance. See section 4.2 for how to extend this definition for ω are not absolutely continuous with respect to Lebesgue measure. DEFINITION 2.3 ((2, ω)-transport metric). The (2, ω)-transport metric is W 2,ω : P 2,ω (R d ) P 2,ω (R d ) R, ( /2 W 2,ω (µ, ν) := t µ ω t ν ω dω) 2. REMARK 2.4. If µ = ω or ν = ω, this reduces to the Wasserstein metric. W 2,ω (µ, ν) W 2 (µ, ν). In general, In the following proposition, we prove a few key properties of transport metrics. In particular, we show that the geodesics of the W 2,ω metric are exactly the generalized geodesics with base ω, and hence the function µ W 2,ω (ν, µ) 2 is convex in this metric for any ν P 2,ω (R d ). PROPOSITION 2.5 (properties of the (2, ω)-transport metric). (i) W 2,ω is a metric on P 2,ω (R d ). (ii) The constant speed geodesics with respect to the W 2,ω metric are exactly the generalized geodesics with base ω. Furthermore, these generalized geodesics µ α satisfy W 2 2,ω(ν, µ α ) = ( α)w 2 2,ω(ν, µ 0 ) + αw 2 2,ω(ν, µ ) α( α)w 2 2,ω(µ 0, µ ) ν P ω (R d ). (3) (iii) Generalized geodesics with base ω are the unique constant speed geodesics in the W 2,ω metric. Consequently, a functional E is λ-convex along generalized geodesics with base ω if and only if it is λ-convex in the W 2,ω metric. In particular, the function µ W2,ω 2 (ν, µ) is 2-convex in the W 2,ω metric for any ν P 2,ω (R d ).

10 0 Proof. (i) W 2,ω is symmetric and nonnegative by definition. It is non-degenerate since 0 = W 2,ω (µ, ν) W 2 (µ, ν) = µ = ν. W 2,ω satisfies the triangle inequality since L 2 (ω) satisfies the triangle inequality: W 2,ω (µ, ν) = t µ ω t ν ω L 2 (ω) t µ ω t ρ ω L 2 (ω) + t ρ ω t ν ω L 2 (ω) = W 2,ω (µ, ρ) + W 2,ω (ρ, ν) (ii) Let µ α := (( α)t µ 0 ω + αt µ ω )#ω be the generalized geodesic with base ω from µ 0 to µ at time α 0, ]. By Remark 2.4, t µα ω = ( α)t µ 0 ω + αt µ ω. Consequently, ( W 2,ω (µ α µ ν, µ µ ν β ) = ( = ) /2 (( α)t µ ω + αt ν ω) (( β)t µ ω + βt ν ω) 2 dω ) /2 ((β α)t µ ω + (α β)t ν ω 2 dω = β α W 2,ω (µ, ν) This shows that µ α is a constant speed geodesic. The second result follows from the corresponding identity of the L 2 (ω) norm. W2,ω(ν, 2 µ α ) = ( α)t µ 0 ω + αt µ ω t ν ω 2 L 2 (ω) = ( α) t µ 0 ω t ν ω 2 L 2 (ω) + α tµ ω t ν ω 2 L 2 (ω) α( α) tµ 0 ω t µ ω 2 L 2 (ω) = ( α)w 2 2,ω(µ 0, ν) + αw 2 2,ω(µ, ν) α( α)w 2 2,ω(µ 0, µ ) (iii) Suppose µ α is a constant speed geodesic in the W 2,ω metric from µ 0 to µ. Let µ α := (( α)t µ 0 ω + αt µ ω )#ω be the generalized geodesic with base ω from µ 0 to µ. Setting ν = µ α in equation (3) gives W 2 2,ω( µ α, µ α ) = ( α)w 2 2,ω( µ α, µ 0 ) + αw 2 2,ω( µ α, µ ) α( α)w 2 2,ω(µ 0, µ ). Using the fact that µ α is a constant speed geodesic shows W 2 2,ω( µ α, µ α ) = ( α)α 2 W 2 2,ω(µ, µ 0 ) + α( α) 2 W 2 2,ω(µ 0, µ ) α( α)w 2 2,ω(µ 0, µ ) = (α + ( α) )( α)αw 2 2,ω(µ 0, µ ) = 0. Therefore µ α = µ α and generalized geodesics are the unique constant speed geodesics in the W 2,ω metric. We may define the subdifferential with respect to W 2,ω in analogy with the Wasserstein subdifferential, Definition 2.9. DEFINITION 2.6 (W 2,ω subdifferential). Given E : P 2,ω (R d ) R {+ } proper and lower semicontinuous in W 2,ω, ξ L 2 (ω) belongs to the W 2,ω subdifferential 2,ω E(µ) in case E(ν) E(µ) ξ, t ν ω t µ ω dω + o(w 2,ω (µ, ν)) as ν µ.

11 REMARK 2.7 (lower semicontinuity in W 2 vs. W 2,ω ). By Remark 2.4, W 2,ω induces a weaker topology than W 2. Consequently, if E is lower semicontinuous in W 2, it is lower semicontinuous in W 2,ω. REMARK 2.8 (additivity of W 2,ω subdifferential). If ξ 2,ω E (µ) and ξ 2 2,ω E 2 (µ), E (ν) + E 2 (ν) E (µ) E 2 (µ) ξ + ξ 2, t ν ω t µ ω dω + o(w 2,ω (µ, ν)), so ξ + ξ 2 2,ω (E + E 2 )(µ). The next proposition provides a characterization of the W 2,ω subdifferential for functionals that are convex in W 2,ω, in analogy with, Equation (0..7)]. PROPOSITION 2.9 (W 2,ω subdifferential for convex function). Given E satisfying the conditions of Definition 2.6, ξ 2,ω E(µ) if and only if E(ν) E(µ) ξ, t ν ω t µ ω dω + λ 2 W 2,ω(µ, 2 ν) ν. (4) Proof. If (4) holds, then ξ 2,ω E(µ) by Definition 2.6. For the converse, assume ξ 2,ω E(µ). Define µ α = (( α)t µ ω + αt ν ω)#ω to be the generalized geodesic from µ to ν with basepoint ω. Since E is λ convex in the W 2,ω metric, E(µ α ) E(µ) α E(ν) E(µ) λ 2 ( α)w 2 ω(µ, ν). (5) By Proposition 2.5, W 2,ω (µ, µ α ) = αw 2,ω (µ, ν), and by Remark 2.4, t µα ω = ( α)t µ ω + αt ν ω. Combining these with the definition of ξ 2,ω E(µ) gives lim inf α 0 E(µ α ) E(µ) α lim inf ξ, t µα ω t µ α 0 α ω dω = lim inf ξ, ( α)t µ ω + αt ν ω t µ α 0 α ω dω = ξ, t ν ω t µ ω dω Sending α 0 in equation (5) shows E(ν) E(µ) ξ, t ν ω t µ ω dω + λ 2 W 2 ω(µ, ν). COROLLARY Given E satisfying the conditions of Definition 2.6 with λ 0, µ is a minimizer for E if and only if 0 2,ω E(µ). PROPOSITION 2.2 (W 2,ω subdifferential of W 2 2 (ω, )). The W 2,ω subdifferential of W 2 2 (ω, ) evaluated at µ contains the element 2(t µ ω id).

12 2 Proof. W2 2 (ω, ν) W2 2 (ω, µ) = t ν ω id 2 dω t µ ω id 2 dω = t ν ω t µ ω t ν ω, t µ ω 2 t ν ω, id + 2 t µ ω, id 2 t µ ω 2 dω = W2,ω(µ, 2 ν) + 2 t ν ω, t µ ω id + 2 t µ ω, id t µ ω dω = W2,ω(µ, 2 ν) + 2 t ν ω t µ ω, t µ ω id dω By Proposition 2.9, this implies that 2(t µ ω id) 2,ω W 2 ω(ω, µ). Finally, if E has a strong subdifferential (Definition 2.), E has a W 2,ω subdifferential. LEMMA 2.22 (strong subdifferential vs. W 2,ω subdifferential). Given E satisfying the conditions of Definition 2., if ξ E(µ) is a strong subdifferential, then ξ t µ ω 2,ω E(µ). Proof. If E has a strong subdifferential ξ at µ, ξ L 2 (µ), hence ξ t µ ω L 2 (ω). Furthermore, E(ν) E(µ) ξ, t ν ω t ω µ id dµ + o( t ν ω t ω µ id L 2 (µ)) R d = ξ t µ ω, t ν ω t µ ω dω + o(w 2,ω (µ, ν)) ν. R d Therefore, ξ t µ ω 2,ω E(µ). 2.6 Discrete Gradient Flow Given a functional E, a time step > 0, and µ, ν P 2,ω0 (R d ) the quadratic perturbation of E is Φ(, µ; ν) := 2 W 2 2 (µ, ν) + E(ν). (6) The proximal set J : P 2,ω0 (R d ) 2 P 2,ω 0 (R d) corresponding to E is { } J (µ) := argmin ν P 2,ω0 (R d ) 2 W 2 2 (µ, ν) + E(ν). (7) Define J 0 (µ) := µ. For the remainder of this section, we consider functionals that satisfy the convexity assumption.2. In order to jointly consider the cases λ 0 and λ < 0, we follow ] and define the negative part of λ to be { λ λ if λ < 0 = 0 if λ 0. In the case λ 0, we interpret = +. λ Suppose µ D(E) and 0 < <. (When λ < 0, the size restriction 0 < < ensures that λ λ 0 < + λ <.) Then there exists a unique element in J (µ) and the proximal map J : D(E) D(E) : µ µ is continuous, Theorem 4..2]. In, Theorem 3..6], Ambrosio, Gigli, and Savaré unite the notions of subdifferential and proximal map through the following chain of inequalities. Recall that E : P 2,ω0 R {+ } is the metric slope see Definition 2.8.

13 3 Theorem AGS. Given E satisfying convexity assumption.2 and µ D( E ) and 0 < < λ, 2 E 2 (µ ) W 2 2 (µ, µ ) 2 + λ (E(µ) E(µ ) 2 W 2 2 (µ, µ )) 2 ( + λ) 2 E 2 (µ). (8) The discrete gradient flow sequence with time step is constructed via repeated applications of the proximal map: µ n = J (µ n ), µ 0 D(E). We write J n to indicate n repeated applications of the proximal map, so that µ n = J n µ Euler-Lagrange Equation THEOREM 2.23 (Euler-Lagrange equation). Assume that E satisfies assumptions. and.2 and ω D(E). Then for 0 < <, ν is the unique minimizer of the quadratic perturbation λ Φ(, ω; ), if and only if (tω ν id) E(ν) is a strong subdifferential. (9) Hence, ω is characterized by the fact that (tω ω id) E(ω ). We assume ω D(E) and E satisfies domain assumption. to ease notation. See section 4.2 for how the assumption on ω can be relaxed to ω D(E) and the domain assumption can be removed. Proof of Theorem The fact that ν minimizes Φ(, ω; ν) = (tω ν id) E(ν) is a strong subdifferential is proved in, Lemma 0..2] using a type of argument introduced by Otto, 2]. To see the other direction, note that if (tω ν id) E(ν) is a strong subdifferential then by Lemma 2.22, (id tν ω) 2,ω E(ν). Combining Remark 2.8 and Proposition 2.2 shows 2 2(tν ω id) + (id tν ω) = 0 2,ω Φ(, ω; ν). Since W2 2(ω, ) = W 2,ω 2 (ω, ) is 2-convex in the W 2,ω metric and E is λ-convex in the W 2,ω metric, Φ(, ω; ) is ( + λ) -convex in the W 2,ω metric, with ( + λ) > 0. Therefore, by Corollary 2.20, when 0 < <, 0 λ 2,ω Φ(, ω; ν), and ν minimizes Φ(, ω; )

14 4 2.8 Discrete Variational Inequality The notion of a discrete variational inequality was introduced in ] to gain quantitative control over the discrete gradient flow for functionals that are convex along generalized geodesics. The starting point for this variational inequality in ] the observation that, if E is convex along generalized geodesics, it satisfies, Assumption 4.0.], which we recall for the reader s convenience: ASSUMPTION 4.0., ]. Given λ R, for every choice of µ, ν 0, and ν D(E) and 0 < <, there exists a curve ν λ α, α 0, ], such that ( ) ν Φ(, µ; ν) is + λ -convex on γ. In other words, the quadratic perturbation satisfies the inequality Φ(, µ; ν α ) ( α)φ(, µ; ν 0 ) + αφ(, µ; ν ) + λ α( α)w2 2 (ν 0, ν ). (20) 2 The convexity of Φ(, ω; ) implies the following discrete variational inequality, Theorem 4..2]. THEOREM AGS2. If, Assumption 4.0.] holds and E is proper, coercive, and lower semicontinuous, then for all 0 < < λ, µ D(E), and ν D(E), 2h W 2 2 (µ h, ν) W2 2 (µ, ν)] + λ 2 W 2 2 (µ h, ν) E(ν) E(µ h ) 2h W 2 2 (µ, µ h ) (2) For our purposes, we require not only control of the Wasserstein metric along the discrete gradient flows, but also control over transport metrics along discrete gradient flows. Luckily, the convexity of E along generalized geodesics implies something slightly stronger than, Assumption 4.0.], and we are able to obtain the following slightly stronger notion of convexity of the quadratic perturbation. In the next theorem two theorems, we assume the base point µ << L d so that the transport metric W 2,µ is well defined by Definition 2.3. As before, this assumption is only for ease of notation, and we describe how to remove it in section 4.2. THEOREM 2.24 (transport metric convexity of quadratic perturbation). Fix µ P 2,ω0 (R d ). If E is λ-convex along generalized geodesics, then for 0 < < ν Φ(, µ; ν) is λ, ( ) + λ -convex on generalized geodesics with base point µ. In other words, for every choice of ν 0, ν D(E), there exists a generalized geodesic ν α from ν 0 to ν with base µ such that Φ(, µ; ν α ) ( α)φ(, µ; ν 0 ) + αφ(, µ; ν ) + λ α( α)w 2 2 2,µ(ν 0, ν ). (22) Proof. By Proposition 2.5, 2 W 2 2(µ, ) = W 2,µ 2 (µ, ) is convex along all generalized geodesics with base µ. Therefore, if E is λ-convex along generalized geodesics, their sum Φ(, µ; ) is ( +λ)-convex along generalized geodesics with base µ.

15 5 THEOREM 2.25 (discrete variational inequality). Suppose E satisfies assumption.2. Then for all µ D(E) and ν D(E), or, equivalently, 2 W 2 2,µ(µ, ν) W 2 2 (µ, ν)] + λ 2 W 2 2,µ(µ, ν) E(ν) E(µ ) 2 W 2 2 (µ, µ ) ( + λ)w 2 2,µ(µ, ν) W 2 2 (µ, ν) 2 E(ν) E(µ ) ] 2 W 2 2 (µ, µ ) Proof. The following proof is nearly identical to, Theorem 4..2 (ii)], except for the stronger convexity assumption on Φ. By Theorem 2.24, there exists a generalized geodesic ν α from µ to ν with base point µ along which Φ(, µ; ) satisfies inequality (22). Combining this with the fact that µ is the minimizer of Φ(, µ; ) gives Φ(, µ; µ ) Φ(, µ; µ α ) ( α)φ(, µ; µ ) + αφ(, µ; ν) + λ α( α)w 2 2 2,µ(µ, ν). Rearranging and dividing by α, 0 Φ(, µ; ν) Φ(, µ; µ ) + λ ( α)w 2 2 2,µ(µ, ν). Sending α 0 and expanding Φ according to its definition gives the result. 3 Exponential Formula for the Wasserstein Metric Given E satisfying convexity assumption.2, we aim to show that, as the time step goes to zero, the discrete gradient flow converges to the continuous gradient flow lim J n n t/n µ = S(t)µ. (23) The key difficulty in showing (23) is proving that the limit exists, which we accomplish by proving the sequence is Cauchy and using the fact that W 2 is complete, Prop 7..5]. First we consider initial data µ D( E ). In section 3.6, we extend our results to µ D(E). 3. Almost Contraction Inequality In this subsection, we use the discrete variational inequality Theorem AGS2 to prove an almost contraction inequality for the discrete gradient flow. (Theorem AGS2 is sufficient for this purpose we use the stronger discrete variational inequality of Theorem 2.25 in a later section.) Our approach is similar to 4], though instead of symmetrizing the contraction inequality, we leave the inequality in an asymmetric form that is more compatible with the asymmetric induction in sections 3.3 and 3.4. The asymmetry useful a second time when we consider gradient flow with initial conditions ν D(E) see section 3.6. For the λ 0 case, we follow the proof of, Lemma 4.2.4]. For the λ > 0 case, we use a new approach. In this case, we rely on the fact that λ > 0 implies E is bounded below, Lemma 2.4.8]. THEOREM 3. (almost contraction inequality). Suppose E satisfies convexity assumption.2, µ D( E ), and ν D(E). If λ > 0, then for all > 0, ( + λ) 2 W 2 2 (µ, ν ) W 2 2 (µ, ν) + 2 E 2 (µ) + 2λ 2 E(ν) inf E] (24)

16 6 If λ 0, then for all 0 < < λ, ( + λ) 2 W 2 2 (µ, ν ) W 2 2 (µ, ν) + 2 E 2 (µ). (25) When λ > 0, (+λ) 2 may be large. Consequently, it is not surprising that we must compensate with extra terms on the right hand side of (24) that are not needed when λ 0. Proof. By Theorem AGS2, recalled for the reader s convenience in section 2.8, ( ( + λ)w2 2 (µ, ν ) W2 2 (µ, ν ) 2 E(ν ) E(µ ) ) 2 W 2 2 (µ, µ ), (26) ( ( + λ)w2 2 (ν, µ) W2 2 (ν, µ) 2 E(µ) E(ν ) ) 2 W 2 2 (ν, ν ). (27) Consider the case λ > 0. Dropping the 2 W 2 2(ν, ν ) term from (27), dividing by ( + λ), and adding to (26) gives ( + λ)w 2 2 (µ, ν ) + λ W 2 2 (µ, ν) 2 ( + λ) 2 W 2 2 (µ, ν ) W 2 2 (µ, ν) 2 ( E(ν ) + λ E(ν ) + + λ E(µ) E(µ ) ( ( + λ)e(ν ) E(ν ) + E(µ) ( + λ) Since λ > 0, E is bounded below, Lemma 2.4.8]. Applying Theorem AGS and the fact that E(µ ) E(µ), we have ( + λ) 2 W2 2 (µ, ν ) W2 2 (µ, ν) 2λ 2 E(ν ) λ E 2 (µ) 2λ 2 inf E 2 E 2 (µ) + 2λ 2 E(ν) inf E], which gives the result. Now consider the case λ 0. Adding (26) and (27) and then applying Theorem AGS gives ( + λ)w2 2 (µ, ν ) W2 2 (ν, µ) + λw2 2 (ν, µ) 2 E(µ) E(µ ) ] 2 W 2 2 (µ, µ ) W2 2 (ν, ν ) Since for a, b > 0 and 0 < ɛ <, the convex function 2 + λ E 2 (µ) W 2 2 (ν, ν ). (28) φ(ɛ) := a2 ɛ + b2 ɛ has the minimum value (a + b) 2, attained at ɛ = a/(a + b), we have Consequently, with ɛ := λ, we obtain (a + b) 2 a2 ɛ + b2 ɛ. W 2 2 (ν, µ) (W 2 (ν, ν) + W 2 (ν, µ)) 2 λ W 2 2 (ν, ν) + + λ W 2 2 (ν, µ). (29) Multiplying by λ, summing with (28), multiplying the total by ( + λ), and using the fact that λ <, we obtain which gives the result. ( + λ) 2 W 2 2 (µ, ν ) W 2 2 (µ, ν) + 2 E 2 (µ), ) 2 W 2 2 (µ, µ ) E(µ ) + 2 W 2 2 (µ, µ ) ])

17 7 3.2 Relation Between Proximal Maps with Different Time Steps We now apply the Euler-Lagrange equation, Theorem 2.23, to prove a theorem relating the proximal map with a large time step to the proximal map with a small time step h. Assumption. is purely for notational convenience. See Theorem 4. for the general case. THEOREM 3.2. Suppose E satisfies assumptions. and.2. Then if µ D(E) and 0 < h <, λ ( h J µ = J h t µ µ + h ) ] id #µ COROLLARY 3.3. Under the assumptions of the previous theorem, if µ D(E), n, ( h J n µ = J (J n µ) = J h t J n µ + h ) ] J n µ id #J n µ. Proof of Theorem 3.2. By Theorem 2.23, is a strong subdifferential. Next, since h/ <, ξ := (tµ µ id) E(µ ) (30) (id + hξ) = (id + h (tµ µ id)) = (( h )id + h tµ µ ). is cyclically monotone. Consequently, if we define ν := (id + hξ)#µ, the optimal transport map is t ν µ = id + hξ. Rearranging shows h (tν µ id) = ξ E(µ ), so by a second application of Theorem 2.23, µ = ν h. We now rewrite ν in terms of its push forward off of µ to obtain the result. By equation (30), t µ µ = (id + ξ), so (id + hξ) = ( h t µ µ + h id) (id + ξ). Therefore, ( ( h h ν = (id + hξ)#µ = t µ µ + h id ) (id + ξ) #µ = t µ µ + h id ) #µ. After proving Theorem 3.2, we discovered another proof of the same result in 8, 9]. It is nonvariational and quite different from the proof given above, and we hope our proof is of independent interest. 3.3 Asymmetric Recursive Inequality The following inequality bounds the Wasserstein distance between discrete gradient flow sequences with different time steps in terms of a convex combination of earlier elements of the sequences, plus a small error term. The recursion of this inequality is asymmetric: the (n, m)th term is controlled in terms of the (n, m )th term and the (n, m )th term. A fundamental difference between Crandall and Liggett s asymmetric recursive inequality and Theorem 3.4 is that the former involves the distance while the latter involves the square distance. (This is a consequence of the fact that our contraction inequality Theorem 3. involves the square distance plus error terms.) Therefore, where Crandall and Liggett are able to use the triangle inequality, we have to use the convexity of the square transport metrics. Passing from the transport metrics back to the Wasserstein metric consumes the bulk of the proof.

18 8 m (n,m) (n-,m-) (n,m-) n THEOREM 3.4 (asymmetric recursive inequality). Suppose E satisfies convexity assumption.2 and µ D( E ). If 0 < h < λ, Monday, August 26, 3 µ, J m h µ) + h + 2h 2 ( λ h) 2m E 2 (µ). ( λ h) 2 W 2 2 (J n µ, J m h µ) h ( λ ) W 2 2 (J n W 2 2 (J n µ, J m h µ) To consider λ 0 and λ < 0 jointly in the following theorem, we replace λ by λ : any function that is λ convex is also λ convex. Proof. To simplify notation, we abbreviate J n µ by J n and J m h µ by J m. First, note that ( λ h) 2 W2 2 (J n, J m ) = ( λ h) 2 W2 2 (J h (µ J n J n h ), J m ) by Theorem 3.2 W2 2 (µ J n J n h, J m ) + h 2 E 2 (J m ) by Theorem 3. W2,J 2 (µ J n J n n h, J m ) + h 2 E 2 (J m ) By Proposition 2.5, the W 2,J n metric is convex along generalized geodesics with base J n. In particular, it is convex along the geodesic µ J n J n h, which gives ( λ h) 2 W2 2 (J n, J m ) h W 2,J 2 (J n, J m ) + h W n 2,J 2 (J n, J m ) + h 2 E 2 (J m ). n (3) The first term on the right hand side coincides with the standard Wasserstein metric. To control the second term, we use the stronger version of the discrete variational inequality Theorem Specifically, replacing (µ, ν) in Theorem 2.25 with (J m, J n ) and (J n, J m ) gives ( λ h)w2,j 2 (J m, J n ) W 2 m 2 (J m, J n ) 2h E(J n ) E(J m ) ] 2h W 2 2 (J m, J m ) ( λ )W2,J 2 (J n, J m ) W 2 n 2 (J n, J m ) 2 E(J m ) E(J n ) ] 2 W 2 2 (J n, J n ) Multiplying the first inequality by, the second inequality by h, adding them together, and then

19 9 applying Theorem AGS gives ( λ h)w2,j 2 (J m, J n ) + h( λ )W 2 m 2,J (J n, J m ) n W2 2 (J m, J n ) + hw2 2 (J n, J m ) + 2h E(J m ) E(J m ) ] 2h W 2 2 (J m, J m ) hw2 2 (J n, J n ) W 2 2 (J m, J n ) + hw 2 2 (J n, J m ) + As in equation (29) we have, λ W 2 2,J n (J m, J n ) W 2 2 (J n, J n ) + Multiplying this by h and adding it to (32) gives ( λ h)w 2 2,J m (J m, J n ) + hw 2 2,J n (J n, J m ) h2 λ h E 2 (J m ) hw 2 2 (J n, J n ). (32) λ λ W 2 2 (J n, J m ). W2 2 (J m, J n h ) + λ W 2 2 (J n, J m ) + h2 λ h E 2 (J m ). Rearranging and dividing by h gives the upper bound W2,J 2 (J m, J n ) ( ) W 2 n 2 (J m, J n ) ( λ h)w2,j 2 (J m, J n ) h m + λ W 2 2 (J n, J m ) + h λ h E 2 (J m ). (33) We now combine this with equation (3) to prove the theorem. Substituting (33) into (3) and using ( λ h) ( λ h) 2 gives ( λ h) 2 W 2 2 (J n, J m ) h W 2 2 (J n, J m ) + h 2 E 2 (J m ) + h ( W 2 h 2 (J m, J n ) ( λ h) 2 W2 2 (J m, J n ) ) + λ W 2 2 (J m, J n ) + h ] λ h E 2 (J m ) Simplifying and rearranging, h ( λ h) 2 W2 2 (J n, J m ) ( h + Therefore, h ( λ ) ) W 2 2 (J n, J m ) + h h W 2 2 (J n, J m ) + h 2 E 2 (J m ) + h λ h E 2 (J m ). ( λ h) 2 W2 2 (J n, J m ) h λ h λ W 2 2 (J n, J m ) + h h W2 2 (J n, J m 3 ) + + h 2 ] λ E 2 (J m ) h h λ W 2 2 (J n, J m ) + h W2 2 (J n, J m ) + 2h2 λ h E 2 (J m ), since 0 < h λ. Finally, applying Theorem AGS and the fact that ( λ h) ( λ h) 2 gives the result: ( λ h) 2 W2 2 (J n, J m ) h λ W 2 2 (J n, J m ) + h W2 2 (J n, J m ) + 2h 2 ( λ h) 2m E 2 (µ)..

20 Inductive Bound The following inductive bound follows the simplification of Crandall and Liggett s method introduced by Rasmussen in 3]. (See also 5].) One key difference is that, in the Banach space case, one works with the distance, rather than the square distance. While this complicated matters in the proof of Theorem 3.4, it simplifies the induction in Theorem 3.6. We begin by proving a bound on the distance between the 0th and nth terms of the discrete gradient flow sequence. LEMMA 3.5. Given E as in Assumption.2 and µ D( E ), for all 0 < < λ W 2 (J n µ, µ) n ( λ ) n E(µ) Proof. This is follows from the triangle inequality, Theorem AGS, and the inequalities +λ and. λ λ W 2 (J n µ, µ) n i= W 2 (J i µ, J i µ) n i= i E(J µ) + λ n i= ( + λ) i E(µ) n ( λ ) n E(µ). THEOREM 3.6 (a Rasmussen type inductive bound). Suppose E satisfies convexity assumption.2. Then if µ D( E ) and 0 < h < λ, W 2 2 (J n µ, J m h µ) (n mh) 2 + hm n ] ( λ ) 2n ( λ h) 2m E 2 (µ). (34) Proof. We proceed by induction. The base case, when either n = 0 or m = 0, follows from the linear growth estimate Lemma 3.5. We assume the inequality holds for (n, m) and (n, m) and show that this implies it holds for (n, m + ). First, we apply the Asymmetric Recursive Inequality, Theorem 3.4, ( λ h) 2 W 2 2 (J n µ, J m+ h µ) h ( λ ) W 2 2 (J n µ, Jh m µ) + h W2 2 (J n µ, Jh m µ) + 2h2 ( λ h) 2(m+) E 2 (µ). Next, we divide by ( λ h) 2 and apply the inductive hypothesis. W2 2 (J n µ, J m+ h µ) h ((n ) mh) 2 + hm (n ) ] ( λ ) 2(n ) ( λ h) 2(m+) E 2 (µ) + h (n mh) 2 + hm n ] ( λ ) 2n ( λ h) 2(m+) E(µ) 2 + 2h 2 ( λ h) 2(m+) 2 E 2 (µ). To control the first term, note that ( λ ) 2(n ) = ( λ ) 2n+ < ( λ ) 2n and ((n ) mh) 2 + hm (n ) ] = (n mh) 2 2(n mh) hm (n ) ]. To control the third term, note that since 0 < h λ, ( λ h) 2 ( λ ) 2 ( λ ) 2n.

21 2 Using these estimates, we may group together the three terms and obtain the following bound. W2 2 (J n µ, J m+ h µ) { h (n mh) 2 2(n mh) hm (n ) ] + h ( λ ) 2n ( λ h) 2(m+) E 2 (µ). (n mh) 2 + hm n ] + 2h 2 } We now consider the convex combination (plus an additional 2h 2 term) within the brackets. h (n mh) 2 2(n mh) hm (n ) ] + h (n mh) 2 + hm n ] + 2h 2 = h (n mh) 2 + hm n ] + h (n mh) 2 + hm n ] + h 2(n mh) 2 ] + 2h 2 = (n mh) 2 + hm n ] 2(n mh)h h + 2h 2 = (n mh) 2 2(n mh)h + hm h n + 2h 2 = (n (m + )h) 2 + h 2 + h(m + ) h n (n (m + )h) 2 + h(m + ) n. Therefore, W 2 2 (J n µ, J m+ h µ) (n (m + )h) 2 + h(m + ) n ] ( λ ) 2n ( λ h) 2(m+) E 2 (µ). 3.5 Exponential Formula for the Wasserstein Metric We now combine our previous results to prove the exponential formula for the Wasserstein metric. THEOREM 3.7 (exponential formula). Suppose E satisfies convexity assumption.2. For µ D( E ), t 0, the discrete gradient flow sequence Jt/n n µ converges as n. Denote the limit by S(t)µ. The convergence is uniform in t on compact subsets of 0, + ), and when n 2λ t, the distance between Jt/n n and S(t)µ is bounded by W 2 (J n t/n µ, S(t)µ) 3 t n e 3λ t E (µ). (35) REMARK 3.8 (range of S(t)). Given µ D( E ), we may use the fact that E is lower semicontinuous, Corollary 2.4.0] and Theorem AGS to conclude E (S(t)µ) lim inf E (J n n t/n µ) lim inf ( n λ t/n) n E (µ) = e λ t E (µ). Therefore, S(t)µ D( E ). We have shown W 2 (Jt/n n µ, S(t)µ) O(n /2 ), which agrees with the rate Crandall and Liggett obtained in a Banach space 6]. By a different method, Ambrosio, Gigli, and Savaré showed W 2 (Jt/n n µ, S(t)µ) O(n ), Theorem 4.0.4], which agrees with the optimal rate in a Hilbert space 4]. Our rate improves upon the rate obtained by Clément and Desch 5], d(jt/n n µ, S(t)µ) O(n /4 ), though they considered the more general case of gradient

22 22 flow on a metric space (X, d) satisfying, Assumption 4.0.]. (See section 2.8 for the role this assumption played in our own proof.) Though we do not obtain the optimal rate of convergence, we demonstrate that Crandall and Liggett s approach extends to the Wasserstein metric, providing a simple and robust route to the exponential formula and properties of continuous gradient flow. This brings together the Banach space theory with the Wasserstein theory, and it is hoped that this method will help extend the abstract theory of Wasserstein gradient flow to a broader class of functionals. REMARK 3.9 (varying time steps). In fact, for any partition of the interval 0, t] into n time steps,..., n, the corresponding discrete gradient flow with varying time steps Π n i= J i µ converges to S(t)µ as the maximum step size goes to zero. See section 4.. Our estimates lead to a simple proof of the fact that S(t)µ is a λ-contracting semigroup, as originally shown in, Proposition 4.3.]. THEOREM 3.0 (S(t) is a λ-contracting semigroup). Given E satisfying convexity assumption.2, the function S(t) on 0, + ), is a λ-contracting semigroup, i.e. (i) lim t 0 S(t)µ = S(0)µ = µ (ii) S(t + s) = S(t)S(s)µ for t, s 0 (iii) W 2 (S(t)µ, S(t)ν) e λt W 2 (µ, ν) S(t) : D( E ) D( E ) : µ S(t)µ Next, we apply the semigroup property (ii) to conclude that E(S(t)µ) is nonincreasing. COROLLARY 3.. For all µ D( E ), E(S(t)µ) is non-increasing for t 0, + ). Combining the previous results, we prove that S(t) is the continuous gradient flow, in the sense of Definition.3. THEOREM 3.2 (S(t)µ is the continuous gradient flow). Given E satisfying convexity assumption.2 and µ D( E ), S(t)µ is the continuous gradient flow for E with initial conditions µ. Furthermore, so S(t)µ is locally Lipschitz on 0, + ). W 2 (S(t)µ, S(s)µ) t s e λ t e λ s E (µ), (36) Finally, we use our method to give a simple proof of the energy dissipation inequality, which shows the regularizing effect of the gradient flow. COROLLARY 3.3 (Energy Dissipation Inequality). Given E satisfying convexity assumption.2 and µ D( E ), for all t 0, t 0, t We now turn to the proofs of these results. t 0 E 2 (S(s)µ)ds E(S(t 0 )µ) E(S(t )µ).

23 23 Proof of Theorem 3.7. By Theorem 3.6, for fixed t 0, if we define := t n, h := t m, with m n > 2tλ, so 0 h < 2λ, W 2 2 (J n t/n µ, J m t/m µ) 3t2 n ( λ t/n) 2n ( λ t/m) 2m E 2 (µ) 3 t2 n e8λ t E 2 (µ). (37) In the second inequality, we use that ( α) e 2α for α 0, /2]. Thus, the sequence Jt/n n µ is Cauchy, and lim n Jt/n n µ exists. The estimate (37) shows that the convergence is uniform in t on compact subsets of 0, + ). If S(t)µ denotes the limit, then sending m in the first inequality of (37) gives the error estimate W 2 2 (J n t/n µ, S(t)µ) 3t2 n e6λ t E 2 (µ). (38) Proof of Theorem 3.0. (i) follows from Lemma 3.5, since W 2 (S(t)µ, µ) = lim W 2(J n t n t/n µ, µ) lim n ( λ t/n) n E(µ) = teλ t E(µ) t 0 0. We now turn to the contraction property (iii). Our proof of the λ > 0 case is new, using the almost contraction inequality, Theorem 3.. For completeness, we recall the proof of, Proposition 4.3.], which shows the λ 0 case. Iterating the contraction inequality from Theorem 3. for λ > 0 and applying Theorem AGS, W 2 2 (J n t/n µ, J n t/n ν) ( + λ(t/n)) 2n W 2 2 (µ, ν) + n i= (t/n) 2 ( ]) ( + λ(t/n)) 2i E 2 (J n i t/n µ) + 2λ E(J n i t/n ν) inf E (39) ( + λ(t/n)) 2n W 2 2 (µ, ν) + n(t/n) 2 ( E 2 (µ) + 2λ E(ν) inf E] ). (40) Likewise, for λ 0, n > tλ, we have W 2 2 (J n t/n µ, J n t/n ν) ( + λ(t/n)) 2n W 2 2 (µ, ν) + Sending n in both cases shows ( + λ(t/n)) 2n W 2 2 (µ, ν) + n i= W 2 2 (S(t)µ, S(t)ν) e 2λt W 2 2 (µ, ν). (t/n) 2 ( + λ(t/n)) 2i E 2 (J n i t/n µ) n(t/n) 2 ( + λ(t/n)) 2n E 2 (µ). (4) We now prove the semigroup property (ii). First, we show that S(t) m µ = S(mt)µ for fixed m N. To consider λ 0 and λ < 0 jointly, we replace λ by λ 0, since any function that is λ convex is also λ convex. First, note that W 2 (S(t) m µ, (J n t/n )m µ) = W 2 (S(t) m µ, J n t/n (J n t/n )m µ) W 2 (S(t) m µ, J n t/n S(t)m µ) + W 2 (J n t/n S(t)m µ, J n t/n (J n t/n )m µ) (42)

24 24 Remark 3.8 ensures S(t) m µ D( E ), so by Theorem 3.7, Jt/n n S(t)m µ n S(t) m µ. Consequently, we may choose n large enough so that the first term is arbitrarily small for fixed m N. We bound the second term in (42) using (4). By Remark 3.8, E 2 (S(t) m µ) e 2(m )λ t E 2 (µ). Therefore, W 2 2 (J n t/n S(t)m µ, J n t/n (J n t/n )m µ) ( λ (t/n)) 2n W 2 2 (S(t) m µ, (J n t/n )m µ) + n(t/n)2 e 2(m )λ t ( λ (t/n)) 2n E 2 (µ). Thus, taking square roots of both sides and combining with (42) shows that for all ɛ > 0, these exists n large enough so that W 2 (S(t) m µ, (J n t/n )m µ) ɛ + e 4λ t W 2 (S(t) m µ, (J n t/n )m µ). Iterating this shows that for n large enough m ( ) W 2 (S(t) m µ, (Jt/n n )m µ) ɛ e 4iλ t + e 4mλ t W 2 (µ, µ) ɛ me 2(m )λ t. (43) i=0 We now apply this to show S(t) m µ = S(mt)µ. By the triangle inequality, W 2 (S(t) m µ, S(mt)µ) W 2 (S(t) m µ, (J n t/n )m µ) + W 2 ((J n t/n )m µ, S(mt)µ). The first term can be made arbitrarily small by (43). Since W 2 ((Jt/n n )m µ, S(mt)µ) = W 2 ((Jtm/nm nm )µ, S(mt)µ), by Theorem 3.7 we may choose n large so the second term is arbitrarily small. Therefore, S(t) m µ = S(mt)µ. This shows shows that for any l, k, r, s N, ( l S k + r ) µ = S s ( ls + rk ks ) ( )] ls+rk ( )] ls ( )] rk ( ) l ( r µ = S µ = S S µ = S S µ. ks ks ks k s) Since S(t)µ is continuous in t 0, + ), S(t + s)µ = S(t)S(s)µ for all t, s 0. Proof of Corollary 3.. For t 0, the lower semicontinuity of E and definition of the proximal map (7) imply E(S(t)µ) lim inf E(J n n t/n µ) lim inf E(µ) = E(µ). n The result then follows from the semigroup property, Theorem 3.0 (ii). Proof of Theorem 3.2. First, we show that S(t)µ is locally Lipschitz continuous in t. Given t, s 0, define := t n, h := s m for m and n large enough so that 0 h <. By Theorem 3.6, λ W2 2 (Jt/n n µ, J s/m m µ) (t s) 2 + ts ] n + 2t2 ( λ t/n) 2n ( λ s/m) 2m E 2 (µ). (44) n Sending n, m and taking the square root of both sides gives W 2 (S(t)µ, S(s)µ) t s e λ t e λ s E (µ). (45) We now turn to the proof that S(t)µ is the continuous gradient flow for E with initial conditions µ in the sense of Definition.3. We already showed S(t)µ t 0 µ in part (i) of Theorem 3.0, so it remains to show that S(t)µ satisfies (6).

The Exponential Formula for the Wasserstein Metric

The Exponential Formula for the Wasserstein Metric The Exponential Formula for the Wasserstein Metric A dissertation submitted to Rutgers, The State University of New Jersey, in partial fulfillment of the requirements for the degree of Doctor of Philosophy

More information

THE EXPONENTIAL FORMULA FOR THE WASSERSTEIN METRIC

THE EXPONENTIAL FORMULA FOR THE WASSERSTEIN METRIC THE EXPONENTIAL FORMULA FOR THE WASSERSTEIN METRIC BY KATY CRAIG A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey in partial fulfillment of the

More information

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE FRANÇOIS BOLLEY Abstract. In this note we prove in an elementary way that the Wasserstein distances, which play a basic role in optimal transportation

More information

CONGESTED AGGREGATION VIA NEWTONIAN INTERACTION

CONGESTED AGGREGATION VIA NEWTONIAN INTERACTION CONGESTED AGGREGATION VIA NEWTONIAN INTERACTION KATY CRAIG, INWON KIM, AND YAO YAO Abstract. We consider a congested aggregation model that describes the evolution of a density through the competing effects

More information

Spaces with Ricci curvature bounded from below

Spaces with Ricci curvature bounded from below Spaces with Ricci curvature bounded from below Nicola Gigli February 23, 2015 Topics 1) On the definition of spaces with Ricci curvature bounded from below 2) Analytic properties of RCD(K, N) spaces 3)

More information

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy Banach Spaces These notes provide an introduction to Banach spaces, which are complete normed vector spaces. For the purposes of these notes, all vector spaces are assumed to be over the real numbers.

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Convergence rate estimates for the gradient differential inclusion

Convergence rate estimates for the gradient differential inclusion Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

AN ELEMENTARY PROOF OF THE TRIANGLE INEQUALITY FOR THE WASSERSTEIN METRIC

AN ELEMENTARY PROOF OF THE TRIANGLE INEQUALITY FOR THE WASSERSTEIN METRIC PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 136, Number 1, January 2008, Pages 333 339 S 0002-9939(07)09020- Article electronically published on September 27, 2007 AN ELEMENTARY PROOF OF THE

More information

GRADIENT FLOWS ON NONPOSITIVELY CURVED METRIC SPACES AND HARMONIC MAPS

GRADIENT FLOWS ON NONPOSITIVELY CURVED METRIC SPACES AND HARMONIC MAPS communications in analysis and geometry Volume 6, Number 2, 199-253, 1998 Preprint GRADIENT FLOWS ON NONPOSITIVELY CURVED METRIC SPACES AND HARMONIC MAPS UWE F. MAYER The notion of gradient flows is generalized

More information

EXAMPLE OF A FIRST ORDER DISPLACEMENT CONVEX FUNCTIONAL

EXAMPLE OF A FIRST ORDER DISPLACEMENT CONVEX FUNCTIONAL EXAMPLE OF A FIRST ORDER DISPLACEMENT CONVEX FUNCTIONAL JOSÉ A. CARRILLO AND DEJAN SLEPČEV Abstract. We present a family of first-order functionals which are displacement convex, that is convex along the

More information

A regularised particle method for linear and nonlinear diffusion

A regularised particle method for linear and nonlinear diffusion 1/25 A regularised particle method for linear and nonlinear diffusion Francesco Patacchini Department of Mathematical Sciences, Carnegie Mellon University Joint work with J. A. Carrillo (Imperial College

More information

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing. 5 Measure theory II 1. Charges (signed measures). Let (Ω, A) be a σ -algebra. A map φ: A R is called a charge, (or signed measure or σ -additive set function) if φ = φ(a j ) (5.1) A j for any disjoint

More information

PATH FUNCTIONALS OVER WASSERSTEIN SPACES. Giuseppe Buttazzo. Dipartimento di Matematica Università di Pisa.

PATH FUNCTIONALS OVER WASSERSTEIN SPACES. Giuseppe Buttazzo. Dipartimento di Matematica Università di Pisa. PATH FUNCTIONALS OVER WASSERSTEIN SPACES Giuseppe Buttazzo Dipartimento di Matematica Università di Pisa buttazzo@dm.unipi.it http://cvgmt.sns.it ENS Ker-Lann October 21-23, 2004 Several natural structures

More information

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. Minimization Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. 1 Minimization A Topological Result. Let S be a topological

More information

A description of transport cost for signed measures

A description of transport cost for signed measures A description of transport cost for signed measures Edoardo Mainini Abstract In this paper we develop the analysis of [AMS] about the extension of the optimal transport framework to the space of real measures.

More information

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems Giuseppe Savaré http://www.imati.cnr.it/ savare Dipartimento di Matematica, Università di Pavia Nonlocal

More information

1 Directional Derivatives and Differentiability

1 Directional Derivatives and Differentiability Wednesday, January 18, 2012 1 Directional Derivatives and Differentiability Let E R N, let f : E R and let x 0 E. Given a direction v R N, let L be the line through x 0 in the direction v, that is, L :=

More information

Integral Jensen inequality

Integral Jensen inequality Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

A BLOB METHOD FOR DIFFUSION

A BLOB METHOD FOR DIFFUSION A BLOB METHOD FOR DIFFUSION JOSÉ ANTONIO CARRILLO, KATY CRAIG, AND FRANCESCO S. PATACCHINI Abstract. As a counterpoint to classical stochastic particle methods for diffusion, we develop a deterministic

More information

Gradient Flows: Qualitative Properties & Numerical Schemes

Gradient Flows: Qualitative Properties & Numerical Schemes Gradient Flows: Qualitative Properties & Numerical Schemes J. A. Carrillo Imperial College London RICAM, December 2014 Outline 1 Gradient Flows Models Gradient flows Evolving diffeomorphisms 2 Numerical

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

Local semiconvexity of Kantorovich potentials on non-compact manifolds

Local semiconvexity of Kantorovich potentials on non-compact manifolds Local semiconvexity of Kantorovich potentials on non-compact manifolds Alessio Figalli, Nicola Gigli Abstract We prove that any Kantorovich potential for the cost function c = d / on a Riemannian manifold

More information

Integration on Measure Spaces

Integration on Measure Spaces Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of

More information

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov,

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov, LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES Sergey Korotov, Institute of Mathematics Helsinki University of Technology, Finland Academy of Finland 1 Main Problem in Mathematical

More information

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define Homework, Real Analysis I, Fall, 2010. (1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define ρ(f, g) = 1 0 f(x) g(x) dx. Show that

More information

Convex Optimization Conjugate, Subdifferential, Proximation

Convex Optimization Conjugate, Subdifferential, Proximation 1 Lecture Notes, HCI, 3.11.211 Chapter 6 Convex Optimization Conjugate, Subdifferential, Proximation Bastian Goldlücke Computer Vision Group Technical University of Munich 2 Bastian Goldlücke Overview

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

Problem List MATH 5143 Fall, 2013

Problem List MATH 5143 Fall, 2013 Problem List MATH 5143 Fall, 2013 On any problem you may use the result of any previous problem (even if you were not able to do it) and any information given in class up to the moment the problem was

More information

The Skorokhod reflection problem for functions with discontinuities (contractive case)

The Skorokhod reflection problem for functions with discontinuities (contractive case) The Skorokhod reflection problem for functions with discontinuities (contractive case) TAKIS KONSTANTOPOULOS Univ. of Texas at Austin Revised March 1999 Abstract Basic properties of the Skorokhod reflection

More information

Uniqueness of the solution to the Vlasov-Poisson system with bounded density

Uniqueness of the solution to the Vlasov-Poisson system with bounded density Uniqueness of the solution to the Vlasov-Poisson system with bounded density Grégoire Loeper December 16, 2005 Abstract In this note, we show uniqueness of weak solutions to the Vlasov- Poisson system

More information

CHAPTER 8: EXPLORING R

CHAPTER 8: EXPLORING R CHAPTER 8: EXPLORING R LECTURE NOTES FOR MATH 378 (CSUSM, SPRING 2009). WAYNE AITKEN In the previous chapter we discussed the need for a complete ordered field. The field Q is not complete, so we constructed

More information

1 Stochastic Dynamic Programming

1 Stochastic Dynamic Programming 1 Stochastic Dynamic Programming Formally, a stochastic dynamic program has the same components as a deterministic one; the only modification is to the state transition equation. When events in the future

More information

A REPRESENTATION FOR THE KANTOROVICH RUBINSTEIN DISTANCE DEFINED BY THE CAMERON MARTIN NORM OF A GAUSSIAN MEASURE ON A BANACH SPACE

A REPRESENTATION FOR THE KANTOROVICH RUBINSTEIN DISTANCE DEFINED BY THE CAMERON MARTIN NORM OF A GAUSSIAN MEASURE ON A BANACH SPACE Theory of Stochastic Processes Vol. 21 (37), no. 2, 2016, pp. 84 90 G. V. RIABOV A REPRESENTATION FOR THE KANTOROVICH RUBINSTEIN DISTANCE DEFINED BY THE CAMERON MARTIN NORM OF A GAUSSIAN MEASURE ON A BANACH

More information

The Caratheodory Construction of Measures

The Caratheodory Construction of Measures Chapter 5 The Caratheodory Construction of Measures Recall how our construction of Lebesgue measure in Chapter 2 proceeded from an initial notion of the size of a very restricted class of subsets of R,

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t))

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t)) Notations In this chapter we investigate infinite systems of interacting particles subject to Newtonian dynamics Each particle is characterized by its position an velocity x i t, v i t R d R d at time

More information

Measure and integration

Measure and integration Chapter 5 Measure and integration In calculus you have learned how to calculate the size of different kinds of sets: the length of a curve, the area of a region or a surface, the volume or mass of a solid.

More information

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)

More information

The heat equation in time dependent domains with Neumann boundary conditions

The heat equation in time dependent domains with Neumann boundary conditions The heat equation in time dependent domains with Neumann boundary conditions Chris Burdzy Zhen-Qing Chen John Sylvester Abstract We study the heat equation in domains in R n with insulated fast moving

More information

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

g 2 (x) (1/3)M 1 = (1/3)(2/3)M. COMPACTNESS If C R n is closed and bounded, then by B-W it is sequentially compact: any sequence of points in C has a subsequence converging to a point in C Conversely, any sequentially compact C R n is

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Banach Journal of Mathematical Analysis ISSN: (electronic)

Banach Journal of Mathematical Analysis ISSN: (electronic) Banach J Math Anal (009), no, 64 76 Banach Journal of Mathematical Analysis ISSN: 75-8787 (electronic) http://wwwmath-analysisorg ON A GEOMETRIC PROPERTY OF POSITIVE DEFINITE MATRICES CONE MASATOSHI ITO,

More information

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings arxiv:1505.04129v1 [math.oc] 15 May 2015 Heinz H. Bauschke, Graeme R. Douglas, and Walaa M. Moursi May 15, 2015 Abstract

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from

More information

GRADIENT FLOWS FOR NON-SMOOTH INTERACTION POTENTIALS

GRADIENT FLOWS FOR NON-SMOOTH INTERACTION POTENTIALS GRADIENT FLOWS FOR NON-SMOOTH INTERACTION POTENTIALS J. A. CARRILLO, S. LISINI, E. MAININI Abstract. We deal with a nonlocal interaction equation describing the evolution of a particle density under the

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE Fixed Point Theory, Volume 6, No. 1, 2005, 59-69 http://www.math.ubbcluj.ro/ nodeacj/sfptcj.htm WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE YASUNORI KIMURA Department

More information

Discrete Ricci curvature via convexity of the entropy

Discrete Ricci curvature via convexity of the entropy Discrete Ricci curvature via convexity of the entropy Jan Maas University of Bonn Joint work with Matthias Erbar Simons Institute for the Theory of Computing UC Berkeley 2 October 2013 Starting point McCann

More information

Behaviour of Lipschitz functions on negligible sets. Non-differentiability in R. Outline

Behaviour of Lipschitz functions on negligible sets. Non-differentiability in R. Outline Behaviour of Lipschitz functions on negligible sets G. Alberti 1 M. Csörnyei 2 D. Preiss 3 1 Università di Pisa 2 University College London 3 University of Warwick Lars Ahlfors Centennial Celebration Helsinki,

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

The small ball property in Banach spaces (quantitative results)

The small ball property in Banach spaces (quantitative results) The small ball property in Banach spaces (quantitative results) Ehrhard Behrends Abstract A metric space (M, d) is said to have the small ball property (sbp) if for every ε 0 > 0 there exists a sequence

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

Lecture 4: Numerical solution of ordinary differential equations

Lecture 4: Numerical solution of ordinary differential equations Lecture 4: Numerical solution of ordinary differential equations Department of Mathematics, ETH Zürich General explicit one-step method: Consistency; Stability; Convergence. High-order methods: Taylor

More information

Regularity of local minimizers of the interaction energy via obstacle problems

Regularity of local minimizers of the interaction energy via obstacle problems Regularity of local minimizers of the interaction energy via obstacle problems J. A. Carrillo, M. G. Delgadino, A. Mellet September 22, 2014 Abstract The repulsion strength at the origin for repulsive/attractive

More information

Spaces with Ricci curvature bounded from below

Spaces with Ricci curvature bounded from below Spaces with Ricci curvature bounded from below Nicola Gigli March 10, 2014 Lessons Basics of optimal transport Definition of spaces with Ricci curvature bounded from below Analysis on spaces with Ricci

More information

(x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively);

(x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively); STABILITY OF EQUILIBRIA AND LIAPUNOV FUNCTIONS. By topological properties in general we mean qualitative geometric properties (of subsets of R n or of functions in R n ), that is, those that don t depend

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Lebesgue Measure on R n

Lebesgue Measure on R n CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets

More information

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s.

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s. 20 6. CONDITIONAL EXPECTATION Having discussed at length the limit theory for sums of independent random variables we will now move on to deal with dependent random variables. An important tool in this

More information

Introduction to Gradient Flows in Metric Spaces (II)

Introduction to Gradient Flows in Metric Spaces (II) Introduction to Gradient Flows in Metric Spaces (II) Philippe Clément November 23, 200 Preface The aim of these Lectures Notes is to provide a gentle introduction to the theory of gradient flows in metric

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

Gradient flows and a Trotter Kato formula of semi-convex functions on CAT(1)-spaces

Gradient flows and a Trotter Kato formula of semi-convex functions on CAT(1)-spaces Gradient flows and a Trotter Kato formula of semi-convex functions on CAT(1)-spaces Shin-ichi Ohta & Miklós Pálfia November 0, 015 Abstract We generalize the theory of gradient flows of semi-convex functions

More information

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ. Convexity in R n Let E be a convex subset of R n. A function f : E (, ] is convex iff f(tx + (1 t)y) (1 t)f(x) + tf(y) x, y E, t [0, 1]. A similar definition holds in any vector space. A topology is needed

More information

Pseudo-Poincaré Inequalities and Applications to Sobolev Inequalities

Pseudo-Poincaré Inequalities and Applications to Sobolev Inequalities Pseudo-Poincaré Inequalities and Applications to Sobolev Inequalities Laurent Saloff-Coste Abstract Most smoothing procedures are via averaging. Pseudo-Poincaré inequalities give a basic L p -norm control

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Introduction to optimal transport

Introduction to optimal transport Introduction to optimal transport Nicola Gigli May 20, 2011 Content Formulation of the transport problem The notions of c-convexity and c-cyclical monotonicity The dual problem Optimal maps: Brenier s

More information

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction J. Korean Math. Soc. 38 (2001), No. 3, pp. 683 695 ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE Sangho Kum and Gue Myung Lee Abstract. In this paper we are concerned with theoretical properties

More information

FULL CHARACTERIZATION OF OPTIMAL TRANSPORT PLANS FOR CONCAVE COSTS

FULL CHARACTERIZATION OF OPTIMAL TRANSPORT PLANS FOR CONCAVE COSTS FULL CHARACTERIZATION OF OPTIMAL TRANSPORT PLANS FOR CONCAVE COSTS PAUL PEGON, DAVIDE PIAZZOLI, FILIPPO SANTAMBROGIO Abstract. This paper slightly improves a classical result by Gangbo and McCann (1996)

More information

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions Angelia Nedić and Asuman Ozdaglar April 16, 2006 Abstract In this paper, we study a unifying framework

More information

Fokker-Planck Equation on Graph with Finite Vertices

Fokker-Planck Equation on Graph with Finite Vertices Fokker-Planck Equation on Graph with Finite Vertices January 13, 2011 Jointly with S-N Chow (Georgia Tech) Wen Huang (USTC) Hao-min Zhou(Georgia Tech) Functional Inequalities and Discrete Spaces Outline

More information

arxiv: v1 [math.ap] 10 Apr 2013

arxiv: v1 [math.ap] 10 Apr 2013 QUASI-STATIC EVOLUTION AND CONGESTED CROWD TRANSPORT DAMON ALEXANDER, INWON KIM, AND YAO YAO arxiv:1304.3072v1 [math.ap] 10 Apr 2013 Abstract. We consider the relationship between Hele-Shaw evolution with

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

REGULARITY OF MONOTONE TRANSPORT MAPS BETWEEN UNBOUNDED DOMAINS

REGULARITY OF MONOTONE TRANSPORT MAPS BETWEEN UNBOUNDED DOMAINS REGULARITY OF MONOTONE TRANSPORT MAPS BETWEEN UNBOUNDED DOMAINS DARIO CORDERO-ERAUSQUIN AND ALESSIO FIGALLI A Luis A. Caffarelli en su 70 años, con amistad y admiración Abstract. The regularity of monotone

More information

{Euclidean, metric, and Wasserstein} gradient flows: an overview

{Euclidean, metric, and Wasserstein} gradient flows: an overview Bull. Math. Sci. (2017) 7:87 154 DOI 10.1007/s13373-017-0101-1 {Euclidean, metric, and Wasserstein} gradient flows: an overview Filippo Santambrogio 1 Received: 11 September 2016 / Revised: 9 February

More information

A generic property of families of Lagrangian systems

A generic property of families of Lagrangian systems Annals of Mathematics, 167 (2008), 1099 1108 A generic property of families of Lagrangian systems By Patrick Bernard and Gonzalo Contreras * Abstract We prove that a generic Lagrangian has finitely many

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

Measurable Choice Functions

Measurable Choice Functions (January 19, 2013) Measurable Choice Functions Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/choice functions.pdf] This note

More information

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,

More information

Dynamic and Stochastic Brenier Transport via Hopf-Lax formulae on Was

Dynamic and Stochastic Brenier Transport via Hopf-Lax formulae on Was Dynamic and Stochastic Brenier Transport via Hopf-Lax formulae on Wasserstein Space With many discussions with Yann Brenier and Wilfrid Gangbo Brenierfest, IHP, January 9-13, 2017 ain points of the

More information

Duality of multiparameter Hardy spaces H p on spaces of homogeneous type

Duality of multiparameter Hardy spaces H p on spaces of homogeneous type Duality of multiparameter Hardy spaces H p on spaces of homogeneous type Yongsheng Han, Ji Li, and Guozhen Lu Department of Mathematics Vanderbilt University Nashville, TN Internet Analysis Seminar 2012

More information

16 1 Basic Facts from Functional Analysis and Banach Lattices

16 1 Basic Facts from Functional Analysis and Banach Lattices 16 1 Basic Facts from Functional Analysis and Banach Lattices 1.2.3 Banach Steinhaus Theorem Another fundamental theorem of functional analysis is the Banach Steinhaus theorem, or the Uniform Boundedness

More information

Building Infinite Processes from Finite-Dimensional Distributions

Building Infinite Processes from Finite-Dimensional Distributions Chapter 2 Building Infinite Processes from Finite-Dimensional Distributions Section 2.1 introduces the finite-dimensional distributions of a stochastic process, and shows how they determine its infinite-dimensional

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Controlling and Stabilizing a Rigid Formation using a few agents

Controlling and Stabilizing a Rigid Formation using a few agents Controlling and Stabilizing a Rigid Formation using a few agents arxiv:1704.06356v1 [math.ds] 20 Apr 2017 Abstract Xudong Chen, M.-A. Belabbas, Tamer Başar We show in this paper that a small subset of

More information

Continuous Functions on Metric Spaces

Continuous Functions on Metric Spaces Continuous Functions on Metric Spaces Math 201A, Fall 2016 1 Continuous functions Definition 1. Let (X, d X ) and (Y, d Y ) be metric spaces. A function f : X Y is continuous at a X if for every ɛ > 0

More information

Parameter Dependent Quasi-Linear Parabolic Equations

Parameter Dependent Quasi-Linear Parabolic Equations CADERNOS DE MATEMÁTICA 4, 39 33 October (23) ARTIGO NÚMERO SMA#79 Parameter Dependent Quasi-Linear Parabolic Equations Cláudia Buttarello Gentile Departamento de Matemática, Universidade Federal de São

More information

On a Class of Multidimensional Optimal Transportation Problems

On a Class of Multidimensional Optimal Transportation Problems Journal of Convex Analysis Volume 10 (2003), No. 2, 517 529 On a Class of Multidimensional Optimal Transportation Problems G. Carlier Université Bordeaux 1, MAB, UMR CNRS 5466, France and Université Bordeaux

More information

Introduction to Dynamical Systems

Introduction to Dynamical Systems Introduction to Dynamical Systems France-Kosovo Undergraduate Research School of Mathematics March 2017 This introduction to dynamical systems was a course given at the march 2017 edition of the France

More information

Chapter 3: Baire category and open mapping theorems

Chapter 3: Baire category and open mapping theorems MA3421 2016 17 Chapter 3: Baire category and open mapping theorems A number of the major results rely on completeness via the Baire category theorem. 3.1 The Baire category theorem 3.1.1 Definition. A

More information

Density of Lipschitz functions and equivalence of weak gradients in metric measure spaces

Density of Lipschitz functions and equivalence of weak gradients in metric measure spaces Density of Lipschitz functions and equivalence of weak gradients in metric measure spaces Luigi Ambrosio Nicola Gigli Giuseppe Savaré May 16, 212 Abstract We compare several notion of weak (modulus of)

More information

Hausdorff Measure. Jimmy Briggs and Tim Tyree. December 3, 2016

Hausdorff Measure. Jimmy Briggs and Tim Tyree. December 3, 2016 Hausdorff Measure Jimmy Briggs and Tim Tyree December 3, 2016 1 1 Introduction In this report, we explore the the measurement of arbitrary subsets of the metric space (X, ρ), a topological space X along

More information

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 3

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 3 Math 551 Measure Theory and Functional Analysis I Homework Assignment 3 Prof. Wickerhauser Due Monday, October 12th, 215 Please do Exercises 3*, 4, 5, 6, 8*, 11*, 17, 2, 21, 22, 27*. Exercises marked with

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information