A DIFFUSIVE MODEL FOR MACROSCOPIC CROWD MOTION WITH DENSITY CONSTRAINTS

Similar documents
Spaces with Ricci curvature bounded from below

A description of transport cost for signed measures

Local semiconvexity of Kantorovich potentials on non-compact manifolds

Gradient Flow. Chang Liu. April 24, Tsinghua University. Chang Liu (THU) Gradient Flow April 24, / 91

The semi-geostrophic equations - a model for large-scale atmospheric flows

A Modest Proposal for MFG with Density Constraints

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true

Calculus of Variations. Final Examination

On a Class of Multidimensional Optimal Transportation Problems

Discrete Ricci curvature via convexity of the entropy

Dynamic and Stochastic Brenier Transport via Hopf-Lax formulae on Was

Stochastic optimal control with rough paths

PATH FUNCTIONALS OVER WASSERSTEIN SPACES. Giuseppe Buttazzo. Dipartimento di Matematica Università di Pisa.

PARTIAL REGULARITY OF BRENIER SOLUTIONS OF THE MONGE-AMPÈRE EQUATION

Variational approach to mean field games with density constraints

An introduction to Mathematical Theory of Control

Optimization and Optimal Control in Banach Spaces

Entropy Methods for Reaction-Diffusion Equations with Degenerate Diffusion Arising in Reversible Chemistry

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17

Empirical Processes: General Weak Convergence Theory

{Euclidean, metric, and Wasserstein} gradient flows: an overview

arxiv: v1 [math.ap] 10 Apr 2013

Regularity and compactness for the DiPerna Lions flow

A COUNTEREXAMPLE TO AN ENDPOINT BILINEAR STRICHARTZ INEQUALITY TERENCE TAO. t L x (R R2 ) f L 2 x (R2 )

Lagrangian discretization of incompressibility, congestion and linear diffusion

RENORMALIZED SOLUTIONS ON QUASI OPEN SETS WITH NONHOMOGENEOUS BOUNDARY VALUES TONI HUKKANEN

Topology-preserving diffusion equations for divergence-free vector fields

Convex discretization of functionals involving the Monge-Ampère operator

2 A Model, Harmonic Map, Problem

Laplace s Equation. Chapter Mean Value Formulas

arxiv: v2 [math.ap] 28 Nov 2016

Kramers formula for chemical reactions in the context of Wasserstein gradient flows. Michael Herrmann. Mathematical Institute, University of Oxford

Minimal time mean field games

The Skorokhod reflection problem for functions with discontinuities (contractive case)

Logarithmic Sobolev Inequalities

Solving Dual Problems

Local minimization as a selection criterion

FULL CHARACTERIZATION OF OPTIMAL TRANSPORT PLANS FOR CONCAVE COSTS

Near-Potential Games: Geometry and Dynamics

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov,

Existence and Uniqueness

Fokker-Planck Equation on Graph with Finite Vertices

Variational Formulations

UNIQUENESS ISSUES FOR EVOLUTIVE EQUATIONS WITH DENSITY CONSTRAINTS

Near-Potential Games: Geometry and Dynamics

Measurable functions are approximately nice, even if look terrible.

Metric Spaces and Topology

Gradient Flows: Qualitative Properties & Numerical Schemes

Anisotropic congested transport

NEW FUNCTIONAL INEQUALITIES

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

WEYL S LEMMA, ONE OF MANY. Daniel W. Stroock

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

P(E t, Ω)dt, (2) 4t has an advantage with respect. to the compactly supported mollifiers, i.e., the function W (t)f satisfies a semigroup law:

Global Existence of Large BV Solutions in a Model of Granular Flow

Densities for the Navier Stokes equations with noise

A PLANAR SOBOLEV EXTENSION THEOREM FOR PIECEWISE LINEAR HOMEOMORPHISMS

Geodesics on Shape Spaces with Bounded Variation and Sobolev Metrics

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

Global minimization. Chapter Upper and lower bounds

Convex discretization of functionals involving the Monge-Ampère operator

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

September Math Course: First Order Derivative

Iteration-complexity of first-order penalty methods for convex programming

A dual form of the sharp Nash inequality and its weighted generalization

Moment Measures. Bo az Klartag. Tel Aviv University. Talk at the asymptotic geometric analysis seminar. Tel Aviv, May 2013

Basic Properties of Metric and Normed Spaces

Mean field games via probability manifold II

A Semi-Lagrangian discretization of non linear fokker Planck equations

ASYMPTOTICALLY NONEXPANSIVE MAPPINGS IN MODULAR FUNCTION SPACES ABSTRACT

Wiener Measure and Brownian Motion

PDEs in Image Processing, Tutorials

The optimal partial transport problem

Multivariate Differentiation 1

THE INVERSE FUNCTION THEOREM FOR LIPSCHITZ MAPS

On Friedrichs inequality, Helmholtz decomposition, vector potentials, and the div-curl lemma. Ben Schweizer 1

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

Newton Method with Adaptive Step-Size for Under-Determined Systems of Equations

Some Background Material

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Euler Equations: local existence

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems

Numerical methods for Mean Field Games: additional material

NOTES ON CALCULUS OF VARIATIONS. September 13, 2012

Una aproximación no local de un modelo para la formación de pilas de arena

u xx + u yy = 0. (5.1)

1 Lyapunov theory of stability

ON KRONECKER PRODUCTS OF CHARACTERS OF THE SYMMETRIC GROUPS WITH FEW COMPONENTS

GLOBAL LIPSCHITZ CONTINUITY FOR MINIMA OF DEGENERATE PROBLEMS

Continuum Limit of Forward Kolmogorov Equation Friday, March 06, :04 PM

LECTURE 15: COMPLETENESS AND CONVEXITY

Régularité des équations de Hamilton-Jacobi du premier ordre et applications aux jeux à champ moyen

A few words about the MTW tensor

REMARKS ON THE VANISHING OBSTACLE LIMIT FOR A 3D VISCOUS INCOMPRESSIBLE FLUID

Analysis II: The Implicit and Inverse Function Theorems

Universität des Saarlandes. Fachrichtung 6.1 Mathematik

Spaces with Ricci curvature bounded from below

Transcription:

A DIFFUSIVE MODEL FOR MACROSCOPIC CROWD MOTION WITH DENSITY CONSTRAINTS Abstract. In the spirit of the macroscopic crowd motion models with hard congestion (i.e. a strong density constraint ρ ) introduced by Maury et al. some years ago, we analyze a variant of the same models where diffusion of the agents is also take into account. From the modeling point of view, this means that individuals try to follow a given spontaneous velocity, but are subject to a Brownian diffusion, and have to adapt to a density constraint which introduces a pressure term affecting the movement. From the PDE point of view, this corresponds to a modified Fokker- Planck equation, where with an additional gradient of a pressure (only living in the saturated zone {ρ = }) in the drift. The paper proves existence and some estimates, based on optimal transport techniques.. Introduction In the past few years modeling crowd behavior has become a very active field of applied mathematics. Beyond their importance in real life applications, these modeling problems serve as basic ideas to understand many other phenomena coming for example from biology (cell migration, tumor growth, pattern formations in animal populations, etc.), particle physics and economics. A first non-exhaustive list of references for these problems is [6, 7, 8, 9,, 4, 5, 7, 8, 4]. A very natural question in all these models is the problem of congestion phenomena: in many practical situations, very high quantities of individuals could try to occupy the same spot, which could be impossible, or lead to strong negative effects on the motion, because of natural limitations on the crowd density. These phenomenas have been studied by using different models, which could be either microscopic (based on ODEs on the motion of a high number of agents) or macroscopic (describing the agents via their density and velocity, typically with Eulerian formalism). Let us concentrate on the macroscopic models, where the density ρ plays a crucial role. These very same models can be characterized either by soft congestion effects (i.e. the higher the density the slower the motion), or by hard congestion (i.e. an abrupt threshold effect: if the density touches a certain maximal value, the motion is strongly affected, while nothing happens for smaller values of the density). See [3] for comparison between the different classes of models. This last class of models, due to the discontinuity in the congestion effects, presents new mathematical difficulties, which cannot be analyzed with the usual techniques from conservation laws (or, more generally, evolution PDEs) used for soft congestion. A very powerful tool to attack macroscopic hard-congestion problems is the theory of optimal transportation (see [3, 9]), as we can see in [, 3, 8, 3]. In this framework, the density of the Date: March 8, 5. Key words and phrases: diffusive crowd motion model; Fokker-Planck equation; density constraints; optimal transportation. Mathematics Subject Classification: 35K6; 49D; 49J45.

agents solves a continuity equation (with velocity field taking into account the congestion effects), and can be seen as a curve in the Wasserstein space. Our aim in this paper is to endow the macroscopic hard congestion models of [, 3, 8, 3] with diffusion effects. In other words, we want to model a crowd, where every agent has a spontaneous velocity, a non-degenerate diffusion (i.e. a stochastic component in its motion), driven by a Brownian motion, and is subject to a density constraint. Implementing this new element into these models could give a better approximation of reality, when dealing with large populations. We also underline that one of the goals of this analysis is to better prepare these hard congestion crowd motion models for a possible analysis in the framework of Mean Field Games (see [9,, ], and also [3]). These MFG models usually involve a stochastic term, also implying regularizing effects, which are useful in the mathematical analysis of the corresponding PDEs... The existing first order models in the light of [, 3]. Some macroscopic models for crowd motion with density constraints and hard congestion effects were studied in [3] and []. We briefly present them as follows: The density of the population in a bounded (convex) domain R d is described by a probability measure ρ P(). The initial density ρ P() evolves in time, and ρ t denotes its value at each time t [, T ]. The spontaneous velocity field of the population is a given time-dependent field, denoted by u t. It represents the velocity that each individual would like to follow in the absence of the others. Ignoring the density constraint, this would give rise to the continuity equation t ρ t + (ρ t u t ) =. We observe that in the original work [] the vector field u t (x) was taken of the form D(x) (independent of time and of gradient form) but we try here to be more general (see [8] for the general case, which requires some extra regularity assumption). The set of admissible densities will be denoted by K := {ρ P() : ρ }. In order to guarantee that K is neither empty nor trivial, we suppose >. The set of admissible velocity fields with respect to the density ρ is characterized by the sign of the divergence of the velocity field on the saturated zone; formally we set { } adm(ρ) := v : R d : v on {ρ = }. We consider the projection operator P in the L (L d ) or in the L (ρ) sense (which will be the same, because the only relevant zone is {ρ = }). Finally we solve the following modified continuity equation for ρ (.) t ρ t + (ρ t P adm(ρt)[u t ] ) =, where the main point is that ρ is advected by a vector field, compatible with the constraints, which is the closest to the spontaneous one. The problem in solving Equation (.) is that the projected field has very low regularity: it is a priori only L in x, and it does not depend smoothly on ρ neither (since a density and a density ε give very different projection operators). By the way, its divergence is not well-defined neither. The insertion of a diffusive behavoir in the model also leads to easier and more general uniqueness results: this is the object of an on-going work of the first author in collaboration with S. Di Marino, see [].

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 3 To handle this issue we need to redefine the set of admissible velocities by duality: { } adm(ρ) = v L (ρ) : v p, p H (), p, p( ρ) =. With the help of this formulation we always have the orthogonal decomposition u = P adm(ρ) [u] + p, where p press(ρ) := { p H (ρ) : p, p( ρ) = }. Indeed, the cones adm(ρ) and press(ρ) are dual to each other. Via this approach the continuity equation (.) can be rewritten as a system for (ρ, p) which is (.) { t ρ t + (ρ t (u t p t )) = p, ρ, p( ρ) =. We can naturally endow this equation/system with initial condition ρ(, x) = ρ (x) (ρ K) and with Neumann boundary conditions... A diffusive counterpart. The goal of our work is to study a second order model of crowd movements with hard congestion effect where beside the transport factor a non-degenerate diffusion is present as well. The diffusion is the consequence of a randomness (a Brownian motion) in the movement of the crowd. With the ingredients that we introduced so far, we would like to modify the Fokker-Planck equation t ρ t ρ t + (ρ t u t ) = (always equipped with the natural Neumann boundary conditions on ) in order to take into account the density constraint ρ t. We observe that the Fokker Planck equation is derived from a motion dx t = u t (X t )dt + db t, but is macroscopically represented by the advection of the density ρ t by the vector field ρ t + u t. Projecting onto the set of admissible velocities raises a natural question: should we project only u t, and then apply the diffusion, or project the whole vector field, including ρ t? Indeed, this is not a real issue, as ρ t = on the ρ t saturated set {ρ t = }. This corresponds to the fact that the Heat Kernel preserves the constraint ρ. As a consequence, we consider the modified Fokker-Planck type equation ρ t ρ t (.3) { t ρ t ρ t + (ρ t P adm(ρt)[u t ] ) =, ρ(, x) = ρ (x), in, which can also be written equivalently as (.4) { t ρ t ρ t + (ρ t (u t p t )) = p, ρ, p( ρ) =, ρ(, x) = ρ (x), in. This equation describes the law of a motion where each particle solves the stochastic differential equation dx t = (u t (X t ) p t (X t ))dt + db t, where B t is the standard d-dimensional Brownian motion. More precisely, if we know the solution X t of the SODE above, and we define ρ t := L(X t ), the law of the random variable X t, this will solve the Fokker-Planck equation (.4).

4.3. Structure of the paper and main results. The main goal of the paper is to provide an existence result, with some extra estimates, for the Fokker-Planck equation (.4) via time discretization, using the so-called splitting method (the two main ingredients of the equation, i.e. the advection with diffusion on one hand, and the density constraint on the other hand, are treated one after the other). In Section we will collect some preliminary results, including what we need from optimal transport and from the previous works about density-constrained crowd motion, in particular on the projection operator onto the set K. In Section 3 we will provide the existence result we aim at, by a splitting scheme and some entropy bounds; the solution will be a curve of measures in H ([, T ]; W ()). In Section 4 we will make use of BV estimates to justify that the solution we just built is also Lip([, T ]; W ()) and satisfies a global BV bound ρ t BV C: this requires to combine BV estimates on the Fokker-Planck equation (which are available depending on the regularity of the vector field u) with BV estimates on the projection operator on K (which have been recently proven in []). Section 5 presents a short review of alternative approaches, all discretized in time, but based either on gradient-flow techniques (the JKO scheme, see [3]) or on different splitting methods. Finally, in the Appendix we detail the BV estimates on the Fokker-Planck equation (without any density constraint) that we could find; this seems to be a delicate matter, interesting in itself, and we are not aware of the sharp assumptions on the vector field u to guarantee the BV estimate that we need.. Preliminaries.. Basic definitions and general facts on optimal transport. Here we collect some tools from the theory of optimal transportation, Wasserstein spaces, its dynamical formulation, etc. which will be used later on. We set our problem either in a compact convex domain R d with smooth boundary (or in the d dimensional flat tours := T d, even if we will not adapt all our notations to the torus case). We refer to [3, 9] for more details. Given two probability measures µ, ν P() and for p we define the usual Wasserstein metric by means of the Monge-Kantorovich optimal transportation problem { W p (µ, ν) := inf x y p dγ(x, y) : γ Π(µ, ν) where Π(µ, ν) := {γ P( ) : (π x ) # γ = µ, (π y ) # γ = ν} and π x and π y denote the canonical projections from onto. This quantity happens to be a distance on P() which metrizes the weak- convergence of probability measures; we denote by W p () the space of probabilities on endowed with this distance. Moreover, in the quadratic case p = and under the assumption µ L d (the d dimensional Lebesgue measure on ) in the late 8 s Y. Brenier showed (see [4, 5]) that actually the optimal γ in the above problem is induced by a map, which is the gradient of a convex function, i.e. there exists T : and ψ : R convex such that T = ψ and γ := (id, T ) # µ. The function ψ is obtained as ψ(x) = x ϕ(x), where ϕ is the so-called Kantorovich potential for the transport from µ to ν, and is characterized as the solution of a dual problem that we will not develop here. In this way, the optimal transport map T can also be written as T (x) = x ϕ(x). Later in the 9 s R. McCann (see [5]) introduced a notion of interpolation between probability measures: the curve µ t := (( t)x + ty) # γ, for t [, ], gives a constant speed geodesic in the Wasserstein space connecting µ := µ and µ := ν. } p,

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 5 Based on this notion of interpolation in J.-D. Benamou and Y. Brenier used some ideas from fluid mechanics to give a dynamical formulation to the Monge-Kantorovich problem (see [3]). They showed that W p p p (µ, ν) = inf {B p (E t, µ t ) : t µ t + (E t ) =, µ = µ, µ = ν}, where B p : M([, ] ) d L ([, ]; W p ()) R is given by de t p B p (E, µ) := p dµ t dµ t (x)dt, if E t µ t, a.e. t [, ], +, otherwise. It is well-known that B p is jointly convex and l.s.c. w.r.t the weak- convergence of measures (see Section 5.3. in [9]) and that, if t µ t + E t =, then B p (E, µ) < + implies that t µ t is a curve in W,p ([, ]; W p ()). Coming back to curves in Wasserstein spaces, it is well known (see [] or Section 5.3 in [9]) that for any distributional solution µ t (being a narrowly continuous curve in (P(), W p )) of the continuity equation t µ t + (v t µ t ) = we have the relations µ Wp (t) v t L p µt and W p (ϱ t, ϱ s ) t s µ Wp (τ) dτ, where we denoted by µ Wp (t) the metric derivative w.r.t. W p of the curve µ t (see for instance [] for general notions about curves in metric spaces and their metric derivative). For curves µ t which are geodesic in W p we have the equality W p (µ, ν) = µ Wp (t) dt = v t L p µt dt. The last equality is in fact the Benamou-Brenier formula with the optimal velocity field v t being the density of the optimal E t w.r.t. the optimal µ t. This optimal velocity field v t can be computed as v t := (T id) (T t ), where T t := ( t)id + tt is the transport in McCann s interpolation (we assume here that the initial measure µ is absolutely continuous, so that we can use transport maps instead of plans). This vector field is easily obtained if we consider that in this interpolation particles move with constant speed T (x) x, but x represents here a Lagrangian coordinate, and not an Eulerian one: if we want to know the velocity at time t at a given point, we have to find out first the original position of the particle passing through that point at that time. In the sequel we will also need the notion of entropy of a probability density, and for any probability measure ϱ P() we define it as ϱ(x) log ϱ(x) dx, if ϱ L E(ϱ) := d, +, otherwise. We recall that this functional is l.s.c. and geodesically convex in the W topology. As we will be mainly working with absolutely continuous probability measures (w.r.t. Lebesgue), we often identify measures with their densities. We denote by M(X) the signed Radon measures on X. Observe that µ L ([, ]; W p()) only means that µ = (mu t) t is a time-dependent family of probability measures.

6.. Projection problems in Wasserstein spaces. Our analysis relies a lot on the projection operator P K in the sense of W. Here K := {ρ P() : ρ } and P K [µ] := argmin ρ K W (µ, ρ). We recall (see [, 3] and []) the main properties of the projection P K operator. As far as is compact, for any probability measure µ, the minimizer in min ρ K W (µ, ρ) exists and is unique, and the operator P K is continuous (it is even C,/ for the W distance). The projection P K [µ] saturates the constraint ρ in the sense that for any µ P() there exists a measurable set B such that P K [µ] = B + µ ac B c, where µ ac is the absolutely continuous part of µ. The projection is characterized in terms of a pressure field, in the sense that ρ = P K [µ] if and only if there exists a Lipschitz function p, with p( ρ) =, and such that the optimal transport map T from ρ to µ is given by T := id ϕ = id + p. There is (as proven in []) a quantified BV estimate: if µ BV (in the sense that it is absolutely continuous and that its density belongs to BV ()), then P K [µ] is also BV and T V (P K [µ], ) T V (µ, ). This last BV estimate will be crucial in Section 4, and it is important to have it in this very form (other estimates of the form T V (P K [µ], ) at V (µ, ) + b would not be as useful as this one, as they cannot be easily iterated). 3. Existence via a splitting-up type algorithm (Main Scheme) Similarly to the approach in [3] (see the algorithm (3) and Theorem 3.5) for a general, nongradient, vector field, we will build a theoretical algorithm, after time-discretization, to produce a solution of (.4). In this section the spontaneous velocity field is a general vector field u t : R d (not necessarily a gradient), which depends also on time. We will work on a time interval [, T ] and in a bounded convex domain R d (the case of the flat torus is even simpler and we will not discuss it in details). We consider ρ P ac () to be given, which represents the initial density of the population, and we suppose ρ K. 3.. Splitting using the Fokker-Planck equation. We assume here that u t L () d for all t [, T ]. Let us consider the following scheme. Main scheme: Let τ > be a small time step with N := [T/τ]. Let us set ρ τ := ρ and for every k {,..., N} we define ρ τ k+ from ρτ k in the following way. First we solve (3.) { t ϱ t ϱ t + (ϱ t u t+ ) =, t ], τ], ϱ = ρ τ k, equipped with the natural Neumann boundary condition (( ϱ t u t ) n = a.e. on ) and set ρ τ k+ = P K[ ρ τ k+ ], where ρτ k+ = ϱ τ. See Figure on the right. ρ τ k t ϱ t ϱ t + (ϱ t u t+ ) = ρ τ k+ = ϱ τ Figure. One time step id + τ p τ k+ ρ τ k+

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 7 This means: first follow the Fokker-Planck equation, ignoring the density constraint, for a time τ, then project. In order to state and prove the convergence of the scheme, we need to define some suitable interpolations of the discrete sequence of densities that we have just introduced. First interpolation. We define the following curves of densities, velocities and momentums constructed with the help of the ρ τ k s. First set { ρ τ ϱ(t ), if t [, (k + /)τ[, t := ( ) id + ((k + )τ t) p τ k+, if t [(k + /)τ, (k + )τ[, # ρτ k+ where ϱ t is the solution of the Fokker-Planck equation (3.) with initial datum ρ τ k and pτ k+ arises from the projection of ρ τ k+, more precisely (id + τ pτ k+ ) is the optimal transport from ρτ k+ to ρ τ k+. What are we doing? We are fitting into a time interval of length τ the two steps of our algorithm. First we follow the FP equation (3.) at double speed, then we interpolate between the measure we reached and its projection following the geodesic between them. This geodesic is easily described as an image measure of ρ τ k+ through McCann s interpolation. By the construction it is clear that ρ τ t is a continuous curve in P() for t [, T ]. We now define a family of time-dependent vector fields though { vt τ ϱ (t ) := ϱ + u (t ) t, if t [, (k + /)τ[, p τ k+ (id + ((k + )τ t) pτ k+ ), if t [(k + /)τ, (k + )τ[, and finally let use define the curve of momentums simply as E τ t := ρ τ t v τ t. Second interpolation. We define another interpolation as follows. Set ρ τ t := ϱ t, if t [, (k + )τ[, where ϱ t is (again) the solution of the Fokker-Planck equation (3.) on the time interval [, τ] with initial datum ρ τ k. Here we do not double its speed. We define the curve of velocities ṽ τ t := ϱ t ϱ t + u t, if t [, (k + )τ[, and we build the curve of momentums by Ẽτ t := ρ τ t ṽt τ. Mind the two differences in the construction of ρ τ t and ρ τ t (hence in the construction of vt τ and ṽt τ and Et τ and Ẽτ t ): ) the first one is continuous in time, while the second one is not; ) in the first construction we have taken into account the projection operator explicitly, while in the second one we see it just in an indirect manner (via the jumps occurring at every time of the form t = ). Third interpolation. For each τ, we also define piecewise constant curves, ˆρ τ t := ρ τ k+, ˆv τ t := p τ k+, if t [, (k + )τ[, if t [, (k + )τ[, and Êτ t := ˆρ τ t ˆv t τ. We remark that p τ k+ ( ρτ k+ ) =, hence the curve of momentums is just Ê τ t := p τ k+, if t [, (k + )τ[. In order to prove the convergence of the scheme above, we will obtain uniform H ([, T ]; W ()) bounds for the curves ρ τ. A key observation here is that the metric derivative (w.r.t. W ) of the solution of the Fokker-Planck equation is comparable with the time differential of the entropy functional along the same solution (see Lemma 3.). Now we state the main theorem of this section.

8 Theorem 3.. There exists a continuous curve [, T ] t ρ t W () and some measures E, Ẽ, Ê M([, T ] ) such that the curves ρτ, ρ τ, ˆρ τ converge uniformly in W () to ρ and E τ E, Ẽ τ Ẽ, Ê τ Ê, in M([, T ] ) d, as τ. Moreover E = Ẽ Ê and for a.e. t there exist v t, ṽ t, ˆv t L ρ t () d such that E = ρv, Ẽ = ρṽ, T ) Ê = ρˆv, ( v t L + ṽ ρt t L + ˆv ρt t L dt < +, v = ṽ ˆv, ρt Ẽt = ρ t u t ρ t and ˆv t = p t with p t and p t ( ρ t ) =. As a consequence, ρ t is a weak solution of the modified Fokker-Planck equation t ρ t ρ t + (ρ t (u t p t )) =, in ], T ], p (3.) t, ρ t, p t ( ρ t ) =, in [, T ], ( ρ t u t + p t ) n =, on [, T ], ρ(t =, ) = ρ, with the natural Neumann boundary conditions on. To prove this theorem we will use the following tools. Lemma 3.. Let us consider a solution ϱ t of the Fokker-Planck equation with the velocity field u t. Then for any time interval ]a, b[ we have the following estimate b (3.3) ϱ t + u t ϱ t dxdt E(ϱ a ) E(ϱ b ) + b u t ϱ t dxdt a In particular this implies (3.4) where ϱ t W b a ϱ t ϱ t W dt E(ϱ a ) E(ϱ b ) + b a a u t ϱ t dxdt, denotes the metric derivative of the curve t ϱ t W (). Proof. To prove this inequality, let us compute first d dt E(ϱ t) = (log ϱ t + ) t ϱ t dx = log ϱ t ( ϱ t (ϱ t u t )) dx = ( ϱ t ) + u t ϱ t dx, ϱ t where we used the conservation of mass (hence t ϱ t dx = ) and the Neumann boundary conditions in the integration by parts. We now compare this with ϱ t + u t ϱ t ϱ t dx ( u t ϱ t ) ϱ t dx = ϱ t u t dx ϱ t ( ϱt ) ϱ t u t dx = d dt E(ϱ t). This provides the first part of the statement, i.e. (3.3). If we combine this with the fact that the metric derivative of the curve t ϱ t is always less or equal than the L ϱ t norm of the velocity field in the continuity equation, we also get ϱ t W u t ϱ t d dt E(ϱ t), ϱ t

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 9 and hence (3.4). Corollary 3.3. From the inequality (3.4) we deduce that E(ϱ b ) E(ϱ a ) b u t ϱ t dxdt, hence in particular for u t L () d we have that a In particular, if ϱ a, then we have that E(ϱ b ) E(ϱ a ) u L (b a). E(ϱ b ) u L (b a). The same estimate can be applied to the curve ρ τ, with a = and b ], (k + )τ[, thus obtaining E( ρ τ t ) Cτ for every t. Lemma 3.4. For any ρ P() we have E (P K [ρ]) E(ρ). Proof. We can assume ρ L d otherwise the claim is straightforward. As we pointed out in Section., we know that there exists a measurable set B such that P K [ρ] = B + ρ B c. Hence it is enough to prove that ρ log ρ dx = B B P K [ρ] log P K [ρ]dx, as the entropies on B c coincide. As the mass of ρ and P K [ρ] is the same on the whole, and they coincide on B c, we have ρ(x)dx = P K [ρ]dx = B. Then, by Jensen s inequality we have ( ρ log ρ dx B B The entropy decay follows. B B B B ) ( ) ρ dx log ρ dx =. B B To analyse the pressure field we will need the following result. Lemma 3.5. Let {p τ } τ> be a bounded sequence in L ([, T ]; H ()) and {ρ τ } τ> a sequence of piecewise constant curves valued in W (), which satisfy W (ρ τ (a), ρ τ (b)) C b a + τ for all a < b [, T ] and ρ τ C for a fixed constant C. Suppose that and that Then p( ρ) = a.e. p τ, p τ ( ρ τ ) =, ρ τ, p τ p weakly in L ([, T ]; H ()) and ρ τ ρ uniformly in W (). Proof. The proof of this result is the same as in Step 3 of Section 3. of [] (see also [8]), hence we omit it.

Lemma 3.6. (i) For every τ > and k we have W (ρ τ k, ρτ k+ ), W (ρ τ k, ρτ k+ ) τc ( E(ρ τ k ) E(ρτ k+ )) + Cτ, where C > only depends on u L. (ii) There exists a constant C, only depending on ρ and u L, such that B (E τ, ρ τ ) C, B (Ẽτ, ρ τ ) C and B (Êτ, ˆρ τ ) C. (iii) For the curve [, T ] t ρ τ t we have that T (ρ τ t ) W dt C, for a C > independent of τ. In particular, we have a uniform Hölder bound on ρ t : W (ρ t (a), ρ t (b)) C b a for every b > a. (iv) E τ, Ẽτ, Êτ are uniformly bounded sequences in M([, T ] ) d. Proof. (i) First by the triangle inequality and by the fact that ρ τ k+ = P K[ ρ τ k+ ] we have that (3.5) W (ρ τ k, ρτ k+ ) W (ρ τ k, ρτ k+ ) + W ( ρ τ k+, ρτ k+ ) W (ρ τ k, ρτ k+ ). We use (as before) the notation ϱ t, t [, τ] for the solution of the Fokker-Planck equation (3.) with initial datum ρ τ k, in particular we have ϱ τ = ρ τ k+. Using Lemma 3. and since ϱ = ρ τ k and τ ϱ τ = ρ τ k+ we have by (3.4) and using W (ρ τ k, ρτ k+ ) ϱ t W dt ( τ W (ρ τ k, ρτ k+ (τ ) ϱ t W dt ) ) τ (E(ϱ ) E(ϱ τ )) + τ τ τ ( E(ρ τ k ) E( ρτ k+ )) + Cτ τ ( E(ρ τ k ) E(ρτ k+ )) + Cτ, u +t ϱ t dxdt where C > is a constant depending just on u L. We also used the fact that E(ρ τ k+ ) E( ρτ k+ ), a consequence of Lemma 3.4. Now by the means of (3.5) we obtain W (ρ τ k, ρτ k+ ) τc ( E(ρ τ k ) E(ρτ k+ )) + Cτ. (ii) We use Lemma 3. on the intervals of type [, (k + /)τ[ and the fact that on each interval of type [(k + /)τ, (k + )τ[ the curve ρ τ t is a constant speed geodesic. In particular, on these intervals we have (ρ τ ) W = v τ t L ρ τ t On the other hand we also have τ p τ k+ L ρ τ k+ = τ p τ k+ L ρ τ k+ = W (ρ τ k+, ρτ k+ ). = W (ρ τ k+, ρτ k+ ) W (ρ τ k, ρτ k+ ) τc ( E(ρ τ k ) E(ρτ k+ )) + Cτ. Hence we obtain (k+)τ (k+/)τ = v τ t L (ρ τ t ) dt 4 ϱ (t ) ϱ (t ) + u t (k+)τ ϱ (t ) (x) dxdt + 4 p τ k+ ρ τ k+ dxdt (k+/)τ C ( E(ρ τ k ) E(ρτ k+ )) + Cτ + τ p τ k+ L ρ τ k+

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS C ( E(ρ τ k ) E(ρτ k+ )) + Cτ. Hence by adding up we obtain B (E τ, ρ τ ) { ( C E(ρ τ k ) E(ρ τ k+ )) + Cτ } = C ( E(ρ τ ) E(ρ τ N+) ) + CT C. k The estimate on B (Ẽτ, ρ τ ) and B (Êτ, ˆρ τ ) are completely analogous and descend from the previous computations. (iii) The estimate on B (E τ, ρ τ ) implies a bound on for ρ τ (i.e., the pair (E τ, ρ τ ) solves the continuity equation). (iv) In order to estimate the total mass of E we write T T ( E τ ([, T ] ) = vt τ ρ τ t dxdt vt τ ρ τ t dx ( T T vt τ ρ τ t dxdt) C. The bounds on Ẽτ and Êτ rely on the same argument. T (ρ τ t ) W dt because v τ is a velocity field ) ( ρ τ t dx Proof of Theorem 3.. We use the tools from Lemma 3.6. Step. By the bounds on the metric derivative of the curves ρ τ t we get compactness, i.e. there exists a curve [, T ] t ρ t P() such that ρ τ t converges uniformly in [, T ] w.r.t. W, in particular weakly- in P() for all t [, T ]. It is easy to see that ρ τ and ˆρ τ are converging to the same curve. Indeed we have ρ τ t = ρ τ s(t) and ˆρτ t = ρ τ ŝ(t) for s(t) t τ and ŝ(t) t τ, which implies W (ρ τ t, ρ τ t ), W (ρ τ t, ˆρ τ t ) Cτ. This provides the convergence to the same limit. Step. By the boundedness of E τ, Ẽτ and Êτ in M([, T ] ) d we have the existence of E, Ẽ, Ê M([, T ] )d such that E τ E, Ẽ τ Ẽ, Êτ Ê as τ. Now we show that E = Ẽ Ê. Indeed, let us show that for any f Lip([, T ] )d test function we have that T ( ) f t Et τ (Ẽτ t + Êτ t ) (dx, dt), as τ. First for each k {,..., N} we have that (k+/)τ (k+)τ f t Et τ (dx, dt) = f (t+)/ ( ϱ t + u t ϱ t )(dx, dt) and (k+)τ (k+/)τ f t E τ t (dx, dt) = = = (k+)τ (k+)τ (k+)τ + (k+)τ ) dt (k+)τ ( ) f t Ẽτ t (dx, dt) + f(t+)/ f t Ẽt τ (dx, dt) f (t+(k+)τ)/ (id + ((k + )τ t) p τ k+) p τ k+ρ τ k+(dx, dt) f t Êτ t (dx, dt) ( ft f (t+(k+)τ)/ (id + ((k + )τ t)) ) ˆv τ t ˆρ τ t (dx, dt)

This implies that T f t (Et τ Ẽτ t + Êτ t )(dx, dt) (k+)τ Lip(f)τ Ẽτ t (dx, dt) k + (k+)τ Lip(f)τ ( + ˆv t τ ) Êτ t (dx, dt) k ( Ẽτ ) τclip(f) ([, T ] ) + Êτ ([, T ] ) + B (Ê, ˆρ) τclip(f), for a uniform constant C >. Letting τ we prove the claim. Step 3. The bounds on B (E τ, ρ τ ), B (Ẽτ, ρ τ ) and B (Êτ, ˆρ τ ) pass to the limit by semicontinuity and allow to conclude that E, Ẽ and Ê are vector valued Radon measures absolutely continuous w.r.t. ρ. Hence there exist v t, ṽ t, ˆv t L ρ t () such that E = ρv, Ẽ = ρṽ and Ê = ρˆv. Step 4. We now look at the equations satisfied by E, Ẽ and Ê. First we use tρ τ + E τ =, we pass to the limit as τ, and we get t ρ + E =. Then, we use Ẽτ = ρ τ + u t ρ τ, we pass to the limit again as τ, and we get Ẽ = ρ + u t ρ. To justify the above limit, the only delicate point is passing to the limit the term u t ρ τ, since u is only L, and ρ τ converges weakly as measures, and we are a priori only allowed to multiply it by continuous functions. Yet, we remark that by Corollary 3.3 we have that E( ρ τ t ) Cτ for all t [, T ]. In particular, this provides, for each t, uniform integrability for ρ τ t and turns the weak convergence as measures into weak convergence in L. This allows to multiply by u t in the weak limit. Finally, we look at Êτ. There exists a piecewise constant (in time) function p τ (defined as p τ k+ on every interval ], (k + )τ]) such that p τ, p τ ( ˆρ τ ) =, T T T (3.6) p τ (dx, dt) = p τ ˆρ τ (dx, dt) = ˆv τ ˆρ τ (dx, dt) C and Êτ = p τ ˆρ τ = p τ. The bound (3.6) implies that p τ is uniformly bounded in L (, T ; H ()). Since for every t we have {p τ t = } {ˆρ τ t < }, we can use a version of Poincaré s inequality, and get a uniform bound in L ([, T ]; L ()) = L ([, T ] ). Hence there exists p L ([, T ] ) such that p τ p weakly in L as τ. In particular we have Ê = p. Moreover it is clear that p and by Lemma 3.5 we obtain p( ρ) = a.e. as well. Indeed, the assumptions of the Lemma are easily checked: we only need to estimate W (ˆρ τ (a), ˆρ τ (b)) for b > a, but we have W (ˆρ τ (a), ˆρ τ (b)) = W (ρ τ (k a τ), ρ τ (k b τ)) C k b k a, for k b τ b + τ and k a a. Once we have Ê = p with p( ρ) =, p L ([, T ]; H ()) and ρ L, we can also write Ê = p = ρ p. If we sum up our results, using E = Ẽ Ê, we have t ρ ρ + (ρ(u p)) = together with p, ρ, p( ρ) =.

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 3 As usual, this equation is satisfied in a weak sense, with Neumann boundary conditions. 4. Uniform Lip([, T ]; W )) and BV estimates In this section we provide uniform estimates for the curves ρ τ, ρ τ and ˆρ τ of the following form: we prove uniform BV (in space) bounds on ρ τ (which implies the same bound for ˆρ τ ) and uniform Lipschitz bounds in time for the W distance for ρ τ. This is a small improvement compared to the previous section, both in time regularity (Lipschitz instead of H, but for W instead of W ) and in space (any higher order regularity of ρ was absent from the previous results). Nevertheless there is a price to pay for this improvement: we have to assume higher regularity for the velocity field. These uniform Lipschitz in time bounds are based both on BV estimates for the Fokker-Planck equation (see Lemma A. from Appendix A) and for the projection operator P K (see []). The assumption on u is essentially the following: we need to control the growth of the total variation of the solutions of the Fokker-Planck equation (3.), and we need to iterate this bound along time steps. We will discuss in the Appendix the different BV estimates on the Fokker-Planck equation that we were able to find. The desired estimate is true for u t C, () and u t (x) n(x) = for. x, and seems to be an open problem if u is only Lipschitz continuous. We will also assume ρ BV (). Despite these extra regularity assumptions, we think these estimates have their own interest, exploiting some finer properties of the solutions of the Fokker-Planck equation and of the Wasserstein projection operator. Before entering into the details of the estimates, we want to discuss why we concentrate on BV estimates (instead of Sobolev ones) and on W (instead of W p, p > ). The main reason is the role of the projection operator: indeed, even if ρ W,p (), we do not have in general P K [ρ] W,p because the projection creates some jumps at the boundary of {P K [ρ] = }. This prevents from obtaining any W,p estimate for p >. On the other hand, [] exactly proves a BV estimate on P K [ρ] and paves the way to BV bounds for our equation. Concerning the regularity in time, we observe that the velocity field in the Fokker-Planck equation contains a term in ρ/ρ. Since the metric derivative in W p is given by the L p norm (w.r.t. ρ t ) of the velocity field, it is clear that estimates in W p for p > would require spatial W,p estimates on the solution itself, which are impossible for p >. The precise result that we prove is Theorem 4.. Supposing u t C, C and ρ BV (), then we ρ τ t BV C and W (ρ τ k, ρτ k+ ) Cτ. As a consequence we also have ρ Lip([, T ]; W )) L ([, T ]; BV ()). To prove this theorem we need the following lemmas. Lemma 4.. Suppose u t Lip C and u t n = on. Then for the solution ϱ t of (A.) we have the estimate ϱ t L ϱ L e Ct, where C = u t L. Proof. Let us set f : [, + [ R, f t := ϱ t e Ct with a fixed constant C >. We have which means that f t is a solution of t f t = t ϱ t e Ct Cf t = e Ct ( ϱ t (ϱ t u t )) Cf t, t f t = f t (f t u t ) Cf t.

4 Now for a p > let us denote I p (t) := f t p L and compute p d dt I p(t) = p f t p t f t dx = p f t p ( f t (f t u t ) Cf t ) dx = pci p (t) p(p ) f t p f t dx + p(p ) f t p f t u t dx pci p (t) (p ) f t p u t dx ((p ) u t L pc)i p (t) By Grönwall s lemma we obtain that I p (t) e t((p ) u L pc) I p (), hence Ip /p (t) e t((p )/p u L C) Ip /p (), and sending p + we get that Using the definition of f t we get the estimation which proves the claim. f t L e t( u L C) f L. ϱ t L e t u L ϱ L, We remark that the above lemma implies in particular that after every step in the Main scheme we have ρ τ k+ eτc + Cτ, where c := u L. Let us now present the following lemma as well. Corollary 4.3. Along the iterations of our Main scheme, for every k we have W ( ρ τ k+, ρτ k+ ) τc for a constant C > independent of τ. Proof. With the saturation property of the projection (see Section. or []), we know that there exists a measurable set B such that ρ τ k+ = ρτ k+ B + \B. On the other hand we know that W ( ρ τ k+, ρτ k+ ) = sup f( ρ τ k+ ρτ k+ ) dx f Lip (), f diam() = sup f Lip (), f diam() f( ρ τ k+ \B ) dx τc diam(). We used the fact that the competitors f in the dual formula can be taken positive and bounded by the diameter of, just by adding a suitable constant. This implies as well that C is depending on c, and diam(). Proof of Theorem 4.. First we take care of the BV estimate. Lemma A. in the Appendix guarantees, for t ], (k +)τ[, that we have T V ( ρ τ t ) Cτ +e Cτ T V (ρ τ k ). Together with the BV bound on the projection that we presented in Section. (taken from []), this can be iterated, providing a uniform bound (depending on T V (ρ ), T and sup t u t C,) on ρ τ t BV. Passing this estimate to the limit as τ we get ρ L ([, T ]; BV ()).

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 5 Then we estimate the behavior in terms of W. We estimate (k+)τ (k+)τ ( ρ W (ρ τ k, ρτ k+ ) ( ρ τ t ) τ W dt t Hence, we obtain This in particular means, for b > a, (k+)τ ρ τ t ) + u t ρ τ t dxdt ρ τ t BV dt + Cτ Cτ. W (ρ τ k, ρτ k+ ) W (ρ τ k, ρτ k+ ) + W ( ρ τ k+, ρτ k+ ) τc. W (ˆρ τ (a), ˆρ τ (b)) C(b a + τ). We can pass this relation to the limit, using that, for every t, we have ˆρ τ t hence also in W (), since W W ), we get W (ρ(a), ρ(b)) C(b a), which means that ρ is Lipschitz continuous in W (). ρ t in W () (and 5. Variations on a theme: some reformulations of the Main scheme In this section we propose some alternative approaches to study the problem (.4). The general idea is to discretize in time, and give a way to produce a measure ρ τ k+ starting from ρτ k. Observe that the interpolations that we proposed in the previous sections ρ τ, ρ τ and ˆρ τ are only technical tools to state and prove a convergence result, and the most important point is exactly the definition of ρ τ k+. The alternative approaches proposed here explore different ideas, more difficult to implement than then one that we presented in Section 3, and/or restricted to some particular cases (for instance when u is a gradient). They have their own modeling interest and this is the main reason justifying their sketchy presentation. 5.. Variant : transport, diffusion then projection. We recall that the original splitting approach for the equation without diffusion ([3, 8]) exhibited an important difference compared to what we did in Section 3. Indeed, in the first phase of each time step (i.e. before the projection) the particles follow the vector field u and ρ τ k+ was not defined as the solution of a continuity equation with advection velocity given by u t, but as the image of ρ τ k via a straight-line transport: ρ τ k+ := (id + τu ) # ρ τ k. One can wonder whether it is possible to follow a similar approach here. A possible way to proceed is the following: take a random variable X distributed according to ρ τ k, and define ρτ k+ as the law of X + τu (X) + B τ, where B is a Brownian motion, independent of X. This exactly means that every particle moves starting from its initial position X, following a displacement ruled by u, but adding a stochastic effect in the form of the value at time τ of a Brownian motion. We can check that this means ρ τ k+ := η τ ((id + τu ) # ρ τ k ), where η τ is a Gaussian kernel with zero-mean and variance τ, i.e. η τ (x) := 4τπ e x 4τ. Then we define ρ τ k+ := P K [ ρ k+ ].

6 Despite the fact that this scheme is very natural and essentially not that different from the Main scheme, we have to be careful with the analysis. First we have to quantify somehow the distance W p (ρ τ k, ρτ k+ ) for some p and show that this is of order τ in some sense. Second, we need to be careful when performing the convolution with the heat kernel (or adding the Brownian motion, which is the same): this requires either to work in the whole space (which was not our framework) or in a periodic setting ( = T d, the flat torus, which is qutie restrictive). Otherwise, the explicit convolution step should be replaced with some other construction, such as following the Heat equation (with Neumann boundary conditions) for a time τ. But this brings back to a situation very similar to the Main scheme, with the additional difficulty that we do not really have estimates on (id + τu ) # ρ τ k. 5.. Variant : gradient flow techniques for gradient velocity fields. In this section we assume that the velocity field of the population is given by the opposite of the gradient of a function, u t = V t a typical example is given when we take for V the distance function to the exit (see the discussions in [] about this type of question). We start from the case where V does not depend on time, and we suppose V W, (). In this particular case beside the splitting approach the problem has a variational structure, hence it is possible to show the existence by the means of gradient flows in Wasserstein spaces. Since the celebrated paper of Jordan, Kinderlehrer and Otto ([3]) we know that the solutions of the Fokker-Planck equation (with a gradient vector field) can be obtained with the help of the gradient flow of a perturbed entropy functional with respect to the Wasserstein distance W. This formulation of JKO scheme was also used in [] for the first order model with density constraints. It is easy to combine the JKO scheme with density constraints to study the second order/diffusive model. As a slight modification of the model from [], we can consider the following discrete implicite Euler (or JKO) scheme. As usual, we fix a time step τ >, ρ τ = ρ and for all k N we just need to define ρ τ k+. We take { (5.) ρ τ k+ = argmin ρ P() V (x)ρ(x) dx + E(ρ) + I K (ρ) + τ W (ρ, ρ τ k ) where I K is the indicator function of K, which is {, if x K, I K (x) := +, otherwise. The usual techniques from [3, ] can be used to identify that the problem (.4) is the gradient flow of the functional ρ J(ρ) := V (x)ρ(x) dx + E(ρ) + I K (ρ) and that the above discrete scheme converges (up to a subsequence) to a solution of (.4), thus proving existence. The key estimate for compactness is τ W (ρ τ k+, ρτ k ) J(ρτ k ) J(ρτ k+ ), which can be summed up (as on the r.h.s. we have a telescopic series), thus obtaining the same bounds on B that we used in Section 3. It is also possible to study a variant where V depends on time. We assume for simplicity that V Lip([, T ] ) (this is a simplification, less regularity in space, such as W,, could be sufficient). In this case we define J t (ρ) := V t (x)ρ(x) dx + E(ρ) + I K (ρ) },

and DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 7 (5.) ρ τ k+ = argmin ρ P() { J (ρ) + } τ W (ρ, ρ τ k ), The analysis proceeds similarly, with the only exception that the we get τ W (ρ τ k+, ρτ k ) J (ρ τ k ) J (ρ τ k+ ), which is no more a a telescopic series. Yet, we have J (ρ τ k+ ) J (k+)τ (ρ τ k+ ) + Lip(V )τ, and we can go on with a telescopic sum plus a remainder of the order of τ. 5.3. Variant 3: transport then gradient flow-like step with the penalized entropy functional. We present now a different scheme, which combines some of the previous approaches. It could formally provide a solution of the same equation, but presents some extra difficulties. We define now ρ τ k+ := (id + τu ) # ρ τ k and with the help of this we define ρ τ k+ := argmin ρ KE(ρ) + τ W (ρ, ρ τ k+ ). In the last optimization problem we minimize a strictly convex and l.s.c. functionals, and hence we have existence and uniqueness of the solution. The formal reason for this scheme being adapted to the equation is that we perform a step of a JKO scheme in the spirit of [3] (without the density constraint) or of [] (without the entropy term). This should let a term ρ (ρ p) appear in the evolution equation. The term (ρu) is due to the first step (the definition of ρ τ k+ ). To explain a little bit more for the unexperienced reader, we consider the optimality conditions for the above minimization problem. Following [], we can say that ρ K is optimal if and only if there exists a constant l R and a Kantorovich potential ϕ for the transport from ρ to ρ τ k such that on ( ln ρ + ϕ ) τ < l, ρ = on ( ρ τ k+ ln ρ + ϕ ) τ > l, [, ] on ( ln ρ + ϕ ) τ = l. We then define p = (l ln ρ ϕ τ ) + and we get p press(ρ). Moreover, ρ a.e. p = ρ ρ ϕ τ. ρ τ k ρ τ k+ We then use the fact that the optimal transport id τ(u (k+)τ p ρ ρ )+o(τ) is of the form T = id ϕ and obtain a situation as is sketched in Figure. Figure. One time step id + τu id + τ( p + ρ ρ ) Notice that (id + τu ) (id + τp) = id τ(u (k+)τ p) + o(τ) provided u is regular enough. Formally we can pass to the limit τ and have t ρ ρ + (ρ(u p)) =. Yet, this turns out to be quite naïve, because we cannot get proper estimates on W (ρ τ k, ρτ k+ ). Indeed, this is mainly due to the hybrid nature of the scheme, i.e. a gradient flow for the diffusion and the projection part on one hand and a free transport on the other hand. The typical estimate in the JKO scheme comes from the fact that one can bound W (ρ τ k, ρτ k+ ) /τ with the opposite of the increment of the energy, and that this gives rise to a telescopic sum. Yet, this is not the case whenever the base point for a new time step is not equal to the previous minimizer. These kinds of difficulties are matter of current study, in particular for mixed systems and/or multiple populations.

8 Appendix A. BV -type estimates for the Fokker-Planck equation Here we present some Total Variation (T V ) decay results (in time) for the solutions of the Fokker-Planck equation. Some are very easy, some trickier. The goal is to look at those estimates which can be easily iterated in time and combined with the decay of the T V via the projection operator, as we did in Section 4. Let us take a vector field v : [, + [ R d (we will choose later which regularity we need) and consider in the problem t ρ t ρ t + (ρ t v t ) =, in ], + [, (A.) ( ρ t v t ) n =, on ], + [, ρ(, ) = ρ, in, for ρ BV () P(). Lemma A.. Suppose v t C, C for a.e. t [, + [. Suppose that either = T d, or that is convex and v n = on. Then, we have the following total variation decay estimate (A.) ρ t dx C(t s) + e C(t s) ρ s dx, s t, where C > is a constant depending just on the C, norm of v. Proof. First we remark that by the regularity of v the quantity v L + Dv L + ( v) L is uniformly bounded. Let us drop now the dependence on t in our notation and calculate in coordinates ( ) d ρ ρ dx = dt ρ ( ρ ρ j tρ) dx = ( ρ (vρ)) dx = ρ iij ( (vρ)) j dx ρ ρ j i ( ) ρ ij = ρ ρ jρ k ρ ki ρ ij ρ j ( ρ 3 dx + B v i,k ρ ij ρ + viρ i j + vjρ i i + v i ) ρ ij dx j,i B + C + C ρ dx + ρ v dx + B B + B + C + C ρ dx. Here the B i are the boundary terms, i.e. ρ j n i ρ ij B := ρ dh d and B := v n ρ dh d. The constant C > only depends on v L + v L + ( v) L. We used as well the fact ( ) ρ ij that ρ ρ jρ k ρ ki ρ ij ρ 3 dx.,k Now, it is clear that in the case of the torus the boundary terms B and B do not exist, hence we conclude by Grönwall s lemma. In the case of the convex domain we have B = (because of the assumption v n = ) and B because of the next Lemma A.. Lemma A.. Suppose that u : R d is a smooth vector field with u n = on, ρ is a smooth function with ρ n = on, and that R d is a smooth convex set parametrized as

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS 9 = {h < } for a smooth convex function h with h = on (so that n = h on ). Then we have, on the whole boundary, u i jρ j n i = u i h ij ρ j. In particular, we have ρ ij ρ j n i. Proof. The Neumann boundary assumption on u means u(γ(t)) h(γ(t)) = for every curve γ valued in and for all t. Differentiating in t, we get u i j(γ(t))(γ (t)) j h i (γ(t)) + u i (γ(t))h ij (γ(t))(γ (t)) j =. Take a point x and choose a curve γ with γ(t ) = x and γ (t ) = ρ(x ) (which is possible, since this vector is tangent to by assumption). This gives the first part of the statement. The second part, i.e. ρ ij ρ j n i, is obtained by taking u = ρ and using that D h(x ) is a positive definite matrix. Remark A.3. If we look attentively at the proof of Lemma A., we can see that we did not really exploit the regularizing effects of the diffusion term in the equation. This means that the regularity estimate that we provide are the same that we would have without diffusion: in this case, the density ρ t is obtained from the initial density as the image through the flow of v. Thus, the density depends on the determinant of the Jacobian of the flow, hence on the derivatives of v. It is normal that, if we want BV bounds on ρ t, we need assumptions on two derivatives of v. We would like to prove some form of BV estimates under weaker regularity assumptions on v, trying to exploit the diffusion effects. In particular, we would like to treat the case where v is only C,. As we will see in the following lemma, this degenerates in some sense. Lemma A.4. Suppose that is either the torus or a smooth convex set = {h < } parameterized as a level set of a smooth convex function h. Let v t : R d be a vector field for t [, T ], Lipschitz and bounded in space, uniformly in time. In the case of a convex domain, suppose v n = on. Let H : R d R be given by H(z) := ε + z. Now let ρ t (sufficiently smooth) be the solution of the Fokker-Planck equation with homogeneous Neumann boundary condition. Then there exists a constant C > (depending on v and ) such that (A.3) H( ρ t ) dx H( ρ ) dx + Cεt + C t ρ s L ε ds. Proof. First let us discuss about some properties of H. It is smooth, its gradient is H(z) = z H(z) and it satisfies H(z) z H(z), z R d. Moreover its Hessian matrix is given by where δ ij = [ δ ij H (z) z i z j ] [H ij (z)] {,...,d} = H 3 (z) {, if i = j,, if j j, {,...,d} is the Kronecker symbol. = H(z) I d H 3 (z) z z, z Rd, Note that, from this computation, the matrix D H is bounded from above by H, and hence by ε. Moreover we introduce a uniform constant C > such that v L + v L + Dv L C.

Now to show the estimate of this lemma we calculate the quantity d H( ρ t ) dx. dt d H( ρ t ) dx = H( ρ t ) t ρ t dx = H( ρ t ) ( ρ t (v t ρ t )) dx dt = H( ρ t ) ρ t dx H( ρ t ) ( (v t ρ t )) dx =: (I) + (II) Now we study each term separately and for the simplicity we drop the t subscripts in the followings. We start from the case of the torus, where there is no boundary term in the integration by parts. (I) = H( ρ) ρ dx = H j ( ρ)ρ jii dx = H kj ( ρ)ρ ik ρ ji dx (II) = =,k =: (II a ) + (II b ). j,i j,i,k H( ρ) ( (vρ)) dx = H j ( ρ)(v i ρ) ij dx H jk ( ρ)ρ ki vjρ i dx + H jk ( ρ)ρ ki v i ρ j dx,k First look at the term (II a ). Since the matrix H jk is positive definite, we can apply a Young inequality for each index i and obtain (II a ) = H jk ( ρ)ρ ki vjρ i dx H jk ( ρ)ρ ki ρ ij dx + H jk ( ρ)v i jvk i ρ dx,k,k (I) + C ρ L D H L.,k The L norm in the second term will be estimated by the L norm for the sake of simplicity (see Remark A.5 below). For the term (II b ) we first make a point-wise computation H jk ( ρ)ρ ki v i ρ j =,k = H 3 ( ρ) ε H 3 ( ρ) [Di ρ (ε I d + ρ I d ρ ρ ) ρ]v i i v i Di ρ ρ = ε i i ( ) v i i. H( ρ) where Di ρ denotes the ith row in the Hessian matrix of ρ and we used ( ρ I d ρ ρ ) ρ =. Integrating by parts we obtain (II b ) = ε ( v) H( ρ) dx Cε /H L Cε, where we used H(z) ε.

DIFFUSIVE EVOLUTION UNDER DENSITY CONSTRAINTS Summing up all the terms we get and using D H ε we get d H( ρ t ) dx dt (I) + C ρ t L D H L + Cε Cε + C ρ t L ε, which proves the claim. If we switch to the case of a smooth bounded convex domain, we have to handle boundary terms. These terms are H j ( ρ)ρ ij n i H j ( ρ)ρvjn i i, where we ignored those terms involving n i v i (i.e., the integration by parts in (II b ), and the term H j ( ρ)ρ j n i v i in the integration by parts of (II a )), since we already supposed v n =. We use here Lemma A., which provides H j ( ρ)ρ ij n i ρh j ( ρ)vjn i i = H( ρ) ( ρj ρ ij n i ρρ j v i jn i) = H( ρ) ( ρj h ij ρ i ρρ j h ij v i). If we use the fact that the matrix D h is positive definite and a Young inequality, we get ρ jh ij ρ i and ρ ρ j h ij v i ρ j h ij ρ i + ρ v j h ij v i, which implies ( ρj ρ ij n i ρρ j v i H( ρ) jn i) ρ H( ρ) D h L v C ρ L. ε This provides the desired estimate on the boundary term. Remark A.5. In the above proof, we needed to use the L norm of ρ only in the boundary term. When there is no boundary term, the L norm is enough, in order to handle the term (II a ). In both cases, the norm of ρ can be bounded in terms of the initial norm multiplied by e Ct, where C bounds the divergence of v. On the other hand, in the torus case, one only needs to suppose ρ L and in the convex case ρ L. Both assumptions are satisfied in the applications to crowd motion with density constraints. We have seen that the constants in the above inequality depend on ε and explode as ε. This prevents us to obtain a clean estimate on the BV norm in this context, but at least proves that ρ BV ρ t BV for all t > (to achieve this result, we just need to take ε = ). Unfortunately, the quantity which is estimated is not the BV norm, but the integral H( ρ). This is not enough for the purpose of the applications to Section 4, as it is unfortunately not true that the projection operator decreases the value of this other functional 3. If we stay interested to the value of the BV norm, we can provide the following estimate. 3 Here is a simple counter-example: consider µ = g(x)dx a BV density on [, ] R, with g defined as follows. Divide the interval [, ] into K intervals J i of length r (with rk = ); call t i the center of each interval J i (i.e. t i = ir + r, for i =,..., K ) and set g(x) = L + r (x t i) on each J i with i odd, and g(x) = on J i for i even, taking L = πr/4. It is not difficult to check that the projection of µ is equal to the indicator function of the union of all the intervals J i with i odd, and that the value of H( ρ) has increased by K( π/)r = π/4, i.e. by a positive constant (see Figure 3).