Convex discretization of functionals involving the Monge-Ampère operator

Convex discretization of functionals involving the Monge-Ampère operator Quentin Mérigot CNRS / Université Paris-Dauphine Joint work with J.D. Benamou, G. Carlier and É. Oudet Workshop on Optimal Transport in the Applied Sciences December 8-12, 2014 RICAM, Linz 1

2 1. Motivation: Gradient flows in Wasserstein space

Background: Optimal transport P 2 (R d ) = {µ P(R d ); x 2 d µ(x) < + } P ac 2 (R d ) = P 2 (R d ) L 1 (R d ) ν R d p 2 π Wasserstein distance between µ, ν P 2 (R d ), Γ(µ, ν) := {π P(R d R d ); p 1# π = µ, p 2# π = ν} Definition: W 2 2(µ, ν) := min π Γ(µ,ν) x y 2 d π(x, y). µ p 1 R d Relation to convex functions: Def: K := finite convex functions on R d Theorem (Brenier): Given µ P2 ac (R d ) the map φ K φ # µ P 2 (R d ) is surjective and moreover, W 2 2(µ, φ # µ) = x φ(x) 2 d µ(x) R d [Brenier 91] 3

Motivation 1: Crowd Motion Under Congestion JKO scheme for crowd motion with hard congestion: [Maury-Roudneff-Chupin-Santambrogio 10] ρ τ k+1 = min σ P 2 (X) 1 2τ W2 2(ρ τ k, σ) + E(σ) + U(σ) ( ) where X R d is convex and bounded, and U(ν) := { 0 if d ν = f d H d, f 1 E(ν) := V (x) d ν(x) R d + if not congestion potential energy Assuming σ = φ # ρ τ k with φ convex, the Wasserstein term becomes explicit: ( ) min φ 1 2τ R d x φ(x) 2 ρ τ k (x) d x + E( φ #ρ τ k ) + U( φ #ρ τ k ) On the other hand, the constraint becomes strongly nonlinear: U( φ # ρ τ k ) < + det D2 φ(x) ρ k (x) 4

Motivation 2: Nonlinear Diffusion ( ) ρ t = div [ρ (U (ρ) + V + W ρ)] ρ(0,.) = ρ 0 ρ(t,.) P ac (R d ) 5

Motivation 2: Nonlinear Diffusion ( ) ρ t = div [ρ (U (ρ) + V + W ρ)] ρ(0,.) = ρ 0 ρ(t,.) P ac (R d ) Formally, ( ) can be seen as the W 2 -gradient flow of U + E, with U(ν) := U(f(x)) d x if d ν = f d H d R d + if not internal energy, ex: U(r) = r log r = entropy E(ν) := R d V (x) d ν(x) + R d W (x y) d[ν ν](x, y) potential energy interaction energy JKO time discrete scheme: for τ > 0, [Jordan-Kinderlehrer-Otto 98] ρ τ k+1 = arg min σ P(R d ) 1 2τ W 2(ρ τ k, σ)2 + U(σ) + E(σ) 5

Displacement Convex Setting For X convex bounded and µ P ac (X), min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) spt(ν) X 6

Displacement Convex Setting For X convex bounded and µ P ac (X), K X := {φ convex ; φ X} min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) min φ KX 1 2τ W2 2(µ, φ # µ) + U( φ # µ) + E( φ # µ) ( X ) 6

Displacement Convex Setting For X convex bounded and µ P ac (X), K X := {φ convex ; φ X} min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) min φ KX 1 2τ W2 2(µ, φ # µ) + U( φ # µ) + E( φ # µ) ( X ) When is the minimization problem ( X ) convex? U(ν) := R d U ( d ν d H d ) d x E(ν) := R d (V + W ν) d ν Theorem: ( X ) is convex if (H1) V, W : R d R are convex functions, NB: U( φ # ρ) = U (H2) r d U(r d ) is convex non-increasing, U(0) = 0. ( ) ρ(x) MA[φ](x) MA[φ](x) d x [McCann 94] MA[φ](x) := det(d 2 φ(x)) 6

Numerical applications of the JKO scheme Numerical applications of the (variational) JKO scheme are still limited: 7

Numerical applications of the JKO scheme Numerical applications of the (variational) JKO scheme are still limited: 1D: monotone rearrangement e.g. [Kinderleherer-Walkington 99] [Blanchet-Calvez-Carrillo 08] [Agueh-Bowles 09] 2D: optimal transport plans diffeomorphisms [Carrillo-Moll 09] U = hard congestion term [Maury-Roudneff-Chupin-Santambrogio 10] Our goal is to approximate a JKO step numerically, in dimension 2: For X convex bounded and µ P ac (X), ( X ) min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) Plan of the talk: Part 2: Convex discretization of the problem ( X ) under McCann s hypotheses. Part 3: Γ-convergence results from the discrete problem to the continuous one. Part 4: Numerical simulations: non-linear diffusion, crowd motion. 7

8 2. Convex discretization of a JKO step

Prior work: Discretizing the Space of Convex Functions min φ K X F (φ(x), φ(x)) d µ(x) K := {φ convex} Finite elements: piecewise-linear convex functions over a fixed mesh number of constraints is N := card(p ) P X 9

Prior work: Discretizing the Space of Convex Functions min φ K X F (φ(x), φ(x)) d µ(x) K := {φ convex} P X Finite differences, using convex interpolates Finite elements: piecewise-linear convex functions over a fixed mesh number of constraints is N := card(p ) non-convergence result [Choné-Le Meur 99] [Carlier-Lachand-Robert-Maury 01] convergent, but number of constraints is N 2 [Ekeland Moreno-Bromberg 10] 9

Prior work: Discretizing the Space of Convex Functions min φ K X F (φ(x), φ(x)) d µ(x) K := {φ convex} P X Finite differences, using convex interpolates Finite elements: piecewise-linear convex functions over a fixed mesh number of constraints is N := card(p ) [Choné-Le Meur 99] convergent, but number of constraints is N 2 [Ekeland Moreno-Bromberg 10] adaptive method exterior parameterization non-convergence result [Carlier-Lachand-Robert-Maury 01] [Mirebeau 14] [Oberman 14] [Oudet-M. 14] 9

Prior work: Discretizing the Space of Convex Functions min φ K X F (φ(x), φ(x)) d µ(x) K := {φ convex} P X Finite differences, using convex interpolates Finite elements: piecewise-linear convex functions over a fixed mesh number of constraints is N := card(p ) [Choné-Le Meur 99] convergent, but number of constraints is N 2 [Ekeland Moreno-Bromberg 10] adaptive method exterior parameterization non-convergence result [Carlier-Lachand-Robert-Maury 01] [Mirebeau 14] [Oberman 14] [Oudet-M. 14] 9 Our functional involves the Monge-Ampère operator det D 2 φ: this will help us.

Discretizing the Space of Convex Functions min φ KX 1 2τ W2 2(µ, φ # µ) + E( φ # µ) + U( φ # µ) K X := {φ cvx ; φ X} Definition: Given P X finite, K X (P ) = {ψ P ; ψ K X } = φ : P R P X 10

Discretizing the Space of Convex Functions min φ KX 1 2τ W2 2(µ, φ # µ) + E( φ # µ) + U( φ # µ) K X := {φ cvx ; φ X} Definition: Given P X finite, K X (P ) = {ψ P ; ψ K X } finite-dimensional convex set = φ : P R P X 10

Discretizing the Space of Convex Functions min φ KX 1 2τ W2 2(µ, φ # µ) + E( φ # µ) + U( φ # µ) K X := {φ cvx ; φ X} Definition: Given P X finite, K X (P ) = {ψ P ; ψ K X } finite-dimensional convex set extension of φ K X (P ): ˆφ = φ : P R X = [ 1, 1] φ : P R ˆφ : R d R K X ˆφ := max{ψ; ψ K X and ψ P φ} K X P X Discrete push-forward of µ P = p P µ pδ p P(P ) 10

Convex Space-Discretization of One JKO Step Theorem: Under McCann s hypothesis (H1) and (H2), the problem min φ K G X (P ) 1 2τ W2 2(µ P, Hφ # µ P ) + E(Hφ # µ P ) + U(Gφ # µ P ) is convex, and the minimum is unique if r d U(r d ) is strictly convex. [Benamou, Carlier, M., Oudet 14] Unfortunately, two different definitions of push-forward seem necessary. The convexity of φ K X (P ) U(Gφ # µ P ) follows from the log-concavity of the discrete Monge-Ampère operator: MA[φ](p) := H d ( ˆφ(p)). If lim r U(r)/r = +, the internal energy is a barrier for convexity, i.e. U(Gφ # µ P ) < + = p P, MA[φ](p) = H d ( ˆφ(p)) > 0 = φ is in the interior of K X (P ). 11 NB: P non-linear constraints vs P 2 linear constraints

Convex Space-Discretization of the Internal Energy Proposition: Under McCann s assumption, φ K X (P ) U(Gφ # µ P ) is convex. 12

Convex Space-Discretization of the Internal Energy Proposition: Under McCann s assumption, φ K X (P ) U(Gφ # µ P ) is convex. Discrete Monge-Ampère operator: MA[φ](p) := H d ( ˆφ(p)). 12

Convex Space-Discretization of the Internal Energy Proposition: Under McCann s assumption, φ K X (P ) U(Gφ # µ P ) is convex. Discrete Monge-Ampère operator: MA[φ](p) := H d ( ˆφ(p)). Proof: Recall Gφ # µ P = p P [µ p/ma[φ](p)]1 ˆφ(p). With U(σ) = U(σ(x)) d x, U(G ac φ# µ P ) = p P U (µ p/ma[φ](p)) MA[φ](p) case U(r) = r log r = p P µ p log(ma[φ](p)) + p P µ p log(µ p ) Lemma: Given φ 0, φ 1 K X (P ), φ t := (1 t)φ 0 + tφ 1 and ˆφ t (p) (1 t) ˆφ 0 (p) t ˆφ 1 (p). = Minkowski sum 12

13 3. A Γ-convergence result

Convergence of the Space-Discretization Setting: X bounded and convex, µ P ac (X) with density c 1 ρ c min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) ( ) 14

Convergence of the Space-Discretization Setting: X bounded and convex, µ P ac (X) with density c 1 ρ c min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) ( ) (C1) E continuous, U l.s.c on (P(X), W 2 ) 14

Convergence of the Space-Discretization Setting: X bounded and convex, µ P ac (X) with density c 1 ρ c min ν P(X) 1 2τ W2 2(µ, ν) + E(ν) + U(ν) ( ) (C1) E continuous, U l.s.c on (P(X), W 2 ) (C2) U(ρ) = U(ρ(x)) d x, where U M is convex. McCann s condition for displacement-convexity Theorem: Let P n X finite, µ n P(P n ) with lim W 2 (µ n, µ) = 0, and: min φ K G X (P n ) 1 2τ W 2(µ n, Hφ # µ n ) + E(Hφ # µ n ) + U(Gφ # µ n ) ( ) n If φ n minimizes ( ) n, then (Gφ n #µ n ) is a minimizing sequence for ( ). [Benamou, Carlier, M., Oudet 14] 14

Proof of the convergence theorem: Lower bound JKO step: X bounded and convex, µ P ac (X) with density c 1 ρ c Let m := min σ P(X) F(σ) m n := min φn K X (P n ) F n (Gφ n #µ n ) F(σ) := W 2 2(µ, σ) + E(σ) + U(σ) F n (σ) := W 2 2(µ n, σ) + E(σ) + U(σ) Step 1: lim inf m n m Step 2: lim sup m n m = min σ P(X) F(σ) Using (C2) and a convolution argument = Given any probability density σ C 0 (X) with ε < σ < ε 1, φ n K X (P n ) s.t. lim n σ σ n L (X) = 0, where σ n is the density of Gφ n #µ n. 15

Proof of the convergence theorem: Lower bound Step 2 : Given any prob. density σ C 0 (X) s.t. ε < σ < ε 1, φ n K X (P n ) s.t. lim n σ σ n L (X) = 0, where σ n := density of Gφ n #µ n. φ µ σ = φ # µ X X 16

Proof of the convergence theorem: Lower bound Step 2 : Given any prob. density σ C 0 (X) s.t. ε < σ < ε 1, φ n K X (P n ) s.t. lim n σ σ n L (X) = 0, where σ n := density of Gφ n #µ n. (a) Given µ n = p P n µ p δ p, take φ n K X (P n ) O.T. potential between µ n and σ: p P n, σ(vp n ) = µ p, where Vp n := ˆφ n (p) X φ µ σ = φ # µ X (b) σ n := Gφ n #µ n = p P n σ(v n p ) H d (V n p )1 V n p. (c) If max p Pn diam(vp n ) n 0 (1), then lim n σ σ n L (X) = 0. p ˆφ n V n p X spt(µ n ) = P n X 16 σ n

Proof of the convergence theorem: Lower bound Step 2 : Given any prob. density σ C 0 (X) s.t. ε < σ < ε 1, φ n K X (P n ) s.t. lim n σ σ n L (X) = 0, where σ n := density of Gφ n #µ n. (a) Given µ n = p P n µ p δ p, take φ n K X (P n ) O.T. potential between µ n and σ: p P n, σ(vp n ) = µ p, where Vp n := ˆφ n (p) X p φ µ σ = φ # µ ˆφ n X X V n p (b) σ n := Gφ n #µ n = p P n σ(v n p ) H d (V n p )1 V n p. (c) If max p Pn diam(vp n ) n 0 (1), then (d) Assume not (1): p n P n s.t. diam(v n p ) ε. (2) lim n σ σ n L (X) = 0. Up to extraction, ˆφ n. φ, so that diam( φ(x)) ε where x = lim n p n X. spt(µ n ) = P n X 16 σ n

Proof of the convergence theorem: Lower bound Step 2 : Given any prob. density σ C 0 (X) s.t. ε < σ < ε 1, φ n K X (P n ) s.t. lim n σ σ n L (X) = 0, where σ n := density of Gφ n #µ n. (a) Given µ n = p P n µ p δ p, take φ n K X (P n ) O.T. potential between µ n and σ: p P n, σ(vp n ) = µ p, where Vp n := ˆφ n (p) X p φ µ σ = φ # µ ˆφ n X V n p (b) σ n := Gφ n #µ n = p P n σ(v n p ) H d (V n p )1 V n p. (c) If max p Pn diam(vp n ) n 0 (1), then (d) Assume not (1): p n P n s.t. diam(v n p ) ε. (2) lim n σ σ n L (X) = 0. Up to extraction, ˆφ n. φ, so that diam( φ(x)) ε where x = lim n p n X. spt(µ n ) = P n X X σ n (e) Moreover, φ # ρ = σ. By Caffarelli s regularity theorem, φ C 1 : Contradiction of (2). 16

17 4. Numerical results

Computing the discrete Monge-Ampère operator U(Gφ # µ P ) = p P U (µ p/ma[φ](p)) MA[φ](p) with MA[φ](p) := H d ( ˆφ(p)). Goal: fast computation of MA[φ](p) and its 1st/2nd derivatives w.r.t φ We rely on the notion of Laguerre (or power) cell in computational geometry Lag φ P (p) X Definition: Given a function φ : P R, Lag φ P (p) := {y Rd ; q P, φ(q) φ(p) + q p y } p 18 Lemma: For φ K X (P ) and p P, ˆφ(p) = Lag φ P (p) X.

Computing the discrete Monge-Ampère operator U(Gφ # µ P ) = p P U (µ p/ma[φ](p)) MA[φ](p) with MA[φ](p) := H d ( ˆφ(p)). Goal: fast computation of MA[φ](p) and its 1st/2nd derivatives w.r.t φ We rely on the notion of Laguerre (or power) cell in computational geometry Lag φ P (p) X Definition: Given a function φ : P R, Lag φ P (p) := {y Rd ; q P, φ(q) φ(p) + q p y } p For φ(p) = p 2 /2, one gets the Voronoi cell: Lag φ P (p) := {y; q P, q y 2 p y 2 } Computation in time O( P log P ) in 2D 18 Lemma: For φ K X (P ) and p P, ˆφ(p) = Lag φ P (p) X.

Computing the discrete Monge-Ampère operator Global construction of the intersections (Lag φ P (p) X) p P in 2D. ˆφ(p 1 ) ˆφ(p 2 ) s 3 s 2 Assumption: X = s S s, with S = finite family of segments. s 3 s 2 p 1 p 2 T s 4 s 1 s 4 s 1 Combinatorics stored as an (abstract) triangulation T of the finite set P S, i.e. 19 (p 1, p 2, p 3 ) T iff (Lag φ P (p 1) Lag φ P (p 2) Lag φ P (p 3)) X (p 1, p 2, s 1 ) T iff (Lag φ P (p 1) Lag φ P (p 2) s 1 ) X (p 1, s 1, s 2 ) T iff (Lag φ P (p 1) s 1 s 2 X) Computation in time O( P log P + S ) in 2D.

Computing the discrete Monge-Ampère operator Global construction of the intersections (Lag φ P (p) X) p P in 2D. ˆφ(p 1 ) ˆφ(p 2 ) s 3 s 2 Assumption: X = s S s, with S = finite family of segments. s 3 s 2 p 1 p 2 T s 4 s 1 s 4 s 1 Computation of MA(p) = H 2 (Lag φ P (p) X) and its derivatives. The sparsity structure of the Jacobian/Hessian is encoded in T : MA(p) φ(q) 2 MA(p) φ(r) φ(q) 0 = (p, q) is an edge of T 0 = (p, q, r) is a triangle of T 19

Example 1: Nonlinear diffusion on point clouds ( ) ρ t = ρm ρ n X on X on X fast diffusion equation: m [1 1/d, 1) porous medium equation: m > 1 Gradient flow in (P(X), W 2 ). for U(ρ) = U(ρ(x)) d x with U(r) = rm m 1 [Otto] Algorithm: Input: µ 0 := x P 0 δ x P 0, τ > 0 For k {0,..., T } φ arg min φ K G X (P k ) 1 2τ W2 2(µ k, Hφ # µ k ) + U(Gφ # µ k ) µ k+1 G φ# µ k ; P k+1 spt(µ k+1 ) Newton s method x G φk (x) X P k X P k+1 20

Example 2: Crowd motion and congestion Gradient flow model of crowd motion with congestion, with a JKO scheme: [Maury-Roudneff-Chupin-Santambrogio 10] µ k+1 = min ν P(X) 1 2τ W2 2(µ k, ν) + E(ν) + U(ν) E(ν) := V (x) d ν(x) X { 0 if d ν/ d H d 1, U(ν) := We solve this problem with a relaxed hard congestion term: + if not Prop: The congestion term U is convex under generalized displacements. Convex optimization problem if V is λ-convex (V + λ. 2 convex) and τ λ/2. U α (ρ) := ρ(x) α log(1 ρ(x) 1/d ) d x 6 r r α log(1 r 1/d ) 21 Prop: (i) U α is convex under gen. displacements Γ (ii) U α U as α. βu 1 Γ U as β 0. 4 2 α=1 α=5 α=10 r = 0 r = 1

Example 2: Crowd motion and congestion Initial density on X = [ 2, 2] 2 Potential 0 1 (2, 2) ( 2, 2) P = 200 200 regular grid. V (x) = x (2, 0) 2 + 5 exp( 5 x 2 /2) Algorithm: Input: µ 0 P(P ), τ > 0, α > 0, β 1. For k {0,..., T } φ arg min φ K G X (P k ) 1 2τ W2 2(µ k, Hφ # µ k ) + E(Hφ # µ k ) + αu β (Gφ # µ k ) ν G φ# µ k ; µ k+1 projection of ν [ 2,2) [ 2,2] on P. 22

23 4. Extension to Other Convexity-like Constraints

Support Functions of Convex Bodies Objective: Logarithmic barrier for the space of support functions of convex bodies. 24

Support Functions of Convex Bodies Objective: Logarithmic barrier for the space of support functions of convex bodies. interior point method for shape optimization problem: Minkowski, Meissner, etc. 24

Support Functions of Convex Bodies Objective: Logarithmic barrier for the space of support functions of convex bodies. Definition: Given a convex body K, h K : u S d 1 max p K u p. 24

Support Functions of Convex Bodies Objective: Logarithmic barrier for the space of support functions of convex bodies. Definition: Given a convex body K, h K : u S d 1 max p K u p. K s (P ) := {h K P ; K bounded convex body } K(h) n 1 n 2 h 1 h 2 n 5 n 3 h 5 h 3 h 4 n 4 P = {n 1,..., n N } S d 1 R d K(h) := N i=1 {x; x n i h i } Prop: Φ(h) := N i=1 log(hd 1 (ith face of K(h))) is a convex barrier for K s (P ). 24

c-concave Functions Definition: φ : X R is c-concave if ψ : Y R s.t. φ = min y c(x, y) + ψ(y) Cost function: c : X Y R 25

c-concave Functions Definition: φ : X R is c-concave if ψ : Y R s.t. φ = min y c(x, y) + ψ(y) K c (P ) := {φ P ; φ : X R is c-concave } Cost function: c : X Y R 25

c-concave Functions Definition: φ : X R is c-concave if ψ : Y R s.t. φ = min y c(x, y) + ψ(y) K c (P ) := {φ P ; φ : X R is c-concave } Cost function: c : X Y R Characterization of the costs functions such that the space K c is convex Application: generalized principal-agent problem [Figalli, Kim, McCann 10] Under the same hypotheses, there exists a convex logarithmic barrier for K c (P ): Voronoi diagram: Vor ψ c (p) = {x X; q P, c(x, p) + ψ(p) c(x, q) + ψ(q)} Vor ψ c (p) X 25