A Variational Model for Data Fitting on Manifolds by Minimizing the Acceleration of a Bézier Curve

A Variational Model for Data Fitting on Manifolds by Minimizing the Acceleration of a Bézier Curve Ronny Bergmann a Technische Universität Chemnitz Kolloquium, Department Mathematik, FAU Erlangen-Nürnberg, Erlangen, December 11, 2018 a joint work with P.-Y. Gousenbourger, UCLouvain, Louvain-la-Neuve, Belgium.

Moin. 1. Riemannian Manifolds & Optimization 2. Data Fitting on Riemannian Manifolds 3. Bézier Curves and Generalized Bézier Curves 4. Discretized Acceleration of a Bézier Curve 5. Gradient Descent on a Manifold 6. Numerical Examples 1

An m-dimensional Riemannian Manifold M ξ x T x M log x x exp x ν x log x g( ; x, y) PT x y (ν) y M A d-dimensional Riemannian manifold can be informally defined as a set M covered with a suitable collection of charts, that identify subsets of M with open subsets of R d and a continously varying inner product on the tangential spaces. [Absil, Mahony, Sepulchre, 2008] 2

An m-dimensional Riemannian Manifold M ξ x T x M log x x exp x ν x log x g( ; x, y) PT x y (ν) y M Geodesic g( ; x, y) shortest path (on M) between x, y M Tangent space T x M at x, with inner product, x Logarithmic map log x y = ġ(0; x, y) speed towards y Exponential map exp x ξ x = g(1), where g(0) = x, ġ(0) = ξ x Parallel transport PT x y (ν) of ν T x M along g( ; x, y) 2

Optimization on Manifolds Let N be a Riemannian manifold and E : N R. We aim to solve argmin E(x) x N often: product manifold N = M n for n N 2 : manifold-valued image processing highdimensional problem locally: convexity defined via geodesics 3

Variational Methods on Manifolds Variational methods model a trade-off between staying close to the data and minimizing a certain property E(x) = D(x; f) + αr(x) α > 0 is a weight f N is given Data data or similarity term D(x; f) regularizer / prior R(x) 4

Differential and Gradient The differential D x f = Df : T M R of a real-valued function f : M R is the push-forward of f. Intuition: Given x M and ξ T x M, then Df(x)[ξ] is the directional derivative of f. The gradient f : M T M is the tangent vector fulfilling M f(x), η x = Df(x)[η] for all η T x M gradient descent (with e.g. Armijo s rule) 5

Data Fitting on Manifolds Given data points d 0,..., d n on a Riemannian manifold M and time points t i I, find a nice curve γ : I M, γ Γ, such that γ(t i ) = d i (interpolation) or γ(t i ) d i (approximation). Γ set of geodesics & approximation: geodesic regression [Rentmeesters, 11; Fletcher, 13; Boumal 13] Γ Sobolev space of curves: Inifinite-dimensional problem [Samir et. al., 12] Γ composite Bézier curves; LSs in tangent spaces [Arnould et. al. 15; Gousenbourger, Massart, Absil, 18] Discretized curve, Γ = M N, [Boumal, Absil, 11] This talk nice means minimal (discretized) acceleration ( as straight as possible ) for Γ the set of composite Bézier curves. In Euclidean space: Natural cubic splines as closed form solution. 6

(Euclidean) Bézier Curves Definition [Bézier, 62] A Bézier curve β K of degree K N 0 is a function β K : [0, 1] R d parametrized by control points b 0,..., b K R n and defined by β K (t; b 0,..., b K ) := K b j B j,k (t), j=0 [Bernstein, 1912] where B j,k = ( K j ) t j (1 t) K j are the Bernstein polynomials of degree K. Evaluation via Casteljau s algorithm. [de Casteljau, 59] 7

Illustration of de Casteljau s Algorithm b 1 b 2 b 0 The set of control points b 0, b 1, b 2, b 3. b 3 8

Illustration of de Casteljau s Algorithm b 1 x [1] 1 b 2 x [1] 0 x [1] 2 b 0 Evaluate line segments at t = 1 4, obtain x[1] 0, x[1] 1, x[1] 2. 8 b 3

Illustration of de Casteljau s Algorithm b 1 x [1] 1 x [2] 1 b 2 x [1] 0 x [2] 0 x [1] 2 b 0 Repeat evaluation for new line segments to obtain x [2] 0, x[2] 1. 8 b 3

Illustration of de Casteljau s Algorithm b 1 x [1] 1 x [2] 1 b 2 x [1] 0 x [2] 0 x [3] 0 x [1] 2 b 0 Repeat for the last segment to obtain β 3 ( 1 4 ; b 0, b 1, b 2, b 3 ) = x [3] 0. 8 b 3

Illustration of de Casteljau s Algorithm x [1] 0 b 1 x [2] 0 x [1] 1 x [3] 0 x [2] 1 b 2 x [1] 2 b 0 Same procedure for evaluation of β 3 ( 1 2 ; b 0, b 1, b 2, b 3 ). b 3 8

Illustration of de Casteljau s Algorithm b 1 x [1] 0 x [2] 0 x [1] 1 b 2 x [3] 0 x [2] 1 b 0 x [1] 2 Same procedure for evaluation of β 3 ( 3 4 ; b 0, b 1, b 2, b 3 ). b 3 8

Illustration of de Casteljau s Algorithm b 1 b 2 b 0 Complete curve β 3 (t; b 0, b 1, b 2, b 3 ). b 3 8

Composite Bézier Curves Definition A composite Bezier curve B : [0, n] R d is defined as β K (t; b 0 0 B(t) :=,..., b0 K ) if t [0, 1], β K (t i; b i 0,..., bi K ), if t (i, i + 1], i = 1,..., n 1. 9

Composite Bézier Curves Definition A composite Bezier curve B : [0, n] R d is defined as β K (t; b 0 0 B(t) :=,..., b0 K ) if t [0, 1], β K (t i; b i 0,..., bi K ), if t (i, i + 1], i = 1,..., n 1. Denote ith segment by B i (t) = β K (t; b i 0,..., bi K ) and p i = b i 0. b 0 2 b 2 2 p 3 b 3 1 p 1 b 1 1 b 0 1 b 0 0 = p 0 p 2 b 2 1 p 4 b 1 2 b 4 2 B 0 B 1 B 2 B 3 B 4 b 3 2 b 4 1 p 5 = b 4 3 9

Composite Bézier Curves Definition A composite Bezier curve B : [0, n] R d is defined as β K (t; b 0 0 B(t) :=,..., b0 K ) if t [0, 1], β K (t i; b i 0,..., bi K ), if t (i, i + 1], i = 1,..., n 1. Denote ith segment by B i (t) = β K (t; b i 0,..., bi K ) and p i = b i 0. continuous iff B i 1 (1) = B i (0), i = 1,..., n 1 b i 1 K = bi 0 = p i, i = 1,..., n 1 continuously differentiable iff p i = 1 2 (bi 1 K 1 + bi 1 ) 9

Bézier Curves on a Manifold Definition. Let M be a Riemannian manifold and b 0,..., b K M, K N. [Park, Ravani, 1995; Popiel, Noakes, 2007] The (generalized) Bézier curve of degree k, k K, is defined as β k (t; b 0,..., b k ) = g(t; β k 1 (t; b 0,..., b k 1 ), β k 1 (t; b 1,..., b k )), if k > 0, and β 0 (t; b 0 ) = b 0. Bézier curves β 1 (t; b 0, b 1 ) = g(t; b 0, b 1 ) are geodesics. composite Bézier curves B : [0, n] M completely analogue (using geodesics for line segments) 10

Bézier Curves on a Manifold Definition. Let M be a Riemannian manifold and b 0,..., b K M, K N. [Park, Ravani, 1995; Popiel, Noakes, 2007] The (generalized) Bézier curve of degree k, k K, is defined as β k (t; b 0,..., b k ) = g(t; β k 1 (t; b 0,..., b k 1 ), β k 1 (t; b 1,..., b k )), if k > 0, and β 0 (t; b 0 ) = b 0. The Riemannian composite Bezier curve B(t) is continuous iff B i 1 (1) = B i (0), i = 1,..., n 1 b i 1 K = bi 0 =: p i, i = 1,..., n 1 continuously differentiable iff p i = g ( 1 2 ; bi 1 K 1, 1) bi or b i 1 K 1 = g( 2; b i 1, p ) i 10

Illustration of a Composite Bézier Curve on the Sphere S 2 The directions, e.g. log pj b 1 j, are now tangent vectors. 11

A Variational Model for Data Fitting Let d 0,..., d n M. A model for data fitting reads E(B) = λ 2 n d 2 M(B(k), d k ) + k=0 n 0 D2 B(t) 2 dt 2 B(t), dt λ > 0, where B Γ is from the set of continuously differentiable composite Bezier curve of degree K with n segments. Goal: find minimizer B Γ finite dimensional optimization problem in the control points b i j, i.e. on ML with L = n(k 1) + 2 λ yields interpolation (p k = d k ) L = n(k 2) + 1 On M = R m : closed form solution, natural (cubic) splines 12

Interlude: Second Order Differences on Manifolds Second order difference: [RB et al., 2014; RB, Weinmann, 2016; Bačák et al., 2016] d 2 (x, y, z) := min c C x,z d M (c, y), x, y, z M, 1 C x,z mid point(s) of geodesic(s) g( ; x, z) 2 x 2y + z 2 = 1 2 (x + z) y 2 y min d M (c, y) c C x,z y c x c(x, z) z x c(x, z) z M = S 2 13

Discretizing the Data Fitting Model We discretize the absolute second order covariant derivative n 0 D2 B(t) 2 dt 2 N 1 dt γ(t) k=1 s d 2 2 [B(s i 1), B(s i ), B(s i+1 )] 4. s on equidistant points s 0,..., s N with step size s = s 1 s 0. Evaluating E(B) consists of evaluation of geodesics and squared (Riemannian) distances (N + 1)K geodesics to evaluate the Bézier segments N geodesics to evaluate the mid points N squared distances to obtain the second order absolute finite differences squared 14

Gradient and Chain Rule on a Manifold The gradient M f(x) T x M of f : M R, x M, is defined as the tangent vector that fulfills M f(x), ξ x = Df(x)[ξ] for all ξ T x M. For a composition F (x) = (g h)(x) = g(h(x)) of two functions g, h: M M the chain rule reads for x M and ξ T x M as D x F [ξ] = D h(x) g [ D x h[ξ] ], where D x h[ξ] T h(x) M and D x F [ξ] T F (x) M. 15

The Differential of a Geodesic w.r.t. its Start Point The geodesic variation Γ g,ξ (s, t) := exp γx,ξ (s)(tζ(s)), s ( ε, ε), t [0, 1], ε > 0. is used to define the Jacobi field J g,ξ (t) = s Γ g,ξ(s, t) s=0. y g(t; x, y) J g,ξ (t) g( ; x, y) J g,ξ ζ(0) ζ(ŝ) x ξ = J g,ξ (0) Γ g,ξ (ŝ, 0) Γ g,ξ (s, t) Γ g,ξ (s, 0) = γ x,ξ (s) Then the differential reads D x g(t,, y)[ξ] = J g,ξ (t). 16

Implementing Jacobi Fields on Symmetric Spaces A manifold is symmetric if for every geodesic g and avery x M the mapping g(t) g( t) is an isometry at least locally around x = g(0). Then one can diagonalize the curvature tensor R, let κ l denote its eigenvalues. let {ξ 1,..., ξ m } T x M be an ONB of eigenvalues with ξ 1 = log x y. parallel transport Ξ j (t) = PT x g(t;x,y) ξ j, j = 1,..., m 17

Implementing Jacobi Fields on Symmetric Spaces II Decompose η = m η l ξl. Then i=1 D x g(t; x, y)[η] = J g,η (t) = m η l J g,ξl (t), l=1 with sinh (d ) g(1 t) κ l sinh(d g κl Ξ l (t) if κ l < 0, J g,ξl (t) = sin (d ) g(1 t) κ l sin( κ l d g) Ξ l (t) if κ l > 0, (1 t)ξ l (t) if κ l = 0. 18

Implementing the Gradient using adjoint Jacobi Fields. The adjoint Jacobi fields Jg, (t): T g(t) M T x M are characterized by J g,ξ (t), η g(t) = ξ, Jg,η(t) x, for all ξ T x M, η T g(t;x,y) M. can be computed without extra efforts, i.e. the same ODEs occur. can be used to calculate the gradient the gradient of iterated evaluations of geodesics can be computed by composition of (adjoint) Jacobi fields 19

Gradient Descent on a Manifold Let N = M L be the product manifold of M, Input. f : N R, its gradient N f, initial data x (0) = b N step sizes s k > 0, k N. Output: ˆx N k 0 repeat x (k+1) exp x (k) ( sk N f(x (k) ) ) k k + 1 until a stopping criterion is reached return ˆx := x (k) 20

Armijo Step Size Rule Let x = x (k) be an iterate from the gradient descent algorithm, β, σ (0, 1), α > 0. Let m be the smallest positive integer such that f(x) f ( exp x ( β m α N f(x)) ) σβ m α N f(x) x Set the step size s k := β m α. 21

Minimizing with Known Minimizer Original Minimized 9 10 2 Absolute first order differences log B(ti) B(t i+1 ) B(ti) 3 10 2 0 1 2 1 3 2 2 22

Interpolation by Bézier Curves with Minimal Acceleration. A comp. Bezier curve (black) and its mnimizer (blue). 23

Approximation by Bézier Curves with Minimal Acceleration. In the following video λ is slowly decreased from 10 to 0. The initial setting, λ = 10. 24

Approximation by Bézier Curves with Minimal Acceleration. In the following video λ is slowly decreased from 10 to 0. Summary of the video. 24

Comparison to Previous Approach [Gousenbourger, Massart, Absil, 2018] This curve (dashed) is too global to be solved in a tangent space (dotted) correctly, while our method (blue) still works. 25

An Example of Rotations M = SO(3) Initialization with approach from composite splines [Gousenbourger, Massart, Absil, 2018] Our method outperforms the approach of solving linear systems in tangent spaces, but their approach can serve as an initialization. 26

Summary Data fitting on manifolds with Bézier curves minimizing their acceleration computed the gradient with respect to control points employed Jacobi fields and their adjoints. implemented within the MVIR toolbox (available soon) ronnybergmann.net/mvirt/ a Julia implementation in preparation (Manopt.jl) 27

Literature M. Bačák, R. Bergmann, G. Steidl, and A. Weinmann. A Second Order Non-Smooth Variational Model for Restoring Manifold-Valued Images. In: SIAM Journal on Scientific Computing 38.1 (2016), A567 A597. doi: 10.1137/15M101988X. R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. In: Frontiers in Applied Mathematics and Statistics (2018). doi: 10.3389/fams.2018.00059. arxiv: 1807.10090. P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2008. doi: 10.1515/9781400830244. P.-Y. Gousenbourger, E. Massart, and P.-A. Absil. Data fitting on manifolds with composite Bézier-like curves and blended cubic splines. In: Journal of Mathematical Imaging and Vision (2018). accepted. doi: 10.1007/s10851-018-0865-2. ronnybergmann.net/talks/2018-dresden-institutsseminar-accbezier.pdf 28