Proximal point algorithm in Hadamard spaces

Similar documents
A NOTION OF NONPOSITIVE CURVATURE FOR GENERAL METRIC SPACES

The Metric Geometry of the Multivariable Matrix Geometric Mean

Results from MathSciNet: Mathematical Reviews on the Web c Copyright American Mathematical Society 2000

Available online at J. Nonlinear Sci. Appl., 10 (2017), Research Article

A splitting minimization method on geodesic spaces

Projection methods in geodesic metric spaces

PROXIMAL POINT ALGORITHMS INVOLVING FIXED POINT OF NONSPREADING-TYPE MULTIVALUED MAPPINGS IN HILBERT SPACES

Viscosity approximation methods for the implicit midpoint rule of nonexpansive mappings in CAT(0) Spaces

Generalized monotonicity and convexity for locally Lipschitz functions on Hadamard manifolds

Basic Properties of Metric and Normed Spaces

MONOTONE OPERATORS ON BUSEMANN SPACES

Spaces with Ricci curvature bounded from below

Spaces with Ricci curvature bounded from below

A new proof of Gromov s theorem on groups of polynomial growth

Metric Spaces and Topology

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

PATH FUNCTIONALS OVER WASSERSTEIN SPACES. Giuseppe Buttazzo. Dipartimento di Matematica Università di Pisa.

A Second Order Non-smooth Variational Model for Restoring Manifold-valued Images

Negative sectional curvature and the product complex structure. Harish Sheshadri. Department of Mathematics Indian Institute of Science Bangalore

Lecture 1: Background on Convex Analysis

Scientific report for the period December November 2014

ICM 2014: The Structure and Meaning. of Ricci Curvature. Aaron Naber ICM 2014: Aaron Naber

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

arxiv: v3 [q-bio.pe] 23 Apr 2018

RANDOM FIELDS AND GEOMETRY. Robert Adler and Jonathan Taylor

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Computers and Mathematics with Applications

ALEXANDER LYTCHAK & VIKTOR SCHROEDER

Zair Ibragimov CSUF. Talk at Fullerton College July 14, Geometry of p-adic numbers. Zair Ibragimov CSUF. Valuations on Rational Numbers

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Section 6. Laplacian, volume and Hessian comparison theorems

Fixed Point Theory in Reflexive Metric Spaces

Stochastic calculus and martingales on trees

arxiv:math/ v1 [math.dg] 24 May 2005

Krasnoselskii-Mann Method for Multi-valued Non-Self Mappings in CAT(0) Spaces

MAT 578 FUNCTIONAL ANALYSIS EXERCISES

Tree Space: Algorithms & Applications Part I. Megan Owen University of Waterloo

Variational inequalities for set-valued vector fields on Riemannian manifolds

On the convergence of SP-iterative scheme for three multivalued nonexpansive mappings in CAT (κ) spaces

Multivariable Calculus

Proximal Point Method for a Class of Bregman Distances on Riemannian Manifolds

Strictly convex functions on complete Finsler manifolds

arxiv: v4 [math.dg] 18 Jun 2015

THE INVERSE FUNCTION THEOREM

Loos Symmetric Cones. Jimmie Lawson Louisiana State University. July, 2018

Paraconvex functions and paraconvex sets

Variational methods for restoration of phase or orientation data

Goebel and Kirk fixed point theorem for multivalued asymptotically nonexpansive mappings

MATH 51H Section 4. October 16, Recall what it means for a function between metric spaces to be continuous:

Continuity. Matt Rosenzweig

SOME ELEMENTARY GENERAL PRINCIPLES OF CONVEX ANALYSIS. A. Granas M. Lassonde. 1. Introduction

Topological properties

First-order Methods for Geodesically Convex Optimization

8.8. Codimension one isoperimetric inequalities Distortion of a subgroup in a group 283

Metric Thickenings of Euclidean Submanifolds

Problem Set 6: Solutions Math 201A: Fall a n x n,

Reflexive Metric Spaces and The Fixed Point Property

MATH 4200 HW: PROBLEM SET FOUR: METRIC SPACES

Hopf-Rinow and Hadamard Theorems

SOME REMARKS ON THE SPACE OF DIFFERENCES OF SUBLINEAR FUNCTIONS

CONVERGENCE OF APPROXIMATING FIXED POINTS FOR MULTIVALUED NONSELF-MAPPINGS IN BANACH SPACES. Jong Soo Jung. 1. Introduction

Strongly convex functions, Moreau envelopes and the generic nature of convex functions with strong minimizers

Logarithmic Sobolev Inequalities

Convex Optimization Notes

Fixed point sets of parabolic isometries of CAT(0)-spaces

THE FUNDAMENTAL GROUP OF NON-NEGATIVELY CURVED MANIFOLDS David Wraith The aim of this article is to oer a brief survey of an interesting, yet accessib

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings

Laplace Operator and Heat Kernel for Shape Analysis

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17

NON POSITIVELY CURVED METRIC IN THE SPACE OF POSITIVE DEFINITE INFINITE MATRICES

WARPED PRODUCT METRICS ON R 2. = f(x) 1. The Levi-Civita Connection We did this calculation in class. To quickly recap, by metric compatibility,

MATH 54 - TOPOLOGY SUMMER 2015 FINAL EXAMINATION. Problem 1

Generalized Ricci Bounds and Convergence of Metric Measure Spaces

PLURISUBHARMONIC FUNCTIONS AND THE STRUCTURE OF COMPLETE KÄHLER MANIFOLDS WITH NONNEGATIVE CURVATURE

Ricci Curvature and Bochner Formula on Alexandrov Spaces

McGill University Department of Mathematics and Statistics. Ph.D. preliminary examination, PART A. PURE AND APPLIED MATHEMATICS Paper BETA

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Convergence of three-step iterations for Ciric-quasi contractive operator in CAT(0) spaces

Extending Lipschitz and Hölder maps between metric spaces

Smooth Dynamics 2. Problem Set Nr. 1. Instructor: Submitted by: Prof. Wilkinson Clark Butler. University of Chicago Winter 2013

OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS

Differential Geometry II Lecture 1: Introduction and Motivation

Pythagorean Property and Best Proximity Pair Theorems

Groups up to quasi-isometry

ASYMPTOTIC INVARIANTS OF HADAMARD MANIFOLDS. Mohamad A. Hindawi. A Dissertation. Mathematics

Geometric Analysis, Karlovassi, Samos

Applied Analysis (APPM 5440): Final exam 1:30pm 4:00pm, Dec. 14, Closed books.

The problems I left behind

Convex Analysis and Economic Theory Winter 2018

Convergence Theorems of Approximate Proximal Point Algorithm for Zeroes of Maximal Monotone Operators in Hilbert Spaces 1

Functional Analysis I

MA3051: Mathematical Analysis II

TEST CODE: PMB SYLLABUS

PDEs in Image Processing, Tutorials

Chapter 2 Convex Analysis

The Hopf argument. Yves Coudene. IRMAR, Université Rennes 1, campus beaulieu, bat Rennes cedex, France

The Structure of Singular Spaces of Dimension 2

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves

Short note on compact operators - Monday 24 th March, Sylvester Eriksson-Bique

Boundaries in non positive curvature

Transcription:

Proximal point algorithm in Hadamard spaces Miroslav Bacak Télécom ParisTech Optimisation Géométrique sur les Variétés - Paris, 21 novembre 2014

Contents of the talk 1 Basic facts on Hadamard spaces 2 Proximal point algorithm 3 Applications to computational phylogenetics

Proximal point algorithm in Hadamard spaces Why? Well... it is used in: Phylogenetics: computing medians and means of phylogenetic trees. diffusion tensor imaging: the space P (n, R) of symmetric positive definite matrices n n with real entries is a Hadamard space if it is equipped with the Riemannian metric X, Y A := Tr ( A 1 XA 1 Y ), X, Y T A (P (n, R)), for every A P (n, R). Computational biology: shape analyses of tree-like structures:

Tree-like structures in organisms Figure: Bronchial tubes in lungs Figure: Transport system in plants Figure: Human circulatory system

Definition of Hadamard space Let (H, d) be a complete metric space where: 1 any two points x 0 and x 1 are connected by a geodesic x: [0, 1] H: t x t, 2 and, d (y, x t ) 2 (1 t)d (y, x 0 ) 2 + td (y, x 1 ) 2 t(1 t)d (x 0, x 1 ) 2, for every y H. Then (H, d) is called a Hadamard space. For today: assume that local compactness.

Geodesic space x 2 x 1

Geodesic space x 2 x 1

Geodesic space (1 λ)x 1 + λx 2 x 2 x 1

Definition of nonpositive curvature A geodesic triangle in a geodesic space: x 3 x 2 x 1

Terminology remark CAT(κ) spaces, for κ R, were introduced in 1987 by Michail Gromov C = Cartan A = Alexandrov T = Toponogov We are particularly interested in CAT(0) spaces.

Examples of Hadamard spaces 1 Hilbert spaces, the Hilbert ball 2 complete simply connected Riemannian manifolds with Sec 0 3 R-trees: a metric space T is an R-tree if for x, y T there is a unique geodesic [x, y] if [x, y] [y, z] = {y}, then [x, z] = [x, y] [y, z] 4 Euclidean buildings 5 the BHV tree space (space of phylogenetic trees) 6 L 2 (M, H), where (M, µ) is a probability space: ( d 2 (u, v) := M ) 1 d (u(x), v(x)) 2 2 dµ(x), u, v L 2 (M, H)

Convexity in Hadamard spaces Let (H, d) be a Hadamard space. These spaces allow for a natural definition of convexity: Definition A set C H is convex if, given x, y C, we have [x, y] C. Definition A function f : H (, ] is convex if f γ is a convex function for each geodesic γ : [0, 1] H.

Convexity in Hadamard spaces Let (H, d) be a Hadamard space. These spaces allow for a natural definition of convexity: Definition A set C H is convex if, given x, y C, we have [x, y] C. Definition A function f : H (, ] is convex if f γ is a convex function for each geodesic γ : [0, 1] H.

Convexity in Hadamard spaces Let (H, d) be a Hadamard space. These spaces allow for a natural definition of convexity: Definition A set C H is convex if, given x, y C, we have [x, y] C. Definition A function f : H (, ] is convex if f γ is a convex function for each geodesic γ : [0, 1] H.

Examples of convex functions 1 The indicator function of a convex closed set C H : ι C (x) := 0, if x C, and ι C (x) :=, if x / C. 2 The distance function to a closed convex subset C H : d C (x) := inf d(x, c), x H. c C 3 The displacement function of an isometry T : H H : δ T (x) := d(x, T x), x H.

Examples of convex functions 1 The indicator function of a convex closed set C H : ι C (x) := 0, if x C, and ι C (x) :=, if x / C. 2 The distance function to a closed convex subset C H : d C (x) := inf d(x, c), x H. c C 3 The displacement function of an isometry T : H H : δ T (x) := d(x, T x), x H.

Examples of convex functions 1 The indicator function of a convex closed set C H : ι C (x) := 0, if x C, and ι C (x) :=, if x / C. 2 The distance function to a closed convex subset C H : d C (x) := inf d(x, c), x H. c C 3 The displacement function of an isometry T : H H : δ T (x) := d(x, T x), x H.

Examples of convex functions 4 Let c: [0, ) H be a geodesic ray. The function b c : H R defined by b c (x) := lim t [d (x, c(t)) t], x H, is called the Busemann function associated to the ray c. 5 The energy of a mapping u: M H given by E(u) := d (u(x), u(y)) 2 p(x, dy)dµ(x), M M where (M, µ) is a measure space with a Markov kernel p(x, dy). E is convex continuous on L 2 (M, H).

Examples of convex functions 4 Let c: [0, ) H be a geodesic ray. The function b c : H R defined by b c (x) := lim t [d (x, c(t)) t], x H, is called the Busemann function associated to the ray c. 5 The energy of a mapping u: M H given by E(u) := d (u(x), u(y)) 2 p(x, dy)dµ(x), M M where (M, µ) is a measure space with a Markov kernel p(x, dy). E is convex continuous on L 2 (M, H).

Examples of convex functions 6 Given a 1,..., a N H and w 1,..., w N > 0, set f(x) := N w n d (x, a n ) p, x H, n=1 where p [1, ). If p = 1, we get Fermat-Weber problem for optimal facility location. A minimizer of f is called a median. If p = 2, then a minimizer of f is the barycenter of µ := or the mean of a 1,..., a N. N w n δ an, n=1

Examples of convex functions 6 Given a 1,..., a N H and w 1,..., w N > 0, set f(x) := N w n d (x, a n ) p, x H, n=1 where p [1, ). If p = 1, we get Fermat-Weber problem for optimal facility location. A minimizer of f is called a median. If p = 2, then a minimizer of f is the barycenter of µ := or the mean of a 1,..., a N. N w n δ an, n=1

Strongly convex functions A function f : H (, ] is strongly convex with parameter β > 0 if f ((1 t)x + ty) (1 t)f(x) + tf(y) βt(1 t)d(x, y) 2, for any x, y H and t [0, 1]. Each strongly has a unique minimizer. Example Given y H, the function f := d(y, ) 2 is strongly convex. Indeed, d (y, x t ) 2 (1 t)d (y, x 0 ) 2 + td (y, x 1 ) 2 t(1 t)d (x 0, x 1 ) 2, for each geodesic x: [0, 1] H.

1 Basic facts on Hadamard spaces 2 Proximal point algorithm 3 Applications to computational phylogenetics

Proximal point algorithm Let f : H (, ] be convex lsc. Optimization problem: min f(x). x H Recall: no (sub)differential, no shooting (singularities). Implicit methods are appropriate. The PPA generates a sequence [ x i := J λi (x i 1 ) := arg min f(y) + 1 ] d(y, x i 1 ) 2, y H 2λ i where x 0 H is a given starting point and λ i > 0, for each i N.

Proximal point algorithm Let f : H (, ] be convex lsc. Optimization problem: min f(x). x H Recall: no (sub)differential, no shooting (singularities). Implicit methods are appropriate. The PPA generates a sequence [ x i := J λi (x i 1 ) := arg min f(y) + 1 ] d(y, x i 1 ) 2, y H 2λ i where x 0 H is a given starting point and λ i > 0, for each i N.

Convergence of proximal point algorithm Theorem (M.B., 2011) Let f : H (, ] be a convex lsc function attaining its minimum. Given x 0 H and (λ i ) such that 1 λ i =, the PPA sequence (x i ) converges to a minimizer of f. (Resolvents are firmly nonexpansive - cheap version for λ i = λ.) Disadvantage: The resolvents x i := J λi (x i 1 ) := arg min y H are often difficult to compute. [ f(y) + 1 ] d(y, x i 1 ) 2, 2λ i

Convergence of proximal point algorithm Theorem (M.B., 2011) Let f : H (, ] be a convex lsc function attaining its minimum. Given x 0 H and (λ i ) such that 1 λ i =, the PPA sequence (x i ) converges to a minimizer of f. (Resolvents are firmly nonexpansive - cheap version for λ i = λ.) Disadvantage: The resolvents x i := J λi (x i 1 ) := arg min y H are often difficult to compute. [ f(y) + 1 ] d(y, x i 1 ) 2, 2λ i

Splitting proximal point algorithm Let f 1,..., f N be convex lsc and consider N f(x) := f n (x), x H. n=1 Example (Median and mean) f n := d (, a n ), f n := d (, a n ) 2. Key idea: apply resolvents J n λ s of f n s in a cyclic or random order.

Splitting proximal point algorithm Let f 1,..., f N be convex lsc and consider N f(x) := f n (x), x H. n=1 Example (Median and mean) f n := d (, a n ), f n := d (, a n ) 2. Key idea: apply resolvents J n λ s of f n s in a cyclic or random order.

Splitting proximal point algorithm Let f 1,..., f N be convex lsc and consider N f(x) := f n (x), x H. n=1 Example (Median and mean) f n := d (, a n ), f n := d (, a n ) 2. Key idea: apply resolvents J n λ s of f n s in a cyclic or random order.

Splitting proximal point algorithm Let x 0 H be a starting point. For each k N 0 we apply resolvents in cyclic order: x kn+1 := J 1 λ k (x kn ), x kn+2 := J 2 λ k (x kn+1 ),. x kn+n := J N λ k (x kn+n 1 ), or in random order: x i+1 := J r i λ i (x i ), where (r i ) are random variables with values in {1,..., N}.

Convergence of splitting proximal point algorithm Theorem (Cyclic order version + Random order version) Assume that f n are Lipschitz (or locally Lipschitz and the minimizing sequence is bounded). Then 1 the cyclic PPA sequence converges to a minimizer of f 2 the random PPA sequence converges to a minimizer of f almost surely. Assumptions are satisfied for f(x) := where p [1, ). N w n d (x, a n ) p, x H, n=1

Convergence of splitting proximal point algorithm Theorem (Cyclic order version + Random order version) Assume that f n are Lipschitz (or locally Lipschitz and the minimizing sequence is bounded). Then 1 the cyclic PPA sequence converges to a minimizer of f 2 the random PPA sequence converges to a minimizer of f almost surely. Assumptions are satisfied for f(x) := where p [1, ). N w n d (x, a n ) p, x H, n=1

Splitting proximal point algorithm (for mean) Hence instead of computing (the usual PPA) x i+1 := arg min z H [ N ] d (z, a n ) 2 + 1 d (z, x i ) 2, 2λ i n=1 we are to minimize the function x i+1 := arg min z H [ d (z, a n ) 2 + 1 2λ i d (z, x i ) 2 where a n are chosen in a cyclic or random order. This is one-dimensional problem! = x i+1 is a convex combination of a n and x i. ],

Splitting proximal point algorithm (for mean) Hence instead of computing (the usual PPA) x i+1 := arg min z H [ N ] d (z, a n ) 2 + 1 d (z, x i ) 2, 2λ i n=1 we are to minimize the function x i+1 := arg min z H [ d (z, a n ) 2 + 1 2λ i d (z, x i ) 2 where a n are chosen in a cyclic or random order. This is one-dimensional problem! = x i+1 is a convex combination of a n and x i. ],

1 Basic facts on Hadamard spaces 2 Proximal point algorithm 3 Applications to computational phylogenetics

Left: One of many evolutionary trees Right: A picture of an evolutionary tree by Charles Darwin (1837)

Billera-Holmes-Vogtmann tree space T d 0 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 1 2 3 4 Figure: 5 out of 15 orthants of T 4

Billera-Holmes-Vogtmann tree space T d Figure: A finite set of trees in T 4

Billera-Holmes-Vogtmann tree space T d Figure: Consider the most frequent tree topology only

Computing the mean: Random order version Algorithm (SPPA with f n := d(, T n ) 2 ) Input: T 1,..., T N T d Step 1: S 1 := T 1 and i := 1 Step 2: choose r {1,..., N} at random Step 3: S i+1 := 1 i+1 T r + i i+1 S i Step 4: i := i + 1 Step 5: go to Step 2 Geodesics can be computed in polynomial time: The Owen-Provan algorithm (2011)

Computing the mean: Random order version Algorithm (SPPA with f n := d(, T n ) 2 ) Input: T 1,..., T N T d Step 1: S 1 := T 1 and i := 1 Step 2: choose r {1,..., N} at random Step 3: S i+1 := 1 i+1 T r + i i+1 S i Step 4: i := i + 1 Step 5: go to Step 2 Geodesics can be computed in polynomial time: The Owen-Provan algorithm (2011)

Computing the mean: Random order version Algorithm (SPPA with f n := d(, T n ) 2 ) Input: T 1,..., T N T d Step 1: S 1 := T 1 and i := 1 Step 2: choose r {1,..., N} at random Step 3: S i+1 := 1 i+1 T r + i i+1 S i Step 4: i := i + 1 Step 5: go to Step 2 Geodesics can be computed in polynomial time: The Owen-Provan algorithm (2011)

Computing the mean: Random order version T 5 T 4 T 6 T 3 T 7 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 7 S 1 T 3 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 7 S 1 S 2 T 3 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 7 S 1 S 3 S 2 T 3 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 7 S 1 S 4 S 3 S 2 T 3 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 3 T 7 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 3 T 7 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 3 T 7 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 3 T 7 T 1 T 2

Computing the mean: Random order version T 5 T 4 T 6 T 3 T 7 T 1 T 2

Le Big Data Space T d : orthant dimension = d 2, # of orthants = (2d 3)!! The actual dimension of T d is d + 1 + (d 2)(2d 3)!! # of trees = N (e.g. coming from an MCMC simulation) Our computation: d = 12 hence dim 10 11 and N = 10 5

Le Big Data Space T d : orthant dimension = d 2, # of orthants = (2d 3)!! The actual dimension of T d is d + 1 + (d 2)(2d 3)!! # of trees = N (e.g. coming from an MCMC simulation) Our computation: d = 12 hence dim 10 11 and N = 10 5

Resuls (courtesy of Philipp Benner) Pika Rabbit Orangutan Baboon Marmoset Gorilla Human Chimpanzee Kangaroo rat Squirrel Guinea pig Rat Mouse Pika Rabbit Squirrel Orangutan Gorilla Human Chimpanzee Baboon Guinea pig Marmoset Kangaroo rat Rat Mouse Pika Rabbit Kangaroo rat Rat Mouse Squirrel Guinea pig Orangutan Baboon Marmoset Gorilla Human Chimpanzee Pika Rabbit Squirrel Kangaroo rat Guinea pig Rat Mouse Orangutan Gorilla Human Chimpanzee Baboon Marmoset

Resuls (courtesy of Philipp Benner)... continued. Pika Rabbit Baboon Marmoset Orangutan Gorilla Chimpanzee Rat Human Mouse Kangaroo rat Guinea pig Squirrel Figure: Approximation of the mean of the 100,000 trees.

References 1 M.B.: Computing medians and means in Hadamard spaces, SIAM J. Optim. 24 (2014), no. 3, 1542 1566. 2 Benner, Bacak, Bourguignon: Point estimates in phylogenetic reconstructions, Bioinformatics, Vol 30 (2014), Issue 17. 3 M.B.: Convex analysis and optimization in Hadamard spaces, De Gruyter, Berlin, 2014.