Optimal mass transport as a distance measure between images

Size: px
Start display at page:

Download "Optimal mass transport as a distance measure between images"

Transcription

1 Optimal mass transport as a distance measure between images Axel Ringh 1 1 Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden. 21 st of June 2018 INRIA, Sophia-Antipolis, France

2 Acknowledgements This is based on joint work with Johan Karlsson 1. [1] J. Karlsson, and A. Ringh. Generalized Sinkhorn iterations for regularizing inverse problems using optimal mass transport. SIAM Journal on Imaging Sciences, 10(4), , I acknowledge financial support from Swedish Research Council (VR) Swedish Foundation for Strategic Research (SSF) Code: The code is based on ODL: 1 Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden 2 / 24

3 Outline Background Inverse problems Optimal mass transport Sinkhorn iterations - solving discretized optimal transport problems Sinkhorn iterations as dual coordinate ascent Inverse problems with optimal mass transport priors Example in computerized tomography 3 / 24

4 Inverse problems Consider the problem of recovering f X from data g Y, given by g = A(f ) + noise Notation: X is called the reconstruction space. Y is called the data space. A : X Y is the forward operator. A : Y X denotes the adjoint operator 4 / 24

5 Inverse problems Consider the problem of recovering f X from data g Y, given by g = A(f ) + noise Notation: X is called the reconstruction space. Y is called the data space. A : X Y is the forward operator. A : Y X denotes the adjoint operator Problems of interest are ill-posed inverse problems: a solution might not exist, the solution might not be unique, the solution does not depend continuously on data. Simply put: A 1 does not exist as a continuous bijection! Comes down to: find approximate inverse A so that g = A(f ) + noise = A (g) f. 4 / 24

6 Variational regularization A common technique to solve ill-posed inverse problems is to use variational regularization: arg min f X G(A(f ), g) + λf(f ) G : Y Y R, data discrepancy functional. F : X R, regularization functional. λ is the regularization parameter. Controls trade-off between data matching and regularization. Common example in imaging is total variation regularization: G(h, g) = h g 2 2, F(f ) = f 1. If A is linear this is a convex problem! 5 / 24

7 Incorporating prior information in variational schemes How can one incorporate prior information in such a scheme? 6 / 24

8 Incorporating prior information in variational schemes How can one incorporate prior information in such a scheme? One way: consider f is prior/template H defines closeness to f. What is a good choice for H? arg min G(A(f ), g) + λf(f ) + γh( f, f ) f X 6 / 24

9 Incorporating prior information in variational schemes How can one incorporate prior information in such a scheme? One way: consider f is prior/template H defines closeness to f. What is a good choice for H? arg min G(A(f ), g) + λf(f ) + γh( f, f ) f X Scenarios where potentially of interest. incomplete measurements, e.g. limited angle tomography. spatiotemporal imaging: data is a time-series of data sets: {g t} T t=0. For each set, the underlying image has undergone a deformation. each data set g t normally contains less information : A (g t) is a poor reconstruction. Approach: solve coupled inverse problems T [ ] arg min f 0,...,f T X G(A(f j ), g j ) + λf(f j ) + j=0 T γh(f j 1, f j ) j=1 6 / 24

10 Measuring distances between functions: the L p metrics Given two functions f 0(x) and f 1(x), what is a suitable way to measure the distance between the two? 7 / 24

11 Measuring distances between functions: the L p metrics Given two functions f 0(x) and f 1(x), what is a suitable way to measure the distance between the two? One suggestion: measure it pointwise, e.g., using an L p metric ( 1/p f 0 f 1 p = f 0(x) f 1(x) dx) p. D 7 / 24

12 Measuring distances between functions: the L p metrics Given two functions f 0(x) and f 1(x), what is a suitable way to measure the distance between the two? One suggestion: measure it pointwise, e.g., using an L p metric Draw-backs: for example unsensitive to shifts. ( 1/p f 0 f 1 p = f 0(x) f 1(x) dx) p. D Example: It gives the same distance from f 0 to f 1 and f 2: f 0 f 1 1 = f 0 f 2 1 = 8. 7 / 24

13 Optimal mass transport - Monge formulation Gaspard Monge: formulated optimal mass transport Optimal transport of soil for construction of forts and roads. Gaspard Monge 8 / 24

14 Optimal mass transport - Monge formulation Gaspard Monge: formulated optimal mass transport Optimal transport of soil for construction of forts and roads. Gaspard Monge 8 / 24

15 Optimal mass transport - Monge formulation Gaspard Monge: formulated optimal mass transport Optimal transport of soil for construction of forts and roads. Gaspard Monge 8 / 24

16 Optimal mass transport - Monge formulation Gaspard Monge: formulated optimal mass transport Optimal transport of soil for construction of forts and roads. Let c(x 0, x 1) : X X R + describes the cost for transporting a unit mass from location x 0 to x Gaspard Monge Given two functions f 0, f 1 : X R +, find the function φ : X X minimizing the transport cost c(x, φ(x))f 0(x)dx X / 24

17 Optimal mass transport - Monge formulation Gaspard Monge: formulated optimal mass transport Optimal transport of soil for construction of forts and roads. Let c(x 0, x 1) : X X R + describes the cost for transporting a unit mass from location x 0 to x Gaspard Monge Given two functions f 0, f 1 : X R +, find the function φ : X X minimizing the transport cost c(x, φ(x))f 0(x)dx where φ is mass preserving map from f 0 to f 1: f 1(x)dx = x A X φ(x) A 2 f 0(x)dx for all A X / 24

18 Optimal mass transport - Monge formulation Gaspard Monge: formulated optimal mass transport Optimal transport of soil for construction of forts and roads. Let c(x 0, x 1) : X X R + describes the cost for transporting a unit mass from location x 0 to x Gaspard Monge Given two functions f 0, f 1 : X R +, find the function φ : X X minimizing the transport cost Nonconvex problem! c(x, φ(x))f 0(x)dx where φ is mass preserving map from f 0 to f 1: f 1(x)dx = x A X φ(x) A 2 f 0(x)dx for all A X / 24

19 Optimal mass transport - Kantorovich formulation Leonid Kantorovich: convex formulation and duality theory Nobel Memorial Prize 1975 in Economics for contributions to the theory of optimum allocation of resources. Leonid Kantorovich 9 / 24

20 Optimal mass transport - Kantorovich formulation Leonid Kantorovich: convex formulation and duality theory Nobel Memorial Prize 1975 in Economics for contributions to the theory of optimum allocation of resources. Again, let c(x 0, x 1) denote the cost of transporting a unit mass from the point x 0 to the point x 1. Leonid Kantorovich Given two functions f 0, f 1 : X R +, find a transport plan M : X X R +, where M(x 0, x 1) is the amount of mass moved between x 0 to x / 24

21 Optimal mass transport - Kantorovich formulation Leonid Kantorovich: convex formulation and duality theory Nobel Memorial Prize 1975 in Economics for contributions to the theory of optimum allocation of resources. Again, let c(x 0, x 1) denote the cost of transporting a unit mass from the point x 0 to the point x 1. Given two functions f 0, f 1 : X R +, find a transport plan M : X X R +, where M(x 0, x 1) is the amount of mass moved between x 0 to x Leonid Kantorovich Transportation plan (M) / 24

22 Optimal mass transport - Kantorovich formulation Leonid Kantorovich: convex formulation and duality theory Nobel Memorial Prize 1975 in Economics for contributions to the theory of optimum allocation of resources. Again, let c(x 0, x 1) denote the cost of transporting a unit mass from the point x 0 to the point x 1. Given two functions f 0, f 1 : X R +, find a transport plan M : X X R +, where M(x 0, x 1) is the amount of mass moved between x 0 to x 1. Look for M that minimizes transportation cost: min M 0 X X c(x 0, x 1)M(x 0, x 1)dx 0dx Leonid Kantorovich Transportation plan (M) / 24

23 Optimal mass transport - Kantorovich formulation Leonid Kantorovich: convex formulation and duality theory Nobel Memorial Prize 1975 in Economics for contributions to the theory of optimum allocation of resources. Again, let c(x 0, x 1) denote the cost of transporting a unit mass from the point x 0 to the point x 1. Given two functions f 0, f 1 : X R +, find a transport plan M : X X R +, where M(x 0, x 1) is the amount of mass moved between x 0 to x 1. Look for M that minimizes transportation cost: min M 0 X X c(x 0, x 1)M(x 0, x 1)dx 0dx 1 s.t. f 0(x 0) = M(x 0, x 1)dx 1, x 0 X X f 1(x 1) = X Leonid Kantorovich 2 M(x 0, x 1)dx 0, 0 x 1 X Transportation plan (M) 9 / 24

24 Measuring distances between functions: optimal mass transport Define a distance between two functions f 0(x) and f 1(x) using optimal transport min c(x 0, x 1 )M(x 0, x 1 )dx 0 dx 1 M 0 X X T (f 0, f 1 ) := s.t. f 0 (x 0 ) = M(x 0, x 1 )dx 1, x 0 X X f 1 (x 1 ) = M(x 0, x 1 )dx 0, x 1 X. If d(x, y) metric on X and c(x, y) = d(x, y) p, then T (f 0, f 1) 1/p is a metric on the space of measures. X 10 / 24

25 Measuring distances between functions: optimal mass transport Define a distance between two functions f 0(x) and f 1(x) using optimal transport min c(x 0, x 1 )M(x 0, x 1 )dx 0 dx 1 M 0 X X T (f 0, f 1 ) := s.t. f 0 (x 0 ) = M(x 0, x 1 )dx 1, x 0 X X f 1 (x 1 ) = M(x 0, x 1 )dx 0, x 1 X. If d(x, y) metric on X and c(x, y) = d(x, y) p, then T (f 0, f 1) 1/p is a metric on the space of measures. X Example revisited: let c(x, y) = x y 2 2 T (f 0, f 1) = 4 ( 2) 2 = 8, T (f 0, f 2) = = 100 This indicates that optimal transport is a more natural distance between two images than L p, at least if one is a deformation of the other. 10 / 24

26 Optimal mass transport - discrete formulation How to solve the optimal transport problem? Here: discretize then optimize 11 / 24

27 Optimal mass transport - discrete formulation How to solve the optimal transport problem? Here: discretize then optimize A linear programming problem: for vectors f 0 R n and f 1 R m and a cost matrix C = [c ij ] R n m, where c ij defines the transportation cost between pixels x i and x j, find the transportation plan M = [m ij ] R n m, m ij is the mass transported between pixels x i and x j, such that min n m ij 0 i=1 j=1 subject to m c ij m ij m m ij = f 0 (i), i = 1,..., n j=1 n m ij = f 1 (j), j = 1,..., m i=1 11 / 24

28 Optimal mass transport - discrete formulation How to solve the optimal transport problem? Here: discretize then optimize A linear programming problem: for vectors f 0 R n and f 1 R m and a cost matrix C = [c ij ] R n m, where c ij defines the transportation cost between pixels x i and x j, find the transportation plan M = [m ij ] R n m, m ij is the mass transported between pixels x i and x j, such that min n m ij 0 i=1 j=1 m c ij m ij m subject to m ij = f 0 (i), i = 1,..., n j=1 n m ij = f 1 (j), j = 1,..., m i=1 min M 0 trace(c T M) subject to M1 m = f 0 M T 1 n = f 1 11 / 24

29 Optimal mass transport - discrete formulation How to solve the optimal transport problem? Here: discretize then optimize A linear programming problem: for vectors f 0 R n and f 1 R m and a cost matrix C = [c ij ] R n m, where c ij defines the transportation cost between pixels x i and x j, find the transportation plan M = [m ij ] R n m, m ij is the mass transported between pixels x i and x j, such that min n m ij 0 i=1 j=1 m c ij m ij m subject to m ij = f 0 (i), i = 1,..., n j=1 n m ij = f 1 (j), j = 1,..., m i=1 min M 0 trace(c T M) subject to M1 m = f 0 M T 1 n = f 1 Number of variables is n m. To compare the distance between to images of size n = m = : M R = approximately variables. Prohibitively large! 11 / 24

30 Optimal mass transport - Sinkhorn iterations Original problem too computationally demanding. Solution: introduce an entropy barrier/regularization term D(M) = i,j (m ij log(m ij ) m ij + 1) [1], T ɛ(f 0, f 1) := min M 0 subject to trace(c T M) + ɛd(m) f 0 = M1 f 1 = M T 1. [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

31 Optimal mass transport - Sinkhorn iterations Original problem too computationally demanding. Solution: introduce an entropy barrier/regularization term D(M) = i,j (m ij log(m ij ) m ij + 1) [1], T ɛ(f 0, f 1) := Using Lagrangian relaxation gives min M 0 subject to trace(c T M) + ɛd(m) f 0 = M1 f 1 = M T 1. L(M, λ 0, λ 1) = trace(c T M) + ɛd(m) + λ T 0 (f 0 M1) + λ T 1 (f 1 M T 1). [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

32 Optimal mass transport - Sinkhorn iterations Original problem too computationally demanding. Solution: introduce an entropy barrier/regularization term D(M) = i,j (m ij log(m ij ) m ij + 1) [1], T ɛ(f 0, f 1) := Using Lagrangian relaxation gives min M 0 subject to trace(c T M) + ɛd(m) f 0 = M1 f 1 = M T 1. L(M, λ 0, λ 1) = trace(c T M) + ɛd(m) + λ T 0 (f 0 M1) + λ T 1 (f 1 M T 1). Given dual variables λ 0, λ 1, the minimum m ij is 0 = L(M, λ0, λ1) m ij = c ij + ɛ log(m ij ) λ 0(i) λ 1(j) [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

33 Optimal mass transport - Sinkhorn iterations Original problem too computationally demanding. Solution: introduce an entropy barrier/regularization term D(M) = i,j (m ij log(m ij ) m ij + 1) [1], T ɛ(f 0, f 1) := Using Lagrangian relaxation gives min M 0 subject to trace(c T M) + ɛd(m) f 0 = M1 f 1 = M T 1. L(M, λ 0, λ 1) = trace(c T M) + ɛd(m) + λ T 0 (f 0 M1) + λ T 1 (f 1 M T 1). Given dual variables λ 0, λ 1, the minimum m ij is 0 = L(M, λ0, λ1) m ij Expressed as m ij = e λ 0(i)/ɛ e c ij /ɛ e λ 1(j)/ɛ = c ij + ɛ log(m ij ) λ 0(i) λ 1(j) M = diag(e λt 0 /ɛ )Kdiag(e λ 1/ɛ ) where K = exp( C/ɛ). Here and in what follows exp( ), log( ),./, denotes the elementwise function. [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

34 Optimal mass transport - Sinkhorn iterations Original problem too computationally demanding. Solution: introduce an entropy barrier/regularization term D(M) = i,j (m ij log(m ij ) m ij + 1) [1], T ɛ(f 0, f 1) := Using Lagrangian relaxation gives min M 0 subject to trace(c T M) + ɛd(m) f 0 = M1 f 1 = M T 1. L(M, λ 0, λ 1) = trace(c T M) + ɛd(m) + λ T 0 (f 0 M1) + λ T 1 (f 1 M T 1). The solution M = diag(e λt 0 /ɛ )Kdiag(e λ1/ɛ ) is specified by n + m unknowns! Given dual variables λ 0, λ 1, the minimum m ij is 0 = L(M, λ0, λ1) m ij Expressed as m ij = e λ 0(i)/ɛ e c ij /ɛ e λ 1(j)/ɛ = c ij + ɛ log(m ij ) λ 0(i) λ 1(j) M = diag(e λt 0 /ɛ )Kdiag(e λ 1/ɛ ) where K = exp( C/ɛ). Here and in what follows exp( ), log( ),./, denotes the elementwise function. [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

35 Optimal mass transport - Sinkhorn iterations How to find the values of λ 0 and λ 1? [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

36 Optimal mass transport - Sinkhorn iterations How to find the values of λ 0 and λ 1? Let u 0 = exp(λ 0/ɛ), u 1 = exp(λ 1/ɛ). The optimal solution M = diag(u 0)Kdiag(u 1) needs to satisfy diag(u 0)Kdiag(u 1)1 = f 0 and diag(u 1)K T diag(u 0)1 = f 1. [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , / 24

37 Optimal mass transport - Sinkhorn iterations How to find the values of λ 0 and λ 1? Let u 0 = exp(λ 0/ɛ), u 1 = exp(λ 1/ɛ). The optimal solution M = diag(u 0)Kdiag(u 1) needs to satisfy Theorem (Sinkhorn iterations [2]) diag(u 0)Kdiag(u 1)1 = f 0 and diag(u 1)K T diag(u 0)1 = f 1. For any matrix K with positive elements there are diagonal matrices diag(u 0), diag(u 1) such that M = diag(u 0)Kdiag(u 1) has prescribed row- and column-sums f 0 and f 1. The vectors u 0 and u 1 can be obtained by alternating marginalization: u 0 = f 0./(Ku 1) u 1 = f 1./(K T u 0). Each iteration only requires the multiplications Ku 1 and K T u 0. This is the bottle neck. Linear convergence rate. Thus highly computationally efficient, allowing for solving large problems. [1] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pages , [2] R. Sinkhorn. Diagonal equivalence to matrices with prescribed row and column sums. The American Mathematical Monthly, 74(4), , / 24

38 Sinkhorn iterations as dual coordinate ascent Several ways to motivate Sinkhorn iterations [1] Diagonal matrix scaling Bregman projections Dykstra s algorithm [1] J.D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré. Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138, / 24

39 Sinkhorn iterations as dual coordinate ascent Several ways to motivate Sinkhorn iterations [1] Diagonal matrix scaling Bregman projections Dykstra s algorithm Here we will introduce yet another interpretation: Use a dual formulation The Sinkhorn iteration corresponds to dual coordinate ascent This allows us to generalize Sinkhorn iterations Approach for addressing inverse problem with optimal transport term [1] J.D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré. Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2), A1111-A1138, / 24

40 Sinkhorn iterations as dual coordinate ascent Lagrangian relaxation gave optimal form of the primal variable M = diag(u 0)Kdiag(u 1) 15 / 24

41 Sinkhorn iterations as dual coordinate ascent Lagrangian relaxation gave optimal form of the primal variable The Lagrangian dual function: M = diag(u 0)Kdiag(u 1) ϕ(u 0, u 1) := min M 0 L(M, u0, u1) = L(M, u 0, u 1) =... = ɛ log(u 0) T f 0 + ɛ log(u 1) T f 1 ɛu T 0 Ku 1 + ɛnm. 15 / 24

42 Sinkhorn iterations as dual coordinate ascent Lagrangian relaxation gave optimal form of the primal variable The Lagrangian dual function: M = diag(u 0)Kdiag(u 1) ϕ(u 0, u 1) := min M 0 L(M, u0, u1) = L(M, u 0, u 1) =... = ɛ log(u 0) T f 0 + ɛ log(u 1) T f 1 ɛu T 0 Ku 1 + ɛnm. The dual problem is thus max ϕ(u u 0,u 0, u 1) 1 15 / 24

43 Sinkhorn iterations as dual coordinate ascent Lagrangian relaxation gave optimal form of the primal variable The Lagrangian dual function: M = diag(u 0)Kdiag(u 1) ϕ(u 0, u 1) := min M 0 L(M, u0, u1) = L(M, u 0, u 1) =... = ɛ log(u 0) T f 0 + ɛ log(u 1) T f 1 ɛu T 0 Ku 1 + ɛnm. The dual problem is thus max ϕ(u u 0,u 0, u 1) 1 Taking the gradient w.r.t u 0 and putting it equal to zero gives and w.r.t u 1 gives These are the Sinkhorn iterations! 0 = f 0./u 0 Ku 1, ( T 0 = f 1./u 1 u0 K) T. 15 / 24

44 Sinkhorn iterations as dual coordinate ascent For g proper, convex and lower semicontinuous, we can now consider problems of the form min f 1 T ɛ(f 0, f 1) + g(f 1) 16 / 24

45 Sinkhorn iterations as dual coordinate ascent For g proper, convex and lower semicontinuous, we can now consider problems of the form min T ɛ(f 0, f 1) + g(f 1) = min trace(c T M) + ɛd(m) + g(f 1) f 1 M 0, f 1 subject to f 0 = M1 f 1 = M T / 24

46 Sinkhorn iterations as dual coordinate ascent For g proper, convex and lower semicontinuous, we can now consider problems of the form min T ɛ(f 0, f 1) + g(f 1) = min trace(c T M) + ɛd(m) + g(f 1) f 1 M 0, f 1 subject to f 0 = M1 f 1 = M T 1. Can be solve by dual coordinate ascent u 0 = f 0./(Ku 1) if the second inclusion can be solved efficiently. 0 g ( ɛ log(u 1)) 1 u 1 K T u 0, Can be solvable when g ( ) is component-wise. Example of such cases: g( ) = I f ( ) indicator function on { f } original optimal transport problem g( ) = / 24

47 Inverse problems with optimal mass transport priors Consider the inverse problems TV-regularization term: f 1 1 min f 1 0 f 1 1 subject to Af 1 g 2 κ. Forward model A, data g, and data mismatch term: Af 1 g 2 17 / 24

48 Inverse problems with optimal mass transport priors Consider the inverse problems min f 1 0 TV-regularization term: f 1 1 f f 1 close to f 0 subject to Af 1 g 2 κ. Forward model A, data g, and data mismatch term: Af 1 g 2 Prior f 0 17 / 24

49 Inverse problems with optimal mass transport priors Consider the inverse problems TV-regularization term: f 1 1 min f 1 0 f γt ɛ(f 0, f 1) subject to Af 1 g 2 κ. Forward model A, data g, and data mismatch term: Af 1 g 2 Prior f 0 17 / 24

50 Inverse problems with optimal mass transport priors Consider the inverse problems TV-regularization term: f 1 1 min f 1 0 f γt ɛ(f 0, f 1) subject to Af 1 g 2 κ. Forward model A, data g, and data mismatch term: Af 1 g 2 Prior f 0 Large convex optimization problem with several terms, variable splitting common: ADMM, primal-dual hybrid gradient algorithm, primal-dual Douglas-Rachford algorithm. Common tool in these algorithms: the proximal operator of the involved functions F. Prox σ F(h) = argmin f F(f ) + 1 2σ f h 2 2. [1] R.T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM Journal on Control and Optimization, 14(5), , / 24

51 Inverse problems with optimal mass transport priors Generalized Sinkhorn iterations We want to compute the proximal of T ɛ(f 0, ), given by Prox σ T ɛ(f 0, )(h) = argmin T ɛ(f 0, f 1) + 1 f 1 2σ f1 h 2 2. Thus we want to solve min trace(c T M) + ɛd(m) + 1 M 0,f 1 2σ f1 h 2 2 subject to f 0 = M1 f 1 = M T / 24

52 Inverse problems with optimal mass transport priors Generalized Sinkhorn iterations We want to compute the proximal of T ɛ(f 0, ), given by Prox σ T ɛ(f 0, )(h) = argmin T ɛ(f 0, f 1) + 1 f 1 2σ f1 h 2 2. Thus we want to solve min trace(c T M) + ɛd(m) + 1 M 0,f 1 2σ f1 h 2 2 subject to f 0 = M1 f 1 = M T 1. Compare to Using dual coordinate ascent, with g( ) = 1 2σ h 2 2, 1 u 0 = f 0./(Ku 1) we get the algorithm: 2 1 u 0 = f 0./(Ku 1) 2 u 1 = exp ( h ω ( h + log ( K T u ) σɛ σɛ 0) + log(σɛ) )) u 1 = f 1./(K T u 0) Here ω denotes the (elementwise) Wright omega function, i.e., x = log(ω(x)) + ω(x). Solved elementwise. Bottleneck is still computation of Ku 1, K T u / 24

53 Inverse problems with optimal mass transport priors Generalized Sinkhorn iterations We want to compute the proximal of T ɛ(f 0, ), given by Prox σ T ɛ(f 0, )(h) = argmin T ɛ(f 0, f 1) + 1 f 1 2σ f1 h 2 2. Thus we want to solve min trace(c T M) + ɛd(m) + 1 M 0,f 1 2σ f1 h 2 2 subject to f 0 = M1 f 1 = M T 1. Compare to Using dual coordinate ascent, with g( ) = 1 2σ h 2 2, 1 u 0 = f 0./(Ku 1) we get the algorithm: 2 1 u 0 = f 0./(Ku 1) 2 u 1 = exp ( h ω ( h + log ( K T u ) σɛ σɛ 0) + log(σɛ) )) u 1 = f 1./(K T u 0) Here ω denotes the (elementwise) Wright omega function, i.e., x = log(ω(x)) + ω(x). Solved elementwise. Bottleneck is still computation of Ku 1, K T u 0. Theorem The algorithm is globally convergent, and with linear convergence rate. 18 / 24

54 Example in computerized tomography Computerized Tomography (CT): imaging modality used in many areas, e.g., medicine. The object is probed with X-rays. Different materials attenuates X-rays differently = incoming and outgoing intensities gives information about the object. Simplest model L r,θ f (x)dx = log ( ) I0, I f (x) is the attenuation in the point x, which is what we want to reconstruct, L r,θ is the line along which the X-ray beam travels, I 0 and I are the the incoming and outgoing intensities. Illustration from Wikipedia 19 / 24

55 Example in computerized tomography Parallel beam 2D CT example: Reconstruction space: pixels Angles: 30 in [π/4, 3π/4] (limited angle) Detector partition: uniform 350 bins Noise level 5% (a) Shepp-Logan phantom (b) Prior 20 / 24

56 Example in computerized tomography TV-regularization and l 2 2 prior: min f 1 γ f 0 f f 1 1 subject to Af 1 w 2 κ. TV-regularization and optimal transport prior: min f 1 γt ɛ(f 0, f 1 ) + f 1 1 subject to Af 1 w 2 κ. (f) Shepp-Logan phantom (g) Prior 20 / 24

57 Example in computerized tomography TV-regularization and l 2 2 prior: min f 1 γ f 0 f f 1 1 subject to Af 1 w 2 κ. TV-regularization and optimal transport prior: min f 1 γt ɛ(f 0, f 1 ) + f 1 1 subject to Af 1 w 2 κ. (k) Shepp-Logan phantom (l) Prior (m) TV-regularization (n) TV-regularization and l 2 2 -prior (γ = 10) (o) TV-regularization and optimal transport prior (γ = 4) 20 / 24

58 Example in computerized tomography Comparing different regularization parameters for the problem with l 2 2 prior. min f 1 γ f 0 f f 1 1 subject to Af 1 w 2 κ. (p) γ = 1 (q) γ = 10 (r) γ = 100 (s) γ = 1000 (t) γ = Figure: Reconstructions using l 2 prior with different regularization parameters γ. 21 / 24

59 Example in computerized tomography Parallel beam 2D CT example: Reconstruction space: pixels Angles: 30 in [0, π] Detector partition: uniform 350 bins Noise level 3% (a) Phantom (b) Prior 22 / 24

60 Example in computerized tomography Parallel beam 2D CT example: Reconstruction space: pixels Angles: 30 in [0, π] Detector partition: uniform 350 bins Noise level 3% (f) Phantom (g) Prior (h) TV-regularization (i) TV-regularization and l 2 2 -prior (γ = 10) (j) TV-regularization and optimal transport prior (γ = 4) 22 / 24

61 Conclusions and further work Conclusions Optimal mass transport - a viable framework for imaging applications Generalized Sinkhorn iteration for computing the proximal operator of optimal transport cost Use variable splitting for solving the inverse problem Application to CT reconstruction using optimal transport priors 23 / 24

62 Conclusions and further work Conclusions Optimal mass transport - a viable framework for imaging applications Generalized Sinkhorn iteration for computing the proximal operator of optimal transport cost Use variable splitting for solving the inverse problem Application to CT reconstruction using optimal transport priors Potential future directions: Application to spatiotemporal image reconstruction: arg min f 0,...,f T X T j=0 [ ] G(A(f j ), g j ) + λf(f j ) + More efficient ways of solving the dual problem? T γh(f j 1, f j ) Learning for inverse problems using optimal transport as a loss function j=1 23 / 24

63 Thank you for your attention! Questions? 24 / 24

arxiv: v4 [math.oc] 28 Aug 2017

arxiv: v4 [math.oc] 28 Aug 2017 GENERALIZED SINKHORN ITERATIONS FOR REGULARIZING INVERSE PROBLEMS USING OPTIMAL MASS TRANSPORT JOHAN KARLSSON AND AXEL RINGH arxiv:1612.02273v4 [math.oc] 28 Aug 2017 Abstract. The optimal mass transport

More information

Generative Models and Optimal Transport

Generative Models and Optimal Transport Generative Models and Optimal Transport Marco Cuturi Joint work / work in progress with G. Peyré, A. Genevay (ENS), F. Bach (INRIA), G. Montavon, K-R Müller (TU Berlin) Statistics 0.1 : Density Fitting

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Numerical Optimal Transport and Applications

Numerical Optimal Transport and Applications Numerical Optimal Transport and Applications Gabriel Peyré Joint works with: Jean-David Benamou, Guillaume Carlier, Marco Cuturi, Luca Nenna, Justin Solomon www.numerical-tours.com Histograms in Imaging

More information

softened probabilistic

softened probabilistic Justin Solomon MIT Understand geometry from a softened probabilistic standpoint. Somewhere over here. Exactly here. One of these two places. Query 1 2 Which is closer, 1 or 2? Query 1 2 Which is closer,

More information

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm J. K. Pant, W.-S. Lu, and A. Antoniou University of Victoria August 25, 2011 Compressive Sensing 1 University

More information

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research

More information

Optimal Transport: A Crash Course

Optimal Transport: A Crash Course Optimal Transport: A Crash Course Soheil Kolouri and Gustavo K. Rohde HRL Laboratories, University of Virginia Introduction What is Optimal Transport? The optimal transport problem seeks the most efficient

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Compressive Imaging by Generalized Total Variation Minimization

Compressive Imaging by Generalized Total Variation Minimization 1 / 23 Compressive Imaging by Generalized Total Variation Minimization Jie Yan and Wu-Sheng Lu Department of Electrical and Computer Engineering University of Victoria, Victoria, BC, Canada APCCAS 2014,

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

Conductivity imaging from one interior measurement

Conductivity imaging from one interior measurement Conductivity imaging from one interior measurement Amir Moradifam (University of Toronto) Fields Institute, July 24, 2012 Amir Moradifam (University of Toronto) A convergent algorithm for CDII 1 / 16 A

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

EE364b Convex Optimization II May 30 June 2, Final exam

EE364b Convex Optimization II May 30 June 2, Final exam EE364b Convex Optimization II May 30 June 2, 2014 Prof. S. Boyd Final exam By now, you know how it works, so we won t repeat it here. (If not, see the instructions for the EE364a final exam.) Since you

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Iterative regularization of nonlinear ill-posed problems in Banach space

Iterative regularization of nonlinear ill-posed problems in Banach space Iterative regularization of nonlinear ill-posed problems in Banach space Barbara Kaltenbacher, University of Klagenfurt joint work with Bernd Hofmann, Technical University of Chemnitz, Frank Schöpfer and

More information

2D X-Ray Tomographic Reconstruction From Few Projections

2D X-Ray Tomographic Reconstruction From Few Projections 2D X-Ray Tomographic Reconstruction From Few Projections Application of Compressed Sensing Theory CEA-LID, Thalès, UJF 6 octobre 2009 Introduction Plan 1 Introduction 2 Overview of Compressed Sensing Theory

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Splitting with Near-Circulant Linear Systems: Applications to Total Variation CT and PET. Ernest K. Ryu S. Ko J.-H. Won

Splitting with Near-Circulant Linear Systems: Applications to Total Variation CT and PET. Ernest K. Ryu S. Ko J.-H. Won Splitting with Near-Circulant Linear Systems: Applications to Total Variation CT and PET Ernest K. Ryu S. Ko J.-H. Won Outline Motivation and algorithm Theory Applications Experiments Conclusion Motivation

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

CS711008Z Algorithm Design and Analysis

CS711008Z Algorithm Design and Analysis CS711008Z Algorithm Design and Analysis Lecture 8 Linear programming: interior point method Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 / 31 Outline Brief

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

Assorted DiffuserCam Derivations

Assorted DiffuserCam Derivations Assorted DiffuserCam Derivations Shreyas Parthasarathy and Camille Biscarrat Advisors: Nick Antipa, Grace Kuo, Laura Waller Last Modified November 27, 28 Walk-through ADMM Update Solution In our derivation

More information

M. Marques Alves Marina Geremia. November 30, 2017

M. Marques Alves Marina Geremia. November 30, 2017 Iteration complexity of an inexact Douglas-Rachford method and of a Douglas-Rachford-Tseng s F-B four-operator splitting method for solving monotone inclusions M. Marques Alves Marina Geremia November

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

Lecture 24 November 27

Lecture 24 November 27 EE 381V: Large Scale Optimization Fall 01 Lecture 4 November 7 Lecturer: Caramanis & Sanghavi Scribe: Jahshan Bhatti and Ken Pesyna 4.1 Mirror Descent Earlier, we motivated mirror descent as a way to improve

More information

Variational Image Restoration

Variational Image Restoration Variational Image Restoration Yuling Jiao yljiaostatistics@znufe.edu.cn School of and Statistics and Mathematics ZNUFE Dec 30, 2014 Outline 1 1 Classical Variational Restoration Models and Algorithms 1.1

More information

Seismic imaging and optimal transport

Seismic imaging and optimal transport Seismic imaging and optimal transport Bjorn Engquist In collaboration with Brittany Froese, Sergey Fomel and Yunan Yang Brenier60, Calculus of Variations and Optimal Transportation, Paris, January 10-13,

More information

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29 A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014

More information

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Penalized Barycenters in the Wasserstein space

Penalized Barycenters in the Wasserstein space Penalized Barycenters in the Wasserstein space Elsa Cazelles, joint work with Jérémie Bigot & Nicolas Papadakis Université de Bordeaux & CNRS Journées IOP - Du 5 au 8 Juillet 2017 Bordeaux Elsa Cazelles

More information

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs Christian Clason Faculty of Mathematics, Universität Duisburg-Essen joint work with Barbara Kaltenbacher, Tuomo Valkonen,

More information

Proximal splitting methods on convex problems with a quadratic term: Relax!

Proximal splitting methods on convex problems with a quadratic term: Relax! Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.

More information

A Second Order Non-smooth Variational Model for Restoring Manifold-valued Images

A Second Order Non-smooth Variational Model for Restoring Manifold-valued Images A Second Order Non-smooth Variational Model for Restoring Manifold-valued Images Ronny Bergmann Mathematical Image Processing and Data Analysis Department of Mathematics University of Kaiserslautern July

More information

Nonlinear Flows for Displacement Correction and Applications in Tomography

Nonlinear Flows for Displacement Correction and Applications in Tomography Nonlinear Flows for Displacement Correction and Applications in Tomography Guozhi Dong 1 and Otmar Scherzer 1,2 1 Computational Science Center, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Wien,

More information

a sinkhorn newton method for entropic optimal transport

a sinkhorn newton method for entropic optimal transport a sinkhorn newton method for entropic optimal transport Christoph Brauer Christian Clason Dirk Lorenz Benedikt Wirth February 2, 2018 arxiv:1710.06635v2 [math.oc] 9 Feb 2018 Abstract We consider the entropic

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

ADMM algorithm for demosaicking deblurring denoising

ADMM algorithm for demosaicking deblurring denoising ADMM algorithm for demosaicking deblurring denoising DANIELE GRAZIANI MORPHEME CNRS/UNS I3s 2000 route des Lucioles BP 121 93 06903 SOPHIA ANTIPOLIS CEDEX, FRANCE e.mail:graziani@i3s.unice.fr LAURE BLANC-FÉRAUD

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

A few words about the MTW tensor

A few words about the MTW tensor A few words about the Ma-Trudinger-Wang tensor Université Nice - Sophia Antipolis & Institut Universitaire de France Salah Baouendi Memorial Conference (Tunis, March 2014) The Ma-Trudinger-Wang tensor

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Inverse problems and medical imaging

Inverse problems and medical imaging Inverse problems and medical imaging Bastian von Harrach harrach@math.uni-frankfurt.de Institute of Mathematics, Goethe University Frankfurt, Germany Seminario di Calcolo delle Variazioni ed Equazioni

More information

arxiv: v3 [math.oc] 29 Jun 2016

arxiv: v3 [math.oc] 29 Jun 2016 MOCCA: mirrored convex/concave optimization for nonconvex composite functions Rina Foygel Barber and Emil Y. Sidky arxiv:50.0884v3 [math.oc] 9 Jun 06 04.3.6 Abstract Many optimization problems arising

More information

Mathematical Foundations of Data Sciences

Mathematical Foundations of Data Sciences Mathematical Foundations of Data Sciences Gabriel Peyré CNRS & DMA École Normale Supérieure gabriel.peyre@ens.fr https://mathematical-tours.github.io www.numerical-tours.com January 7, 2018 Chapter 18

More information

Minimizing Isotropic Total Variation without Subiterations

Minimizing Isotropic Total Variation without Subiterations MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Minimizing Isotropic Total Variation without Subiterations Kamilov, U. S. TR206-09 August 206 Abstract Total variation (TV) is one of the most

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization

More information

Variational methods for restoration of phase or orientation data

Variational methods for restoration of phase or orientation data Variational methods for restoration of phase or orientation data Martin Storath joint works with Laurent Demaret, Michael Unser, Andreas Weinmann Image Analysis and Learning Group Universität Heidelberg

More information

Entropic and Displacement Interpolation: Tryphon Georgiou

Entropic and Displacement Interpolation: Tryphon Georgiou Entropic and Displacement Interpolation: Schrödinger bridges, Monge-Kantorovich optimal transport, and a Computational Approach Based on the Hilbert Metric Tryphon Georgiou University of Minnesota IMA

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

Operator Splitting for Parallel and Distributed Optimization

Operator Splitting for Parallel and Distributed Optimization Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer

More information

Parameter Identification in Partial Differential Equations

Parameter Identification in Partial Differential Equations Parameter Identification in Partial Differential Equations Differentiation of data Not strictly a parameter identification problem, but good motivation. Appears often as a subproblem. Given noisy observation

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

arxiv: v1 [math.oc] 10 May 2016

arxiv: v1 [math.oc] 10 May 2016 Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints arxiv:1605.02970v1 [math.oc] 10 May 2016 Alexey Chernov 1, Pavel Dvurechensky 2, and Alexender Gasnikov

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection EUSIPCO 2015 1/19 A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection Jean-Christophe Pesquet Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est

More information

A Tutorial on Primal-Dual Algorithm

A Tutorial on Primal-Dual Algorithm A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.

More information

Inderjit Dhillon The University of Texas at Austin

Inderjit Dhillon The University of Texas at Austin Inderjit Dhillon The University of Texas at Austin ( Universidad Carlos III de Madrid; 15 th June, 2012) (Based on joint work with J. Brickell, S. Sra, J. Tropp) Introduction 2 / 29 Notion of distance

More information

Optimal transport for Gaussian mixture models

Optimal transport for Gaussian mixture models 1 Optimal transport for Gaussian mixture models Yongxin Chen, Tryphon T. Georgiou and Allen Tannenbaum Abstract arxiv:1710.07876v2 [math.pr] 30 Jan 2018 We present an optimal mass transport framework on

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Optimal transport for Seismic Imaging

Optimal transport for Seismic Imaging Optimal transport for Seismic Imaging Bjorn Engquist In collaboration with Brittany Froese and Yunan Yang ICERM Workshop - Recent Advances in Seismic Modeling and Inversion: From Analysis to Applications,

More information

arxiv: v3 [math.na] 11 Oct 2017

arxiv: v3 [math.na] 11 Oct 2017 Indirect Image Registration with Large Diffeomorphic Deformations Chong Chen and Ozan Öktem arxiv:1706.04048v3 [math.na] 11 Oct 2017 Abstract. The paper adapts the large deformation diffeomorphic metric

More information

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Inverse problems and medical imaging

Inverse problems and medical imaging Inverse problems and medical imaging Bastian von Harrach harrach@math.uni-stuttgart.de Chair of Optimization and Inverse Problems, University of Stuttgart, Germany Rhein-Main Arbeitskreis Mathematics of

More information

arxiv: v2 [stat.ml] 24 Aug 2015

arxiv: v2 [stat.ml] 24 Aug 2015 A SMOOTHED DUAL APPROACH FOR VARIATIONAL WASSERSTEIN PROBLEMS MARCO CUTURI AND GABRIEL PEYRÉ arxiv:1503.02533v2 [stat.ml] 24 Aug 2015 Abstract. Variational problems that involve Wasserstein distances have

More information

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming Altuğ Bitlislioğlu and Colin N. Jones Abstract This technical note discusses convergence

More information

Sparse Regularization via Convex Analysis

Sparse Regularization via Convex Analysis Sparse Regularization via Convex Analysis Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York, USA 29 / 66 Convex or non-convex: Which

More information

Source Reconstruction for 3D Bioluminescence Tomography with Sparse regularization

Source Reconstruction for 3D Bioluminescence Tomography with Sparse regularization 1/33 Source Reconstruction for 3D Bioluminescence Tomography with Sparse regularization Xiaoqun Zhang xqzhang@sjtu.edu.cn Department of Mathematics/Institute of Natural Sciences, Shanghai Jiao Tong University

More information

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1, Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,

More information

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities STATISTICS,OPTIMIZATION AND INFORMATION COMPUTING Stat., Optim. Inf. Comput., Vol. 2, June 204, pp 05 3. Published online in International Academic Press (www.iapress.org) On the Convergence and O(/N)

More information

Spin Glass Approach to Restricted Isometry Constant

Spin Glass Approach to Restricted Isometry Constant Spin Glass Approach to Restricted Isometry Constant Ayaka Sakata 1,Yoshiyuki Kabashima 2 1 Institute of Statistical Mathematics 2 Tokyo Institute of Technology 1/29 Outline Background: Compressed sensing

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Superiorized Inversion of the Radon Transform

Superiorized Inversion of the Radon Transform Superiorized Inversion of the Radon Transform Gabor T. Herman Graduate Center, City University of New York March 28, 2017 The Radon Transform in 2D For a function f of two real variables, a real number

More information

Image restoration: numerical optimisation

Image restoration: numerical optimisation Image restoration: numerical optimisation Short and partial presentation Jean-François Giovannelli Groupe Signal Image Laboratoire de l Intégration du Matériau au Système Univ. Bordeaux CNRS BINP / 6 Context

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

A priori bounds on the condition numbers in interior-point methods

A priori bounds on the condition numbers in interior-point methods A priori bounds on the condition numbers in interior-point methods Florian Jarre, Mathematisches Institut, Heinrich-Heine Universität Düsseldorf, Germany. Abstract Interior-point methods are known to be

More information

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Inverse problems and medical imaging

Inverse problems and medical imaging Inverse problems and medical imaging Bastian von Harrach harrach@math.uni-frankfurt.de Institute of Mathematics, Goethe University Frankfurt, Germany Colloquium of the Department of Mathematics Saarland

More information

Implicit Fixed-point Proximity Framework for Optimization Problems and Its Applications

Implicit Fixed-point Proximity Framework for Optimization Problems and Its Applications Syracuse University SURFACE Dissertations - ALL SURFACE June 2018 Implicit Fixed-point Proximity Framework for Optimization Problems and Its Applications Xiaoxia Liu Syracuse University Follow this and

More information

Phase Retrieval from Local Measurements: Deterministic Measurement Constructions and Efficient Recovery Algorithms

Phase Retrieval from Local Measurements: Deterministic Measurement Constructions and Efficient Recovery Algorithms Phase Retrieval from Local Measurements: Deterministic Measurement Constructions and Efficient Recovery Algorithms Aditya Viswanathan Department of Mathematics and Statistics adityavv@umich.edu www-personal.umich.edu/

More information

Primal-dual algorithms for the sum of two and three functions 1

Primal-dual algorithms for the sum of two and three functions 1 Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms

More information

Relation of Pure Minimum Cost Flow Model to Linear Programming

Relation of Pure Minimum Cost Flow Model to Linear Programming Appendix A Page 1 Relation of Pure Minimum Cost Flow Model to Linear Programming The Network Model The network pure minimum cost flow model has m nodes. The external flows given by the vector b with m

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Convergence of Fixed-Point Iterations

Convergence of Fixed-Point Iterations Convergence of Fixed-Point Iterations Instructor: Wotao Yin (UCLA Math) July 2016 1 / 30 Why study fixed-point iterations? Abstract many existing algorithms in optimization, numerical linear algebra, and

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information