Combining multiresolution analysis and non-smooth optimization for texture segmentation

Similar documents
A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

Proximal tools for image reconstruction in dynamic Positron Emission Tomography

A Primal-dual Three-operator Splitting Scheme

About Split Proximal Algorithms for the Q-Lasso

Inverse problem and optimization

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

Adaptive Primal Dual Optimization for Image Processing and Learning

On a multiscale representation of images as hierarchy of edges. Eitan Tadmor. University of Maryland

Variational Image Restoration

Primal-dual algorithms for the sum of two and three functions 1

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Sparse linear models

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

1 Sparsity and l 1 relaxation

INVERSE PROBLEM FORMULATION FOR REGULARITY ESTIMATION IN IMAGES

A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing

Semi-Linearized Proximal Alternating Minimization for a Discrete Mumford Shah Model

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION

Generalized greedy algorithms.

Sparse Regularization via Convex Analysis

Variable Metric Forward-Backward Algorithm

Stochastic Proximal Gradient Algorithm

A posteriori error control for the binary Mumford Shah model

Recent developments on sparse representation

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Markov Random Fields

Proximal splitting methods on convex problems with a quadratic term: Relax!

A First Order Primal-Dual Algorithm for Nonconvex T V q Regularization

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

consistent learning by composite proximal thresholding

Solving DC Programs that Promote Group 1-Sparsity

Gradient Sliding for Composite Optimization

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

Signal Processing and Networks Optimization Part VI: Duality

Proximal Methods for Optimization with Spasity-inducing Norms

Variational methods for restoration of phase or orientation data

arxiv: v2 [math.oc] 21 Nov 2017

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs

arxiv: v4 [math.oc] 29 Jan 2018

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions

OWL to the rescue of LASSO

Artifact-free Wavelet Denoising: Non-convex Sparse Regularization, Convex Optimization

ARock: an algorithmic framework for asynchronous parallel coordinate updates

Primal-dual coordinate descent

Convex Hodge Decomposition and Regularization of Image Flows

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Optimisation in imaging

Convex Hodge Decomposition of Image Flows

Coordinate Update Algorithm Short Course Operator Splitting

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

Investigating the Influence of Box-Constraints on the Solution of a Total Variation Model via an Efficient Primal-Dual Method

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Combining local regularity estimation and total variation optimization for scale-free texture segmentation

Accelerated primal-dual methods for linearly constrained convex problems

arxiv: v1 [math.oc] 13 Dec 2018

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm

ADMM and Fast Gradient Methods for Distributed Optimization

Gauge optimization and duality

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

Monotone Operator Splitting Methods in Signal and Image Recovery

Proximal methods. S. Villa. October 7, 2014

SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS

ϕ ( ( u) i 2 ; T, a), (1.1)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

Denoising of NIRS Measured Biomedical Signals

Convex relaxation for Combinatorial Penalties

Convergence of Fixed-Point Iterations

Dual methods for the minimization of the total variation

2D HILBERT-HUANG TRANSFORM. Jérémy Schmitt, Nelly Pustelnik, Pierre Borgnat, Patrick Flandrin

Stochastic and online algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Regularization Methods for Prediction in Dynamic Graphs and e-marketing Applications

On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting

Solving Corrupted Quadratic Equations, Provably

Optimization methods

A Linearly Convergent First-order Algorithm for Total Variation Minimization in Image Processing

A Level Set Based. Finite Element Algorithm. for Image Segmentation

2D Wavelets. Hints on advanced Concepts

Minimizing Isotropic Total Variation without Subiterations

Oslo Class 6 Sparsity based regularization

Smoothing Proximal Gradient Method. General Structured Sparse Regression

Relaxed linearized algorithms for faster X-ray CT image reconstruction

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Sparse Optimization Lecture: Dual Methods, Part I

arxiv: v1 [math.na] 3 Jan 2019

Sélection adaptative des paramètres pour le débruitage des images

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China)

Accelerated Proximal Gradient Methods for Convex Optimization

Lasso: Algorithms and Extensions

Sparse linear models and denoising

ENERGY METHODS IN IMAGE PROCESSING WITH EDGE ENHANCEMENT

An Introduction to Wavelets and some Applications

Error Analysis for H 1 Based Wavelet Interpolations

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Transcription:

Combining multiresolution analysis and non-smooth optimization for texture segmentation Nelly Pustelnik CNRS, Laboratoire de Physique de l ENS de Lyon

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Stochastic textures Geometric textures periodic Stochastic textures Conclusions

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Stochastic textures Geometric textures periodic Stochastic textures scale-free? Conclusions

1.5 1 0.5 0-0.5-1 0 100 200 300 400 500 600 700 800 900 1000-1.5 4 3 2 1 0-1 -2-3 0 100 200 300 400 500 600 700 800 900 1000-4 1 0.5 0-0.5-1 -1.5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7-1 7 6 5 4 3 2 1 0-1 -2 0 1 2 3 4 5 6 7-3 8 6 4 2 0-2 -4 0 1 2 3 4 5 6 7 8 9 Sinusoidal signal periodic Stochastic textures log power Time log frequency Sinusoidal signal + noise periodic log power Time log frequency Monofractal signal scale-free log power Time log frequency

Texture segmentation Ω 1 Ω 2 Mask Synthetic image Real texture Segmentation: Estimate the boundary between Ω 1 and Ω 2 - Contribution 1: Discrete Mumford-Shah, - Contribution 2: Chan-Vese model. Texture = local dependence = local regularity. - Contribution 3: Joint estimation and segmentation.

SIROCCO Projet (Start) Projet Jeunes Chercheur.e.s GdR ISIS 2013-2015 Défi Imag In CNRS 2017 Joint work with : B. Pascal, M. Foare, P. Abry, V. Vidal, J.-C. Géminard (LPENSL), L. Condat (GIPSA-Lab), H. Wendt, N. Dobigeon (IRIT). Difficulties: large size data (> 2 million pixels), accurate transition, avoid irregular contour.

Summary 1. Basics: wavelets and proximal tools 2. Segmentation by means of proximal tools 3. Two-step texture segmentation relying on scale-free descriptor 4. Joint texture segmentation

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Wavelet transform and sparsity prox Wavelets: sparse representation of most natural signals. Dyadic wavelet transform, denoted F R Ω Ω filterbank implementation, orthonormal transform: FF = F F = I. g R Ω ζ = Fg

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Wavelet transform and sparsity prox g ζ = Fg softλ (F g) b = F softλ (F g) u

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Wavelet transform and sparsity prox g ζ = Fg softλ (F g) softλ (ζ) = max{ ζi λ, 0}sign(ζi ) i Ω X 1 νi = arg min kν ζk22 + λ ν 2 i {z } kνk1 1 b = arg min ku gk22 + λkf uk1 u u 2 b = F softλ (F g) u 10 8 Identity Soft-thresholding 6 4 2 -λ 0 λ -2-4 -6-8 αi

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Wavelet transform and sparsity prox g ζ = Fg softλ (F g) b = F softλ (F g) u 10 softλ (ζ) = max{ ζi λ, 0}sign(ζi ) i Ω 8 Identity Soft-thresholding 6 = proxλk k1 (ζ) 4 2 -λ 0 1 b = arg min ku gk22 + λkf uk1 u u 2 λ -2-4 -6-8 αi

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Wavelet transform and sparsity prox g ζ = Fg softλ (F g) b = F softλ (F g) u 10 softλ (ζ) = max{ ζi λ, 0}sign(ζi ) i Ω = proxλk k1 (ζ) 8 Identity Soft-thresholding 6 4 2 -λ 0 λ -2 b = proxλkf k1 (g) u -4-6 -8 αi

F s : linear operator, Non-smooth optimization û Argmin u R Ω f s : proper, convex, l.s.c functions. S f s (F s u) s=1 Since 2004, numerous proximal algorithms: [Bauschke-Combettes, 2017] - Forward-Backward S = 2, f 1 Lipschitz gradient and L 2 = Id - Douglas-Rachford S = 2 and F 1 = F 2 = Id - PPXA F 1 =... = F S = Id - ADMM Invert S i=1 F ifi - Primal-dual... Flexibility in the design of objective functions.

F s : linear operator, Non-smooth optimization û Argmin u R Ω f s : proper, convex, l.s.c functions. S f s (F s u) s=1 Handle with large size problems: Closed form expression of the proximity operators: Avoid splitting: prox s fs. prox fs u = arg min ν u 2 ν 2 + f s (ν). Exploit properties of f s (strong convexity) and of F s. Block-coordinate approach.

Summary 1. Basics: wavelets and proximal tools 2. Segmentation by means of proximal tools 3. Two-step texture segmentation relying on scale-free descriptor 4. Joint texture segmentation

minimize u,k Mumford-Shah 1 (u g) 2 dxdy + β 2 Ω }{{} fidelity Ω\K u 2 dxdy } {{ } smoothness [Mumford-Shah, 1989] Ω: image domain, g L (Ω): input (possibly noisy), u W 1,2 (Ω): piecewise smooth approximation of g, + λh 1 (K Ω) }{{} length W 1,2 (Ω) = { u L 2 (Ω) u L 2 (Ω) } where weak derivative operator K: set of discontinuities, H 1 : Hausdorff measure. g (û, K)

minimize u,k Mumford-Shah 1 (u g) 2 dxdy + β 2 Ω }{{} fidelity Ω\K u 2 dxdy } {{ } smoothness [Mumford-Shah, 1989] Ω: image domain, g L (Ω): input (possibly noisy), u W 1,2 (Ω): piecewise smooth approximation of g, + λh 1 (K Ω) }{{} length W 1,2 (Ω) = { u L 2 (Ω) u L 2 (Ω) } where weak derivative operator K: set of discontinuities, H 1 : Hausdorff measure. g (û, K)

Total variation model 1 minimize (u g) 2 dxdy + β u 2 dxdy + λh 1 (K Ω) u,k 2 Ω Ω\K Discrete piecewise constant relaxation minimize u 1 2 u g 2 2 + λtv(u) + Convex. + Fast implementation due to strong convexity. TV denotes some form of the 2-D discrete total variation, i.e., N 1 N 2 ( u R Ω ) TV(u)= u i1 +1,i 2 u i1,i 2 2 + u i1,i 2 +1 u i1,i 2 2 i 1 =1i 2 =1 = Du 2,1,

Total variation model g û TV with λ = 100 û TV with λ = 500

Proposed Discrete Mumford-Shah minimize u,e 1 2 u g 2 2 + β (1 e) Du 2 + λr(e), [Foare-Pustelnik-Condat, 2018] Ω = {1,..., N 1 } {1,..., N 2 } g R Ω : input (possibly noisy), u R Ω : piecewise smooth approximation of g, D R E Ω : models a finite difference operator, e R E : edges between nodes whose value is 1 when a contour change is detected and 0 otherwise, R: non-smooth to favor sparse solution (i.e. short K ).

Proposed Discrete Mumford-Shah minimize u,e [Foare-Pustelnik-Condat, 2018] Ω = {1,..., N 1 } {1,..., N 2 } g R Ω : input (possibly noisy), 1 2 u g 2 2 + β (1 e) Du 2 + λr(e), u R Ω : piecewise smooth approximation of g, D R E Ω : models a finite difference operator, e R E : edges between nodes whose value is 1 when a contour change is detected and 0 otherwise, R: non-smooth to favor sparse solution (i.e. short K ). Hybrid linearized proximal alternating minimization (alternative to [Bolte et al. 2013]

Segmentation methods: summary Total Variation Discrete MS Chan-Vese + Fast + Piecewise constant Not accurate contour + Extract contour + Identify smooth variations + Piecewise smooth piecewise constant Time consuming Tune parameters + Perform good segmentation results Time consuming Tune parameters: number of labels, mean value µ q [Pustelnik-Condat, 2017]

Segmentation methods: summary Total Variation Discrete MS Chan-Vese + Fast + Piecewise constant Not accurate contour + Extract contour + Identify smooth variations + Piecewise smooth piecewise constant Time consuming Tune parameters + Perform good segmentation results Time consuming Tune parameters: number of labels, mean value µ q [Pustelnik-Condat, 2017]

Summary 1. Basics: wavelets and proximal tools 2. Segmentation by means of proximal tools 3. Two-step texture segmentation relying on scale-free descriptor 4. Joint texture segmentation

Local regularity (1D) 100 50 0-50 0 100 200 300 400 500

Local regularity (1D) f α regular at y f (x) f (y) χ x y α Example: α = 1 10

Local regularity (1D) f α regular at y f (x) f (y) χ x y α Example: α = 1 2

Local regularity (1D) Definition ( y) h(y) = sup α such that f is α-regular at y. Compute h(y) at every point?

Pointwise regularity and wavelet transform modulus [Extracted from Mallat 1998] log 2 Wf (u, s) log 2 A + (α + 1 2 ) log 2 s. Extract α at each location compute the slope. Continuous wavelet transform not adapted to large size images.

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Local regularity and wavelet leaders Discrete wavelet coefficients: - Coefficients at scale j {1,..., J} and subband m = {1, 2, 3}: ζj,m = Hj,m g - Orthonormal transform: h i> > >,..., HJ,3 F = H1,1, L> where J,4 g N Hj,m R 4j ζ = Fg N N and LJ,4 R 4J N

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Local regularity and wavelet leaders Wavelet leader at scale j and location k - local supremum of all wavelet coefficients taken within a spatial neighborhood across all finer scales j 0 j ( λj,k = [k2j, (k + 1)2j ) S Lj,k = sup ζj 0,m,k where Λj,k = p { 1,0,1}2 λj,k+p m={1,2,3} λj 0,k 0 Λj,k

Multiresolution + nonlinearity local regularity Behavior through the scales [Jaffard, 2004] L j,k s n 2 jhn when 2 j 0 (where k = 2 j n) Linear regression across scales [Wendt et al., 2009] ĥ n = j w j,k log 2 L j,k

Multiresolution + nonlinearity local regularity Behavior through the scales [Jaffard, 2004] L j,k s n 2 jhn when 2 j 0 (where k = 2 j n) Linear regression across scales [Wendt et al., 2009] ĥ n = j w j,k log 2 L j,k Unbiased when { j w j,k 0 j jw j,k 1 Ω 1 Ω 2 Mask Original g Estimate ĥ

Multiresolution + nonlinearity + nonsmooth Total variation: piecewise constant estimate 1 ĥ TV = arg min u 2 u w j log 2 L j 2 2 + λ Du 1 j }{{} Nonlinear Linear transform transform Linear transform wavelet log 2 leaders linear regression ĥ Nonlinear transform l 1 minimisation

Multiresolution + nonlinearity + nonsmooth Ω 1 Ω 2 Mask Original g Estimate ĥ Estimate ĥtv

Summary 1. Basics: wavelets and proximal tools 2. Segmentation by means of proximal tools 3. Two-step texture segmentation relying on scale-free descriptor 4. Joint texture segmentation

Multiresolution + nonlinearity + nonsmooth Total variation: Joint estimation and segmentation [Pustelnik et al., 2016] (ĥtvw, ŵ) = arg min u,w 1 2 u w j log 2 L j 2 2 + λ Du 1 + d C (w) j }{{} Relax unbiased contraint: C = {w R J Ω ( k) j w j,k 0 and j jw j,k 1} d C (ŵ)) = w P C (w) 2 1 P C (ŵ)) = arg min ν C 2 ν w 2 2 ĥ

Multiresolution + nonlinearity + nonsmooth Ω 1 Ω 2 Mask Original g Estimate ĥ Estimate ĥtv Estimate ĥtvw

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Multiresolution + nonlinearity + nonsmooth Original g b Estimate h btv Estimate h btvw Estimate h Conclusions

Intro Basics Segmentation Two-step texture segmentation Joint Texture Segmentation Conclusions Multiresolution + nonlinearity + nonsmooth [Yuan et al. 2015] Original g [Arbelaez et al. 2011] b Estimate h btv Estimate h btvw Estimate h

Multiresolution + nonlinearity + nonsmooth 1 (ĥtvw, ŵ) = arg min (u,w) 2 u w j log 2 L j 2 2 + λ Du 1 + d C (ŵ) j }{{} ĥ + Good texture segmentation performance + Convex minimization formulation + Combined estimation and segmentation (contrary to ĥtv) Computational cost. Not adapted for large scale data.

Multiresolution + nonlinearity local regularity Behavior through the scales [Jaffard, 2004] L j,k s n 2 jhn as 2 j 0 (where k = 2 j n) log 2 L j,k log 2 s n + jh n as 2 j 0. PLOVER: Piecewise constant LOcal VariancE and Regularity estimation [Pascal et al., 2018] Find ( v, ĥ) Argmin j log 2 L j v jh 2 2 + η Dh 1 + ζ Dv 1 v,h with ŝ = 2 v + Strongly convex computationally efficient + Combine estimation and segmentation. + Joint estimation of the local variance and local regularity.

Multiresolution + nonlinearity + nonsmooth + fast (a) Synthetic texture x (b) s mask (c) h mask 0.6 0.5 2 1 0.5 0.4 0 0.3-1 0.4 0.2-2 0.3 0.1 Linear regression Disjoint TV PLOVER Disjoint re-estimation SNR = 2.7496 0.7 SNR = 9.9722 0.7 SNR = 10.2854 0.7 SNR = 8.0758 PLOVER re-estimation SNR = 8.0241 0.7 0.7 Local variance 0.6 0.5 0.4 0.6 0.5 0.4 0.6 0.5 0.4 0.6 0.5 0.4 0.6 0.5 0.4 SNR = -5.3411 0.3 0.7 SNR = -4.2591 0.3 0.7 SNR = -4.1325 0.3 0.7 SNR = 0.14181 0.3 0.7 SNR = 0.24025 0.3 0.7 Local regularity 0.6 0.5 0.4 0.6 0.5 0.4 0.6 0.5 0.4 0.6 0.5 0.4 0.6 0.5 0.4 0.3 0.3 0.3 0.3 0.3

Multiresolution + nonlinearity + nonsmooth + fast Image g R N Zoom of g PLOVER : ŝ PLOVER : ĥ 2 1.5 1 0.5 0 [Arbelaez2011] [Yuan 2015] Disjoint TV PLOVER

Conclusions HL-PAM for fast discrete Mumford-Shah several applications going from image restoration to graph analysis. Proximity operator of a sum of two functions application to segmentation and depth map estimation. Scale-free descriptors in a variational framework large-scale texture segmentation procedure.

Perspectives TV denoising/ Chan-Vese/D-MS procedure allowing to propose to expert accurate estimation and segmentation. D-MS allows to go from piecewise smooth to piecewise constant. Both are of interest for the applications. HL-PAM and strong convexity? Quantify deadzone w.r.t. scale. Regularization parameter selection. Integrate anisotropy.

References N. Pustelnik, H. Wendt, P. Abry, N. Dobigeon, Combining local regularity estimation and total variation optimization for scale-free texture segmentation, IEEE Trans. on Computational Imaging, vol. 2, no. 4, pp. 468-479, Dec. 2016. N. Pustelnik, L. Condat, Proximity operator of a sum of functions; Application to depth map estimation, IEEE Signal Processing Letters, 2017. M. Foare, N. Pustelnik, L. Condat, A new proximal method for joint image restoration and edge detection with the Mumford-Shah model, accepted ICASSP 2018. B. Pascal, N. Pustelnik, P. Abry, M. Serres, V. Vidal, Joint estimation of local variance and local regularity for texture segmentation. Application to multiphase flow characterization, submitted IEEE ICIP 2018. J. Frecon, N. Pustelnik, N. Dobigeon, H. Wendt, and P. Abry, Bayesian selection for the regularization parameter in TVl0 denoising problems, IEEE Trans. on Signal Processing, 2017.

Proposed Discrete Mumford-Shah minimize u,e 1 2 u g 2 2 + β (1 e) Du 2 + λr(e) g û ê

Proposed Discrete Mumford-Shah minimize u,e 1 2 u g 2 2 + β (1 e) Du 2 + λr(e) R: favors binary, i.e. {0, 1} E and sparse solution (i.e. short K ) 1. Ambrosio-Tortorelli approximation: [Ambrosio-Tortorelli, 1990] [Foare-Lachaud-Talbot, 2016] R(e) = ε De 2 2 + 1 4ε e 2 2 with ε > 0 2. l 1 -norm: R(e) = e 1 3. Quadratic l 1 : [Foare-Pustelnik-Condat, 2017] R(e) = { } E i=1 max e i, e2 i. 4ε

minimize u,e Proposed Discrete Mumford-Shah Ψ(u, e) := 1 2 u g 2 2 + β (1 e) Du 2 +λr(e) }{{} S(e,Du) PALM [Bolte et al, 2014] Set u [0] R Ω and e [0] R E. For l N Set γ > 1 and c l = γχ(e [l] ) u [l+1] prox 1 g (u [l] c 2 c l u S ( e [l], Du [l])) l 2 Set δ > 1 and d k = δν(u [l+1] ) e [l+1] prox 1 λr (e [l] d l e S ( e [l], Du [l+1])) d l Under technical assumptions, the sequence (u [l], e [l] ) l N converges to a critical point (u, e ) of Ψ.

minimize u,e Proposed Discrete Mumford-Shah Ψ(u, e) := 1 2 u g 2 2 + β (1 e) Du 2 +λr(e), }{{} S(e,Du) Proposed HL-PAM [Foare-Pustelnik-Condat, 2017] Set u [0] R Ω and e [0] R E. For l N Set γ > 1 and c l = γχ(e [l] ). u [l+1] prox 1 g (u [l] c 2 c k u S ( e [l], Du [l])) k 2 Set d l > 0. ) e [l+1] prox 1 λr+s(,du d [l+1] ) (e [l] l Under technical assumptions, the sequence (u [l], e [l] ) l N converges to a critical point (u, e ) of Ψ.

Proposed Discrete Mumford-Shah Assumptions 1. The updating steps of u [l+1] and e [l+1] have closed form expressions; 2. e S is globally Lispchitz with moduli χ(e [l] ) for every l N and there exists χ, χ + > 0 such that χ χ(e [l] ) χ + ; 3. (d l ) l N is a positive sequence such that the stepsizes d l belongs to (d, d + ) for some positive d d +.

Proposed Discrete Mumford-Shah Proposition [Foare-Pustelnik-Condat, 2017] We assume that S is separable, i.e, ( e=(e i ) 1 i E ) R(e) = σ i (e i ), where σ i :R E ] ; + ] with a closed form proximity operator expression. Let d l > 0, then prox 1 d l λr+s(,du [l+1] ) (e[l] ) = ( prox λσ i 2β(Du [l] ) 2 i +d l E i=1 [l] i β(du[l+1] ) 2 i + d le 2 β(du [l+1] ) 2 i + d l 2 ) i E

Proposed Discrete Mumford-Shah Proposition [Foare-Pustelnik-Condat, 2017] For every η R and τ, ɛ > 0 { [ ( prox. τ max{., 2 }(η) = sign(η) max 0, min η τ, max 4ɛ, 4ɛ η )]} τ 2ɛ + 1

Proposed Discrete Mumford-Shah Convergence PALM versus HL-PALM: Ψ(u [l], e [l] ) w.r.t. iterations l 200 180 160 140 120 PALM, d l = 0.5/β HL-PAM, d l = 0.5/β HL-PAM, d l = 5/β HL-PAM, d l = 50/β HL-PAM, d l = 500/β 100 0 50 100 150 200 250 300

Data g TV [Strekalovskiy-Cremers, 2014] [Foare-Lachaud-Talbot, 2016] l 1 quadratic-l 1

Proposed Discrete Mumford-Shah Data g TV [Strekalovskiy-Cremers, 2014] [Foare-Lachaud-Talbot, 2016] l 1 quadratic-l 1

Proposed Discrete Mumford-Shah Data g TV [Strekalovskiy-Cremers, 2014] [Foare-Lachaud-Talbot, 2016] l 1 quadratic-l 1

Proposed Discrete Mumford-Shah Data g TV [Strekalovskiy-Cremers, 2014] [Foare-Lachaud-Talbot, 2016] l 1 quadratic-l 1

Proposed Discrete Mumford-Shah Convergence speed: Ψ(u [l+1], e [l+1] ) Ψ(u [l], e [l] ) < 10 4 TV [Foare-Lachaud-Talbot, 2016] l 1 quadratic l 1 dots ( Ω = 128 2 ) 0.4 43.6 2.2 2.1 dots ( Ω = 256 2 ) 2.2 231.3 6.2 5.5 dots ( Ω = 512 2 ) 30.8 1446.5 116.3 90.3 ellipse ( Ω = 128 2 ) 0.7 55.3 7.5 4.3 ellipse ( Ω = 256 2 ) 4.4 507.2 34.8 17.2 ellipse ( Ω = 512 2 ) 48.8 5038.6 535.7 385.6 peppers ( Ω = 128 2 ) 1.1 167.7 22.3 19.9 peppers ( Ω = 256 2 ) 8.8 1014.4 78.6 81.3 peppers ( Ω = 512 2 ) 61.8 10038.6 647.5 650.8

minimize u,k Chan-Vese model 1 (u g) 2 dxdy + β u 2 dxdy + λh 1 (K Ω) 2 Ω Ω\K Discrete piecewise constant relaxation with fixed label number [Chan-Vese, 2001] Q Q minimize θ (q 1) θ (q), (µ q g) 2 + λ TV(θ (q 1) θ (q) ) (θ (q) ) 1 q Q 1 q=1 q=1 s.t. 1 θ (0) θ (1)... θ (Q 1) θ (Q) 0, Ω 3 Ω 3 Ω 2 Ω 1 Ω 1 Ω Ω Ω 2 1 1 g

minimize u,k Chan-Vese model 1 (u g) 2 dxdy + β u 2 dxdy + λh 1 (K Ω) 2 Ω Ω\K Discrete piecewise constant relaxation with fixed label number [Chan-Vese, 2001] Q Q minimize θ (q 1) θ (q), (µ q g) 2 + λ TV(θ (q 1) θ (q) ) (θ (q) ) 1 q Q 1 q=1 q=1 s.t. 1 θ (0) θ (1)... θ (Q 1) θ (Q) 0, Ω 3 Ω 2 Ω 1 g θ (0) θ (1) θ (2) θ (3)

minimize u,k Chan-Vese model 1 (u g) 2 dxdy + β u 2 dxdy + λh 1 (K Ω) 2 Ω Ω\K Discrete piecewise constant relaxation with fixed label number [Chan-Vese, 2001] Q Q minimize θ (q 1) θ (q), (µ q g) 2 + λ TV(θ (q 1) θ (q) ) (θ (q) ) 1 q Q 1 q=1 q=1 s.t. 1 θ (0) θ (1)... θ (Q 1) θ (Q) 0, Ω 3 Ω 3 Ω 2 Ω Ω Ω 1 1 Ω Ω 2 1 1 g θ (0) θ (1) θ (1) θ (2) θ (2) θ (3)

Chan-Vese model minimize Θ=(θ (q) ) 1 q Q 1 Q 1 Q β (q), θ (q) + λ DH q Θ 2,1 + ι [0,1] Q Ω (Θ) + ι E (Θ) q=1 q=1 β (q) = (µ q+1 g) 2 (µ q g) 2, H q : R Q Ω R Ω : Θ θ (q 1) θ (q), E = {Θ R Q Ω : θ (1)... θ (Q 1) }. Use of splitting proximal algorithms to deal with a sum of convex but non-smooth functions.

Chan-Vese model Three-term splitting : minimize Θ Q 1 q=1 β(q), θ (q) + λ Q q=1 DH qθ 2,1 + ι [0,1] Q Ω (Θ) + ι E (Θ) Two-term splitting : minimize Θ Q 1 q=1 β(q), θ (q) + λ Q q=1 DH qθ 2,1 + ι [0,1] Q Ω (Θ) + ι E (Θ) Question: When is it possible to compute the proximity operator of a sum of functions rather splitting. Would it be more efficient?

Chan-Vese model Proposition [Pustelnik, Condat, 2017] (i) For some function h 0 Γ 0(R), h is separable, with ( ) x = (xi ) i Ω h(x) = h 0(x i ). i Ω (ii) g has the following form: ( x = (xi ) i Ω ) g(x) = (m,m ) Υ Ω 2 σ Cm,m (x m x m), where σ Cm,m : t R sup {tp, p C m,m } is the support function of a closed real interval C m,m, such that inf C m,m = a m,m and sup C m,m = b m,m, for some a m,m R { } and b m,m R {+ }, with a m,m b m,m. a m,m t if t < 0, ( t R) σ Cm,m (t) = 0 if t = 0, b m,m t if t > 0, Under assumptions (i) and (ii), prox g+h = prox h prox g.

Chan-Vese model Particular cases: Fused Lasso: Ω = {1,..., N} and Υ = {(1, 2), (2, 3),..., (N 1, N)}, b n,n+1 = a n,n+1 = ω n 0, h 0 = λ, g(x) = N 1 n=1 ω n x n+1 x n Chan-Vese: Ω = {1,..., Q} and Υ = {(1, 2), (2, 3),..., (Q 1, Q)}, a n,n+1 = 0 b n,n+1 = +, h 0 = ι [0,1], g(x) = ι E Compute P E with Pool Adjacent Violators Algorithm (PAVA) [Ayer et al., 1995]

Chan-Vese model g λ = 10 3 λ = 10 4

Chan-Vese model 10 10 Minimal splitting (proposed method) Intermediate splitting 10 8 Full splitting 10 6 10 4 10 2 10 0 0 5000 10000 15000