A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

Similar documents
A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing

Variable Metric Forward-Backward Algorithm

Sequential convex programming,: value function and convergence

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

Optimisation in imaging

A user s guide to Lojasiewicz/KL inequalities

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

SPARSE SIGNAL RESTORATION. 1. Introduction

An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems

A semi-algebraic look at first-order methods

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

Block Coordinate Descent for Regularized Multi-convex Optimization

From error bounds to the complexity of first-order descent methods for convex functions

On the convergence of a regularized Jacobi algorithm for convex optimization

A gradient type algorithm with backward inertial steps for a nonconvex minimization

Lecture 3. Optimization Problems and Iterative Algorithms

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Image restoration: numerical optimisation

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

6. Proximal gradient method

Hedy Attouch, Jérôme Bolte, Benar Svaiter. To cite this version: HAL Id: hal

9. Dual decomposition and dual algorithms

A First Order Primal-Dual Algorithm for Nonconvex T V q Regularization

Regularized Multiplicative Algorithms for Nonnegative Matrix Factorization

Convergence rates for an inertial algorithm of gradient type associated to a smooth nonconvex minimization

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator

Adaptive Primal Dual Optimization for Image Processing and Learning

arxiv: v2 [math.oc] 21 Nov 2017

6. Proximal gradient method

Inverse problem and optimization

Relaxed linearized algorithms for faster X-ray CT image reconstruction

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Enhanced Compressive Sensing and More

MATH 4211/6211 Optimization Basics of Optimization Problems

Sparse Regularization via Convex Analysis

Numerical Optimization: Basic Concepts and Algorithms

Tame variational analysis

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China)

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

1 Sparsity and l 1 relaxation

AN AUGMENTED LAGRANGIAN METHOD FOR l 1 -REGULARIZED OPTIMIZATION PROBLEMS WITH ORTHOGONALITY CONSTRAINTS

A Proximal Alternating Direction Method for Semi-Definite Rank Minimization (Supplementary Material)

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J

Received: 15 December 2010 / Accepted: 30 July 2011 / Published online: 20 August 2011 Springer and Mathematical Optimization Society 2011

Chapter 2. Optimization. Gradients, convexity, and ALS

Proximal methods. S. Villa. October 7, 2014

Image restoration by minimizing zero norm of wavelet frame coefficients

minimize x subject to (x 2)(x 4) u,

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

Image restoration. An example in astronomy

Variational Image Restoration

Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms

Markov Random Fields

First-order methods for structured nonsmooth optimization

Sequential Unconstrained Minimization: A Survey

Interior Point Methods in Mathematical Programming

Line Search Methods for Unconstrained Optimisation

You should be able to...

A Majorize-Minimize strategy for subspace optimization applied to image restoration

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Nonconvex Sparse Logistic Regression with Weakly Convex Regularization

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University

PART II: Basic Theory of Half-quadratic Minimization

arxiv: v1 [cs.na] 29 Nov 2017

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

14 : Theory of Variational Inference: Inner and Outer Approximation

Douglas-Rachford splitting for nonconvex feasibility problems

Conditional Gradient (Frank-Wolfe) Method

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Gauge optimization and duality

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator

An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems

On the iterate convergence of descent methods for convex optimization

Convex Optimization Algorithms for Machine Learning in 10 Slides

Dual methods for the minimization of the total variation

Numerical Optimization

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Convex Optimization and l 1 -minimization

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Approximate Message Passing Algorithms

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Steepest descent method on a Riemannian manifold: the convex case

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

Learning Task Grouping and Overlap in Multi-Task Learning

arxiv: v5 [cs.na] 22 Mar 2018

Sparse & Redundant Representations by Iterated-Shrinkage Algorithms

This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization

Physics-based Prior modeling in Inverse Problems

arxiv: v2 [math.oc] 22 Jun 2017

Perturbed Proximal Gradient Algorithm

Discrete Inference and Learning Lecture 3

Transcription:

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration E. Chouzenoux, A. Jezierska, J.-C. Pesquet and H. Talbot Université Paris-Est Lab. d Informatique Gaspard Monge UMR CNRS 8049 Journées GDR MOA-MSPC Optimisation et traitement d images Emilie Chouzenoux Journées GDR MOA-MSPC 1 / 32

Outline 1 General context 2 l 2 -l 0 regularization functions Existence of minimizers Epi-convergence property 3 Minimization of F δ Proposed algorithm Convergence results 4 Application to image restoration Image denoising Image deblurring Image segmentation Texture+Geometry decomposition 5 Conclusion Emilie Chouzenoux Journées GDR MOA-MSPC 2 / 32

Context Image restoration We observe data y R Q, related to the original image x R N through: y = Hx+u, H R Q N Objective: Restore the unknown original image x from H and y. y x Emilie Chouzenoux Journées GDR MOA-MSPC 3 / 32

Context Penalized optimization problem Find where ( ) min F(x) = Φ(Hx y)+λr(x), x R N Φ Fidelity to data term, related to noise R Regularization term, related to some a priori assumptions λ Regularization weight Assumption: x is sparse in a dictionary V of analysis vectors in R N F 0 (x) = Φ(Hx y)+λl 0 (V x) Emilie Chouzenoux Journées GDR MOA-MSPC 4 / 32

Context Penalized optimization problem Find where ( ) min F(x) = Φ(Hx y)+λr(x), x R N Φ Fidelity to data term, related to noise R Regularization term, related to some a priori assumptions λ Regularization weight Assumption: x is sparse in a dictionary V of analysis vectors in R N F δ (x) = Φ(Hx y)+λ C c=1 ψ δ (V c x) where ψ δ is a differentiable, non-convex approximation of the l 0 norm. Emilie Chouzenoux Journées GDR MOA-MSPC 4 / 32

Outline 1 General context 2 l 2 -l 0 regularization functions Existence of minimizers Epi-convergence property 3 Minimization of F δ Proposed algorithm Convergence results 4 Application to image restoration Image denoising Image deblurring Image segmentation Texture+Geometry decomposition 5 Conclusion Emilie Chouzenoux Journées GDR MOA-MSPC 5 / 32

Outline 1 General context 2 l 2 -l 0 regularization functions Existence of minimizers Epi-convergence property 3 Minimization of F δ Proposed algorithm Convergence results 4 Application to image restoration Image denoising Image deblurring Image segmentation Texture+Geometry decomposition 5 Conclusion Emilie Chouzenoux Journées GDR MOA-MSPC 6 / 32

l 2 -l 0 regularization functions We consider the following class of potential functions: 1 ( δ (0,+ )) ψ δ is differentiable. 2 ( δ (0,+ )) lim t ψ δ (t) = 1. 3 ( δ (0,+ )) ψ δ (t) = O(t 2 ) for small t. Examples: 1 0.8 ψ δ (t) = t2 2δ 2 +t 2 ψ δ (t) = 1 exp( t2 2δ 2 ) 0.6 0.4 0.2 0 5 2.5 0 2.5 5 t Emilie Chouzenoux Journées GDR MOA-MSPC 7 / 32

Existence of minimizers F δ (x) = Φ(Hx y)+λ C c=1 ψ δ (V c x) Difficulty: F δ is a non convex, non coercive function. Proposition Assume that and that lim Φ(x) = + x + KerH = {0} Then, for every δ > 0, F δ has a minimizer. Emilie Chouzenoux Journées GDR MOA-MSPC 8 / 32

Existence of minimizers F δ (x) = Φ(Hx y)+λ C c=1 ψ δ (V c x)+ Πx 2 Difficulty: F δ is a non convex, non coercive function. Proposition Assume that and that lim Φ(x) = + x + KerH KerΠ = {0} Then, for every δ > 0, F δ has a minimizer. Emilie Chouzenoux Journées GDR MOA-MSPC 8 / 32

Epi-convergence to the l 0 -penalized objective function Assumptions: 1 ( (δ 1,δ 2 ) (0,+ ) 2 ) δ 1 δ 2 ( t R) ψ δ1 (t) ψ δ2 (t) 0. { 0 if t = 0 2 ( t R) lim δ 0 ψ δ (t) = 1 otherwise. 3 Φ is coercive and KerH KerΠ = 0 Proposition Let (δ n ) n N be a decreasing sequence of positive real numbers converging to 0. Under the above assumptions, inff δn inff 0 as n + In addition, if for every n N, x n is a minimizer of F δn, then the sequence ( x n ) n N is bounded and all its cluster points are minimizers of F 0. Emilie Chouzenoux Journées GDR MOA-MSPC 9 / 32

Outline 1 General context 2 l 2 -l 0 regularization functions Existence of minimizers Epi-convergence property 3 Minimization of F δ Proposed algorithm Convergence results 4 Application to image restoration Image denoising Image deblurring Image segmentation Texture+Geometry decomposition 5 Conclusion Emilie Chouzenoux Journées GDR MOA-MSPC 10 / 32

Iterative minimization of F δ (x) Descent algorithm x k+1 = x k +α k d k, ( k {0,...,K}) d k : search direction satisfying g T k d k < 0 where g k F δ (x k ) Ex: Gradient, conjugate gradient, Newton, truncated Newton,... stepsize α k : approximate minimizer of f k,δ (α): α F δ (x k +αd k ) Emilie Chouzenoux Journées GDR MOA-MSPC 11 / 32

Iterative minimization of F δ (x) Descent algorithm x k+1 = x k +α k d k, ( k {0,...,K}) d k : search direction satisfying g T k d k < 0 where g k F δ (x k ) Ex: Gradient, conjugate gradient, Newton, truncated Newton,... stepsize α k : approximate minimizer of f k,δ (α): α F δ (x k +αd k ) Generalization : subspace algorithm [Zibulevsky10] r x k+1 = x k + s i,k d i k, ( k {0,...,K}) i=1 [d 1 k,...,dr k ] = D k: Set of search directions Ex: Super-memory gradient D k = [ g k,d k 1,..,d k m ] stepsize s k : approximate minimizer of f k,δ (s): s F δ (x k +D k s) Emilie Chouzenoux Journées GDR MOA-MSPC 11 / 32

Majorize-Minimize principle [Hunter04] Objective: Find ˆx Argmin x F δ (x) For all x, let Q(.,x ) a tangent majorant of F δ at x i.e., Q(x,x ) F δ (x), x, Q(x,x ) = F δ (x ) MM algorithm: j {0,...,J}, x j+1 Argmin x Q(x,x j ) Q(.,x j ) F δ x j x j+1 Emilie Chouzenoux Journées GDR MOA-MSPC 12 / 32

Quadratic tangent majorant function Assumption: For all x, there exists A(x ), definite positive, such that Q(x,x ) = F δ (x )+ F δ (x ) T (x x )+ 1 2 (x x ) T A(x )(x x ) is a quadratic tangent majorant of F δ at x. Construction of A(.) G(x) = c φ([v x w] c ) { φisc 1 onr φhasl LipschitzgradientonR A(x ) = av T V, a L Emilie Chouzenoux Journées GDR MOA-MSPC 13 / 32

Quadratic tangent majorant function Assumption: For all x, there exists A(x ), definite positive, such that Q(x,x ) = F δ (x )+ F δ (x ) T (x x )+ 1 2 (x x ) T A(x )(x x ) is a quadratic tangent majorant of F δ at x. Construction of A(.) G(x) = c φ([v x w] c ) { φisc 1 onr φhasl LipschitzgradientonR A(x ) = av T V, a L [Allain06] φisc 1 andevenonr φ(.)isconcaveonr + A(x ) = V T Diag{ω([V x w])}v ω(u) = φ(u)/u (0, ) Emilie Chouzenoux Journées GDR MOA-MSPC 13 / 32

Majorize-Minimize multivariate stepsize [Chouzenoux11] x k+1 = x k +D k s k ( k {0,...K}) D k : set of directions s k resulting from MM minimization of f k,δ (s): s F δ (x k +D k s) q k (s,s j k ) : Quadratic tangent majorant of f k,δ at s j k with Hessian: B s j = Dk TA(x k +D k s j k )D k k MM minimization in the subspace: s 0 k = 0, s j+1 k Argmin s q k (s,s j k ), ( j {0,...J 1}) s k = s J k. Emilie Chouzenoux Journées GDR MOA-MSPC 14 / 32

Proposed algorithm Majorize-Minimize Memory Gradient (MM-MG) algorithm For k = 1,...,K 1 Compute the set of directions, for example D k = [ g k,x k x k 1 ] 2 s 0 k = 0 3 j {0,...J 1}, B s j = Dk A(x k +D k s j k )D k k s j+1 k 4 s k = s J k = s j k B 1 f s j k,δ (s j k ) k 5 Update x k+1 = x k +D k s k Emilie Chouzenoux Journées GDR MOA-MSPC 15 / 32

Convergence results Assumptions ➀ Φ is coercive and KerH KerΠ = 0 ➁ The gradient of Φ is L-Lipschitzian ➂ ψ δ is even and ψ δ (.) is concave on R +. Moreover, there exists ω [0,+ ) such that ( t (0,+ )) 0 ψ δ (t) ωt. In addition, lim t 0 ψ δ (t)/t R. t 0 ➃ F δ satisfies the Lojasiewicz inequality [Attouch10a,Attouch10b]: For every x R N and every bounded neighborhood of E of x, there exist constants κ > 0, ζ > 0 and θ [0,1) such that F δ (x) κ F δ (x) F δ ( x) θ, for every x E such that F δ (x) F δ ( x) < ζ. Emilie Chouzenoux Journées GDR MOA-MSPC 16 / 32

Convergence results Theorem Under Assumptions ➀, ➁, and ➂, for all J 1, the MM-MG algorithm is such that lim k F δ(x k ) = 0. Furthermore, if Assumption ➃ is fulfilled, then The MM-MG algorithm generates a sequence converging to a critical point x of F δ. The sequence (x k ) k N has a finite length in the sense that + k=0 x k+1 x k < +. Emilie Chouzenoux Journées GDR MOA-MSPC 17 / 32

Outline 1 General context 2 l 2 -l 0 regularization functions Existence of minimizers Epi-convergence property 3 Minimization of F δ Proposed algorithm Convergence results 4 Application to image restoration Image denoising Image deblurring Image segmentation Texture+Geometry decomposition 5 Conclusion Emilie Chouzenoux Journées GDR MOA-MSPC 18 / 32

Image denoising Original image x Noisy image y 512 512 SNR = 15 db F δ (x) = 1 2 x y 2 +λ c ψ δ(v c x), ψ δ (t) = 1 exp( t 2 /2δ 2 ) MM-MG / Beck Teboulle / Half quadratic algorithms Comparison with discrete methods for ψ δ (t) = δ 2 min(t 2,δ 2 ) Emilie Chouzenoux Journées GDR MOA-MSPC 19 / 32

Results Denoised image SNR = 24.4 db Emilie Chouzenoux Journées GDR MOA-MSPC 20 / 32

Results Algorithm Time SNR (db) MM-MG 28 s 24.4 Beck-Teboulle [Beck09] 45 s 24.45 Half-Quadratic [Allain06] 97 s 24.42 Graph Cut Swap [Boykov02] 365 s 23.98 Tree-Reweighted [Felzenszwalb10] 181 s 24.28 Belief Propagation [Kolmogorov06] 1958 s 24.28 Emilie Chouzenoux Journées GDR MOA-MSPC 20 / 32

Image deblurring Original image x Noisy blurred image y 256 256 Motion blur (9 pixels) SNR = 10 db F δ (x) = 1 2 Hx y 2 +λ c ψ δ(v c x)+τ x 2, τ = 10 10 Non convex penalty: ψ δ (t) = 1 exp( t 2 /2δ 2 ) Convex penalty: ψ δ (t) = 1+t 2 /δ 2 1 Emilie Chouzenoux Journées GDR MOA-MSPC 21 / 32

Results: Non convex penalty Restored image SNR = 18.82 db Algorithm Time SNR (db) MM-MG 29 s 18.82 B-T 32 s 18.67 H-Q 141 s 18.52 Emilie Chouzenoux Journées GDR MOA-MSPC 22 / 32

Results: Convex penalty Denoised image SNR = 17.91 db Algorithm Time SNR (db) MM-MG 194 s 17.91 B-T 900 s 17.91 H-Q 1824 s 17.91 Emilie Chouzenoux Journées GDR MOA-MSPC 23 / 32

Image segmentation Original image x 256 256 F δ (x) = 1 2 x x 2 +λ c ψ δ(v c x) with large λ and small δ Non convex penalty: ψ δ (t) = 1 exp( t 2 /2δ 2 ) Convex penalty: ψ δ (t) = 1+t 2 /δ 2 1 Emilie Chouzenoux Journées GDR MOA-MSPC 24 / 32

Results Segmentation with non convex penalty Segmentation with convex penalty Emilie Chouzenoux Journées GDR MOA-MSPC 25 / 32

Results 50th line 200 180 160 140 120 100 0 50 100 150 200 250 Initial image Non convex Convex Emilie Chouzenoux Journées GDR MOA-MSPC 25 / 32

Results 50th line 200 180 160 140 120 100 0 50 100 150 200 250 Initial image Non convex Convex Algorithm Time Non convex Convex MM-MG 15 s 5 s Beck-Teboulle 21 s 50 s Half-Quadratic 39 s 31 s Emilie Chouzenoux Journées GDR MOA-MSPC 25 / 32

Texture+Geometry decomposition Original image x Noisy image y 256 256 SNR=15 db { ˆx geometry y = ˆx+ ˇx with ˇx texture + noise ( ˆx Argmin 1 x 2 1 (x y) 2 +λ c (ψ δ(vc T x)) ) [Osher03] Non convex penalty / convex penalty Emilie Chouzenoux Journées GDR MOA-MSPC 26 / 32

Results: Non convex penalty Geometry ˆx Texture + noise ˇx MM-MG algorithm: Convergence in 91 s Emilie Chouzenoux Journées GDR MOA-MSPC 27 / 32

Results: Convex penalty Geometry ˆx Texture + noise ˇx MM-MG algorithm: Convergence in 134 s Emilie Chouzenoux Journées GDR MOA-MSPC 28 / 32

Outline 1 General context 2 l 2 -l 0 regularization functions Existence of minimizers Epi-convergence property 3 Minimization of F δ Proposed algorithm Convergence results 4 Application to image restoration Image denoising Image deblurring Image segmentation Texture+Geometry decomposition 5 Conclusion Emilie Chouzenoux Journées GDR MOA-MSPC 29 / 32

Conclusion Majorize-Minimize memory gradient algorithm for l 2 -l 0 minimization Faster methods w.r.t. combinatorial optimization techniques Simplicity of implementation Future work Constrained case Non differentiable case How to ensure the global convergence? Emilie Chouzenoux Journées GDR MOA-MSPC 30 / 32

Bibliography M. Allain, J. Idier and Y. Goussard On global and local convergence of half-quadratic algorithms IEEE Transactions on Image Processing, 15(5):1130 1142, 2006 H. Attouch, J. Bolte and B. F. Svaiter Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods http://www.optimization-online.org, 2010 H. Attouch, J. Bolte, P. Redont and A. Soubeyran Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka- Lojasiewicz inequality Mathematics of Operations Research, 35(2):438 457, 2010 A. Beck and M. Teboulle Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems IEEE Transactions on Image Processing, 18(11):2419 2434, 2009 E. Chouzenoux, J. Idier and S. Moussaoui A Majorize-Minimize strategy for subspace optimization applied to image restoration IEEE Transactions on Image Processing, 20(6):1517 1528, 2011 Emilie Chouzenoux Journées GDR MOA-MSPC 31 / 32

Thanks for you attention! Emilie Chouzenoux Journées GDR MOA-MSPC 32 / 32