About Split Proximal Algorithms for the Q-Lasso

Similar documents
Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

Extensions of the CQ Algorithm for the Split Feasibility and Split Equality Problems

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces

A Primal-dual Three-operator Splitting Scheme

A Dykstra-like algorithm for two monotone operators

A Unified Approach to Proximal Algorithms using Bregman Distance

1 Introduction and preliminaries

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

The Split Hierarchical Monotone Variational Inclusions Problems and Fixed Point Problems for Nonexpansive Semigroup

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Computing the resolvent of composite operators

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Generalized greedy algorithms.

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim

Splitting methods for decomposing separable convex programs

SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS

Operator Splitting for Parallel and Distributed Optimization

On the split equality common fixed point problem for quasi-nonexpansive multi-valued mappings in Banach spaces

Proximal methods. S. Villa. October 7, 2014

Coordinate Update Algorithm Short Course Operator Splitting

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Auxiliary-Function Methods in Optimization

Recent developments on sparse representation

Convergence of Fixed-Point Iterations

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Primal-dual algorithms for the sum of two and three functions 1

Viscosity approximation methods for the implicit midpoint rule of asymptotically nonexpansive mappings in Hilbert spaces

Dual Proximal Gradient Method

Proximal splitting methods on convex problems with a quadratic term: Relax!

arxiv: v4 [math.oc] 29 Jan 2018

Lasso: Algorithms and Extensions

Existence and convergence theorems for the split quasi variational inequality problems on proximally smooth sets

Auxiliary-Function Methods in Iterative Optimization

The Generalized Viscosity Implicit Rules of Asymptotically Nonexpansive Mappings in Hilbert Spaces

Proximal tools for image reconstruction in dynamic Positron Emission Tomography

Dual and primal-dual methods

Learning with stochastic proximal gradient

Optimization methods

arxiv: v1 [math.oc] 20 Jun 2014

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR TV MINIMIZATION

Making Flippy Floppy

PARALLEL SUBGRADIENT METHOD FOR NONSMOOTH CONVEX OPTIMIZATION WITH A SIMPLE CONSTRAINT

The Split Common Fixed Point Problem for Asymptotically Quasi-Nonexpansive Mappings in the Intermediate Sense

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

A Tutorial on Primal-Dual Algorithm

Least Sparsity of p-norm based Optimization Problems with p > 1

Monotone Operator Splitting Methods in Signal and Image Recovery

The Alternating Direction Method of Multipliers

Proximal Methods for Optimization with Spasity-inducing Norms

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

A General Iterative Method for Constrained Convex Minimization Problems in Hilbert Spaces

ADMM for monotone operators: convergence analysis and rates

consistent learning by composite proximal thresholding

A direct formulation for sparse PCA using semidefinite programming

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE

Iterative algorithms based on the hybrid steepest descent method for the split feasibility problem

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

Inertial Douglas-Rachford splitting for monotone inclusion problems

Douglas-Rachford splitting for nonconvex feasibility problems

On Optimal Frame Conditioners

1 Sparsity and l 1 relaxation

Thai Journal of Mathematics Volume 14 (2016) Number 1 : ISSN

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators

FINDING BEST APPROXIMATION PAIRS RELATIVE TO A CONVEX AND A PROX-REGULAR SET IN A HILBERT SPACE

Variable Metric Forward-Backward Algorithm

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Fonctions Perspectives et Statistique en Grande Dimension

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

Dual methods for the minimization of the total variation

Conditional Gradient (Frank-Wolfe) Method

Sequential Unconstrained Minimization: A Survey

Adaptive Primal Dual Optimization for Image Processing and Learning

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Making Flippy Floppy

arxiv: v1 [math.oc] 12 Mar 2013

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

Sparse Regularization via Convex Analysis

INERTIAL ACCELERATED ALGORITHMS FOR SOLVING SPLIT FEASIBILITY PROBLEMS. Yazheng Dang. Jie Sun. Honglei Xu

A Relaxed Explicit Extragradient-Like Method for Solving Generalized Mixed Equilibria, Variational Inequalities and Constrained Convex Minimization

Convergence theorems for a finite family. of nonspreading and nonexpansive. multivalued mappings and equilibrium. problems with application

Iterative common solutions of fixed point and variational inequality problems

A direct formulation for sparse PCA using semidefinite programming

A primal-dual fixed point algorithm for multi-block convex minimization *

ABSTRACT. Recovering Data with Group Sparsity by Alternating Direction Methods. Wei Deng

An inexact subgradient algorithm for Equilibrium Problems

Recent Advances in Structured Sparse Models

A general iterative algorithm for equilibrium problems and strict pseudo-contractions in Hilbert spaces

CPSC 540: Machine Learning

A New Primal Dual Algorithm for Minimizing the Sum of Three Functions with a Linear Operator

Optimization methods

Introduction to Alternating Direction Method of Multipliers

Transcription:

Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S UMR 7296, France e-mail : abdellatif.moudafi@univ-amu.fr Abstract : Numerous problems in signal processing and imaging, statistical learning and data ing, or computer vision can be formulated as optimization problems which consist in imizing a sum of convex functions, not necessarily differentiable, possibly composed with linear operators. Each function is typically either a data fidelity term or a regularization term enforcing some properties on the solution, see for example [, 2] and references therein. In this note we are interested in the general form of Q-Lasso introduced in [3] which generalized the well-known Lasso of Tibshirani [4]. Q is a closed convex subset of a Euclidean m- space, for some integer m, that can be interpreted as the set of errors within given tolerance level when linear measurements are taken to cover a signal/image via the Lasso. Only the unconstrained case was discussed in [3], we discuss here some split proximal algorithms for solving the general case. It is worth mentioning that the lasso model a number of applied problems arising from machine learning and signal/image processing due to the fact it promotes the sparsity of a signal. Keywords : Q-Lasso; basic pursuit; proximal algorithm; soft-thresholding; regularization; Douglas-Rachford algorithm; forward-backward iterations. 200 Mathematics Subject Classification : 49J53; 65K0; 49M37; 90C25. Introduction The lasso of Tibshirani [4] is the imization problem x IR n 2 Ax b 2 2 +γ x, (.) Copyright c 207 by the Mathematical Association of Thailand. All rights reserved.

2 Thai J. Math. 5 (207)/ A. Moudafi where A is an m n real matrix, b IR m and γ > 0 is a tuning parameter. It is equivalent to the the basic pursuit (BP) of Chen et al. [5] x IR n x subject to Ax = b. (.2) However, due to errors of measurements, the constraint Ax = b is actually inexact; It turns out that problem (.2) is reformulated as x IR n x subject to Ax b p ε, (.3) where ε > 0 is the tolerance level of errors and p is often,2 or. It is noticed in [3] that if we let Q := B ε (b), the closed ball in IR n with center b and radius ε, then (.3) is rewritten as x IR n x subject to Ax Q. (.4) With Q a nonempty closed convex set of IR m and P Q the projection from IR m onto Q and since that the constraint is equivalent to the condition Ax P Q (Ax) = 0, this leads to the following equivalent Lagrangian formulation x IR n 2 (I P Q)Ax 2 2 +γ x, (.5) with γ > 0 a Lagrangian multiplier. A connection is also made in [3] with the so-called split feasibility problem [6] which is stated as finding x verifying x C, Ax Q, (.6) where C and Q are closed convex subsets of IR n and IR m, respectively. An equivalent imization formulation of (.6) is Its l regularization is given as x C 2 (I P Q)Ax 2 2. (.7) x C 2 (I P Q)Ax 2 2 +γ x, (.8) where γ > 0 is a regularization parameter. The main difficulty in solving (.8) stems form the fact that m,n are typically of high, hence the denoation of large-scale optimization. It is not possible to manipulate, at every iteration of an algorithm, matrices of size n n, like the Hessian of a function. So, proximal algorithms, which only exploit first-order information of the functions, are often the only viable way to solve (.8). In this note, we propose two proximal algorithms to solve the problem (.8) by full splitting; that is, at every iteration, the only operations involved are evaluations

About Split Proximal Algorithms for the Q-Lasso 3 of the gradient of the function 2 (I P Q)A( ) 2 2, the proximal mapping of γ, A, or its transpose A t. In [3], properties and iterative methods for (.5) are investigated. Remember also that many authors devoted their works to the unconstrained imization problem x H f (x)+f 2 (x) with f,f 2 are two proper, convex lower semi continuous functions defined on a Hilbert space H and f 2 differentiable with a β-lipschitz continuous gradient for some β > 0 and an effective method to solve it is the forward-backward algorithm which from an initial value x 0 generates a sequence (x k ) by the following iteration x k+ = ( λ k )x k +λ k prox γk f (x k γ k f 2 (x k )), (.9) where γ k > 0 is the algorithm step-size, 0 < λ k < is a relaxation parameter. It is well-known, see for instance [], that if (γ k ) is bounded and (λ k ) is bounded from below, then (x k ) weakly converges to a solution of x H f (x) + f 2 (x) provided that the set of solutions is nonempty. To relaxingthe assumptionon the differentiability of f 2, the Douglas-Rachford algorithm was introduce. It generates a sequence (y k ) as follows { yk+/2 = prox κf2 y k ; y k+ = y k + τ k ( proxκf (2y k+/2 y k ) y k+/2 ) (.0) where κ > 0, (τ k ) is a sequence of positive reals. It is well-known that (y k ) converges weakly to y such that prox κf2 y is a solution of the unconstrained imization problem above provided that: k IN, τ k ]0,2[ and k=0 τ k(2 τ k ) = + and the set of solutions is nonempty. In this note we are interested in (.8) which is more general and reduced to (.5) when the set of constraints is the entire space IR n. The involvement of the convex set C brings some technical difficulties which are overcome in what follows. Since the same properties for problem (.8) may be obtained by mimicking the analysis developed in [3] for (.5), we will focus our attention on the algorithmic aspect. 2 Split Proximal Algorithms In this section, we introduce several proximal iterative methods for solving Q-Lasso (.8) in its general form. For the sake of our purpose we confine ourselves to the finite dimensional setting, but our analysis is still valid in Hilbert spaces, just replace the transpose A t of A by its adjoint operator A and the convergence by the weak convergence. To begin with, recall that the proximal mapping (or the Moreau envelope) of a proper, convex and lower semicontinuous function ϕ of parameter λ > 0 is defined by prox λϕ (x) := arg v IR n{ϕ(v)+ 2 v x 2 }, x IR n,

4 Thai J. Math. 5 (207)/ A. Moudafi and that it has closed-form expression in some important cases. For example, if ϕ =, then for x IR n prox λ (x) = (prox λ (x ),prox λ (x n )), (2.) where prox λ (x k ) = sgn(x k )max k=,2, n { x k λ,0}. If ϕ(x) = 2 Ax y 2, then and if ϕ = i C, we have where Prox γϕ = (I +γa t A) A t = A t (I +γaa t ), Prox γϕ (x) = Proj C (x) := arg z C x z, i C (x) = { 0 if x C; + otherwise such function is convenient to enforce hard constraints on the solution. Observe that the imization problem (.8) can be written as x IR n 2 (I P Q)Ax 2 2 +γ x +i C (x). (2.2) Remember also that the partial differential is defined as It is easily seen that and that ϕ(x) := {u IR n ;ϕ(z) ϕ(x)+ u,z x z IR n }. 2 Ax y 2 = 2 Ax y 2 = A t (Ax y) = A t (Ax y) (x) = and i C is nothing but the normal cone to C. { sign(xi ) if x i 0; [,] if x i = 0 Very recently several split proximal algorithms were proposed to imize the problem x IR nf(x)+g(x)+i C(x), (2.3) where f, g are two proper, convex and lower semicontinuous functions and C a nonempty closed convex set. Based in the algorithms introduced in [5], we propose two solutions for solving the more general problem (2.2). Both of them are obtained by coupling the forward-backward algorithm and the Douglas-Rachford algorithm. Then we mention other solutions introduced in[7] founded on a combination of the Krasnoselski- Mann iteration and the Douglas-Rachford algorithm or an alternating algorithm. Other solutions may be considered by using primal-dual algorithms proposed in [2].

About Split Proximal Algorithms for the Q-Lasso 5 2. Insertion of a Forward-Backward Step in the Douglas- Rachford Algorithm To apply the Douglas-Rachford algorithm when g = γ et g 2 = 2 (I P Q )A( ) 2 2 +i C, weneedtodeteretheirproximalmappings. Themaindifficulty lies in the computation of the second one, namely prox κ 2 (I PQ)A( ) 2 2 +ic. As in [5], we can use a forward-backward algorithm to achieve this goal. The resulting algorithm is Algorithm :. Set γ ]0,2κ A ],λ ]0,] and κ ]0,+ [. Choose (τ k ) k IN satisfying k IN, τ k ]0,2[ and k=0 τ k(2 τ k ) = +. 2. Set k = 0,y 0 = y /2 C. 3. Set x k,0 = y k /2. 4. Pour i = 0,...,N k, a) choose γ k,n [γ,2κ A [ and λ k,i [λ,]; b) compute x k,i+ = x k,i +λ k,i ( PC ( x k,i γ k,i (κa t (I P Q )Ax k,i y k ) +γ k,i ) xk,i ). 5. Set y k+/2 = x k,nk. 6. Set y k+ = y k +τ k (prox κ (2y k+/2 y k ) y k+/2 ). 7. Increment k k + and go to 3. By a judicious choose of of N k, the convergence of the sequence (y k ) to y such that prox κ 2 (I PQ)A( ) 2 2 +ic(y) solves problem (2.2), follows directly by applying [5]-Proposition 4.. 2.2 Insertion of a Douglas-Rachford Step in the Forward- Backward Algorithm We consider f = κ +i C et f 2 = 2 (I P Q)A( ) 2 2. Since f 2 has a A 2 - Lipschitz gradient, we can apply the forward-backward algorithm. This requires however to compute prox ic+γ k which can be performed with Douglas-Rachford iterations. The resulting algorithm is Algorithm 2:. Choose γ k and λ k satisfying assumptions 0 < inf k γ k sup k γ k < 2/ A 2, 0 < λ λ k. Set τ ]0,2]. 2. Set k = 0,x 0 C.

6 Thai J. Math. 5 (207)/ A. Moudafi 3. Set x k = x k γ k A t (I P Q )Ax n. 4. Set y k,0 = 2prox γk x k x k. 5. For i = 0,,M k, a) compute b) choose τ k,i [τ,2]; y k,i+/2 = P C ( y k,i +x k); 2 c) compute y k,i+ = y k,i +τ k,i (prox γk (2y n,i+/2 y k,i ) y n,i+/2 ); d) if y k,i+ = y k,i, then goto 6. 6. Set x k+ = x k +λ k (y k,i+/2 x k ). 7. Increment k k + and go to 3. A direct application of [5]-Proposition 4.2 ensures the existence of positive integers (M k ) such that if for all k 0 M k M k, then the sequence (x k ) weakly convergences to a solution of problem (2.2). Remark 2.. Other split proximal algorithms may be designed by combining the fixed-point idea to compute the composite of a convex function with a linear operator introduced in [8] and the analysis developed for computing the proximal mapping of the sum of two convex functions developed in [7]. Primal-dual algorithms considered in [2] can also be used. Note that there are often several ways to assign the functions of (2.2) to the terms used in the generic problem investigated in [2, 8, 9]. Acknowledgements : I would like to thank my Team I&M (Image et Modéles) and the head of LSIS at AMU, the Prof. M. Ouladsine. References [] P.L. Combettes, J.-C. Pesquet, A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery, IEEE J. Selected Topics Signal Process (2007) 564-574. [2] L. Condat, A Generic Proximal Algorithm for Convex Optimization: Application to Total Variation Minimization, IEEE Signal Proc. Letters 2 (8) (204) 985-989. [3] M.A. Alghamdi, M. Ali Alghamdi, N. Shahzad, Naseer, H.-K. Xu, Properties and Iterative Methods for the Q-Lasso, Abstract & Applied Analysis (203), Article ID 250943, 8 pages. [4] R. Tibshirani, Regression Shrinkage and Selection Via the Lasso, Journal of the Royal Statistical Society (Series B) 58 () (996) 267-288.

About Split Proximal Algorithms for the Q-Lasso 7 [5] S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit, SIAM journal on scientific computing 20 () (998) 33-6. [6] Y. Censor, T. Elfving, A multiprojection algorithm using Bregman projections in a product space, Numer. Algorithms 8 (994) 22-239. [7] A. Moudafi, Computing the resolvent of composite operators, CUBO A Mathematical Journal 6 (3) (204) 87-96. [8] Ch.A. Micchelli, L. Shen, Y. Xu, X. Zeng, Proximity algorithms for the L /TV image denoising model, Adv Comput Math 38 (203) 40-426. [9] C. Chaux, J.-C. Pesquet, N. Pustelnik, Nested iterative algorithms for convex constrained image recovery problems, SIAM Journal on Imaging Sciences 2 (2) (2009) 730-762. (Received 8 October 205) (Accepted 5 September 206) Thai J. Math. Online @ http://thaijmath.in.cmu.ac.th