A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014 16 December 2014 EC (UPE) IFPEN 16 Dec. 2014 1 / 29
In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec. 2014 2 / 29
Introduction EC (UPE) IFPEN 16 Dec. 2014 3 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} E = { e (i,j) (i,j) E } set of edges = object relationships e (i,j) E (i,j) E EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} E = { e (i,j) (i,j) E } set of edges = object relationships e (i,j) E (i,j) E directed reflexive graph: E M 2 EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} E = { e (i,j) (i,j) E } set of edges = object relationships e (i,j) E (i,j) E directed reflexive graph: E M 2 directed nonreflexive graph: E M(M 1) EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} E = { e (i,j) (i,j) E } set of edges = object relationships e (i,j) E (i,j) E directed reflexive graph: E M 2 directed nonreflexive graph: E M(M 1) undirected nonreflexive graph: E M(M 1)/2 EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} E = { e (i,j) (i,j) E } set of edges = object relationships e (i,j) E (i,j) E directed reflexive graph: E M 2 directed nonreflexive graph: E M(M 1) undirected nonreflexive graph: E M(M 1)/2 (x (i) ) 1 i M : weights on vertices (scalars or vectors) EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Valued graphs V = { v (i) i {1,...,M} } set of vertices = objects v (i) V i {1,...,M} E = { e (i,j) (i,j) E } set of edges = object relationships e (i,j) E (i,j) E directed reflexive graph: E M 2 directed nonreflexive graph: E M(M 1) undirected nonreflexive graph: E M(M 1)/2 (x (i) ) 1 i M : weights on vertices (scalars or vectors) (x (i,j) ) (i,j) E : weights on edges (scalars or vectors) EC (UPE) IFPEN 16 Dec. 2014 4 / 29
Variational formulation Objective function The cost of a given choice of the weights is evaluated by Φ ( (x (i) ) 1 i M,(x (i,j) ) (i,j) E ) EC (UPE) IFPEN 16 Dec. 2014 5 / 29
Variational formulation Objective function The cost of a given choice of the weights is evaluated by Φ ( (x (i) ) 1 i M,(x (i,j) ) ) (i,j) E }{{} x where [ (x x = (i) ] ) 1 i M (x (i,j) H, ) (i,j) E H separable real Hilbert space, and Φ Γ 0 (H): class of convex lower-semicontinuous functions from H to ],+ ] with a nonempty domain. Example: scalar weights H = R N with N = M + E Problem: How to solve this very large-scale minimization problem in an efficient manner? EC (UPE) IFPEN 16 Dec. 2014 5 / 29
First trick: parallel splitting EC (UPE) IFPEN 16 Dec. 2014 6 / 29
First trick: parallel splitting Split Φ into simpler building blocks that is... EC (UPE) IFPEN 16 Dec. 2014 7 / 29
First trick: parallel splitting Split Φ into simpler building blocks that is... ( x H) Φ(x) = f(x)+h(x)+ q (g k l k )(L k x) where f Γ 0 (H) h convex, µ-lipschitz differentiable function with µ ]0, + [ ( k {1,...,q}) g k Γ 0 (G k ), G k separable real Hilbert space l k Γ 0 (G k ) ν k -strongly convex with ν k ]0,+ [ L k : H G k linear and bounded g k l k inf-convolution of g k and l k : k=1 ( v k G k ) (g k l k )(v k ) = inf g k (v v k G k )+l k(v k v k ) k g k ι {0} = g k. EC (UPE) IFPEN 16 Dec. 2014 7 / 29
First trick: parallel splitting Split Φ into simpler building blocks that is... ( x H) q Φ(x) = f(x)+h(x)+ (g k l k )(L k x) k=1 Difficulties: large-size optimization problem functions f, (g k ) 1 k q, or (l k ) 1 k q often nonsmooth (indicator functions of constraint sets, sparsity measures,...) linear operator inversions required by standard optimization methods (e.g. ADMM) difficult to perform due to the form of operators (L k ) 1 k q (e.g. weighted incidence matrices). EC (UPE) IFPEN 16 Dec. 2014 7 / 29
Second trick: primal-dual strategy EC (UPE) IFPEN 16 Dec. 2014 8 / 29
Second trick: primal-dual strategy Dualize the problem. Let H be a Hilbert space and f: H ],+ ]. The conjugate of f is f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H Adrien-Marie Legendre Werner Fenchel (1752 1833) (1905 1988) EC (UPE) IFPEN 16 Dec. 2014 9 / 29
Conjugate versus Fourier transform conjugate Fourier transform Property h(x) h (u) h(x) ĥ(ν) invariant function 1 2 x 2 1 2 u 2 e π x 2 e π ν 2 translation f(x c) f (u) + u c f(x c) j2π ν c e f(ν) c H dual translation f(x) + x c f (u c) e j2π x c f(x c) f(ν c) c H scalar ) multiplication αf(x) αf u α αf(x) α f(ν) α ]0,+ [ scaling α R ( ) f xα f (αu) ( ) f xα α f(αν) isomorphism L B(G,H) f(lx) f ( L u ) f(lx) 1 L ν ) det(l) reflection f( x) f ( u) f( x) f( ν) separability N N N N ϕ n(x (n) ) ϕ n (u(n) ) ϕ n(x (n) ) ϕ n(ν (n) ) n=1 n=1 n=1 n=1 x = (x (n) ) 1 n N u = (u (n) ) 1 n N x = (x (n) ) 1 n N ν = (ν (n) ) 1 n N isotropy ψ( x ) ψ ( u ) ψ( x ) ψ( ν ) inf-convolution (f g)(x) f (u) + g (u) (f g)(x) f(ν)ĝ(ν) /convolution = f(y)g(x y)dy H sum/product f(x) + g(x) (f g )(u) f(x)g(x) (f,g) ( Γ 0 (H) ) 2 ( f ĝ)(ν) domf domg identity element ι {0} (x) 0 δ(x) 1 of convolution identity element 0 ι {0} (u) 1 δ(ν) of addition/product EC (UPE) IFPEN 16 Dec. 2014 10 / 29
Primal-dual formulation Find an element of the set F of solutions to the primal problem minimize x H q f(x)+h(x)+ (g k l k )(L k x) k=1 and an element of the set F of solutions to the dual problem ( minimize (f h ) v 1 G 1,...,v q G q q ) L k v k + k=1 We assume that there exists x H such that 0 f(x)+ h(x)+ q ( g k (v k )+lk (v k) ). k=1 q L k ( g k l k ) ( L k x ). k=1 EC (UPE) IFPEN 16 Dec. 2014 11 / 29
Parallel proximal primal-dual algorithm Algorithm 1 for n = 0,1,... y n proxf (x W 1 n W ( q L kv k,n + h(x n ) )) k=1 x n+1 = x n +λ n (y n x n ) for k = 1,...,q ( u k,n prox U 1 k gk vk,n +U k (L k (2y n x n ) lk(v k,n )) ) v k,n+1 = v k,n +λ n (u k,n v k,n ), EC (UPE) IFPEN 16 Dec. 2014 12 / 29
Parallel proximal primal-dual algorithm Algorithm 1 where for n = 0,1,... y n proxf (x W 1 n W ( q L kv k,n + h(x n ) )) k=1 x n+1 = x n +λ n (y n x n ) for k = 1,...,q ( u k,n prox U 1 k gk vk,n +U k (L k (2y n x n ) lk(v k,n )) ) v k,n+1 = v k,n +λ n (u k,n v k,n ), W : H H strongly positive self-adjoint bounded linear operator and ( k {1,...,q}) U k : G k G k strongly positive self-adjoint bounded linear operator such that ( 1 ( q ) ) 1/2 k=1 U1/2 k L k W 1/2 2 min{ W 1 µ,( U k 1 ν k ) 1 k q } > 1 2. EC (UPE) IFPEN 16 Dec. 2014 12 / 29
Parallel proximal primal-dual algorithm Algorithm 1 for n = 0,1,... y n proxf (x W 1 n W ( q L kv k,n + h(x n ) )) k=1 x n+1 = x n +λ n (y n x n ) for k = 1,...,q ( u k,n prox U 1 k gk vk,n +U k (L k (2y n x n ) lk(v k,n )) ) v k,n+1 = v k,n +λ n (u k,n v k,n ), where prox W 1 f proximity operator of f in (H, W 1) ( x H) proxf W 1 (x) = argmin y H f(y)+ 1 2 y x 2 W 1, prox U 1 k g proximity operator of g k k in (G k, U 1) k ( n N) λn ]0,1] such that inf n N λ n > 0. EC (UPE) IFPEN 16 Dec. 2014 12 / 29
Parallel proximal primal-dual algorithms Advantages: No linear operator inversion. Use of proximable or/and differentiable functions. Use of preconditioning linear operators. EC (UPE) IFPEN 16 Dec. 2014 13 / 29
Parallel proximal primal-dual algorithms Advantages: No linear operator inversion. Use of proximable or/and differentiable functions. Use of preconditioning linear operators. Bibliographical remarks: methods based on Forward-Backward iteration type I: [Vũ,2013][Condat,2013] (extensions of [Esser et al.,2010][chambolle,pock,2011]) type II: [Combettes et al.,2014] (extensions of [Loris,Verhoeven,2011][Chen et al.,2014]) methods based on Forward-Backward-Forward iteration [Combettes,Pesquet,2012] projection based methods [Alotaibi et al.,2013]... EC (UPE) IFPEN 16 Dec. 2014 13 / 29
Third trick: block-coordinate strategy EC (UPE) IFPEN 16 Dec. 2014 14 / 29
Third trick: block-coordinate strategy Split variable: x = (x 1,...,x p ) H 1 H p = H where H 1,...,H p separable real Hilbert spaces. At each iteration n, update only a subset of components. ( Gauss-Seidel). EC (UPE) IFPEN 16 Dec. 2014 15 / 29
Third trick: block-coordinate strategy Split variable: x = (x 1,...,x p ) H 1 H p = H where H 1,...,H p separable real Hilbert spaces. At each iteration n, update only a subset of components. ( Gauss-Seidel). Advantage: reduced complexity and memory requirements per iteration Useful for large-scale optimization Assumptions: f(x) = p j=1 f j(x j ), h(x) = p j=1 h j(x j ) where ( j {1,...,p}) f j Γ 0 (H j ), h j convex µ j -Lipschitz differentiable with µ j ]0,+ [. In addition, for every k {1,...,q}, L k x = p j=1 L k,jx j, where L k,j : H j G k linear and bounded. EC (UPE) IFPEN 16 Dec. 2014 15 / 29
Primal-dual variational formulation Find an element of the set F of solutions to the primal problem minimize x 1 H 1,...,x p H p p ( fj (x j )+h j (x j ) ) + j=1 q ( p ) (g k l k ) L k,j x j k=1 j=1 and an element of the set F of solutions to the dual problem minimize v 1 G 1,...,v q G q p ( (fj h j) j=1 q ) L k,j v k + k=1 q ( g k (v k )+lk (v k) ). We assume that there exists (x 1,...,x p ) H 1 H p such that ( j {1,...,p}) 0 f j (x j )+ h j (x j ) q ( p ) + L k,j ( g k l k ) L k,j x j. k=1 k=1 j =1 EC (UPE) IFPEN 16 Dec. 2014 16 / 29
Random block-coordinate primal-dual algorithm Algorithm 2 for n = 0,1,... for j = 1,...,p ( ( y j,n = ε j,n prox W 1 j f xj,n j W j ( L k,jv k,n + h j (x j,n )+c j,n ) ) k L ) j +a j,n x j,n+1 = x j,n +λ n ε j,n (y j,n x j,n ) for k = 1,...,q ( ( u k,n = ε p+k,n prox U 1 k g vk,n +U k ( L k,j (2y j,n x j,n ) lk(v k,n ) k j L k +d k,n ) ) ) +b k,n where v k,n+1 = v k,n +λ n ε p+k,n (u k,n v k,n ), (εn ) n N identically distributed D-valued random variables with D = {0,1} p+q {0} binary variables signaling the blocks to be activated EC (UPE) IFPEN 16 Dec. 2014 17 / 29
Random block-coordinate primal-dual algorithm Algorithm 2 for n = 0,1,... for j = 1,...,p ( ( y j,n = ε j,n prox W 1 j f xj,n j W j ( L k,jv k,n + h j (x j,n )+c j,n ) ) k L ) j +a j,n x j,n+1 = x j,n +λ n ε j,n (y j,n x j,n ) for k = 1,...,q ( ( u k,n = ε p+k,n prox U 1 k g vk,n +U k ( L k,j (2y j,n x j,n ) lk(v k,n ) k j L k +d k,n ) ) ) +b k,n where v k,n+1 = v k,n +λ n ε p+k,n (u k,n v k,n ), x0, (a n ) n N, and (c n ) n N H-valued random variables, v 0, (b n ) n N, and (d n ) n N G-valued random variables with G = G 1 G q (a n ) n N, (b n ) n N, (c n ) n N, and (d n ) n N : error terms EC (UPE) IFPEN 16 Dec. 2014 17 / 29
Random block-coordinate primal-dual algorithm Algorithm 2 for n = 0,1,... for j = 1,...,p ( y j,n ε j,n prox W 1 j f xj,n j W j ( L k,jv k,n + h j (x j,n )) ) k L j x j,n+1 = x j,n +λ n ε j,n (y j,n x j,n ) for k = 1,...,q u k,n ε p+k,n prox U 1( k gk vk,n +U k ( L k,j (2y j,n x j,n ) l ) k(v k,n )) j L k v k,n+1 = v k,n +λ n ε p+k,n (u k,n v k,n ), where ( j {1,...,p}) Wj : H j H j strongly positive self-adjoint bounded operator and ( k {1,...,q}) U k : G k G k strongly positive self-adjoint bounded operator such that ( 1 ( p j=1 ) ) 1/2 q k=1 U1/2 k L k,j W 1/2 j 2 min{( W j 1 µ j ) 1 j p,( U k 1 ν k ) 1 k q } > 1 2. EC (UPE) IFPEN 16 Dec. 2014 17 / 29
Random block-coordinate primal-dual algorithm Algorithm 2 for n = 0,1,... for j = 1,...,p ( y j,n ε j,n prox W 1 j f xj,n j W j ( L k,jv k,n + h j (x j,n )) ) k L j x j,n+1 = x j,n +λ n ε j,n (y j,n x j,n ) for k = 1,...,q u k,n ε p+k,n prox U 1( k gk vk,n +U k ( L k,j (2y j,n x j,n ) l ) k(v k,n )) j L k v k,n+1 = v k,n +λ n ε p+k,n (u k,n v k,n ), where ( k {1,...,q}) Lk = { j {1,...,p} L k,j 0 }, ( j {1,...,p}) L j = { k {1,...,q} L k,j 0 }, prox W 1 j f j prox U 1 k gk proximity operator of f j in (H j, W 1), j proximity operator of gk in (G k, U 1) k EC (UPE) IFPEN 16 Dec. 2014 17 / 29
Random block-coordinate primal-dual algorithm Theorem Set ( n N) X n = σ(x n,v n ) 0 n n and E n = σ(ε n ). Assume that 1 E( an 2 X n ) < +, n N E( bn 2 X n ) < +, n N E( cn 2 X n ) < +, and n N E( dn 2 X n ) < + P-a.s. 2 For every n N, E n and X n are independent, and ( k {1,...,q}) P[ε p+k,0 = 1] > 0. 3 For every j {1,...,p} and n N, { ω Ω ε p+k,n (ω) = 1 } { ω Ω ε j,n (ω) = 1 }. k L j Then, (x n ) n N converges weakly P-a.s. to an F-valued random variable, and (v n ) n N converges weakly P-a.s. to an F -valued random variable. EC (UPE) IFPEN 16 Dec. 2014 18 / 29
Application to 3D mesh denoising EC (UPE) IFPEN 16 Dec. 2014 19 / 29
Mesh denoising problem Undirected nonreflexive graph V: set of vertices of the mesh E: set of edges of the mesh x = (x (i) ) 1 i M where, for every i {1,...,M}, x (i) R 3 are 3D coordinates of the i-th vertex of the object H = R 3M. Cost function: Φ(x) = M ι Cj (x (j) )+ψ j (x (j) z (j) )+η j (x (j) x (i) ) i Nj 1,2 j=1 where, for every j {1,...,M}, Cj nonempty convex subset of R 3 ψj : R 3 R convex, Lipschitz differentiable function z (j) : 3D measured coordinates of the j-th vertex Nj : neighborhood of j-th vertex (ηj ) 1 j M : nonnegative regularization constants. EC (UPE) IFPEN 16 Dec. 2014 20 / 29
Mesh denoising problem Implementation details: a block a vertex p = M ( j {1,...,M}) fj = ι Cj where C j : box constraint ( j {1,...,M}) hj = ψ j ( z j ) l 2 -l 1 Huber function robust data fidelity measure q = M and ( k {1,...,M}) ( x H) g k (L k x) = (x (k) x (i) ) i Nk 1,2 ( k {1,...,M}) lk = ι {0}. Simulation scenario: E = E1 E 2 with E 1 E 2. additive independent noise with distribution E 1 N(0,σ1), 2 E 2 πn(0,σ2)+(1 π)n(0,(σ 2 2) 2 ), π (0,1). probability of variable activation { p if j E 1 ( j {1,...,M})( n N) P(ε j,n = 1) = 1 otherwise. EC (UPE) IFPEN 16 Dec. 2014 21 / 29
Simulation results Original mesh, Noisy mesh, σ 1 = 10 3, π = 0.98, E 1 = 6844, E 2 = 1327. (σ 2,σ 2 ) = (5 10 3,1.5 10 2 ), MSE = 5.45 10 6. EC (UPE) IFPEN 16 Dec. 2014 22 / 29
Simulation results Proposed reconstruction, Laplacian smoothing, MSE = 8.89 10 7. MSE = 1.29 10 6. EC (UPE) IFPEN 16 Dec. 2014 23 / 29
Complexity 85 80 75 70 C(p) 65 60 55 50 45 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p EC (UPE) IFPEN 16 Dec. 2014 24 / 29
Simulation results Original mesh, Noisy mesh, σ1 = 5 10 4, π = 0.95, E1 = 18492, E2 = 4506. (σ2, σ2 ) = (2 10 3, 4 10 3 ), MSE = 1.08 10 6. EC (UPE) IFPEN 16 Dec. 2014 25 / 29
Simulation results Proposed reconstruction, Laplacian smoothing, MSE = 2.19 10 7. MSE = 2.95 10 7. EC (UPE) IFPEN 16 Dec. 2014 26 / 29
Complexity 55 50 C(p) 45 40 35 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p EC (UPE) IFPEN 16 Dec. 2014 27 / 29
Conclusion No linear operator inversion. Flexibility in the random activation of primal/dual components. Possibility to address other problems than denoising [Couprie et al.,2013] Fourth trick: EC (UPE) IFPEN 16 Dec. 2014 28 / 29
Conclusion No linear operator inversion. Flexibility in the random activation of primal/dual components. Possibility to address other problems than denoising [Couprie et al.,2013] Fourth trick:... employ asynchronous distributed strategies. EC (UPE) IFPEN 16 Dec. 2014 28 / 29
Some references P. L. Combettes and J.-C. Pesquet Proximal splitting methods in signal processing in Fixed-Point Algorithms for Inverse Problems in Science and Engineering, H. H. Bauschke, R. Burachik, P. L. Combettes, V. Elser, D. R. Luke, and H. Wolkowicz editors. Springer-Verlag, New York, pp. 185-212, 2011. C. Couprie, L. Grady, L. Najman, J.-C. Pesquet, and H. Talbot Dual constrained TV-based regularization on graphs SIAM Journal on Imaging Sciences, vol. 6, no 3, pp. 1246-1273, 2013. P. L. Combettes, L. Condat, J.-C. Pesquet, and B. C. Vũ A forward-backward view of some primal-dual optimization methods in image recovery IEEE International Conference on Image Processing (ICIP 2014), 5 p., Paris, France, Oct. 27-30, 2014. P. Combettes and J.-C Pesquet Stochastic quasi-fejér block-coordinate fixed point iterations with random sweeping 2014, http://arxiv.org/abs/1404.7536. N. Komodakis and J.-C. Pesquet Playing with duality: An overview of recent primal-dual approaches for solving large-scale optimization problems to appear in Signal Processing Magazine, 2014. J.-C. Pesquet and A. Repetti A class of randomized primal-dual algorithms for distributed optimization to appear in Journal of Nonlinear and Convex Analysis, 2014. A. Repetti, E. Chouzenoux and J.-C. Pesquet A random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising submitted to IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015). EC (UPE) IFPEN 16 Dec. 2014 29 / 29