Monotone Operator Splitting Methods in Signal and Image Recovery

Size: px

Start display at page:

Download "Monotone Operator Splitting Methods in Signal and Image Recovery"

Delilah Collins
5 years ago
Views:

1 Monotone Operator Splitting Methods in Signal and Image Recovery P.L. Combettes 1, J.-C. Pesquet 2, and N. Pustelnik 3 2 Univ. Pierre et Marie Curie, Paris 6 LJLL CNRS UMR Univ. Paris-Est LIGM CNRS UMR ENS Lyon Laboratoire de Physique CNRS UMR 5672 IEEE ICASSP Tutorial May 2014

2 2/161 Monotone operators and inverse problems [Microscopy, ISBI Challenge 2013, F. Soulez]? Original image Degraded image

3 2/161 Monotone operators and inverse problems [Microscopy, ISBI Challenge 2013, F. Soulez]? Original image x R N Degraded image z = P α (Hx) R M H R M N : matrix associated with the degradation operator. P α : R M R M : noise degradation with parameter α (e.g. Poisson noise). Find a good estimate of x from the observations z, using some a priori knowledge on H and on the noise statistics.

4 Monotone operators and inverse problems 3/161 Inverse problem: Find an estimate x close to x from the observations z = P α (Hx). Inverse filtering (if M = N and H is invertible) x = H 1 z = H 1 (Hx +b) if b R M is an additive noise = x +H 1 b Closed form expression, but amplification of the noise if H is ill-conditioned (ill-posed problem).

5 Monotone operators and inverse problems 3/161 Inverse problem: Find an estimate x close to x from the observations z = P α (Hx). Inverse filtering (if M N and the rank of H is N) x = (H H) 1 H z = (H H) 1 H (Hx +b) if b R M is an additive noise = x +(H H) 1 H b Closed form expression, but amplification of the noise if H is ill-conditioned (ill-posed problem).

6 Monotone operators and inverse problems 4/161 Inverse problem: Find an estimate x close to x from the observations z = P α (Hx). Variational approach x Argmin x R N f 1 (x) + f }{{} 2 (x) }{{} Data fidelity term Regularization e.g. Hx z 2 { term 2 e.g. λ x p p 1 p with λ ]0,+ [ Often no closed form expression (e.g. if p 2 and H Id) or solution expensive to compute (e.g. if p = 2, H Id and N 1) Iterative strategy.

7 4/161 Monotone operators and inverse problems Inverse problem: Find an estimate x close to x from the observations z = P α (Hx). Variational approach (more general context) x Argmin x R N m f i (x) i=1 where f i may denote a data fidelity term / a (hybrid) regularization term / constraint. Iterative strategy.

8 Monotone operators and inverse problems 5/161 Iterative strategy = Optimization algorithm: Construct a sequence (x n ) n N that converges to x Argmin x R N m f i (x). i=1 Sequence such that ( n N) x n+1 = Tx n where T denotes an operator from R N to R N. How can we build T from the functionals (f i ) 1 i m involved in the minimization problem? Which properties are required by T in order to ensure the convergence of (x n ) n N to x?

6/161 Naive answer Fixed point theorem (E. Picard, 1856-1941) If x is a fixed point of T, i.e. x = T x T is a strict contraction, i.e. there exists ρ [0,1[ such that ( (x,x ) R N R N) Tx Tx ρ x x then (x n ) n N converges to x.

9 6/161 Naive answer Fixed point theorem (E. Picard, ) If x is a fixed point of T, i.e. x = T x T is a strict contraction, i.e. there exists ρ [0,1[ such that ( (x,x ) R N R N) Tx Tx ρ x x then (x n ) n N converges to x. Proof: For all n N, x n+1 x = Tx n T x ρ x n x. Consequently, x n x ρ n x 0 x. Hence, we have proved that (x n ) n N converges linearly to x.

10 Why do we need to go further? Limitations: It is difficult (even sometimes impossible) to build a strictly contractive operator T. 7/161 One may prefer iterations built as ( n N) x n+1 = T n x n where T n denotes an operator from R N to R N. It is often intricate to build T n, while it may be easier to write T n as a composition of simpler operators ( splitting techniques ). T n can be multivalued, i.e. ( n N) x n+1 T n x n.

Tutorial philosophy Provide a modern vision of convex optimization in order to deal with nonsmooth problems (sparsity) possibly non-finite functions, monotone operators,.

11 Tutorial philosophy Provide a modern vision of convex optimization in order to deal with nonsmooth problems (sparsity) possibly non-finite functions, monotone operators,... 8/161 Provide a powerful framework to capture many convex optimization algorithms (forward-backward, Douglas-Rachford, ADMM,...) in a unifying form. Introduce the technical literature on this topic deal with infinite dimensional Hilbert spaces... even if most of the signal/image processing applications are in finite dimension.

12 Tutorial philosophy Illustrate the performance of the algorithms on inverse problems examples (without giving too many details). Do not explore all the applications of monotone operators. 9/161

13 Tutorial philosophy Illustrate the performance of the algorithms on inverse problems examples (without giving too many details). Do not explore all the applications of monotone operators. (Denoising, restoration, reconstruction, machine learning, ressource allocation, networking, communications,...) 9/161

14 Tutorial philosophy Illustrate the performance of the algorithms on inverse problems examples (without giving too many details). Do not explore all the applications of monotone operators. (Denoising, restoration, reconstruction, machine learning, ressource allocation, networking, communications,...) Focus on the convergence of the iterates (x n ) n N rather than on the convergence of criterion ( m i=1 f i(x n ) ) n N. 9/161

15 A pioneer 10/161 Jean-Jacques Moreau ( )

16 Reference book 11/161 H.H. Bauschke and P.L. Combettes

17 12/161 Outline 1. Background on monotone and maximally monotone operators Inversion, subdifferential, conjugate of a convex function 2. Nonexpansive operators Taxinomy, resolvent, and proximity operator 3. Search for a zero of a maximally monotone operator Fixed points, Fejér monotonicity, Douglas-Rachford, Forward-Backward 4. Duality Main theorems, ADMM, primal-dual methods

18 Part 1: Background 13/ Monotone operators Definition Properties Basic operations Inversion Maximality Usefulness for convex optimization (subdifferential) 2. Maximally monotone operators Properties Basic operations Inversion Usefulness of inversion for convex optimization

19 Hilbert spaces 14/161 A (real) Hilbert space H is a complete real vector space endowed with an inner product whose associated norm is ( x H) x = x x. Particular case: H = R N (Euclidean space with dimension N).

20 Hilbert spaces 14/161 A (real) Hilbert space H is a complete real vector space endowed with an inner product whose associated norm is ( x H) x = x x. Particular case: H = R N (Euclidean space with dimension N). 2 H is the power set of H, i.e. the family of all subsets of H.

21 15/161 Hilbert spaces Let H and G be two Hilbert spaces. A linear operator L: H G is bounded (or continuous) if L = sup Lx G < + x H 1

22 15/161 Hilbert spaces Let H and G be two Hilbert spaces. A linear operator L: H G is bounded (or continuous) if L = sup Lx < + x 1

23 15/161 Hilbert spaces Let H and G be two Hilbert spaces. A linear operator L: H G is bounded (or continuous) if L = sup Lx < + x 1 In finite dimension, every linear operator is bounded.

24 15/161 Hilbert spaces Let H and G be two Hilbert spaces. A linear operator L: H G is bounded (or continuous) if L = sup Lx < + x 1 In finite dimension, every linear operator is bounded. B(H,G): Banach space of bounded linear operators from H to G.

25 Hilbert spaces 16/161 Let H and G be two Hilbert spaces. Let L B(H,G). Its adjoint L is the operator in B(G,H) defined as ( (x,y) H G) y Lx G = L y x H.

26 Hilbert spaces 16/161 Let H and G be two Hilbert spaces. Let L B(H,G). Its adjoint L is the operator in B(G,H) defined as ( (x,y) H G) y Lx = L y x.

27 Hilbert spaces 16/161 Let H and G be two Hilbert spaces. Let L B(H,G). Its adjoint L is the operator in B(G,H) defined as ( (x,y) H G) Lx y = x L y.

28 Hilbert spaces 16/161 Let H and G be two Hilbert spaces. Let L B(H,G). Its adjoint L is the operator in B(G,H) defined as ( (x,y) H G) Lx y = x L y. Example: If then L: H H n : y (y,...,y) L : H n H: x = (x 1,...,x n ) n i=1 x i Proof: Ly x = (y,...,y) (x 1,...,x n ) = n n y x i = y x i i=1 i=1

29 16/161 Hilbert spaces Let H and G be two Hilbert spaces. Let L B(H,G). Its adjoint L is the operator in B(G,H) defined as ( (x,y) H G) Lx y = x L y. We have L = L. If L is bijective (i.e. an isomorphism ) then L 1 B(G,H) and (L 1 ) = (L ) 1. If H = R N and G = R M then L = L.

30 Hilbert spaces 17/161 Let H be a Hilbert space and L B(H,H). L is self-adjoint if L = L. L is positive if ( x H) x Lx 0. L is strictly positive if L is positive and if ( x H) x Lx = 0 x = 0.

31 18/161 Mappings versus multivalued operators Let H be a real Hilbert space. A is an H-valued mapping defined on D H if A: D H x A(x) Example: ln x 2 1 A: ]0,+ [ R x lnx x

32 Mappings versus multivalued operators 18/161 Let H be a real Hilbert space. A is a (multivalued) operator if A: H 2 H x { A i (x) i Ix R } Example: A: R 2 R {x} if x > 0 x [0,1] if x = 0 if x < A( x ) x

33 Graph 19/161 Let H be a real Hilbert space. Let A : H 2 H. The graph of A is graa = {( x,u) H 2 u Ax}. Graph examples: u u u u x x x x

34 Graph 19/161 Let H be a real Hilbert space. Let A : H 2 H. The graph of A is graa = {( x,u) H 2 u Ax}. Graph examples: u u u u x x x x Single-valued Multivalued Multivalued Single-valued

35 Monotone operator: definition 20/161 Let H be a real Hilbert space. Let A : H 2 H. A is monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) u 1 u 2 x 1 x 2 0. Monotone operators? u u u u x x x x

36 Monotone operator: definition 20/161 Let H be a real Hilbert space. Let A : H 2 H. A is monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) u 1 u 2 x 1 x 2 0. Monotone operators? u u u u x x x x

37 21/161 Monotone operator: example Let H be a real Hilbert space. Let A B(H,H). A is monotone A is positive A monotone A+A monotone A monotone.

38 21/161 Monotone operator: example Let H be a real Hilbert space. Let A B(H,H). A is monotone A is positive A monotone A+A monotone A monotone. Proof: A monotone ( (x 1,x 2 ) H 2) x 1 x 2 Ax 1 Ax 2 0 ( x H) 2 x Ax 0 ( x H) x Ax + A x x 0 ( x H) x (A+A )x 0 A+A monotone

39 21/161 Monotone operator: example Let H be a real Hilbert space. Let A B(H,H). A is monotone A is positive A monotone A+A monotone A monotone. For A B(H,H) to be monotone, A is not required to be self-adjoint. Example : A B(H,H) skewed (i.e. A = A) is monotone.

40 Domain 22/161 Let H be a real Hilbert space. Let A : H 2 H. The domain of A is doma = { x H Ax }. Which domain? u u u x x x

41 Domain 22/161 Let H be a real Hilbert space. Let A : H 2 H. The domain of A is doma = { x H Ax }. Which domain? u u u x x x

42 Domain 22/161 Let H be a real Hilbert space. Let A : H 2 H. The domain of A is doma = { x H Ax }. Let C H. If doma = C and for every x C, Ax is a singleton, we view A as a mapping from C to H. u u u x x x

43 23/161 Monotone operator: properties Let H and G be two Hilbert spaces. Let A: H 2 H and B: G 2 G be two monotone operators. The following operators are monotone: x y +γρa(ρx +z) = { y +γρu u A(ρx +z) } where (y,z) H 2, γ [0,+ [ and ρ R. A B : H G 2 H G (x,y) Ax Ay = { (u,v) u Ax,v Bx }. A+B : x { u +v u Ax,v Bx } if G = H. L BL : x { L v v B(Lx) } if L B(H,G).

44 Monotone operator: inversion 24/161 Let H be a Hilbert space. Let A: H 2 H. A 1 is the operator from H to 2 H the graph of which is gra(a 1 ) = { (u,x) (x,u) graa }. Graph of A Graph of A 1? u u u x x x

45 Monotone operator: inversion 24/161 Let H be a Hilbert space. Let A: H 2 H. A 1 is the operator from H to 2 H the graph of which is gra(a 1 ) = { (u,x) (x,u) graa }. Graph of A Graph of A 1? u u u x x x

46 Monotone operator: inversion 24/161 Let H be a Hilbert space. Let A: H 2 H. A 1 is the operator from H to 2 H the graph of which is gra(a 1 ) = { (u,x) (x,u) graa }. Let H be a Hilbert space. Let A: H 2 H be a monotone operator. A 1 is monotone.

47 25/161 Maximally monotone operator: definition Let H be a Hilbert space. Let A : H 2 H. A is maximally monotone ifais monotone and ifthere exists no monotone operator B: H 2 H (different from A) such that grab properly contains graa. Maximally monotone operator? u u u u x x x x

48 25/161 Maximally monotone operator: definition Let H be a Hilbert space. Let A : H 2 H. A is maximally monotone ifais monotone and ifthere exists no monotone operator B: H 2 H (different from A) such that grab properly contains graa. Maximally monotone operator? u u u u x x x x

49 25/161 Maximally monotone operator: definition Let H be a Hilbert space. Let A : H 2 H. A is maximally monotone ifais monotone and ifthere exists no monotone operator B: H 2 H (different from A) such that grab properly contains graa. Maximally monotone operator? u u u u x x x x

50 26/161 Maximally monotone operator: second definition Let H be a Hilbert space. A : H 2 H is maximally monotone if one if the following equivalent conditions is satisfied: (i) A is monotone and there exists no monotone operator B: H 2 H such that grab properly contains graa. (ii) For every (x 1,u 1 ) H 2, (x 1,u 1 ) graa ( (x 2,u 2 ) graa ) x 1 x 2 u 1 u 2 0.

51 Continuous functions 27/161 Let H be a Hilbert space. Let A: H H be monotone and continuous. Then A is maximally monotone. Example : If L B(H,H) is positive, then L is maximally monotone.

52 28/161 Maximally monotone operator

53 28/161 Maximally monotone operator

54 28/161 Maximally monotone operator Usefulness in convex optimization

55 29/161 Convex analysis: definitions Let f : H ],+ ] where H is a Hilbert space. The domain of f is domf = {x H f(x) < + }. The function f is proper if domf. Domains of the functions? f(x) f(x) x δ x

56 29/161 Convex analysis: definitions Let f : H ],+ ] where H is a Hilbert space. The domain of f is domf = {x H f(x) < + }. The function f is proper if domf. Domains of the functions? f(x) f(x) x domf = R (proper) δ x

57 29/161 Convex analysis: definitions Let f : H ],+ ] where H is a Hilbert space. The domain of f is domf = {x H f(x) < + }. The function f is proper if domf. Domains of the functions? f(x) f(x) x domf = R (proper) domf =]0,δ] (proper) δ x

58 Convex analysis: definitions 30/161 C H is a convex set if ( (x,y) C 2 )( α ]0,1[) αx +(1 α)y C Convex sets? C C C

59 Convex analysis: definitions 30/161 C H is a convex set if ( (x,y) C 2 )( α ]0,1[) αx +(1 α)y C Convex sets? C C C

60 31/161 Convex analysis: definitions f : H ],+ ] is a convex fonction if ( (x,y) H 2 ) ( α ]0,1[) f(αx +(1 α)y) αf(x)+(1 α)f(y)

61 Convex analysis: definitions 31/161 f : H ],+ ] is a convex fonction if ( (x,y) H 2 ) ( α ]0,1[) f(αx +(1 α)y) αf(x)+(1 α)f(y) Convex functions? f(x) = x f(x) = x f(x) = ι [ δ,δ] (x) (0 if x [ δ,δ] + otherwise) x x δ δ x

62 Convex analysis: definitions 31/161 f : H ],+ ] is a convex fonction if ( (x,y) H 2 ) ( α ]0,1[) f(αx +(1 α)y) αf(x)+(1 α)f(y) Convex functions? f(x) = x f(x) = x f(x) = ι [ δ,δ] (x) (0 if x [ δ,δ] + otherwise) x x δ δ x

63 Convex analysis: definitions 32/161 f : H ],+ ] is convex the epigraph of f, i.e. is convex. epif = { (x,ζ) domf R f(x) ζ }

64 Convex analysis: definitions 32/161 f : H ],+ ] is convex the epigraph of f, i.e. is convex. epif = { (x,ζ) domf R f(x) ζ } f(x) = x f(x) = x f(x) = ι [ δ,δ] (x) epif epif epif x x δ δ x

65 Convex analysis: definitions 32/161 f : H ],+ ] is convex the epigraph of f, i.e. is convex. epif = { (x,ζ) domf R f(x) ζ } f(x) = x f(x) = x f(x) = ι [ δ,δ] (x) epif epif epif x x δ δ x

66 Convex analysis: definitions 32/161 f : H ],+ ] is convex the epigraph of f, i.e. is convex. epif = { (x,ζ) domf R f(x) ζ } f(x) = x f(x) = x f(x) = ι [ δ,δ] (x) epif epif epif x x δ δ x f : H [,+ [ is concave if f is convex.

67 Convex analysis: definitions 33/161 Let f : H ],+ ]. f is a lower semi-continuous (l.s.c.) function on H if epif is closed l.s.c. functions? f(x) f(x) x x

68 Convex analysis: definitions 33/161 Let f : H ],+ ]. f is a lower semi-continuous (l.s.c.) function on H if epif is closed l.s.c. functions? f(x) f(x) x x

69 Convex analysis: definitions 33/161 Let f : H ],+ ]. f is a lower semi-continuous (l.s.c.) function on H if epif is closed l.s.c. functions? f(x) f(x) x x

70 Convex analysis: definitions 33/161 Let f : H ],+ ]. f is a lower semi-continuous (l.s.c.) function on H if epif is closed l.s.c. functions? f(x) f(x) x x

71 Convex analysis: definitions 33/161 Let f : H ],+ ]. f is a lower semi-continuous (l.s.c.) function on H if epif is closed l.s.c. functions? f(x) f(x) x x

72 34/161 Convex analysis: definitions/properties Γ 0 (H) : class of convex, l.s.c., and proper functions from H to ],+ ]. Every continuous function on H is l.s.c. Every finite sum of l.s.c. (convex) functions is l.s.c. (convex). Let (f i ) i I be a family of l.s.c. (convex) functions. sup i I f i is l.s.c. (convex).

73 Subdifferential of a convex function: definition 35/161 The (Moreau) subdifferential of f, denoted by f,

74 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, f(y) y

75 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) u y x

76 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

77 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

78 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

79 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

80 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

81 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

82 Subdifferential of a convex function: definition 35/161 Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u x y x

83 36/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted by f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f(y) f(x)+ y x u u Fermat rule: x x y 0 f(x) ( y H) y x 0 +f(x) f(y) x Argminf

84 37/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} u f(x) is a subgradient of f at x.

85 37/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} u f(x) is a subgradient of f at x. If x domf, then f(x) =.

86 37/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} u f(x) is a subgradient of f at x. If x domf, then f(x) =. For every x domf, f(x) is a closed and convex set.

87 38/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f is a monotone operator : u x 1 u 2 x 2 u 1 x

88 38/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f is a monotone operator : Let u 1 f(x 1 ) and u 2 f(x 2 ). u x 1 u 2 x 2 u 1 x

89 38/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f is a monotone operator : Let u 1 f(x 1 ) and u 2 f(x 2 ). u By using the subdifferential definition: x 2 x 1 u 1 +f(x 1 ) f(x 2 ) x 1 x 2 u 2 +f(x 2 ) f(x 1 ) x 1 u 2 x 2 u 1 x

90 38/161 Subdifferential of a convex function: properties Let f : H ],+ ] be a proper function. The (Moreau) subdifferential of f, denoted f, is such that f : H 2 H x {u H ( y H) y x u +f(x) f(y)} f is a monotone operator : Let u 1 f(x 1 ) and u 2 f(x 2 ). u By using the subdifferential definition: x 2 x 1 u 1 +f(x 1 ) f(x 2 ) x 1 x 2 u 2 +f(x 2 ) f(x 1 ) and thus x 1 x 2 u 1 u 2 0. x 1 u 2 x 2 u 1 x

91 Subdifferential of a convex function: properties 39/161 The subdifferential of a convex and proper function is: Monotone If f is Gâteaux differentiable at x, then f(x) = { f(x)}

92 Subdifferential of a convex function: properties 39/161 The subdifferential of a convex and proper function is: Monotone If f is Gâteaux differentiable at x, then f(x) = { f(x)} Non necessarily maximally monotone Counterexample: For every x H, { x if x > 0 f(x) = + otherwise, g(x) = x { {1} if x > 0 f(x) =, g(x) = {1}. otherwise Consequently, gra f = ]0,+ [ {1} R {1} gra g g(x) f(x) u x x

93 40/161 Subdifferential of a convex function: properties The subdifferential of a convex, proper and l.s.c. function is Maximally monotone

94 Subdifferential of a convex function: properties 40/161 The subdifferential of a convex, proper and l.s.c. function is Maximally monotone Example: For every x H, { x if x 0 h(x) = + otherwise, g(x) = x {1} if x > 0 h(x) = ],1] if x = 0, g(x) = {1}. otherwise Consequently, gra h gra g. g(x) h(x) u x x

95 40/161 Subdifferential of a convex function: properties The subdifferential of a convex, proper and l.s.c. function is Maximally monotone If H = R, equivalence between both properties.

96 41/161 Subdifferential of a convex function: example Soit C H. The indicator function of C is ( x H) ι C (x) = { 0 if x C + otherwise. Example : C = [δ 1,δ 2 ] f(x) = ι [δ1,δ 2](x) δ 1 δ 2 x

97 42/161 Subdifferential of a convex function: example For every x H, ι C (x) is the normal cone to C at x defined by {{ u H ( y C) u y x 0 } if x C N C (x) = otherwise. C N C (x) u N C (x) x C x u

98 42/161 Subdifferential of a convex function: example For every x H, ι C (x) is the normal cone to C at x defined by {{ u H ( y C) u y x 0 } if x C N C (x) = otherwise. If x intc then N C (x) = {0}. If C is a vector space then, for every x C, N C (x) = C.

99 43/161 Maximally monotone operator Properties

100 Maximally monotone operator: properties 44/161 Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. For every x H, Ax is a closed convex. Proof: { Ax = u H x x u u 0 }. (x,u ) graa Consequently, Ax is an intersection of closed convex sets.

101 45/161 Maximally monotone operator: properties Let H and G be two Hilbert spaces. Let A: H 2 H and B: G 2 G be two maximally monotone operators. The following operators are maximally monotone: y +γρa(ρ +z) where (y,z) H 2, γ [0,+ [ and ρ R; A B, A 1.

102 46/161 Inverse of a maximally monotone operator

103 46/161 Inverse of a maximally monotone operator Usefulness?

104 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H

105 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f(x) f ( ) (u) = sup x u f(x). x H x u x f (u)

106 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H f(x) x u f (u) x

107 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H f(x) x u f (u) x

108 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H Example : ( x R N ) f(x) = 1 q x q q with q ]1,+ [ ( u R N ) f (u) = 1 q u q q with 1 q + 1 q = 1

109 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H f is l.s.c. and convex.

110 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H Moreau-Fenchel theorem Let H be a Hilbert space and f : H ],+ ] be a proper function. f is l.s.c. and convex f = f. Consequence: If f Γ 0 (H) then f Γ 0 (H).

111 Conjugate: definition 47/161 Let H be a Hilbert space and f : H ],+ ]. The conjugate of f is the function f : H [,+ ] such that ( u H) f ( ) (u) = sup x u f(x). x H Let f Γ 0 (H). ( f) 1 = f

112 Conjugate versus Fourier transform 48/161 conjugate Fourier transform Property h(x) h (u) h(x) ĥ(ν) invariant function 1 2 x u 2 exp( π x 2 ) exp( π ν 2 ) translation f(x c) f (u) + u c f(x c) exp( j2π ν c ) f(ν) c H dual translation f(x) + x c f (u c) exp(j2π x c )f(x c) f(ν c) c H scalar ) multiplication αf(x) αf u α αf(x) α f(ν) α ]0,+ [ ( ) ( ) scaling α R f xα f (αu) f xα α f(αν) isomorphism L B(G,H) f(lx) f ( (L 1 ) u ) f(lx) 1 det(l) f ( (L 1 ) ν ) reflection f( x) f ( u) f( x) f( ν) separability N ϕ n(x (n) ) N ϕ n (u(n) ) N ϕ n(x (n) ) N ϕ n(ν (n) ) n=1 n=1 n=1 n=1 x = (x (n) ) 1 n N u = (u (n) ) 1 n N x = (x (n) ) 1 n N ν = (ν (n) ) 1 n N isotropy ψ( x ) ψ ( u ) ψ( x ) ψ( ν ) identity element ι {0} (x) 0 δ(x) 1 of convolution identity element 0 ι {0} (u) 1 δ(ν) of addition/product

113 Conjugate: example 49/161 Let H be a Hilbert space and C H. σ C is the support function of C if ( u H) σ C (u) = sup x u x C = ι C (u). ι [δ1,δ 2](x) σ C (u) σ C (u) x u δ 1 δ 2 x u

114 Conjugate: example 49/161 Let H be a Hilbert space and C H. σ C is the support function of C if ( u H) σ C (u) = sup x u x C = ι C (u). ι [δ1,δ 2](x) σ C (u) x u σ C (u) δ 1 δ 2 x u

115 Conjugate: example 49/161 Let H be a Hilbert space and C H. σ C is the support function of C if ( u H) σ C (u) = sup x u x C = ι C (u). ι [δ1,δ 2](x) σ C (u) x u σ C (u) δ 1 δ 2 x u

116 Conjugate: example 50/161 Let H be a Hilbert space. f : H ],+ ] is positively homogeneous if ( x H)( α ]0,+ [) f(αx) = αf(x). f is positively homogeneous and belongs to Γ 0 (H) f = σ C where C is a nonempty closed convex subset of H.

117 Conjugate: example 50/161 Let H be a Hilbert space. f : H ],+ ] is positively homogeneous if ( x H)( α ]0,+ [) f(αx) = αf(x). f is positively homogeneous and belongs to Γ 0 (H) f = σ C where C is a nonempty closed convex subset of H. δ 1 x if x < 0 Example 1: Let f : R ],+ ] : x 0 if x = 0 δ 2 x if x > 0 with δ 1 < δ 2 +. Then, f = σ C where C is the closed real interval such that infc = δ 1 and supc = δ 2.

118 Conjugate: example 50/161 Let H be a Hilbert space. f : H ],+ ] is positively homogeneous if ( x H)( α ]0,+ [) f(αx) = αf(x). f is positively homogeneous and belongs to Γ 0 (H) f = σ C where C is a nonempty closed convex subset of H. Example 2: Let f be a l q norm of R N with q [1,+ ]. We have f = σ C where C = { y R N y q 1 } with 1 q + 1 q = 1. Particular case : l 1 norm of R N C = [ 1,1] N.

119 51/161 Maximally monotone operator Properties

120 Maximally monotone operator: sum 52/161 Let A and B be two maximally monotone operators. A+B is monotone but may not be maximally monotone.

121 53/161 Maximally monotone operator: sum Let H be a Hilbert space. Let A and B be two maximally monotone operators from H to 2 H such that one of the following assumptions is satisfied: domb = H doma int(domb) 0 int(doma domb) then A+B is maximally monotone.

122 53/161 Maximally monotone operator: sum Let H be a Hilbert space. Let A and B be two maximally monotone operators from H to 2 H such that one of the following assumptions is satisfied: domb = H doma int(domb) 0 int(doma domb) then A+B is maximally monotone. Consequence: Let α [0,+ [. If A is maximally monotone, then A+αId is maximally monotone.

123 54/161 Maximally monotone operator: linear transform Let H and G two Hilbert spaces. Let B: G 2 G be a maximally monotone operator and L B(H,G) such that one of the following assumptions is satisfied: L surjective 0 int(domb ranl) then L BL is maximally monotone.

124 54/161 Maximally monotone operator: linear transform Let H and G two Hilbert spaces. Let B: G 2 G be a maximally monotone operator and L B(H,G) such that one of the following assumptions is satisfied: L surjective 0 int(domb ranl) then L BL is maximally monotone. Consequence: Let µ ]0,+ [. If A is maximally monotone and LL = µid, then L AL is maximally monotone. Proof: LL = µid ranl = H.

125 Part 2: Nonexpansive operators 55/ Background on nonexpansive operators Definition Properties Examples Resolvent 2. Proximal operator Definition Properties Examples

126 Nonexpansive operator: definition 56/161 Let H be a Hilbert space and let C be a nonempty subset of H. Let A: C H. ( A is nonexpansive if (x,y) C 2 ) Ax Ay x y.

127 Nonexpansive operator: definition 56/161 Let H be a Hilbert space and let C be a nonempty subset of H. Let A: C H and ν ]0,+ [ ν 1 A is nonexpansive if ( (x,y) C 2) Ax Ay ν x y.

128 Nonexpansive operator: definition 56/161 Let H be a Hilbert space and let C be a nonempty subset of H. Let A: C H and ν ]0,+ [ ν 1 A is nonexpansive if ( (x,y) C 2) Ax Ay ν x y. ν 1 A is nonexpansive A is ν-lipschitzian. Lipschitz Nonexpansive

129 Nonexpansive operator: definition 57/161 Let H be a real Hilbert space. Let A: H 2 H A is firmly nonexpansive if ( (x,u) graa)( (y,v) graa) u v 2 u v x y.

130 Nonexpansive operator: definition 57/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H. A is firmly nonexpansive if ( x C)( y C) Ax Ay 2 Ax Ay x y.

131 Nonexpansive operator: definition 57/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H. A is firmly nonexpansive if ( (x,y) C 2 ) Ax Ay 2 + (Id A)x (Id A)y 2 x y 2.

132 Nonexpansive operator: definition 57/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H. A is firmly nonexpansive if ( (x,y) C 2 ) Ax Ay 2 + (Id A)x (Id A)y 2 x y 2. A is firmly nonexpansive Id A is firmly nonexpansive. A is firmly nonexpansive 2A Id is nonexpansive.

133 57/161 Nonexpansive operator: definition Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H. A is firmly nonexpansive if ( (x,y) C 2 ) Ax Ay 2 + (Id A)x (Id A)y 2 x y 2. A is firmly nonexpansive Id A is firmly nonexpansive. A is firmly nonexpansive 2A Id }{{} Reflection of A is nonexpansive.

134 Nonexpansive operator: definition 57/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H. A is firmly nonexpansive if ( (x,y) C 2 ) Ax Ay 2 + (Id A)x (Id A)y 2 x y 2. A is firmly nonexpansive A is nonexpansive. Lipschitz Nonexpansive Firmly nonexpansive

135 Nonexpansive operator: definition 58/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H and β ]0,+ [. A is β-cocoercive if βa is firmly nonexpansive, i.e., ( x C)( y C) β Ax Ay 2 x y Ax Ay.

136 Nonexpansive operator: definition 58/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H and β ]0,+ [. A is β-cocoercive if βa is firmly nonexpansive, i.e., ( x C)( y C) β Ax Ay 2 x y Ax Ay. Let H and G be two real Hilbert spaces, L B(G,H) nonzero, and A: H H. A is β-cocoercive L AL is L 2 β-cocoercive.

137 Nonexpansive operator: definition 58/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H and β ]0,+ [. A is β-cocoercive if βa is firmly nonexpansive, i.e., ( x C)( y C) β Ax Ay 2 x y Ax Ay. Let H and G be two real Hilbert spaces, L B(G,H) nonzero, and A: H H. A is β-cocoercive L AL is L 2 β-cocoercive. A is β-cocoercive A is β 1 -Lipschitzian.

138 Nonexpansive operator: definition 58/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H and β ]0,+ [. A is β-cocoercive if βa is firmly nonexpansive, i.e., ( x C)( y C) β Ax Ay 2 x y Ax Ay. Let H and G be two real Hilbert spaces, L B(G,H) nonzero, and A: H H. A is β-cocoercive L AL is L 2 β-cocoercive. A is β-cocoercive A is β 1 -Lipschitzian. A: H H is β-cocoercive A is maximally monotone.

139 Nonexpansive operator: definition 58/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H and β ]0,+ [. A is β-cocoercive if βa is firmly nonexpansive, i.e., ( x C)( y C) β Ax Ay 2 x y Ax Ay. Lipschitz Nonexpansive Firmly nonexpansive Cocoercive

140 Nonexpansive operator: definition 59/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A : C H and let α ]0,1[. A is α-averaged if there exists a nonexpansive operator R: C H such that A = (1 α)id+αr.

141 Nonexpansive operator: definition 59/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A : C H and let α ]0,1[. A is α-averaged if ( (x,y) C 2 ) Ax Ay α α (Id A)x (Id A)y 2 x y 2.

142 Nonexpansive operator: definition 59/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A : C H and let α ]0,1[. A is α-averaged if ( (x,y) C 2 ) Ax Ay α α (Id A)x (Id A)y 2 x y 2. A is α-averaged A is nonexpansive. A is 1 2-averaged A is firmly nonexpansive. A is α-averaged A is α -averaged for every α [α,1[. Let λ ]0,1/α[. A is α-averaged (1 λ)id+λa is λα-averaged.

143 Nonexpansive operator: definition 59/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A : C H and let α ]0,1[. A is α-averaged if ( (x,y) C 2 ) Ax Ay α α (Id A)x (Id A)y 2 x y 2. Let (ω i ) 1 i n ]0,1] n be such that n i=1 ω i = 1 and let (α i ) 1 i n ]0,1[ n. If, for every i {1,...,n}, A i : C H is α i -averaged, then n i=1 ω ia i is α-averaged with α = max 1 i n α i. Let (α i ) 1 i n ]0,1[ n. If, for every i {1,...,n}, A i : C C is α i -averaged, then A 1 A n is α-averaged with n α = 1. n 1+ max 1 i n α i

144 Nonexpansive operator: definition 59/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A : C H and let α ]0,1[. A is α-averaged if ( (x,y) C 2 ) Ax Ay α α (Id A)x (Id A)y 2 x y 2. A : H H is α-averaged with α ]0,1/2] A is maximally monotone.

145 60/161 Nonexpansive operator: recap Lipschitz Nonexpansive α averaged Firmly nonexpansive Cocoercive

146 60/161 Nonexpansive operator: recap (if the domain C is equal to H) Lipschitz Nonexpansive α averaged Firmly nonexpansive Maximally monotone Cocoercive

147 Nonexpansive operator: properties 61/161 Let H be a real Hilbert space and let C be a nonempty subset of H. Let A: C H. Let β ]0,+ [ and γ ]0,2β[. If A is β-cocoercive, then Id γa is γ/(2β)-averaged. Proof : A β-cocoercive βa firmly nonexpansive. There exists a nonexpansive operator R: C H such that βa = (Id+R)/2. Thus ( Id γa = 1 γ ) Id+ γ 2β 2β ( R). ( R) being nonexpansive, Id γa is γ/(2β)-averaged.

148 62/161 Nonexpansive operators What is it for?

149 62/161 Nonexpansive operators What is their use?

150 Nonexpansive operator: example Descent lemma Let H be a real Hilbert space, f : H R and ν ]0,+ [. If f is differentiable and its gradient is ν-lipschitzian, then ( (x,y) H 2 ) f(y) f(x)+ y x f(x) + ν 2 y x 2. 63/161 Baillon-Haddad theorem Let H be a real Hilbert space, f Γ 0 (H) and ν ]0,+ [. If f is differentiable, then f ν-lipschitzian f ν 1 -cocoercive.

151 Nonexpansive operator: example Descent lemma Let H be a real Hilbert space, f : H R and ν ]0,+ [. If f is differentiable and its gradient is ν-lipschitzian, then ( (x,y) H 2 ) f(y) f(x)+ y x f(x) + ν 2 y x 2. 63/161 Baillon-Haddad theorem Let H be a real Hilbert space, f Γ 0 (H) and ν ]0,+ [. If f is differentiable, then f ν-lipschitzian f ν 1 -cocoercive. Let H be a Hilbert space, f Γ 0 (H), ν ]0,+ [ and γ ]0,2ν 1 [. f differentiable and f ν-lipschitzian Id γ f is γν/2-averaged.

152 Nonexpansive operator: example Descent lemma Let H be a real Hilbert space, f : H R and ν ]0,+ [. If f is differentiable and its gradient is ν-lipschitzian, then ( (x,y) H 2 ) f(y) f(x)+ y x f(x) + ν 2 y x 2. 63/161 Baillon-Haddad theorem Let H be a real Hilbert space, f Γ 0 (H) and ν ]0,+ [. If f is differentiable, then f ν-lipschitzian f ν 1 -cocoercive. Let H be a Hilbert space, f Γ 0 (H), ν ]0,+ [ and γ ]0,2ν 1 [. f differentiable and f ν-lipschitzian Id γ f }{{} gradient descent operator is γν/2- averaged.

153 63/161 Nonexpansive operator: example Descent lemma Let H be a real Hilbert space, f : H R and ν ]0,+ [. If f is differentiable and its gradient is ν-lipschitzian, then ( (x,y) H 2 ) f(y) f(x)+ y x f(x) + ν 2 y x 2. Baillon-Haddad theorem Let H be a real Hilbert space, f Γ 0 (H) and ν ]0,+ [. If f is differentiable, then f ν-lipschitzian f ν 1 -cocoercive. Let H be a Hilbert space, f Γ 0 (H), and ν ]0,+ [ f differentiable and f ν-lipschitzian f is ν 1 -strongly convex. Remark : f is ν 1 -strongly convex if f ν 1 2 /2 is convex.

154 64/161 Nonexpansive operators generalities Definition Properties Examples Resolvent

155 Resolvent: definition 65/161 Let H be a Hilbert space. Let A : H 2 H. The revolvent of A is J A = (Id+A) 1.

156 Resolvent: definition 65/161 Let H be a Hilbert space. Let A : H 2 H. The revolvent of A is J A = (Id+A) 1. Example : u x A A+Id? J A?

157 Resolvent: definition 65/161 Let H be a Hilbert space. Let A : H 2 H. The revolvent of A is J A = (Id+A) 1. Example : u x A and Id A+Id? J A?

158 Resolvent: definition 65/161 Let H be a Hilbert space. Let A : H 2 H. The revolvent of A is J A = (Id+A) 1. Example : u u x x A and Id A+Id J A?

159 Resolvent: definition 65/161 Let H be a Hilbert space. Let A : H 2 H. The revolvent of A is J A = (Id+A) 1. Example : u u x x A and Id A+Id J A?

160 Resolvent: definition 65/161 Let H be a Hilbert space. Let A : H 2 H. The revolvent of A is J A = (Id+A) 1. Example : u u u x x x A and Id A+Id J A

161 Resolvent: definition 66/161 The range of an operator B: H 2 H is ranb = { u H x H,u Bx }. Minty theorem Let H be a Hilbert space. Let A: H 2 H be a monotone operator. ran(id+a) = H A is maximally monotone.

162 Resolvent: properties 67/161 Let H be a real Hilbert space. Let A : H 2 H. A is monotone J A is firmly nonexpansive. Remark : J A : ran(id+a) H.

163 Resolvent: properties 67/161 Let H be a real Hilbert space. Let A : H 2 H. A is monotone J A is firmly nonexpansive. Let H be a real Hilbert space. Let A : H 2 H. A is maximally monotone J A : H H is firmly nonexpansive. Proof: A monotone J A : ran(id+a) H firmly nonexpansive + Minty theorem.

164 Resolvent: properties 67/161 Let H be a real Hilbert space. Let A : H 2 H. A is monotone J A is firmly nonexpansive. Let H be a real Hilbert space. Let A : H 2 H. A is maximally monotone J A : H H is firmly nonexpansive. Let H be a Hilbert space. Let A : H 2 H maximally monotone and γ ]0,+ [. For every x H, there exists a unique p H such that x p γap and thus p = J γa x. Proof: x (Id+γA)(p) p (Id+γA) 1 x p = J γa x

165 68/161 Resolvent: properties Let H be a real Hilbert space. Let A : H 2 H be a maximally monotone and γ ]0,+ [. J γa and Id J γa are firmly nonexpansive. The reflected resolvent R γa = 2J γa Id is nonexpansive. γ A is γ-cocoercive. Let H be a Hilbert space. Let A : H 2 H and γ ]0,+ [. The Yosida approximation of A of index γ is γ A = 1 γ (Id J γa).

166 69/161 Resolvent: properties Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. Let z H and B = A( z). Then J B = z +J A ( z). Let z H and B = z +A. Then J B = J A ( z). ( ) Let α [0,+ [ and B = A+αId. Then J B = J A 1+α 1+α

167 Resolvent: properties 70/161 For every i {1,...,n}, let H i be a Hilbert space and A i : H i 2 H i be a maximally monotone operator. J A1 A n = J A1 J An : H 1 H n H 1 H n (x 1,...,x n ) (J A1 x 1,...,J An x n ).

168 Resolvent: properties 71/161 Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator and γ ]0,+ [. J γa 1 = Id γj γ 1 A(γ 1 )

169 Resolvent: properties 72/161 Let H and G be two Hilbert spaces. Let A : H 2 H be a maximally monotone operator and L B(G,H) such that LL = µid where µ ]0,+ [. Then J L AL = Id L µ A L.

170 Resolvent: properties 72/161 Let H and G be two Hilbert spaces. Let A : H 2 H be a maximally monotone operator and L B(G,H) such that LL = µid where µ ]0,+ [. Then J L AL = Id L µ A L. Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator and L B(H,H) be a unitary operator. Then J L AL = L J A L.

171 Resolvent: properties 73/161 Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator and B = ρa(ρ ) where ρ R. Then J B = ρ 1 J ρ 2 A(ρ ). Preuve: Set L = ρid and apply formula J L AL = Id L µ A L.

172 Resolvent: properties 73/161 Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator and B = ρa(ρ ) where ρ R. Then J B = ρ 1 J ρ 2 A(ρ ). Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator and B = A( ). Then J B = J A ( ).

173 74/161 Resolvent

174 74/161 Resolvent Proximity operator

175 Convex analysis 75/161 Let H be a Hilbert space. Let f : H ],+ ]. f is coercive if lim x + f(x) = +.

176 Convex analysis 75/161 Let H be a Hilbert space. Let f : H ],+ ]. f is coercive if lim x + f(x) = +. Coercive functions? f(x) + f(x) + f(x) + x x x

177 Convex analysis 75/161 Let H be a Hilbert space. Let f : H ],+ ]. f is coercive if lim x + f(x) = +. Coercive functions? f(x) + f(x) + f(x) + x x x

178 76/161 Convex analysis Let H be a Hilbert space. Let f : H ],+ ]. f is strictly convex if ( x domf)( y domf)( α ]0,1[) x y f(αx +(1 α)y) < αf(x)+(1 α)f(y).

179 76/161 Convex analysis Let H be a Hilbert space. Let f : H ],+ ]. f is strictly convex if ( x domf)( y domf)( α ]0,1[) x y f(αx +(1 α)y) < αf(x)+(1 α)f(y). Strictly convex functions? f(x) f(x) f(x) x x x

180 76/161 Convex analysis Let H be a Hilbert space. Let f : H ],+ ]. f is strictly convex if ( x domf)( y domf)( α ]0,1[) x y f(αx +(1 α)y) < αf(x)+(1 α)f(y). Strictly convex functions? f(x) f(x) f(x) x x x

181 Convex analysis 77/161 Let H be a Hilbert space and C be a closed convex of H. Let f Γ 0 (H) such that domf C. If f is coercive or C is bounded, then there exists p C such that f(p) = inf x C f(x). Moreover, if f is strictly convex, this minimizer p is unique.

182 78/161 Proximity operator: definition Let H be a Hilbert space. Let f Γ 0 (H) and γ ]0,+ [. For every x H, there exists a unique p H such that f(p)+ 1 2γ p x 2 = inf y H f(y)+ 1 2γ y x 2.

183 Proximity operator: definition 78/161 Let H be a Hilbert space. Let f Γ 0 (H) and γ ]0,+ [. For every x H, there exists a unique p H such that f(p)+ 1 2γ p x 2 = inf y H f(y)+ 1 2γ y x 2. Let H be a Hilbert space. Let f Γ 0 (H). The Moreau envelope of f of parameter γ ]0,+ [ is γ f : H R: x inf f(y)+ 1 y H 2γ y x 2. The proximity operator of f is prox f : H H: x argmin y H f(y)+ 1 2 y x 2.

184 Proximity operator: definition 78/161 Let H be a Hilbert space. Let f Γ 0 (H). The Moreau envelope of f of parameter γ ]0,+ [ is γ f : H R: x inf f(y)+ 1 y H 2γ y x 2. The proximity operator of f is prox f : H H: x argmin y H f(y)+ 1 2 y x 2. f(x) γ f(x) prox f (x) x x x

185 Proximity operator: definition 79/

186 Proximity operator: definition 79/

187 Proximity operator: definition 79/

188 Proximity operator: definition 79/

189 Proximity operator: definition 79/

190 Proximity operator: definition 79/

191 Proximity operator: definition 79/

192 Proximity operator: definition 79/

193 Proximity operator: definition 79/

194 Proximity operator: definition 79/

195 Proximity operator: definition 79/

196 Proximity operator: definition 79/

197 Proximity operator: definition 79/

198 Proximity operator: definition 79/

199 Proximity operator: definition 79/

200 Proximity operator: definition 79/

201 Proximity operator: definition 79/

202 Proximity operator: definition 79/

203 Proximity operator: definition 80/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). If domf int(domg) then (f +g) = f + g. Let H be a Hilbert space and f Γ 0 (H). prox f = J f.

204 Proximity operator: definition 80/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). If domf int(domg) then (f +g) = f + g. Let H be a Hilbert space and f Γ 0 (H). prox f = J f. Proof: By using Fermat s rule, for every x H, ( p = argmin f +(2γ) 1 x 2 0 f x 2) (p) 0 f(p)+p x x (Id+ f)(p) p = (Id+ f) 1 (x).

205 Proximity operator: definition 80/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). If domf int(domg) then (f +g) = f + g. Let H be a Hilbert space and f Γ 0 (H). prox f = J f. Remark: As dom(prox f ) = H, this provides a proof that f is maximally monotone!

206 Proximity operator: properties 81/161 Let H be a Hilbert space, f Γ 0 (H) and (x,p) H 2. p = prox γf x ( y H) y p x p +f(p) f(y).

207 Proximity operator: properties 81/161 Let H be a Hilbert space, f Γ 0 (H) and (x,p) H 2. p = prox γf x ( y H) y p x p +f(p) f(y). Let H be a Hilbert space, f Γ 0 (H) and γ ]0,+ [. γ f is differentiable and γ f is γ 1 -Lipschitzian ( x H) }{{} γ f = γ 1 (Id prox γf ) = }{{} γ f. Moreau Yosida envelope approximation

208 Proximity operator: properties 81/161 Let H be a Hilbert space, f Γ 0 (H) and (x,p) H 2. p = prox γf x ( y H) y p x p +f(p) f(y). Let H be a Hilbert space, f Γ 0 (H) and γ ]0,+ [. γ f is differentiable and γ f is γ 1 -Lipschitzian ( x H) }{{} γ f = γ 1 (Id prox γf ) = }{{} γ f. Moreau Yosida envelope approximation Proof: Previous property +... calculations.

209 Proximity operator: properties 81/161 Let H be a Hilbert space, f Γ 0 (H) and (x,p) H 2. p = prox γf x ( y H) y p x p +f(p) f(y). Let H be a Hilbert space, f Γ 0 (H) and γ ]0,+ [. γ f is differentiable and γ f is γ 1 -Lipschitzian ( x H) }{{} γ f = γ 1 (Id prox γf ) = }{{} γ f. Moreau Yosida envelope approximation Interpretation: γ f is a smooth approximation of f.

210 82/161 Proximity operator: properties Let H be a Hilbert space, x H and f Γ 0 (H). Properties g(x) prox g x Translation f(x z),z H z + prox f (x z) Quadratic perturbation f(x) + α x 2 /2 + z x + γ prox f ( x z α+1 ) α+1 z H,α > 0,γ R Scale change f(ρx),ρ R 1 ρ prox ρ 2 f (ρx) Reflection f( x) prox f ( x) Moreau envelope γ f(x) = inf f(y) + 1 ( ) x y 2 1 γx + prox y H 1+γ (1+γ)f (x) 2γ γ > 0

211 Proximity operator: properties 83/161 For every i {1,...,n}, let H i be a Hilbert space and f i Γ 0 (H i ). For all (x 1,...,x n ) H 1 H n, if n f(x 1,...,x n ) = f i (x i ). then i=1 prox f (x 1,...,x n ) = ( prox fi (x i ) ) 1 i n.

212 Proximity operator: properties 84/161 Let H be a separable Hilbert space. Let (b i ) i I be an orthonormal basis of H. For every i I, let ϕ i Γ 0 (R) such that ϕ i 0. For every x H, if f(x) = ϕ i ( x b i ) i I then prox f (x) = i I prox ϕ i ( x b i )b i. Remark: The assumption ( i I) ϕ i 0 can be relaxed if H is finite dimensional.

213 Proximity operator: properties 84/161 Let H be a separable Hilbert space. Let (b i ) i I be an orthonormal basis of H. For every i I, let ϕ i Γ 0 (R) such that ϕ i 0. For every x H, if f(x) = ϕ i ( x b i ) i I then prox f (x) = i I prox ϕ i ( x b i )b i. Example: H = R N, (b i ) 1 i N canonical basis of R N, f = λ 1 with λ [0,+ [. ( x = (x (i) ) 1 i N ) R N ) prox λ 1 (x) = ( prox λ (x (i) ) ) 1 i N

214 Proximity operator: properties 85/161 Moreau decomposition formula Let H be a Hilbert space, f Γ 0 (H) and γ ]0,+ [. ( x H) prox γf x = x γprox γ 1 f(γ 1 x).

215 Proximity operator: properties 85/161 Moreau decomposition formula Let H be a Hilbert space, f Γ 0 (H) and γ ]0,+ [. ( x H) prox γf x = x γprox γ 1 f(γ 1 x). Example: If H = R N, f = 1 q q q with q ]1,+ [, then f = 1 q q q with 1/q +1/q = 1, and ( x R N ) prox γ q q q x = x γprox 1 γq q q (γ 1 x).

216 Proximity operator: properties 86/ p=1 p = 4/3 p=3/2 p=2 p=3 p=

217 Proximity operator: properties 87/161 Let H and G be two Hilbert spaces. Let f Γ 0 (H) and L B(G,H) such that ranl = H. Then (f L) = L f L.

218 Proximity operator: properties 87/161 Let H and G be two Hilbert spaces. Let f Γ 0 (H) and L B(G,H) such that ranl = H. Then (f L) = L f L. Let H and G be two Hilbert spaces. Let f Γ 0 (H) and L B(G,H) such that LL = µid where µ ]0,+ [. Then prox f L = Id µ 1 L (Id prox µf ) L.

219 87/161 Proximity operator: properties Let H and G be two Hilbert spaces. Let f Γ 0 (H) and L B(G,H) such that ranl = H. Then (f L) = L f L. Let H and G be two Hilbert spaces. Let f Γ 0 (H) and L B(G,H) such that LL = µid where µ ]0,+ [. Then prox f L = Id µ 1 L (Id prox µf ) L. Remark : Useful property for data fidelity terms involving a neg-log-likelihood f and a synthesis tight frame operator L.

220 Introduction OMM1 OMM2 OMM3 OMM4 88/161 Proximity operator: properties Particular case : L B(H, H) unitary, proxf L = L proxf L. Illustration: denoising using an ℓ1 penalty on the coefficients resulting from an orthogonal wavelet transform L. L proxλk k1 L

221 Proximity operator: examples 89/161 Projection : Let H be a Hilbert space. Let C be a nonempty closed convex subset of H. ( x H) 1 prox ιc (x) = argmin y C 2 y x 2 = P C (x). x P C (x) C

222 89/161 Proximity operator: examples Projection : Let H be a Hilbert space. Let C be a nonempty closed convex subset of H. ( x H) 1 prox ιc (x) = argmin y C 2 y x 2 = P C (x). Remark : p = P C (x) x p ι C (p) = N C (p) ( y C) y p x p 0. Particular case: if C is a vector space: p = P C (x) x p C. γ ι C = (2γ) 1 d 2 C where d C distance to the convex set C is defined by d C : x inf y C y x = x P C x. We have then d 2 C = ( 1 2ι C ) = 2(Id P C ).

223 Proximity operator: examples 90/161 Quadratic function : Let H and G be two Hilbert spaces. Let L B(G,H), γ ]0,+ [ and z G. f = γ L z 2 /2 prox f = (Id+γL L) 1 ( +γl z).

224 Proximity operator: examples 91/161 Support function : Let H be a Hilbert space and C H be nonempty closed convex. ( x H) prox σc = Id P C. Soft-thresholding : H = R, δ 1 = infc and δ 2 = supc. For every x R, δ 1 x if x < 0 x δ 1 if x < δ 1 σ C (x) = 0 if x = 0 prox σc (x) = soft C (x) = 0 if x C δ 2 x if x > 0 x δ 2 if x > δ 2. P C (x) prox σc (x) δ 1 δ 2 x δ 1 δ 2 x

225 92/161 Part 3: Search for a zero 1. Zeros of a (maximally) monotone operator 2. Fixed points 3. Convergence Definition Fejér monotonicity Demiclosedness principle 4. Algorithms Krasnosel skii Mann Douglas-Rachford PPXA Forward-Backward Forward-Backward-Forward

226 Monotone operator: zeros 93/161 Let H be a Hilbert space. Let A : H 2 H be a monotone operator. The set of zeros of A, denoted zera, is zera = {x H 0 Ax}. Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator. zera is closed and convex.

227 93/161 Monotone operator: zeros Let H be a Hilbert space. Let A : H 2 H be a monotone operator. The set of zeros of A, denoted zera, is zera = {x H 0 Ax}. Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator. If one of the following assumptions is satisfied A is surjective doma is bounded, then zera.

228 Monotone operator: zeros 94/161 Let H be a Hilbert space. A: H 2 H is strictly monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) x 1 x 2 u 1 u 2 x 1 x 2 > 0.

229 94/161 Monotone operator: zeros Let H be a Hilbert space. A: H 2 H is strictly monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) x 1 x 2 u 1 u 2 x 1 x 2 > 0. Let H be a Hilbert space. Let A: H 2 H be a strictly monotone operator. zera is at most a singleton. Proof: If (x 1,x 2 ) (zera) 2 and x 1 x 2 then 0 = x 1 x !

230 Monotone operator: zeros 95/161 Let H be a Hilbert space and β ]0,+ [. A: H 2 H is β-strongly monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) u 1 u 2 x 1 x 2 β x 1 x 2 2.

231 Monotone operator: zeros 95/161 Let H be a Hilbert space and β ]0,+ [. A: H 2 H is β-strongly monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) u 1 u 2 x 1 x 2 β x 1 x 2 2. Let H be a Hilbert space and β ]0,+ [. A: H 2 H is β-strongly monotone A 1 : H H is β-cocoercive. Let H be a Hilbert space and β ]0,+ [. A: H 2 H is β-strongly monotone A is strictly monotone.

232 Monotone operator: zeros 95/161 Let H be a Hilbert space and β ]0,+ [. A: H 2 H is β-strongly monotone if ( (x1,u 1 ) graa )( (x 2,u 2 ) graa ) u 1 u 2 x 1 x 2 β x 1 x 2 2. Let H be a Hilbert space and β ]0,+ [. Let A : H 2 H be a maximally monotone operator. A β-strongly monotone zera is a singleton.

233 Monotone operator: zeros and minimizer 96/161 Let H be a Hilbert space. Let f : H ],+ ] be a proper function. f is strictly convex f is strictly monotone. Let H be a Hilbert space. Let f : H ],+ ] be a proper function. f is strictly convex f has at most one minimizer.

234 Monotone operator: zeros and minimizer 97/161 Let H be a Hilbert space and β ]0,+ [. Let f : H ],+ ] be a proper function. f is β-strongly convex f is β-strongly monotone. Let H be a Hilbert space and β ]0,+ [. Let f Γ 0 (H). f is β-strongly convex f has a unique minimizer.

235 Fixed point algorithms: zeros and fixed points 98/161 Let H be a Hilbert space. Let B : H 2 H. The set of fixed points of B is denoted by FixB = {x H x Bx}

236 Fixed point algorithms: zeros and fixed points 98/161 Let H be a Hilbert space. Let B : H 2 H. The set of fixed points of B is denoted by FixB = {x H x Bx} Let H be a Hilbert space. Let A : H 2 H be a monotone operator and let γ ]0,+ [. Then FixJ γa = zera

237 98/161 Fixed point algorithms: zeros and fixed points Let H be a Hilbert space. Let B : H 2 H. The set of fixed points of B is denoted by FixB = {x H x Bx} Let H be a Hilbert space. Let A : H 2 H be a monotone operator and let γ ]0,+ [. Then FixJ γa = zera Proof : ( x H) 0 Ax x = J γa x

238 98/161 Fixed point algorithms: zeros and fixed points Let H be a Hilbert space. Let B : H 2 H. The set of fixed points of B is denoted by FixB = {x H x Bx} Let H be a Hilbert space. Let A : H 2 H be a monotone operator and let γ ]0,+ [. Then FixJ γa = zera Proof : ( x H) 0 Ax 0 γax x (Id+γA)x x (Id+γA) 1 x x = J γa x

239 99/161 Fixed point algorithm: convergence Let H be a Hilbert space. Let (x n ) n N be a sequence in H and x H. (x n ) n N converges strongly to x if It is denoted by x n x. (x n ) n N converges weakly to x if ( y H) It is denoted by x n x. lim x n x = 0. n + lim n + y x n x = 0. Remark: In a finite dimensional Hilbert space, strong and weak convergences are equivalent.

240 100/161 Fixed point algorithm: convergence Let H be a Hilbert space. Let (x n ) n N be a sequence of H. (x n ) n N converges weakly if and only if (x n ) n N is bounded and (x n ) n N posseses at most one sequential cluster point in the weak topology. x is a sequential cluster point of (x n ) n N in the weak topology if there exists a sub-sequence (x nk ) k N of (x n ) n N that converges weakly to x.

241 100/161 Fixed point algorithm: convergence Let H be a Hilbert space. Let (x n ) n N be a sequence of H. (x n ) n N converges weakly if and only if (x n ) n N is bounded and (x n ) n N posseses at most one sequential cluster point in the weak topology. Illustration: x 0 x 1 x 2 x 3 x 4 x (x n ) n N is bounded but it has 2 sequential cluster points: 1 and 1. (x n ) n N does not converge.

242 Fixed point algorithm: Fejér-monotone sequence 101/161 Let H be a Hilbert space and D be a nonempty subset of H. Let (x n ) n N be a sequence in H. (x n ) n N is Fejér-monotone with respect to D if ( x D)( n N) x n+1 x x n x. Let H be a Hilbert space and D be a nonempty subset of H. Let (x n ) n N be Fejér-monotone with respect to D then (x n ) n N is bounded. for every x D, ( x n x ) n N converges.

243 102/161 Fixed point algorithm: Fejér-monotone sequence Fejér-monotone convergence Let H be a Hilbert space and let D be a nonempty subset of H. Let (x n ) n N be a sequence in H. (x n ) n N converges weakly to a point in D if (x n ) n N is Fejér-monotone with respect to D and every sequential cluster point of (x n ) n N in the weak topology lies in D.

244 Fixed point algorithm: Fejér-monotone sequence 103/161 Demiclosedness principle Let H be a Hilbert space and C be a nonempty closed convex subset of H. Let T : C H be a nonexpansive operator. If (x n ) n N is a sequence in C thatconverges weakly to x and iftx n x n 0 then x FixT.

245 Fixed point algorithm: Fejér-monotone sequences 104/161 Let H be a Hilbert space and C be a nonempty subset of H. Let T : C C be a nonexpansive operator ( n N) x n+1 = Tx n.

246 Fixed point algorithm: Fejér-monotone sequences 104/161 Let H be a Hilbert space and C be a nonempty subset of H. Let T : C C be a nonexpansive operator If x n Tx n 0, ( n N) x n+1 = Tx n.

247 Fixed point algorithm: Fejér-monotone sequences 104/161 Let H be a Hilbert space and C be a nonempty subset of H. Let T : C C be a nonexpansive operator such that FixT. Let x 0 C, ( n N) x n+1 = Tx n. If x n Tx n 0, then (x n ) n N converges weakly to a point in FixT. Proof : (x n ) n N Fejér-monotone with respect to FixT + demiclosedness principle.

248 105/161 Fixed point algorithm: proximal point algorithm Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator ( n N) x n+1 = J A x n,

249 105/161 Fixed point algorithm: proximal point algorithm Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator Let γ ]0,+ [ ( n N) x n+1 = J γa x n,

250 105/161 Fixed point algorithm: proximal point algorithm Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator such that zera. Let γ ]0,+ [ et x 0 H. If ( n N) x n+1 = J γa x n, then (x n ) n N converges weakly to a point in zera. Proof: J γa : H H is nonexpansive and x n J γa x n 0.

251 Fixed point algorithm: proximal point algorithm 106/161 Let H be a Hilbert space. Let A : H 2 H be a maximally monotone operator such that zera. Let (γ n ) n N a sequence in ]0,+ [ such that If + n=0 ( n N) x n+1 = J γnax n, then (x n ) n N converges weakly to a point in zera. γ 2 n = + and x 0 H. Proof:... more challenging.

252 Optimization method: proximal point algorithm 107/161 Let H be a Hilbert space. Let f Γ 0 (H) such that Argminf. ( n N) x n+1 = prox γnf x n

253 Optimization method: proximal point algorithm 107/161 Let H be a Hilbert space. Let f Γ 0 (H) such that Argminf. Let (γ n ) n N be a sequence in ]0,+ [ such that If + n=0 ( n N) x n+1 = prox γnf x n then (x n ) n N converges weakly to a (global) minimizer of f. γ 2 n = + and x 0 H.

254 Fixed point algorithm: Fejér-monotone sequence 108/161 Krasnosel skii-mann algorithm Let H be a Hilbert space and C be a nonempty closed convex subset of H. Let T: C C be a nonexpansive operator ( n N) x n+1 = x n +λ n (Tx n x n ).

255 Fixed point algorithm: Fejér-monotone sequence 108/161 Krasnosel skii-mann algorithm Let H be a Hilbert space and C be a nonempty closed convex subset of H. Let T: C C be a nonexpansive operator Let (λ n ) n N be a sequence in [0,1] such that n N λ n(1 λ n ) = +. ( n N) x n+1 = x n +λ n (Tx n x n ).

256 108/161 Fixed point algorithm: Fejér-monotone sequence Krasnosel skii-mann algorithm Let H be a Hilbert space and C be a nonempty closed convex subset of H. Let T: C C be a nonexpansive operator such that FixT. Let (λ n ) n N be a sequence in [0,1] such that n N λ n(1 λ n ) = +. Let x 0 C and ( n N) x n+1 = x n +λ n (Tx n x n ). Then, (x n ) n N converges weakly to a point in FixT Proof: (x n ) n N is Fejér-monotone with respect to FixT. (Tx n x n ) n N converges strongly to 0.

257 Fixed point algorithm: Douglas-Rachford 109/161 Let H be a finite dimensional Hilbert space. Let A: H 2 H and B: H 2 H be two maximally monotone operators. ( n N) y n = J γb x n z n = J γa (2y n x n ) x n+1 = x n +λ n (z n y n ).

258 Fixed point algorithm: Douglas-Rachford 109/161 Let H be a finite dimensional Hilbert space. Let A: H 2 H and B: H 2 H be two maximally monotone operators. Let γ ]0,+ [ and let (λ n ) n N be a sequence in [0,2] such that n N λ n(2 λ n ) = +. ( n N) y n = J γb x n z n = J γa (2y n x n ) x n+1 = x n +λ n (z n y n ).

259 Fixed point algorithm: Douglas-Rachford 109/161 Let H be a finite dimensional Hilbert space. Let A: H 2 H and B: H 2 H be two maximally monotone operators. Let γ ]0,+ [ and let (λ n ) n N be a sequence in [0,2] such that n N λ n(2 λ n ) = +. We assume that zer(a+b). Let x 0 H and ( n N) The following properties are satisfied: x n x y n = J γb x n z n = J γa (2y n x n ) x n+1 = x n +λ n (z n y n ). (y n ) n N and (z n ) n N converge to J γb x zer(a+b).

260 Fixed point algorithm: Douglas-Rachford Proof: We set T = R γa R γb. 1. T is nonexpansive. 2. zer(a+b) = J γb (FixT). 3. zer(a+b) = J γb (FixT) FixT. 4. We have ( ) ( n N) x n+1 = x n +λ n JγA (2J γb x n x n ) J γb x n = x n + λ n( ) Txn x n. 2 Krasnosel skii-mann algorithm with relaxation steps (λ n /2) n N. 5. Invoke continuity of J γb. 110/161

261 Fixed point algorithm: Douglas-Rachford 111/161 Let H be a Hilbert space. Let A: H 2 H and B: H 2 H be two maximally monotone operators. Let γ ]0,+ [ and let (λ n ) n N be a sequence in [0,2] such that n N λ n(2 λ n ) = +. We assume that zer(a+b). Let x 0 H and ( n N) The following properties are satisfied: x n x y n = J γb x n z n = J γa (2y n x n ) x n+1 = x n +λ n (z n y n ). (y n ) n N and (z n ) n N converge weakly to J γb x zer(a+b).

262 Optimization algorithm: Douglas-Rachford 112/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). ( n N) y n = prox γg x n z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ).

263 Optimization algorithm: Douglas-Rachford 112/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). Let γ ]0,+ [ and let (λ n ) n N be a sequence in [0,2] such that n N λ n(2 λ n ) = +. We assume that zer( f + g). Let x 0 H and ( n N) The following properties are satisfied: x n x y n = prox γg x n z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ). z n y n 0, y n ŷ, z n ŷ where ŷ = prox γg x Argmin(f +g).

264 113/161 Optimization algorithm: Douglas-Rachford Image restoration : z = Ax +n x R N : original image (unknown), A R N N : blur operator, n R N : additive noise (white zero-mean Gaussian), z R N : observation = blur + noise Find an image x close to x using z Degraded image z

265 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm with g = A z 2 2 and f = η W 1 ( n N) y n = prox γg x n z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ).

266 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm with g = A z 2 2 and f = η W 1 ( n N) y n = prox γg x n Closed form z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ).

267 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm with g = A z 2 2 and f = η W 1 ( n N) y n = prox γg x n z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ). Closed form prox γη 1 = soft [ γη,γη]

268 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm with g = A z 2 2 and f = η W 1 ( n N) y n = prox γg x n z n = prox γf (2y n x n ) Closed form if W 1 = W x n+1 = x n +λ n (z n y n ). (e.g. wavelet transform) prox f W = W prox f (W )

269 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm Degraded image z Restored image x [DR DWT]

270 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm y n = prox γg x n z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ). y n y Iterations γ = {50,10 2,5.10 2,10 3, }

271 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm y n = prox γg x n z n = prox γf (2y n x n ) x n+1 = x n +λ n (z n y n ). λ n {1,1.5,2.1}

272 Optimization algorithm: Douglas-Rachford 114/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η Wx 1 with η ]0,+ [ and W R N N Douglas-Rachford algorithm DR : y n y and PP : x n x Time (sec) DR (red) Proximal point algo. (blue)

273 Optimization algorithm: Douglas-Rachford 115/161 Let H and G be two Hilbert spaces. Let g Γ 0 (H) and L B(G,H) such that ranl is closed and L L is an isomorphism. ( n N) y n = prox γg x n c n = (L L) 1 L y n ( ) x n+1 = x n +λ n L(2cn v n ) y n v n+1 = v n +λ n (c n v n ).

274 Optimization algorithm: Douglas-Rachford 115/161 Let H and G be two Hilbert spaces. Let g Γ 0 (H) and L B(G,H) such that ranl is closed and L L is an isomorphism. Let γ ]0,+ [ and let (λ n ) n N be a sequence in [0,2] such that n N λ n(2 λ n ) = +. We assume that zer(l g L). Let x 0 H, v 0 = (L L) 1 L x 0 and y n = prox γg x n c n = (L L) 1 L y n ( n N) ( ) x n+1 = x n +λ n L(2cn v n ) y n v n+1 = v n +λ n (c n v n ). We have then: v n v where v Argmin(g L).

275 Optimization algorithm: Douglas-Rachford Sketch of proof: minimize v G g(lv) minimize x H where E = ranl. We apply Douglas-Rachford algorithm with f = ι E. ι E (x)+g(x) 116/161

276 Optimization algorithm: Douglas-Rachford Particular case of Douglas-Rachford algorithm: PPXA+ H = H 1 H m where H 1,...,H m Hilbert spaces 117/161 ( n N) y n,i = prox γgi x n,i, i {1,...,m} c n = ( m i=1 L i L i) 1 m i=1 L i y n,i ( ) x n+1,i = x n,i +λ n Li (2c n v n ) y n,i, i {1,...,m} v n+1 = v n +λ n (c n v n ).

277 Optimization algorithm: Douglas-Rachford Particular case of Douglas-Rachford algorithm: PPXA+ H = H 1 H m where H 1,...,H m Hilbert spaces ( x = (x1,...,x m ) H ) g(x) = m i=1 g i(x i ) where ( i {1,...,m} g i Γ 0 (H i ) L: v (L 1 v,...,l m v) where ( i {1,...,m}) L i B(G,H i ). 117/161 Let (x 0,i ) 1 i m H, v 0 = ( m i=1 L i L i) 1 m i=1 L i x 0,i and y n,i = prox γgi x n,i, i {1,...,m} c n = ( m i=1 ( n N) L i L i) 1 m i=1 L i y n,i ( ) x n+1,i = x n,i +λ n Li (2c n v n ) y n,i, i {1,...,m} v n+1 = v n +λ n (c n v n ). We have then v n v Argmin m i=1 g i L i.

278 Optimization algorithm: Douglas-Rachford Particular case of Douglas-Rachford algorithm: PPXA H = H 1 H m where H 1,...,H m Hilbert spaces ( x = (x1,...,x m ) H ) g(x) = m i=1 g i(x i ) where ( i {1,...,m} g i Γ 0 (H i ) L: v (L 1 v,...,l m v) where ( i {1,...,m}) L i B(G,H i ). If H 1 =... = H m and L 1 =... = L m = Id Consensus trick Let (x 0,i ) 1 i m H, v 0 = 1 m m i=1 x 0,i et y n,i = prox γgi x n,i, i {1,...,m} c n = 1 m m i=1 ( n N) y n,i ( ) x n+1,i = x n,i +λ n 2cn v n y n,i, i {1,...,m} v n+1 = v n +λ n (c n v n ). We have then v n v Argmin m i=1 g i. 117/161

279 Optimization algorithm: PPXA+ 118/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [H V ] x 2,1 +ι C (x) with PPXA+ with g 1 = A z 2 2 and L 1 = Id g 2 = η 1,2 and L 2 = [H V ] ( n N) η ]0,+ [ H,V R N N C = [0,255] N g 3 = ι C and L 3 = Id y n,i = prox γgi x n,i,i {1,2,3} c n = ( 3 i=1 L i L i) 1 3 i=1 L i y n,i ( ) x n+1,i = x n,i +λ n Li (2c n v n ) y n,i, i {1,2,3} v n+1 = v n +λ n (c n v n ).

280 Optimization algorithm: PPXA+ 118/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [H V ] x 2,1 +ι C (x) with PPXA+ with g 1 = A z 2 2 and L 1 = Id g 2 = η 1,2 and L 2 = [H V ] ( n N) η ]0,+ [ H,V R N N C = [0,255] N g 3 = ι C and L 3 = Id y n,i = prox γgi x n,i,i {1,2,3} Closed form c n = ( 3 i=1 L i L i) 1 3 i=1 L i y n,i Closed form ( ) x n+1,i = x n,i +λ n Li (2c n v n ) y n,i, i {1,2,3} v n+1 = v n +λ n (c n v n ).

281 Optimization algorithm: PPXA+ 118/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [H V ] x 2,1 +ι C (x) with PPXA+ η ]0,+ [ H,V R N N C = [0,255] N Degraded image z Restored image x [PPXA TV]

282 Optimization algorithm: PPXA+ 118/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [H V ] x 2,1 +ι C (x) with PPXA+ η ]0,+ [ H,V R N N C = [0,255] N Degraded image z Restored image x [DR DWT]

283 Optimization algorithm: PPXA+ 118/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [H V ] x 2,1 +ι C (x) with PPXA+ η ]0,+ [ H,V R N N C = [0,255] N y n,i = prox γgi x n,i, c n = ( 3 i=1 L i L i) 1 2 i=1 L i y n,i ( ) x n+1,i = x n,i +λ n Li (2c n v n ) y n,i, v n+1 = v n +λ n (c n v n ). γ = {5.10 2,10 3,5.10 3,10 4, }

284 Optimization algorithm: PPXA+ 118/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [H V ] x 2,1 +ι C (x) with PPXA+ η ]0,+ [ H,V R N N C = [0,255] N 10 4 y n,i = prox γgi x n,i, c n = ( 3 i=1 L i L i) 1 2 i=1 L i y n,i ( ) 10 x n+1,i = x n,i +λ n Li (2c n v n ) y n,i, 3 v n+1 = v n +λ n (c n v n ) v n v Iterations λ n {1,1.8,2.1}

285 Fixed point algorithm: α-averaged operator 119/161 Let H be a Hilbert space. Let T : H H be a α-averaged operator with α ]0,1[ ( n N) x n+1 = x n +λ n (Tx n x n ).

286 Fixed point algorithm: α-averaged operator 119/161 Let H be a Hilbert space. Let T : H H be a α-averaged operator with α ]0,1[ Let (λ n ) n N a sequence in [0,1/α] such that n N λ n(1 αλ n ) = +. ( n N) x n+1 = x n +λ n (Tx n x n ).

287 Fixed point algorithm: α-averaged operator 119/161 Let H be a Hilbert space. Let T : H H be a α-averaged operator with α ]0,1[ such that FixT. Let (λ n ) n N a sequence in [0,1/α] such that n N λ n(1 αλ n ) = +. Let x 0 H and ( n N) x n+1 = x n +λ n (Tx n x n ). Then, (x n ) n N converges weakly to a point in FixT. Proof : Use properties of Krasnosel skii-mann algorithm.

288 120/161 Fixed point algorithm: Forward-Backward Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. Let B: H H be a β-cocoercive operator with β ]0,+ [. ( n N) { y n = x n γbx n x n+1 = x n +λ n (J γa y n x n ).

289 120/161 Fixed point algorithm: Forward-Backward Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. Let B: H H be a β-cocoercive operator with β ]0,+ [. Let γ ]0,2β[ and δ = min{1,β/γ}+1/2. Let (λ n ) n N be a sequence in [0,δ[ such that n N λ n(δ λ n ) = +. ( n N) { y n = x n γbx n x n+1 = x n +λ n (J γa y n x n ).

290 Fixed point algorithm: Forward-Backward 120/161 Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. Let B: H H be a β-cocoercive operator with β ]0,+ [. Let γ ]0,2β[ and δ = min{1,β/γ}+1/2. Let (λ n ) n N be a sequence in [0,δ[ such that n N λ n(δ λ n ) = +. We assume that zer(a+b). Let x 0 H and { y n = x n γbx n ( n N) x n+1 = x n +λ n (J γa y n x n ). (x n ) n N converges weakly to a point in zer(a+b).

291 Fixed point algorithm: Forward-Backward 120/161 Let γ ]0,2β[ and δ = min{1,β/γ}+1/2. Let (λ n ) n N a sequence in [0,δ[ such that n N λ n(δ λ n ) = +. 2 δ = min {1,β/γ} + 1/ γ/β ]0,2[

292 121/161 Fixed point algorithm: Forward-Backward Proof: Let T = J γa (Id γb). 1. FixT = zer(a+b) : ( x H) x FixT x γbx (Id+γA)x 0 Ax +Bx. 2. The iterations can be written as ( n N) x n+1 = x n +λ n (Tx n x n ) 3. T is α-averaged : B β-cocoercive and γ ]0,2β[ Id γb is γ/(2β)-averaged and J γa is 1/2-averaged. Then, T is α-averaged with α = max{ 1 2, γ 2β } α 1 = δ.

293 Fixed point algorithm: Forward-Backward 122/161 Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. Let B: H H be β-cocoercive where β ]0,+ [. Let (γ n ) n N be a sequence in [γ,γ] where 0 < γ γ < 2β. Let (λ n ) n N be a sequence in [λ,1] where λ ]0,1]. We assume that zer(a+b). Let x 0 H and { y n = x n γ n Bx n ( n N) x n+1 = x n +λ n (J γnay n x n ). (x n ) n N converges weakly to a point in zer(a+b). Remark: When B = 0 and λ n 1, the algorithm reduces to the proximal point algorithm.

294 Optimization algorithm: Forward-Backward 123/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H) such that g is 1 β-lipschitz where β ]0,+ [. Let (γ n ) n N in [γ,γ] where 0 < γ γ < 2β. Let (λ n ) n N be a sequence in [λ,1] where λ ]0,1]. We assume that Argmin(f +g). Let x 0 H and { y n = x n γ n g(x n ) ( n N) x n+1 = x n +λ n (prox γnfy n x n ). (x n ) n N converges weakly to a minimizer of f +g.

295 Optimization algorithm: Forward-Backward MM (Majoration-Minimization) interpretation For all n N, let p n = prox γnf( xn γ n g(x n ) ). We have: f(x n+1 ) = f ( (1 λ n )x n +λ n p n ) (1 λn )f(x n )+λ n f(p n ). We use the descent lemma, g(x n+1 ) g(x n )+ g(x n ) x n+1 x n + 1 2β x n+1 x n 2, the proximity operator definition and, the subdifferential definition x n γ n g(x n ) p n f(p n ) γ n f(p n )+ x n γ n g(x n ) p n x n p n γ n f(x n ) λ n f(p n )+ g(x n ) x n+1 x n +γ 1 n λ 1 n x n+1 x n 2 λ n f(x n ) since x n+1 x n = λ n (p n x n ). 124/161

296 Optimization algorithm: Forward-Backward MM (Majoration-Minimization) interpretation ( 1 f(x n+1 )+g(x n+1 )+ 1 ) x n+1 x n 2 f(x n )+g(x n ). γ n λ n 2β Thus, if γn 1 λ 1 n 1 2 β 1 0 γ n λ n 2β, ( f(x n )+g(x n ) ) n N is a decreasing sequence. Because ( n N) p n = prox γnfy n domf, if x 0 domf, then ( n N) x n+1 = (1 λ n )x n +λ n p n domf. Thus, ( f(x n )+g(x n ) ) is a real decreasing sequence. n N 124/161

297 125/161 Optimization algorithm: Forward-Backward Beck-Teboule proximal gradient algorithm (Nesterov acceleration) Let x 0 = z 0 H, t 0 = 1, and y n = z n β g(z n ) ( n N) x n+1 = prox βf y n t n+1 = 1+ 4tn λ n = 1+ tn 1 t n+1 z n+1 = z n +λ n (x n+1 x n ). Convergence of (f(x n )) n N at optimal O(1/n 2 ) rate, but convergence of (x n ) n N not secured theoretically.

298 126/161 Optimization algorithm: Forward-Backward Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB with g = AF z 2 2, f = η 1 and x = F û. { y n = x n γ n g(x n ) x n+1 = x n +λ n (prox γnfy n x n )

299 126/161 Optimization algorithm: Forward-Backward Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB with g = AF z 2 2, f = η 1 and x = F û. { y n = x n γ n g(x n ) Closed form: 2FA (AF z) x n+1 = x n +λ n (prox γnfy n x n ) Closed form

300 Optimization algorithm: Forward-Backward 126/161 Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB Degraded image z Restored image [FB DTT]

301 Optimization algorithm: Forward-Backward 126/161 Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB Degraded image z Restored image [PPXA TV]

302 Optimization algorithm: Forward-Backward 126/161 Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB Degraded image z Restored image [DR DWT]

303 126/161 Optimization algorithm: Forward-Backward Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB { y n = x n γ n g(x n ) x n+1 = x n +λ n (prox γnfy n x n ) 2 AF 2 γ n {0.1,1.5,1.9,2}

304 Optimization algorithm: Forward-Backward 126/161 Image restoration : Variational approach û Argmin u R N AF u z 2 2 +η u 1 with η ]0,+ [ and F R N M FB { y n = x n γ n g(x n ) x n+1 = x n +λ n (prox γnfy n x n ) x n x Iterations λ n {0.5,1,1.1}

305 Optimization algorithm: projected gradient 127/161 Let H be a Hilbert space. Let C be a nonempty closed convex subset of H. Let g Γ 0 (H) such that g is 1 β-lipschitz where β ]0,+ [. Let (γ n ) n N a sequence in [γ,γ] where 0 < γ γ < 2β. Let (λ n ) n N be a sequence in [λ,1] where λ ]0,1]. We assume that Argmin x C g(x). Let x 0 H and { y n = x n γ n g(x n ) ( n N) x n+1 = x n +λ n (P C y n x n ). (x n ) n N converges weakly to a minimizer of g on C.

306 Optimization algorithm: projected gradient 128/161 Image restoration : Variational approach regularized û Argmin u R N AF u z 2 2 +η u 1 with { η ]0,+ [ F R N M Image restoration : Variational approach constrained û Argmin AF u z 2 2 with u R N, u 1 ǫ { ǫ ]0,+ [ F R N M

307 Fixed point algorithm: Forward-Backward-Forward 129/161 Let H be a Hilbert space. Let A: H 2 H be a maximally monotone operator. Let B: H H be a monotone operator, 1 β-lipschitz where β ]0,+ [. Let ε ]0,β/(1+β)[ and (γ n ) n N in [ε,(1 ǫ)β]. We assume that zer(a+b). Let x 0 H and y n = x n γ n Bx n p n = J γnay n ( n N) q n = p n γ n Bp n x n+1 = x n y n +q n. (x n ) n N and (p n ) n N converge weakly to x zer(a+b).

308 Fixed point algorithm: Forward-Backward-Forward 130/161 Let H be a Hilbert space. Let f Γ 0 (H). Let g Γ 0 (H) such that g is 1 β-lipschitz where β ]0,+ [. Let ε ]0,β/(1+β)[ and (γ n ) n N in [ε,(1 ǫ)β]. We assume that Argmin(f +g). Let x 0 H and y n = x n γ n g(x n ) p n = prox ( n N) γnfy n q n = p n γ n g(p n ) x n+1 = x n y n +q n. (x n ) n N and (p n ) n N converge weakly to x Argmin(f +g).

309 Part 4: Duality 131/ General duality concepts Primal and dual problems Duality theorems Inf-convolution 2. Augmented Lagrangian algorithms ADMM SDMM 3. Primal-dual algorithms FB-based PD algorithm M+LFBF algorithm

310 Duality 132/161 Let H be a Hilbert space. Let A: H 2 H and B: H 2 H. The parallel sum of A and B is A B = (A 1 +B 1 ) 1. If A and B are monotone, then A B is monotone. A N {0} = A where N {0} = ι {0} and gran {0} = {0} H. A B = B A.

311 Duality 133/161 Primal problem Let H and G be two Hilbert spaces. Let A: H 2 H, C: H 2 H, B: G 2 G and D: G 2 G be monotone operators. Let L B(H,G). Find x H such that 0 A x +C x +L (B D)L x. Dual problem Let H and G be two Hilbert spaces. Let A: H 2 H, C: H 2 H, B: G 2 G and D: G 2 G be monotone operators. Let L B(H,G). Find v G such that 0 L(A 1 C 1 )( L v)+b 1 v +D 1 v.

312 134/161 Duality theorem Let x H and v G. ( x, v) is a Kuhn-Tucker point if { L v (A+C) x L x (B 1 +D 1 ) v. If x H is a solution to the primal problem, then there exists a solution v to the dual problem such that ( x, v) is a Kuhn-Tucker point.

313 134/161 Duality theorem Let x H and v G. ( x, v) is a Kuhn-Tucker point if { L v (A+C) x L x (B 1 +D 1 ) v. If v G is a solution to the dual problem, then there exists a solution x to the primal problem such that ( x, v) is a Kuhn-Tucker point.

314 134/161 Duality theorem Let x H and v G. ( x, v) is a Kuhn-Tucker point if { L v (A+C) x L x (B 1 +D 1 ) v. If ( x, v) is a Kuhn-Tucker point, then x is a solution to the primal problem and v is a solution to the dual problem.

315 Inf-convolution 135/161 Let H be a Hilbert space. Let f : H ],+ ] and g: H ],+ ]. The inf-convolution of f and g is f g: H [,+ ]: x inf y H f(y)+g(x y)

316 Inf-convolution 135/161 Let H be a Hilbert space. Let f : H ],+ ] and g: H ],+ ]. The inf-convolution of f and g is f g: H [,+ ]: x inf (u,v) H 2 u+v=x f(u)+g(v)

317 Inf-convolution 135/161 Let H be a Hilbert space. Let f : H ],+ ] and g: H ],+ ]. The inf-convolution of f and g is f g: H [,+ ]: x inf (u,v) H 2 u+v=x f(u)+g(v) f ι {0} = f f g = g f dom(f g) = domf +domg γ f = f 1 2γ 2, γ ]0,+ [.

318 Inf-convolution 135/161 Let H be a Hilbert space. Let f : H ],+ ] and g: H ],+ ]. The inf-convolution of f and g is f g: H [,+ ]: x inf y H f(y)+g(x y) If f : H ],+ ] and g: H ],+ ] are convex, then f g is convex.

319 136/161 Inf-convolution conjugate Fourier transform (H finite dimensional) Property h(x) h (u) h(x) ĥ(ν) inf-convolution (f g)(x) f (u)+g (u) (f g)(x) f(ν)ĝ(ν) /convolution = inf y H f(y) + g(x y) = H f(y)g(x y)dy sum/product f(x)+g(x) (f g )(u) f(x)g(x) ( f ĝ)(ν) f Γ 0 (H) g Γ 0 (H) domf domg

320 Inf-convolution 137/161 Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). If domf int(domg ), then (f }{{} inf-convolution g) = f g. }{{} parallel sum

321 137/161 Inf-convolution Let H be a Hilbert space. Let f Γ 0 (H) and g Γ 0 (H). If domf int(domg ), then (f }{{} inf-convolution g) = f g. }{{} parallel sum Proof: (f g) = (f +g ) = ( (f +g ) ) 1 = ( f + g ) 1 = ( ( f) 1 +( g) 1) 1 = f g.

322 Fenchel-Rockafellar duality 138/161 Primal problem Let H and G be two Hilbert spaces. Let f : H ],+ ], g: G ],+ ]. Let L B(H,G). We want to minimize f(x)+g(lx). x H Dual problem Let H and G be two Hilbert spaces. Let f : H ],+ ], g: G ],+ ]. Let L B(H,G). We want to minimize f ( L v)+g (v). v G

323 Fenchel-Rockafellar duality 139/161 Weak duality Let H and G be two Hilbert spaces. Let f be a proper fonction from H to ],+ ], g be a proper function from G to ],+ ] and L B(H,G). Let µ = inf x H f(x)+g(lx) and µ = inf v G f ( L v)+g (v). We have µ µ. If µ R, µ+µ is called the duality gap.

324 Fenchel-Rockafellar duality 140/161 LetHandG betwohilbertspaces, f Γ 0 (H), g Γ 0 (G) andl B(H,G). If int(domg) L(domf) and domg int ( L(domf) ), then f +L gl = (f +g L)

325 Fenchel-Rockafellar duality 141/161 Duality theorem (1) LetHandG betwohilbertspaces, f Γ 0 (H), g Γ 0 (G) andl B(H,G). zer( f +L gl ) zer ( ( L) f ( L )+ g )

326 142/161 Fenchel-Rockafellar duality Duality theorem (2) LetHandG betwohilbertspaces, f Γ 0 (H), g Γ 0 (G) andl B(H,G). If there exists x H such that 0 f( x)+l g(l x) then x is a solution to the primal problem. Moreover, L v f( x) and L x g ( v) where v is a solution to the dual problem. If there exists ( x, v) H G such that L v f( x) and L x g ( v) then x (resp. v) is a solution to the primal (resp. dual) problem. Particular case: If f = ϕ+ 1 2 z 2 where ϕ Γ 0 (H) and z H, then L v f( x) L v ϕ( x)+ x z. We have then x = prox ϕ ( L v +z).

327 Fenchel-Rockafellar duality 143/161 Duality theorem (3): strong duality LetHandG betwohilbertspaces, f Γ 0 (H), g Γ 0 (G) andl B(H,G). If int(domg) L(domf) or domg int ( L(domf) ), then µ = inf x H f(x)+g(lx) = min v G f ( L v)+g (v) = µ.

328 144/161 Fenchel-Rockafellar duality Extension of the duality theorem Let H and G be two Hilbert space, f Γ 0 (H), h Γ 0 (H), g Γ 0 (G), l Γ 0 (G) such that domh = H and doml = G, L B(H,G). If there exists x H such that 0 f( x)+l ( g l)(l x)+ h( x) then x is a solution to the primal problem minimize x H f(x)+(g l)(lx)+h(x) L v f( x)+ h( x) and L x g ( v)+ l ( v) where v is a solution to the dual problem minimize v G (f h )( L v)+g (v)+l (v).

329 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) v n = prox γg u n w n = prox γf ( L )(2v n u n ) u n+1 = u n +w n v n

330 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) v n = u n γprox γ 1 g(γ 1 u n ) 2v n u n w n γ ( f ( L ) ) w n u n+1 = u n +w n v n

331 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) v n = u n γprox γ 1 g(γ 1 u n ) 2v n u n w n γl f ( L )w n u n+1 = u n +w n v n

332 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) v n = u n γprox γ 1 g(γ 1 u n ) 2v n u n w n γl f ( L )w }{{ n } x n u n+1 = u n +w n v n

333 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are v n = u n γprox γ 1 g(γ 1 u n ) 2v n u n w n = γlx n ( n N) x n f ( L w n ) u n+1 = u n +w n v n

334 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are v n = u n γprox γ 1 g(γ 1 u n ) 2v n u n w n = γlx n ( n N) L w n f(x n ) u n+1 = u n +w n v n

335 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) v n = u n γprox γ 1 g(γ 1 u n ) L (u n 2v n γlx n ) f(x n ) u n+1 = u n +w n v n using y n = γ 1 (u n v n ) and z n = γ 1 v n

336 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) y n = prox γ 1 g(γ 1 u n ) L (y n z n Lx n ) 1 γ f(x n) u n+1 = γz n +γlx n

337 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are ( n N) y n = prox γ 1 g(γ 1 u n ) 1 x n = argmin 2 Lx y n +z n γ f(x) x H u n+1 = γz n +γlx n

338 Augmented Lagrangian method 145/161 ADMM algorithm (Alternating-direction method of multipliers) Douglas-Rachford for the dual problem Let H and G be two Hilbert spaces. Let f Γ 0 (H) and g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. The Douglas-Rachford iterations to minimize f ( L )+g are 1 x n = argmin 2 Lx y n +z n γ f(x) x H s ( n N) n = Lx n y n+1 = proxg γ (z n +s n ) z n+1 = z n +s n y n+1.

339 Augmented Lagrangian method 146/161 ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Lagrange function: ( (x,y,v) H G 2 ) L(x,y,z) = f(x)+g(y)+ v Lx y where v G denotes the Lagrange multiplier.

340 Augmented Lagrangian method 146/161 ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Augmented Lagrange function: let γ ]0,+ [, we define ( (x,y,z) H G 2 ) L(x,y,z) = f(x)+g(y)+γ z Lx y + γ 2 Lx y 2 x L(x,y,z) = f(x)+γl z +γl (Lx y) y L(x,y,z) = g(y) γz +γ(y Lx) z L(x,y,z) = Lx y.

341 Augmented Lagrangian method 146/161 ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Augmented Lagrange function: Thus (0,0,0) x L( x,ŷ,ẑ) y L( x,ŷ,ẑ) z L( x,ŷ,ẑ) γl ẑ f( x) γẑ g(l x) L x g (γẑ) ŷ = L x.

342 146/161 Augmented Lagrangian method ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Augmented Lagrange function: To sum up, if ( (x,y,z) H G 2 ) L(x,y,z) = f(x)+g(y)+γ z Lx y + γ 2 Lx y 2 then ( x,ŷ,ẑ) critical point of L ŷ = L x, x solution to the primal problem and γẑ solution to the dual problem. Moreover, ( x,ŷ,ẑ) is a critical point of L ( x,ŷ,ẑ) saddle point of L, i.e. L( x,ŷ,z) L( x,ŷ,ẑ) L(x,y,ẑ).

343 Augmented Lagrangian method 146/161 ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Augmented Lagrange function: Algorithm to find ( x, ŷ, ẑ): x n = argmin L(x,y n,z n ) x H ( n N) y n+1 = argmin L(x n,y,z n ) y G z n+1 such that L(x n,y n+1,z n+1 ) L(x n,y n+1,z n ).

344 Augmented Lagrangian method 146/161 ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Augmented Lagrange function: Algorithm to find ( x, ŷ, ẑ): x n = argmin f(x)+γ z n Lx y n + γ 2 Lx y n 2 x H ( n N) y n+1 = argmin g(y)+γ z n Lx n y + γ 2 Lx n y 2 y G z n+1 = z n + 1 γ z L(x n,y n+1,z n ).

345 Augmented Lagrangian method 146/161 ADMM algorithm (Alternating-direction method of multipliers) Lagrangian interpretation minimize x H f(x)+g(lx) minimize x H,y G Lx=y f(x)+g(y) Augmented Lagrange function: Algorithm to find ( x, ŷ, ẑ): ( n N) 1 x n = argmin 2 Lx y n +z n γ f(x) x H y n+1 = proxg γ (z n +Lx n ) z n+1 = z n +Lx n y n+1.

346 Augmented Lagrange method 147/161 ADMM algorithm (Alternating-direction method of multipliers) Let H and G be two Hilbert spaces. Let f Γ 0 (H) et g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. ( n N) 1 x n = argmin 2 Lx y n +z n γ f(x) x H s n = Lx n y n+1 = proxg γ (z n +s n ) z n+1 = z n +s n y n+1.

347 Augmented Lagrange method 147/161 ADMM algorithm (Alternating-direction method of multipliers) Let H and G be two Hilbert spaces. Let f Γ 0 (H) et g Γ 0 (G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. Weassumethatint(domg) L(domf) ordomg int ( L(domf) ) and that Argmin(f +g L). Let 1 x n = argmin 2 Lx y n +z n γ f(x) x H s ( n N) n = Lx n y n+1 = proxg γ (z n +s n ) z n+1 = z n +s n y n+1. We have: x n x where x Argmin(f +g L) γz n v where v Argmin(f ( L )+g ).

348 Augmented Lagrangian method SDMM algorithm (Simultaneous-direction method of multipliers) Let H and G 1,...,G m be Hilbert spaces. Let ( i {1,...,m}) g i Γ 0 (G i ) and L i B(H,G i ). Let γ ]0,+ [. We assume that m i=1 L i L i is an isomorphism ( n N) ( m ) 1 x n = i=1 L i L m i i=1 L i (y n,i z n,i ) s n,i = L i x n, i {1,...,m} y n+1,i = proxg i (z n,i +s n,i ), i {1,...,m} γ z n+1,i = z n,i +s n,i y n+1,i, i {1,...,m}. 148/161

349 Augmented Lagrangian method SDMM algorithm (Simultaneous-direction method of multipliers) Let H and G 1,...,G m be Hilbert spaces. Let ( i {1,...,m}) g i Γ 0 (G i ) and L i B(H,G i ). Let γ ]0,+ [. 148/161 We assume that m i=1 L i L i is an isomorphism and there exists x H such that ( i {1,...,m}) L i x int(domg i ). Let ( m ) 1 x n = i=1 L i L m i i=1 L i (y n,i z n,i ) ( n N) s n,i = L i x n, i {1,...,m} y n+1,i = proxg i (z n,i +s n,i ), i {1,...,m} γ z n+1,i = z n,i +s n,i y n+1,i, i {1,...,m}. We have: x n x where x Argmin( m i=1 g i L i ) γz n = γ(z n,i ) 1 i m v where v = ( v i ) 1 i m Argmin v=(v i) 1 i m m i=1 L i v i=0 ( m i=1 g i (v i )).

350 Primal-dual method FB-based PD algorithm (Condat-Vũ-...) Let H and G be two Hilbert spaces. Let L B(H,G). Let A: H 2 H and B: G 2 G be two maximally monotone operators. Let C: H H be a µ-cocoercive operator with µ ]0,+ [. Let D: G 2 G be a ν-strongly monotone operator with ν ]0,+ [. 149/161 ( n N) ( p n = J τa xn τ(cx n +L v n ) ) ( q n = J σb 1 vn +σ ( )) L(2p n x n ) D 1 v ( n (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ).

351 Primal-dual method FB-based PD algorithm (Condat-Vũ-...) Let H and G be two Hilbert spaces. Let L B(H,G). Let A: H 2 H and B: G 2 G be two maximally monotone operators. Let C: H H be a µ-cocoercive operator with µ ]0,+ [. Let D: G 2 G be a ν-strongly monotone operator with ν ]0,+ [. 149/161 Let β = min{µ,ν}, τ ]0,+ [, σ ]0,+ [, ρ = min{τ 1,σ 1 }(1 τσ L ) and δ = min{1,ρβ}+1/2. We assume that 2ρβ > 1. Let (λ n ) n N be a sequence in ]0,δ[ such that n N λ n(δ λ n ) = +. We assume that zer ( A+C +L (B D)L ). Let x 0 H, v 0 G and ( p n = J τa xn τ(cx n +L v n ) ) ( ( n N) q n = J σb 1 vn +σ ( )) L(2p n x n ) D 1 v ( n (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ). We have: x n x zer ( A+C +L (B D)L ) and v n v zer ( ( L)(A 1 C 1 )( L )+B 1 +D 1).

352 Primal-dual method Proof: ( x, v) Kuhn-Tucker point iff ( x, v) zer(p +Q) where P maximally monotone such that P = M +S with M: (x,v) (Ax,B 1 v) [ ][ ] 0 L x S: (x,v) L 0 v Q: (x,v) (Cx,D 1 v) β-cocoercive with β = min{µ,ν} Forward-Backward algorithm. 150/161

353 Primal-dual optimization algorithm 151/161 FB-based PD algorithm Let H and G be two Hilbert spaces. Let L B(H,G). Let f Γ 0 (H) and g Γ 0 (G). Let h Γ 0 (H) having a 1/µ-Lipschitzian gradient, l Γ 0 (G) ν-strongly convex ( n N) ( p n = prox τf xn τ ( )) h(x n )+L v ( n q n = prox σg vn +σ ( L(2p n x n ) l (v n ) )) ( (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ).

354 Primal-dual optimization algorithm 151/161 FB-based PD algorithm Let H and G be two Hilbert spaces. Let L B(H,G). Let f Γ 0 (H) and g Γ 0 (G). Let h Γ 0 (H) having a 1/µ-Lipschitzian gradient, l Γ 0 (G) ν-strongly convex and β = min{µ,ν}. Let τ ]0,+ [, σ ]0,+ [, ρ = min{τ 1,σ 1 }(1 τσ L ) et δ = min{1,ρβ}+1/2. We assume that 2ρβ > 1. Let (λ n ) n N a sequence in ]0,δ[ such that n N λ n(δ λ n ) = +. We assume that zer ( f + h+l ( g l)l ). Let x 0 H, v 0 G and ( p n = prox τf xn τ ( )) h(x n )+L v ( n ( n N) q n = prox σg vn +σ ( L(2p n x n ) l (v n ) )) ( (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ). We have: x n x Argmin ( f +h+(g l) L ) v n v Argmin ( (f h ) ( L )+g +l )

355 Primal-dual optimization algorithm 152/161 FB-based PD algorithm ( p n = prox τf xn τ ( h(x n )+L )) v n ( ( n N) q n = prox σg vn +σ ( L(2p n x n ) l (v n ) )) ( (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ). Remark: * No operator inversion. * Allow the use of proximable or/and differentiable functions.

356 Primal-dual optimization algorithm 152/161 FB-based PD algorithm CP algorithm ( p n = prox τf xn τ ( h(x n )+L )) v n ( ( n N) q n = prox σg vn +σ ( L(2p n x n ) l (v n ) )) ( (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ). Remark: * When h = 0, l = ι {0}, λ n 1 and στ L 2 < 1, this yields the Chambolle-Pock algorithm.

357 Primal-dual optimization algorithm 152/161 FB-based PD algorithm CP algorithm ( x n+1 = prox τf xn τl ) v n ( n N) y n = 2x n+1 x n ) v n+1 = prox σg ( vn +σly n. Remark: * When h = 0, l = ι {0}, λ n 1 and στ L 2 < 1, this yields the Chambolle-Pock algorithm.

358 Primal-dual optimization algorithm 152/161 FB-based PD algorithm FB algorithm ( p n = prox τf xn τ ( h(x n )+L )) v n ( ( n N) q n = prox σg vn +σ ( L(2p n x n ) l (v n ) )) ( (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ). Remark: * When h = 0, l = ι {0}, λ n 1 and στ L 2 < 1, this yields the Chambolle-Pock algorithm. * When g = 0, l = ι {0} and L = 0, this yields the forward-backward algorithm.

359 Primal-dual optimization algorithm 152/161 FB-based PD algorithm FB algorithm ( {p n = prox ( n N) τf xn τ h(x n ) ) ( x n+1 = x n +λ n pn x n ) ). Remark: * When h = 0, l = ι {0}, λ n 1 and στ L 2 < 1, this yields the Chambolle-Pock algorithm. * When g = 0, l = ι {0} and L = 0, this yields the forward-backward algorithm.

360 Primal-dual optimization algorithm 152/161 FB-based PD algorithm DR algorithm ( p n = prox τf xn τ ( h(x n )+L )) v n ( ( n N) q n = prox σg vn +σ ( L(2p n x n ) l (v n ) )) ( (x n+1,v n+1 ) = (x n,v n )+λ n (pn,q n ) (x n,v n ) ). Remark: * When h = 0, l = ι {0}, λ n 1 and στ L 2 < 1, this yields the Chambolle-Pock algorithm. * When g = 0, l = ι {0} and L = 0, this yields the forward-backward algorithm. * In the limit case when h = 0, l = ι {0}, λ n 1, L = Id and σ = 1/τ, this yields the Douglas-Rachford algorithm.

361 Primal-dual optimization algorithm 152/161 FB-based PD algorithm DR algorithm Remark: ( n N) ( ) x n+1 = prox τf xn τv n ( s n = prox τg 2xn+1 (x n τv n ) ) x n+1 τv n+1 = (x n τv n )+s n x n+1 * When h = 0, l = ι {0}, λ n 1 and στ L 2 < 1, this yields the Chambolle-Pock algorithm. * When g = 0, l = ι {0} and L = 0, this yields the forward-backward algorithm. * In the limit case when h = 0, l = ι {0}, λ n 1, L = Id and σ = 1/τ, this yields the Douglas-Rachford algorithm.

362 153/161 FB-based PD algorithm Image restoration : Variational approach x Argmin Ax z 2 2 +η [W1H 1...WJ H J ] x 2,1 +ι C (x) x R N η > 0 H 1,...,H J R N N (e.g. operators associated with filters) with W 1,...,W J R N N (e.g. weights) C = [0,255] N PPXA+ : (2Id+W 1 H 1 H 1W W J H J H JW J ) 1 complicated FB-based PD algorithm : implementation without inversion

363 FB-based PD algorithm 154/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [W 1 H 1...W J H J ] x 2,1 +ι C (x) Degraded image z Restored image [PD NLTV]

364 FB-based PD algorithm 154/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [W 1 H 1...W J H J ] x 2,1 +ι C (x) Degraded image z Restored image [FB DTT]

365 FB-based PD algorithm 154/161 Image restoration : Variational approach x Argmin x R N Ax z 2 2 +η [W 1 H 1...W J H J ] x 2,1 +ι C (x) Degraded image z Restored image [PPXA TV]

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29 A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014