Fonctions Perspectives et Statistique en Grande Dimension

Size: px
Start display at page:

Download "Fonctions Perspectives et Statistique en Grande Dimension"

Transcription

1 Fonctions Perspectives et Statistique en Grande Dimension Patrick L. Combettes Department of Mathematics North Carolina State University Raleigh, NC 27695, USA Basé sur un travail conjoint avec C. L. Müller du Flatiron Institute, New York Journées MAS-MODE de la SMAI, Institut Henri Poincaré, Paris, 9 janvier 2017 Patrick L. Combettes Fonctions Perspectives et Statistique 1/31

2 2/31 Some optimization problems in statistics Standard finite-dimensional linear model: Observation z = Xb+σe = (ζ i ) 1 i n R n, unknown b = (β j ) 1 j p R p Belloni et al. s square-root lasso (2011): minimize b R p Xb z 2 +α b 1 Patrick L. Combettes Fonctions Perspectives et Statistique 2/31

3 2/31 Some optimization problems in statistics Standard finite-dimensional linear model: Observation z = Xb+σe = (ζ i ) 1 i n R n, unknown b = (β j ) 1 j p R p Belloni et al. s square-root lasso (2011): minimize b R p Xb z 2 +α b 1 Sun and Zhang s scaled lasso (2012): minimize b R p,σ>0 1 Xb z σ 2n σ 2 +α b 1 Patrick L. Combettes Fonctions Perspectives et Statistique 2/31

4 2/31 Some optimization problems in statistics Standard finite-dimensional linear model: Observation z = Xb+σe = (ζ i ) 1 i n R n, unknown b = (β j ) 1 j p R p Belloni et al. s square-root lasso (2011): minimize b R p Xb z 2 +α b 1 Sun and Zhang s scaled lasso (2012): minimize b R p,σ>0 1 Xb z σ 2n σ 2 +α b 1 Lederer&Müller TREX estimator (2015): minimize b R p Xb z 2 2 X (Xb z) +α b 1 Patrick L. Combettes Fonctions Perspectives et Statistique 2/31

5 2/31 Some optimization problems in statistics Standard finite-dimensional linear model: Observation z = Xb+σe = (ζ i ) 1 i n R n, unknown b = (β j ) 1 j p R p Belloni et al. s square-root lasso (2011): minimize b R p Xb z 2 +α b 1 Sun and Zhang s scaled lasso (2012): minimize b R p,σ>0 1 Xb z σ 2n σ 2 +α b 1 Lederer&Müller TREX estimator (2015): minimize b R p Xb z 2 2 X (Xb z) +α b 1 Owen s penalized concomitant M-estimators (2007): n ( ) ζi b x i p minimize nσ+σ Huber +pτ+τ Berhu b,σ,τ σ i=1 j=1 ( βj τ ) Patrick L. Combettes Fonctions Perspectives et Statistique 2/31

6 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

7 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Problems involving various notions of divergence between x > 0 and y > 0: Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

8 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Problems involving various notions of divergence between x > 0 and y > 0: pth order Hellinger: x(t) 1/p y(t) 1/p p dt R N Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

9 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Problems involving various notions of divergence between x > 0 and y > 0: pth order Hellinger: x(t) 1/p y(t) 1/p p dt R N ( ) x(t) Kullback-Leibler: x(t) ln dt R y(t) N Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

10 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Problems involving various notions of divergence between x > 0 and y > 0: pth order Hellinger: x(t) 1/p y(t) 1/p p dt R N ( ) x(t) Kullback-Leibler: x(t) ln dt R y(t) N Rényi: x(t) α y(t) 1 α dt R N Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

11 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Problems involving various notions of divergence between x > 0 and y > 0: pth order Hellinger: x(t) 1/p y(t) 1/p p dt R N ( ) x(t) Kullback-Leibler: x(t) ln dt R y(t) N Rényi: x(t) α y(t) 1 α dt R N RN x(t) y(t) 2 Pearson: dt y(t) Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

12 2/31 Some optimization problems in statistics Problems involving the Fisher information of a multi-dimensional density x > 0 (Fisher, 1925): x(t) 2 2 dt R x(t) N Problems involving various notions of divergence between x > 0 and y > 0: pth order Hellinger: x(t) 1/p y(t) 1/p p dt R N ( ) x(t) Kullback-Leibler: x(t) ln dt R y(t) N Rényi: x(t) α y(t) 1 α dt R N RN x(t) y(t) 2 Pearson: dt y(t) What is the common structure underlying these formulations? Patrick L. Combettes Fonctions Perspectives et Statistique 3/31

13 2/31 Perspective functions: Definition H, G real Hilbert spaces Γ 0 (G): set of lower semicontinuous convex functions from G to ],+ ] with domϕ = { x G ϕ(x) < + }. ϕ Γ 0 (G) rec ϕ is the recession function of ϕ: given z dom ϕ, ( ) ( y G) (recϕ)(y) = sup ϕ(x + y) ϕ(y) x dom ϕ (Lower semicontinuous envelope of the) Perspective function of ϕ: ηϕ(y/η), if η > 0; ϕ: R G ],+ ]: (η, y) (recϕ)(y), if η = 0; +, if η < 0. Patrick L. Combettes Fonctions Perspectives et Statistique 4/31

14 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then ϕ Γ 0 (R G) Patrick L. Combettes Fonctions Perspectives et Statistique 5/31

15 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then ϕ Γ 0 (R G) ϕ is positively homogeneous Patrick L. Combettes Fonctions Perspectives et Statistique 5/31

16 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then ϕ Γ 0 (R G) ϕ is positively homogeneous Let C = { (µ, u) R G µ+ϕ (u) 0 }. Then ϕ = σ C and ( ϕ) = ι C Patrick L. Combettes Fonctions Perspectives et Statistique 5/31

17 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then ϕ Γ 0 (R G) ϕ is positively homogeneous Let C = { (µ, u) R G µ+ϕ (u) 0 }. Then ϕ = σ C and ( ϕ) = ι C Let η R and y G. Then ϕ(η, y) = {( ) } ϕ(y/η) y u /η, u u ϕ(y/η), if η > 0; { (µ, u) C σdomϕ (y) = u y }, if η = 0 and y 0; C, if η = 0 and y = 0; Ø, if η < 0 Patrick L. Combettes Fonctions Perspectives et Statistique 5/31

18 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then Let ψ Γ 0 (G) be such that domϕ domψ Ø, and let λ ]0,+ [. Then [λϕ+ψ] = λ ϕ+ ψ Γ 0 (R G). Patrick L. Combettes Fonctions Perspectives et Statistique 6/31

19 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then Let ψ Γ 0 (G) be such that domϕ domψ Ø, and let λ ]0,+ [. Then [λϕ+ψ] = λ ϕ+ ψ Γ 0 (R G). Let Λ: H G be linear, bounded, and such that ranλ domϕ Ø. Set Λ: R H R G: (ξ, x) (ξ,λx). Then [ϕ Λ] = ϕ Λ Γ 0 (R G). Patrick L. Combettes Fonctions Perspectives et Statistique 6/31

20 2/31 Perspective functions: Properties Let ϕ Γ 0 (G). Then Let ψ Γ 0 (G) be such that domϕ domψ Ø, and let λ ]0,+ [. Then [λϕ+ψ] = λ ϕ+ ψ Γ 0 (R G). Let Λ: H G be linear, bounded, and such that ranλ domϕ Ø. Set Λ: R H R G: (ξ, x) (ξ,λx). Then [ϕ Λ] = ϕ Λ Γ 0 (R G). Let ψ Γ 0 (G) and let C be a closed convex subset of G such that C domψ Ø. Set ηψ(y/η), if η > 0 and y η(c domψ); g: (η, y) (recψ)(y), if η = 0 and y rec C; +, otherwise. Then g = [ι C +ψ] Γ 0 (R G). Patrick L. Combettes Fonctions Perspectives et Statistique 6/31

21 Perspective functions: Examples Let ψ Γ 0 (G) and let envψ: y inf x G (ψ(x)+ y x 2 /2) be the Moreau envelope of ψ. Set y 2 η(envψ)(y/η), if η > 0; 2η g: (η, y) σ domψ (y), if η = 0; +, if η < 0. Then g = [env(ψ )] Γ 0 (R G). Patrick L. Combettes Fonctions Perspectives et Statistique 7/31

22 Perspective functions: Examples Take ψ = ι B(0;1) in previous example and set ρ y η, if y > η and η > 0; 2 y 2 g: (η, y), if y η and η > 0; 2η y, if η = 0; +, if η < 0. Then g = [ϕ], where ϕ = env = 2 /2 db(0;1) 2 /2 is the generalized Huber function. In computer vision, g is called the bivariate Huber function. It also shows up in Owen s concomitant M-estimator formulation. Patrick L. Combettes Fonctions Perspectives et Statistique 8/31

23 Perspective functions: Examples Let C and D be nonempty closed convex subsets of G, and let ρ ]0,+ [. Set ηdc 2(y/η) +σ D (y), if η > 0 and y / ηc; 2ρ g: (η, y) σ D (y), if η > 0 and y ηc; σ D (y), if η = 0 and y rec C; +, otherwise Then g = ϕ Γ 0 (R G), where ϕ = d 2 C /(2ρ)+σ D. A special case of g appears in computer vision. If G = R and D = [ 1, 1], ϕ is the Berhu (reversed Huber) function used in mechanics and in Owen s concomitant M- estimator formulation Patrick L. Combettes Fonctions Perspectives et Statistique 9/31

24 Perspective functions: Examples Let ψ: G [0,+ ] be a proper lower semicontinuous positively homogeneous convex function, let δ R, let ρ [0,+ [, let p [1,+ [, and set { δη + ρη g: (η, y) p +ψ p (y) 1/p, if η 0; +, if η < 0. Then g = [δ + ρ+ψ p 1/p ] Γ 0 (R G). Let φ Γ 0 (R) be an even function, let v G, let δ R, and set ηφ( y /η)+ y v +δη, if η > 0; g: (η, y) (recφ)( y )+ y v, if η = 0; +, if η < 0. Then g = [φ + v +δ] Γ 0 (R G). Patrick L. Combettes Fonctions Perspectives et Statistique 10/31

25 Perspective functions: Examples The divergences between x > 0 and y > 0 discussed earlier are of the form R N ϕ ( y(t), x(t) ) dt, where { t 1/p 1 p, if t > 0; pth order Hellinger: ϕ(ξ) = +, otherwise { ξ lnξ, if ξ > 0; Kullback-Leibler: ϕ(ξ) = +, otherwise { ξ α, if ξ > 0; Rényi: ϕ(ξ) = +, otherwise Pearson: ϕ(ξ) = ξ 1 2 Patrick L. Combettes Fonctions Perspectives et Statistique 11/31

26 Composite perspective functions Let L: H G be linear and bounded, let ϕ Γ 0 (G), let r G, let u H, let ρ R, and set ( ) ( ) Lx r x u ρ ϕ, if x u > ρ; x u ρ f : x (recϕ) ( Lx r ), if x u = ρ; +, if x u < ρ. Suppose that there exists z H such that Lz r +( z u ρ)domϕ and z u 0, and set A: H R G: x ( x u ρ, Lx r). Then f = ϕ A Γ 0 (H). Patrick L. Combettes Fonctions Perspectives et Statistique 12/31

27 Composite perspective functions: Examples Let L: H G be linear and bounded, let be a norm on G such that, for some χ ]0,+ [, χ, let r G, let u H, let ρ R, and let q and s be in ]1,+ [. Set Lx r qs x u ρ (q 1)s, if x u > ρ; h: x 0, if Lx = r and x u = ρ; +, otherwise. Then h Γ 0 (H). Patrick L. Combettes Fonctions Perspectives et Statistique 13/31

28 Composite perspective functions: Examples Let L: H G be linear and bounded, let be a norm on G such that, for some χ ]0,+ [, χ, let r G, let u H, let ρ R, and let q and s be in ]1,+ [. Set Lx r qs x u ρ (q 1)s, if x u > ρ; h: x 0, if Lx = r and x u = ρ; +, otherwise. Then h Γ 0 (H). Let (Ω,F, P) be a probability space, let H = L 2 (Ω,F, P), let p ]1, 2], and let q and s be in ]1,+ [. Set E qs/p X p, if EX > 0; E h: X (q 1)s X 0, if X = 0 a.s.; +, otherwise. Then h Γ 0 (H). Patrick L. Combettes Fonctions Perspectives et Statistique 13/31

29 Composite perspective functions: Examples Let (Ω,F,µ) be a measure space, let G be a separable real Hilbert space, and let ϕ Γ 0 (G). Set H = L 2 ((Ω,F,µ);R) and G = L 2 ((Ω,F,µ); G), and suppose thatµ(ω) < + orϕ ϕ(0) = 0. For every x H, set Ω 0 (x) = { ω Ω x(ω) = 0 } and Ω + (x) = { ω Ω x(ω) > 0 }. Define Φ: H G ],+ ]: (x, y) ( ) ( )( ) y(ω) recϕ y(ω) µ(dω) + x(ω)ϕ µ(dω), Ω 0 (x) Ω +(x) x(ω) x 0 a.e. if (recϕ)(y)1 Ω0 (x) + xϕ(y/x)1 Ω+(x) L 1( (Ω,F,µ);R ) ; +, otherwise. Then Φ Γ 0 (H G). Patrick L. Combettes Fonctions Perspectives et Statistique 14/31

30 Composite perspective functions: Examples Corollary: Let Ω be a nonempty open subset of R N and let H be the Sobolev space H 1 (Ω), i.e., H = { x L 2 (Ω) x (L 2 (Ω)) N}. For every x H, set Ω (x) = { t Ω x(t) < 0 }, Ω 0 (x) = { t Ω x(t) = 0 }, andω + (x) = { t Ω x(t) > 0 }. Letϕ Γ 0 (R N ) be such that ϕ ϕ(0) = 0, and define f : H ],+ ] ( )( ) recϕ x(t) dt + x Ω 0 (x) +, Then f Γ 0 (H). Ω +(x) ( ) x(t) x(t)ϕ dt, if x 0 x(t) else Patrick L. Combettes Fonctions Perspectives et Statistique 15/31

31 Composite perspective functions: Examples The Fisher information f: H 1 (Ω) ],+ ] { x(t) 2 2 x 0 a.e. dt, if x Ω +(x) x(t) [ x = 0 x = 0] a.e.; +, otherwise is in Γ 0 (H 1 (Ω)). For (x, y) R 2N, set I 0 (x, y) = { i I ξ i = 0 and η i < 0 } and η i + 1/p η i ξ 1/p p i, if I (x) I 0 (x, y) = Ø; d φ (x, y) = i I 0 (x) i I +(x) +, otherwise. Then d φ Γ 0 (R 2N ). We recover the Kolmogorov variational divergence for p = 1 and the Hellinger divergence for p = 2. Patrick L. Combettes Fonctions Perspectives et Statistique 16/31

32 Perspective functions: Proximity operator The Moreau proximity operator of g Γ 0 (G) is prox g : G G: x argmin (g(y)+ 12 ) x y 2. y G It is an essential tool in the design of splitting algorithms to solve a variety of convex minimization problems, especially in data science over the past dodecade PLC and V. R. Wajs, Signal recovery by proximal forwardbackward splitting, Multiscale Model. Simul., vol. 4, 2005 Patrick L. Combettes Fonctions Perspectives et Statistique 17/31

33 Proximity operators Many common convex functions in data processing (statistics, machine learning, image recovery, data denoising, support vector machine, signal processing) have explicit proximity operators: l 1 norm Shatten norm nuclear norm Huber s function Berhu function elastic net regularizer hinge loss Fisher information distance function Vapnik s ε-insensitive loss Burg s entropy etc. Patrick L. Combettes Fonctions Perspectives et Statistique 18/31

34 Proximity operators Basic properties: p = prox f x x p f(p) prox f + prox f = Id (Moreau s decomposition) For f = ι V, V a closed vector subspace: P V + P V = Id prox ρ = Id prox (ρ ) = Id P [ ρ,ρ] = soft ρ (prox f x, x prox f x) = (prox f x, prox f x) gra f Fix prox f = Argmin f prox f x prox f y x y Patrick L. Combettes Fonctions Perspectives et Statistique 19/31

35 Proximity operators Basic properties: p = prox f x x p f(p) prox f + prox f = Id (Moreau s decomposition) For f = ι V, V a closed vector subspace: P V + P V = Id prox ρ = Id prox (ρ ) = Id P [ ρ,ρ] = soft ρ (prox f x, x prox f x) = (prox f x, prox f x) gra f Fix prox f = Argmin f prox f x prox f y 2 x y 2 prox f x prox f y 2 The last two properties suggest the conceptual algorithm x n+1 = prox f x n to minimize f, which is at the root of proximal splitting algorithms. Patrick L. Combettes Fonctions Perspectives et Statistique 19/31

36 Proximal splitting methods in convex optimization f Γ 0 (H), ϕ k Γ 0 (G k ), l k Γ 0 (G k ) strongly convex, L k : H G k linear bounded, L k = 1, h: H R convex and smooth: minimize x H f(x)+ p (ϕ k l k )(L k x r k )+h(x) k=1 where: ϕ k l k : x inf y H ( ϕk (y)+l k (x y) ) Example: multiview total variation image recovery from observations r k = L k x + w k : minimize x H k N p 1 φ k ( x e k )+ k=1 α k d Ck }{{} ι C (L k x r k )+β x 1,2 A splitting algorithm activates each function and each linear operator individually Patrick L. Combettes Fonctions Perspectives et Statistique 20/31

37 Proximal splitting methods in convex optimization Algorithm: for n = 0, 1,... For k = 1,...,p y 1,n = x n ( h(x n)+ m k=1 L k v k,n ) p 1,n = prox f y 1,n y 2,k,n = v k,n +(L k x n l k(v k,n )) p 2,k,n = prox g k (y 2,k,n r k ) q 2,k,n = p 2,k,n +(L k p 1,n l k(p 2,k,n )) v k,n+1 = v k,n y 2,k,n + q 2,k,n q 1,n = p 1,n ( h(p 1,n )+ m k=1 L k p 2,k,n ) x n+1 = x n y 1,n + q 1,n (x n ) n N converges weakly to a solution PLC, Systems of structured monotone inclusions: Duality, algorithms, and applications, SIAM J. Optim., vol. 23, 2013 Patrick L. Combettes Fonctions Perspectives et Statistique 21/31

38 Perspective functions: Proximity operator Let ϕ Γ 0 (G), let γ ]0,+ [, let η R, and let y G. Suppose that η +γϕ (y/γ) 0. Then prox γ ϕ (η, y) = (0, 0). Suppose that domϕ is open and that η + γϕ (y/γ) > 0. Then prox γ ϕ (η, y) = ( η +γϕ (p), y γp ), where p is the unique solution to the inclusion y γp+ ( η +γϕ (p) ) ϕ (p). If ϕ is differentiable at p, then p is characterized by y = γp+(η +γϕ (p)) ϕ (p). Patrick L. Combettes Fonctions Perspectives et Statistique 22/31

39 Perspective functions: Proximity operator Let v G, let δ R, and let φ Γ 0 (R) be an even function such that φ is differentiable on R. Define ηφ( y /η)+δη + y v, if η > 0; g: (η, y) 0, if y = 0 and η = 0; +, otherwise. Let γ ]0,+ [, let η R, let y G, and set ψ: s (φ (s)+ ηγ ) δ φ (s)+s. Then ψ is invertible. Moreover, if η +γφ ( y/γ v ) > γδ, set Then t = ψ 1( y/γ v ) and p = v + prox γg (η, y) = t (y γv). y γv {( η +γ(φ (t) δ), y γp ), if η +γφ ( y/γ v ) > γδ; (0, 0), if η +γφ ( y/γ v ) γδ. Patrick L. Combettes Fonctions Perspectives et Statistique 23/31

40 Perspective functions: Proximity operator Let v G, let δ R, let α ]0,+ [, let q ]1,+ [, and consider the function y q +δη + y v, if η > 0; αηq 1 g: (η, y) 0, if y = 0 and η = 0; +, otherwise. Let γ ]0,+ [, set q = q/(q 1), set = (α(1 1/q )) q 1, and take η R and y G. If q γ q 1 η + y q > γδ, let t [0,+ [ be the unique solution to the equation t 2q 1 + q (η γδ) t q 1 + q γ t q y γv = 0 2 γ 2 and set p = v + t(y γv)/ y γv. Then prox γg (η, y) = {( η +γ( t q δ)/q, y γp ), if q γ q 1 η + y q > γδ; ( 0, 0 ), if q γ q 1 η + y q γδ. Patrick L. Combettes Fonctions Perspectives et Statistique 24/31

41 Perspective functions: Proximity operator Let (Ω,F,µ) be a measure space, let G be a separable real Hilbert space, and let ϕ Γ 0 (G). Set H = L 2 ((Ω,F,µ);R) and G = L 2 ((Ω,F,µ); G), and suppose thatµ(ω) < + orϕ ϕ(0) = 0. For every x H, set Ω 0 (x) = { ω Ω x(ω) = 0 } and Ω + (x) = { ω Ω x(ω) > 0 }. Define Φ: H G ],+ ]: (x, y) ( ) ( )( ) y(ω) recϕ y(ω) µ(dω) + x(ω)ϕ µ(dω), Ω 0 (x) Ω +(x) x(ω) x 0 a.e. if (recϕ)(y)1 Ω0 (x) + xϕ(y/x)1 Ω+(x) L 1( (Ω,F,µ);R ) ; +, otherwise. Now let x H and y G, and set, for µ-almost every ω Ω, (p(ω), q(ω)) = prox ϕ (x(ω), y(ω)). Then prox Φ (x, y) = (p, q). Patrick L. Combettes Fonctions Perspectives et Statistique 25/31

42 Perspective functions: Proximity operator We can also handle cases when domϕ is not open. Consider the perspective function { ϕ: R 2 d [ εη,εη] (y), if η 0; ],+ ]: (η, y) +, if η < 0 of the Vapnik loss function ϕ = max{ ε, 0}. ϕ = ε +ι [ 1,1]. Let η R, let y R, and set (χ, q) = prox γ ϕ (η, y). Then If η +ε y 0 and y γ, (χ, q) = (0, 0). If η γε and y > γ, (χ, q) = (0, y γsign(y)). If η > γε and y > εη +γ(1+ε 2 ), (χ, q) = (η +γε, y γsign(y)). If y > η/ε and εη y εη +γ(1+ε 2 ), (χ, q) = ((η +ε y )/(1+ε 2 ),ε(η +ε y )sign(y)/(1+ε 2 )). If η 0 and y εη, (χ, q) = (η, y). Patrick L. Combettes Fonctions Perspectives et Statistique 26/31

43 Applications in high-dimensional statistics Linear data model: z = Xb+σe Penalized concomitant M-estimators: n minimize ϕ i (σ, X i: b ζ i )+ σ R,τ R, b R p i=1 p ψ j (τ, aj b). This model unifies various robust regression procedures Can be solved efficiently by the block-iterative proximal splitting method of PLC and J. Eckstein, Asynchronous block-iterative primal-dual decomposition methods for monotone inclusions, Mathematical Programming, published online j=1 Other model of interest: generalized TREX Xb z q 2 minimize + b b R p α X (Xb z) q 1 1 Patrick L. Combettes Fonctions Perspectives et Statistique 27/31

44 Applications in high-dimensional statistics The nonconvex generalized TREX problem can be rewritten as a system of 2p convex problems minimize b R p x j (Xb z)>0 Xb z q 2 α x j (Xb z) q 1+ b 1, where x j = sx :j, s { 1, 1}. Each subproblem involves the (shifted) perspective function y z 2 2 α ( ), η x if η > x j z; j z g j : (η, y) 0, if y = z and η = xj z; +, otherwise of q 2 composed with the linear operator b ( x j Xb, Xb ), and h = [ 1 ] = 1. It can be solved (for instance), by a Douglas-Rachford-like algorithm. Patrick L. Combettes Fonctions Perspectives et Statistique 28/31

45 Applications in high-dimensional statistics prox h is the standard soft thresholding operator We have prox γgj (η, y) {( η +γ t q /q, y γp ), if q γ q 1 (η xj z)+ y z q = 2 > 0; ( x j z, z ), if q γ q 1 (η xj z)+ y z q 2 0, where = (α(1 1/q )) q 1, t (y z), if y z; p = y z 0, if y = z, and t is the unique solution in ]0,+ [ to the reduced equation s 2q 1 + q (η xj z) s q 1 + q γ s q y z = 0. 2 γ 2 Patrick L. Combettes Fonctions Perspectives et Statistique 29/31

46 Applications in high-dimensional statistics Algorithm for the jth generalized TREX subproblem q k = M j x k y k b k = x k R j q k c k = M j b k z k = prox γh (2b k x k ) t k = prox γgj,q (2c k y k ) x k+1 = x k +µ k (z k b k ) y k+1 = y k +µ k (t k c k ). (b k ) k N converges to a solution b to the subproblem. See paper for detailed numerical application to sparse regression. Patrick L. Combettes Fonctions Perspectives et Statistique 30/31

47 References PLC and C. L. Müller, Perspective functions: Proximal calculus and applications in high-dimensional statistics, J. Math. Anal. Appl., 2017 (published online) PLC, Perspective functions: Properties, constructions, and examples, H. H. Bauschke and PLC, Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York, (second edition: February 2017) Patrick L. Combettes Fonctions Perspectives et Statistique 31/31

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

SIGNAL RECOVERY BY PROXIMAL FORWARD-BACKWARD SPLITTING

SIGNAL RECOVERY BY PROXIMAL FORWARD-BACKWARD SPLITTING Multiscale Model. Simul. To appear SIGNAL RECOVERY BY PROXIMAL FORWARD-BACKWARD SPLITTING PATRICK L. COMBETTES AND VALÉRIE R. WAJS Abstract. We show that various inverse problems in signal recovery can

More information

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization

More information

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29 A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection EUSIPCO 2015 1/19 A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection Jean-Christophe Pesquet Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est

More information

Visco-penalization of the sum of two monotone operators

Visco-penalization of the sum of two monotone operators Visco-penalization of the sum of two monotone operators Patrick L. Combettes a and Sever A. Hirstoaga b a Laboratoire Jacques-Louis Lions, Faculté de Mathématiques, Université Pierre et Marie Curie Paris

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

consistent learning by composite proximal thresholding

consistent learning by composite proximal thresholding consistent learning by composite proximal thresholding Saverio Salzo Università degli Studi di Genova Optimization in Machine learning, vision and image processing Université Paul Sabatier, Toulouse 6-7

More information

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the

More information

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces

A generalized forward-backward method for solving split equality quasi inclusion problems in Banach spaces Available online at www.isr-publications.com/jnsa J. Nonlinear Sci. Appl., 10 (2017), 4890 4900 Research Article Journal Homepage: www.tjnsa.com - www.isr-publications.com/jnsa A generalized forward-backward

More information

Convex Optimization Conjugate, Subdifferential, Proximation

Convex Optimization Conjugate, Subdifferential, Proximation 1 Lecture Notes, HCI, 3.11.211 Chapter 6 Convex Optimization Conjugate, Subdifferential, Proximation Bastian Goldlücke Computer Vision Group Technical University of Munich 2 Bastian Goldlücke Overview

More information

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Radu Ioan Boţ Christopher Hendrich 2 April 28, 206 Abstract. The aim of this article is to present

More information

Chapter 2 Hilbert Spaces

Chapter 2 Hilbert Spaces Chapter 2 Hilbert Spaces Throughoutthis book,h isarealhilbertspacewith scalar(orinner)product. The associated norm is denoted by and the associated distance by d, i.e., ( x H)( y H) x = x x and d(x,y)

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Self-dual Smooth Approximations of Convex Functions via the Proximal Average

Self-dual Smooth Approximations of Convex Functions via the Proximal Average Chapter Self-dual Smooth Approximations of Convex Functions via the Proximal Average Heinz H. Bauschke, Sarah M. Moffat, and Xianfu Wang Abstract The proximal average of two convex functions has proven

More information

FENCHEL DUALITY, FITZPATRICK FUNCTIONS AND MAXIMAL MONOTONICITY S. SIMONS AND C. ZĂLINESCU

FENCHEL DUALITY, FITZPATRICK FUNCTIONS AND MAXIMAL MONOTONICITY S. SIMONS AND C. ZĂLINESCU FENCHEL DUALITY, FITZPATRICK FUNCTIONS AND MAXIMAL MONOTONICITY S. SIMONS AND C. ZĂLINESCU This paper is dedicated to Simon Fitzpatrick, in recognition of his amazing insights ABSTRACT. We show in this

More information

Auxiliary-Function Methods in Iterative Optimization

Auxiliary-Function Methods in Iterative Optimization Auxiliary-Function Methods in Iterative Optimization Charles L. Byrne April 6, 2015 Abstract Let C X be a nonempty subset of an arbitrary set X and f : X R. The problem is to minimize f over C. In auxiliary-function

More information

FENCHEL DUALITY, FITZPATRICK FUNCTIONS AND MAXIMAL MONOTONICITY S. SIMONS AND C. ZĂLINESCU

FENCHEL DUALITY, FITZPATRICK FUNCTIONS AND MAXIMAL MONOTONICITY S. SIMONS AND C. ZĂLINESCU FENCHEL DUALITY, FITZPATRICK FUNCTIONS AND MAXIMAL MONOTONICITY S. SIMONS AND C. ZĂLINESCU This paper is dedicated to Simon Fitzpatrick, in recognition of his amazing insights ABSTRACT. We show in this

More information

MOSCO STABILITY OF PROXIMAL MAPPINGS IN REFLEXIVE BANACH SPACES

MOSCO STABILITY OF PROXIMAL MAPPINGS IN REFLEXIVE BANACH SPACES MOSCO STABILITY OF PROXIMAL MAPPINGS IN REFLEXIVE BANACH SPACES Dan Butnariu and Elena Resmerita Abstract. In this paper we establish criteria for the stability of the proximal mapping Prox f ϕ =( ϕ+ f)

More information

arxiv: v1 [math.oc] 12 Mar 2013

arxiv: v1 [math.oc] 12 Mar 2013 On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems arxiv:303.875v [math.oc] Mar 03 Radu Ioan Boţ Ernö Robert Csetnek André Heinrich February

More information

Generalized greedy algorithms.

Generalized greedy algorithms. Generalized greedy algorithms. François-Xavier Dupé & Sandrine Anthoine LIF & I2M Aix-Marseille Université - CNRS - Ecole Centrale Marseille, Marseille ANR Greta Séminaire Parisien des Mathématiques Appliquées

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Compressed Sensing: a Subgradient Descent Method for Missing Data Problems

Compressed Sensing: a Subgradient Descent Method for Missing Data Problems Compressed Sensing: a Subgradient Descent Method for Missing Data Problems ANZIAM, Jan 30 Feb 3, 2011 Jonathan M. Borwein Jointly with D. Russell Luke, University of Goettingen FRSC FAAAS FBAS FAA Director,

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Oslo Class 6 Sparsity based regularization

Oslo Class 6 Sparsity based regularization RegML2017@SIMULA Oslo Class 6 Sparsity based regularization Lorenzo Rosasco UNIGE-MIT-IIT May 4, 2017 Learning from data Possible only under assumptions regularization min Ê(w) + λr(w) w Smoothness Sparsity

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1 EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex

More information

Variable Metric Forward-Backward Algorithm

Variable Metric Forward-Backward Algorithm Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

Making Flippy Floppy

Making Flippy Floppy Making Flippy Floppy James V. Burke UW Mathematics jvburke@uw.edu Aleksandr Y. Aravkin IBM, T.J.Watson Research sasha.aravkin@gmail.com Michael P. Friedlander UBC Computer Science mpf@cs.ubc.ca Vietnam

More information

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

PROXIMAL THRESHOLDING ALGORITHM FOR MINIMIZATION OVER ORTHONORMAL BASES

PROXIMAL THRESHOLDING ALGORITHM FOR MINIMIZATION OVER ORTHONORMAL BASES PROXIMAL THRESHOLDING ALGORITHM FOR MINIMIZATION OVER ORTHONORMAL BASES Patrick L. Combettes and Jean-Christophe Pesquet Laboratoire Jacques-Louis Lions UMR CNRS 7598 Université Pierre et Marie Curie Paris

More information

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y

More information

Subgradient Projectors: Extensions, Theory, and Characterizations

Subgradient Projectors: Extensions, Theory, and Characterizations Subgradient Projectors: Extensions, Theory, and Characterizations Heinz H. Bauschke, Caifang Wang, Xianfu Wang, and Jia Xu April 13, 2017 Abstract Subgradient projectors play an important role in optimization

More information

Sparse Regularization via Convex Analysis

Sparse Regularization via Convex Analysis Sparse Regularization via Convex Analysis Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York, USA 29 / 66 Convex or non-convex: Which

More information

arxiv: v4 [math.oc] 29 Jan 2018

arxiv: v4 [math.oc] 29 Jan 2018 Noname manuscript No. (will be inserted by the editor A new primal-dual algorithm for minimizing the sum of three functions with a linear operator Ming Yan arxiv:1611.09805v4 [math.oc] 29 Jan 2018 Received:

More information

The proximal mapping

The proximal mapping The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function

More information

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control RTyrrell Rockafellar and Peter R Wolenski Abstract This paper describes some recent results in Hamilton- Jacobi theory

More information

On the order of the operators in the Douglas Rachford algorithm

On the order of the operators in the Douglas Rachford algorithm On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums

More information

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane Conference ADGO 2013 October 16, 2013 Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde Université des Antilles et de la Guyane Playa Blanca, Tongoy, Chile SUBDIFFERENTIAL

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

Inertial Douglas-Rachford splitting for monotone inclusion problems

Inertial Douglas-Rachford splitting for monotone inclusion problems Inertial Douglas-Rachford splitting for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek Christopher Hendrich January 5, 2015 Abstract. We propose an inertial Douglas-Rachford splitting algorithm

More information

The Fitzpatrick Function and Nonreflexive Spaces

The Fitzpatrick Function and Nonreflexive Spaces Journal of Convex Analysis Volume 13 (2006), No. 3+4, 861 881 The Fitzpatrick Function and Nonreflexive Spaces S. Simons Department of Mathematics, University of California, Santa Barbara, CA 93106-3080,

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Learning with stochastic proximal gradient

Learning with stochastic proximal gradient Learning with stochastic proximal gradient Lorenzo Rosasco DIBRIS, Università di Genova Via Dodecaneso, 35 16146 Genova, Italy lrosasco@mit.edu Silvia Villa, Băng Công Vũ Laboratory for Computational and

More information

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

Existence and Approximation of Fixed Points of. Bregman Nonexpansive Operators. Banach Spaces

Existence and Approximation of Fixed Points of. Bregman Nonexpansive Operators. Banach Spaces Existence and Approximation of Fixed Points of in Reflexive Banach Spaces Department of Mathematics The Technion Israel Institute of Technology Haifa 22.07.2010 Joint work with Prof. Simeon Reich General

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Fenchel-Moreau Conjugates of Inf-Transforms and Application to Stochastic Bellman Equation

Fenchel-Moreau Conjugates of Inf-Transforms and Application to Stochastic Bellman Equation Fenchel-Moreau Conjugates of Inf-Transforms and Application to Stochastic Bellman Equation Jean-Philippe Chancelier and Michel De Lara CERMICS, École des Ponts ParisTech First Conference on Discrete Optimization

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Stochastic model-based minimization under high-order growth

Stochastic model-based minimization under high-order growth Stochastic model-based minimization under high-order growth Damek Davis Dmitriy Drusvyatskiy Kellie J. MacPhee Abstract Given a nonsmooth, nonconvex minimization problem, we consider algorithms that iteratively

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition Gongguo Tang and Arye Nehorai Department of Electrical and Systems Engineering Washington University in St Louis

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

Monotone Operator Splitting Methods in Signal and Image Recovery

Monotone Operator Splitting Methods in Signal and Image Recovery Monotone Operator Splitting Methods in Signal and Image Recovery P.L. Combettes 1, J.-C. Pesquet 2, and N. Pustelnik 3 2 Univ. Pierre et Marie Curie, Paris 6 LJLL CNRS UMR 7598 2 Univ. Paris-Est LIGM CNRS

More information

Stability of optimization problems with stochastic dominance constraints

Stability of optimization problems with stochastic dominance constraints Stability of optimization problems with stochastic dominance constraints D. Dentcheva and W. Römisch Stevens Institute of Technology, Hoboken Humboldt-University Berlin www.math.hu-berlin.de/~romisch SIAM

More information

arxiv: v2 [math.oc] 21 Nov 2017

arxiv: v2 [math.oc] 21 Nov 2017 Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany

More information

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

THE L 2 -HODGE THEORY AND REPRESENTATION ON R n

THE L 2 -HODGE THEORY AND REPRESENTATION ON R n THE L 2 -HODGE THEORY AND REPRESENTATION ON R n BAISHENG YAN Abstract. We present an elementary L 2 -Hodge theory on whole R n based on the minimization principle of the calculus of variations and some

More information

PARTIAL REGULARITY OF BRENIER SOLUTIONS OF THE MONGE-AMPÈRE EQUATION

PARTIAL REGULARITY OF BRENIER SOLUTIONS OF THE MONGE-AMPÈRE EQUATION PARTIAL REGULARITY OF BRENIER SOLUTIONS OF THE MONGE-AMPÈRE EQUATION ALESSIO FIGALLI AND YOUNG-HEON KIM Abstract. Given Ω, Λ R n two bounded open sets, and f and g two probability densities concentrated

More information

THROUGHOUT this paper, we let C be a nonempty

THROUGHOUT this paper, we let C be a nonempty Strong Convergence Theorems of Multivalued Nonexpansive Mappings and Maximal Monotone Operators in Banach Spaces Kriengsak Wattanawitoon, Uamporn Witthayarat and Poom Kumam Abstract In this paper, we prove

More information

PROXIMAL THRESHOLDING ALGORITHM FOR MINIMIZATION OVER ORTHONORMAL BASES

PROXIMAL THRESHOLDING ALGORITHM FOR MINIMIZATION OVER ORTHONORMAL BASES SIAM J. Optim. to appear PROXIMAL THRESHOLDING ALGORITHM FOR MINIMIZATION OVER ORTHONORMAL BASES PATRICK L. COMBETTES AND JEAN-CHRISTOPHE PESQUET Abstract. The notion of soft thresholding plays a central

More information

Conductivity imaging from one interior measurement

Conductivity imaging from one interior measurement Conductivity imaging from one interior measurement Amir Moradifam (University of Toronto) Fields Institute, July 24, 2012 Amir Moradifam (University of Toronto) A convergent algorithm for CDII 1 / 16 A

More information

Controllability of linear PDEs (I): The wave equation

Controllability of linear PDEs (I): The wave equation Controllability of linear PDEs (I): The wave equation M. González-Burgos IMUS, Universidad de Sevilla Doc Course, Course 2, Sevilla, 2018 Contents 1 Introduction. Statement of the problem 2 Distributed

More information

On Total Convexity, Bregman Projections and Stability in Banach Spaces

On Total Convexity, Bregman Projections and Stability in Banach Spaces Journal of Convex Analysis Volume 11 (2004), No. 1, 1 16 On Total Convexity, Bregman Projections and Stability in Banach Spaces Elena Resmerita Department of Mathematics, University of Haifa, 31905 Haifa,

More information

Extensions of the CQ Algorithm for the Split Feasibility and Split Equality Problems

Extensions of the CQ Algorithm for the Split Feasibility and Split Equality Problems Extensions of the CQ Algorithm for the Split Feasibility Split Equality Problems Charles L. Byrne Abdellatif Moudafi September 2, 2013 Abstract The convex feasibility problem (CFP) is to find a member

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

OWL to the rescue of LASSO

OWL to the rescue of LASSO OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,

More information

ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD

ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD TIM HOHEISEL, MAXIME LABORDE, AND ADAM OBERMAN Abstract. In this article we study the connection

More information

Sparse and Regularized Optimization

Sparse and Regularized Optimization Sparse and Regularized Optimization In many applications, we seek not an exact minimizer of the underlying objective, but rather an approximate minimizer that satisfies certain desirable properties: sparsity

More information

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE

WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE Fixed Point Theory, Volume 6, No. 1, 2005, 59-69 http://www.math.ubbcluj.ro/ nodeacj/sfptcj.htm WEAK CONVERGENCE OF RESOLVENTS OF MAXIMAL MONOTONE OPERATORS AND MOSCO CONVERGENCE YASUNORI KIMURA Department

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Radu Ioan Boţ Christopher Hendrich November 7, 202 Abstract. In this paper we

More information

FINDING BEST APPROXIMATION PAIRS RELATIVE TO A CONVEX AND A PROX-REGULAR SET IN A HILBERT SPACE

FINDING BEST APPROXIMATION PAIRS RELATIVE TO A CONVEX AND A PROX-REGULAR SET IN A HILBERT SPACE FINDING BEST APPROXIMATION PAIRS RELATIVE TO A CONVEX AND A PROX-REGULAR SET IN A HILBERT SPACE D. RUSSELL LUKE Abstract. We study the convergence of an iterative projection/reflection algorithm originally

More information

1 Sparsity and l 1 relaxation

1 Sparsity and l 1 relaxation 6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the

More information

Second order forward-backward dynamical systems for monotone inclusion problems

Second order forward-backward dynamical systems for monotone inclusion problems Second order forward-backward dynamical systems for monotone inclusion problems Radu Ioan Boţ Ernö Robert Csetnek March 6, 25 Abstract. We begin by considering second order dynamical systems of the from

More information

Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces

Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces Emil Ernst Michel Théra Rapport de recherche n 2004-04

More information

STRONG CONVERGENCE THEOREMS FOR COMMUTATIVE FAMILIES OF LINEAR CONTRACTIVE OPERATORS IN BANACH SPACES

STRONG CONVERGENCE THEOREMS FOR COMMUTATIVE FAMILIES OF LINEAR CONTRACTIVE OPERATORS IN BANACH SPACES STRONG CONVERGENCE THEOREMS FOR COMMUTATIVE FAMILIES OF LINEAR CONTRACTIVE OPERATORS IN BANACH SPACES WATARU TAKAHASHI, NGAI-CHING WONG, AND JEN-CHIH YAO Abstract. In this paper, we study nonlinear analytic

More information

SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS

SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS XIANTAO XIAO, YONGFENG LI, ZAIWEN WEN, AND LIWEI ZHANG Abstract. The goal of this paper is to study approaches to bridge the gap between

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization

More information

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim

GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS. Jong Kyu Kim, Salahuddin, and Won Hee Lim Korean J. Math. 25 (2017), No. 4, pp. 469 481 https://doi.org/10.11568/kjm.2017.25.4.469 GENERAL NONCONVEX SPLIT VARIATIONAL INEQUALITY PROBLEMS Jong Kyu Kim, Salahuddin, and Won Hee Lim Abstract. In this

More information

On the validity of the Euler Lagrange equation

On the validity of the Euler Lagrange equation J. Math. Anal. Appl. 304 (2005) 356 369 www.elsevier.com/locate/jmaa On the validity of the Euler Lagrange equation A. Ferriero, E.M. Marchini Dipartimento di Matematica e Applicazioni, Università degli

More information

2D HILBERT-HUANG TRANSFORM. Jérémy Schmitt, Nelly Pustelnik, Pierre Borgnat, Patrick Flandrin

2D HILBERT-HUANG TRANSFORM. Jérémy Schmitt, Nelly Pustelnik, Pierre Borgnat, Patrick Flandrin 2D HILBERT-HUANG TRANSFORM Jérémy Schmitt, Nelly Pustelnik, Pierre Borgnat, Patrick Flandrin Laboratoire de Physique de l Ecole Normale Suprieure de Lyon, CNRS and Université de Lyon, France first.last@ens-lyon.fr

More information

First-order methods for structured nonsmooth optimization

First-order methods for structured nonsmooth optimization First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

Γ-convergence of functionals on divergence-free fields

Γ-convergence of functionals on divergence-free fields Γ-convergence of functionals on divergence-free fields N. Ansini Section de Mathématiques EPFL 05 Lausanne Switzerland A. Garroni Dip. di Matematica Univ. di Roma La Sapienza P.le A. Moro 2 0085 Rome,

More information

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms JOTA manuscript No. (will be inserted by the editor) Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms Peter Ochs Jalal Fadili Thomas Brox Received: date / Accepted: date Abstract

More information