Signal Processing and Networks Optimization Part VI: Duality

Size: px
Start display at page:

Download "Signal Processing and Networks Optimization Part VI: Duality"

Transcription

1 Signal Processing and Networks Optimization Part VI: Duality Pierre Borgnat 1, Jean-Christophe Pesquet 2, Nelly Pustelnik 1 1 ENS Lyon Laboratoire de Physique CNRS UMR 5672 pierre.borgnat@ens-lyon.fr, nelly.pustelnik@ens-lyon.fr 2 LIGM Univ. Paris-Est CNRS UMR 8049 jean-christophe.pesquet@univ-paris-est.fr

2 2/29 Fenchel-Rockafellar duality Primal problem Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ]. Let L B(H,G). We want to minimize f(x)+g(lx). x H Dual problem Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ]. Let L B(H,G). We want to minimize f ( L v)+g (v). v G

3 3/29 Fenchel-Rockafellar duality Weak duality Let H and G be two real Hilbert spaces. Let f be a proper fonction from H to ],+ ], g be a proper function from G to ],+ ], and L B(H,G). Let µ = inf x H f(x)+g(lx) and µ = inf v G f ( L v)+g (v). We have µ µ. If µ R, µ+µ is called the duality gap.

4 3/29 Fenchel-Rockafellar duality Weak duality Let H and G be two real Hilbert spaces. Let f be a proper fonction from H to ],+ ], g be a proper function from G to ],+ ], and L B(H,G). Let µ = inf x H f(x)+g(lx) and µ = inf v G f ( L v)+g (v). We have µ µ. If µ R, µ+µ is called the duality gap. Proof: According to Fenchel-Young inequality, for every x H and v G, f(x)+g(lx)+f ( L v)+g (v) x L v + Lx v = 0

5 4/29 Fenchel-Rockafellar duality Strong duality Let H and G be two real Hilbert spaces. Let f Γ 0 (H), g Γ 0 (G), and L B(H,G). If int(domg) L(domf) or domg int ( L(domf) ), then µ = inf x H f(x)+g(lx) = min v G f ( L v)+g (v) = µ.

6 5/29 Example 1: Linear programming Let L R K N, b R K, and c R N. The primal problem Primal-LP : minimize c x s.t. Lx b x [0,+ [ N is associated with the the dual problem Dual-LP : maximize b y s.t. L y c. y [0,+ [ K In addition, if the primal problem has a solution, then strong duality holds.

7 5/29 Example 1: Linear programming Let L R K N, b R K, and c R N. The primal problem Primal-LP : minimize c x s.t. Lx b x [0,+ [ N is associated with the the dual problem Dual-LP : maximize b y s.t. L y c. y [0,+ [ K In addition, if the primal problem has a solution, then strong duality holds. Proof: Set ( x H = R N ) f(x) = c x +ι [0,+ [ N(x), ( z G = R K ) g(z) = ι [0,+ [ K(z b), y = v

8 6/29 Example 2: Consensus and sharing Let H be a real Hilbert space. For every i {1,...,m}, let g i : H ],+ ] and h i : H ],+ ]. The consensus problem is given by minimize (x 1,...,x m) H m x 1= =x m The sharing problem is given by maximize (u 1,...,u m) H m u 1+ +u m=u m g i (x i ). i=1 m h i (u i ), u H. i=1 If, for every i {1,...,m}, h i = g i ( u/m), then sharing is the dual problem of consensus.

9 6/29 Example 2: Consensus and sharing Let H be a real Hilbert space. For every i {1,...,m}, let g i : H ],+ ] and h i : H ],+ ]. The consensus problem is given by minimize (x 1,...,x m) H m x 1= =x m The sharing problem is given by maximize (u 1,...,u m) H m u 1+ +u m=u m g i (x i ). i=1 m h i (u i ), u H. i=1 If, for every i {1,...,m}, h i = g i ( u/m), then sharing is the dual problem of consensus. Proof: Set L = Id and ( x = (x 1,...,x m ) H N) { f(x) = ι Λm (x), g(x) = m i=1 g i(x i ) where Λ m = { (x 1,...,x m ) H m x 1 =... = x m }.

10 7/29 Fenchel-Rockafellar duality Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ], and L B(H,G). If domg L(domf), then ( x H) f(x)+l g(lx) (f +g L)(x) Let H and G be two real Hilbert spaces. Letf Γ 0 (H), g Γ 0 (G), andl B(H,G). Ifint(domg) L(domf) and domg int ( L(domf) ), then f +L gl = (f +g L)

11 8/29 Fenchel-Rockafellar duality Duality theorem (1) Let H and G be two real Hilbert spaces. Let f Γ 0 (H), g Γ 0 (G), and L B(H,G). zer( f +L gl ) zer ( ( L) f ( L )+ g )

12 8/29 Fenchel-Rockafellar duality Duality theorem (1) Let H and G be two real Hilbert spaces. Let f Γ 0 (H), g Γ 0 (G), and L B(H,G). zer( f +L gl ) zer ( ( L) f ( L )+ g ) Proof: ( x H) 0 f(x)+l g(lx) ( x H)( v G) ( x H)( v G) ( v G) { L v f(x) v g(lx) { x f ( L v) Lx g (v) 0 L f ( L v)+ g (v).

13 9/29 Fenchel-Rockafellar duality Duality theorem (2) Let H and G be two real Hilbert spaces. Let f Γ 0 (H), g Γ 0 (G), and L B(H,G). If there exists x H such that 0 f( x)+l g(l x), then x is a solution to the primal problem. Moreover, there exists a solution v to the dual problem such that L v f( x) and L x g ( v). If there exists ( x, v) H G such that L v f( x) and L x g ( v) then x (resp. v) is a solution to the primal (resp. dual) problem. If ( x, v) H G is such that L v f( x) and L x g ( v), then ( x, v) is called a Kuhn-Tucker point.

14 10/29 Fenchel-Rockafellar duality Proof: 0 f( x)+l g(l x) (f +g L)( x). Then, according to Fermat rule, x is a solution to the primal problem. In addition, there exists v G such that { 0 f( x)+l v { L v f( x) v g(l x) L x g ( v). We have also x f ( L v), which implies that On the other hand, 0 L f ( L v)+ g ( v). 0 L f ( L v)+ g ( v) ( f ( L )+g ) ( v) v solution to the dual problem. The second assertion is shown in a similar manner.

15 10/29 Fenchel-Rockafellar duality Particular case: If f = ϕ+ 1 2 z 2 where ϕ Γ 0 (H) and z H, then L v f( x) L v ϕ( x)+ x z 0 x +L v z + ϕ( x). Hence, x = prox ϕ ( L v +z).

16 11/29 Link with Lagrange duality Minimax problem Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ], and L B(H,G). The primal problem is equivalent to finding µ = inf sup (x,y) H G v G L(x,y,v) where L is the Lagrange function defined as ( (x,y,v) H G 2 ) L(x,y,v) = f(x)+g(y)+ v Lx y

17 11/29 Link with Lagrange duality Minimax problem Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ], and L B(H,G). The primal problem is equivalent to finding µ = inf sup (x,y) H G v G L(x,y,v) where L is the Lagrange function defined as ( (x,y,v) H G 2 ) L(x,y,v) = f(x)+g(y)+ v Lx y Proof: µ = inf f(x)+g(lx) = inf f(x)+g(y)+ι {0} (Lx y) x H (x,y) H G = inf f(x)+g(y)+sup v Lx y (x,y) H G v G = inf (x,y) H G v G supf(x)+g(y)+ v Lx y.

18 11/29 Link with Lagrange duality Maximin problem Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ], and L B(H,G). The dual problem is equivalent to finding µ = sup inf L(x,y,v) v G (x,y) H G where L is the Lagrange function defined as ( (x,y,v) H G 2 ) L(x,y,v) = f(x)+g(y)+ v Lx y Proof: µ = inf f ( L v)+g (v) = inf v G v G ( = inf v G = sup ( sup x H x L v f(x) ) + ( sup y v g(y) ) y G ) inf (x,y) H G inf v G (x,y) H G f(x)+g(y)+ v Lx y f(x)+g(y)+ v Lx y.

19 11/29 Link with Lagrange duality Maximin problem Let H and G be two real Hilbert spaces. Let f : H ],+ ], g: G ],+ ], and L B(H,G). The dual problem is equivalent to finding µ = sup inf L(x,y,v) v G (x,y) H G where L is the Lagrange function defined as ( (x,y,v) H G 2 ) L(x,y,v) = f(x)+g(y)+ v Lx y Remark: v is called the Lagrange multiplier associated with the constraint Lx = y.

20 12/29 Link with Lagrange duality Let ( x,ŷ, v) H G 2. ( x,ŷ, v) is a saddle point of the Lagrange function L if ( (x,y,v) H G 2 ) L( x,ŷ,v) L( x,ŷ, v) L(x,y, v). Let H and G be two real Hilbert spaces. Let f Γ 0 (H), g Γ 0 (G), and L B(H,G). Let ( x,ŷ, v) H G 2. Assume that int(domg) L(domf) or domg int ( L(domf) ). ( x,ŷ, v) is a saddle point of the Lagrange function ( x, v) is a Kuhn-Tucker point and ŷ = L x.

21 13/29 Link with Lagrange duality Proof ( ): If ( x,ŷ, v) is a saddle point of L, then it is a critical point of L, that is 0 x L( x,ŷ, v) = f( x)+l v 0 y L( x,ŷ, v) = g(ŷ) v 0 = v L( x,ŷ, v) = L x ŷ L v f( x) v g(ŷ) ŷ = L x L v f( x) L x g ( v) ŷ = L x.

22 13/29 Link with Lagrange duality Proof ( ): Conversely, assume that ( x, v) is a Kuhn-Tucker point and ŷ = L x. Since x (resp. v) is a solution to the primal (resp. dual) problem, then µ = inf sup (x,y) H G v G v G L(x,y,v) = supl( x,ŷ,v) v G µ = sup inf L(x,y,v) = inf L(x,y, v). (x,y) H G (x,y) H G By strong duality, sup v G L( x,ŷ,v) = inf (x,y) H G L(x,y, v), which can be rewritten as ( (x,y,v) H G 2 ) L( x,ŷ,v) L(x,y, v) or equivalently ( (x,y,v) H G 2 ) L( x,ŷ,v) L( x,ŷ, v) L(x,y, v).

23 14/29 Alternating-direction method of multipliers Idea: iterations for finding a saddle point ( x,ŷ,ẑ): x n Argmin L(,y n,z n ) ( n N) y n+1 Argmin L(x n,,z n ) v n+1 such that L(x n,y n+1,v n+1 ) L(x n,y n+1,v n ). But the convergence is not guaranteed in general!

24 14/29 Alternating-direction method of multipliers Idea: iterations for finding a saddle point ( x,ŷ,ẑ): x n Argmin L(,y n,z n ) ( n N) y n+1 Argmin L(x n,,z n ) v n+1 such that L(x n,y n+1,v n+1 ) L(x n,y n+1,v n ). But the convergence is not guaranteed in general! Solution: introduce an Augmented Lagrange function. Let γ ]0,+ [, we define ( (x,y,z) H G 2 ) L(x,y,z) =f(x)+g(y)+γ z Lx y The Lagrange multiplier is v = γz. + γ 2 Lx y 2

25 15/29 Alternating-direction method of multipliers Let H and G be two real Hilbert spaces. Let f Γ 0 (H), g Γ 0 (G), and L B(H,G). Let ( x,ŷ, v) H G 2. Assume that int(domg) L(domf) or domg int ( L(domf) ). ( x,ŷ,ẑ) is a saddle point of the augmented Lagrange function ( x,γẑ) is a Kuhn-Tucker point and ŷ = L x.

26 16/29 Alternating-direction method of multipliers Proof ( ): If ( x,ŷ,ẑ) is a saddle point of L, then 0 x L( x,ŷ,ẑ) = f( x)+γl ẑ +γl (L x ŷ) 0 y L( x,ŷ,ẑ) = g(ŷ) γẑ +γ(ŷ L x) 0 = z L( x,ŷ,ẑ) = L x ŷ 0 x L( x,ŷ,γẑ) 0 y L( x,ŷ,γẑ) 0 = v L( x,ŷ,γẑ).

27 16/29 Alternating-direction method of multipliers Proof ( ): Conversely, if ( x,γẑ) is a Kuhn-Tucker point and ŷ = L x, then it is a saddle point of L. In addition, ( (x,y,z) H G 2 ) L(x,y,z) = L(x,y,γz)+ γ 2 Lx y 2. It can be deduced that L( x,ŷ,z) = L( x,ŷ,γz) L( x,ŷ,γẑ) = L( x,ŷ,ẑ) L(x,y,γẑ) L(x,y,ẑ).

28 17/29 Alternating-direction method of multipliers Algorithm for finding a saddle point: x n = argmin L(x,y n,z n ) x H ( n N) y n+1 = argmin L(x n,y,z n ) y G z n+1 such that L(x n,y n+1,z n+1 ) L(x n,y n+1,z n ). By performing a gradient ascent on the Lagrange multiplier, x n = argmin f(x)+γ z n Lx y n + γ 2 Lx y n 2 x H ( n N) y n+1 = argmin g(y)+γ z n Lx n y + γ 2 Lx n y 2 y G z n+1 = z n + 1 γ z L(x n,y n+1,z n ) 1 x n = argmin 2 Lx y n +z n γ f(x) x H ( n N) y n+1 = proxg γ (z n +Lx n ) z n+1 = z n +Lx n y n+1.

29 18/29 Augmented Lagrangian method ADMM algorithm (Alternating-direction method of multipliers) Let H and G be two Hilbert spaces. Let f Γ 0(H) and g Γ 0(G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. ( n N) 1 x n = argmin Lx 2 yn +zn f(x) γ x H s n = Lx n y n+1 = prox g (zn +sn) γ z n+1 = z n +s n y n+1.

30 18/29 Augmented Lagrangian method ADMM algorithm (Alternating-direction method of multipliers) Let H and G be two Hilbert spaces. Let f Γ 0(H) and g Γ 0(G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. We assume that int(domg) L(domf) or domg int ( L(domf) ), and that Argmin(f +g L). Let (y 0,z 0) G 2 and 1 x n = argmin Lx 2 yn +zn f(x) γ x H s ( n N) n = Lx n y n+1 = prox g (zn +sn) γ We have: z n+1 = z n +s n y n+1. x n x where x Argmin(f +g L) γz n v where v Argmin(f ( L )+g ).

31 18/29 Augmented Lagrangian method ADMM algorithm (Alternating-direction method of multipliers) Let H and G be two Hilbert spaces. Let f Γ 0(H) and g Γ 0(G). Let L B(H,G) such that L L is an isomorphism and let γ ]0,+ [. We assume that int(domg) L(domf) or domg int ( L(domf) ), and that Argmin(f +g L). Let (y 0,z 0) G 2 and 1 x n = argmin Lx 2 yn +zn f(x) γ x H s ( n N) n = Lx n y n+1 = prox g (zn +sn) γ We have: z n+1 = z n +s n y n+1. x n x where x Argmin(f +g L) γz n v where v Argmin(f ( L )+g ). Sketch of the proof: ADMM can be shown to be equivalent to Douglas-Rachford applied to the dual problem.

32 19/29 Augmented Lagrangian method Extension: Splitting more than 2 functions minimize x H M f i (x) i=1 where H a real Hilbert space and, for every i {1,...,m}, f i Γ 0 (H). Idea: Reformulate the problem as a consensus one in the product space H = H m M minimize f i (x i )+ι Λm (x) x=(x 1,...,x m) H where i=1 Λ m = { (x 1,...,x m ) H x1 = = x m }.

33 20/29 Augmented Lagrangian method Use of ADMM: set ( x = (x 1,...,x m ) H) g = ι Λm, L = Id. f(x) = M i=1 f i(x i ) Resulting SDMM (Simultaneous Direction Method of Multipliers): ( n N) 1 x n = argmin 2 x y n +z n γ f(x) x H y n+1 = proxg γ (z n +x n ) z n+1 = z n +x n y n+1.

34 20/29 Augmented Lagrangian method Use of ADMM: set ( x = (x 1,...,x m ) H) g = ι Λm, L = Id. f(x) = M i=1 f i(x i ) Resulting SDMM (Simultaneous Direction Method of Multipliers): x n = prox f (y n z n ) γ ( n N) y n+1 = P Λm (z n +x n ) z n+1 = z n +x n y n+1. P Λm z n+1 = P Λm (z n +x n ) P Λm y n+1 = y n+1 y n+1 = 0

35 20/29 Augmented Lagrangian method Use of ADMM: set ( x = (x 1,...,x m ) H) g = ι Λm, L = Id. f(x) = M i=1 f i(x i ) Resulting SDMM (Simultaneous Direction Method of Multipliers): x n = prox f (y n z n ) γ ( n N) y n+1 = P Λm x n z n+1 = z n +x n y n+1 provided that P Λm z 0 = 0.

36 20/29 Augmented Lagrangian method Projection onto Λ m : ( u = (u1,...,u m ) H ) P Λm (u) = (u,...,u) where u = 1 m m u i. Resulting SDMM (Simultaneous Direction Method of Multipliers): x n = prox f (y n z n ) γ ( n N) y n+1 = P Λm x n z n+1 = z n +x n y n+1 provided that P Λm z 0 = 0. i=1

37 20/29 Augmented Lagrangian method Projection onto Λ m : ( u = (u1,...,u m ) H ) P Λm (u) = (u,...,u) where u = 1 m m u i. Resulting SDMM (Simultaneous Direction Method of Multipliers): ( n N) provided that m i=1 z 0,i = 0. i=1 x n,i = proxf i (x n z n,i ), i {1,...,m} γ x n = 1 m x n,i m i=1 z n+1,i = z n,i +x n,i x n, i {1,...,m}

38 20/29 Augmented Lagrangian method Projection onto Λ m : ( u = (u1,...,u m ) H ) P Λm (u) = (u,...,u) where u = 1 m m u i. Resulting SDMM (Simultaneous Direction Method of Multipliers): ( n N) i=1 x n,i = proxf i (x n z n,i ), i {1,...,m} γ x n = 1 m x n,i m i=1 z n+1,i = z n,i +x n,i x n, i {1,...,m} Parallel computing of proximity operator, but centralized computation of average.

39 21/29 Distributed optimization V = {1,...,m} set of indices of the vertices of an undirected nonreflexive graph x H vector of node weights E = { e i,j (i,j) E } (with i < j) set of edges.

40 21/29 Distributed optimization V = {1,...,m} set of indices of the vertices of an undirected nonreflexive graph x H vector of node weights E = { e i,j (i,j) E } (with i < j) set of edges. The graph is assumed to be connected: x Λ m ( (i,j) E ) (x i,x j ) Λ 2

41 21/29 Distributed optimization V = {1,...,m} set of indices of the vertices of an undirected nonreflexive graph x H vector of node weights E = { e i,j (i,j) E } (with i < j) set of edges. The graph is assumed to be connected: x Λ m ( (i,j) E ) (x i,x j ) Λ 2 For every (i,j) E, we define decimation operators: L i,j : H H 2 : x (x i,x j ) interpolation operators: L i,j : H2 H: (y 1,y 2 ) x = (x i ) 1 i m where y 1 if i = i ( i {1,...,m}) x i = y 2 if i = j 0 otherwise.

42 21/29 Distributed optimization V = {1,...,m} set of indices of the vertices of an undirected nonreflexive graph x H vector of node weights E = { e i,j (i,j) E } (with i < j) set of edges. The graph is assumed to be connected: x Λ m ( (i,j) E ) (x i,x j ) Λ 2 For every (i,j) E, we define decimation operators: L i,j : H H 2 : x (x i,x j ) interpolation operators: L i,j : H2 H: (y 1,y 2 ) x = (x i ) 1 i m where y 1 if i = i ( i {1,...,m}) x i = y 2 if i = j 0 otherwise. The concatenated decimation operator L = (L i,j ) (i,j) E is such that L L = L i,j L i,j = Diag(d 1 Id,...,d mid) (i,j) E where, for every i {1,...,m}, d i is the degree of vertex of index i.

43 22/29 Distributed optimization Distributed consensus minimize x=(x 1,...,x m) H M f i (x i )+ i=1 (i,j) E ι Λm (x) }{{} ( ι Λ2 (xi,x j ) )

44 22/29 Distributed optimization Distributed consensus minimize x=(x 1,...,x m) H M f i (x i )+ ( ι Λ2 Li,j x ) i=1 (i,j) E

45 22/29 Distributed optimization Distributed consensus minimize x=(x 1,...,x m) H M f i (x i )+ ( ι Λ2 Li,j x ) i=1 (i,j) E Use of ADMM: Set M ( x = (x 1,...,x m ) H) f(x) = f i (x i ) i=1 ( y = (yi,j ) (i,j) E H E ) g(y) = (i,j) E ι Λ2 ( yi,j ).

46 22/29 Distributed optimization Use of ADMM: Set ( x = (x 1,...,x m ) H) f(x) = M f i (x i ) i=1 ( y = (yi,j ) (i,j) E H E ) g(y) = (i,j) E ι Λ2 ( yi,j ). Distributed ADMM (1): 1 x n = argmin 2 Lx y n +z n γ f(x) x H s ( n N) n = Lx n y n+1 = proxg γ (z n +s n ) z n+1 = z n +s n y n+1.

47 22/29 Distributed optimization Distributed ADMM (1): 1 x n = argmin 2 Lx y n +z n γ f(x) x H s ( n N) n = Lx n y n+1 = proxg γ (z n +s n ) z n+1 = z n +s n y n+1. Distributed ADMM (2): provided that ( (i,j) E ) P Λ2 z 0,i,j = 0, ( n N) 0 L (Lx n y n +z n )+ 1 γ f(x n) s n,i,j = (x n,i,x n,j ), (i,j) E y n+1,i,j = P Λ2 (z n,i,j +s n,i,j ) = P Λ2 s n,i,j, (i,j) E z n+1,i,j = z n,i,j +s n,i,j y n+1,i,j, (i,j) E.

48 22/29 Distributed optimization Distributed ADMM (2): provided that ( (i,j) E ) P Λ2 z 0,i,j = 0, 0 L (Lx n y n +z n )+ 1 γ f(x n) s n,i,j = (x n,i,x n,j ), (i,j) E ( n N) y n+1,i,j = P Λ2 (z n,i,j +s n,i,j ) = P Λ2 s n,i,j, (i,j) E z n+1,i,j = z n,i,j +s n,i,j y n+1,i,j, (i,j) E. Distributed ADMM (3): 0 d i x n,i ( L (y n z n ) ) i + 1 γ f i(x n,i ), i V s n,i,j = (x n,i,x n,j ), (i,j) E ( n N) y n+1,i,j = P Λ2 s n,i,j, (i,j) E z n+1,i,j = z n,i,j +s n,i,j y n+1,i,j, (i,j) E.

49 22/29 Distributed optimization Distributed ADMM (3): 0 d i x n,i ( L (y n z n ) ) i + 1 γ f i(x n,i ), i V s n,i,j = (x n,i,x n,j ), (i,j) E ( n N) y n+1,i,j = P Λ2 s n,i,j, (i,j) E z n+1,i,j = z n,i,j +s n,i,j y n+1,i,j, (i,j) E. Distributed ADMM (4): ( ( x n,i = prox f i d 1 i L (y n z n ) ) ) i i V γd i y n+1,i,j = 1 2 (x n,i +x n,j ), (i,j) E ( n N) y n+1,i,j = (y n+1,i,j,y n+1,i,j ) z n+1,i,j,1 = z n,i,j,1 +x n,i y n+1,i,j, (i,j) E z n+1,i,j,2 = z n,i,j,2 +x n,j y n+1,i,j, (i,j) E.

50 22/29 Distributed optimization Distributed ADMM (4): ( ( x n,i = prox f i d 1 i L (y n z n ) ) ) i i V γd i y n+1,i,j = 1 2 (x n,i +x n,j ), (i,j) E ( n N) y n+1,i,j = (y n+1,i,j,y n+1,i,j ) z n+1,i,j,1 = z n,i,j,1 +x n,i y n+1,i,j, (i,j) E z n+1,i,j,2 = z n,i,j,2 +x n,j y n+1,i,j, (i,j) E. Distributed ADMM (5): ( x n,i = prox f i d 1 ( i (y n,i,j z n,i,j,1 )+ (y n,j,i z n,j,i,2 ) )), i V γd i j:(i,j) E j:(j,i) E ( n N) y n+1,i,j = 1 2 (x n,i +x n,j ), (i,j) E z n+1,i,j,1 = z n,i,j,1 +x n,i y n+1,i,j, (i,j) E z n+1,j,i,2 = z n,j,i,2 +x n,i y n+1,j,i, (j,i) E. { y n+1,i,j z n+1,i,j,1 = 2y n+1,i,j x n,i z n,i,j,1 = x n,j z n,i,j,1 y n+1,j,i z n+1,j,i,2 = x n,j z n,j,i,2

51 22/29 Distributed optimization Distributed ADMM (5): ( x n,i = prox f i d 1 ( i (y n,i,j z n,i,j,1 )+ (y n,j,i z n,j,i,2 ) )), i V γd i j:(i,j) E j:(j,i) E ( n N) y n+1,i,j = 1 2 (x n,i +x n,j ), (i,j) E z n+1,i,j,1 = z n,i,j,1 +x n,i y n+1,i,j, (i,j) E z n+1,j,i,2 = z n,j,i,2 +x n,i y n+1,j,i, (j,i) E. { y n+1,i,j z n+1,i,j,1 = 2y n+1,i,j x n,i z n,i,j,1 = x n,j z n,i,j,1 y n+1,j,i z n+1,j,i,2 = x n,j z n,j,i,2 Distributed ADMM (6): z n+1,i,j,1 = z n,i,j, (x n,i x n,j ), (i,j) E z n+1,j,i,2 = z n,j,i, (x n,i x n,j ), (j,i) E x n,i = d 1 ( i x n,j + ) x n,j i V ( n N) j:(i,j) E j:(j,i) E z n,i = d 1 ( i z n,i,j,1 + ) z n,j,i,2 i V j:(i,j) E ( γd i j:(j,i) E x n+1,i = prox f i x n,i z n,i ), i V

52 22/29 Distributed optimization Distributed ADMM (6): z n+1,i,j,1 = z n,i,j, (x n,i x n,j ), (i,j) E z n+1,j,i,2 = z n,j,i, (x n,i x n,j ), (j,i) E x n,i = d 1 ( i x n,j + ) x n,j i V ( n N) j:(i,j) E j:(j,i) E z n,i = d 1 ( i z n,i,j,1 + ) z n,j,i,2 i V j:(i,j) E ( γd i x n+1,i = prox f i j:(j,i) E x n,i z n,i ), i V Distributed ADMM (final form): x n,i = d 1 ( i x n,j + ) x n,j i V j:(i,j) E j:(j,i) E ( n N) z n+1,i = z n,i +d i x n,i x n,i x n+1,i = prox f i x n,i z n,i ), i V. γd i (

53 23/29 Primal-dual methods Problem Let H and G be two real Hilbert spaces, f Γ 0 (H), h Γ 0 (H), g Γ 0 (G), and L B(H,G). It is assumed that h is differentiable and have a β-lipschitzian gradient with β ]0,+ [. We want to minimize f(x)+g(lx)+h(x). x H

54 24/29 Primal-dual methods ADMM algorithm: Let γ ]0,+ [. ( n N) Limitations: 1 x n = argmin 2 Lx y n +z n ( ) γ f(x)+h(x) x H y n+1 = proxg γ (z n +Lx n ) z n+1 = z n +Lx n y n+1. Computation of xn at iteration n N may be complicated. Convergence requires L L to be invertible. The smoothness of h is not exploited.

55 25/29 Primal-dual methods Idea 1: the optimization problem is reformulated as finding inf sup f(x)+h(x)+ v Lx g (v). x H v G Arrow-Hurwitz method: Let (τ n ) n N and (σ n ) n N be sequences in ]0,+ [. t n f(x n ) x n+1 = x n τ n (t n + h(x n )+L v n ) ( n N) s n g (v n ) v n+1 = v n σ n (s n Lx n+1 ) requires stringent conditions on the choice of the step-size (e.g. decaying to zero)

56 26/29 Primal-dual methods Idea 2: Use implicit updates t n f(x n+1 ) x n+1 = x n τ n (t n + h(x n )+L v n ) ( n N) s n g (v n+1 ) v n+1 = v n σ n (s n Lx n+1 ) { 0 x n+1 x n +τ n ( h(x n )+L v n )+τ n f(x n+1 ) 0 v n+1 v n σ n Lx n+1 +σ n g (v n+1 ) { ( x n+1 = prox τnf xn τ n ( h(x n )+L v n ) ) ( ) v n+1 = prox σng vn +σ n Lx n+1 still does not converge for constant values of the step-size.

57 27/29 Primal-dual methods Idea 3: Use the approximation x n+1 2x n+1 x n { ( x n+1 = prox ( n N) τnf xn τ n ( h(x n )+L v n ) ( v n+1 = prox σng vn +σ n L(2x n+1 x n ) ).

58 28/29 Primal-dual optimization algorithm Convergence of PD algorithm Let H and G be two Hilbert spaces. Let L B(H,G). Let f Γ 0 (H) and g Γ 0 (G). Let h Γ 0 (H) have a β-lipschitzian gradient. ( n N) {x n+1 = prox τf ( xn τ ( h(x n )+L v n )) v n+1 = prox σg ( vn +σl(2x n+1 x n ) ).

59 28/29 Primal-dual optimization algorithm Convergence of PD algorithm Let H and G be two Hilbert spaces. Let L B(H,G). Let f Γ 0 (H) and g Γ 0 (G). Let h Γ 0 (H) have a β-lipschitzian gradient. Let τ ]0,+ [ and σ ]0,+ [. We assume that τ 1 σ L 2 β/2 and zer ( f + h+l gl ). Let x 0 H, v 0 G, and ( {x n+1 = prox τf xn τ ( )) h(x n )+L v n We have: ( n N) x n x Argmin ( f +h+g L ) v n+1 = prox σg ( vn +σl(2x n+1 x n ) ). v n v Argmin ( (f +g) ( L )+g ).

60 28/29 Primal-dual optimization algorithm Convergence of PD algorithm Let H and G be two Hilbert spaces. Let L B(H,G). Let f Γ 0 (H) and g Γ 0 (G). Let h Γ 0 (H) have a β-lipschitzian gradient. Let τ ]0,+ [ and σ ]0,+ [. We assume that τ 1 σ L 2 β/2 and zer ( f + h+l gl ). Let x 0 H, v 0 G, and ( {x n+1 = prox τf xn τ ( )) h(x n )+L v n We have: ( n N) x n x Argmin ( f +h+g L ) v n+1 = prox σg ( vn +σl(2x n+1 x n ) ). v n v Argmin ( (f +g) ( L )+g ). Sketch of the proof: rewrite the algorithm as a Forward-Backward iteration for solving a saddle point problem.

61 29/29 Exercice Let H and (G i ) 1 i m be real Hilbert spaces. Let f Γ 0 (H), let h Γ 0 (H), and, for every i {1,...,m}, let g i Γ 0 (G) and L i B(H,G i ). It is assumed that h is differentiable and have a β-lipschitzian gradient with β ]0,+ [. Propose a primal-dual algorithm to solve minimize x H m f(x)+ g i (L i x)+h(x). i=1

An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems

An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems PGMO 1/32 An Overview of Recent and Brand New Primal-Dual Methods for Solving Convex Optimization Problems Emilie Chouzenoux Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est Marne-la-Vallée,

More information

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29

In collaboration with J.-C. Pesquet A. Repetti EC (UPE) IFPEN 16 Dec / 29 A Random block-coordinate primal-dual proximal algorithm with application to 3D mesh denoising Emilie CHOUZENOUX Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est, France Horizon Maths 2014

More information

Monotone Operator Splitting Methods in Signal and Image Recovery

Monotone Operator Splitting Methods in Signal and Image Recovery Monotone Operator Splitting Methods in Signal and Image Recovery P.L. Combettes 1, J.-C. Pesquet 2, and N. Pustelnik 3 2 Univ. Pierre et Marie Curie, Paris 6 LJLL CNRS UMR 7598 2 Univ. Paris-Est LIGM CNRS

More information

Proximal splitting methods on convex problems with a quadratic term: Relax!

Proximal splitting methods on convex problems with a quadratic term: Relax! Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems

On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems Radu Ioan Boţ Ernö Robert Csetnek August 5, 014 Abstract. In this paper we analyze the

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection

A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection EUSIPCO 2015 1/19 A Parallel Block-Coordinate Approach for Primal-Dual Splitting with Arbitrary Random Block Selection Jean-Christophe Pesquet Laboratoire d Informatique Gaspard Monge - CNRS Univ. Paris-Est

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

ADMM for monotone operators: convergence analysis and rates

ADMM for monotone operators: convergence analysis and rates ADMM for monotone operators: convergence analysis and rates Radu Ioan Boţ Ernö Robert Csetne May 4, 07 Abstract. We propose in this paper a unifying scheme for several algorithms from the literature dedicated

More information

Additional Homework Problems

Additional Homework Problems Additional Homework Problems Robert M. Freund April, 2004 2004 Massachusetts Institute of Technology. 1 2 1 Exercises 1. Let IR n + denote the nonnegative orthant, namely IR + n = {x IR n x j ( ) 0,j =1,...,n}.

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

A Dykstra-like algorithm for two monotone operators

A Dykstra-like algorithm for two monotone operators A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

A Monotone + Skew Splitting Model for Composite Monotone Inclusions in Duality

A Monotone + Skew Splitting Model for Composite Monotone Inclusions in Duality A Monotone + Skew Splitting Model for Composite Monotone Inclusions in Duality arxiv:1011.5517v1 [math.oc] 24 Nov 2010 Luis M. Briceño-Arias 1,2 and Patrick L. Combettes 1 1 UPMC Université Paris 06 Laboratoire

More information

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form

More information

Dual Decomposition.

Dual Decomposition. 1/34 Dual Decomposition http://bicmr.pku.edu.cn/~wenzw/opt-2017-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/34 1 Conjugate function 2 introduction:

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization

Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Convergence analysis for a primal-dual monotone + skew splitting algorithm with applications to total variation minimization Radu Ioan Boţ Christopher Hendrich November 7, 202 Abstract. In this paper we

More information

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

More information

Distributed Convex Optimization

Distributed Convex Optimization Master Program 2013-2015 Electrical Engineering Distributed Convex Optimization A Study on the Primal-Dual Method of Multipliers Delft University of Technology He Ming Zhang, Guoqiang Zhang, Richard Heusdens

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Operator Splitting for Parallel and Distributed Optimization

Operator Splitting for Parallel and Distributed Optimization Operator Splitting for Parallel and Distributed Optimization Wotao Yin (UCLA Math) Shanghai Tech, SSDS 15 June 23, 2015 URL: alturl.com/2z7tv 1 / 60 What is splitting? Sun-Tzu: (400 BC) Caesar: divide-n-conquer

More information

Primal-dual algorithms for the sum of two and three functions 1

Primal-dual algorithms for the sum of two and three functions 1 Primal-dual algorithms for the sum of two and three functions 1 Ming Yan Michigan State University, CMSE/Mathematics 1 This works is partially supported by NSF. optimization problems for primal-dual algorithms

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

arxiv: v1 [math.oc] 12 Mar 2013

arxiv: v1 [math.oc] 12 Mar 2013 On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems arxiv:303.875v [math.oc] Mar 03 Radu Ioan Boţ Ernö Robert Csetnek André Heinrich February

More information

EE/AA 578, Univ of Washington, Fall Duality

EE/AA 578, Univ of Washington, Fall Duality 7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem 1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Optimization using Calculus. Optimization of Functions of Multiple Variables subject to Equality Constraints

Optimization using Calculus. Optimization of Functions of Multiple Variables subject to Equality Constraints Optimization using Calculus Optimization of Functions of Multiple Variables subject to Equality Constraints 1 Objectives Optimization of functions of multiple variables subjected to equality constraints

More information

arxiv: v1 [math.oc] 21 Apr 2016

arxiv: v1 [math.oc] 21 Apr 2016 Accelerated Douglas Rachford methods for the solution of convex-concave saddle-point problems Kristian Bredies Hongpeng Sun April, 06 arxiv:604.068v [math.oc] Apr 06 Abstract We study acceleration and

More information

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization

A General Framework for a Class of Primal-Dual Algorithms for TV Minimization A General Framework for a Class of Primal-Dual Algorithms for TV Minimization Ernie Esser UCLA 1 Outline A Model Convex Minimization Problem Main Idea Behind the Primal Dual Hybrid Gradient (PDHG) Method

More information

Duality Theory of Constrained Optimization

Duality Theory of Constrained Optimization Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

More information

Tight Rates and Equivalence Results of Operator Splitting Schemes

Tight Rates and Equivalence Results of Operator Splitting Schemes Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1

More information

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent CMU 10-725/36-725: Convex Optimization (Fall 2017) OUT: Nov 4 DUE: Nov 18, 11:59 PM START HERE: Instructions Collaboration policy: Collaboration

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators

Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators Radu Ioan Boţ Christopher Hendrich 2 April 28, 206 Abstract. The aim of this article is to present

More information

A Primal-dual Three-operator Splitting Scheme

A Primal-dual Three-operator Splitting Scheme Noname manuscript No. (will be inserted by the editor) A Primal-dual Three-operator Splitting Scheme Ming Yan Received: date / Accepted: date Abstract In this paper, we propose a new primal-dual algorithm

More information

Math 273a: Optimization Convex Conjugacy

Math 273a: Optimization Convex Conjugacy Math 273a: Optimization Convex Conjugacy Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Convex conjugate (the Legendre transform) Let f be a closed proper

More information

arxiv: v1 [math.oc] 20 Jun 2014

arxiv: v1 [math.oc] 20 Jun 2014 A forward-backward view of some primal-dual optimization methods in image recovery arxiv:1406.5439v1 [math.oc] 20 Jun 2014 P. L. Combettes, 1 L. Condat, 2 J.-C. Pesquet, 3 and B. C. Vũ 4 1 Sorbonne Universités

More information

ADMM and Fast Gradient Methods for Distributed Optimization

ADMM and Fast Gradient Methods for Distributed Optimization ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work

More information

Variable Metric Forward-Backward Algorithm

Variable Metric Forward-Backward Algorithm Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

arxiv: v7 [math.oc] 22 Feb 2018

arxiv: v7 [math.oc] 22 Feb 2018 A SMOOTH PRIMAL-DUAL OPTIMIZATION FRAMEWORK FOR NONSMOOTH COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, OLIVIER FERCOQ, AND VOLKAN CEVHER arxiv:1507.06243v7 [math.oc] 22 Feb 2018 Abstract. We propose a

More information

9. Dual decomposition and dual algorithms

9. Dual decomposition and dual algorithms EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple

More information

arxiv: v1 [math.oc] 27 Jan 2013

arxiv: v1 [math.oc] 27 Jan 2013 An Extragradient-Based Alternating Direction Method for Convex Minimization arxiv:1301.6308v1 [math.oc] 27 Jan 2013 Shiqian MA Shuzhong ZHANG January 26, 2013 Abstract In this paper, we consider the problem

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ???? DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0

More information

Maximal monotone operators are selfdual vector fields and vice-versa

Maximal monotone operators are selfdual vector fields and vice-versa Maximal monotone operators are selfdual vector fields and vice-versa Nassif Ghoussoub Department of Mathematics, University of British Columbia, Vancouver BC Canada V6T 1Z2 nassif@math.ubc.ca February

More information

Hermite Interpolation

Hermite Interpolation Jim Lambers MAT 77 Fall Semester 010-11 Lecture Notes These notes correspond to Sections 4 and 5 in the text Hermite Interpolation Suppose that the interpolation points are perturbed so that two neighboring

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

Primal-dual coordinate descent

Primal-dual coordinate descent Primal-dual coordinate descent Olivier Fercoq Joint work with P. Bianchi & W. Hachem 15 July 2015 1/28 Minimize the convex function f, g, h convex f is differentiable Problem min f (x) + g(x) + h(mx) x

More information

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

A Tutorial on Primal-Dual Algorithm

A Tutorial on Primal-Dual Algorithm A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote

More information

Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

More information

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE

A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE A GENERAL FRAMEWORK FOR A CLASS OF FIRST ORDER PRIMAL-DUAL ALGORITHMS FOR CONVEX OPTIMIZATION IN IMAGING SCIENCE ERNIE ESSER XIAOQUN ZHANG TONY CHAN Abstract. We generalize the primal-dual hybrid gradient

More information

Convex Optimization Conjugate, Subdifferential, Proximation

Convex Optimization Conjugate, Subdifferential, Proximation 1 Lecture Notes, HCI, 3.11.211 Chapter 6 Convex Optimization Conjugate, Subdifferential, Proximation Bastian Goldlücke Computer Vision Group Technical University of Munich 2 Bastian Goldlücke Overview

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting

On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting Mathematical Programming manuscript No. (will be inserted by the editor) On the equivalence of the primal-dual hybrid gradient method and Douglas Rachford splitting Daniel O Connor Lieven Vandenberghe

More information

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology March 2016 1 / 27 Constrained

More information

Preconditioning via Diagonal Scaling

Preconditioning via Diagonal Scaling Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Convex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE

Convex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE Convex Analysis Notes Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE These are notes from ORIE 6328, Convex Analysis, as taught by Prof. Adrian Lewis at Cornell University in the

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

An ADMM Algorithm for Clustering Partially Observed Networks

An ADMM Algorithm for Clustering Partially Observed Networks An ADMM Algorithm for Clustering Partially Observed Networks Necdet Serhat Aybat Industrial Engineering Penn State University 2015 SIAM International Conference on Data Mining Vancouver, Canada Problem

More information

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction J. Korean Math. Soc. 38 (2001), No. 3, pp. 683 695 ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE Sangho Kum and Gue Myung Lee Abstract. In this paper we are concerned with theoretical properties

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Nesterov s Acceleration

Nesterov s Acceleration Nesterov s Acceleration Nesterov Accelerated Gradient min X f(x)+ (X) f -smooth. Set s 1 = 1 and = 1. Set y 0. Iterate by increasing t: g t 2 @f(y t ) s t+1 = 1+p 1+4s 2 t 2 y t = x t + s t 1 s t+1 (x

More information