Alternating Direction Augmented Lagrangian Algorithms for Convex Optimization

Size: px
Start display at page:

Download "Alternating Direction Augmented Lagrangian Algorithms for Convex Optimization"

Transcription

1 Alternating Direction Augmented Lagrangian Algorithms for Convex Optimization Joint work with Bo Huang, Shiqian Ma, Tony Qin, Katya Scheinberg, Zaiwen Wen and Wotao Yin Department of IEOR, Columbia University MOPTA Conference Lehigh University August 19, 2010

2 Alternating direction augmented Lagrangian methods Long history: goes back to Gabay and Mercier, Glowinski and Marrocco, Lions and Mercier, and Passty etc. Optimization models from partial differential equation Maximal monotone operators Variational inequalities Nonlinear convex optimization Linear programming Nonsmooth l 1 -minimization, compressive sensing

3 Outline Outline: 1 ADAL methods for minimizing a convex function that can be written as a sum of simpler functions 2 ADAL methods for solving SDPs

4 Introduction SUM-K min F (x) K f i (x) i=1 SUM-2 min F (x) f (x) + g(x) Minimize the sum of convex functions Assume the following problem is easy min x τf i (x) + 1 x y 2 2 Examples of f i : x 1, x 2, Ax b 2, X, log det(x ),...

5 Examples Compressed sensing: Matrix Rank Min: min ρ x Ax b 2 2 Robust PCA: min ρ X A(X ) b 2 2 min X,Y X + ρ Y 1 : X + Y = M Sparse Inverse Covariance Selection: min log det(x ) + Σ, X + ρ X 1

6 Variable Splitting (SUM 2) min f (x) + g(x) Variable splitting min s.t. f (x) + g(y) x = y Augmented Lagrangian function: L(x, y; λ) := f (x) + g(y) λ, x y + 1 x y 2 2µ Augmented Lagrangian Method: { (x k+1, y k+1 ) := arg min (x,y) L(x, y; λk ) λ k+1 := λ k (x k+1 y k+1 )/µ

7 Alternating Direction Augmented Lagrangian (ADAL) L(x, y; λ) := f (x) + g(y) λ, x y + 1 2µ x y 2 Solve augmented Lagrangian subproblem alternatingly x k+1 := arg min x L(x, y k ; λ k ) y k+1 := arg min y L(x k+1, y; λ k ) λ k+1 := λ k (x k+1 y k+1 )/µ

8 Symmetric ADAL L(x, y; λ) := f (x) + g(y) λ, x y + 1 2µ x y 2 Symmetric version x k+1 := arg min x L(x, y k ; λ k ) λ k+ 1 2 := λ k (x k+1 y k )/µ y k+1 := arg min y L(x k+1, y; λ k+ 1 2 ) λ k+1 := λ k+ 1 2 (x k+1 y k+1 )/µ Optimality conditions lead to (assuming f and g are smooth) λ k+ 1 2 = f (x k+1 ), λ k+1 = g(y k+1 )

9 Alternating Linearization Method (SUM 2) min F (x) f (x) + g(x) Define Q g (u, v) := f (u) + g(v) + g(v), u v + 1 u v 2 2µ Q f (u, v) := f (u) + f (u), v u + 1 2µ u v 2 + g(v) Alternating Linearization Method (ALM) { x k+1 := arg min x Q g (x, y k ) y k+1 := arg min y Q f (x k+1, y) Proposed by Kiwiel, Rosa and Ruszczynski for stochastic programming. Gauss-Seidel like algorithm

10 Complexity Bound of ALM Theorem (Goldfarb, Ma and Scheinberg, 2009) Assume f and g are Lipschitz continuous with constants L(f ) and L(g). For µ 1/ max{l(f ), L(g)}, ALM satisfies Therefore, F (y k ) F (x ) x 0 x 2 convergence in objective value 4µk O(1/ɛ) iterations for an ɛ-optimal solution The first complexity result for splitting and alternating direction type methods Question: Can we improve the complexity?

11 Optimal Gradient Methods min f (x) (assuming f is Lipschitz continuous) ɛ-optimal solution f (x) f (x ) ɛ Classical gradient method Complexity O(1/ɛ) x k = x k 1 τ k f (x k 1 ) Nesterov s acceleration technique (1983) Complexity O(1/ ɛ) { x k := y k 1 τ k f (y k 1 ) y k := x k + k 1 k+2 (x k x k 1 ) Optimal first-order method; best one can get

12 ISTA and FISTA (Beck and Teboulle, 2009) Assume g is smooth min F (x) f (x) + g(x) Fixed Point Algorithm ( Also called ISTA in compressed sensing ) or equivalently x k+1 := arg min x Q f (x, x k ) x k+1 := arg min x τf (x) x (x k τ g(x k )) 2 Never minimize g Iteration complexity: O(1/ɛ) for an ɛ-optimal solution (F (x k ) F (x ) ɛ)

13 ISTA and FISTA (Beck and Teboulle, 2009) min F (x) f (x) + g(x) Fast ISTA (FISTA) x k := ( arg min x τf (x) + ) 1 2 x (y k τ g(y k )) 2 t k+1 := tk 2 /2 y k+1 := x k + t k 1 t k+1 (x k x k 1 ) Complexity O(1/ ɛ)

14 Fast Alternating Linearization Method ALM { x k+1 := arg min x Q g (x, y k ) y k+1 := arg min y Q f (x k+1, y) Accelerate ALM in the same way as FISTA Fast alternating linearization method (FALM) x k := arg min x Q g (x, z k ) y k := arg min y Q f (x k, y) w k := (x k + y k )/2 t k+1 := ( t 2 k ) /2 z k+1 := w k + 1 t k+1 (t k (y k w k 1 ) (w k w k 1 )) computational effort at each iteration is almost unchanged

15 Complexity Bound for FALM min F (x) f (x) + g(x) Theorem (Goldfarb, Ma and Scheinberg, 2009) Assume f and g are Lipschitz continuous with constants L(f ) and L(g). For µ 1/ max{l(f ), L(g)}, FALM satisfies Therefore, F (y k ) F (x ) x 0 x 2 µ(k + 1) 2 convergence in objective value O(1/ ɛ) iterations for an ɛ-optimal solution Optimal first-order method

16 ALM with skipping steps At k-th iteration of ALM-S: x k+1 := arg min x L µ (x, y k ; λ k ) If F (x k+1 ) > L µ (x k+1, y k ; λ k ), then x k+1 := y k y k+1 := arg min y Q f (y, x k+1 ) λ k+1 := f (x k+1 ) (x k+1 y k+1 )/µ Note that only one function is required to be smooth.

17 Complexity result of ALM-S Theorem (Goldfarb and Scheinberg, 2010) Assume f is Lipschitz continuous. For µ 1/L(f ), the iterates y k in ALM-S satisfy: F (y k ) F (x ) x 0 x 2 2µ(k + k ns ), k, where k ns is the number of iterations until the k-th for which F (x k+1 ) L µ (x k+1, y k ; λ k ). Thus O(1/ɛ) iterations to obtain an ɛ-optimal solution. Similar algorithm can be designed for FALM with O(1/ ɛ) complexity and only one function is required to be smooth.

18 Complexity result of ADAL Theorem (Goldfarb and Huang, 2010) Assume both f and g are Lipschitz continuous. For µ 1/ max{l(f ), L(g)}, ADAL needs O(1/ɛ) iterations to obtain an ɛ-optimal solution. Thus far we have been unable to develop a fast O(1/ ɛ) version of ADAL.

19 SUM-K From P.L.Lions and B.Mercier s 1979 paper on operator splitting Generalization from 2 to K is possible, but It seems proving convergence for K 3 is difficult. min F (x) f (x) + g(x) + h(x) Define Q gh (u, v, w) := f (u) + g(v) + g(v), u v + u v 2 /2µ +h(w) + h(w), u w + u w 2 /2µ. x k+1 := arg min Q gh (x, y k, z k ) y k+1 := arg min Q fh (x k+1, y, z k ) z k+1 := arg min Q fg (x k+1, y k+1, z) However, no complexity results for Gauss-Seidel like algorithm!

20 MSA: A Jacobi Type Algorithm min F (x) f (x) + g(x) + h(x) Multiple Splitting Algorithm (MSA) Jacobi type algorithm Can be done in parallel x k+1 := arg min Q gh (x, w k, w k ) y k+1 := arg min Q fh (w k, y, w k ) z k+1 := arg min Q fg (w k, w k, z) w k+1 := (x k+1 + y k+1 + z k+1 )/3 We have complexity result!

21 Complexity Bound for MSA min F (x) f (x) + g(x) + h(x) Theorem (Goldfarb and Ma, 2009) Assume f, g and h are Lipschitz continuous with constants L(f ), L(g) and L(h). For µ 1/ max{l(f ), L(g), L(h)}, MSA satisfies Therefore, min{f (x k ), F (y k ), F (z k )} F (x ) x 0 x 2. µk convergence in objective value O(1/ɛ) iterations for an ɛ-optimal solution

22 Fast Multiple Splitting Algorithm Fast Multiple Splitting Algorithm (FaMSA) x k := arg min Q gh (x, wx k, wx k ) y k := arg min Q fh (wy k, y, wy k ) z k := arg min Q fg (wz k, wz k, z) w k := (x k + y k + z k )/3 t k+1 := ( 1 + w k+1 x := w k t 2 k ) /2 t k+1 [t k (x k w k ) (w k w k 1 )] wy k+1 := w k + 1 t k+1 [t k (y k w k ) (w k w k 1 )] wz k+1 := w k + 1 t k+1 [t k (z k w k ) (w k w k 1 )]

23 Complexity Bound for FaMSA min F (x) f (x) + g(x) + h(x) Theorem (Goldfarb and Ma, 2009) Assume f, g and h are Lipschitz continuous with constants L(f ), L(g) and L(h). For µ 1/ max{l(f ), L(g), L(h)}, FaMSA satisfies Therefore, min{f (x k ), F (y k ), F (z k )} F (x ) 4 x 0 x 2 convergence in objective value O(1/ ɛ) iterations for an ɛ-optimal solution optimal first-order method µ(k + 1) 2

24 Comparison of ALM/FALM and MSA/FaMSA ALM/FALM Gauss-Seidel like algorithms expected to be faster than MSA/FaMSA since the information from current iteration is used complexity results for (SUM-2), no results for (SUM-K) when K 3 only one function needs to be smooth MSA/FaMSA Jacobi like algorithms can be done in parallel complexity results for (SUM-K) for any K 2

25 ALM and FALM for Robust PCA Robust PCA: f (X ) = X, g(y ) = ρ Y 1 min { X + ρ Y 1 : X + Y = M} X,Y R m n Subproblem wrt X (a matrix shrinkage operator, corresponds to an SVD): X k+1 := arg min X f (X ) + g(y k ) + g(y k ), M X Y k Subproblem wrt Y (a vector shrinkage operator): + X + Y k M 2 F /2µ Y k+1 := arg min Y f (X k+1 ) + f (X k+1 ), M X k+1 Y + X k+1 + Y M 2 F /2µ + g(y )

26 Numerical Results Surveillance video

27 Numerical Results (cont.) Surveillance video

28 Numerical Results (cont.) Shadow and specularities removal from face images

29 Numerical Results (cont.) Video denoising

30 Summary of Numerical Results Problem n 1 n 2 SVDs CPU Hall :03 Escalator :53 Campus (color) :01:01 Face :39 Video denoising :12:49

31 Conclusion and future work Our contribution Alternating linearization and multiple splitting methods Optimal first-order methods First complexity results for splitting and alternating direction methods Current and Future work Extension of ALM/FALM, MSA/FaMSA to constrained problems Extension of MSA/FaMSA to nonsmooth problems Development of Fast ADAL with O(1/ (ɛ)) complexity Line search variants Applications in many fields such as Medical Imaging, Machine Learning, Model Selection, Optimal acquisition basis selection (radar), etc.

32 ADAL applied to SDP dual problem (D) min y R m,s S n b y s.t. A (y) + S = C, S 0, A(X ) := ( A (1), X,, A (m), X ) A (y) := m i=1 y ia (i) Let A := ( vec(a (1) ),, vec(a (m) ) ) R m n 2. Constraints: A(X ) = b = Avec(X ) = b Operators related to A( ): A(A (y)) := (AA )y, A (A(X )) := mat (( ) ) A A vec(x ) Assumption: The matrix A has full row rank and the Slater condition holds for the primal SDP.

33 Augmented Lagrangian method Augmented Lagrangian function: L µ (X, y, S) = b y + X, A (y) + S C + 1 2µ A (y)+s C 2 F. Algorithmic framework: compute y k+1 and S k+1 at k-th iteration (DL) min L µ(x k, y, S), s.t. S 0. y R m,s S n update the primal variable X k+1 by Pros and Cons: X k+1 := X k + A (y k+1 ) + S k+1 C µ Pros: rich theory, well understood and a lot of algorithms Cons: L µ (X k, y, S) is not separable in y and S, and the subproblem (DL) is difficult to minimize.

34 Alternating direction method Divide variables into different blocks according to their roles Minimize the augmented Lagrangian function with respect to one block at a time while all other blocks are fixed Framework y k+1 := arg min y R m L µ(x k, y, S k ), S k+1 := arg min S S n L µ(x k, y k+1, S), S 0, X k+1 := X k + A (y k+1 ) + S k+1 C µ.

35 Alternating direction method Solving subproblem: y k+1 = arg min y R m L µ (X k, y, S k ) First-order optimality condition: y L µ := A(X k ) b + 1 µ A(A (y k+1 ) + S k C) = 0. Since AA is invertible, ( ) y k+1 := y(s k, X k ) := (AA ) 1 µ(a(x k ) b) + A(S k C). Compute AA prior to executing the algorithm. The operator AA is the identity for many SDP relaxations of combinatorial optimization problems, such as the maxcut, maximum stable set and bisection problems

36 Alternating direction method Solving subproblem S k+1 = arg min S S n L µ(x k, y k+1, S), S 0 Equivalent formulation: S min V k+1 2, S 0, S S n F where V k+1 := V (S k, X k ) = C A (y(s k, X k )) µx k. Hence, the solution is S k+1 := V k+1 := Q Σ + Q where V k+1 = QΣQ = ( ) ( ) ( Σ Q Q + 0 Q ) 0 Σ Q

37 Alternating direction method Updating the Lagrange multiplier X k+1 Updating formula: Equivalent formulation: X k+1 := X k + A (y k+1 ) + S k+1 C µ X k+1 = 1 µ (S k+1 V k+1 ) = 1 µ V k+1, where V k+1 := Q Σ Q. Note that X k+1 is also the optimal solution of µx min + V k+1 2, X 0. X S n F

38 Convergence results Let P(V ) := ( V, V ) Fixed point formulation: y k+1 = y(s k, X k ) ( S k+1, µx k+1) = P(V k+1 ) = P(V (S k, X k )). Equivalence of the optimality conditions: (X, y, S) satisfies the KKT optimality conditions A(X ) = b, A (y) + S = C, SX = 0, X 0, Z 0. (X, y, S) satisfies y = y(s, X ) and ( S, µx ) = P(V (S, X )). Main result: the sequence {(X k, y k, S k )} generated by our algorithm from any starting point converges to a solution (X, y, S ) X, where X is the set of primal and dual solution pairs. No complexity results currently known.

39 Convergence results For any V, V S n, P(V ) P( V ) V V F, with F equality holding if and only if V V = 0 and V V = 0. For any S, X, Ŝ, X S+, n V (S, X ) V (Ŝ, X ( ) F S Ŝ, µ(x X ) F ) with equality holding if and only if V V = (S Ŝ) µ(x X ). Let (X, y, S ), be an optimal solution. If P(V (S, X )) P(V (S, X )) F = ( S S, µ(x X ) ) F, then, (S, µx ) is a fixed point, and hence, (X, y, S) is a primal and dual optimal solution pair.

40 An expanded problem Primal problem: min C, X, s.t. A(X ) = b, B(X ) d, X 0, X 0, where B(X ) := ( B (1), X,, B (q), X ), B (i) S n. Dual problem min b y d v, y R m,v R q,s S n,z S n s.t. A (y) + B (v) + S + Z = C, v 0, S 0, Z 0

41 A multiple splitting scheme Augmented Lagrangian function L µ = b y d v + X, A (y) + B (v) + S + Z C + 1 2µ A (y) + B (v) + S + Z C 2 F, Framework y k+1 := arg min y R m L µ(x k, y, v k, Z k, S k ), v k+1 := arg min v R q L µ(x k, y k+1, v, Z k, S k ), v 0, Z k+1 := arg min Z S n L µ(x k, y k+1, v k+1, Z, S k ), Z 0, S k+1 := arg min S S n L µ(x k, y k+1, v k+1, Z k+1, S), S 0, X k+1 := X k + A (y k+1 ) + B (v k+1 ) + S k+1 + Z k+1 C µ.

42 Eigenvalue decomposition Partial eigenvalue decomposition: X and S share the same set of eigenvectors since XS = 0. Only need to compute either V k or V k. Adaptive scheme (suppose that V k 1 is known) κ +(V k 1 ): the total number of the pos. eigenvalues of V k 1. compute V k for S k if κ +(V k 1 ) n 2 otherwise, compute V k for X k. Direct and iterative methods DSYEVX in LAPACK and ARPACK Warm start techniques V k is the optimal solution of min S 0 S V k 2 F Nonlinear programming approaches: R k+1 := arg min RR V k+1 2 R R n κ F starting from R := Q k (Σk +) 1 2,

43 Adjusting penalty parameter µ Tune µ so that the primal and dual infeasibilities are balanced The primal and dual infeasibilities are proportional to 1 µ and µ, respectively. We decrease (increase) µ by a factor γ, 0 < γ < 1, if the primal infeasibility is less than (greater than) a multiple of the dual infeasibility for a number of consecutive iterations. µ is required to remain within an interval [µ min, µ max ], where 0 < µ min < µ max <.

44 Other practical issues step size for updating X : X k+1 = X k + ρ A (y k+1 ) + S k+1 C µ = (1 ρ)x k + ρ µ (S k+1 V k+1 ) = (1 ρ)x k + ρ X k+1, termination rule: optimality and feasiblities: pinf = A(X ) b b 2, dinf = C + S + Z A (y) F 1 + C 1, gap = b y C, X 1 + b y + C, X. Stop when δ := max{pinf, dinf, gap} ɛ Stagnation: it stag is an iteration counter (it stag > h 1 and δ <= 10ɛ) or (it stag > h 2 and δ <= 10 2 ɛ) or (it stag > h 3 and δ <= 10 3 ɛ), where 1 < h 1 < h 2 < h 3 are integers representing different levels of difficulty.

45 Numerical Results Frequency assignment problem Maximal stable set problem Comparison with SDPNAL (Zhao, Sun and Toh) A semi-smooth Newton-CG method for minimizing the dual augmented Lagrangian function Codes were written in C Language MEX-files in MATLAB (Release 7.3.0) and all experiments were performed on a Dell Precision 670 workstation with an Intel Xeon 3.4GHZ CPU and 6GB of RAM.

46 Numerical Results: frequency assignment problems Graph: G = (V, E); Weight matrix W S r Formulation: min 1 k 1 2k diag(we) + 2k W, X s.t. X i,j 1 k 1, (i, j) E\T, X i,j = 1 k 1, (i, j) T, diag(x ) = e, X 0. Operator A: the edges in T and diag(x ) = e Operator B: the edges in E\T 1 Scaling: X ij / 2 = 2(k 1) and X ij / 2 1 2(k 1) ( ) y k+1 = µ(a(x k ) b) + A(S k C) ( ( v k+1 = max µ(b(x k ) d) + B(S k C) ) ), 0.

47 Numerical Results: frequency assignment problems Table: Computational results ADAL SDPNAL name n ˆm pinf dinf itr gap cpu gap cpu fap e e e e fap e e e e fap e e e e fap e e e e fap e e e e fap e e e e fap e e e e fap e e e e fap e e e e fap e e e e-5 1:50 fap e e e e-4 5:15 fap e e e-05 6: e-4 13:14

48 Numerical Results: frequency assignment problems Figure: Performance profiles of two variants of ADAL, SDPNAL and mprw gap cpu % of problems % of problems SDPAD ρ=1.6 SDPAD ρ= SDPAD ρ=1 SDPNAL 0.1 SDPAD ρ=1 SDPNAL mprw mprw not more than 2 x times worse than the best not more than 2 x times worse than the best (a) gap (b) cpu

49 Numerical Results: computing θ(g) and θ + (G) Formulation: θ(g) := max ee, X s.t. X ij = 0, (i, j) E, I, X = 1, X 0, θ + (G) := max ee, X s.t. X ij = 0, (i, j) E, I, X = 1, X 0, X 0, Scaling: X ij / 2 = 0 and 1 n I, X = 1 n.

50 Numerical Results: computing θ(g) Table: Computational results on computing θ(g) ADAL SDPNAL name n ˆm pinf dinf itr gap cpu gap cpu theta e e e e-8 50 theta e e e e-8 1:00 theta e e e e-8 58 theta e e e-7 1:43 4.1e-8 1:34 MANN-a e e e e-8 07 sanr e e e e-7 04 c-fat e e e e-8 09 ham e e e-6 22:8 9.0e-8 02 ham e e e e-8 10 ham e e e-5 2:48 1.4e-6 1:33 brock e e e e-8 26 keller e e e e-8 05 p-hat e e e e-7 1:45 G e e e-6 21:17 4.2e-8 1:33 2dc e e e-4 5:51 1.7e-4 32:16 1dc e e e-4 24:26 2.9e-6 41:26 1et e e e-4 20:03 1.8e-6 1:01:14 1tc e e e-4 21:47 2.2e-6 1:48:04 1zc e e e-7 23:15 3.3e-8 4:15 2dc e e e-4 49:05 9.9e-5 2:57:56 1dc e e e-4 2:50:44 1.5e-6 6:11:11 1et e e e-4 4:54:57 8.8e-7 7:13:55 1tc e e e-3 5:15:14 7.9e-6 9:52:09 1zc e e e-6 6:39:07 1.2e-6 45:16 2dc e e e-4 7:12:50 4.4e-5 15:13:19

51 Numerical Results: computing θ + (G) Table: Computational results on computing θ + (G) ADAL SDPNAL name n ˆm pinf dinf itr gap cpu gap cpu theta e e e-7 1:01 8.4e-8 3:31 theta e e e-7 1:01 2.3e-8 3:28 theta e e e e-7 2:35 theta e e e-7 1:39 1.2e-7 6:44 MANN-a e e e e-7 35 sanr e e e e-7 11 c-fat e e e e-7 36 ham e e e e ham e e e-7 2:19 2.6e-7 42 ham e e e-7 27:58 4.2e-7 4:35 brock e e e e-9 1:45 keller e e e e-7 43 p-hat e e e e-7 6:50 G e e e-6 20:22 2.1e-7 52:00 2dc e e e-5 5:10 3.8e-4 2:25:15 1dc e e e-5 57:20 1.4e-5 5:03:49 1et e e e-4 36:04 1.1e-5 6:45:50 1tc e e e-4 49:13 8.7e-4 10:37:57 1zc e e e-4 28:22 1.6e-7 40:13 2dc e e e-5 50:34 7.3e-4 11:57:25 1dc e e e-4 4:25:17 9.7e-5 35:52:44 1et e e e-4 5:15:09 4.0e-5 80:48:17 1tc e e e-4 6:35:33 1.4e-3 73:56:01 1zc e e e-6 7:14:21 2.3e-7 2:13:04 2dc e e e-5 5:45:25 2.7e-3 45:21:42

52 Numerical Results: computing θ(g) Figure: Performance profiles of two variants of ADAL and SDPNAL gap cpu % of problems % of problems SDPAD ρ=1.6 SDPAD ρ=1 SDPNAL 0.1 SDPAD ρ=1.6 SDPAD ρ=1 SDPNAL not more than 2 x times worse than the best not more than 2 x times worse than the best (a) gap (b) cpu

53 Numerical Results: computing θ + (G) Figure: Performance profiles of two variants of ADAL and SDPNAL gap cpu % of problems % of problems SDPAD ρ=1.6 SDPAD ρ=1 SDPNAL 0.1 SDPAD ρ=1.6 SDPAD ρ=1 SDPNAL not more than 2 x times worse than the best not more than 2 x times worse than the best (a) gap (b) cpu

54 Numerical Results: BIQ problems The binary integer quadratic programming problem: min x Qx, s.t. x {0, 1} n whose SDP relaxation is: ( ) Q 0 min, X 0 0 s.t. X ii X n,i = 0, i = 1,, n 1, X nn = 1, X 0, X 0, 2 Scaling: 3 (X ij X n,i ) = 0 best upper bound take from Biq mac library and %dgap := best upper bound dobj best upper bound 100%.

55 Numerical Results: BIQ problems Figure: Performance profiles of two variants of ADAL and SDPNAL gap cpu % of problems % of problems SDPAD ρ=1.6 SDPAD ρ=1 SDPNAL 0.1 SDPAD ρ=1.6 SDPAD ρ=1 SDPNAL not more than 2 x times worse than the best not more than 2 x times worse than the best (a) %dgap (b) cpu

Optimization for Learning and Big Data

Optimization for Learning and Big Data Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for

More information

Alternating direction augmented Lagrangian methods for semidefinite programming

Alternating direction augmented Lagrangian methods for semidefinite programming Math. Prog. Comp. (2010) 2:203 230 DOI 10.1007/s12532-010-0017-1 FULL LENGTH PAPER Alternating direction augmented Lagrangian methods for semidefinite programming Zaiwen Wen Donald Goldfarb Wotao Yin Received:

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory

Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Beyond Heuristics: Applying Alternating Direction Method of Multipliers in Nonconvex Territory Xin Liu(4Ð) State Key Laboratory of Scientific and Engineering Computing Institute of Computational Mathematics

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Frist order optimization methods for sparse inverse covariance selection

Frist order optimization methods for sparse inverse covariance selection Frist order optimization methods for sparse inverse covariance selection Katya Scheinberg Lehigh University ISE Department (joint work with D. Goldfarb, Sh. Ma, I. Rish) Introduction l l l l l l The field

More information

FAST FIRST-ORDER METHODS FOR COMPOSITE CONVEX OPTIMIZATION WITH BACKTRACKING

FAST FIRST-ORDER METHODS FOR COMPOSITE CONVEX OPTIMIZATION WITH BACKTRACKING FAST FIRST-ORDER METHODS FOR COMPOSITE CONVEX OPTIMIZATION WITH BACKTRACKING KATYA SCHEINBERG, DONALD GOLDFARB, AND XI BAI Abstract. We propose new versions of accelerated first order methods for convex

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Introduction to Alternating Direction Method of Multipliers

Introduction to Alternating Direction Method of Multipliers Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction

More information

Solving large Semidefinite Programs - Part 1 and 2

Solving large Semidefinite Programs - Part 1 and 2 Solving large Semidefinite Programs - Part 1 and 2 Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Singapore workshop 2006 p.1/34 Overview Limits of Interior

More information

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL) Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

Lecture: Algorithms for Compressed Sensing

Lecture: Algorithms for Compressed Sensing 1/56 Lecture: Algorithms for Compressed Sensing Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

ACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem.

ACCELERATED LINEARIZED BREGMAN METHOD. June 21, Introduction. In this paper, we are interested in the following optimization problem. ACCELERATED LINEARIZED BREGMAN METHOD BO HUANG, SHIQIAN MA, AND DONALD GOLDFARB June 21, 2011 Abstract. In this paper, we propose and analyze an accelerated linearized Bregman (A) method for solving the

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming* Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming* Yin Zhang Dept of CAAM, Rice University Outline (1) Introduction (2) Formulation & a complexity theorem (3)

More information

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT N. S. AYBAT, D. GOLDFARB, AND G. IYENGAR Abstract. The stable principal component pursuit SPCP problem is a non-smooth convex optimization

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Convergent prediction-correction-based ADMM for multi-block separable convex programming

Convergent prediction-correction-based ADMM for multi-block separable convex programming Convergent prediction-correction-based ADMM for multi-block separable convex programming Xiaokai Chang a,b, Sanyang Liu a, Pengjun Zhao c and Xu Li b a School of Mathematics and Statistics, Xidian University,

More information

Randomized Coordinate Descent Methods on Optimization Problems with Linearly Coupled Constraints

Randomized Coordinate Descent Methods on Optimization Problems with Linearly Coupled Constraints Randomized Coordinate Descent Methods on Optimization Problems with Linearly Coupled Constraints By I. Necoara, Y. Nesterov, and F. Glineur Lijun Xu Optimization Group Meeting November 27, 2012 Outline

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Low-Rank Factorization Models for Matrix Completion and Matrix Separation

Low-Rank Factorization Models for Matrix Completion and Matrix Separation for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so

More information

Lecture: Algorithms for LP, SOCP and SDP

Lecture: Algorithms for LP, SOCP and SDP 1/53 Lecture: Algorithms for LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html wenzw@pku.edu.cn Acknowledgement:

More information

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques

More information

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming* Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming* Notes for CAAM 564, Spring 2012 Dept of CAAM, Rice University Outline (1) Introduction (2) Formulation & a complexity

More information

An adaptive accelerated first-order method for convex optimization

An adaptive accelerated first-order method for convex optimization An adaptive accelerated first-order method for convex optimization Renato D.C Monteiro Camilo Ortiz Benar F. Svaiter July 3, 22 (Revised: May 4, 24) Abstract This paper presents a new accelerated variant

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

Dealing with Constraints via Random Permutation

Dealing with Constraints via Random Permutation Dealing with Constraints via Random Permutation Ruoyu Sun UIUC Joint work with Zhi-Quan Luo (U of Minnesota and CUHK (SZ)) and Yinyu Ye (Stanford) Simons Institute Workshop on Fast Iterative Methods in

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

Coordinate Update Algorithm Short Course Operator Splitting

Coordinate Update Algorithm Short Course Operator Splitting Coordinate Update Algorithm Short Course Operator Splitting Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 25 Operator splitting pipeline 1. Formulate a problem as 0 A(x) + B(x) with monotone operators

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Fast proximal gradient methods

Fast proximal gradient methods L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Agenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples

Agenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples Agenda Fast proximal gradient methods 1 Accelerated first-order methods 2 Auxiliary sequences 3 Convergence analysis 4 Numerical examples 5 Optimality of Nesterov s scheme Last time Proximal gradient method

More information

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems?

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)

More information

Math 273a: Optimization Convex Conjugacy

Math 273a: Optimization Convex Conjugacy Math 273a: Optimization Convex Conjugacy Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Convex conjugate (the Legendre transform) Let f be a closed proper

More information

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

A projection method based on KKT conditions for convex quadratic semidefinite programming with nonnegative constraints

A projection method based on KKT conditions for convex quadratic semidefinite programming with nonnegative constraints A projection method based on KKT conditions for convex quadratic semidefinite programming with nonnegative constraints Xiaokai Chang Sanyang Liu Pengjun Zhao December 4, 07 Abstract The dual form of convex

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

Module 04 Optimization Problems KKT Conditions & Solvers

Module 04 Optimization Problems KKT Conditions & Solvers Module 04 Optimization Problems KKT Conditions & Solvers Ahmad F. Taha EE 5243: Introduction to Cyber-Physical Systems Email: ahmad.taha@utsa.edu Webpage: http://engineering.utsa.edu/ taha/index.html September

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 XI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.11 Alternating direction methods of multipliers for separable convex programming Bingsheng He Department of Mathematics

More information

EE 227A: Convex Optimization and Applications October 14, 2008

EE 227A: Convex Optimization and Applications October 14, 2008 EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider

More information

A Continuation Approach Using NCP Function for Solving Max-Cut Problem

A Continuation Approach Using NCP Function for Solving Max-Cut Problem A Continuation Approach Using NCP Function for Solving Max-Cut Problem Xu Fengmin Xu Chengxian Ren Jiuquan Abstract A continuous approach using NCP function for approximating the solution of the max-cut

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Lecture 8: February 9

Lecture 8: February 9 0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

More information

More First-Order Optimization Algorithms

More First-Order Optimization Algorithms More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

Lecture: Matrix Completion

Lecture: Matrix Completion 1/56 Lecture: Matrix Completion http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html Acknowledgement: this slides is based on Prof. Jure Leskovec and Prof. Emmanuel Candes s lecture notes Recommendation systems

More information

Accelerated Proximal Gradient Methods for Convex Optimization

Accelerated Proximal Gradient Methods for Convex Optimization Accelerated Proximal Gradient Methods for Convex Optimization Paul Tseng Mathematics, University of Washington Seattle MOPTA, University of Guelph August 18, 2008 ACCELERATED PROXIMAL GRADIENT METHODS

More information

Adaptive Primal Dual Optimization for Image Processing and Learning

Adaptive Primal Dual Optimization for Image Processing and Learning Adaptive Primal Dual Optimization for Image Processing and Learning Tom Goldstein Rice University tag7@rice.edu Ernie Esser University of British Columbia eesser@eos.ubc.ca Richard Baraniuk Rice University

More information

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order

More information

HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING

HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX PROGRAMMING SIAM J. OPTIM. Vol. 8, No. 1, pp. 646 670 c 018 Society for Industrial and Applied Mathematics HYBRID JACOBIAN AND GAUSS SEIDEL PROXIMAL BLOCK COORDINATE UPDATE METHODS FOR LINEARLY CONSTRAINED CONVEX

More information

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS

ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in

More information

An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP

An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP Kaifeng Jiang, Defeng Sun, and Kim-Chuan Toh Abstract The accelerated proximal gradient (APG) method, first

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009 UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that

More information

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION

LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION LINEARIZED AUGMENTED LAGRANGIAN AND ALTERNATING DIRECTION METHODS FOR NUCLEAR NORM MINIMIZATION JUNFENG YANG AND XIAOMING YUAN Abstract. The nuclear norm is widely used to induce low-rank solutions for

More information

Lecture 17: Primal-dual interior-point methods part II

Lecture 17: Primal-dual interior-point methods part II 10-725/36-725: Convex Optimization Spring 2015 Lecture 17: Primal-dual interior-point methods part II Lecturer: Javier Pena Scribes: Pinchao Zhang, Wei Ma Note: LaTeX template courtesy of UC Berkeley EECS

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

ALADIN An Algorithm for Distributed Non-Convex Optimization and Control

ALADIN An Algorithm for Distributed Non-Convex Optimization and Control ALADIN An Algorithm for Distributed Non-Convex Optimization and Control Boris Houska, Yuning Jiang, Janick Frasch, Rien Quirynen, Dimitris Kouzoupis, Moritz Diehl ShanghaiTech University, University of

More information

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

arxiv: v2 [math.oc] 1 Dec 2014

arxiv: v2 [math.oc] 1 Dec 2014 A Convergent 3-Block Semi-Proximal Alternating Direction Method of Multipliers for Conic Programming with 4-Type of Constraints arxiv:1404.5378v2 [math.oc] 1 Dec 2014 Defeng Sun, Kim-Chuan Toh, Liuqin

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems

Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems L.C.Smith College of Engineering and Computer Science Algorithm-Hardware Co-Optimization of Memristor-Based Framework for Solving SOCP and Homogeneous QCQP Problems Ao Ren Sijia Liu Ruizhe Cai Wujie Wen

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

The Ongoing Development of CSDP

The Ongoing Development of CSDP The Ongoing Development of CSDP Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Joseph Young Department of Mathematics New Mexico Tech (Now at Rice University)

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

SPARSE SECOND ORDER CONE PROGRAMMING FORMULATIONS FOR CONVEX OPTIMIZATION PROBLEMS

SPARSE SECOND ORDER CONE PROGRAMMING FORMULATIONS FOR CONVEX OPTIMIZATION PROBLEMS Journal of the Operations Research Society of Japan 2008, Vol. 51, No. 3, 241-264 SPARSE SECOND ORDER CONE PROGRAMMING FORMULATIONS FOR CONVEX OPTIMIZATION PROBLEMS Kazuhiro Kobayashi Sunyoung Kim Masakazu

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems AMSC 607 / CMSC 764 Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 4: Introduction to Interior Point Methods Dianne P. O Leary c 2008 Interior Point Methods We ll discuss

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

A Tutorial on Primal-Dual Algorithm

A Tutorial on Primal-Dual Algorithm A Tutorial on Primal-Dual Algorithm Shenlong Wang University of Toronto March 31, 2016 1 / 34 Energy minimization MAP Inference for MRFs Typical energies consist of a regularization term and a data term.

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Interior-Point Methods

Interior-Point Methods Interior-Point Methods Stephen Wright University of Wisconsin-Madison Simons, Berkeley, August, 2017 Wright (UW-Madison) Interior-Point Methods August 2017 1 / 48 Outline Introduction: Problems and Fundamentals

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program

More information

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some

More information