A New Look at the Performance Analysis of First-Order Methods

Size: px
Start display at page:

Download "A New Look at the Performance Analysis of First-Order Methods"

Transcription

1 A New Look at the Performance Analysis of First-Order Methods Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with Yoel Drori, Google s R&D Center, Tel Aviv Optimization without Borders In Honor of Yuri Nesterov s 60th Birthday February 7 2, 206 Les Houches, France

2 The Problem Problem: How to evaluate the performance (complexity) of an optimization method for a given class of problems?

3 The Problem Problem: How to evaluate the performance (complexity) of an optimization method for a given class of problems? We focus on first order methods for smooth convex minimization Asssumption. (M) min{f (x) : x R d }, f convex and C, L (Rd ). (M) is solvable, i.e., the optimal set X (f ) := argmin f is nonempty. Given any starting point x 0, R > 0, such that x 0 x R, x X (f ).

4 The Problem Problem: How to evaluate the performance (complexity) of an optimization method for a given class of problems? We focus on first order methods for smooth convex minimization Asssumption. (M) min{f (x) : x R d }, f convex and C, L (Rd ). (M) is solvable, i.e., the optimal set X (f ) := argmin f is nonempty. Given any starting point x 0, R > 0, such that x 0 x R, x X (f ). We completely depart from conventional approaches...

5 Black-Box First Order Methods A Black-box optimization method is an algorithm A which has knowledge of: The underlying space R d The family of functions F to be minimized Oracle Model of Optimization Nemirovsky-Yudin (83)

6 Black-Box First Order Methods A Black-box optimization method is an algorithm A which has knowledge of: The underlying space R d The family of functions F to be minimized The function itself is not known. To gain information on the objective function f to be minimized, the algorithm A queries a subroutine which given an input point in R d, returns the value of f and its gradient f at that point. First Order Method: The Algorithm A The algorithm starts with an initial point x 0 R d and generate a finite sequence of points {x i : i =,... N} where at each step, the algorithm depends only on the previous steps, their function values and gradients via some rule: x i+ = A(x 0,..., x i ; f (x 0 ),..., f (x i ); f (x 0 ),..., f (x i )), i = 0,,..., N Note that the algorithm has another implicit knowledge: x 0 x R. Oracle Model of Optimization Nemirovsky-Yudin (83)

7 Performance/Complexity of an Algorithm We measure the worst-case performance (or complexity) of an algorithm A by looking at the absolute inaccuracy δ(f, x N ) = f (x N ) f (x ), where x N is the output of the algorithm after making N calls to the oracle. The worst-case is taken over all possible functions f F with starting points x 0 satisfying x 0 x R, where x X (f ). Problem We look at finding the maximal absolute inaccuracy over all possible inputs to the algorithm. This leads to the following...

8 Main Observation The worst-case performance of an optimization method is by itself an optimization problem!

9 Main Observation The worst-case performance of an optimization method is by itself an optimization problem! The Performance Estimation Problem PEP To measure the worst-case performance of an algorithm A we need to solve the following Performance Estimation Problem (PEP): max f (x N ) f (x ) s.t. f F, x i+ = A(x 0,..., x i ; f (x 0 ),..., f (x i ); f (x 0 ),..., f (x i )), i = 0,..., N, x X (f ), x x 0 R, x 0,..., x N, x R d. (P) PEP is an abstract optimization problem in infinite dimension : f F. Clearly intractable!?!..

10 A Methodology to Tackle PEP: Basic Un-Formal Approach A. Relax the functional constraint (f F) by new variables and constraints in R d to built a finite dimensional problem. This is done by: Exploiting adequate properties of the class F at the points x 0,..., x N, x R d. 2 Using the rule(s) describing the given algorithm A.

11 A Methodology to Tackle PEP: Basic Un-Formal Approach A. Relax the functional constraint (f F) by new variables and constraints in R d to built a finite dimensional problem. This is done by: Exploiting adequate properties of the class F at the points x 0,..., x N, x R d. 2 Using the rule(s) describing the given algorithm A. The resulting relaxed finite dimensional problem remains a valid upper bound on f (x N ) f (x ). Yet, this problem remains nontrivial to tackle. So what else can be done...?

12 A Methodology to Tackle PEP: Basic Un-Formal Approach A. Relax the functional constraint (f F) by new variables and constraints in R d to built a finite dimensional problem. This is done by: Exploiting adequate properties of the class F at the points x 0,..., x N, x R d. 2 Using the rule(s) describing the given algorithm A. The resulting relaxed finite dimensional problem remains a valid upper bound on f (x N ) f (x ). Yet, this problem remains nontrivial to tackle. So what else can be done...? B. More Relaxations..!!... For a given class of algorithms A: Exploit structures of PEP to simplify it. 2 Develop a novel relaxation technique and duality to find an upper bound to this problem. Despite "massive" relaxations: We derive new and better complexity bounds than currently known. In principle... this approach is universal. It can be applied to any optimization algorithm...!

13 Relaxing the functional constraint f F We focus on First Order Methods (FOM) for smooth convex problem, that is: convex f F C, L. We start with the following well known fact for convex f in F C, L.

14 Relaxing the functional constraint f F We focus on First Order Methods (FOM) for smooth convex problem, that is: convex f F C, L. We start with the following well known fact for convex f in F C, L. Proposition Suppose f : R d R is convex and has Lipschitz continuous gradient with constant L. Then for every x, y R d : 2L f (x) f (y) 2 f (x) f (y) f (y), x y. ()

15 Relaxing the functional constraint f F We focus on First Order Methods (FOM) for smooth convex problem, that is: convex f F C, L. We start with the following well known fact for convex f in F C, L. Proposition Suppose f : R d R is convex and has Lipschitz continuous gradient with constant L. Then for every x, y R d : 2L f (x) f (y) 2 f (x) f (y) f (y), x y. () The relaxation scheme - a sort of discretization" Apply () at the points x 0,..., x N and x. Use the resulting inequalities as constraints" instead of the functional constraint f F.

16 Relaxing the functional constraint f F We focus on First Order Methods (FOM) for smooth convex problem, that is: convex f F C, L. We start with the following well known fact for convex f in F C, L. Proposition Suppose f : R d R is convex and has Lipschitz continuous gradient with constant L. Then for every x, y R d : 2L f (x) f (y) 2 f (x) f (y) f (y), x y. () The relaxation scheme - a sort of discretization" Apply () at the points x 0,..., x N and x. Use the resulting inequalities as constraints" instead of the functional constraint f F. Define L x x 0 2 δ i := f (x i ) f (x ), L x x 0 g i := f (x i ), i = 0,..., N,, In terms of δ i, g i, condition () becomes g 2 i g j 2 x δ i δ j g j, i x j x x, i, j = 0,..., N,. (2) 0 We now treat x, {x i, δ i, g i } N i=0 as the optimization variables, instead of f C, L.

17 A (Relaxed) Finite Dimensional PEP Replacing the constraint on f by the constraints (2) we reach a relaxed finite dimensional PEP in the variables x, {x i, δ i, g i } N i=0: max L x x 0 2 δ N x,x i,g i R d,δ i R (P) s.t. g 2 i g j 2 x δ i δ j g j, i x j x x, i, j = 0,..., N,, 0 x i+ = A(x 0,..., x i ; δ 0,..., δ i ; g 0,..., g i ), i = 0,..., N, x x 0 R. Since (P) is a relaxation of the original maximization problem, its solution still provides a valid upper bound on the complexity of the given method A: f (x N ) f (x ) val(p). We will now show our main results for: The gradient method. 2 A broad class of first order methods.

18 PEP for the Gradient Method Algorithm (GM) 0 Input: N, h, f C, L (Rd ) convex, x 0 R d. For i = 0,..., N, compute x i+ = x i h L f (x i ), (h > 0).

19 PEP for the Gradient Method Algorithm (GM) 0 Input: N, h, f C, L (Rd ) convex, x 0 R d. For i = 0,..., N, compute x i+ = x i h L f (x i ), (h > 0). After some transformations, PEP for (GM) is a Nonconvex Quadratic Problem: (P) max LR 2 δ N g i R d,δ i R s.t. 2 g i g j 2 δ i δ j g j, j t=i+ hg t, i < j = 0,..., N, 2 g i g j 2 δ i δ j + g j, i t=j+ hg t, j < i = 0,..., N, 2 g i 2 δ i, i = 0,..., N, 2 g i 2 δ i g i, ν + i t= hg t, i = 0,..., N. Notation: ν R d is any unit vector; i < j = 0,..., N is a shorthand notation for i = 0,..., N, j = i +,..., N. As is", PEP remains impossible to tackle..!?..we turn to the second phase: Reformulation and more relaxations!

20 Analyzing PEP for the Gradient Method The main steps (see paper for details): We further drop constraints... This is still a valid upper bound! Reformulate it as a Quadratic Matrix (QM) Optimization Problem: max G R (N+) d,δ R N+ LR 2 δ N s.t. Tr(G T A i,i G) δ i δ i, i =,..., N, Tr(G T D i G + νe T i+g) δ i, i = 0,..., N, (G ) The matrices A i,i, D i S N+ are explicitly given in terms of h. ( {e i+ } N i=0 are the canonical unit vectors in RN+ and ν R d is a unit vector.)

21 Analyzing PEP for the Gradient Method The main steps (see paper for details): We further drop constraints... This is still a valid upper bound! Reformulate it as a Quadratic Matrix (QM) Optimization Problem: max G R (N+) d,δ R N+ LR 2 δ N s.t. Tr(G T A i,i G) δ i δ i, i =,..., N, Tr(G T D i G + νe T i+g) δ i, i = 0,..., N, (G ) The matrices A i,i, D i S N+ are explicitly given in terms of h. ( {e i+ } N i=0 are the canonical unit vectors in RN+ and ν R d is a unit vector.) To find an upper bound on problem G we use duality. We further exploit the special structure of this (QM), and a dimension reduction result, to derive a tractable SDP dual.

22 A Dual Problem for the Quadratic Matrix Problem G min { λ R N,t R 2 LR2 t : λ Λ, S(λ, t) 0}, (DG ) Λ := {λ R N : λ i+ λ i 0, i =,..., N, λ N 0, λ i 0, i =,..., N}, ( ) S N+2 ( h)s0 (λ) + hs (λ) q S(λ, t) := q T, t q := (λ, λ 2 λ,..., λ N λ N, λ N ) T and S 0, S S N+ are defined by: 2λ λ λ 2λ 2 λ 2 λ 2 2λ 3 λ 3 S 0 (λ) = λ N 2λ N λ N λ N and 2λ λ 2 λ... λ N λ N λ N λ 2 λ 2λ 2 λ N λ N λ N S (λ) = (4) λ N λ N λ N λ N 2λ N λ N λ N λ N... λ N (3)

23 A Feasible Analytical Solution of this SDP can be found! A crucial and hard part of the proof!

24 A Feasible Analytical Solution of this SDP can be found! A crucial and hard part of the proof! Lemma Let Then, t = 2Nh +, and λ i i =, i =,..., N. 2N + i the matrices S 0 (λ), S (λ) S N+ defined in (3) (4) are positive definite for every N N. The pair (λ i, t) is feasible for DG. Equipped with this result, invoking standard duality leads to the desired complexity result for GM.

25 Complexity Bound for the Gradient Method Theorem Let f C, L (Rd ) and let x 0,..., x N R d be generated by (GM) with 0 < h. Then a f (x N ) f (x ) a The classical bound on the gradient method: f (x N ) f (x ) LR2 2N. LR2 4N + 2. (5)

26 Complexity Bound for the Gradient Method Theorem Let f C, L (Rd ) and let x 0,..., x N R d be generated by (GM) with 0 < h. Then a f (x N ) f (x ) a The classical bound on the gradient method: f (x N ) f (x ) LR2 2N. LR2 4N + 2. (5) We further prove that this bound is tight!

27 The Bound is Tight Theorem Let L > 0, N N and d N. Then for every h > 0 there exists a convex function ϕ C, L (Rd ) and a point x 0 R d such that after N iterations, Algorithm GM reaches an approximate solution x N with the following absolute inaccuracy ϕ(x N ) ϕ = LR2 4Nh + 2.

28 The Bound is Tight Theorem Let L > 0, N N and d N. Then for every h > 0 there exists a convex function ϕ C, L (Rd ) and a point x 0 R d such that after N iterations, Algorithm GM reaches an approximate solution x N with the following absolute inaccuracy ϕ(x N ) ϕ = LR2 4Nh + 2. Interestingly...this ϕ is nothing else but the Moreau envelope of x /(2Nh + )! 7 6 ϕ(x) = { x, x, 2N+ 2(2N+) 2 2N+ 2 x 2, x <, 2N with x 0 = e

29 A Conjecture for GM with 0 < h < 2 We conclude this part by raising a conjecture on the worst-case performance of the gradient method with a constant step size 0 < h < 2.

30 A Conjecture for GM with 0 < h < 2 We conclude this part by raising a conjecture on the worst-case performance of the gradient method with a constant step size 0 < h < 2. Conjecture Suppose the sequence x 0,..., x N is generated by Algorithm GM with 0 < h < 2, then ( ) f (x N ) f (x ) LR2 2 max, ( h)2n. 2Nh + Note: when 0 < h the bound above coincides with our previous bound.

31 A Wide Class of First-Order Algorithms Consider the following class of first-order algorithms: Algorithm (FO) 0 Input: f C, L (Rd ), x 0 R d. For i = 0,..., N, compute x i+ = x i L i k=0 h(i+) k f (x k ). We now show that the class (FO) covers some fundamental schemes beyond the gradient method. 2 For this class we establish a complexity bound that can be efficiently computed via SDP solvers. 3 Furthermore, we derive an "optimized" algorithm of this form by finding optimal step sizes h (i) k.

32 Example : the Heavy Ball Method Example (The heavy ball method, HBM, Polyak (964)) 0 Input: f C, L (Rd ), x 0 R d, x x 0 α L f (x 0 ), (α > 0). 2 For i =,..., N compute: x i+ = x i α L f (x i ) + β(x i x i ), (β > 0).

33 Example : the Heavy Ball Method Example (The heavy ball method, HBM, Polyak (964)) 0 Input: f C, L (Rd ), x 0 R d, x x 0 α L f (x 0 ), (α > 0). 2 For i =,..., N compute: x i+ = x i α L f (x i ) + β(x i x i ), (β > 0). By recursively eliminating the term x i x i in the last step, we can rewrite step 2 as follows: x i+ = x i i αβ i k f (x k ), L k=0 hence this methods clearly fits in the class (FO).

34 Example 2: Nesterov s Fast Gradient Method Example (Nesterov s fast gradient method, FGM (983)) 0 Input: f C, L (Rd ), x 0 R d, y x 0, t, 2 For i =,..., N compute: x i y i L f (y i ), 2 t i+ + +4t 2 i 2, 3 y i+ x i + t i t i+ (x i x i ). This algorithm is as simple as the gradient method, yet achieves an optimal convergence rate of O(/N 2 ): f (x N ) f (x ) 2L x 0 x 2 (N + ) 2, x X (f ); (3L/32, lower bound).

35 Example 2: Nesterov s Fast Gradient Method Example (Nesterov s fast gradient method, FGM (983)) 0 Input: f C, L (Rd ), x 0 R d, y x 0, t, 2 For i =,..., N compute: x i y i L f (y i ), 2 t i+ + +4t 2 i 2, 3 y i+ x i + t i t i+ (x i x i ). This algorithm is as simple as the gradient method, yet achieves an optimal convergence rate of O(/N 2 ): f (x N ) f (x ) 2L x 0 x 2 (N + ) 2, x X (f ); (3L/32, lower bound). This algorithm includes 2 sequences of points (x i, y i ). At first glance, it does not appear to belong to the class (FO)......It can be shown that the FGM fits in the class (FO), (see the paper ).

36 PEP for the Wide Class of First-Order Algorithms (FO) Applying our approach to FO, (as done for GM), we derive the following PEP: max LR 2 δ N g i R d,δ i R s.t. 2 g i g j 2 δ i δ j g j, j g 2 i g j 2 δ i δ j + g j, i t=j+ 2 g i 2 δ i, i = 0,..., N, g 2 i 2 δ i g i, ν + i t= t t=i+ k=0 h(t) k g k, i < j = 0,..., N, t k=0 h(t) k g k, j < i = 0,..., N, t k=0 h(t) k g k, i = 0,..., N. In this general case an analytical solution appears unlikely...

37 PEP for the Wide Class of First-Order Algorithms (FO) Applying our approach to FO, (as done for GM), we derive the following PEP: max LR 2 δ N g i R d,δ i R s.t. 2 g i g j 2 δ i δ j g j, j g 2 i g j 2 δ i δ j + g j, i t=j+ 2 g i 2 δ i, i = 0,..., N, g 2 i 2 δ i g i, ν + i t= t t=i+ k=0 h(t) k g k, i < j = 0,..., N, t k=0 h(t) k g k, j < i = 0,..., N, t k=0 h(t) k g k, i = 0,..., N. In this general case an analytical solution appears unlikely... Nevertheless, using techniques similar to the ones used for GM, we establish a dual bound that can be efficiently computed via any SDP solver. More precisely, we obtain the following result.

38 A Bound on Algorithm FO via Convex SDP Theorem Fix any N, d N. Let f C, L (Rd ) be convex and suppose that x 0,..., x N R d are generated by Algorithm FO, and that (DQ) is solvable. Then, f (x N ) f (x ) LR 2 B(h) Here B( ) is the value of the Convex SDP (DQ ): B(h) = min λ,τ,t 2 LR2 t ( N s.t. i= λ iãi,i(h) + N i=0 τ D i i (h) τ ) 2 τ T t 0, 2 2 (λ, τ) Λ, (DQ ) Λ := {(λ, τ) R N + R N+ + : τ 0 = λ, λ i λ i+ + τ i = 0, i =,..., N, λ N + τ N = }. and the matrices Ãi(h), D i (h) are explicitly given in terms of h (h i k) 0 <k i N. Note: The bound is independent of the dimension d.

39 Numerical Examples Performance estimate (log scale) Heavy ball method, α= β=0.5 FGM main sequence FGM auxiliary sequence FGM analytical bound N FGM Analytical Bound = 2LR2 (N+) 2. HBM is not competitive versus FGM Conjecture 2: f (x i ), f (y i ) converge to optimal value with same rate of convergence.

40 Numerical Examples Performance estimate (log scale) Heavy ball method, α= β=0.5 FGM main sequence FGM auxiliary sequence FGM analytical bound N FGM Analytical Bound = 2LR2 (N+) 2. HBM is not competitive versus FGM Conjecture 2: f (x i ), f (y i ) converge to optimal value with same rate of convergence....just proven by Kim-Fessler (205).

41 Finding an Optimized" Algorithm Given B(h), a natural question is: how to find the best" algorithm with respect to the bound? i.e., the best step sizes. That is find h = argmin h B(h) which leads to the mini-max problem: min h (k) k max δ N x i,g i R d,δ i R s.t. g 2L i g j 2 δ i δ j g j, x i x j, i, j = 0,..., N,, x i+ = x i i L k=0 h(i+) k g k, i = 0,..., N, x x 0 R. Once again, we face a challenging problem...

42 Finding an Optimized" Algorithm Given B(h), a natural question is: how to find the best" algorithm with respect to the bound? i.e., the best step sizes. That is find h = argmin h B(h) which leads to the mini-max problem: min h (k) k max δ N x i,g i R d,δ i R s.t. g 2L i g j 2 δ i δ j g j, x i x j, i, j = 0,..., N,, x i+ = x i i L k=0 h(i+) k g k, i = 0,..., N, x x 0 R. Once again, we face a challenging problem... Using semidefinite relaxations, duality and linearization, a solution to this problem can be efficiently approximated.

43 An Optimized Algorithm Solution step I Remove some selected constraints and eliminate x i using the equality constraints: min h (k) k max δ N x,g i R d,δ i R s.t. 2L g i g i 2 δ i δ i g i, i k=0 h(i) 2L g i 2 δ i g i, x x 0 + i t= x x 0 2 R 2. k g k, i =,..., N, t k=0 h(t) k g k, i = 0,..., N,

44 An Optimized Algorithm Solution step I Remove some selected constraints and eliminate x i using the equality constraints: min h (k) k max δ N x,g i R d,δ i R s.t. 2L g i g i 2 δ i δ i g i, i k=0 h(i) 2L g i 2 δ i g i, x x 0 + i t= x x 0 2 R 2. k g k, i =,..., N, t k=0 h(t) k g k, i = 0,..., N, Take dual of the inner max" problem, to obtain a Nonconvex (bilinear) SDP: (BIL) min h,λ,τ,t { 2 t : ( N i= λ iãi(h) + N i=0 τ D i i (h) τ ) 2 τ T t 2 2 } 0, (λ, τ) Λ, Ã i (h) := 2 (e i e i+ )(e i e i+ ) T + i 2 k=0 h(i) k (e i+ek+ T + e k+ei+ T ), D i (h) := 2 e i+ei+ T + i t 2 t= k=0 h(t) k (e i+ek+ T + e k+ei+ T ) Λ :={(λ, τ) R N + RN+ + : τ 0 = λ, λ i λ i+ + τ i = 0, i =,..., N, λ N + τ N = }.

45 Optimized Algorithm Solution step II Define a new variable (Linearize the bilinear nonconvex SDP): r i,k = λ i h (i) k to obtain a A Convex SDP: where (LIN) min r,λ,τ,t + τ i i t=k+ h(t) k, i =,..., N, k = 0,..., i { 2 t : ( S(r, λ, τ) 2 τ 2 τ T 2 t ) } 0, (λ, τ) Λ, S(r, λ, τ) = N 2 i= λ i(e i e i+ )(e i e i+ ) T + N 2 i=0 τ ie i+ ei+ T + N i 2 i= k=0 r i,k(e i+ ek+ T + e k+ ei+). T

46 Optimized Algorithm Solution step II Define a new variable (Linearize the bilinear nonconvex SDP): r i,k = λ i h (i) k to obtain a A Convex SDP: where (LIN) min r,λ,τ,t + τ i i t=k+ h(t) k, i =,..., N, k = 0,..., i { 2 t : ( S(r, λ, τ) 2 τ 2 τ T 2 t ) } 0, (λ, τ) Λ, S(r, λ, τ) = N 2 i= λ i(e i e i+ )(e i e i+ ) T + N 2 i=0 τ ie i+ ei+ T + N i 2 i= k=0 r i,k(e i+ ek+ T + e k+ ei+). T Theorem (Use the solution of (LIN) to solve (BIL) and get optimal h.) Suppose (r, λ, τ, t ) is an optimal solution for (LIN), then (h, λ, τ, t ) is an optimal solution for (BIL), where h = (h (i) k ) 0 k<i N is defined by the following recursive rule: r h (i) i,k τ i i t=k+ h(t) k k = λ +τ i λ i + τi 0 i, i =,..., N, k = 0,..., i. 0 otherwise

47 An Optimized Algorithm Numerical Results Performance estimate (log scale) Heavy ball method, α= β=0.5 FGM main sequence FGM auxiliary sequence FGM analytical bound Optimal algorithm N The bound on the new algorithm is two times better than the bound on Nesterov s FGM!

48 An Optimized Algorithm Example with N = 5 Example A first-order algorithm with optimal step-sizes for N = 5: x x L f (x 0 ) x 2 x 0.74 L f (x 0 ) L f (x ) x 3 x L f (x 0 ) L f (x ) L f (x 2 ) x 4 x L f (x 0 ) L f (x ) L f (x 2 ) L f (x 3 ) x 5 x L f (x 0 ) L f (x ) L f (x 2 ) L f (x 3 ) L f (x 4 ) We then get f (x 5) f (x ) 0.09 LR 2 for any x X (f ).

49 Concluding Remarks and Extensions The PEP framework offers a new approach to derive complexity bounds. Finding a bound for the PEP problem is challenging! Numerical bounds required solving SDP which dimension depends on N.

50 Concluding Remarks and Extensions The PEP framework offers a new approach to derive complexity bounds. Finding a bound for the PEP problem is challenging! Numerical bounds required solving SDP which dimension depends on N. Two Very Recent Works Building on PEP: Kim-Fessler (MP-205) confirmed our Conjecture 2. Also derived an efficient Optimized" algorithm, with an analytical bound for the auxiliary sequence y k. Taylor et al. (205): Tightness of our relaxations and numerical bounds + smooth strongly convex case.

51 Concluding Remarks and Extensions The PEP framework offers a new approach to derive complexity bounds. Finding a bound for the PEP problem is challenging! Numerical bounds required solving SDP which dimension depends on N. Two Very Recent Works Building on PEP: Kim-Fessler (MP-205) confirmed our Conjecture 2. Also derived an efficient Optimized" algorithm, with an analytical bound for the auxiliary sequence y k. Taylor et al. (205): Tightness of our relaxations and numerical bounds + smooth strongly convex case. Extensions: Analyze other algorithms (e.g., constraints done for projected gradient), and different classes F of input functions/optimization models... PEP also useful as a constructive approach to design new algorithms... In our Recent work on Nonsmooth problems we derive an Optimal Kelley-Like Cutting Plane Method. [To appear in Math. Prog.]

52 HAPPY BIRTHDAY YURI!

Performance of first-order methods for smooth convex minimization: a novel approach

Performance of first-order methods for smooth convex minimization: a novel approach Performance of first-order methods for smooth convex minimization: a novel approach arxiv:206.3209v [math.oc] 4 Jun 202 Yoel Drori and Marc Teboulle June 5, 202 Abstract We introduce a novel approach for

More information

arxiv: v4 [math.oc] 1 Apr 2018

arxiv: v4 [math.oc] 1 Apr 2018 Generalizing the optimized gradient method for smooth convex minimization Donghwan Kim Jeffrey A. Fessler arxiv:607.06764v4 [math.oc] Apr 08 Date of current version: April 3, 08 Abstract This paper generalizes

More information

Optimized first-order methods for smooth convex minimization

Optimized first-order methods for smooth convex minimization Noname manuscript No. will be inserted by the editor) Optimized first-order methods for smooth convex minimization Donghwan Kim Jeffrey A. Fessler Date of current version: September 9, 05 Abstract We introduce

More information

Optimized first-order minimization methods

Optimized first-order minimization methods Optimized first-order minimization methods Donghwan Kim & Jeffrey A. Fessler EECS Dept., BME Dept., Dept. of Radiology University of Michigan web.eecs.umich.edu/~fessler UM AIM Seminar 2014-10-03 1 Disclosure

More information

arxiv: v5 [math.oc] 23 Jan 2018

arxiv: v5 [math.oc] 23 Jan 2018 Another look at the fast iterative shrinkage/thresholding algorithm FISTA Donghwan Kim Jeffrey A. Fessler arxiv:608.086v5 [math.oc] Jan 08 Date of current version: January 4, 08 Abstract This paper provides

More information

Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint

Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint Conditional Gradient Algorithms for Rank-One Matrix Approximations with a Sparsity Constraint Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with Ronny Luss Optimization and

More information

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization

More information

c 2000 Society for Industrial and Applied Mathematics

c 2000 Society for Industrial and Applied Mathematics SIAM J. OPIM. Vol. 10, No. 3, pp. 750 778 c 2000 Society for Industrial and Applied Mathematics CONES OF MARICES AND SUCCESSIVE CONVEX RELAXAIONS OF NONCONVEX SES MASAKAZU KOJIMA AND LEVEN UNÇEL Abstract.

More information

The moment-lp and moment-sos approaches

The moment-lp and moment-sos approaches The moment-lp and moment-sos approaches LAAS-CNRS and Institute of Mathematics, Toulouse, France CIRM, November 2013 Semidefinite Programming Why polynomial optimization? LP- and SDP- CERTIFICATES of POSITIVITY

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program

More information

Mustafa Ç. Pinar 1 and Marc Teboulle 2

Mustafa Ç. Pinar 1 and Marc Teboulle 2 RAIRO Operations Research RAIRO Oper. Res. 40 (2006) 253 265 DOI: 10.1051/ro:2006023 ON SEMIDEFINITE BOUNDS FOR MAXIMIZATION OF A NON-CONVEX QUADRATIC OBJECTIVE OVER THE l 1 UNIT BALL Mustafa Ç. Pinar

More information

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University

SIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

A convex optimization approach for minimizing the ratio of indefinite quadratic functions over an ellipsoid

A convex optimization approach for minimizing the ratio of indefinite quadratic functions over an ellipsoid Math. Program., Ser. A (2009) 118:13 35 DOI 10.1007/s10107-007-0181-x FULL LENGTH PAPER A convex optimization approach for minimizing the ratio of indefinite quadratic functions over an ellipsoid Amir

More information

A semi-algebraic look at first-order methods

A semi-algebraic look at first-order methods splitting A semi-algebraic look at first-order Université de Toulouse / TSE Nesterov s 60th birthday, Les Houches, 2016 in large-scale first-order optimization splitting Start with a reasonable FOM (some

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 6 (Conic optimization) 07 Feb, 2013 Suvrit Sra Organizational Info Quiz coming up on 19th Feb. Project teams by 19th Feb Good if you can mix your research

More information

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. . Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. Nemirovski Arkadi.Nemirovski@isye.gatech.edu Linear Optimization Problem,

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 15. Suvrit Sra. (Gradient methods III) 12 March, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 15 (Gradient methods III) 12 March, 2013 Suvrit Sra Optimal gradient methods 2 / 27 Optimal gradient methods We saw following efficiency estimates for

More information

Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems

Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems Lagrangian-Conic Relaxations, Part I: A Unified Framework and Its Applications to Quadratic Optimization Problems Naohiko Arima, Sunyoung Kim, Masakazu Kojima, and Kim-Chuan Toh Abstract. In Part I of

More information

Lecture: Cone programming. Approximating the Lorentz cone.

Lecture: Cone programming. Approximating the Lorentz cone. Strong relaxations for discrete optimization problems 10/05/16 Lecture: Cone programming. Approximating the Lorentz cone. Lecturer: Yuri Faenza Scribes: Igor Malinović 1 Introduction Cone programming is

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Lecture 5. The Dual Cone and Dual Problem

Lecture 5. The Dual Cone and Dual Problem IE 8534 1 Lecture 5. The Dual Cone and Dual Problem IE 8534 2 For a convex cone K, its dual cone is defined as K = {y x, y 0, x K}. The inner-product can be replaced by x T y if the coordinates of the

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization Robert M. Freund M.I.T. June, 2010 from papers in SIOPT, Mathematics

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

EE 227A: Convex Optimization and Applications October 14, 2008

EE 227A: Convex Optimization and Applications October 14, 2008 EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider

More information

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma Venkataramanan (Ragu) Balakrishnan School of ECE, Purdue University 8 September 2003 European Union RTN Summer School on Multi-Agent

More information

Iteration-complexity of first-order augmented Lagrangian methods for convex programming

Iteration-complexity of first-order augmented Lagrangian methods for convex programming Math. Program., Ser. A 016 155:511 547 DOI 10.1007/s10107-015-0861-x FULL LENGTH PAPER Iteration-complexity of first-order augmented Lagrangian methods for convex programming Guanghui Lan Renato D. C.

More information

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1, 4 Duality 4.1 Numerical perturbation analysis example. Consider the quadratic program with variables x 1, x 2, and parameters u 1, u 2. minimize x 2 1 +2x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x

More information

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725 Primal-Dual Interior-Point Methods Javier Peña Convex Optimization 10-725/36-725 Last time: duality revisited Consider the problem min x subject to f(x) Ax = b h(x) 0 Lagrangian L(x, u, v) = f(x) + u T

More information

Expanding the reach of optimal methods

Expanding the reach of optimal methods Expanding the reach of optimal methods Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with C. Kempton (UW), M. Fazel (UW), A.S. Lewis (Cornell), and S. Roy (UW) BURKAPALOOZA! WCOM

More information

15-780: LinearProgramming

15-780: LinearProgramming 15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear

More information

On deterministic reformulations of distributionally robust joint chance constrained optimization problems

On deterministic reformulations of distributionally robust joint chance constrained optimization problems On deterministic reformulations of distributionally robust joint chance constrained optimization problems Weijun Xie and Shabbir Ahmed School of Industrial & Systems Engineering Georgia Institute of Technology,

More information

Relaxations and Randomized Methods for Nonconvex QCQPs

Relaxations and Randomized Methods for Nonconvex QCQPs Relaxations and Randomized Methods for Nonconvex QCQPs Alexandre d Aspremont, Stephen Boyd EE392o, Stanford University Autumn, 2003 Introduction While some special classes of nonconvex problems can be

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

Cutting Planes for First Level RLT Relaxations of Mixed 0-1 Programs

Cutting Planes for First Level RLT Relaxations of Mixed 0-1 Programs Cutting Planes for First Level RLT Relaxations of Mixed 0-1 Programs 1 Cambridge, July 2013 1 Joint work with Franklin Djeumou Fomeni and Adam N. Letchford Outline 1. Introduction 2. Literature Review

More information

A user s guide to Lojasiewicz/KL inequalities

A user s guide to Lojasiewicz/KL inequalities Other A user s guide to Lojasiewicz/KL inequalities Toulouse School of Economics, Université Toulouse I SLRA, Grenoble, 2015 Motivations behind KL f : R n R smooth ẋ(t) = f (x(t)) or x k+1 = x k λ k f

More information

ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING. Arkadi Nemirovski

ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING. Arkadi Nemirovski ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING Arkadi Nemirovski Georgia Institute of Technology and Technion Israel Institute of Technology Convex Programming solvable case in Optimization Revealing

More information

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization Convex Optimization Ofer Meshi Lecture 6: Lower Bounds Constrained Optimization Lower Bounds Some upper bounds: #iter μ 2 M #iter 2 M #iter L L μ 2 Oracle/ops GD κ log 1/ε M x # ε L # x # L # ε # με f

More information

An Optimal Affine Invariant Smooth Minimization Algorithm.

An Optimal Affine Invariant Smooth Minimization Algorithm. An Optimal Affine Invariant Smooth Minimization Algorithm. Alexandre d Aspremont, CNRS & École Polytechnique. Joint work with Martin Jaggi. Support from ERC SIPA. A. d Aspremont IWSL, Moscow, June 2013,

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Trust Region Problems with Linear Inequality Constraints: Exact SDP Relaxation, Global Optimality and Robust Optimization

Trust Region Problems with Linear Inequality Constraints: Exact SDP Relaxation, Global Optimality and Robust Optimization Trust Region Problems with Linear Inequality Constraints: Exact SDP Relaxation, Global Optimality and Robust Optimization V. Jeyakumar and G. Y. Li Revised Version: September 11, 2013 Abstract The trust-region

More information

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:

More information

1 Introduction Semidenite programming (SDP) has been an active research area following the seminal work of Nesterov and Nemirovski [9] see also Alizad

1 Introduction Semidenite programming (SDP) has been an active research area following the seminal work of Nesterov and Nemirovski [9] see also Alizad Quadratic Maximization and Semidenite Relaxation Shuzhong Zhang Econometric Institute Erasmus University P.O. Box 1738 3000 DR Rotterdam The Netherlands email: zhang@few.eur.nl fax: +31-10-408916 August,

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017 Solution to EE 67 Mid-erm Exam, Fall 207 November 2, 207 EE 67 Solution to Mid-erm Exam - Page 2 of 2 November 2, 207 (4 points) Convex sets (a) (2 points) Consider the set { } a R k p(0) =, p(t) for t

More information

An inexact subgradient algorithm for Equilibrium Problems

An inexact subgradient algorithm for Equilibrium Problems Volume 30, N. 1, pp. 91 107, 2011 Copyright 2011 SBMAC ISSN 0101-8205 www.scielo.br/cam An inexact subgradient algorithm for Equilibrium Problems PAULO SANTOS 1 and SUSANA SCHEIMBERG 2 1 DM, UFPI, Teresina,

More information

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING Kartik Krishnan Advanced Optimization Laboratory McMaster University Joint work with Gema Plaza Martinez and Tamás

More information

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs 2015 American Control Conference Palmer House Hilton July 1-3, 2015. Chicago, IL, USA Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Raphael Louca and Eilyan Bitar

More information

The simplex algorithm

The simplex algorithm The simplex algorithm The simplex algorithm is the classical method for solving linear programs. Its running time is not polynomial in the worst case. It does yield insight into linear programs, however,

More information

Characterization of Self-Polar Convex Functions

Characterization of Self-Polar Convex Functions Characterization of Self-Polar Convex Functions Liran Rotem School of Mathematical Sciences, Tel-Aviv University, Tel-Aviv 69978, Israel Abstract In a work by Artstein-Avidan and Milman the concept of

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Gradient methods for minimizing composite functions

Gradient methods for minimizing composite functions Math. Program., Ser. B 2013) 140:125 161 DOI 10.1007/s10107-012-0629-5 FULL LENGTH PAPER Gradient methods for minimizing composite functions Yu. Nesterov Received: 10 June 2010 / Accepted: 29 December

More information

On duality gap in linear conic problems

On duality gap in linear conic problems On duality gap in linear conic problems C. Zălinescu Abstract In their paper Duality of linear conic problems A. Shapiro and A. Nemirovski considered two possible properties (A) and (B) for dual linear

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Barrier Method. Javier Peña Convex Optimization /36-725

Barrier Method. Javier Peña Convex Optimization /36-725 Barrier Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: Newton s method For root-finding F (x) = 0 x + = x F (x) 1 F (x) For optimization x f(x) x + = x 2 f(x) 1 f(x) Assume f strongly

More information

How hard is this function to optimize?

How hard is this function to optimize? How hard is this function to optimize? John Duchi Based on joint work with Sabyasachi Chatterjee, John Lafferty, Yuancheng Zhu Stanford University West Coast Optimization Rumble October 2016 Problem minimize

More information

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3 MI 6.97 Algebraic techniques and semidefinite optimization February 4, 6 Lecture 3 Lecturer: Pablo A. Parrilo Scribe: Pablo A. Parrilo In this lecture, we will discuss one of the most important applications

More information

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth

More information

CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi. Gradient Descent

CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi. Gradient Descent CS 435, 2018 Lecture 3, Date: 8 March 2018 Instructor: Nisheeth Vishnoi Gradient Descent This lecture introduces Gradient Descent a meta-algorithm for unconstrained minimization. Under convexity of the

More information

PDE Constrained Optimization selected Proofs

PDE Constrained Optimization selected Proofs PDE Constrained Optimization selected Proofs Jeff Snider jeff@snider.com George Mason University Department of Mathematical Sciences April, 214 Outline 1 Prelim 2 Thms 3.9 3.11 3 Thm 3.12 4 Thm 3.13 5

More information

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization racle Complexity of Second-rder Methods for Smooth Convex ptimization Yossi Arjevani had Shamir Ron Shiff Weizmann Institute of Science Rehovot 7610001 Israel Abstract yossi.arjevani@weizmann.ac.il ohad.shamir@weizmann.ac.il

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

A First Order Method for Finding Minimal Norm-Like Solutions of Convex Optimization Problems

A First Order Method for Finding Minimal Norm-Like Solutions of Convex Optimization Problems A First Order Method for Finding Minimal Norm-Like Solutions of Convex Optimization Problems Amir Beck and Shoham Sabach July 6, 2011 Abstract We consider a general class of convex optimization problems

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 28 (Algebra + Optimization) 02 May, 2013 Suvrit Sra Admin Poster presentation on 10th May mandatory HW, Midterm, Quiz to be reweighted Project final report

More information

Lecture #21. c T x Ax b. maximize subject to

Lecture #21. c T x Ax b. maximize subject to COMPSCI 330: Design and Analysis of Algorithms 11/11/2014 Lecture #21 Lecturer: Debmalya Panigrahi Scribe: Samuel Haney 1 Overview In this lecture, we discuss linear programming. We first show that the

More information

Gradient methods for minimizing composite functions

Gradient methods for minimizing composite functions Gradient methods for minimizing composite functions Yu. Nesterov May 00 Abstract In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum

More information

Copositive matrices and periodic dynamical systems

Copositive matrices and periodic dynamical systems Extreme copositive matrices and periodic dynamical systems Weierstrass Institute (WIAS), Berlin Optimization without borders Dedicated to Yuri Nesterovs 60th birthday February 11, 2016 and periodic dynamical

More information

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University Determinant maximization with linear matrix inequality constraints S. Boyd, L. Vandenberghe, S.-P. Wu Information Systems Laboratory Stanford University SCCM Seminar 5 February 1996 1 MAXDET problem denition

More information

Randomized Coordinate Descent Methods on Optimization Problems with Linearly Coupled Constraints

Randomized Coordinate Descent Methods on Optimization Problems with Linearly Coupled Constraints Randomized Coordinate Descent Methods on Optimization Problems with Linearly Coupled Constraints By I. Necoara, Y. Nesterov, and F. Glineur Lijun Xu Optimization Group Meeting November 27, 2012 Outline

More information

The Trust Region Subproblem with Non-Intersecting Linear Constraints

The Trust Region Subproblem with Non-Intersecting Linear Constraints The Trust Region Subproblem with Non-Intersecting Linear Constraints Samuel Burer Boshi Yang February 21, 2013 Abstract This paper studies an extended trust region subproblem (etrs in which the trust region

More information

Sequential convex programming,: value function and convergence

Sequential convex programming,: value function and convergence Sequential convex programming,: value function and convergence Edouard Pauwels joint work with Jérôme Bolte Journées MODE Toulouse March 23 2016 1 / 16 Introduction Local search methods for finite dimensional

More information

Convex and Semidefinite Programming for Approximation

Convex and Semidefinite Programming for Approximation Convex and Semidefinite Programming for Approximation We have seen linear programming based methods to solve NP-hard problems. One perspective on this is that linear programming is a meta-method since

More information

Research Reports on Mathematical and Computing Sciences

Research Reports on Mathematical and Computing Sciences ISSN 1342-284 Research Reports on Mathematical and Computing Sciences Exploiting Sparsity in Linear and Nonlinear Matrix Inequalities via Positive Semidefinite Matrix Completion Sunyoung Kim, Masakazu

More information

Complexity bounds for primal-dual methods minimizing the model of objective function

Complexity bounds for primal-dual methods minimizing the model of objective function Complexity bounds for primal-dual methods minimizing the model of objective function Yu. Nesterov July 4, 06 Abstract We provide Frank-Wolfe ( Conditional Gradients method with a convergence analysis allowing

More information

Supplement: Universal Self-Concordant Barrier Functions

Supplement: Universal Self-Concordant Barrier Functions IE 8534 1 Supplement: Universal Self-Concordant Barrier Functions IE 8534 2 Recall that a self-concordant barrier function for K is a barrier function satisfying 3 F (x)[h, h, h] 2( 2 F (x)[h, h]) 3/2,

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28 26/28 Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable Functions YURII NESTEROV AND GEOVANI NUNES GRAPIGLIA 5 YEARS OF CORE DISCUSSION PAPERS CORE Voie du Roman Pays 4, L Tel

More information

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Convex relaxation. In example below, we have N = 6, and the cut we are considering Convex relaxation The art and science of convex relaxation revolves around taking a non-convex problem that you want to solve, and replacing it with a convex problem which you can actually solve the solution

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

A solution approach for linear optimization with completely positive matrices

A solution approach for linear optimization with completely positive matrices A solution approach for linear optimization with completely positive matrices Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with M. Bomze (Wien) and F.

More information

Lecture 25: Subgradient Method and Bundle Methods April 24

Lecture 25: Subgradient Method and Bundle Methods April 24 IE 51: Convex Optimization Spring 017, UIUC Lecture 5: Subgradient Method and Bundle Methods April 4 Instructor: Niao He Scribe: Shuanglong Wang Courtesy warning: hese notes do not necessarily cover everything

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm

A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm Volume 31, N. 1, pp. 205 218, 2012 Copyright 2012 SBMAC ISSN 0101-8205 / ISSN 1807-0302 (Online) www.scielo.br/cam A sensitivity result for quadratic semidefinite programs with an application to a sequential

More information

Separation, Inverse Optimization, and Decomposition. Some Observations. Ted Ralphs 1 Joint work with: Aykut Bulut 1

Separation, Inverse Optimization, and Decomposition. Some Observations. Ted Ralphs 1 Joint work with: Aykut Bulut 1 : Some Observations Ted Ralphs 1 Joint work with: Aykut Bulut 1 1 COR@L Lab, Department of Industrial and Systems Engineering, Lehigh University COLGEN 2016, Buzios, Brazil, 25 May 2016 What Is This Talk

More information

Self-dual Smooth Approximations of Convex Functions via the Proximal Average

Self-dual Smooth Approximations of Convex Functions via the Proximal Average Chapter Self-dual Smooth Approximations of Convex Functions via the Proximal Average Heinz H. Bauschke, Sarah M. Moffat, and Xianfu Wang Abstract The proximal average of two convex functions has proven

More information

On John type ellipsoids

On John type ellipsoids On John type ellipsoids B. Klartag Tel Aviv University Abstract Given an arbitrary convex symmetric body K R n, we construct a natural and non-trivial continuous map u K which associates ellipsoids to

More information

Rank-one Generated Spectral Cones Defined by Two Homogeneous Linear Matrix Inequalities

Rank-one Generated Spectral Cones Defined by Two Homogeneous Linear Matrix Inequalities Rank-one Generated Spectral Cones Defined by Two Homogeneous Linear Matrix Inequalities C.J. Argue Joint work with Fatma Kılınç-Karzan October 22, 2017 INFORMS Annual Meeting Houston, Texas C. Argue, F.

More information

Complexity and Simplicity of Optimization Problems

Complexity and Simplicity of Optimization Problems Complexity and Simplicity of Optimization Problems Yurii Nesterov, CORE/INMA (UCL) February 17 - March 23, 2012 (ULg) Yu. Nesterov Algorithmic Challenges in Optimization 1/27 Developments in Computer Sciences

More information

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method CSC2411 - Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method Notes taken by Stefan Mathe April 28, 2007 Summary: Throughout the course, we have seen the importance

More information

Tight Rates and Equivalence Results of Operator Splitting Schemes

Tight Rates and Equivalence Results of Operator Splitting Schemes Tight Rates and Equivalence Results of Operator Splitting Schemes Wotao Yin (UCLA Math) Workshop on Optimization for Modern Computing Joint w Damek Davis and Ming Yan UCLA CAM 14-51, 14-58, and 14-59 1

More information

10. Ellipsoid method

10. Ellipsoid method 10. Ellipsoid method EE236C (Spring 2008-09) ellipsoid method convergence proof inequality constraints 10 1 Ellipsoid method history developed by Shor, Nemirovski, Yudin in 1970s used in 1979 by Khachian

More information

Interior Point Algorithms for Constrained Convex Optimization

Interior Point Algorithms for Constrained Convex Optimization Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems

More information