Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles

Size: px
Start display at page:

Download "Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles"

Transcription

1 Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research with Anatoli Juditsky University J. Fourier, Grenoble London Optimization Workshop King s College, London, June 9-10, 2014

2 Overview Problems of interest and motivation Linear Minimization Oracles and classical Conditional Gradient Algorithm Fenchel-type representations and LMO-based Convex Optimization nonsmooth convex minimization variational inequalities with monotone operators and convex-concave saddle points

3 Motivation Problem of Interest: Variational Inequality with monotone operator Find x X : Φ(x), x x 0 x X VI(Φ, X) X: convex compact subset of Euclidean space E Φ : X E is monotone: Φ(x) Φ(x ), x x 0 x, x X Examples: Convex Minimization: Φ(x) f (x), x X, for a convex Lipschitz continuous function f : X R solutions to VI(Φ, X) are exactly the minimizers of f on X Convex-Concave Saddle Points: X = U V, Φ(u, v) = [Φ u (u, v) u f (u, v); Φ v (u, v) v [ f (u, v)]] for a convex-concave Lipschitz continuous f (u, v) : X R solutions to VI(Φ, X) are exactly the saddle points of f on U V. When problem s sizes make Interior Point algorithms prohibitively time consuming, First Order Methods (FOM s) become the methods of choice. Reasons: Under favorable circumstances, FOM s (a) have cheap steps and (b) exhibit nearly dimension independent sublinear convergence rate. Note: Perhaps one could survive without (b), but (a) is a must!

4 Proximal FOMs Find x X : Φ(x), x x 0 x X VI(Φ, X) X: convex compact subset of Euclidean space E Φ : X E is monotone: Φ(x) Φ(x ), x x 0 x, x X Fact: Most FOMs for large-scale convex optimization (Subgradient Descent, Mirror Descent, Nesterov s Fast Gradient Methods,...) are proximal algorithms. To allow for proximal methods with cheap iterations, X should admit cheap proximal setup a C 1 strongly convex distance generating function (d.g.f.) ω( ) : X R leading to easy to compute Bregman projections e argmin x X [ω(x) + e, x ] Note: If X admits cheap proximal setup, then X admits cheap Linear Minimization Oracle capable to minimize linear forms over X.

5 Proximal FOMs: bottlenecks Find x X : Φ(x), x x 0 x X VI(Φ, X) X: convex compact subset of Euclidean space E Φ : X E is monotone: Φ(x) Φ(x ), x x 0 x, x X In several important cases, X does not admit cheap proximal setup, but does allow for cheap LMO: Example 1: X R m n is nuclear norm ball, or spectahedron the set of symmetric psd m m matrices with unit trace. Here Bregman projection requires full singular value decomposition of an m n matrix, resp., full eigenvalue decomposition of a symmetric m m matrix. LMO is much cheaper: it reduces to computing (e.g., by Power method) the leading pair of singular vectors (resp., the leading eigenvector) of a matrix. Example 2: X is Total Variation ball in the space of m n zero mean images. Here already the simplest Bregman projection reduces to highly computationally demanding metric projection onto the TV ball. LMO is much cheaper: it reduces to solving a max flow problem on a simple mn-node network with 2mn arcs.

6 Illustration: LMO vs. Bregman projection Computing leading pair of singular vectors of an matrix takes 64.4 sec by factor 7.5 cheaper than computing the full singular value decomposition. Computing leading eigenvector of an symmetric matrix takes 10.9 sec by factor 13.0 cheaper than computing the full eigenvalue decomposition. Minimizing a linear form over the TV ball in the space of images takes 55.6 sec by factor 20.6 cheaper than computing metric projection onto the ball. Platform: GHz CPU, 16.0 GB RAM 64-bit Windows 7 Our goal: Solving large-scale problems with convex structure (convex minimization, convex-concave saddle points, variational inequalities with monotone operators) on LMO-represented domains.

7 Beyond Proximal FOMs: Conditional Gradient Conditional Gradient Algorithm Seemingly the only standard technique for handling LMO-represented domains is the Conditional Gradient Algorithm [Frank&Wolfe 58] solving smooth convex minimization problems Opt = min x X f (x) (P) CGA is the recurrence [X ] x t [ f (x t ), x + t Argmin x X f (x t ), x ] x t+1 : f (x t+1 ) f (x t + 2 t+1 [x + t x t ]) & x t+1 X, t = 1, 2,... Theorem [well known]: Let f ( ) be convex and (κ, L) smooth, κ (1, 2]: x, y X : f (y) f (x) + f (x), y x + L κ x y κ X [ X : norm on Lin(X) with the unit ball 1 [X X]] 2 Then f (x t ) Opt 22κ κ(3 κ) L, t = 2, 3,... (t + 1) κ 1 CGA was extended recently [Harchaoui,Juditsky,Nem. 13] to norm-regularized problems like min [f (x) + x ] x K K: cone with LMO-represented K {x : x 1}; f : convex and smooth.

8 Fenchel-Type Representations of Functions Question: How to carry out nonsmooth convex minimization and solve other smooth/nonsmooth problems with convex structure on LMO-represented domains? Proposed answer: Use Fenchel-type representations. Fenchel representation (F.r.) of a function f : R n R {+ } is f (x) = sup y [ x, y f (y)] f : proper convex lower semicontinuous. Fenchel-type representation (F-t.r.) of f is f (x) = sup y [ x, Ay + a φ(y)] φ: proper convex lower semicontinuous. Good F-t.r: Y := Dom φ is compact & φ Lip(Y ). F.r. of proper convex lower semicontinuous f exists in the nature and is unique, but usually is not available numerically. In contrast, F-t.r. s admit fully algorithmic calculus: all basic convexity-preserving operations as applied to operands given by F-t.r. s yield explicit F-t.r. of the result. Typical well-structured convex functions admit explicit good F-t.r. s (even with affine φ s).

9 Example: F.r. of f 1 + f 2 is given by computationally demanding inf-convolution: (f 1 + f 2 ) (y) = inf y 1 +y 2 =y [f 1 (y 1 ) + f 2 (y 2 )] In contrast, an F.-t.r. of f 1 + f 2 is readily given by F.-t.r. s of f 1, f 2 : f i (x) = inf yi Y i [ x, A i y i + a i φ i (y i )] i = 1, 2 [ ] f 1 (x) + f 2 (x) = sup x, A1 y 1 + a 1 + A 2 y 2 + a 2 [φ y=[y 1 ;y 2 ] }{{} 1 (y 1 ) + φ 2 (y 2 )] }{{} Y :=Y 1 Y 2 Ay+a φ(y)

10 Nonsmooth Convex Minimization via Fenchel-Type Representation When solving convex minimization problem Opt(P) = min x X f (x), good F-t.r. of the objective f (x) = [ x, Ay + a φ(y)] gives rise to the dual problem max y Y =Domφ [ Opt(P) =] Opt(D) = min y Y [ ] f (y) := φ(y) min x, Ay + a x X (P) (D) Observation: LMO for X combines with First Order oracle for φ to induce First Order oracle for f When First Order oracle for φ and LMO for X are available, (D) is well suited for solving by FOMs (e.g., proximal methods, provided Y admits cheap proximal setup). Strategy: Solve (D) and then recover a solution to (P). Question: How to recover good solution to the problem of interest (P) from information acquired when solving (D)? Proposed answer: Use accuracy certificates.

11 Accuracy Certificates Assume we are applying N-step FOM to a convex problem Opt = min y Y F(y), (P) and have generated search points y t Y augmented with first order information (F(y t ), F (y t )), 1 t N. An accuracy certificate for execution protocol I N = {y t, F (y t), F (y t)} N t=1 is a collection λ N = {λ N t } N t=1 of N nonnegative weights summing up to 1. Accuracy certificate λ N and execution protocol I N give rise to Resolution Res(I N, λ N N [ ) = max y Y t=1 λn t F (y t ), y t y Gap Gap(I N, λ N ) = min t N F (y t ) ] N t=1 λn t F (y t ) + Res(I N, λ N ) Res(I N, λ N ) Simple Theorem I [Nem.,Onn,Rothblum, 10]: Let y N be the best (with the smallest value of F ) of the search points y 1,..., y N, and let ŷ N = N t=1 λn t y t. Then y N, ŷ N are feasible solutions to (P) satisfying F(ŷ N ) Opt Res(I N, λ N ), F(y N ) Opt Gap(I N, λ N ) Res(I N, λ N )

12 Accuracy Certificates (continued) Opt(P) = min x X [f (x) := max y Y [ x, Ay + a φ(y)]] (P) Let I N = {y t Y, F(y t ), F (y t )} N t=1 be N-step execution protocol built by an FOM as applied to Opt(D) = min y Y {F(y) := φ(y) min x X x, Ay + a } (D) and let x t Argmin x X x, Ay t + a be LMO s answers obtained when mimicking the First Order oracle for F : F(y t ) = φ(y t ) x t, Ay t + a & F (y t ) = φ (y t ) A T x t Simple Theorem II [Cox,Juditsky,Nem., 13]: Let λ N be an accuracy certificate for I N and x N = N t=1 λn t x t. Then x N is feasible for (P) and f ( x N ) Opt(P) Res(I N, λ N ) (in fact, the right hand side can be replaced with Gap(I N, λ N )).

13 LMO-Based Nonsmooth Convex Minimization (continued) Opt(P) = min x X {f (x) = max y Y [ x, Ay + a φ(y)]} (P) [ Opt(P) =] Opt(D) = min y Y {F(y) = φ(y) min x X x, Ay + a } (D) Conclusion: Mimicking the First Order oracle for (D) via LMO for X and solving (D) by an FOM producing accuracy certificates, after N = 1, 2,... iterations we have at our disposal feasible solutions x N to the problem of interest (P) such that f ( x N ) Opt(P) Gap(I N, λ N ). Fact: A wide spectrum of FOMs allow for augmenting execution protocols by good accuracy certificates, meaning that Res(I N, λ N ) (and thus Gap(I N, λ N )) obeys the standard efficiency estimates of the algorithms in question. For some FOMs (Subgradient/Mirror Descent, Nesterov s Fast Gradient Method for smooth convex minimization, and full memory Mirror Descent Bundle Level algorithms), good certificates are readily available. Several FOMs (polynomial time Cutting Plane algorithms, like Ellipsoid and Inscribed Ellipsoid methods, and truncated memory Mirror Descent Bundle Level algorithms) can be modified in order to produce good certificates. The required modifications are costless the complexity of an iteration remains basically intact.

14 LMO-Based Nonsmooth Convex Minimization (continued) Opt(P) = min x X {f (x) = max y Y [ x, Ay + a φ(y)]} (P) [ Opt(P) =] Opt(D) = min y Y {F(y) = φ(y) min x X x, Ay + a } (D) Let Y be equipped with cheap proximal setup (P) can be solved by applying to (D) a proximal algorithm with good accuracy certificates (e.g., various versions of Mirror Descent) and recovering from the certificates approximate solutions to (P). With this approach, an iteration requires a single call to the LMO for X and a single computation of Bregman projection ξ argmin y Y [ ξ, y + ω(y)]. An alternative is to use F-t.r. of f and proximal setup for Y to approximate f by fδ (x) = max y Y { x, Ay + a φ(y) δω(y)} to minimize the C 1,1 function f δ ( ) over X by Conditional Gradient. Note: The alternative is just Nesterov s smoothing with smooth minimization by the LMO-based Conditional Gradient rather than by proximal Fast Gradients. Fact: When φ is affine (quite typical!), both approaches result in methods with the same iteration complexity and the same O(1/ t) efficiency estimate.

15 LMO-Based Nonsmooth Convex Minimization: How It Works Test problems: Matrix Completion { with uniform fit Opt = min x R p p : x nuc 1 f (x) := max(i,j) Ω x ij a ij } { [ ]} = min x R p p : x nuc 1 max y Y (i,j) Ω y ij(x ij a ij ) Y = {y = {y ij : (i, j) Ω} : (i,j) Ω y ij 1} Ω: N-element collection of cells in a p p matrix. Results, I: Restricted Memory Bundle-Level algorithm on low size (p = 512, N = 512) Matrix Completion: Memory depth Gap 1 /Gap Results, II: Subgradient Descent on Matrix Completion: p N Gap 1 Gap 1 /Gap 32 Gap 1 /Gap 128 Gap 1 /Gap 1024 CPU, sec e e e Platform: desktop PC with GHz Intel Core2 CPU and 16 GB RAM, Windows 7-64 OS.

16 From Nonsmooth LMO-Based Convex Minimization to Variational Inequalities and Saddle Points Motivating Example Consider Matrix Completion problem as follows: Opt = min [f (u) := Au b 2,2] u: u nuc 1 u Au : R n n R m m, e.g., Au = k i=1 l iuri T 2,2 : spectral norm (largest singular value) of a matrix Fenchel-type representation of f is immediate: f (u) = max v nuc 1 v, Au b problem of interest reduces to bilinear saddle point problem min u U max v V v, Au b U = {u R n n : u nuc 1}, V = {v R m m : v nuc 1} where both U and V admit computationally cheap LMO s, but do not admit computationally cheap proximal setups Our previous approach (same as any other known approach) is inapplicable we needed Y V to be proximal-friendly... (?) How to solve convex-concave saddle point problems on products of LMO-represented domains?

17 Fenchel-Type Representation of Monotone Operator: Definition Definitions Fenchel-type representation: Let X E be a convex compact set in Euclidean space, and Φ : X E be a vector field on X. A Fenchel-type representation of Φ on X is Φ(x) = Ay(x) + a ( ) y Ay + a : F E: affine mapping from Euclidean space F into E y(x): strong solution to VI(G( ) A x, Y ) Y F: convex & compact, G( ) : Y F: monotone F, Y, A, a, y( ), G( ) is the data of the representation. Definition Dual operator induced by F.-t.r. ( ) is Θ(y) = G(y) A x(y) : Y F, x(y) Argmin x X Ay + a, x The v.i. VI(Θ, Y ) is called the (induced by ( )) dual to the primal v.i. VI(Φ, X).

18 Fenchel-Type Representation of Monotone Operator (continued) Facts: If an operator Φ : X E admits a representation on a convex compact set X E, Φ is monotone on X The dual operator Θ induced by a Fenchel-type representation of a monotone operator is monotone. Θ is bounded, provided G( ) is so. Calculus of Fenchel-type Representations: F-t.r. s of monotone operators admit fully algorithmic calculus: F-t.r. s of operands of basic monotonicity-preserving operations: summation with nonnegative coefficients, direct summation, affine substitution of variables can be straightforwardly converted to an F-t.r. of the result. An affine monotone operator admits explicit F-t.r. on every compact domain. A good F-t.r. f (x) = min y Y [ x, Ay + a φ(y)] of convex function f : X R induces an F-t.r. of a subgradient field of f, provided φ C 1 (Y ).

19 A Digression: Variational Inequalities with Monotone Operators: Accuracy Measures Find x X : Φ(x), x x 0 x X VI(Φ, X) A natural measure of (in)accuracy of a candidate solution x X to VI(Φ, X) is the dual gap function ε vi ( x Φ, X) = sup x X Φ(x), x x When VI(Φ, X) comes from convex-concave saddle point problem: X = U V for convex compact sets U, V, and Φ(u, v) = [Φ u (u, v) u f (u, v); Φ v (u, v) v [ f (u, v)]] for Lipschitz continuous convex-concave function f (u, v) : X = U V R, another natural accuracy measure is the saddle point inaccuracy ε sad ( x = [ū; v] f, U, V ) := max v V f (ū, v) min u U f (u, v) Explanation: Convex-concave saddle point problem gives rise to two dual to each other convex programs ] Opt(P) = min u U [f (u) := max v V f (u, v) (P) Opt(D) = max v V [f (v) := min u U f (u, v)] (D) with equal optimal values: Opt(P) = Opt(D). ε sad (ū, v f, U, V ) is the sum of non-optimalities, in terms of respective objectives, of ū U as a solution to (P) and v V as a solution to (D).

20 Why Accuracy Certificates Certify Accuracy? Fact: Let v.i. VI(Ψ, Z ) with monotone operator Ψ and convex compact Z be solved by N-step FOM, let I N = {z i Z, Ψ(z i )} N i=1 be execution protocol, and λ N = {λ i 0} N i=1, i λ i = 1, be an accuracy certificate. Then z N = N i=1 λ iz i is a feasible solution to VI(Ψ, Z ), and ε vi (z N Ψ, Z ) Res(I N, λ N N ) := max z Z i=1 λ i Ψ(z i ), z i z When Ψ is associated with convex-concave saddle point problem min u U max v V f (u, v), we also have ε sad (z N f, U, V ) Res(I N, λ N ). Fact: Let Ψ be a bounded vector field on a convex compact domain Z. For every N = 1, 2,..., a properly designed N-step proximal FOM (Mirror Descent) as applied to VI(Ψ, Z ) generates an execution protocol I N and accuracy certificate λ N such that Res(I N, λ N ) O(1/ N) If Ψ is Lipschitz continuous on Z, then for properly selected N-step FOM (Mirror Prox) the efficiency estimate improves to Res(I N, λ N ) O(1/N). In both cases, factors hidden in O( ) are explicitly given by parameters of proximal setup and the magnitude of Ψ (first case), or of the Lipschitz constant of Ψ (second case).

21 Solving Monotone Variational Inequalities on LMO-Represented Domains In order to solve a primal v.i. { VI(Φ, X) given a F-t.r. y(x) Y Φ(x) = Ay(x) + a, where G(y(x)) A x, y y(x) 0 y Y we solve the dual v.i. VI(Θ, Y ), Θ(y) = G(y) A x(y), where x(y) Argmin x X x, Ay + a Note: Computing Θ(y) reduces to computing G(y), multiplying by A and A, and a single call to the LMO representing X. Theorem [Juditsky,Nem., 13]: Let I N = {y i Y, Θ(y i )} N i=1 be execution protocol of a FOM applied to the dual v.i. VI(Θ, Y ), and λ N = {λ i 0} N i=1, i λ i = 1, be associated accuracy certificate. Then x N = N i=1 λ ix(y i ) is a feasible solution to the primal v.i. VI(Φ, X) and ε vi (x T Φ, X) Res(I N, λ N N ) := max y Y i=1 λ i Θ(y i ), y i y If Φ is associated with bilinear convex-concave saddle point problem min u U max v V [f (u, v) = a, u + b, v + v, Au ], then also ε sad (x N f, U, V ) Res(I N, λ N )

22 How it Works As applied to Motivating Example Opt = min u R n n, u nuc 1 [ f (u) := Au b 2,2 ] = min u R n n, u nuc 1 max v R m m, v nuc 1 v, Au b, Au = k i=1 l i ur T i our approach results in a method yielding in N = 1, 2,... steps feasible approximate solutions u N to the problem of interest and lower bounds Opt N on Opt such that Gap N f (u N ) Opt N O(1) A 2,2 / N Iteration count N m = 512 Gap N n = 1024 Gap 1 /Gap N k = 2 cpu, sec m = 1024 Gap N n = 2048 Gap 1 /Gap N k = 2 cpu, sec m = 2048 Gap N n = 4096 Gap 1 /Gap N k = 2 cpu, sec m = 4096 Gap N n = 8192 Gap 1 /Gap N k = 2 cpu, sec m = 8192 Gap N n = Gap 1 /Gap N k = 2 cpu, sec Platform: 4 x 3.40 GHz desktop with 16 GB RAM, 64 bit Windows 7 OS. Note: The design dimension of the largest instance is 2 28 =

23 References Bruce Cox, Anatoli Juditsky, Arkadi Nemirovski, Dual subgradient algorithms for large-scale nonsmooth learning problems to appear in Mathematical Programming Series B, arxiv: , Aug Anatoli Juditsky, Arkadi Nemirovski, Solving variational inequalities with monotone operators on domains given by Linear Minimization Oracles submitted to Mathematical Programming, arxiv: , Dec. 2013

Difficult Geometry Convex Programs and Conditional Gradient Algorithms

Difficult Geometry Convex Programs and Conditional Gradient Algorithms Difficult Geometry Convex Programs and Conditional Gradient Algorithms Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research with

More information

Solving variational inequalities with monotone operators on domains given by Linear Minimization Oracles

Solving variational inequalities with monotone operators on domains given by Linear Minimization Oracles Math. Program., Ser. A DOI 10.1007/s10107-015-0876-3 FULL LENGTH PAPER Solving variational inequalities with monotone operators on domains given by Linear Minimization Oracles Anatoli Juditsky Arkadi Nemirovski

More information

Dual subgradient algorithms for large-scale nonsmooth learning problems

Dual subgradient algorithms for large-scale nonsmooth learning problems Math. Program., Ser. B (2014) 148:143 180 DOI 10.1007/s10107-013-0725-1 FULL LENGTH PAPER Dual subgradient algorithms for large-scale nonsmooth learning problems Bruce Cox Anatoli Juditsky Arkadi Nemirovski

More information

Stochastic Semi-Proximal Mirror-Prox

Stochastic Semi-Proximal Mirror-Prox Stochastic Semi-Proximal Mirror-Prox Niao He Georgia Institute of echnology nhe6@gatech.edu Zaid Harchaoui NYU, Inria firstname.lastname@nyu.edu Abstract We present a direct extension of the Semi-Proximal

More information

Convex Stochastic and Large-Scale Deterministic Programming via Robust Stochastic Approximation and its Extensions

Convex Stochastic and Large-Scale Deterministic Programming via Robust Stochastic Approximation and its Extensions Convex Stochastic and Large-Scale Deterministic Programming via Robust Stochastic Approximation and its Extensions Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia

More information

Saddle Point Algorithms for Large-Scale Well-Structured Convex Optimization

Saddle Point Algorithms for Large-Scale Well-Structured Convex Optimization Saddle Point Algorithms for Large-Scale Well-Structured Convex Optimization Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research

More information

1 First Order Methods for Nonsmooth Convex Large-Scale Optimization, I: General Purpose Methods

1 First Order Methods for Nonsmooth Convex Large-Scale Optimization, I: General Purpose Methods 1 First Order Methods for Nonsmooth Convex Large-Scale Optimization, I: General Purpose Methods Anatoli Juditsky Anatoli.Juditsky@imag.fr Laboratoire Jean Kuntzmann, Université J. Fourier B. P. 53 38041

More information

Online First-Order Framework for Robust Convex Optimization

Online First-Order Framework for Robust Convex Optimization Online First-Order Framework for Robust Convex Optimization Nam Ho-Nguyen 1 and Fatma Kılınç-Karzan 1 1 Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA, 15213, USA. July 20, 2016;

More information

Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization

Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization Niao He nhe6@gatech.edu GeorgiaTech Zaid Harchaoui zaid.harchaoui@inria.fr NYU, Inria arxiv:1507.01476v1 [math.oc] 6 Jul 2015 August 13, 2018

More information

Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization

Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization Zaid Harchaoui Anatoli Juditsky Arkadi Nemirovski May 25, 2014 Abstract Motivated by some applications in signal processing

More information

Primal-dual first-order methods with O(1/ǫ) iteration-complexity for cone programming

Primal-dual first-order methods with O(1/ǫ) iteration-complexity for cone programming Mathematical Programming manuscript No. (will be inserted by the editor) Primal-dual first-order methods with O(1/ǫ) iteration-complexity for cone programming Guanghui Lan Zhaosong Lu Renato D. C. Monteiro

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization Convex Optimization Ofer Meshi Lecture 6: Lower Bounds Constrained Optimization Lower Bounds Some upper bounds: #iter μ 2 M #iter 2 M #iter L L μ 2 Oracle/ops GD κ log 1/ε M x # ε L # x # L # ε # με f

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization

Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization Zaid Harchaoui Anatoli Juditsky Arkadi Nemirovski April 12, 2013 Abstract Motivated by some applications in signal processing

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Gradient Sliding for Composite Optimization

Gradient Sliding for Composite Optimization Noname manuscript No. (will be inserted by the editor) Gradient Sliding for Composite Optimization Guanghui Lan the date of receipt and acceptance should be inserted later Abstract We consider in this

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems O. Kolossoski R. D. C. Monteiro September 18, 2015 (Revised: September 28, 2016) Abstract

More information

Dual methods for the minimization of the total variation

Dual methods for the minimization of the total variation 1 / 30 Dual methods for the minimization of the total variation Rémy Abergel supervisor Lionel Moisan MAP5 - CNRS UMR 8145 Different Learning Seminar, LTCI Thursday 21st April 2016 2 / 30 Plan 1 Introduction

More information

6 First-Order Methods for Nonsmooth Convex Large-Scale Optimization, II: Utilizing Problem s Structure

6 First-Order Methods for Nonsmooth Convex Large-Scale Optimization, II: Utilizing Problem s Structure 6 First-Order Methods for Nonsmooth Convex Large-Scale Optimization, II: Utilizing Problem s Structure Anatoli Juditsky Anatoli.Juditsky@imag.fr Laboratoire Jean Kuntzmann, Université J. Fourier B. P.

More information

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization

More information

2 First Order Methods for Nonsmooth Convex Large-Scale Optimization, II: Utilizing Problem s Structure

2 First Order Methods for Nonsmooth Convex Large-Scale Optimization, II: Utilizing Problem s Structure 2 First Order Methods for Nonsmooth Convex Large-Scale Optimization, II: Utilizing Problem s Structure Anatoli Juditsky Anatoli.Juditsky@imag.fr Laboratoire Jean Kuntzmann, Université J. Fourier B. P.

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING. Arkadi Nemirovski

ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING. Arkadi Nemirovski ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING Arkadi Nemirovski Georgia Institute of Technology and Technion Israel Institute of Technology Convex Programming solvable case in Optimization Revealing

More information

Universal Gradient Methods for Convex Optimization Problems

Universal Gradient Methods for Convex Optimization Problems CORE DISCUSSION PAPER 203/26 Universal Gradient Methods for Convex Optimization Problems Yu. Nesterov April 8, 203; revised June 2, 203 Abstract In this paper, we present new methods for black-box convex

More information

Solving large Semidefinite Programs - Part 1 and 2

Solving large Semidefinite Programs - Part 1 and 2 Solving large Semidefinite Programs - Part 1 and 2 Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Singapore workshop 2006 p.1/34 Overview Limits of Interior

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Convex Optimization Conjugate, Subdifferential, Proximation

Convex Optimization Conjugate, Subdifferential, Proximation 1 Lecture Notes, HCI, 3.11.211 Chapter 6 Convex Optimization Conjugate, Subdifferential, Proximation Bastian Goldlücke Computer Vision Group Technical University of Munich 2 Bastian Goldlücke Overview

More information

ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS

ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS Mau Nam Nguyen (joint work with D. Giles and R. B. Rector) Fariborz Maseeh Department of Mathematics and Statistics Portland State

More information

arxiv: v2 [math.oc] 12 Jun 2015

arxiv: v2 [math.oc] 12 Jun 2015 arxiv:1506.02444v2 [math.oc] 12 Jun 2015 Decomposition Techniques for Bilinear Saddle Point Problems and Variational Inequalities with Affine Monotone Operators on Domains Given by Linear Minimization

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems

An accelerated non-euclidean hybrid proximal extragradient-type algorithm for convex concave saddle-point problems Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://www.tandfonline.com/loi/goms20 An accelerated non-euclidean hybrid proximal extragradient-type algorithm

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Tutorial: Mirror Descent Algorithms for Large-Scale Deterministic and Stochastic Convex Optimization

Tutorial: Mirror Descent Algorithms for Large-Scale Deterministic and Stochastic Convex Optimization Tutorial: Mirror Descent Algorithms for Large-Scale Deterministic and Stochastic Convex Optimization Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

Lecture 23: November 19

Lecture 23: November 19 10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Accelerated Proximal Gradient Methods for Convex Optimization

Accelerated Proximal Gradient Methods for Convex Optimization Accelerated Proximal Gradient Methods for Convex Optimization Paul Tseng Mathematics, University of Washington Seattle MOPTA, University of Guelph August 18, 2008 ACCELERATED PROXIMAL GRADIENT METHODS

More information

LEARNING IN CONCAVE GAMES

LEARNING IN CONCAVE GAMES LEARNING IN CONCAVE GAMES P. Mertikopoulos French National Center for Scientific Research (CNRS) Laboratoire d Informatique de Grenoble GSBE ETBC seminar Maastricht, October 22, 2015 Motivation and Preliminaries

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä New Proximal Bundle Method for Nonsmooth DC Optimization TUCS Technical Report No 1130, February 2015 New Proximal Bundle Method for Nonsmooth

More information

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs Christian Clason Faculty of Mathematics, Universität Duisburg-Essen joint work with Barbara Kaltenbacher, Tuomo Valkonen,

More information

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties Fedor S. Stonyakin 1 and Alexander A. Titov 1 V. I. Vernadsky Crimean Federal University, Simferopol,

More information

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

Complexity bounds for primal-dual methods minimizing the model of objective function

Complexity bounds for primal-dual methods minimizing the model of objective function Complexity bounds for primal-dual methods minimizing the model of objective function Yu. Nesterov July 4, 06 Abstract We provide Frank-Wolfe ( Conditional Gradients method with a convergence analysis allowing

More information

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. . Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. Nemirovski Arkadi.Nemirovski@isye.gatech.edu Linear Optimization Problem,

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

Bregman Divergence and Mirror Descent

Bregman Divergence and Mirror Descent Bregman Divergence and Mirror Descent Bregman Divergence Motivation Generalize squared Euclidean distance to a class of distances that all share similar properties Lots of applications in machine learning,

More information

Convex Optimization Lecture 16

Convex Optimization Lecture 16 Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Primal-dual subgradient methods for convex problems

Primal-dual subgradient methods for convex problems Primal-dual subgradient methods for convex problems Yu. Nesterov March 2002, September 2005 (after revision) Abstract In this paper we present a new approach for constructing subgradient schemes for different

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent

Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent Haihao Lu August 3, 08 Abstract The usual approach to developing and analyzing first-order

More information

Convex Optimization Theory. Athena Scientific, Supplementary Chapter 6 on Convex Optimization Algorithms

Convex Optimization Theory. Athena Scientific, Supplementary Chapter 6 on Convex Optimization Algorithms Convex Optimization Theory Athena Scientific, 2009 by Dimitri P. Bertsekas Massachusetts Institute of Technology Supplementary Chapter 6 on Convex Optimization Algorithms This chapter aims to supplement

More information

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1 L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones definition spectral decomposition quadratic representation log-det barrier 18-1 Introduction This lecture: theoretical properties of the following

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

Convex Optimization Under Inexact First-order Information. Guanghui Lan

Convex Optimization Under Inexact First-order Information. Guanghui Lan Convex Optimization Under Inexact First-order Information A Thesis Presented to The Academic Faculty by Guanghui Lan In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy H. Milton

More information

A Quantum Interior Point Method for LPs and SDPs

A Quantum Interior Point Method for LPs and SDPs A Quantum Interior Point Method for LPs and SDPs Iordanis Kerenidis 1 Anupam Prakash 1 1 CNRS, IRIF, Université Paris Diderot, Paris, France. September 26, 2018 Semi Definite Programs A Semidefinite Program

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

CS675: Convex and Combinatorial Optimization Spring 2018 The Ellipsoid Algorithm. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Spring 2018 The Ellipsoid Algorithm. Instructor: Shaddin Dughmi CS675: Convex and Combinatorial Optimization Spring 2018 The Ellipsoid Algorithm Instructor: Shaddin Dughmi History and Basics Originally developed in the mid 70s by Iudin, Nemirovski, and Shor for use

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control RTyrrell Rockafellar and Peter R Wolenski Abstract This paper describes some recent results in Hamilton- Jacobi theory

More information

m i,j=1 M ijx T i x j = m

m i,j=1 M ijx T i x j = m VARIATIONAL GRAM FUNCTIONS: CONVEX ANALYSIS AND OPTIMIZATION AMIN JALALI, MARYAM FAZEL, AND LIN XIAO Abstract. We introduce a new class of convex penalty functions, called variational Gram functions (VGFs),

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Math 273a: Optimization Subgradient Methods

Math 273a: Optimization Subgradient Methods Math 273a: Optimization Subgradient Methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Nonsmooth convex function Recall: For ˉx R n, f(ˉx) := {g R

More information

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches

Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Splitting Techniques in the Face of Huge Problem Sizes: Block-Coordinate and Block-Iterative Approaches Patrick L. Combettes joint work with J.-C. Pesquet) Laboratoire Jacques-Louis Lions Faculté de Mathématiques

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

What can be expressed via Conic Quadratic and Semidefinite Programming?

What can be expressed via Conic Quadratic and Semidefinite Programming? What can be expressed via Conic Quadratic and Semidefinite Programming? A. Nemirovski Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Abstract Tremendous recent

More information

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

In English, this means that if we travel on a straight line between any two points in C, then we never leave C. Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from

More information

Hypothesis Testing via Convex Optimization

Hypothesis Testing via Convex Optimization Hypothesis Testing via Convex Optimization Arkadi Nemirovski Joint research with Alexander Goldenshluger Haifa University Anatoli Iouditski Grenoble University Information Theory, Learning and Big Data

More information

An adaptive accelerated first-order method for convex optimization

An adaptive accelerated first-order method for convex optimization An adaptive accelerated first-order method for convex optimization Renato D.C Monteiro Camilo Ortiz Benar F. Svaiter July 3, 22 (Revised: May 4, 24) Abstract This paper presents a new accelerated variant

More information

arxiv: v2 [math.oc] 21 Nov 2017

arxiv: v2 [math.oc] 21 Nov 2017 Unifying abstract inexact convergence theorems and block coordinate variable metric ipiano arxiv:1602.07283v2 [math.oc] 21 Nov 2017 Peter Ochs Mathematical Optimization Group Saarland University Germany

More information

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

More information

The proximal mapping

The proximal mapping The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function

More information

Learning with Submodular Functions: A Convex Optimization Perspective

Learning with Submodular Functions: A Convex Optimization Perspective Foundations and Trends R in Machine Learning Vol. 6, No. 2-3 (2013) 145 373 c 2013 F. Bach DOI: 10.1561/2200000039 Learning with Submodular Functions: A Convex Optimization Perspective Francis Bach INRIA

More information

Lecture: Smoothing.

Lecture: Smoothing. Lecture: Smoothing http://bicmr.pku.edu.cn/~wenzw/opt-2018-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Smoothing 2/26 introduction smoothing via conjugate

More information

DISCUSSION PAPER 2011/70. Stochastic first order methods in smooth convex optimization. Olivier Devolder

DISCUSSION PAPER 2011/70. Stochastic first order methods in smooth convex optimization. Olivier Devolder 011/70 Stochastic first order methods in smooth convex optimization Olivier Devolder DISCUSSION PAPER Center for Operations Research and Econometrics Voie du Roman Pays, 34 B-1348 Louvain-la-Neuve Belgium

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Subgradient Method. Ryan Tibshirani Convex Optimization

Subgradient Method. Ryan Tibshirani Convex Optimization Subgradient Method Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last last time: gradient descent min x f(x) for f convex and differentiable, dom(f) = R n. Gradient descent: choose initial

More information

4. Convex optimization problems

4. Convex optimization problems Convex Optimization Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form convex optimization problems quasiconvex optimization linear optimization quadratic optimization

More information

Constrained optimization

Constrained optimization Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained

More information

An Optimal Affine Invariant Smooth Minimization Algorithm.

An Optimal Affine Invariant Smooth Minimization Algorithm. An Optimal Affine Invariant Smooth Minimization Algorithm. Alexandre d Aspremont, CNRS & École Polytechnique. Joint work with Martin Jaggi. Support from ERC SIPA. A. d Aspremont IWSL, Moscow, June 2013,

More information

Copositive Programming and Combinatorial Optimization

Copositive Programming and Combinatorial Optimization Copositive Programming and Combinatorial Optimization Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with I.M. Bomze (Wien) and F. Jarre (Düsseldorf) IMA

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

15. Conic optimization

15. Conic optimization L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 15-1 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Lecture 24 November 27

Lecture 24 November 27 EE 381V: Large Scale Optimization Fall 01 Lecture 4 November 7 Lecturer: Caramanis & Sanghavi Scribe: Jahshan Bhatti and Ken Pesyna 4.1 Mirror Descent Earlier, we motivated mirror descent as a way to improve

More information