A. Derivation of regularized ERM duality

Size: px
Start display at page:

Download "A. Derivation of regularized ERM duality"

Transcription

1 A. Derivation of regularized ERM dualit For completeness, in this section we derive the dual 5 to the problem of computing proximal operator for the ERM objective 3. We can rewrite the primal problem as B convex dualit, this is equivalent to min max x,z i} R n φ i z i + x s + T Ax z = max n min x R d,z R n φ iz i + x s. subject to z i = a T i x, for i =,..., n = max = max = max = max = min The final negated problem is precisel the dual formulation. min x,z i} min x,z i} φ i z i + x s + T Ax z φ i z i i z i + x s + T Ax min φ i z i i z i } + min z i x max i z i φ i z i } max z i x φ i i AT + s T A T φ i i + AT s T A T. The first problem is a Lagrangian saddle-point problem, where the Lagrangian is defined as Lx,, z = φ i z i + x s + T Ax z. } x s + T Ax T Ax x s The dual-to-primal mapping 6 and primal-to-dual mapping 7 are implied b the KKT conditions under L, and can be derived b solving for x,, and z in the sstem Lx,, z = 0. The dualit gap in this context is defined as gap s, x, = f s, x + g s,. 9 Strong dualit dictates that gap s, x, 0 for all x R d, R n, with equalit attained when x is primal-optimal and is dual-optimal. B. Additional lemmas This section establishes technical lemmas, useful in statements and proofs throughout the paper. Lemma B. Standard bounds for smooth, strongl convex functions. Let f : R k R be differentiable and let x R k. Furthermore, let x opt be a minimizer of f. If f is L-smooth then L fx fx fx opt L x xopt }

2 If f is -strongl convex we have x xopt fx fx opt fx Proof. Appl the definition of smoothness and strong convexit at the points x and x opt and minimize the resulting quadratic form. Lemma B. Errors for primal-dual pairs. Consider F, f s, g s,, ˆx s,, and ŷ, as defined in, 3, 5, 6, and 7, respectivel. Then for all R n, Furthermore let f s, ˆx s, f opt s, R LnR L + g s, g opt s,, Then for all R n, x opt s, = argmin f s,x and opt s, = argmin g s,x. ˆx s, x opt s, nr opt s,. Proof. Because F is nr L smooth, f s, is nr L + smooth. Consequentl, for all x R d we have f s, x f opt s, xopt Since we know that x opt s, = s AT opt s, and AAT nr I, s, nr L + f s, ˆx x, f s, x opt s, nr L + x x opt s, s AT s AT opt s, = nr L + opt s, AA T nr nr L + opt s,. 0 Finall, since each φ i is /L strongl convex, G is n/l strongl convex and hence so is g s,. Therefore, Substituting in 0 ields the result. n opt s, L g s, g s, opt s,. B adding the dual error to both sides of the inequalit in Lemma B., we obtain the following corollar. Corollar B.3 Dual error bounds gap. Consider g s,, gap s,, ˆx s,, and ŷ, as defined in 5, 9, 6, and 7, respectivel. For all s R d and R n, Another corollar arises b combining Lemmas B. and B.3. gap s, ˆx s,, R LnR L + g s, g opt s,. Corollar B.4 Dual error bounds primal gradient. Consider f s,, g s,, and ˆx s,, as defined in 3, 5, and 6, respectivel. For all s R d and R n, f s, ˆx s, nr L + f s, ˆx s, f opt s, R LnR L + g s, g opt s,.

3 Lemma B.5 Gap for primal-dual pairs. Consider F, f s, g s,, gap s,, ˆx s,, and ŷ, as defined in, 3, 5, 9, 6, and 7, respectivel. For an x, s R d Separatel, for an R n and s R d, gap s, x, ŷx = F x + x s. gap s, ˆx s,, = F ˆx s, + g s, + AT. 3 Proof. To prove the first identit, let ŷ = ŷx for brevit. Recall that ŷ i = φ ia T i x argmax i x T a i i φ i i } 4 b definition, and hence x T a i ŷ i φ i ŷ i = φ i a T i x. Observe that gap s, x, ŷ = φi a T i x + φ i ŷ i x T A T ŷ + AT ŷ + x s = φ i a T i x + φ i ŷ i x T a i ŷ i + }} AT ŷ + x s =0 b 4 = AT ŷ + x s n = a i φ ia T i x + x s = F x + x s. For the second identit 3, let ˆx = ˆx s, for brevit. Fix s R d and > 0. Define rx = x s, and note that its Fenchel conjugate is r = + s T. With this notation we can write: Observe that Note also that g s, ma be rewritten as f s, x = F x + rx g s, = G + r A T. ˆx = s AT = argmin x s + x T A T } x = argmin rx + x T A T } x = argmax x T A T rx } x = r A T. g s, = G + AT s T A T = G + T A + T A s T A T = G AT ˆx T A T. Combining the above two observations, and noting that the first implies equalit in the Fenchel-Young inequalit, gap s, ˆx, = f s, ˆx + g s, = F ˆx + G + rˆx + r A T = F ˆx + G ˆx T A T = F ˆx + g s, + AT,

4 proving the claim. C. Convergence analsis of Dual APPA The goal of this section is to establish the convergence rate of Algorithm 3, Dual APPA. It is structured as follows: Mirroring Lemma 3., we establish Lemma C., concerning contraction of the norm of the gradient, F, rather than of the error in function value F F opt. Similar to Lemma 3., this lemma is self-contained and assume onl that F is -strongl convex, but not that it is the ERM objective. We then establish directl a geometric rate upper bound on the value of F over the course of the algorithm, reling, in the process on Lemma C.. Finall, we quer the strong convexit of F to establish that it too is subject to the same rate up to an extra multiplicative factor. Lemma C. Gradient-norm reduction implies contraction. Suppose F : R d R is -strongl convex and that > 0. Define f s : R d R b f s x = fx + x s. For an x 0, x R d, F x + f x0 x F x 0. 5 Corollar C.. Define γ def = / + and let τ be an scalar in the interval [γ,. For an x 0, x R d, F x τ F x 0, 6 whenever f x0 x τ γ F x γ Proof of Lemma C.. Taking gradients of f x0 and f x we have that B the triangle inequalit, Because f x0 is + -strongl convex, Combining the previous two observations proves the claim. F x = F x + x x 0 x x 0 = f x0 x x x 0. F x f x0 x + x x 0. + x x 0 f x0 x f x0 x 0 f x0 x + f x0 x 0 = f x0 x + F x 0. Throughout the remainder of this section, we consider the functions defined in the main setup Section. namel, F, f s, g s,, ˆx s,, and ŷ, as defined in, 3, 5, 6, and 7, respectivel, where F is a -strongl convex sum whose constituent summands are each L-smooth scalar functions operating on the inner product of the variable x R d with a vector a i of Euclidean norm at most R. For notational brevit, we omit the subscript, and we denote iterates of the algorithm with numerical subscripts e.g. x, x,..., and,,... rather than parenthesized superscripts. We begin b bounding the error of the dual regularized ERM problem when the center of regularization changes. This characterizes the initial error at the beginning of each Dual APPA iteration.

5 Lemma C.3 Dual error is bounded across re-centering.. Fix R n and x R d. Let x = ˆx x. Then g x g opt x c g x g opt x + c F x, where c = R LnR L + + nr L + + and c = +. In other words, the dual error g s gs opt is bounded across a re-centering step b multiples of previous sub-optimalit measurements namel, dual error and gradient norm. Proof. We first show how re-centering, from x to x, increases the dualit gap between and x the new center. The dual function value changes from to g x = G + AT x T A T g x = G + AT x T A T = G + AT = g x + AT = g x + AT = g x + x AT x = g x + x x x AT T A T i.e. it increases b x x. Meanwhile, the primal function value changes from f x x = F x + x x to f x x = F x, i.e. it decreases b x x. Hence, in total, the new dualit gap is gap x x, = gap x x, + x x gap x x, + + f xx f x x gap x x, + fx + x + f x x gap x x, + fx + x + F x, where the first inequalit follows b + -strong convexit of f x.

6 Combining the last two chains of inequalities, and abbreviating L = nr L +, we bound the re-centered dual error of : g x g opt x gap x x, gap x x, + R L L g x gx opt + R L L g x gx opt + = R L L + + fx x + F x L + fx + x + F x R L L + g x gx opt + F x g x gx opt + + F x. where the third and fourth inequalities follow from Corollar B.3 the fourth first reling on the L-smoothness of f x, that is: f x x Lf x x fx opt R L L g x gx opt. 8 Recalling the choices of numerical constants c and c, the claim is proven. Finall, we prove that for an appropriate choice of oracle parameter σ independent of the target error ɛ the iterates produced b Dual APPA are bounded above b a geometric sequence. In doing so, it will be convenient to abbreviate the same numerical constant as in 8. c 3 = R LnR L +, Proposition C.4 Convergence rate of Dual APPA in gradient norm. Define γ = / + and let τ be an scalar in [γ,. Define r = max /, τ/ }, τ, 9 and σ = max c 3 τ γ, + γ c + c + c c 3, r r } τ γ c 3 c + c + c c γ In the execution of Dual APPA Algorithm 3, for ever iteration t, where c = F x 0. g xt t g opt x t cr t, and F x t cr t, Proof. The argument proceeds b strong induction on t. We begin the inductive argument with base case t =. For the first invariant, the algorithm takes σ / and we have g x0 g opt x 0 σ g x 0 0 g opt x 0 σ F x 0 σ 4 c 3 because the algorithm explicitl takes 0 = ŷx 0 see Lemma B.4. So this part of the claim holds as r / and σ /.

7 The second invariant holds immediatel as c = F x 0 and r /. We will also explicitl handle the case t = for the second invariant. B 3 and our choice of σ, we have that and so b Corollar C., f x0 x c 3 g x0 g opt x 0 c 3 σ and the claim follows as r τ / F x 0 F x τ F x 0 τ c. τ γ F x 0. + γ Now consider t. Concerning the first invariant, the algorithm takes σ r c + c + c c 3 and so we have g xt t g opt x 0 σ g x t t gx opt t [ c + c c 3 g xt t gx opt σ t + c F x t ] [ c + c c 3 cr t + c cr t ] σ σ cc + c + c c 3 r t cr t. For the second invariant, we have alread handled the case of t =, and so assume that t 3. B Lemmas C. and C., and b the choice of σ, we have that F x t + γ f xt x t + γ F x t + γ c 3 g xt t gx opt t + γ F x t + γ c 3 σ g x t t gx opt t + γ F x t [ + γ c 3 c + c c 3 g xt 3 t gx opt σ t 3 + c F x t 3 ] + γ F x t + γ c 3 σ c + c + c c 3 cr t + γ cr t τ γ cr t + γ cr t τ cr t cr t, where the final inequalit is due to the choice of r τ. This completes the inductive argument. Equipped with Proposition C.4, we translate it into a statement about the convergence rate of Dual APPA in error, rather than in gradient norm, to prove Theorem.0: Proof of Theorem.0. Define γ, τ, r, and σ as in Proposition C.4. In the execution of Dual APPA Algorithm 3, at each iteration t, we claim that F x t F opt nr L ɛ 0 r t+, 4

8 where ɛ 0 = F x 0 F opt. Consequentl, for an ɛ > 0, in order to achieve ɛ error in Dual APPA, it suffices to take a number of iterations T nr L log + log ɛ 0. r ɛ The first part of the claim follows directl from Proposition C.4, using the smoothness and strong convexit of F. The second part follows from a calculation sufficient to make the right hand side of 4 smaller than a given ɛ. D. Convergence analsis of Accelerated APPA The goal of this section is to establish the convergence rate of Algorithm, Accelerated APPA, and prove Theorem.6. Note that the results in this section use nothing about the structure of F other than strong convexit and thus the appl to the general ERM problem ; the are written in greater generalit to make this distinction clear. Aspects of the proofs in this section bear resemblance to the analsis in Shalev-Shwartz & Zhang 04, which achieves similar results in a more specialized setting. The rest of this section is structured as follows: In Lemma D. we show that appling a primal oracle to the inner minimization problem gives us a quadratic lower bound on F x. In Lemma D. we use this lower bound to construct a series of lower bounds for the main objective function f, and accelerate the APPA algorithm, comprising the bulk of the analsis. In Lemma D.3 we show that the requirements of Lemma D. can be met b using a primal oracle that decreases the error b a constant factor. In Lemma D.4 we analze the initial error requirements of Lemma D.. In Lemma D.5 we provide an auxiliar lemma that combines two quadratic functions that we use in the earlier proofs. The proof of Theorem.6 follows immediatel from these lemmas. Lemma D.. Let f R n R be -strongl convex and let f x0,x def = fx + x x 0. Suppose x + satisfies. f x0,x + min x R n f x 0,x + ɛ, Then for def = /, g def = x 0 x +, and all x R n we have fx fx + g + x x 0 + g + ɛ. Note that as = / we are onl losing a factor of in the strong convexit parameter for our lower bound. This allows us to account for the error without loss in our ultimate convergence rates. Proof. Let x opt = argmin x R n f x0,x. Since f is -strongl convex clearl f x0, is + strongl convex and b Lemma B. f x0,x f x0,x opt + x x opt. 5 B Cauch-Schwartz we know + x x + + x x opt + x opt x + + x xopt + + x opt x +,

9 which implies + x x opt + x x x opt x +. On the other hand, since f x0,x + f x0,x opt + ɛ b 5 we have + x+ x opt ɛ and therefore f x0,x f x0,x + f x0,x f x0,x opt ɛ + x x opt ɛ + x x x opt x + ɛ + x x + + ɛ. Now since x x + = x x 0 + g = x x 0 + g, x x 0 + g, and using the fact that f x0,x = fx + x x 0, we have [ ] fx fx g + + g, x x 0 + x x 0 + ɛ. The right hand side of the above equation is a quadratic function. Looking at its gradient with respect to x we see that it obtains its minimum when x = x 0 + g and has a minimum value of fx + g + ɛ. Lemma D.. Let f : R n def R be -strongl convex and suppose that in each iteration t we have ψ t = ψ opt t + x vt such that fx ψ t x for all x. Then if we let ρ def = +, f, x def = fx + x for 3 and t def = x t + ρ / v t, +ρ / +ρ / E[f t,x t+ ] f opt t, ρ 3/ 4 fx t ψ opt t, t def g = t x t+, t+ def v = ρ / v t + ρ [ / t + g t ]. We have that E[fx t ψ opt t ] ρ / t fx 0 ψ opt 0. Proof. Regardless of how t is chosen we know b Lemma D. that for γ = + fx fx t+ gt + x t γ gt and all x Rn + f,x t+ f opt. 6 t t,

10 Thus, for β = ρ / we can let [ ψ t+ x def = βψ t x + β fx t+ gt + x t γ gt + ] t, = β f,x t+ f opt t [ ] ψ opt t + x vt + β + f t,x t+ f opt t, ] = ψ opt t+ + x vt+. [ fx t+ gt + x t γ gt where in the last line we used Lemma D.5. Again, b Lemma D.5 we know that ψ opt t+ = βψ t + β fx t+ gt + f,x t+ f opt t t, + β β vt t γ gt βψ t + βfx t+ β g t + β βγ g t, v t t In the second step we used the following fact: β + f t,x t+ f opt t,. β + β β γ = β + βγ β. Furthermore, expanding the term x t + γ gt and instantiating x with x t in 6 ields fx t+ fx t gt + γ g t, t x t + + f,x t+ f opt t t,. Consequentl we know [ fx t+ ψ opt t+ β[fxt ψ opt β t ] + β ] g t + γβ g t, t x t βv t t + + f,x t+ f opt t t, Note that we have chosen t so that the inner product term equals 0, and we choose β = ρ / which ensures β β + 0. Also, b assumption we know E[f,x t+ f opt t t, ] ρ 3/ 4 fx t ψ opt t, which implies E[fx t+ ψ opt t+ ] β + + ρ 3/ fx t ψ opt t ρ / /fx t ψ opt t. 4 In the final step we are using the fact that + ρ and ρ.

11 Lemma D.3. Under the setting of Lemma D., we have f,x t f opt t t, fxt ψ opt t. In particular in order to achieve E[f,x t+ t ρ 3/ 8 fx t ψ opt t we onl need an oracle that shrinks the function error b a factor of ρ 3/ 8 in expectation. Proof. We know f t,x t fx t = xt t = ρ + ρ / xt v t. We will tr to show the lower bound f opt is larger than t ψopt, t b the same amount. This is because for all x we have f t,x = fx + x t ψ opt t + x vt + x t. The RHS is a quadratic function, whose optimal point is at x = v t + t + and whose optimal value is equal to ψ opt t + v t t + v t t = ψ opt t ρ / xt v t. B definition of ρ, we know f t,x t f opt t, fxt ψ opt t. + +ρ / x t v t is exactl equal to ρ +ρ / x t v t, therefore Remark In the next lemma we show that moving to the regularized problem has the same effect on the primal function value and the lower bound. This is a result of the choice of β in the proof of Lemma D.. However this does not mean the choice of β is ver fragile, we can choose an β that is between the current β and ; the influence to this lemma will be that the increase in primal function becomes smaller than the increase in the lower bound so the lemma continues to hold. Lemma D.4. Let ψ opt 0 = fx 0 + fx 0 f opt, and v 0 = x 0 def, then ψ 0 = ψ opt 0 + x v 0 is a valid lower bound for f. In particular when = LR then fx 0 ψ opt 0 κfx 0 f opt. Proof. This lemma is a direct corollar of Lemma D. with x + = x 0. Lemma D.5. Suppose that for all x we have f x def = ψ + x v and f x = ψ + x v then where αf x + αf x = ψ α + x v α v α = αv + αv and ψ α = αψ + αψ + α α v v Proof. Setting the gradient of αf x + αf x to 0 we know that v α must satisf α v α v + α v α v = 0 and thus Finall, v α = αv + αv. [ ψ α = α ψ + ] [ v α v + α ψ + ] v α v = αψ + αψ + [ α α v v + αα v v ] = αψ + αψ + α α v v

Linear programming: Theory

Linear programming: Theory Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analsis and Economic Theor Winter 2018 Topic 28: Linear programming: Theor 28.1 The saddlepoint theorem for linear programming The

More information

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem 1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Approximation of Continuous Functions

Approximation of Continuous Functions Approximation of Continuous Functions Francis J. Narcowich October 2014 1 Modulus of Continuit Recall that ever function continuous on a closed interval < a x b < is uniforml continuous: For ever ɛ > 0,

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

PICONE S IDENTITY FOR A SYSTEM OF FIRST-ORDER NONLINEAR PARTIAL DIFFERENTIAL EQUATIONS

PICONE S IDENTITY FOR A SYSTEM OF FIRST-ORDER NONLINEAR PARTIAL DIFFERENTIAL EQUATIONS Electronic Journal of Differential Equations, Vol. 2013 (2013), No. 143, pp. 1 7. ISSN: 1072-6691. URL: http://ejde.math.txstate.edu or http://ejde.math.unt.edu ftp ejde.math.txstate.edu PICONE S IDENTITY

More information

Notes on Convexity. Roy Radner Stern School, NYU. September 11, 2006

Notes on Convexity. Roy Radner Stern School, NYU. September 11, 2006 Notes on Convexit Ro Radner Stern School, NYU September, 2006 Abstract These notes are intended to complement the material in an intermediate microeconomic theor course. In particular, the provide a rigorous

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 6 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 6 Fall 2009 UC Berkele Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 6 Fall 2009 Solution 6.1 (a) p = 1 (b) The Lagrangian is L(x,, λ) = e x + λx 2

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- Lojasiewicz Condition

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- Lojasiewicz Condition Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polak- Lojasiewicz Condition Hamed Karimi, Julie Nutini, and Mark Schmidt Department of Computer Science, Universit of British Columbia

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Lecture 16: FTRL and Online Mirror Descent

Lecture 16: FTRL and Online Mirror Descent Lecture 6: FTRL and Online Mirror Descent Akshay Krishnamurthy akshay@cs.umass.edu November, 07 Recap Last time we saw two online learning algorithms. First we saw the Weighted Majority algorithm, which

More information

Lagrange Relaxation and Duality

Lagrange Relaxation and Duality Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

CS711008Z Algorithm Design and Analysis

CS711008Z Algorithm Design and Analysis CS711008Z Algorithm Design and Analysis Lecture 8 Linear programming: interior point method Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 / 31 Outline Brief

More information

THE INVERSE FUNCTION THEOREM

THE INVERSE FUNCTION THEOREM THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

An interior-point gradient method for large-scale totally nonnegative least squares problems

An interior-point gradient method for large-scale totally nonnegative least squares problems An interior-point gradient method for large-scale totally nonnegative least squares problems Michael Merritt and Yin Zhang Technical Report TR04-08 Department of Computational and Applied Mathematics Rice

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725 Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

More information

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006 Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in

More information

Convex Optimization Conjugate, Subdifferential, Proximation

Convex Optimization Conjugate, Subdifferential, Proximation 1 Lecture Notes, HCI, 3.11.211 Chapter 6 Convex Optimization Conjugate, Subdifferential, Proximation Bastian Goldlücke Computer Vision Group Technical University of Munich 2 Bastian Goldlücke Overview

More information

Cubic regularization of Newton s method for convex problems with constraints

Cubic regularization of Newton s method for convex problems with constraints CORE DISCUSSION PAPER 006/39 Cubic regularization of Newton s method for convex problems with constraints Yu. Nesterov March 31, 006 Abstract In this paper we derive efficiency estimates of the regularized

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information

Self Equivalence of the Alternating Direction Method of Multipliers

Self Equivalence of the Alternating Direction Method of Multipliers Self Equivalence of the Alternating Direction Method of Multipliers Ming Yan Wotao Yin August 11, 2014 Abstract In this paper, we show interesting self equivalence results for the alternating direction

More information

4 Inverse function theorem

4 Inverse function theorem Tel Aviv Universit, 2013/14 Analsis-III,IV 53 4 Inverse function theorem 4a What is the problem................ 53 4b Simple observations before the theorem..... 54 4c The theorem.....................

More information

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem 56 Chapter 7 Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem Recall that C(X) is not a normed linear space when X is not compact. On the other hand we could use semi

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program

More information

A Greedy Framework for First-Order Optimization

A Greedy Framework for First-Order Optimization A Greedy Framework for First-Order Optimization Jacob Steinhardt Department of Computer Science Stanford University Stanford, CA 94305 jsteinhardt@cs.stanford.edu Jonathan Huggins Department of EECS Massachusetts

More information

L p Spaces and Convexity

L p Spaces and Convexity L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

Logarithmic Regret Algorithms for Strongly Convex Repeated Games

Logarithmic Regret Algorithms for Strongly Convex Repeated Games Logarithmic Regret Algorithms for Strongly Convex Repeated Games Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci & Eng, The Hebrew University, Jerusalem 91904, Israel 2 Google Inc 1600

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Accelerating Stochastic Optimization

Accelerating Stochastic Optimization Accelerating Stochastic Optimization Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem and Mobileye Master Class at Tel-Aviv, Tel-Aviv University, November 2014 Shalev-Shwartz

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient

Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient Xingyu Zhou The Ohio State University zhou.2055@osu.edu December 5, 2017 Xingyu Zhou (OSU) Fenchel Duality December 5, 2017 1

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009 UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

Online Convex Optimization

Online Convex Optimization Advanced Course in Machine Learning Spring 2010 Online Convex Optimization Handouts are jointly prepared by Shie Mannor and Shai Shalev-Shwartz A convex repeated game is a two players game that is performed

More information

Robust Nash Equilibria and Second-Order Cone Complementarity Problems

Robust Nash Equilibria and Second-Order Cone Complementarity Problems Robust Nash Equilibria and Second-Order Cone Complementarit Problems Shunsuke Haashi, Nobuo Yamashita, Masao Fukushima Department of Applied Mathematics and Phsics, Graduate School of Informatics, Koto

More information

Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms

Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms Exploiting Strong Convexit from Data with Primal-Dual First-Order Algorithms Jialei Wang Lin Xiao Abstract We consider empirical ris minimization of linear predictors with convex loss functions. Such problems

More information

The Proximal Gradient Method

The Proximal Gradient Method Chapter 10 The Proximal Gradient Method Underlying Space: In this chapter, with the exception of Section 10.9, E is a Euclidean space, meaning a finite dimensional space endowed with an inner product,

More information

Week 3 September 5-7.

Week 3 September 5-7. MA322 Weekl topics and quiz preparations Week 3 September 5-7. Topics These are alread partl covered in lectures. We collect the details for convenience.. Solutions of homogeneous equations AX =. 2. Using

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Nonlinear Systems Theory

Nonlinear Systems Theory Nonlinear Systems Theory Matthew M. Peet Arizona State University Lecture 2: Nonlinear Systems Theory Overview Our next goal is to extend LMI s and optimization to nonlinear systems analysis. Today we

More information

1. Introduction. An optimization problem with value function objective is a problem of the form. F (x, y) subject to y K(x) (1.2)

1. Introduction. An optimization problem with value function objective is a problem of the form. F (x, y) subject to y K(x) (1.2) OPTIMIZATION PROBLEMS WITH VALUE FUNCTION OBJECTIVES ALAIN B. ZEMKOHO Abstract. The famil of optimization problems with value function objectives includes the minmax programming problem and the bilevel

More information

Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems

Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems V. Jeyakumar and G. Li Revised Version:August 31, 2012 Abstract An exact semidefinite linear programming (SDP) relaxation

More information

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order

More information

PARTIAL SEMIGROUPS AND PRIMALITY INDICATORS IN THE FRACTAL GENERATION OF BINOMIAL COEFFICIENTS TO A PRIME SQUARE MODULUS

PARTIAL SEMIGROUPS AND PRIMALITY INDICATORS IN THE FRACTAL GENERATION OF BINOMIAL COEFFICIENTS TO A PRIME SQUARE MODULUS PARTIAL SEMIGROUPS AND PRIMALITY INDICATORS IN THE FRACTAL GENERATION OF BINOMIAL COEFFICIENTS TO A PRIME SQUARE MODULUS D. DOAN, B. KIVUNGE 2, J.J. POOLE 3, J.D.H SMITH 4, T. SYKES, AND M. TEPLITSKIY

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Dual Decomposition.

Dual Decomposition. 1/34 Dual Decomposition http://bicmr.pku.edu.cn/~wenzw/opt-2017-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/34 1 Conjugate function 2 introduction:

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Quantitative stability of the Brunn-Minkowski inequality for sets of equal volume

Quantitative stability of the Brunn-Minkowski inequality for sets of equal volume Quantitative stabilit of the Brunn-Minkowski inequalit for sets of equal volume Alessio Figalli and David Jerison Abstract We prove a quantitative stabilit result for the Brunn-Minkowski inequalit on sets

More information

Additional Homework Problems

Additional Homework Problems Additional Homework Problems Robert M. Freund April, 2004 2004 Massachusetts Institute of Technology. 1 2 1 Exercises 1. Let IR n + denote the nonnegative orthant, namely IR + n = {x IR n x j ( ) 0,j =1,...,n}.

More information

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

More information

MAT389 Fall 2016, Problem Set 2

MAT389 Fall 2016, Problem Set 2 MAT389 Fall 2016, Problem Set 2 Circles in the Riemann sphere Recall that the Riemann sphere is defined as the set Let P be the plane defined b Σ = { (a, b, c) R 3 a 2 + b 2 + c 2 = 1 } P = { (a, b, c)

More information

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark Solution Methods Richard Lusby Department of Management Engineering Technical University of Denmark Lecture Overview (jg Unconstrained Several Variables Quadratic Programming Separable Programming SUMT

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Lecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008

Lecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008 Lecture 13 Newton-type Methods A Newton Method for VIs October 20, 2008 Outline Quick recap of Newton methods for composite functions Josephy-Newton methods for VIs A special case: mixed complementarity

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Lecture 4: Exact ODE s

Lecture 4: Exact ODE s Lecture 4: Exact ODE s Dr. Michael Doughert Januar 23, 203 Exact equations are first-order ODE s of a particular form, and whose methods of solution rel upon basic facts concerning partial derivatives,

More information

The Entropy Power Inequality and Mrs. Gerber s Lemma for Groups of order 2 n

The Entropy Power Inequality and Mrs. Gerber s Lemma for Groups of order 2 n The Entrop Power Inequalit and Mrs. Gerber s Lemma for Groups of order 2 n Varun Jog EECS, UC Berkele Berkele, CA-94720 Email: varunjog@eecs.berkele.edu Venkat Anantharam EECS, UC Berkele Berkele, CA-94720

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Sec. 1.1: Basics of Vectors

Sec. 1.1: Basics of Vectors Sec. 1.1: Basics of Vectors Notation for Euclidean space R n : all points (x 1, x 2,..., x n ) in n-dimensional space. Examples: 1. R 1 : all points on the real number line. 2. R 2 : all points (x 1, x

More information

A Beckman-Quarles type theorem for linear fractional transformations of the extended double plane

A Beckman-Quarles type theorem for linear fractional transformations of the extended double plane Rose- Hulman Undergraduate Mathematics Journal A Beckman-Quarles tpe theorem for linear fractional transformations of the extended double plane Andrew Jullian Mis a Josh Keilman b Volume 12, No. 2, Fall

More information

Implications of the Constant Rank Constraint Qualification

Implications of the Constant Rank Constraint Qualification Mathematical Programming manuscript No. (will be inserted by the editor) Implications of the Constant Rank Constraint Qualification Shu Lu Received: date / Accepted: date Abstract This paper investigates

More information

Finite Dimensional Optimization Part III: Convex Optimization 1

Finite Dimensional Optimization Part III: Convex Optimization 1 John Nachbar Washington University March 21, 2017 Finite Dimensional Optimization Part III: Convex Optimization 1 1 Saddle points and KKT. These notes cover another important approach to optimization,

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

Second Proof: Every Positive Integer is a Frobenius Number of Three Generators

Second Proof: Every Positive Integer is a Frobenius Number of Three Generators International Mathematical Forum, Vol., 5, no. 5, - 7 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.988/imf.5.54 Second Proof: Ever Positive Integer is a Frobenius Number of Three Generators Firu Kamalov

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions

Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Primal-dual coordinate descent A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non-Separable Functions Olivier Fercoq and Pascal Bianchi Problem Minimize the convex function

More information

Math 361: Homework 1 Solutions

Math 361: Homework 1 Solutions January 3, 4 Math 36: Homework Solutions. We say that two norms and on a vector space V are equivalent or comparable if the topology they define on V are the same, i.e., for any sequence of vectors {x

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Gradient Sliding for Composite Optimization

Gradient Sliding for Composite Optimization Noname manuscript No. (will be inserted by the editor) Gradient Sliding for Composite Optimization Guanghui Lan the date of receipt and acceptance should be inserted later Abstract We consider in this

More information

LMI Methods in Optimal and Robust Control

LMI Methods in Optimal and Robust Control LMI Methods in Optimal and Robust Control Matthew M. Peet Arizona State University Lecture 15: Nonlinear Systems and Lyapunov Functions Overview Our next goal is to extend LMI s and optimization to nonlinear

More information

A sharp Blaschke Santaló inequality for α-concave functions

A sharp Blaschke Santaló inequality for α-concave functions Noname manuscript No will be inserted b the editor) A sharp Blaschke Santaló inequalit for α-concave functions Liran Rotem the date of receipt and acceptance should be inserted later Abstract We define

More information

PDE Constrained Optimization selected Proofs

PDE Constrained Optimization selected Proofs PDE Constrained Optimization selected Proofs Jeff Snider jeff@snider.com George Mason University Department of Mathematical Sciences April, 214 Outline 1 Prelim 2 Thms 3.9 3.11 3 Thm 3.12 4 Thm 3.13 5

More information

arxiv: v7 [math.oc] 22 Feb 2018

arxiv: v7 [math.oc] 22 Feb 2018 A SMOOTH PRIMAL-DUAL OPTIMIZATION FRAMEWORK FOR NONSMOOTH COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, OLIVIER FERCOQ, AND VOLKAN CEVHER arxiv:1507.06243v7 [math.oc] 22 Feb 2018 Abstract. We propose a

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

Second Order Optimality Conditions for Constrained Nonlinear Programming

Second Order Optimality Conditions for Constrained Nonlinear Programming Second Order Optimality Conditions for Constrained Nonlinear Programming Lecture 10, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk)

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

8. Conjugate functions

8. Conjugate functions L. Vandenberghe EE236C (Spring 2013-14) 8. Conjugate functions closed functions conjugate function 8-1 Closed set a set C is closed if it contains its boundary: x k C, x k x = x C operations that preserve

More information

Some Results Concerning Uniqueness of Triangle Sequences

Some Results Concerning Uniqueness of Triangle Sequences Some Results Concerning Uniqueness of Triangle Sequences T. Cheslack-Postava A. Diesl M. Lepinski A. Schuyler August 12 1999 Abstract In this paper we will begin by reviewing the triangle iteration. We

More information

Lecture 5. The Dual Cone and Dual Problem

Lecture 5. The Dual Cone and Dual Problem IE 8534 1 Lecture 5. The Dual Cone and Dual Problem IE 8534 2 For a convex cone K, its dual cone is defined as K = {y x, y 0, x K}. The inner-product can be replaced by x T y if the coordinates of the

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Accelerated Randomized Primal-Dual Coordinate Method for Empirical Risk Minimization

Accelerated Randomized Primal-Dual Coordinate Method for Empirical Risk Minimization Accelerated Randomized Primal-Dual Coordinate Method for Empirical Risk Minimization Lin Xiao (Microsoft Research) Joint work with Qihang Lin (CMU), Zhaosong Lu (Simon Fraser) Yuchen Zhang (UC Berkeley)

More information

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE

A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE A GLOBALLY CONVERGENT STABILIZED SQP METHOD: SUPERLINEAR CONVERGENCE Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-14-1 June 30,

More information

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,

More information

Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent

Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent Relative-Continuity for Non-Lipschitz Non-Smooth Convex Optimization using Stochastic (or Deterministic) Mirror Descent Haihao Lu August 3, 08 Abstract The usual approach to developing and analyzing first-order

More information

Stochastic model-based minimization under high-order growth

Stochastic model-based minimization under high-order growth Stochastic model-based minimization under high-order growth Damek Davis Dmitriy Drusvyatskiy Kellie J. MacPhee Abstract Given a nonsmooth, nonconvex minimization problem, we consider algorithms that iteratively

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

LECTURE 10 LECTURE OUTLINE

LECTURE 10 LECTURE OUTLINE LECTURE 10 LECTURE OUTLINE Min Common/Max Crossing Th. III Nonlinear Farkas Lemma/Linear Constraints Linear Programming Duality Convex Programming Duality Optimality Conditions Reading: Sections 4.5, 5.1,5.2,

More information

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6 Chapter 6 Orthogonality 6.1 Orthogonal Vectors and Subspaces Recall that if nonzero vectors x, y R n are linearly independent then the subspace of all vectors αx + βy, α, β R (the space spanned by x and

More information

Problem 1 (Exercise 2.2, Monograph)

Problem 1 (Exercise 2.2, Monograph) MS&E 314/CME 336 Assignment 2 Conic Linear Programming January 3, 215 Prof. Yinyu Ye 6 Pages ASSIGNMENT 2 SOLUTIONS Problem 1 (Exercise 2.2, Monograph) We prove the part ii) of Theorem 2.1 (Farkas Lemma

More information