Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Size: px
Start display at page:

Download "Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)"

Transcription

1 Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective function f : IR n IR assume that f C (sometimes C 2 ) and Lipschitz often in practice this assumption violated, but not necessary

2 LINESEARCH VS TRUST-REGION METHODS Linesearch methods pick descent direction p k pick stepsize α k to reduce f(x k + αp k ) x k+ = x k + α k p k Trust-region methods pick step s k to reduce model of f(x k + s) accept x k+ = x k +s k if decrease in model inherited by f(x k +s k ) otherwise set x k+ = x k, refine model TRUST-REGION MODEL PROBLEM Model f(x k + s) by: linear model m L k (s) = f k + s T g k quadratic model symmetric B k Major difficulties: m Q k (s) = f k + s T g k + 2s T B k s models may not resemble f(x k + s) if s is large models may be unbounded from below linear model - always unless g k = 0 quadratic model - always if B k is indefinite, possibly if B k is only positive semi-definite

3 THE TRUST REGION Prevent model m k (s) from unboundedness by imposing a trust-region constraint s k for some suitable scalar radius k > 0 = trust-region subproblem approx minimize s IR n in theory does not depend on norm in practice it might! m k (s) subject to s k OUR MODEL For simplicity, concentrate on the second-order (Newton-like) model m k (s) = m Q k (s) = f k + s T g k + 2s T B k s and the l 2 -trust region norm = 2 Note: B k = H k is allowed analysis for other trust-region norms simply adds extra constants in following results

4 BASIC TRUST-REGION METHOD Given k = 0, 0 > 0 and x 0, until convergence do: Build the second-order model m(s) of f(x k + s). Solve the trust-region subproblem to find s k for which m(s k ) < f k and s k k, and define ρ k = f k f(x k + s k ). f k m k (s k ) If ρ k η v [very successful] 0 < η v < set x k+ = x k + s k and k+ = γ i k γ i Otherwise if ρ k η s then [successful] 0 < η s η v < set x k+ = x k + s k and k+ = k Otherwise [unsuccessful] set x k+ = x k and k+ = γ d k 0 < γ d < Increase k by SOLVE THE TRUST REGION SUBPROBLEM? At the very least aim to achieve as much reduction in the model as would an iteration of steepest descent Cauchy point: s C k = α C kg k where αk C = arg min m k ( αg k ) subject to α g k k α>0 = arg min m k ( αg k ) 0<α k / g k minimize quadratic on line segment = very easy! require that m k (s k ) m k (s C k) and s k k in practice, hope to do far better than this

5 ACHIEVABLE MODEL DECREASE Theorem 3.. If m k (s) is the second-order model and s C k is its Cauchy point within the trust-region s k, f k m k (s C k) 2 g k min g k + B k, k. PROOF OF THEOREM 3. m k ( αg k ) = f k α g k 2 + 2α 2 g T k B k g k. Result immediate if g k = 0. Otherwise, 3 possibilities (i) curvature g T k B k g k 0 = m k ( αg k ) unbounded from below as α increases = Cauchy point occurs on the trust-region boundary. (ii) curvature g T k B k g k > 0 & minimizer m k ( αg k ) occurs at or beyond the trust-region boundary = Cauchy point occurs on the trustregion boundary. (iii) the curvature gk T B k g k > 0 & minimizer m k ( αg k ), and hence Cauchy point, occurs before trust-region is reached. Consider each case in turn;

6 Case (i) gk T B k g k 0 & α 0 = m k ( αg k ) = f k α g k 2 + 2α 2 gk T B k g k f k α g k 2 () Cauchy point lies on boundary of the trust region = () + (2) = α C k = k g k. (2) f k m k (s C k) g k 2 k g k = g k k 2 g k k. Case (ii) = = α k (3) + (4) + (5) = def = arg min m k ( αg k ) f k α g k 2 + 2α 2 g T k B k g k (3) α k = g k 2 g T k B k g k α C k = k g k (4) α C kg T k B k g k g k 2. (5) f k m k (s C k) = α C k g k 2 2[α C k] 2 g T k B k g k 2α C k g k 2 = 2 g k 2 k g k = 2 g k k.

7 Case (iii) = where α C k = α k = g k 2 g T k B k g k f k m k (s C k) = αk g k 2 + 2(α k) 2 gk T B k g k = g k 4 g k 4 gk T 2 B k g k gk T B k g k g k 4 = 2 2 gk T B k g k g k 2 + B k, g T k B k g k g k 2 B k g k 2 ( + B k ) because of the Cauchy-Schwarz inequality. Corollary 3.2. If m k (s) is the second-order model, and s k is an improvement on the Cauchy point within the trust-region s k, f k m k (s k ) 2 g k min g k + B k, k.

8 DIFFERENCE BETWEEN MODEL AND FUNCTION Lemma 3.3. Suppose that f C 2, and that the true and model Hessians satisfy the bounds H(x) κ h for all x and B k κ b for all k and some κ h and κ b 0. Then where κ d = 2(κ h + κ b ), for all k. f(x k + s k ) m k (s k ) κ d 2 k, PROOF OF LEMMA 3.3 Mean value theorem = f(x k + s k ) = f(x k ) + s T k x f(x k ) + 2s T k xx f(ξ k )s k for some ξ k [x k, x k + s k ]. Thus f(x k + s k ) m k (s k ) = 2 s T k H(ξ k )s k s T k B k s k 2 s T k H(ξ k )s k + 2 s T k B k s k 2(κ h + κ b ) s k 2 κ d 2 k using the triangle and Cauchy-Schwarz inequalities.

9 ULTIMATE PROGRESS AT NON-OPTIMAL POINTS Lemma 3.4. Suppose that f C 2, that the true and model Hessians satisfy the bounds H k κ h and B k κ b for all k and some κ h and κ b 0, and that κ d = 2(κ h + κ b ). Suppose furthermore that g k 0 and that k g k min Then iteration k is very successful and, ( η v) κ h + κ b 2κ d k+ k.. PROOF OF LEMMA 3.4 By definition, + B k κ h + κ b + first bound on k = Corollary 3.2 = k g k κ h + κ b f k m k (s k ) 2 g k min + Lemma second bound on k = ρ k = f(x k + s k ) m k (s k ) f k m k (s k ) g k + B k. = ρ k η v = iteration is very successful. g k + B k, k = 2 g k k. 2 κ d 2 k g k k = 2 κ d k g k η v.

10 RADIUS WON T SHRINK TO ZERO AT NON-OPTIMAL POINTS Lemma 3.5. Suppose that f C 2, that the true and model Hessians satisfy the bounds H k κ h and B k κ b for all k and some κ h and κ b 0, and that κ d = 2(κ h + κ b ). Suppose furthermore that there exists a constant ɛ > 0 such that g k ɛ for all k. Then for all k. def k κ ɛ = ɛγ d min, ( η v) κ h + κ b 2κ d PROOF OF LEMMA 3.5 Suppose otherwise that iteration k is first for which k+ κ ɛ. k > k+ = iteration k unsuccessful = γ d k k+. Hence k ɛ min, ( η v) κ h + κ b 2κ d g k min, ( η v) κ h + κ b 2κ d But this contradicts assertion of Lemma 3.4 that iteration k must be very successful.

11 POSSIBLE FINITE TERMINATION Lemma 3.6. Suppose that f C 2, and that both the true and model Hessians remain bounded for all k. Suppose furthermore that there are only finitely many successful iterations. Then x k = x for all sufficiently large k and g(x ) = 0. PROOF OF LEMMA 3.6 x k0 +j = x k0 + = x for all j > 0, where k 0 is index of last successful iterate. All iterations are unsuccessful for sufficiently large k = { k } 0 + Lemma 3.4 then implies that if g k0 + > 0 there must be a successful iteration of index larger than k 0, which is impossible = g k0 + = 0.

12 GLOBAL CONVERGENCE OF ONE SEQUENCE Theorem 3.7. Suppose that f C 2, and that both the true and model Hessians remain bounded for all k. Then either g l = 0 for some l 0 or or lim f k = k lim inf k g k = 0. PROOF OF THEOREM 3.7 Let S be the index set of successful iterations. Lemma 3.6 = true Theorem 3.7 when S finite. So consider S =, and suppose f k bounded below and g k ɛ (6) for some ɛ > 0 and all k, and consider some k S. + Corollary 3.2, Lemma 3.5, and the assumption (6) = def ɛ f k f k+ η s [f k m k (s k )] δ ɛ = 2η s ɛ min, κ ɛ + κ b = f 0 f k+ = k [f j f j+ ] σ k δ ɛ, j=0 j S where σ k is the number of successful iterations up to iteration k. But lim k σ k = +. = f k unbounded below = a subsequence of the g k 0.

13 GLOBAL CONVERGENCE Theorem 3.8. Suppose that f C 2, and that both the true and model Hessians remain bounded for all k. Then either g l = 0 for some l 0 or or lim f k = k lim g k = 0. k PROOF OF THEOREM 3.8 Suppose otherwise that f k is bounded from below, and that there is a subsequence {t i } S, such that g ti 2ɛ > 0 (7) for some ɛ > 0 and for all i. Theorem 3.7 = {l i } S such that g k ɛ for t i k < l i and g li < ɛ. (8) Now restrict attention to indices in K def = {k S t i k < l i }.

14 g k 2ɛ ɛ S {t i} {l i} K k Figure 3.: The subsequences of the proof of Theorem 3.8 As in proof of Theorem 3.7, (8) = f k f k+ η s [f k m k (s k )] 2η s ɛ min for all k K = LHS of (9) 0 as k = lim k = 0 k k K = k 2 [f k f k+ ]. ɛη s for k K sufficiently large = x ti x li l i j=t i j K x j x j+ l i j=t i j K ɛ, k (9) + κ b j 2 ɛη s [f ti f li ]. (0) for i sufficiently large. But RHS of (0) 0 = x ti x li 0 as i tends to infinity + continuity = g ti g li 0.

15 Impossible as g ti g li ɛ by definition of {t i } and {l i } = no subsequence satisfying (7) can exist. II: SOLVING THE TRUST-REGION SUBPROBLEM (approximately) minimize s IR n q(s) s T g + 2s T Bs subject to s AIM: find s so that q(s ) q(s C ) and s Might solve exactly = Newton-like method approximately = steepest descent/conjugate gradients

16 THE l 2 -NORM TRUST-REGION SUBPROBLEM minimize s IR n q(s) s T g + 2s T Bs subject to s 2 Solution characterisation result: Theorem 3.9. Any global minimizer s of q(s) subject to s 2 satisfies the equation (B + λ I)s = g, where B+λ I is positive semi-definite, λ 0 and λ ( s 2 ) = 0. If B + λ I is positive definite, s is unique. PROOF OF THEOREM 3.9 Problem equivalent to minimizing q(s) subject to 2 2 2s T s 0. Theorem.9 = g + Bs = λ s () for some Lagrange multiplier λ 0 for which either λ = 0 or s 2 = (or both). It remains to show B + λ I is positive semi-definite. If s lies in the interior of the trust-region, λ = 0, and Theorem.0 = B + λ I = B is positive semi-definite. If s 2 = and λ = 0, Theorem.0 = v T Bv 0 for all v N + = {v s T v 0}. If v / N + = v N + = v T Bv 0 for all v. Only remaining case is where s 2 = and λ > 0. Theorem.0 = v T (B + λ I)v 0 for all v N + = {v s T v = 0} = remains to consider v T Bv when s T v 0.

17 s N + w s Figure 3.2: Construction of missing directions of positive curvature. Let s be any point on the boundary δr of the trust-region R, and let w = s s. Then w T s = (s s) T s = 2(s s) T (s s) = 2w T w (2) since s 2 = = s 2. () + (2) = q(s) q(s ) = w T (g + Bs ) + 2w T Bw = λ w T s + 2w T Bw = 2w T (B + λ I)w, = w T (B + λ I)w 0 since s is a global minimizer. But (3) s = s 2 st v v T v v δr = (for this s) w v = v T (B + λ I)v 0. When B + λ I is positive definite, s = (B + λ I) g. If s δr and s R, (2) and (3) become w T s 2w T w and q(s) q(s ) + 2w T (B + λ I)w respectively. Hence, q(s) > q(s ) for any s s. If s is interior, λ = 0, B is positive definite, and thus s is the unique unconstrained minimizer of q(s).

18 ALGORITHMS FOR THE l 2 -NORM SUBPROBLEM Two cases: B positive-semi definite and Bs = g satisfies s 2 = s = s B indefinite or Bs = g satisfies s 2 > In this case (B + λ I)s = g and s T s = 2 nonlinear (quadratic) system in s and λ concentrate on this EQUALITY CONSTRAINED l 2 -NORM SUBPROBLEM Suppose B has spectral decomposition B = U T ΛU U eigenvectors Λ diagonal eigenvalues: λ λ 2... λ n Require B + λi positive semi-definite = λ λ Define Require Note s(λ) = (B + λi) g ψ(λ) def = s(λ) 2 2 = 2 (γ i = e T i Ug) ψ(λ) = U T (Λ + λi) Ug 2 2 = n i= γ 2 i (λ i + λ) 2

19 CONVEX EXAMPLE ψ(λ) λ B = solution curve as varies 2 = g = NONCONVEX EXAMPLE ψ(λ) 2 B = minus leftmost eigenvalue g = λ

20 THE HARD CASE ψ(λ) 2 B = minus leftmost eigenvalue g = = λ SUMMARY For indefinite B, Hard case occurs when g orthogonal to eigenvector u for most negative eigenvalue λ OK if radius is radius small enough No obvious solution to equations... but solution is actually of the form where s lim = lim λ + λ s(λ) s lim + σu 2 = s lim + σu

21 HOW TO SOLVE s(λ) 2 = DON T!! Solve instead the secular equation no poles φ(λ) def = s(λ) 2 = 0 smallest at eigenvalues (except in hard case!) analytic function = ideal for Newton global convergent (ultimately quadratic rate except in hard case) need to safeguard to protect Newton from the hard & interior solution cases THE SECULAR EQUATION φ(λ) min 4s 2 + 4s s + s 2 subject to s λ λ λ

22 NEWTON S METHOD FOR SECULAR EQUATION Newton correction at λ is φ(λ)/φ (λ). Differentiating φ(λ) = s(λ) 2 = (s T (λ)s(λ)) 2 = φ (λ) = st (λ) λ s(λ) (s T (λ)s(λ)) 3 2 Differentiating the defining equation = st (λ) λ s(λ). s(λ) 3 2 (B + λi)s(λ) = g = (B + λi) λ s(λ) + s(λ) = 0. Notice that, rather than λ s(λ), merely s T (λ) λ s(λ) = s T (λ)(b + λi)(λ) s(λ) required for φ (λ). Given the factorization B + λi = L(λ)L T (λ) = s T (λ)(b + λi) s(λ) = s T (λ)l T (λ)l (λ)s(λ) = (L (λ)s(λ)) T (L (λ)s(λ)) = w(λ) 2 2 where L(λ)w(λ) = s(λ). NEWTON S METHOD & THE SECULAR EQUATION Let λ > λ and > 0 be given Until convergence do: Factorize B + λi = LL T Solve LL T s = g Solve Lw = s Replace λ by λ + s 2 s 2 2 w 2

23 SOLVING THE LARGE-SCALE PROBLEM when n is large, factorization may be impossible may instead try to use an iterative method to approximate Steepest descent leads to the Cauchy point obvious generalization: conjugate gradients... but what about the trust region? what about negative curvature? CONJUGATE GRADIENTS TO MINIMIZE q(s) Given s 0 = 0, set g 0 = g, d 0 = g and i = 0 Until g i small or breakdown, iterate α i = g i 2 2/d i T Bd i s i+ = s i + α i d i g i+ = g i + α i Bd i β i = g i+ 2 2/ g i 2 2 d i+ = g i+ + β i d i and increase i by Important features g j = Bs j + g for all j = 0,..., i d j T g i+ = 0 for all j = 0,..., i g j T g i+ = 0 for all j = 0,..., i

24 CRUCIAL PROPERTY OF CONJUGATE GRADIENTS Theorem 3.0. Suppose that the conjugate gradient method is applied to minimize q(s) starting from s 0 = 0, and that d i T Bd i > 0 for 0 i k. Then the iterates s j satisfy the inequalities for 0 j k. s j 2 < s j+ 2 PROOF OF THEOREM 3.0 First show that d i T d j = gi 2 2 d j 2 g j 2 2 > 0 (4) for all 0 j i k. For any i, (4) is trivially true for j = i. Suppose it is also true for all i l. Then, the update for d l+ gives d l+ = g l+ + gl+ 2 2 d l. g l 2 Forming the inner product with d j, and using the fact that d j T g l+ = 0 for all j = 0,..., l, and (4) when j = l, reveals d l+ T d j = g l+ T d j + gl+ 2 2 d l T d j g l 2 = gl+ 2 2 g l 2 2 d j 2 g l 2 g j 2 2 = gl+ 2 2 d j 2 g j 2 2 > 0. Thus (4) is true for i l +, and hence for all 0 j i k.

25 Now have from the algorithm that s i = s 0 + i j=0 αj d j = i as, by assumption, s 0 = 0. Hence s i T d i = i j=0 αj d j T d i = i j=0 αj d j j=0 αj d j T d i > 0 (5) as each α j > 0, which follows from the definition of α j, since d j T Hd j > 0, and from relationship (4). Hence s i+ 2 2 = s i+ T s i+ = ( s i + α i d i) T ( s i + α i d i) = s i T s i + 2α i s i T d i + α i 2 d i T d i > s i T s i = s i 2 2 follows directly from (5) and α i > 0 which is the required result. TRUNCATED CONJUGATE GRADIENTS Apply the conjugate gradient method, but terminate at iteration i if. d i T Bd i 0 = problem unbounded along d i 2. s i + α i d i 2 > = solution on trust-region boundary In both cases, stop with s = s i + α B d i, where α B root of s i + α B d i 2 = chosen as positive Crucially q(s ) q(s C ) and s 2 = TR algorithm converges to a first-order critical point

26 HOW GOOD IS TRUNCATED C.G.? In the convex case... very good Theorem 3.. Suppose that the truncated conjugate gradient method is applied to minimize q(s) and that B is positive definite. Then the computed and actual solutions to the problem, s and s M, satisfy the bound q(s ) 2q(s M ) In the non-convex case... maybe poor e.g., if g = 0 and B is indefinite = q(s ) = 0 WHAT CAN WE DO IN THE NON-CONVEX CASE? Solve the problem over a subspace instead of the B-conjugate subspace for CG, use the equivalent Lanczos orthogonal basis Gram-Schmidt applied to CG (Krylov) basis D i Subspace Q i = {s s = Q i s q for some s q IR i } Q i is such that Q i T Q i = I and Q i T BQ i = T i where T i is tridiagonal and Q i T g = g 2 e Q i trivial to generate from CG D i

27 GENERALIZED LANCZOS TRUST-REGION METHOD = s i = Q i s i q, where s i = arg min s Q i q(s) subject to s 2 s i q = arg min s q IR i g 2 e T s q + 2s T q T i s q subject to s q 2 advantage T i has very sparse factors = can solve the problem using the earlier secular equation approach can exploit all the structure here = use solution for one problem to initialize next until the trust-region boundary is reached, it is conjugate gradients = switch when we get there

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods

More information

Part 2: Linesearch methods for unconstrained optimization. Nick Gould (RAL)

Part 2: Linesearch methods for unconstrained optimization. Nick Gould (RAL) Part 2: Linesearch methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL) Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x

More information

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf. Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of

More information

Arc Search Algorithms

Arc Search Algorithms Arc Search Algorithms Nick Henderson and Walter Murray Stanford University Institute for Computational and Mathematical Engineering November 10, 2011 Unconstrained Optimization minimize x D F (x) where

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

On Lagrange multipliers of trust-region subproblems

On Lagrange multipliers of trust-region subproblems On Lagrange multipliers of trust-region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Programy a algoritmy numerické matematiky 14 1.- 6. června 2008

More information

On Lagrange multipliers of trust region subproblems

On Lagrange multipliers of trust region subproblems On Lagrange multipliers of trust region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Applied Linear Algebra April 28-30, 2008 Novi Sad, Serbia Outline

More information

Optimal Newton-type methods for nonconvex smooth optimization problems

Optimal Newton-type methods for nonconvex smooth optimization problems Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations

More information

Second Order Optimization Algorithms I

Second Order Optimization Algorithms I Second Order Optimization Algorithms I Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 7, 8, 9 and 10 1 The

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Lecture 15: SQP methods for equality constrained optimization

Lecture 15: SQP methods for equality constrained optimization Lecture 15: SQP methods for equality constrained optimization Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 15: SQP methods for equality constrained

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

This manuscript is for review purposes only.

This manuscript is for review purposes only. 1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Clément Royer - University of Wisconsin-Madison Joint work with Stephen J. Wright MOPTA, Bethlehem,

More information

Trust Regions. Charles J. Geyer. March 27, 2013

Trust Regions. Charles J. Geyer. March 27, 2013 Trust Regions Charles J. Geyer March 27, 2013 1 Trust Region Theory We follow Nocedal and Wright (1999, Chapter 4), using their notation. Fletcher (1987, Section 5.1) discusses the same algorithm, but

More information

Algorithms for Trust-Region Subproblems with Linear Inequality Constraints

Algorithms for Trust-Region Subproblems with Linear Inequality Constraints Algorithms for Trust-Region Subproblems with Linear Inequality Constraints Melanie L. Bain Gratton A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial

More information

Worst Case Complexity of Direct Search

Worst Case Complexity of Direct Search Worst Case Complexity of Direct Search L. N. Vicente May 3, 200 Abstract In this paper we prove that direct search of directional type shares the worst case complexity bound of steepest descent when sufficient

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Trust Region Methods. Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh. Convex Optimization /36-725

Trust Region Methods. Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh. Convex Optimization /36-725 Trust Region Methods Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh Convex Optimization 10-725/36-725 Trust Region Methods min p m k (p) f(x k + p) s.t. p 2 R k Iteratively solve approximations

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

Quasi-Newton methods for minimization

Quasi-Newton methods for minimization Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big

More information

An introduction to algorithms for nonlinear optimization 1,2

An introduction to algorithms for nonlinear optimization 1,2 RAL-TR-2002-03 (revised) An introduction to algorithms for nonlinear optimization 1,2 Nicholas I. M. Gould 3,4,5 and Sven Leyffer 6,7,8 ABSTRACT We provide a concise introduction to modern methods for

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method The minimization problem We are given a symmetric positive definite matrix R n n and a right hand side vector b R n We want to solve the linear system Find u R n such that

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

Numerical Optimization

Numerical Optimization Numerical Optimization Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Spring 2010 Emo Todorov (UW) AMATH/CSE 579, Spring 2010 Lecture 9 1 / 8 Gradient descent

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Lecture 3: Linesearch methods (continued). Steepest descent methods

Lecture 3: Linesearch methods (continued). Steepest descent methods Lecture 3: Linesearch methods (continued). Steepest descent methods Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 3: Linesearch methods (continued).

More information

More First-Order Optimization Algorithms

More First-Order Optimization Algorithms More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM

More information

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J 7 Appendix 7. Proof of Theorem Proof. There are two main difficulties in proving the convergence of our algorithm, and none of them is addressed in previous works. First, the Hessian matrix H is a block-structured

More information

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues

More information

A conjugate gradient-based algorithm for large-scale quadratic programming problem with one quadratic constraint

A conjugate gradient-based algorithm for large-scale quadratic programming problem with one quadratic constraint Noname manuscript No. (will be inserted by the editor) A conjugate gradient-based algorithm for large-scale quadratic programming problem with one quadratic constraint A. Taati, M. Salahi the date of receipt

More information

The Steepest Descent Algorithm for Unconstrained Optimization

The Steepest Descent Algorithm for Unconstrained Optimization The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem

More information

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method. Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem SIAM J. OPTIM. Vol. 9, No. 4, pp. 1100 1127 c 1999 Society for Industrial and Applied Mathematics NEWTON S METHOD FOR LARGE BOUND-CONSTRAINED OPTIMIZATION PROBLEMS CHIH-JEN LIN AND JORGE J. MORÉ To John

More information

Self-Concordant Barrier Functions for Convex Optimization

Self-Concordant Barrier Functions for Convex Optimization Appendix F Self-Concordant Barrier Functions for Convex Optimization F.1 Introduction In this Appendix we present a framework for developing polynomial-time algorithms for the solution of convex optimization

More information

A line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 )

A line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 ) A line-search algorithm inspired by the adaptive cubic regularization framewor with a worst-case complexity Oɛ 3/ E. Bergou Y. Diouane S. Gratton December 4, 017 Abstract Adaptive regularized framewor

More information

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality AM 205: lecture 18 Last time: optimization methods Today: conditions for optimality Existence of Global Minimum For example: f (x, y) = x 2 + y 2 is coercive on R 2 (global min. at (0, 0)) f (x) = x 3

More information

1. Introduction. Consider the following quadratically constrained quadratic optimization problem:

1. Introduction. Consider the following quadratically constrained quadratic optimization problem: ON LOCAL NON-GLOBAL MINIMIZERS OF QUADRATIC OPTIMIZATION PROBLEM WITH A SINGLE QUADRATIC CONSTRAINT A. TAATI AND M. SALAHI Abstract. In this paper, we consider the nonconvex quadratic optimization problem

More information

Miscellaneous Nonlinear Programming Exercises

Miscellaneous Nonlinear Programming Exercises Miscellaneous Nonlinear Programming Exercises Henry Wolkowicz 2 08 21 University of Waterloo Department of Combinatorics & Optimization Waterloo, Ontario N2L 3G1, Canada Contents 1 Numerical Analysis Background

More information

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION

A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION Optimization Technical Report 02-09, October 2002, UW-Madison Computer Sciences Department. E. Michael Gertz 1 Philip E. Gill 2 A PRIMAL-DUAL TRUST REGION ALGORITHM FOR NONLINEAR OPTIMIZATION 7 October

More information

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion:

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion: Unconstrained Convex Optimization 21 4 Newton Method H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion: f(x + p) f(x)+p T f(x)+ 1 2 pt H(x)p ˆf(p) In general, ˆf(p) won

More information

Generalized Newton-Type Method for Energy Formulations in Image Processing

Generalized Newton-Type Method for Energy Formulations in Image Processing Generalized Newton-Type Method for Energy Formulations in Image Processing Leah Bar and Guillermo Sapiro Department of Electrical and Computer Engineering University of Minnesota Outline Optimization in

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Frank E. Curtis, Lehigh University Beyond Convexity Workshop, Oaxaca, Mexico 26 October 2017 Worst-Case Complexity Guarantees and Nonconvex

More information

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems Outline Scientific Computing: An Introductory Survey Chapter 6 Optimization 1 Prof. Michael. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

An introduction to complexity analysis for nonconvex optimization

An introduction to complexity analysis for nonconvex optimization An introduction to complexity analysis for nonconvex optimization Philippe Toint (with Coralia Cartis and Nick Gould) FUNDP University of Namur, Belgium Séminaire Résidentiel Interdisciplinaire, Saint

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

An interior-point trust-region polynomial algorithm for convex programming

An interior-point trust-region polynomial algorithm for convex programming An interior-point trust-region polynomial algorithm for convex programming Ye LU and Ya-xiang YUAN Abstract. An interior-point trust-region algorithm is proposed for minimization of a convex quadratic

More information

Complexity of gradient descent for multiobjective optimization

Complexity of gradient descent for multiobjective optimization Complexity of gradient descent for multiobjective optimization J. Fliege A. I. F. Vaz L. N. Vicente July 18, 2018 Abstract A number of first-order methods have been proposed for smooth multiobjective optimization

More information

Transpose & Dot Product

Transpose & Dot Product Transpose & Dot Product Def: The transpose of an m n matrix A is the n m matrix A T whose columns are the rows of A. So: The columns of A T are the rows of A. The rows of A T are the columns of A. Example:

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with Appendix: Matrix Estimates and the Perron-Frobenius Theorem. This Appendix will first present some well known estimates. For any m n matrix A = [a ij ] over the real or complex numbers, it will be convenient

More information

Constrained optimization: direct methods (cont.)

Constrained optimization: direct methods (cont.) Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a

More information

Static unconstrained optimization

Static unconstrained optimization Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R

More information

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS Working Paper 01 09 Departamento de Estadística y Econometría Statistics and Econometrics Series 06 Universidad Carlos III de Madrid January 2001 Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 91 624

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

A Second-Order Method for Strongly Convex l 1 -Regularization Problems

A Second-Order Method for Strongly Convex l 1 -Regularization Problems Noname manuscript No. (will be inserted by the editor) A Second-Order Method for Strongly Convex l 1 -Regularization Problems Kimon Fountoulakis and Jacek Gondzio Technical Report ERGO-13-11 June, 13 Abstract

More information

Preconditioned conjugate gradient algorithms with column scaling

Preconditioned conjugate gradient algorithms with column scaling Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Transpose & Dot Product

Transpose & Dot Product Transpose & Dot Product Def: The transpose of an m n matrix A is the n m matrix A T whose columns are the rows of A. So: The columns of A T are the rows of A. The rows of A T are the columns of A. Example:

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc.

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc. ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, 2015 1 Name: Solution Score: /100 This exam is closed-book. You must show ALL of your work for full credit. Please read the questions carefully.

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor. ECE580 Exam 1 October 4, 2012 1 Name: Solution Score: /100 You must show ALL of your work for full credit. This exam is closed-book. Calculators may NOT be used. Please leave fractions as fractions, etc.

More information

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University

More information

Università di Firenze, via C. Lombroso 6/17, Firenze, Italia,

Università di Firenze, via C. Lombroso 6/17, Firenze, Italia, Convergence of a Regularized Euclidean Residual Algorithm for Nonlinear Least-Squares by S. Bellavia 1, C. Cartis 2, N. I. M. Gould 3, B. Morini 1 and Ph. L. Toint 4 25 July 2008 1 Dipartimento di Energetica

More information

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization

How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization Frank E. Curtis Department of Industrial and Systems Engineering, Lehigh University Daniel P. Robinson Department

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2

More information

MATH 4211/6211 Optimization Basics of Optimization Problems

MATH 4211/6211 Optimization Basics of Optimization Problems MATH 4211/6211 Optimization Basics of Optimization Problems Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 A standard minimization

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Keywords: Nonlinear least-squares problems, regularized models, error bound condition, local convergence.

Keywords: Nonlinear least-squares problems, regularized models, error bound condition, local convergence. STRONG LOCAL CONVERGENCE PROPERTIES OF ADAPTIVE REGULARIZED METHODS FOR NONLINEAR LEAST-SQUARES S. BELLAVIA AND B. MORINI Abstract. This paper studies adaptive regularized methods for nonlinear least-squares

More information