Global and derivative-free optimization Lectures 1-4
|
|
- Rosalind Wright
- 5 years ago
- Views:
Transcription
1 Global and derivative-free optimization Lectures 1-4 Coralia Cartis, University of Oxford INFOMM CDT: Contemporary Numerical Techniques Global and derivative-free optimizationlectures 1-4 p. 1/46
2 Lectures 1-4:outline Brief overview of derivative-based methods for local NLO. Global optimization: definition and overview. Derivative-Free Optimization (DFO): motivation and applications. Overview of DFO algorithms Model-based (+with probabilistic models, later) Direct-search, pattern search, Nelder-Mead Implicit-filtering (in the context of an application) Overview of GO algorithms (briefly) Stochastic methods Deterministic methods Branch-and-Bound; Interval Methods; Response Surface Methods; Modern Branch-and-Bound Global and derivative-free optimizationlectures 1-4 p. 2/46
3 Nonlinear optimization: derivative-based algorithms minimize f(x) subject to x Ω R n. (P) f :Ω R is (generally) smooth and nonconvex. Ω feasible set (determined by finitely many constraints). guaranteed to find local minimizers of (P). rely heavily on accurate/exact derivative(s) information of f and the constraints optimality conditions: for example, when Ω=R n, x local minimizer of f = f(x )=0and 2 f(x ) 0 f(x )=0and 2 f(x ) 0 = x local min. of f. used as termination criteria for algorithms. Taylor expansions of f: at kth iterate x k x, f(x k + s) m k (s) =f(x k )+s T f(x k ) [ st 2 f(x k )s ] used in algorithm construction. Global and derivative-free optimizationlectures 1-4 p. 3/46
4 Nonlinear optimization: derivative-based algorithms... Methods for local unconstrained optimization [i.e., Ω=IR n in (P)] A Generic Method (GM) Choose ɛ>0 and x 0 R n. While (TERMINATION CRITERIA not achieved), REPEAT: compute the change x k+1 x k = α k s k, [linesearch, trust-region] to ensure f(x k+1 ) f(x k ); where α k [0, 1] and s k = arg min s R n m k(s) f(x k + s). set x k+1 := x k + α k s k, k := k +1. TC: f(x k ) ɛ; maybe also, λ min ( 2 f(x k )) ɛ. Global and derivative-free optimizationlectures 1-4 p. 4/46
5 Nonlinear optimization: derivative-based algorithms... Linesearch methods for local unconstrained optimization compute a descent direction s k from x k [i.e., (s k ) T f(x k ) < 0] set x k+1 = x k + α k s k to decrease f [α k (in)exact linesearch] s k = f(x k ) s k = 2 f(x k ) 1 f(x k ) s k min s= f(xk ) m k (s) linear. steepest descent s k min s m k (s) quadratic. Newton Global and derivative-free optimizationlectures 1-4 p. 5/46
6 Nonlinear optimization: derivative-based algorithms... Trust-region methods for local unconstrained optimization compute a step s k from x k to improve model m k (s) of f within the trust-region s k, s k (approx.)min s m k (s) subjectto s k. set x k+1 = x k + s k if m k and f agree at x k + s k otherwise set x k+1 = x k and reduce the radius k Global and derivative-free optimizationlectures 1-4 p. 6/46
7 Nonlinear optimization: derivative-based algorithms... How to compute/provide derivatives to a solver? Calculate derivatives by hand when easy/simple objective and constraints; user provides code that computes them. Calculate or approximate derivatives automatically: Automatic differentiation: breaks down computer code for evaluating f into elementary arithmetic operations + differentiate by chain rule. Software: ADIFOR, ADOL-C. Symbolic differentiation: manipulate the algebraic expression of f (if available). Software: symbolic packages of MAPLE, MATHEMATICA, MATLAB. Finite differencing approximate derivatives. See Nocedal & Wright, Numerical Optimization (2nd edition, 2006) for more details. Global and derivative-free optimizationlectures 1-4 p. 7/46
8 Nonlinear optimization: derivative-based algorithms... Advantages and successes global convergence to stationary points of (P) under mild assumptions on the problem class; fast local convergence for Newton-like variants. can solve large-scale problems n large (at least of order 10 3 ) efficiently, even when (P) has nonlinear constraints. Limitations only guaranteed to provide local solutions of (P) when (P) is nonconvex. requires accurate or exact first-, and sometimes even second-, derivatives of the objective f and constraints to be available. Global and derivative-free optimizationlectures 1-4 p. 8/46
9 Global and derivative-free optimization algorithms Attempt to overcome the limitations of derivative-based NLO algorithms for local minimization: Global Optimization (GO) the global minimizer of (P) is required; derivatives are allowed Derivative-Free Optimization (DFO) derivatives are unavailable, even if (P) may be smooth; use only function values to construct iterates that approach a (local) min. For the remainder, Ω=R n in (P), i.e., we solve minimize f(x) subject to x R n. (UP) [GO and DFO may not deal with nonlinear constraints, at best bounds] Comparison to local optimization (UP) : GO more difficult (generally, NP-hard); in DFO, we lose problem information. = for both GO and DFO, often content with improvement rather than optimization Global and derivative-free optimizationlectures 1-4 p. 9/46
10 Global optimization Consider (UP). When f convex, x local minimizer of f = x global minimizer of f. Hence, for such instances, local optimization=go. f nonconvex and bounded below: how to compute the global minimizer of f in the presence of local minimizers/high oscillations and sometimes noise? A local optimization algorithm gets trapped at local minimizers and cannot further advance towards the global solution. How do GO methods avoid this? when to terminate a GO algorithm? Global and derivative-free optimizationlectures 1-4 p. 10/46
11 Global optimization... Applications: many of the grand challenge problems of scientific computing such weather forecasting, electronic-structural design, protein folding, molecular dynamics, etc. Methods (to be addressed): branch-and-bound, multistart local search, randomized, etc. Limitations on average, can solve efficiently problems of (very) small scale (in the order of 10 variables); better if parallelism is employed. difficulties with incorporating nonlinear constraints, only bound constraints (more) straightforward. Global and derivative-free optimizationlectures 1-4 p. 11/46
12 Derivative-Free Optimization (DFO) Consider (UP). Even when f is smooth, for many applications: Exact first derivatives of f are unavailable: f(x) given by a black-box code, proprietary code or a simulation package. Computing f(x) for any given x is expensive: f(x) given by a time-consuming numerical simulation or lab experiments. Numerical approximation of f(x) is impractically expensive or slow: using finite-differencing for approximating f(x) when f(x) is expensive. The values of f(x) are noisy, i.e., the evaluation of f(x) is inaccurate. For example, when f(x) depends on discretization, sampling, inaccurate data, etc. Then gradient information is meaningless. Global and derivative-free optimizationlectures 1-4 p. 12/46
13 DFO: effect of noise on finite-differencing Effect of noisy function values on finite-differencing: Let F smooth and Ψ noise so that f(x) =F (x) +Ψ(x). Central-Difference (CD) formula for f(x) with stepsize h: f (x) f(x + hei ) f(x he i ), i = 1,n, x i h and let η(x, h) := sup z x h Ψ(z). = h f(x) F (x) L F h 2 + η(x,h) h, where L F is the Lipschitz constant of 2 F. If η(x, h) dominates in the RHS, then lucky if h f(x) descent. May use DFO methods in that case. Global and derivative-free optimizationlectures 1-4 p. 13/46
14 DFO: Illustrative application Tuning of algorithmic parameters Consider some nonlinear optimization solver (say trust region). Its performance depends on parameters choices: starting point, initial trust-region radius, successful step parameter, etc. For their automatic (optimal) adjustment, solve for instance min p R n p f(p) =CPU(solver; p) subject to p P, where p vector of all parameters to be tuned and P = {p : l p u}. derivative calculation hard, possibly nondifferentiable. Other applications. Automatic error analysis, engineering design, molecular geometry, etc. Global and derivative-free optimizationlectures 1-4 p. 14/46
15 DFO methods use only objective function values to construct iterates. do not essentially compute an approximate gradient. instead, form sample of points (less tightly clustered than for finite-differences); use associated function values to generate x k+1 so as to ensure descent; must also control geometry of sample sets. Algorithms (to be addressed): model-based trust-region, direct-search algorithms, etc. compute approximate (local) solution with few function evaluations; asymptotic speed irrelevant as no optimality conditions for termination. also suitable (but not guaranteed to be successful) for nonsmooth and for global optimization. Global and derivative-free optimizationlectures 1-4 p. 15/46
16 DFO methods... Limitations. With current state-of-the art DFO methods, expect to successfully solve problems provided: the problem is small-scale (in the order of 10 2 variables); f must be quite smooth; accurate finite-differencing cannot be achieved (f noisy or expensive etc); high accuracy not required (as the methods are slow asymptotically). Global and derivative-free optimizationlectures 1-4 p. 16/46
17 Derivative-free optimization Model-based derivative-free methods Model-based derivative-free algorithm Interpolation models Polynomial interpolation Geometry of the sample set Comments Global and derivative-free optimizationlectures 1-4 p. 17/46
18 Models in optimization methods minimize f(x) subject to x R n. (UP) derivative-based methods rely on (linear or quadratic) Taylor models of f: f(x k +s) f(x k )+s T f(x k ) ( + 1 ) 2 st 2 f(x k )s m k (s). need accurate gradient values f(x k ) [and maybe Hessians 2 f(x k )]. How to construct models of f when derivatives are unavailable / don t exist / cannot be approximated? by interpolation of f on a set of appropriately chosen sample points. Global and derivative-free optimizationlectures 1-4 p. 18/46
19 Models in derivative-free optimization methods Sample set: Y = {y 1,...,y q } for some q. {f(y 1 ),...,f(y q )} assumed to be known/computed. x k current iterate/estimate of minimizer x. x k Y and f(x k ) f(y i ), i = 1,q. Model: m k (s) =c + s T g ( st Hs ), where c R, g R n (and H R n n symmetric) unknown. Compute c R, g R n (and H R n n ) to satisfy the interpolation conditions: m k (y i x k )=f(y i ), i {1,...,q}. (IC) need q = n +1for m k linear (i.e., H =0); m k quadratic needs q =(n +1)(n +2)/2; connect to finite-differences. Global and derivative-free optimizationlectures 1-4 p. 19/46
20 Model-based DFO algorithm Issues to address: model interpolation: matrix of linear system (IC) must be nonsingular and well-conditioned. model minimization: since m k nonconvex, add TR constraint: s k = arg min s R n m k(s) subject to s k. (TR) update m k rather than recompute (only one point leaves from Y and a new one enters); improve the geometry of Y to help with the (conditioning of) model interpolation step. A complete algorithm is very involved; here we give a generic framework. Global and derivative-free optimizationlectures 1-4 p. 20/46
21 Model-based DFO algorithm... Let s k be a(n approximate) solution of (TR). Then predicted model decrease: m k (0) m k (s k )=f(x k ) m k (s k ). actual function decrease: f(x k ) f(x k + s k ). The trust region radius k is chosen based on the value of ρ k := f(xk ) f(x k + s k ). f(x k ) m k (s k ) If ρ k η, where η (0, 1), x k+1 := x k + s k, k+1 k. If ρ k <η, x k+1 = x k and k is reduced or Y is improved. Global and derivative-free optimizationlectures 1-4 p. 21/46
22 Generic model-based DFO algorithm Given Y = {y 1,...,y q } such that (IC) nonsingular, x 0 Y such that f(x 0 ) f(y i ) for i = 1,q, η (0, 1), 0 > 0 and k =0. While (TC not satisfied), do: 1. Form the linear/quadratic model m k (s) to satisfy (IC). 2. Solve (approximately) the (TR) subproblem for s k with m k (s k ) <f(x k ) ( sufficiently ). Compute ρ k := [f(x k ) f(x k + s k )]/[f(x k ) m k (s k )]. 3. If ρ k η, then [successful step] set x k+1 := x k + s k, k+1 k, replace y i Y by x k+1. Else if (ρ k <η) and (Y need not be improved), then set x k+1 = x k and k+1 < k. [unsuccessful step] End(if) 4. Geometry-improving step... Global and derivative-free optimizationlectures 1-4 p. 22/46
23 Generic model-based DFO algorithm... (continued...) 4. Invoke a geometry-improving procedure to update Y [one point leaves Y, new one enters so as to improve the conditioning of (IC)] choose ˆx Y such that f(ˆx) f(y i ) for all y i Y. set k+1 := k ; recompute ρ k for x k + s k := ˆx. If ρ k η, then set x k+1 =ˆx; Else set x k+1 = x k. End(if) 5. Let k := k +1. Global and derivative-free optimizationlectures 1-4 p. 23/46
24 Model-based DFO algorithm: comments ρ k <η= trust region is too large OR the sample set Y is inadequate (degenerate): iterates confined to low-dimensional surface of R n that does not contain the solution replace point in Y if condition number of (IC) is too high so as to improve this condition no. initial Y : vertices and edges midpoints of simplex in R n. quadratic models expensive: O(n 2 ) function evals. for initial model set-up; O(n 4 ) arithmetic operations per iteration for model update and minimization. (cheaper quadratic model: see Frobenius norm updates) use linear models (at least at the start of the algorithm until enough function evaluations have been calculated): O(n) function evals. for initial model set-up; O(n 3 ) ops/it. Global and derivative-free optimizationlectures 1-4 p. 24/46
25 Model construction by interpolation Linear model (linear polynomial in n variables): m k (s) =f(x k )+s T g. Need q = n +1in (IC) to determine c R,g R n ; but c = f(x k ) and so (IC) provide f(x k )+(y i x k ) T g = f(y i ), i = 1,n, or equivalently, (y i x k ) T g = f(y i ) f(x k ), i = 1,n. Thus g and hence m k (s) uniquely defined {y 1 x k,y 2 x k,...,y n x k } linearly independent {x k,y 1,y 2,...,y n } nondegenerate simplex. Pn 1 polynomials of degree at most 1 in Rn ; dim Pn 1 = n +1; monomial basis=natural basis φ = {1,x 1,...,x n }; φ j (x) =x j m k (y i x k )= q j=1 α jφ j (y i )=f(y i ), i = 1,q. Global and derivative-free optimizationlectures 1-4 p. 25/46
26 Model construction by interpolation... Quadratic model (quadratic polynomial in n variables): m k (s) =f(x k )+s T g st Hs, or equivalently, by symmetry of H, where ŝ = m k (s) =f(x k )+s T g + i<j H ijs i s j ( s, {s i s j } i<j, m k (s) =f(x k )+ŝ T ĝ, { 2 1 s 2 i }), ĝ = ( g, {H ij } i<j, P n 2 polynomials of degree at most 2 in Rn ; dim Pn 2 =(n + 1)(n + 2)/2 =q; monomial basis=natural basis φ = {φ j : j = 1,q}; i H iis 2 i, m k (y i x k )= q j=1 α jφ j (y i )=f(y i ), i = 1,q. { 1 2 H 2 ii}). Thus m k uniquely defined δ(φ, Y )=det({φ j (y i )} ij ) 0, for some polynomial basis φ. Global and derivative-free optimizationlectures 1-4 p. 26/46
27 Model construction by interpolation... (IC) M(φ, Y )α φ = f(y ), where M(φ, Y ) ij = φ j (y i ) and f(y ) i = f(y i ), i, j = 1,q. Y poised for interpolation δ(φ, Y ) =detm(φ, Y ) 0for some basis φ δ(φ, Y ) 0for any basis φ interpolating polynomial m k (s) exists and is unique. Other (useful) polynomial basis: Lagrange polynomials; Newton fundamental polynomials. Lagrange polynomials: given Y = {y 1,...,y q }, Lagrange polyn. χ j (x), j = 1,q, such that χ j (y i )=1if i = j and χ j (y i )=0if i j. Y poised = basis {χ j (x)} j uniquely exists = interpolating polyn. of f on Y : m k (s) = q j=1 f(yj )χ j (x k + s). Global and derivative-free optimizationlectures 1-4 p. 27/46
28 Updating Y to improve its geometry remove y Y and add y + to give Y + so that δ(φ, Y + ) increases in magnitude. Property: δ(φ, Y + ) χ j (y + ) δ(φ, Y ) with j =index of y. When ρ k η: y + = x k + s k and y = arg max yj Y χ j (x k + s k ). When ρ k <η: check whether Y needs improvement: [Y adequate at x k if for all y j Y such that y j x k k, δ(φ, Y ) cannot be doubled if y j replaced by y inside TR constr.] If Y adequate, choose k+1 < k, x k+1 = x k, leave Y unchanged. Else, for every y j Y, define potential replacement y j r y j r = arg max y x k k χ j(y). Let y = y j = arg max y i Y χ i (y i r ). Global and derivative-free optimizationlectures 1-4 p. 28/46
29 Constructing cheaper quadratic models extension to DFO of quasi-newton techniques; only O(n) function-evaluations required to construct the quadratic model with a O(n 3 ) arith. cost/iteration. Compute c, g and H by solving min c,g,h H Hk F subject to H = H T,m k (y i x k )=f(y i ), i = 1, ˆq, where F Frobenius norm and H k previous model Hessian. ˆq = O(n): ˆq n +2so we can compute c, g and some H k+1 H k ; practical value: ˆq =2n +1. like before, we need to consider geometry of Y... Software (model-based implementations): COBYLA (linear), DFO (quadratic), UOBYQA (quadratic), WEDGE (quadratic), NEWUOA (cheap quadratic based on qnewton updating). Global and derivative-free optimizationlectures 1-4 p. 29/46
30 Derivative-free optimization Direct-search derivative-free methods Linesearch methods Coordinate search method Pattern search methods Simplex methods Nelder-Mead algorithm Global and derivative-free optimizationlectures 1-4 p. 30/46
31 Linesearch derivative-free methods minimize f(x) subject to x R n. (UP) A Generic Linesearch DF Method (GLM-DF) Choose x 0 R n ; k =0. While (TC not satisfied), REPEAT: choose a search direction s k R n from x k if possible, compute a stepsize α k R along s k such that f(x k + α k s k ) <f(x k ); set x k+1 := x k + α k s k (if such a step α k exists) and x k+1 := x k (otherwise) and k := k +1. recall derivative-based linesearch methods: (s k descent if (s k ) T f(x k ) < 0) f(x k + αs k ) <f(x k ) for α>0 sufficiently small; s k = f(x k ) descent. Global and derivative-free optimizationlectures 1-4 p. 31/46
32 Linesearch derivative-free methods: linesearch f(x) (a) Steps are too long. (b) Steps are too short. (c) Bad search direction. Kolda, Lewis & Torczon (SIREV) Exact linesearch: α k =argmin α R φ k (α) =f(x k + αs k ). Inexact linesearch: sufficient decrease (Armijo-like cond.): f(x k + α k s k ) <f(x k ) ρ(α k ), where ρ(t) 0 increasing function of t, ρ(t)/t 0 as t 0. Use backtracking to satisfy the Armijo-like condition. Global and derivative-free optimizationlectures 1-4 p. 32/46
33 Linesearch DF methods: choice of search directions Example: Coordinate search method x 1 = x 0 + α 0 e 1 x 2 = x 1 + α 1 e 2. x n = x n 1 + α n 1 e n x n+1 = x n + α n e 1 x*. α k computed by exact or inexact linesearch. x 0 x 1 inefficient behaviour: coordinate direction (almost) orthogonal to f(x k ); see Figs. 1 (c) & 2. Global and derivative-free optimizationlectures 1-4 p. 33/46
34 Linesearch DF methods: choice of search directions Coordinate search method (continued...) CS with exact linesearch: example of failure to converge to a stationary point of f (MJD Powell, 1973). efficient when the variables are essentially uncoupled (equiv. to a nearly diagonal Hessian). Problem about coordinate search. when convergent, the local rate of convergence is often slower than steepest descent (Luenberger, 2003): one step of SD n steps of CS. globally convergent variants exist (strong assumptions or sophisticated linesearch); for example, assume that along each coordinate direction, f has a unique minimizer. Global and derivative-free optimizationlectures 1-4 p. 34/46
35 Linesearch DF methods: choice of search directions Other variants of coordinate search and linesearch DF: Back and forth /Double sweep method: Search along e 1,e 2,...,e n,e n 1,...,e 2,e 1,e 2,... Hookes & Jeeves; search along n coordinates then along the line from 1st to last point in cycle. Conjugate directions algorithm (connection to derivative-based Conjugate Gradients). Global convergence for GLM-DF: prevent inefficient behaviour by requiring that cos θ k = f(xk ) T s k δ>0for all k. f(x k ) s k When gradient of f is unavailable, require instead min v 0 max j {0,...,n 1} v T s k+j v s k+j δ>0 for all k. = span {s 0,s 1,...,s n 1 } = R n. Still not enough for global convergence: need sophisticated linesearch (Lucidi et al, 2002). Global and derivative-free optimizationlectures 1-4 p. 35/46
36 Pattern-search methods motivated by the need to make use of parallelization of function evaluations in linesearch methods. Pattern-search algorithm Given ɛ>0, θ 1 (0, 1), θ 2 1, choose x 0 R n, stepsize α 0 >ɛ, initial direction set D 0 ; k =0; While (α k >ɛ), do: 1. If sufficient decrease condition holds at α k for some s i D k, then set x k+1 = x k + α k s i and α k+1 = θ 2 α k. 2. Else set x k+1 = x k and α k+1 = θ 1 α k. Global and derivative-free optimizationlectures 1-4 p. 36/46
37 Pattern-search methods... instead of one search direction s k, at each PS iteration, we have a set of directions D k ; conditions for a good set of directions D k : at least one direction in D k should give descent in the sense that min v 0 max s D k v T s v s δ>0 for all k. (*) (very similar to linesearch methods condition earlier). Also, require that 0 <s min s s max for all s D k. (**) Note that D k = {e i, e i } does not satisfy (*). Global and derivative-free optimizationlectures 1-4 p. 37/46
38 Pattern-search methods... Suitable choices for D k that satisfy (*) and (**): coordinate directions: {e 1,e 2,...,e n, e 1, e 2,...,e n }. {s i = 1 2n e ei }, i = 1,n and s n+1 = 1 e, where 2n e =(1, 1...,1) T. stepsize α k fixed during kth iteration; sufficient decrease condition (see Inexact linesearch earlier) checked at α k along each direction in D k. Suitable values for ρ(t): γt 2 s k or γt 3/2 etc. Pattern-search software packages: APPS (Hough, Kolda & Torczon), DIRECT (Jones, Perttunen & Stuckman), etc Global and derivative-free optimizationlectures 1-4 p. 38/46
39 Pattern-search methods... D k = {e 1,e 2, e 1, e 2 }, n =2. (a) Initial pattern (b) Move North (c) Move West (d) Move North (e) Contract (f) Move West Kolda, Lewis & Torczon (SIREV) Global and derivative-free optimizationlectures 1-4 p. 39/46
40 Simplex methods: Nelder-Mead Nothing to do with simplex methods for linear programming. Nelder-Mead (1965): the most popular algorithm with users of optimization: easy to understand and implement, not sophisticated. But heuristical, not rigorous or reliable and hence not popular with optimizers. Connection to (linear) model-based DFO: NM and simplex methods keep a simplex of points at each iteration, but do not construct a linear approximation of f over this simplex, only use function values at the vertices of the simplex and certain operations on the simplex. Vertices of the simplex: Y = {x k,y 1,...,y n }; Edges from x k : matrix M = ( y 1 x k y 2 x k... y n x k) M nonsingular the simplex (ie, convex hull of Y ) is nondegenerate. Connection to pattern-search: set of search directions D k = {s i = y i x k : i = 1,n}. Global and derivative-free optimizationlectures 1-4 p. 40/46
41 The Nelder-Mead algorithm Change notation: Y = {x 1,...,x n+1 } at iteration k with f(x 1 ) f(x 2 )... f(x n+1 ). Attempt to improve worst function value f(x n+1 ). Centroid of best n points: x = 1 n n i=1 xi. Search direction : x n+1 x; x(α) =x + α(x n+1 x). Simplex operations: illustrated for n =2(S. Richards, 2010). (a) Reflection (b) Expansion (c) Outside Contraction (d) Inside Contraction (e) Shrink Global and derivative-free optimizationlectures 1-4 p. 41/46
42 The Nelder-Mead algorithm Given ρ 1 (reflection), χ>ρ (expansion), γ (0, 1) (contraction) and σ (0, 1) (shrinkage); intial simplex Y = {x 1,...,x n+1 } in R n, k =0; While (TC not satisfied), do: 1. Order vertices: f(x 1 ) f(x 2 )... f(x n+1 ). 2. (Reflection) Compute x r = x( ρ) and f(x r ). If f(x 1 ) f(x r ) <f(x n ), replace x n+1 Y by x r ; k = k +1. Else if f(x r ) <f(x 1 ), then 3. (Expand) Compute x e = x( χ) and f(x e ). If f(x e ) <f(x r ), replace x n+1 Y by x e ; k = k +1. Else replace x n+1 Y by x r ; k = k +1. (End if) Else (i.e., f(x r ) f(x n )). Global and derivative-free optimizationlectures 1-4 p. 42/46
43 The Nelder-Mead algorithm (continued)... Else (i.e., f(x r ) f(x n )) 4. (Contract) If f(x n ) f(x r ) <f(x n+1 ), then (outside contraction) Compute x oc = x( γ) and f(x oc ). If f(x oc ) f(x r ), replace x n+1 Y by x oc ; k = k +1. Else go to Step 5. (End if) Else (i.e., f(x r ) f(x n+1 )), then (inside contraction) Compute x ic = x(γ) and f(x ic ). If f(x ic ) <f(x n+1 ), replace x n+1 Y by x ic ; k = k +1. Else go to Step 5. (End if) (End if) (End if) 5. (Shrink) Define n new vertices y i = x 1 + σ(x i x 1 ), i = 1,n, and new simplex Y + = {x 1,y 1,...,y n }; k = k +1. Global and derivative-free optimizationlectures 1-4 p. 43/46
44 The Nelder-Mead algorithm: some properties Termination conditions: function values at simplex vertices close to each other, or simplex has become too small (max i x i x 1 ɛ max(1, x 1 )). Function-evaluation cost: k =0and any shrinkage step expensive (n +1function values); else, one or two fcts. evals./operation Limited convergence results: only for n =1and n =2. Other simplex methods have better convergence theory; see Torczon (1991). Examples of failure (many), documented (McKinnon 1998). Global and derivative-free optimizationlectures 1-4 p. 44/46
45 The Nelder-Mead algorithm: convergence (Lagarias et al, 1998) Theorem 1. (n =1) Let f : R R be a strictly convex objective with bounded level sets. Assume the initial simplex is nondegenerate. Apply the Nelder-Mead algorithm to minimizing f. Then both end points of the Nelder-Mead interval (i.e., simplex in one-d) converge to the minimizer x of f. Theorem 2. (n =2) Let f : R 2 R be a strictly convex objective with bounded level sets. Assume the initial simplex is nondegenerate and that ρ =1, χ =2and γ =1/2. Apply the Nelder-Mead algorithm to minimizing f. Then and lim k f(x 1,k )=lim k f(x 2,k )=lim k f(x 3,k ) lim k diam(conv(y k )) = 0. Global and derivative-free optimizationlectures 1-4 p. 45/46
46 Illustrations of Nelder-Mead algorithm in action Margaret Wright, 2013 Global and derivative-free optimizationlectures 1-4 p. 46/46
47 A2-dNMpicture(notetheeaseofunderstandingwhat s happening!):
48 Nelder-Mead on the McKinnon counterexample:
49 Similar things happen on the more complicated (in)famous Rosenbrock function, f =100(x 2 1 x 2 ) 2 +(1 x 1 ) 2,withitscurving steep-sided valley. Coordinate search; 81 function evaluations, step =
50 Nelder Mead, 76 function evaluations
Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2
Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods
More informationA recursive model-based trust-region method for derivative-free bound-constrained optimization.
A recursive model-based trust-region method for derivative-free bound-constrained optimization. ANKE TRÖLTZSCH [CERFACS, TOULOUSE, FRANCE] JOINT WORK WITH: SERGE GRATTON [ENSEEIHT, TOULOUSE, FRANCE] PHILIPPE
More informationInterpolation-Based Trust-Region Methods for DFO
Interpolation-Based Trust-Region Methods for DFO Luis Nunes Vicente University of Coimbra (joint work with A. Bandeira, A. R. Conn, S. Gratton, and K. Scheinberg) July 27, 2010 ICCOPT, Santiago http//www.mat.uc.pt/~lnv
More informationLecture 3: Linesearch methods (continued). Steepest descent methods
Lecture 3: Linesearch methods (continued). Steepest descent methods Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 3: Linesearch methods (continued).
More information8 Numerical methods for unconstrained problems
8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields
More informationNumerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen
Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationUnconstrained minimization of smooth functions
Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationNewton s Method. Ryan Tibshirani Convex Optimization /36-725
Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x
More informationWorst Case Complexity of Direct Search
Worst Case Complexity of Direct Search L. N. Vicente May 3, 200 Abstract In this paper we prove that direct search of directional type shares the worst case complexity bound of steepest descent when sufficient
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More informationHigher-Order Methods
Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth
More informationNewton s Method. Javier Peña Convex Optimization /36-725
Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and
More informationWorst-Case Complexity Guarantees and Nonconvex Smooth Optimization
Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Frank E. Curtis, Lehigh University Beyond Convexity Workshop, Oaxaca, Mexico 26 October 2017 Worst-Case Complexity Guarantees and Nonconvex
More informationLine Search Methods for Unconstrained Optimisation
Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic
More informationIntroduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems
New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,
More informationWorst Case Complexity of Direct Search
Worst Case Complexity of Direct Search L. N. Vicente October 25, 2012 Abstract In this paper we prove that the broad class of direct-search methods of directional type based on imposing sufficient decrease
More informationOptimization and Root Finding. Kurt Hornik
Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted
More informationOptimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30
Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More information6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection
6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods
More information17 Solution of Nonlinear Systems
17 Solution of Nonlinear Systems We now discuss the solution of systems of nonlinear equations. An important ingredient will be the multivariate Taylor theorem. Theorem 17.1 Let D = {x 1, x 2,..., x m
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationLecture 14: October 17
1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationA derivative-free nonmonotone line search and its application to the spectral residual method
IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral
More informationZero-Order Methods for the Optimization of Noisy Functions. Jorge Nocedal
Zero-Order Methods for the Optimization of Noisy Functions Jorge Nocedal Northwestern University Simons Institute, October 2017 1 Collaborators Albert Berahas Northwestern University Richard Byrd University
More informationNonlinear Optimization: What s important?
Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global
More informationOutline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems
Outline Scientific Computing: An Introductory Survey Chapter 6 Optimization 1 Prof. Michael. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction
More informationDerivative-Free Trust-Region methods
Derivative-Free Trust-Region methods MTH6418 S. Le Digabel, École Polytechnique de Montréal Fall 2015 (v4) MTH6418: DFTR 1/32 Plan Quadratic models Model Quality Derivative-Free Trust-Region Framework
More informationStatistics 580 Optimization Methods
Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of
More informationIntroduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1
A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions R. Yousefpour 1 1 Department Mathematical Sciences, University of Mazandaran, Babolsar, Iran; yousefpour@umz.ac.ir
More informationOptimization Tutorial 1. Basic Gradient Descent
E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.
More informationContents. Preface. 1 Introduction Optimization view on mathematical models NLP models, black-box versus explicit expression 3
Contents Preface ix 1 Introduction 1 1.1 Optimization view on mathematical models 1 1.2 NLP models, black-box versus explicit expression 3 2 Mathematical modeling, cases 7 2.1 Introduction 7 2.2 Enclosing
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationA New Trust Region Algorithm Using Radial Basis Function Models
A New Trust Region Algorithm Using Radial Basis Function Models Seppo Pulkkinen University of Turku Department of Mathematics July 14, 2010 Outline 1 Introduction 2 Background Taylor series approximations
More informationStochastic Optimization Algorithms Beyond SG
Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods
More informationE5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization
E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained
More informationOptimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23
Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationScientific Computing: Optimization
Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture
More informationA multistart multisplit direct search methodology for global optimization
1/69 A multistart multisplit direct search methodology for global optimization Ismael Vaz (Univ. Minho) Luis Nunes Vicente (Univ. Coimbra) IPAM, Optimization and Optimal Control for Complex Energy and
More information4M020 Design tools. Algorithms for numerical optimization. L.F.P. Etman. Department of Mechanical Engineering Eindhoven University of Technology
4M020 Design tools Algorithms for numerical optimization L.F.P. Etman Department of Mechanical Engineering Eindhoven University of Technology Wednesday September 3, 2008 1 / 32 Outline 1 Problem formulation:
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationGlobal and derivative-free optimization Lectures 1-4 (continued)
Global and derivative-free optimization Lectures 1-4 (continued) Coralia Cartis, University of Oxford INFOMM CDT: Contemporary Numerical Techniques Global and derivative-free optimizationlectures 1-4 (continued)
More informationPart 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)
Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective
More informationStatic unconstrained optimization
Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R
More informationUnconstrained Optimization
1 / 36 Unconstrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University February 2, 2015 2 / 36 3 / 36 4 / 36 5 / 36 1. preliminaries 1.1 local approximation
More informationHow to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization
How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization Frank E. Curtis Department of Industrial and Systems Engineering, Lehigh University Daniel P. Robinson Department
More informationLecture 15: SQP methods for equality constrained optimization
Lecture 15: SQP methods for equality constrained optimization Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 15: SQP methods for equality constrained
More informationMotivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)
AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 5: Nonlinear Equations Dianne P. O Leary c 2001, 2002, 2007 Solving Nonlinear Equations and Optimization Problems Read Chapter 8. Skip Section 8.1.1.
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More informationLecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent
10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 5: Gradient Descent Scribes: Loc Do,2,3 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for
More informationNONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM
NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth
More informationScientific Data Computing: Lecture 3
Scientific Data Computing: Lecture 3 Benson Muite benson.muite@ut.ee 23 April 2018 Outline Monday 10-12, Liivi 2-207 Monday 12-14, Liivi 2-205 Topics Introduction, statistical methods and their applications
More informationComplexity analysis of second-order algorithms based on line search for smooth nonconvex optimization
Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Clément Royer - University of Wisconsin-Madison Joint work with Stephen J. Wright MOPTA, Bethlehem,
More informationOutline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St
Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic
More informationA globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications
A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth
More informationOn fast trust region methods for quadratic models with linear constraints. M.J.D. Powell
DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many
More informationOptimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison
Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big
More informationA DERIVATIVE-FREE ALGORITHM FOR THE LEAST-SQUARE MINIMIZATION
A DERIVATIVE-FREE ALGORITHM FOR THE LEAST-SQUARE MINIMIZATION HONGCHAO ZHANG, ANDREW R. CONN, AND KATYA SCHEINBERG Abstract. We develop a framework for a class of derivative-free algorithms for the least-squares
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More informationOptimization Methods
Optimization Methods Decision making Examples: determining which ingredients and in what quantities to add to a mixture being made so that it will meet specifications on its composition allocating available
More informationLecture V. Numerical Optimization
Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize
More informationAlgorithms for Constrained Optimization
1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic
More informationAM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α
More informationOptimization Methods. Lecture 19: Line Searches and Newton s Method
15.93 Optimization Methods Lecture 19: Line Searches and Newton s Method 1 Last Lecture Necessary Conditions for Optimality (identifies candidates) x local min f(x ) =, f(x ) PSD Slide 1 Sufficient Conditions
More informationOptimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38
More informationLectures 9 and 10: Constrained optimization problems and their optimality conditions
Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationGradient Descent. Ryan Tibshirani Convex Optimization /36-725
Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like
More informationarxiv: v1 [math.oc] 1 Jul 2016
Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the
More informationQuasi-Newton Methods
Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications
More informationECS550NFB Introduction to Numerical Methods using Matlab Day 2
ECS550NFB Introduction to Numerical Methods using Matlab Day 2 Lukas Laffers lukas.laffers@umb.sk Department of Mathematics, University of Matej Bel June 9, 2015 Today Root-finding: find x that solves
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationMultipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations
Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations Oleg Burdakov a,, Ahmad Kamandi b a Department of Mathematics, Linköping University,
More informationOptimal Newton-type methods for nonconvex smooth optimization problems
Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations
More informationA Derivative-Free Gauss-Newton Method
A Derivative-Free Gauss-Newton Method Coralia Cartis Lindon Roberts 29th October 2017 Abstract We present, a derivative-free version of the Gauss-Newton method for solving nonlinear least-squares problems.
More informationInfeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization
Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel
More informationDerivative-Free Optimization of Noisy Functions via Quasi-Newton Methods. Jorge Nocedal
Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods Jorge Nocedal Northwestern University Huatulco, Jan 2018 1 Collaborators Albert Berahas Northwestern University Richard Byrd University
More informationOPER 627: Nonlinear Optimization Lecture 9: Trust-region methods
OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods Department of Statistical Sciences and Operations Research Virginia Commonwealth University Sept 25, 2013 (Lecture 9) Nonlinear Optimization
More informationGEOMETRY OF INTERPOLATION SETS IN DERIVATIVE FREE OPTIMIZATION
GEOMETRY OF INTERPOLATION SETS IN DERIVATIVE FREE OPTIMIZATION ANDREW R. CONN, KATYA SCHEINBERG, AND LUíS N. VICENTE Abstract. We consider derivative free methods based on sampling approaches for nonlinear
More information1. Introduction. We analyze a trust region version of Newton s method for the optimization problem
SIAM J. OPTIM. Vol. 9, No. 4, pp. 1100 1127 c 1999 Society for Industrial and Applied Mathematics NEWTON S METHOD FOR LARGE BOUND-CONSTRAINED OPTIMIZATION PROBLEMS CHIH-JEN LIN AND JORGE J. MORÉ To John
More informationA Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity
A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University
More informationA Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification
JMLR: Workshop and Conference Proceedings 1 16 A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification Chih-Yang Hsia r04922021@ntu.edu.tw Dept. of Computer Science,
More informationLecture 14: Newton s Method
10-725/36-725: Conve Optimization Fall 2016 Lecturer: Javier Pena Lecture 14: Newton s ethod Scribes: Varun Joshi, Xuan Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes
More informationStochastic Analogues to Deterministic Optimizers
Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured
More informationIE 5531: Engineering Optimization I
IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationBenchmarking Derivative-Free Optimization Algorithms
ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439 Benchmarking Derivative-Free Optimization Algorithms Jorge J. Moré and Stefan M. Wild Mathematics and Computer Science Division
More information5 Quasi-Newton Methods
Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min
More informationUnconstrained Multivariate Optimization
Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued
More informationWorst case complexity of direct search
EURO J Comput Optim (2013) 1:143 153 DOI 10.1007/s13675-012-0003-7 ORIGINAL PAPER Worst case complexity of direct search L. N. Vicente Received: 7 May 2012 / Accepted: 2 November 2012 / Published online:
More informationOPER 627: Nonlinear Optimization Lecture 14: Mid-term Review
OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization
More informationTrust Regions. Charles J. Geyer. March 27, 2013
Trust Regions Charles J. Geyer March 27, 2013 1 Trust Region Theory We follow Nocedal and Wright (1999, Chapter 4), using their notation. Fletcher (1987, Section 5.1) discusses the same algorithm, but
More information