Today. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods

Similar documents
(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)

Lecture 8 Optimization

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10)

Numerical optimization

Root Finding and Optimization

18-660: Numerical Methods for Engineering Design and Optimization

CISE-301: Numerical Methods Topic 1:

Unconstrained Multivariate Optimization

Finding roots. Lecture 4

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

Lecture 34 Minimization and maximization of functions

1 Numerical optimization

Nonlinear Equations. Chapter The Bisection Method

1 Numerical optimization

Syllabus Objective: 2.9 The student will sketch the graph of a polynomial, radical, or rational function.

Optimization. Totally not complete this is...don't use it yet...

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

Exploring the energy landscape

Math 2412 Activity 1(Due by EOC Sep. 17)

y2 = 0. Show that u = e2xsin(2y) satisfies Laplace's equation.

ECE 595, Section 10 Numerical Simulations Lecture 7: Optimization and Eigenvalues. Prof. Peter Bermel January 23, 2013

Scientific Computing: Optimization

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

CPSC 540: Machine Learning

Lecture V. Numerical Optimization

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Section 1.2 Domain and Range

8 Numerical methods for unconstrained problems

Chapter 6. Nonlinear Equations. 6.1 The Problem of Nonlinear Root-finding. 6.2 Rate of Convergence

Algebra II Notes Inverse Functions Unit 1.2. Inverse of a Linear Function. Math Background

Order of convergence. MA3232 Numerical Analysis Week 3 Jack Carl Kiefer ( ) Question: How fast does x n

Example - Newton-Raphson Method

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Section 3.4: Concavity and the second Derivative Test. Find any points of inflection of the graph of a function.

Lab on Taylor Polynomials. This Lab is accompanied by an Answer Sheet that you are to complete and turn in to your instructor.

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

Higher-Order Methods

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

FALL 2018 MATH 4211/6211 Optimization Homework 4

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

Numerical Methods I Solving Nonlinear Equations

This is only a list of questions use a separate sheet to work out the problems. 1. (1.2 and 1.4) Use the given graph to answer each question.

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares

Numerical Optimization

Gradient Descent. Sargur Srihari

Scientific Computing: An Introductory Survey

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems

A Primer on Multidimensional Optimization

Solution of Nonlinear Equations

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions

A Simple Explanation of the Sobolev Gradient Method

Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques

Fitting The Unknown 1/28. Joshua Lande. September 1, Stanford

Optimization Methods

Nonlinear Optimization

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve.

8.4 Inverse Functions

Chapter 6 Reliability-based design and code developments

Quadratic Functions. The graph of the function shifts right 3. The graph of the function shifts left 3.

Review of Classical Optimization

( x) f = where P and Q are polynomials.

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Polynomials, Linear Factors, and Zeros. Factor theorem, multiple zero, multiplicity, relative maximum, relative minimum

Numerical Methods. Root Finding

Numerical Methods Lecture 3

Conjugate gradient algorithm for training neural networks

Unconstrained optimization

11 More Regression; Newton s Method; ROC Curves

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Contents. Preface. 1 Introduction Optimization view on mathematical models NLP models, black-box versus explicit expression 3

Line Search Methods for Unconstrained Optimisation

Topic 8c Multi Variable Optimization

UNCONSTRAINED OPTIMIZATION

4.1 & 4.2 Student Notes Using the First and Second Derivatives. for all x in D, where D is the domain of f. The number f()

Roots of equations, minimization, numerical integration

Optimal Control of process

17 Solution of Nonlinear Systems

Nonlinear Programming

Inverse of a Function

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations

Optimization. Next: Curve Fitting Up: Numerical Analysis for Chemical Previous: Linear Algebraic and Equations. Subsections

Bindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 6: Monday, Mar 7. e k+1 = 1 f (ξ k ) 2 f (x k ) e2 k.

Programming, numerics and optimization

3. Several Random Variables

and ( x, y) in a domain D R a unique real number denoted x y and b) = x y = {(, ) + 36} that is all points inside and on

Numerical Optimization of Partial Differential Equations

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

Static unconstrained optimization

Unconstrained Optimization

ROBUST STABILITY AND PERFORMANCE ANALYSIS OF UNSTABLE PROCESS WITH DEAD TIME USING Mu SYNTHESIS

Nonlinearity Root-finding Bisection Fixed Point Iteration Newton s Method Secant Method Conclusion. Nonlinear Systems

3.1 Introduction. Solve non-linear real equation f(x) = 0 for real root or zero x. E.g. x x 1.5 =0, tan x x =0.

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

Part I: Thin Converging Lens

Quasi-Newton Methods

Introduction to Geometry Optimization. Computational Chemistry lab 2009

Consider the function f(x, y). Recall that we can approximate f(x, y) with a linear function in x and y:

Transcription:

Optimization

Last time Root inding: deinition, motivation Algorithms: Bisection, alse position, secant, Newton-Raphson Convergence & tradeos Eample applications o Newton s method Root inding in > 1 dimension

Today Introduction to optimization Deinition and motivation 1-dimensional methods Golden section, discussion o error Newton s method Multi-dimensional methods Newton s method, steepest descent, conjugate gradient General strategies, value-only methods

Ingredients Objective unction Variables Constraints Find values o the variables that minimize or maimize the objective unction while satisying the constraints

Dierent Kinds o Optimization Figure rom: Optimization Technology Center http://www-p.mcs.anl.gov/otc/guide/optweb/

Dierent Optimization Techniques Algorithms have very dierent lavor depending on speciic problem Closed orm vs. numerical vs. discrete Local vs. global minima Running times ranging rom O(1) to NP-hard Today: Focus on continuous numerical methods

Optimization in 1-D Look or analogies to bracketing in root-inding What does it mean to bracket a minimum? ( let, ( let )) ( right, ( right )) ( mid, ( mid )) let < mid < right ( mid ) < ( let ) ( mid ) < ( right )

Optimization in 1-D Once we have these properties, there is at least one local minimum between let and right Establishing bracket initially: Given initial, increment Evaluate ( initial ), ( initial +increment) I decreasing, step until ind an increase Else, step in opposite direction until ind an increase Grow increment (by a constant actor) at each step For maimization: substitute or

Optimization in 1-D Strategy: evaluate unction at some new ( let, ( let )) ( right, ( right )) ( new, ( new )) ( mid, ( mid ))

Optimization in 1-D Strategy: evaluate unction at some new Here, new bracket points are new, mid, right ( let, ( let )) ( right, ( right )) ( new, ( new )) ( mid, ( mid ))

Optimization in 1-D Strategy: evaluate unction at some new Here, new bracket points are let, new, mid ( let, ( let )) ( right, ( right )) ( new, ( new )) ( mid, ( mid ))

Optimization in 1-D Unlike with root-inding, can t always guarantee that interval will be reduced by a actor o 2 Let s ind the optimal place or mid, relative to let and right, that will guarantee same actor o reduction regardless o outcome

Optimization in 1-D α α 2 1-α 2 =α i ( new ) < ( mid ) new interval = α else new interval = 1 α 2

Golden Section Search To assure same interval, want α = 1 α 2 So, α = 5 1 2 = Φ This is the reciprocal o the golden ratio = 0.618 So, interval decreases by 30% per iteration Linear convergence

Sources o Error When we ind a minimum value,, why is it dierent rom true minimum min? 1. Obvious: width o bracket min right let 2. Less obvious: loating point representation ( min ) ( min ( ) ) ε mach

Stopping Criterion or Golden Section Q: When is ( right let ) small enough that discrepancy between and min limited by rounding error in ( min )? Use Taylor series, knowing that ( min ) is around 0 + + 1 2 ( ) ( min) 0 2 ( min)( min) So, the condition ( min ) ( ) min ( ) ε mach holds where min ε mach 2 ( ( min min ) )

Implications Rule o thumb: pointless to ask or more accuracy than sqrt(ε ) Q:, what happens to # o accurate digits in results when you switch rom single precision (~7 digits) to double (~16 digits) or, ()? A: Gain only ~4 more accurate digits.

Faster 1-D Optimization Trade o super-linear convergence or worse robustness Combine with Golden Section search or saety Usual bag o tricks: Fit parabola through 3 points, ind minimum Compute derivatives as well as positions, it cubic Use second derivatives: Newton

Newton s Method

Newton s Method

Newton s Method

Newton s Method

Newton s Method At each step: k + 1 = k ( ( k k ) ) Requires 1 st and 2 nd derivatives Quadratic convergence

Questions?

Multidimensional Optimization

Multi-Dimensional Optimization Important in many areas Finding best design in some parameter space Fitting a model to measured data Hard in general Multiple etrema, saddles, curved/elongated valleys, etc. Can t bracket (but there are trust region methods) In general, easier than rootinding Can always walk downhill Minimizing one scalar unction, not simultaneously satisying multiple unctions

Problem with Saddle

Newton s Method in Multiple Dimensions Replace 1 st derivative with gradient, 2 nd derivative with Hessian = = 2 2 2 2 2 2 ), ( y y y y H y

Newton s Method in Multiple Dimensions in 1 dimension: Replace 1 st derivative with gradient, 2 nd derivative with Hessian So, Can be ragile unless unction smooth and starting close to minimum ) ( ) ( 1 1 k k k k H = + ) ( ) ( 1 k k k k = +

Other Methods What i you can t / don t want to use 2 nd derivative? Quasi-Newton methods estimate Hessian Alternative: walk along (negative o) gradient Perorm 1-D minimization along line passing through current point in the direction o the gradient Once done, re-compute gradient, iterate

Steepest Descent

Problem With Steepest Descent

Conjugate Gradient Methods Idea: avoid undoing minimization that s already been done Walk along direction d k + 1 = gk + 1 + β d where g is gradient Polak and Ribiere ormula: β k = T g k + k + 1 T k gk k ( g 1 g g k k )

Conjugate Gradient Methods Conjugate gradient implicitly obtains inormation about Hessian For quadratic unction in n dimensions, gets eact solution in n steps (ignoring roundo error) Works well in practice

Value-Only Methods in Multi-Dimensions I can t evaluate gradients, lie is hard Can use approimate (numerically evaluated) gradients: = + + + δ δ δ δ δ δ ) ( ) ( ) ( ) ( ) ( ) ( 3 2 1 3 2 1 ) ( e e e e e e

Generic Optimization Strategies Uniorm sampling Cost rises eponentially with # o dimensions Heuristic: compass search Try a step along each coordinate in turn I can t ind a lower value, halve step size

Generic Optimization Strategies Simulated annealing: Maintain a temperature T Pick random direction d, and try a step o size dependent on T I value lower than current, accept I value higher than current, accept with probability ~ ep(( ( current ) ( new )) / T) Annealing schedule how ast does T decrease? Slow but robust: can avoid non-global minima

Downhill Simple Method (Nelder-Mead) Keep track o n+1 points in n dimensions Vertices o a simple (triangle in 2D tetrahedron in 3D, etc.) At each iteration: simple can move, epand, or contract Sometimes known as amoeba method: simple oozes along the unction

Downhill Simple Method (Nelder-Mead) Basic operation: relection location probed by relection step worst point (highest unction value)

Downhill Simple Method (Nelder-Mead) I relection resulted in best (lowest) value so ar, try an epansion location probed by epansion step Else, i relection helped at all, keep it

Downhill Simple Method (Nelder-Mead) I relection didn t help (relected point still worst) try a contraction location probed by contration step

Downhill Simple Method (Nelder-Mead) I all else ails shrink the simple around the best point

Downhill Simple Method (Nelder-Mead) Method airly eicient at each iteration (typically 1-2 unction evaluations) Can take lots o iterations Somewhat lakey sometimes needs restart ater simple collapses on itsel, etc. Beneits: simple to implement, doesn t need derivative, doesn t care about unction smoothness, etc.

Rosenbrock s Function 2 2 (, y) = 100( y ) + (1 Designed speciically or testing optimization techniques Curved, narrow valley ) 2

Demo

Global Optimization In general, can t guarantee that you ve ound global (rather than local) minimum Some heuristics: Multi-start: try local optimization rom several starting positions Very slow simulated annealing Use analytical methods (or graphing) to determine behavior, guide methods to correct neighborhoods

Sotware notes

Sotware Matlab: minbnd For unction o 1 variable with bound constraints Based on golden section & parabolic interpolation () doesn t need to be deined at endpoints minsearch Simple method (i.e., no derivative needed) Optimization Toolbo (available ree @ Princeton) meshgrid sur Ecel: Solver