(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)

Similar documents
Today. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods

Lecture 8 Optimization

Unconstrained Multivariate Optimization

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10)

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

1 Numerical optimization

Root Finding and Optimization

Lecture V. Numerical Optimization

Chapter 6. Nonlinear Equations. 6.1 The Problem of Nonlinear Root-finding. 6.2 Rate of Convergence

1 Numerical optimization

Chapter 4. Unconstrained optimization

5 Quasi-Newton Methods

Numerical Optimization

Convex Optimization CMU-10725

Optimization II: Unconstrained Multivariable

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015

Numerical solutions of nonlinear systems of equations

Statistics 580 Optimization Methods

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

CISE-301: Numerical Methods Topic 1:

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve.

The concept of limit

Optimization Methods

Example: When describing where a function is increasing, decreasing or constant we use the x- axis values.

Topic 8c Multi Variable Optimization

Basic mathematics of economic models. 3. Maximization

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

Quasi-Newton Methods

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Nonlinear Programming

Topic 4b. Open Methods for Root Finding

A Simple Explanation of the Sobolev Gradient Method

Unconstrained optimization

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

2. Quasi-Newton methods

18-660: Numerical Methods for Engineering Design and Optimization

Optimization. Totally not complete this is...don't use it yet...

NonlinearOptimization

Optimization II: Unconstrained Multivariable

Extreme Values of Functions

AP Calculus Notes: Unit 1 Limits & Continuity. Syllabus Objective: 1.1 The student will calculate limits using the basic limit theorems.

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Programming, numerics and optimization

Nonlinear Equations. Chapter The Bisection Method

9.6 Newton-Raphson Method for Nonlinear Systems of Equations

Static unconstrained optimization

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method

y2 = 0. Show that u = e2xsin(2y) satisfies Laplace's equation.

This is only a list of questions use a separate sheet to work out the problems. 1. (1.2 and 1.4) Use the given graph to answer each question.

Nonlinear Equations. Not guaranteed to have any real solutions, but generally do for astrophysical problems.

ECE 595, Section 10 Numerical Simulations Lecture 7: Optimization and Eigenvalues. Prof. Peter Bermel January 23, 2013

Exploring the energy landscape

Scientific Computing: An Introductory Survey

Finding roots. Lecture 4

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Review of Classical Optimization

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Section 3.4: Concavity and the second Derivative Test. Find any points of inflection of the graph of a function.

Lecture 34 Minimization and maximization of functions

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations

Solution of Nonlinear Equations

ECS550NFB Introduction to Numerical Methods using Matlab Day 2

Line Search Techniques

Numerical Optimization

Scientific Computing: Optimization

Numerical Methods. Root Finding

MEAN VALUE THEOREM. Section 3.2 Calculus AP/Dual, Revised /30/2018 1:16 AM 3.2: Mean Value Theorem 1

Lecture 14: Newton s Method

0,0 B 5,0 C 0, 4 3,5. y x. Recitation Worksheet 1A. 1. Plot these points in the xy plane: A

Single Variable Minimization

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

( x) f = where P and Q are polynomials.

Line Search Methods for Unconstrained Optimisation

. This is the Basic Chain Rule. x dt y dt z dt Chain Rule in this context.

Example - Newton-Raphson Method

Nonlinear Optimization

CLASS NOTES Models, Algorithms and Data: Introduction to computing 2018

8.4 Inverse Functions

Lab on Taylor Polynomials. This Lab is accompanied by an Answer Sheet that you are to complete and turn in to your instructor.

Optimization and Root Finding. Kurt Hornik

Exact and Approximate Numbers:

17 Solution of Nonlinear Systems

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems

Lecture 14: October 17

Lecture 7 Unconstrained nonlinear programming

Multivariate Newton Minimanization

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

Nonlinear Optimization for Optimal Control

Wind-Driven Circulation: Stommel s gyre & Sverdrup s balance

Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

Math 411 Preliminaries

EC5555 Economics Masters Refresher Course in Mathematics September 2013

and ( x, y) in a domain D R a unique real number denoted x y and b) = x y = {(, ) + 36} that is all points inside and on

Transcription:

Solving Nonlinear Equations & Optimization One Dimension Problem: or a unction, ind 0 such that 0 = 0. 0

One Root: The Bisection Method This one s guaranteed to converge at least to a singularity, i not an actual root. 1. Start with a and b such that a and b are opposite signs. 2. Choose midpoint c = a + b a/2. 3. I c has a sign opposite o a, then set b = c. Otherwise, set a=c c. 4. Repeat until desired tolerance is attained.

One Root: Brent s Method Brackets with a local quadratic interpolation o three points. At a given iteration, i the net computed point alls outside o the bracketing interval, a bisection step is used. Is the method underlying uniroot in R. More details in Press et al 1992. Brent s is the most is the method most highly recommended by NR or single nonlinear root-inding.

One Root: Newton s Method Local linear approimation using. Steps: With irst guess 0, compute 0 slope o approimating line. Net guess 1 is the root o the tangent line etending rom 0. Iterate until convergence. 1 0

A Comparison Method Requires? Guaranteed? Convergence Bisection No Yes Linear Brent s No Almost Superlinear Newton s Yes No Quadratic* * I close. These same relative trade-os eist or higher-dimensional procedures.

Optimization in One Dimension Problem: or a unction, ind m such that m > or m < or all m. We ll ocus on minima, since inding a ma or is equivalent to inding a min or. Global versus local: Multiple etrema. Boundaries.

One-dimensional: Golden Section Search An analogue to the bisection method or inding roots. Proceeds as ollows: Begin with 3 points 1 < 2 < 3, that are thought to contain a local minimum. Choose new point 0, such that 1 < 0 < 3. Form a new bracketing interval lbased on the relative values o 0 and 2. For eample, i 0 < 2, then the new interval is 0, 3 i 0 > 2, or it s 1, 2 i 0 < 2. Iterate until convergence.

What does Golden mean? The question is: ollowing the steps on the previous slide, how do we select 0? The answer is: we make a choice that guarantees a proportional reduction in the width o the interval at each step. For eample, i 0 < 2, then or this to happen regardless o the value o 0 we need to satisy 0 1 = 3 2 = α 3 1, where α represents the proportion o the interval eliminated at each step. To get the same reduction at the net iteration, the points also must satisy 2 0 = α 3 1 =α[α 3 1 + 2 0 ], so 2 0 = 3 1 α/1 α. Since 0 1 + 2 0 + 3 2 = 3 1, it ollows that 2α + α2/1 α = 1, a quadratic whose only solution satisying 0 < α < 1 is 3 5 / 2. Hence, the proportion o the interval remaining ater each iteration is given by which is known as the Golden Mean. 1 5 1 / 2 0.618,

How do we use the value α?? Start with an interval [ 1, 3 ] thought to contain the min. Select the interior points 0 = 1 + α 3 1 and 2 = 3 α 3 1. Evaluate 0 and 2. I 0 < 2, new interval is [ 1, 2 ] and net point selected is 1 + α 2 1. I 0 > 2, new interval is [ 0, 3 ] and net point selected is 3 α 3 0. Iterate.

Brent s Method Works in a manner analogous to Brent s or root-inding: local quadratic interpolation, with a saety net in case new points all outside o the bracket. Too complicated to describe here a lot o housekeeping computations, although you can ind out more in NR. The method used by R s optimize unction.

Mathematics and Statistics Solving Several Nonlinear Equations The problem is to ind solutions or a system o the orm The problem is to ind solutions or a system o the orm 0,,, 0,,, 2 1 2 2 1 1 p p 0,,, 2 1 2 p 0,,, 2 1 p p

Options Multivariate Newton s, or Newton-Raphson NR. Modiied NR line searches and backtracking. Multivariate i t secant method Broyden s method. Similar trade-os apply as we discussed with one equation in terms o convergence and knowledge o the Jacobian.

Why is inding several roots such a problem? There are no good, general methods or solving systems o more than one nonlinear equation rom NR. Oten, unctions 1, 2,, p have nothing to do with each other. Finding solutions o s means identiying where e the p zero eo contours in the p 1 zero hypersuraces simultaneously intersect. These can be diicult to home in on without t some insight into how the p unctions relate to one another. See eample on ollowing slide, with p = 2.

Reproduced rom Numerical Recipes:

Mathematics and Statistics Developing a multivariate linear approimation: Let denote the entire vector o p unctions, and let = 1,, p denote an entire vector o values i, or i=1,,p. Taylor series epansion o i in a neighborhood o :. 2 p j i i i O Taylor series epansion o i in a neighborhood o :. 1 j j j i i O Note that the partial derivatives in this equation arise rom the Jacobian matri J o. So in matri notation we have: 2 J O. J O

Newton-Raphson From epansion on previous slide, neglecting terms o order δ 2 and higher, and setting equal to zero we obtain a set o linear equations or the corrections δ 2 that move each unction simultaneously l closer to zero: J, which can be solved using LU decomposition. This gives us an iterative approach correcting and updating a solution: new old, which we can iterate to convergence i.e., how close either g the 1-norm or -norm o δ is to zero.

Evaluating the Jacobian As we oten cannot easily evaluate the Jacobian analytically, a conventional option is numerical dierentiation. Numerical evaluation o the Jacobian relies on inite dierence equations. Approimate value o the i,jth element o J is given by: J ij [ i h je j i ]/ h j, where h j is some very small number and e j represents a vector with 1 at the jth position and zeroes everywhere else.

Modiied Newton-Raphson Note that a ull Newton step can be represented as J 1. When we are not close enough to the solution, this is not guaranteed to decrease to decrease the value o the unction. How do we know i we should take the ull step? One strategy is to require that the step decrease the inner product o, which is the same requirement as trying to minimize = /2. Another is to note that that the Newton step is a descent direction: J J 1 0.

Strategy: Modiied Newton-Raphson continued i. Deine p = δ, and a Newton iteration as new old p, where a ull Newton step speciies λ = 1. ii. I is reduced, then go to net iteration. iii. I is not reduced, then backtrack, selecting some λ < 1. Value o λ or a conventional backtrack is selected to ensure that the average rate o decrease is at least some raction o the initial rate o decrease, and that rate o decrease o at new value o is some raction o the rate or the old value o.

Multidimensional Optimization The problem: ind a minimum i or the unction 1,,, p. Note that in many statistical applications the unctions we wish to optimize e.g., loglikelihoods are conve, and hence airly well behaved. Also, in terms o the various approaches, options involve trade-os between rate o convergence and inormation about the gradient and Hessian. The latter two can oten be numerically evaluated.

Strategies 1. Newton-Raphson applied to the gradient. 2. Nelder-Mead Simple Method no gradient required. 3. Powell s Method. 4. Conjugate Gradient Methods. 5. Variable Metric Methods.

Nelder-Mead Simple Approach Simple is a igure with p+1 vertices in p dimensions a triangle in two dimensions, i or a tetrahedron t in three dimensions. i Start with a set o p+1 points that deine a inite simple i.e., one having inite volume. Simple method then takes a series o relective steps, moving the highest point where the is largest through the opposite ace o the simple to a lower point. Steps are designed to preserve the volume, but simple may epand lengthen where easible to acilitate convergence. When simple reaches a valley loor, it takes contractive steps. NR implementation descriptively reers to this routine as NR implementation descriptively reers to this routine as amoeba.

Possible simple moves:

Powell s Method aka, Direction Set Methods We know how to minimize a single nonlinear equation. Given a one-dimensional approach, a direction set method proceeds as ollows: Start at a point 0 = 1, p. Consider a set o vector directions n 1, n 2,,n p e.g., these might arise rom the gradient o. In the direction n 1, ind the scalar that minimizes 0 +λn 1 using a one-dimensional method. Replace 0 with 0 +λn 1. Iterate t through h n 2,,n p, and continue iterating ti until convergence. Note that you can use whatever nonlinear optimization routine you want say, Brent s or the Golden Section Search.

Conjugate Gradient Methods I you can compute the gradient, it turns out that you can enjoy substantial computational savings over a direction set method. Idea is to choose directions based on the gradient, but it turns out that the path o steepest descent i.e., given a current guess i or the minimum, the path o steepest descent is the negative gradient evaluated at i is not a good direction. See igure on slide ollowing. Instead, a set o conjugate directions are derived such that the we will not just proceed down the new gradient, but in a direction that is conjugate to the old gradient and conjugate to all previous directions traversed. Note: given the symmetric hessian H, two vectors i and n j are said to be conjugate i i Hn j = 0.

Problems with Steepest Descent a In a long, narrow valley, steepest descent takes many steps to reach the valley loor. b For a single magniied step, direction begins perpendicular to contours, but winds up parallel to local contours where minimum is reached.

Quasi-Newton Methods Similar to conjugate gradient methods, in the sense that we are accumulating inormation rom p successive line minimizations using gradient inormation to ind the minimum o a quadratic orm. Quasi-Newton methods can be thought o as a means o applying Newton-Raphson to the gradient, without the need or the Hessian. Using N-R with the gradient, given a current guess i the net guess is given by: 1 H. i1 i i Note that with quasi-newton, we start out with a positive-deinite matri used as an approimation to the Hessian. Successive iterations update this approimation, which converges to the actual Hessian. Most common implementations o this approach are so-called Davidon- Fletcher-Powell DFP and Broyden-Fletcher-Goldarb-Shanno BFGS algorithms.

Newton-Raphson in R: Mathematics and Statistics

Simple Mathematics and Quasi-Newton Statistics Methods Dr. Corcoran in R: STAT 6550