Root Finding and Optimization Ramses van Zon SciNet, University o Toronto Scientiic Computing Lecture 11 February 11, 2014
Root Finding It is not uncommon in scientiic computing to want solve an equation numerically. I there is one unknown and one equation only, we can always write the equation as (x) = 0 I x satisies this equation, it is called a root. I there s a set o equations, one can write: 1 (x 1, x 2, x 3,... ) = 0 2 (x 1, x 2, x 3,... ) = 0 3 (x 1, x 2, x 3,... ) = 0... The one-dimensional case is considerably easier to solve: First.
1D root inding 4 3 2 1 0-1 -2-4 -3-2 -1 0 1 2 3 What s so nice about 1D? x Algorithms always start rom an initial guess and (usually) a bounding interval [a, b] in which the root is to be ound. I (a) and (b) have opposite signs, and is continuous, there must be a root inside the interval: the root is bracketed. Consecutive reinement o the interval until a b < ε guarranteed to ind the root.
Bracketing Must ind bounding interval irst. Strategies: Plot the unction. Make a conservative guess or [a, b] then slice up the interval, checking or sign change o. Make a wild guess and expand the interval until exhibits a sign change. Troublemakers No sign change Easily missed Singularity
Suppose we ve bracketed a root, what s next? Classic root inding algorithms Bisection Secant/False Position Ridders /Brent Newton-Raphson (requires derivatives) All o these zero in on the root, but have dierent convergence and stability characteristics. Note: For polynomial : specialized routines (Muller, Laguerre) For eigenvalue problems: specialized routines (linear algebra)
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 midpoint b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 midpoint b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 b 2 b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 b 2 b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 a 2 b 2 b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 a 2 b 3 b 2 b 1
Bisection method Split the interval. Compute (midpoint). Get new root bracket. a 1 a 3 b 3 b 2 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 2 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 2 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 b 2 b 1
False Position Method Linearly interpolate. Solve or next x. Get new root bracket. a 1 a 2 b 2 b 1
Other non-derivate methods Secant: Like alse position, but keep most-recent point. Ridders method: uses an exponential it. Brent s method: uses an inverse quadratic it.
Newton-Raphson method Guess x. From unction and derivate, approximate root. Repeat. x 1
Newton-Raphson method Guess x. From unction and derivate, approximate root. Repeat. x 1
Newton-Raphson method Guess x. From unction and derivate, approximate root. Repeat. x x (x)/ (x) x 2 x 1
Newton-Raphson method Guess x. From unction and derivate, approximate root. Repeat. x x (x)/ (x) x 2 x 1
Newton-Raphson method Guess x. From unction and derivate, approximate root. Repeat. x x (x)/ (x) x 2 x 1
Newton-Raphson method Guess x. From unction and derivate, approximate root. Repeat. x x (x)/ (x) x 3 x 2 x 1
Convergence and Stability method convergence stability Bisection ɛ n+1 = 1 2 ɛ n Stable Secant ɛ n+1 = cɛ 1.6 n No bracket guarrantee False position ɛ n+1 = 1 2 ɛ n cɛn 1.6 Stable Ridders ɛ n+2 = cɛ 2 n Stable Brent ɛ n+1 = 1 2 ɛ n cɛ 2 n Stable Newton-Raphson ɛ n+1 = cɛ 2 n Can be unstable
GNU Scientiic Library (GSL) Is a C library containing many useul scientiic routines: Root inding Minimization Sorting Linear algebra, Eigen-systems Fast Fourier transorms Integration, dierentiation, interpolation, approximation Random numbers Statistics, histograms, itting Monte Carlo integration, simulated annealing ODEs Polynomials, permutations Special unctions Vectors, matrices
GSL root inding example #include <iostream> #include <gsl/gsl roots.h> struct Params { double v, w, a, b, c; }; double exampleunction(double x, void * param) { Params* p = (Params*)param; return p->a*cos(sin(p->v+p->w*x))+p->b*x-p->c*x*x; } int main() { double x lo = -4.0; double x hi = 5.0; Params args = {0.3, 2/3.0, 2.0, 1/1.3, 1/30.0}; gsl root solver* solver; gsl unction wrapper; solver=gsl root solver alloc(gsl root solver brent); wrapper.unction = exampleunction; wrapper.params = &args; gsl root solver set(solver, &wrapper, x lo, x hi);
GSL root inding example } std::cout << " iter [ lower, upper] root err\n"; int status = 1; or (int iter=0; status and iter < 100; ++iter) { gsl root solver iterate(solver); double x rt = gsl root solver root(solver); double x lo = gsl root solver x lower(solver); double x hi = gsl root solver x upper(solver); std::cout << iter << << x lo << << x hi << << x rt << << x hi - x lo << "\n"; status = gsl root test interval(x lo,x hi,0,0.001); } gsl root solver ree(solver); return status;
GSL root inding example } std::cout << " iter [ lower, upper] root err\n"; int status = 1; or (int iter=0; status and iter < 100; ++iter) { gsl root solver iterate(solver); double x rt = gsl root solver root(solver); double x lo = gsl root solver x lower(solver); double x hi = gsl root solver x upper(solver); std::cout << iter << << x lo << << x hi << << x rt << << x hi - x lo << "\n"; status = gsl root test interval(x lo,x hi,0,0.001); } gsl root solver ree(solver); return status; Compilation and likage: $ g++ -c -O2 -I GSLINCDIR gslrx.cc -o gslrx.o $ g++ gslrx.o -o gslrx -L GSLLIBDIR -lgsl -lgslcblas (speciy directories i installed in non-standard locations)
Multidimensional Root Finding ( x) = 0 or 1 (x 1, x 2, x 3,..., x D ) = 0 2 (x 1, x 2, x 3,..., x D ) = 0. D (x 1, x 2, x 3,..., x D ) = 0 Cannot bracket a root with a inite number o points. Roots o each equation deine a D 1 hypersurace. Looking or possible intersections o hypersuraces.
Newton-Raphson or Multidimensional Root Finding Given a good initial guess, Newton-Raphson can work in arbitrary dimensions: ( x 0 + δ x) = 0 ( x0 ) + x δ x = 0 ( x 0 ) = x δ x ( x 0 ) = J δ x δ x = J 1 ( x 0 ) Requires inverting a D D matrix, or at least, solving a linear set o equations: see lecture on linear algebra.
Convergence and Stability As in 1D, Newton-Raphson can be unstable. Need some sae guard that our iteration steps do not spin out o control. Several ways, e.g. make sure that ( x) 2 gets smaller in each time step. This can potentially still ail, but usually does the trick.
Optimization
Optimization Optimization=Minimization=Maximization There s a unction (x 1, x 2,,..., x D ). Want to know or which set o x 1, x 2,... the unction is at its minimum. Maximizing is minimizing. Dierent rom solving / x i = 0 because o integration properties. Once more, D = 1 is substantially dierent rom higher dimensional cases. There s no guarantee that we ind the global minimum, methods return local minimums.
1D Minimization Strategy x Local minimum: smallest unction value in its neighbourhood. Global minimum: smallest such unction value or all x. Generalize bracketing to minimization. a, b, c bracket a minimum o i (a) > (b) < (c) Once a minimum is bracketed, we can successively reine the brackets.
Minimization Methods or 1D Golden Section Method Choose the larger o the intervals [a, b] and [b, c]. Place a new point d a raction w into the larger interval. New triplet is succession o points bracketing the minimum. Optimal choice or w = 1 2 (3 5) 0.38197 x Brent s Method Uses a quadratic interpolation using the three points, to ind a new point. All o these are in the GSL library.
Finding the global minimum There s no guarantee that we ind the global minimum, methods return local minima. I we re not sure: Essentially try again rom a dierent starting points, and check which o the ound minima is the lowest. Some methods, especially those using random numbers like simulated annealing, have ways o escaping local minima.
Multidimensional Minimization No bracketing method available Not as dire as multi-dimensional root inding: down is down, so as local as the unction is bounded rom below, you re likely to ind at least a local minimum. Common strategy Most methods are built upon 1D methods. Start rom a trial point P Perorm one dimensional minimizations in D independent directions Hope or the best. Test or convergence.
Finding independent directions Just choosing D directions will lead to staircase scenario. It might get to the minimum eventually, but requires a lot o small 1d minimizations to do so.
Finding independent directions Non-gradient based direction sets Use previous 1D minimization to guide the choice o direction sets. Powell s Methods Gradient based direction sets This one uses the derivatives to determine in which D directions to go next. Steepest Descent It may seem like a good idea just to go in the direction o the gradient: steepest descent. But you would have to take many straight turns. Conjugate Gradient. Essentially, you re solving the system as i it is quadratic. Better. All o these are in the GSL library.
Simulated Annealing All o these methods have trouble with unctions that have (many) local minima. Let s look or inspiration in Nature. Why? Nature is a very good minimizer. For instance, when temperature drops below 0 o C, water molecules ind a new low energy coniguration, which any o our previous minimization methods would have a tough time inding: D = O(10 23 ). The mechanism behind this is that while the temperature drops, the molecules are constantly moving in and out o local mimima, thus exploring much more o conigurational space.
Simulated Annealing Ok, so how does that help us? D dimensional space becomes a coniguration space. becomes an energy unction or the system. We assign the system a temperature T. We assign the system a dynamics that is such that the equilibrium distribution o conigurations is P( x) e ( x)/t E.g. use the Monte Carlo method rom last week s lecture! Evolve this system with given dynamics, but rom time to time, we reduce the temperature T. As T 0 the system will get trapped in a minimum. I we anneal slow enough, there s a good chance it s the global minimum.
Simulated Annealing x
Simulated Annealing x
Simulated Annealing x
Simulated Annealing x
Simulated Annealing x
Simulated Annealing Final notes Typically does not converge as ast as the deterministic ones. But applicable in many hard cases (travelling salesman). GSL has it too.