Numerical Methods in Informatics

Numerical Methods in Informatics Lecture 2, 30.09.2016: Nonlinear Equations in One Variable http://www.math.uzh.ch/binf4232 Tulin Kaman Institute of Mathematics, University of Zurich E-mail: tulin.kaman@math.uzh.ch Lecture 2, 30.09.2016 T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 1 / 35

3. Nonlinear Equations in One Variable Goals of today To develop useful methods for a basic, simply stated problem, including such favourites as fixed point iteration and Newton s method; to develop and assess several algorithmic concepts that are prevalent throughout the field of numerical computing; to study basic algorithms for minimizing a function in one variable. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 2 / 35

3. Nonlinear Equations in One Variable Outline Bisection method Fixed point iteration Newton s method and variants Minimizing a function in one variable T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 3 / 35

3. Nonlinear Equations in One Variable Solving nonlinear equations We want to find solutions to the scalar, nonlinear equation f (x) = 0, where the independent variable x is in an interval [a, b]. Example If x is value of x where f vanishes then x is a root or zero of f. How many roots? This depends not only on the function but also on the interval. f (x) = sin(x) has one root in [ π/2, π/2], two roots in [ π/4, 5π/4], and no roots in [π/4, 3π/4]. Why study a nonlinear problem before addressing linear ones?! One linear equation, e.g. ax = b, is too simple (solution: x = b/a, a 0), whereas a system of linear equations (Chapter 5 and 7) can have many complications. Several important general methods can be described in a simple context. Several important algorithm properties can be defined and used in a simple context. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 4 / 35

3. Nonlinear Equations in One Variable Solving nonlinear equations Examples 1 f (x) = sin(x) Since sin(nπ) = 0 for any integer n, we have on the interval [a, b] = [π/2, 3π/2] one root, x = π. [a, b] = [0, 4π] five roots 2 f (x) = x 3 30x 2 + 2552, 0 x 20. One root 3 f (x) = 10 cosh(x/4) x, x. Not every equation has a solution. No real root T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 5 / 35

3. Nonlinear Equations in One Variable Solving nonlinear equations Iterative methods for finding roots It is not realistic to expect to find a solution of a nonlinear equation by using a closed form formula They practically exist only for very low order polynomials We have to resort to iterative methods: Starting with an initial iterate x 0, the method generates a sequence of iterates x 1, x 2, that (hopefully) converges to a root of the function. A rough knowledge of the root s location is required. To find approximate locations of roots we can roughly plot the function we can probe the function, try to find two arguments a, b s.t. f (a)f (b) < 0. Intermediate value theorem If f C[a, b] and s is a value such that f (â) s f (ˆb) for two numbers â, ˆb [a, b], then there exists a real number c [a, b] for which f (c) = s. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 6 / 35

3. Nonlinear Equations in One Variable Solving nonlinear equations Iterative methods for finding roots Generally for a nonlinear problem, must consider an iterative method: starting with initial iterate (guess) x 0, generate sequence of iterates x 1, x 2,, x k, that hopefully converge to a root x. Desirable qualities of a root finding algorithm are: Efficient: requires a small number of function evaluations. Robust: fails rarely, if ever. Announces failure if it does fail. Requires a minimal amount of additional information such as the derivative of f. Requires f to satisfy only minimal smoothness properties. Generalizes easily and naturally to many equations in many unknowns. Like many other wish-lists, this one is hard to fully satisfy... T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 7 / 35

3. Nonlinear Equations in One Variable Bisection method Bisection method simple safe requires minimal assumptions on the function f (x). slow hard to generalize to higher dimensions T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 8 / 35

3. Nonlinear Equations in One Variable Bisection method Bisection method development Given a < b such that f (a)f (b) < 0, there must be a root in [a, b]. Refer to [a, b] as the uncertainty interval. So, at each iteration, evaluate f (p) at p = a+b 2 and check the sign of f (a)f (b). If positive, set a p, if negative set b p. This reduces the length of the uncertainty interval by factor 0.5 at each iteration. So, setting x n = p, the error after n iterations satisfies x x n b a 2 2 n. Stopping criterion: given (absolute) tolerance atol, require b a 2 2 n atol. This allows a priori determination of the number of iterations n: unusual in algorithms for nonlinear problems. n = log 2 ( b a 2atol ) T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 9 / 35

3. Nonlinear Equations in One Variable Bisection method bisect function function [p,n] = bisect(func,a,b,fa,fb,atol) if (a >= b) (fa*fb >= 0) (atol <= 0) disp( something wrong with the input: quitting ); p = NaN; n=nan; return end n = ceil ( log2 (b-a) - log2 (2*atol)); for k=1:n p = (a+b)/2; fp = feval(func,p); if fa * fp < 0 b = p; fb = fp; else a = p; fa = fp; end end p = (a+b)/2; T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 10 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Fixed point iteration This is an intuitively appealing approach which often leads to simple algorithms for complicated problems. 1 Write given problem f (x) = 0 as g(x) = x, so that f (x ) = 0 iff g(x ) = x. x is a fixed point of the mapping g. 2 Iterate: x k+1 = g(x k ), k = 0, 1, starting with guess x 0. It s all in the choice of the function g. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 11 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Choosing the function g The convergence properties of the fixed point iteration depend on the choice of the function g. There are many possible choices g for the given f : we are considering a family of methods. Examples: g(x) = x f (x), g(x) = x + 2f (x), g(x) = x f (x)/f (x) (assuming f exists and f (x) 0). The first two choices are simple, the last one has potential to yield fast convergence. We want resulting method to be simple converge do it rapidly. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 12 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Questions arise Suppose that we have somehow determined the continuous function g C[a, b], and let us consider the fixed point iteration. 1 Is there a fixed point x in [a, b]? 2 If yes, is it unique? 3 Does the sequence of iterates converge to a root x? 4 If yes, how fast? 5 If not, does this mean that no root exists? T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 13 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Fixed Point Theorem Theorem If g C[a, b] and a g(x) b for all x [a, b], then there is a fixed point x in the interval [a, b]. If, in addition, the derivative g exists and there is a constant ρ < 1 such that the derivative satisfies g (x) ρ x (a, b), then the fixed point x is unique in this interval. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 14 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Convergence of the fixed point iteration Assuming ρ < 1 as for the fixed point theorem, obtain x k+1 x = g(x k ) g(x ) ρ x k x This is a contraction by the factor ρ. So, x k+1 x ρ x k x ρ 2 x k 1 x ρ k+1 x 0 x Since contraction factor ρ < 1, then ρ k 0 as k The smaller ρ, the faster convergence is. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 15 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Graphical illustration, x = e x, starting from x 0 = 1 This yields x 1 = e 1, x 2 = e e 1, T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 16 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Example: cosh with two roots f (x) = g(x) x, g(x) = 2 cosh(x/4) The functions x and 2 cosh(x/4) meet at two locations. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 17 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Fixed point iteration with g = 2 cosh(x/4) There are two fixed points (roots of f ), 2 x1 4 and 8 x 1 10. x1 2.35755106, x2 8.50719958 For tolerance 1.e 8 : Starting at x 0 = 2 converge to x1 in 16 iterations. Starting at x 0 = 4 converge to x1 in 18 iterations. Starting at x 0 = 8 converge to x1 (even though x 2 is closer to x 0). Starting at x 0 = 10 obtain overflow in 3 iterations. Note: bisection yields both roots in 27 iterations. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 18 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Rate of convergence Suppose we want x k x 0.1 x 0 x Since x k x ρ k x 0 x, we want ρ k 0.1, k log 10 ρ 1 Definition of the rate of convergence: rate = log 10 ρ Then it takes about k = [1/rate] iterations to reduce the error by more than an order of magnitude. The higher the rate is (the faster convergence is), the smaller ρ. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 19 / 35

3. Nonlinear Equations in One Variable Fixed point iteration Return to Example: cosh with two roots Bisection: ρ = 0.5 so rate = log 10 0.5.3 k = 4 For the root x 1 of fixed point example, ρ 0.31 so rate = log 10 0.31.5 k = 2 For the root x 2 of fixed point example, ρ > 1 so rate = log 10 ρ < 0 no convergence T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 20 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Newton s method Not so simple Not very safe or robust Requires more than continuity on f (x). Fast Automatically generalized to systems T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 21 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Derivation By Taylor series, f (x) = f (x k ) + f (x k )(x x k ) + f (η)(x x k ) 2 /2. So, for x = x 0 = f (x k ) + f (x k )(x x k ) + O((x x k ) 2 ). The method is obtained by neglecting nonlinear term, defining 0 = f (x k ) + f (x k )(x k+1 x k ), which gives the iteration step x k+1 = x k f (x k) f, k = 0, 1, 2, (x k ) T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 22 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Geometric Interpretation Newtons method: the next iterate is the x-intercept of the tangent line to f at the current iterate. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 23 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Example: cosh with two roots f (x) = 2 cosh(x/4) x has two roots x1 2.35755106 and x2 8.50719958 in the interval [2, 10]. Newton s iteration is x k+1 = x k 2 cosh(x k/4) x k 0.5 sinh(x k /4) 1 For absolute tolerance 1.e 8 : Starting at x 0 = 2 requires 4 iterations to reach x 1. Starting at x 0 = 4 requires 5 iterations to reach x 1. Starting at x 0 = 8 requires 5 iterations to reach x 2. Starting at x 0 = 10 requires 6 iterations to reach x 2. The values of f (x k ), starting from x 0 = 8, are displayed below k 0 1 2 3 4 5 f (x k ) -4.76e-1 8.43e-2 1.56e-3 5.65e-7 7.28e-14 1.78e-15 Note that the number of significant digits essentially doubles at each iteration (until the 5th, when roundoff error takes over). T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 24 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Convergence theorem for Newton s method Theorem If f C 2 [a, b] and there is a root x in [a, b] such that f (x ) = 0, f (x ) 0, then there is a number δ such that, starting with x 0 from anywhere in the neighborhood [x δ, x + δ], Newton s method converges quadratically. Idea of Proof: Find the relation between e k+1 = x k+1 x and e k = x k x. Expand f (x ) in terms of Taylor series about x k ; divide by f (x k ), rearrange and replace x k f (x k) f (x k ) by x k+1; find the relation between e k+1 = x k+1 x and e k = x k x T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 25 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Proof of convergence theorem for Newton s Method T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 26 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Secant method (quasi-newton) One potential disadvantage of Newton s method is the need to know and evaluate the derivative of f. No derivative is needed in the secant method. Observe that near the root (assuming convergence) f (x k ) f (x k) f (x k 1 ) x k x k 1 So, Secant iteration x k x k 1 x k+1 = x k f (x k ), k = 0, 1, 2, f (x k ) f (x k 1 ) The secant method is not a fixed point method but a multi-point method. Note: the need for two initial starting iterates x 0 and x 1 : a two-step method. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 27 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Example: cosh with two roots f (x) = 2 cosh(x/4) x has two roots x1 2.35755106 and x2 8.50719958 in the interval [2, 10]. Secant s iteration is (2 cosh(x k /4) x k )(x k x k 1 ) x k+1 = x k (2 cosh(x k /4) x k )(2 cosh(x k 1 /4) x k 1 ) For absolute tolerance 1.e 8 : Starting at x 0 = 2 and x 1 = 4 requires 7 iterations to reach x 1. Starting at x 0 = 10 and x 1 = 8 requires 7 iterations to reach x 2. k 0 1 2 3 4 5 6 f (x k ) 2.26-4.76e-1-1.64e-1 2.45e-2-9.93e-4-5.62e-6 1.30e-9 Observe superlinear convergence (convergence order 1.618 ): x k+1 x c x k x 1.618, for some constant c much faster than bisection and fixed point iteration, yet not quite as fast as Newton s iteration. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 28 / 35

3. Nonlinear Equations in One Variable Newton s method and variants Newton iteration for multiple roots f (x) = (x x ) m q(x) where q(x ) 0 then x is a root of multiplicity m. The fast local convergence rate of both Newton and secant methods is predicated on the assumption that f (x ) = 0. If m > 1, then f (x ) = 0 The method is linearly convergent, with the contraction factor ρ = 1 1 m. Show!!! Idea of proof: g(x) = x f (x) f (x), derive g (x). T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 29 / 35

3. Nonlinear Equations in One Variable Review: Convergence Definition: Speed (Rate) of Convergence Let {x k } k=0 be a sequence of in Rn that converges to x R n and assume that each x k x for each k. We say that the rate of convergence of {x k } to x is of order r and with asymptotic error constant C, if x k+1 x lim k 0 x k x r = C where r 1 and C > 0. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 30 / 35

3. Nonlinear Equations in One Variable Review: Convergence Convergence rate of Methods Bisection Method: converges linearly, with asymptotic error constant 1 2 Fixed point iteration: converges at least quadratically, with asymptotic error constant g (x ) 2 Newton Method: converges quadratically, with asymptotic error constant f (x ) 2f (x ) Recommendation: Determine the convergence rate of methods T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 31 / 35

3. Nonlinear Equations in One Variable Review: Convergence Review: Speed (Rate) of convergence A given method is said to be linearly convergent if there is a constant ρ < 1 such that x k+1 x ρ x k x, quadratically convergent if there is a constant M such that x k+1 x M x k x 2, superlinearly convergent if there is a sequence of constant ρ k 0 such that x k+1 x ρ k x k x, for all k sufficiently large. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 32 / 35

3. Nonlinear Equations in One Variable Minimizing a function in one variable Example A major source of applications giving rise to root finding is optimization. One-variable version: find an argument x = ˆx that minimizes a given objective function Find x = x that minimizes Φ(x) = 10 cosh(x/4) x This function has no zeros but has one minimizer around x = 1.6 T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 33 / 35

3. Nonlinear Equations in One Variable Minimizing a function in one variable Conditions for optimum and algorithms Necessary condition for an optimum: Suppose Φ C 2 and denote f (x) = Φ (x). Then a zero of f is a critical point of Φ, where Φ (x ) = 0. To be a minimizer or a maximizer, it is necessary for x to be a critical point. Sufficient condition for an optimum: A critical point x is a minimizer of Φ(x) if also Φ (x ) > 0. a maximizer of Φ(x) if also Φ (x ) < 0. further investigation is required if Φ (x ) = 0 Hence, an algorithm for finding a minimizer is obtained by using one of the methods we have learned for finding the roots of Φ (x), then checking for each such root x if also Φ (x ) > 0. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 34 / 35

3. Nonlinear Equations in One Variable Minimizing a function in one variable Example To find a minimizer for Φ(x) = 10 cosh(x/4) x 1 Calculate first derivative Φ (x) = 2.5 sinh(x/4) 1 2 Find the root of Φ (x) = 0 using any of our methods, obtaining x 1.56014. 3 Calculate second derivative Φ (x) = 2.5/4 cosh(x/4) > 0 x so x is a minimizer. T. Kaman Numerical Methods in Informatics Lecture 2, 30.09.2016 35 / 35