Order of convergence. MA3232 Numerical Analysis Week 3 Jack Carl Kiefer ( ) Question: How fast does x n

Week 3 Jack Carl Kiefer (94-98) Jack Kiefer was an American statistician. Much of his research was on the optimal design of eperiments. However, he also made significant contributions to other areas of statistics and optimization, including the introduction of golden section search (his master s thesis work) and the Dvoretzky-Kiefer-Wolfowitz inequality. Question: How fast does n converge to zero? To answer this question, we study the order of convergence. Order of convergence Let be a fied point of g (also a root of f ). We have n g n g Suppose n g n converges to. Let e n n. We have g e n g e n n g n g * Note: If we use mean value theorem, then g ng g n n between n and g n. *. It is hard to evaluate We apply the Taylor epansion of g e n g e n g g e n ==> e n g e n g! Notations (Big O and small o): If at : g e n o e n! e n o e n Fh C as h, then we say Fh Oh p as h h p ( Fh is big O of p h or Fh is of order p h ) If Fh h p as h, then we say Fh oh p as h and n is somewhere - -

Definition of order of convergence If g, then we have e n g e n ==> e n g e n This is called linear convergence (first order convergence) If g but g, then we have e n g ==> e n e n g e n This is called quadratic convergence (second order convergence), then we have If g g p but g p e n g p p! p e n ==> e n g p p! e n This is called p-th order convergence. p See NL_solvers/newton_convg.m (which shows Newton s method converges quadratically) and NL_solvers/iter_convg.m (which shows the linear convergence of the iteration n n e for solving e ). Multiplicity of a root Recall that is a root of f if f. Taylor epansion of f around : f f Definition: If f but f! f f, then is called a simple root (a root of multiplicity ). - -

If f f but f, then is called a double root (a root of multiplicity ). If f f f p but f p, then is called a root of multiplicity p. Order of convergence of Newton s method Newton s method: n g n, f f f g If is a simple root ( g g f f f ), then we have ==> At least quadratic convergence Conclusion: At a simple root, Newton s method has at least quadratic convergence.. If is a root of multiplicity p (p > ), then we use Taylor epansion to calculate g (You can skip the following derivation since it has been shown before.) f f f p but f p ==> ==> ==> f p f p! f f p p! f f p p p g f f p p Comparing with the Taylor epansion of g g we obtain that g at - 3 -

g p ==> Linear convergence. Conclusion: At a root of multiplicity p >, Newton s method has linear convergence. Question: How to regain quadratic convergence at a root of multiplicity p >? Schroder s method: Suppose we know the multiplicity p of the root. Then we use the iterative method n g n, g p f f This is called Schroder s method. For Schroder s method g p f f p p p p O ==> g p p O ==> It has at least quadratic convergence. Note: Schroder s method requires that we know the multiplicity p of the root. Remark: There are other variations of Newton s method. For eample, the secant method uses a finite difference to estimate the derivative: n n f n n n f f f n n, given two values, that are near the root. The bisection method can only be used to solve a single equation. Newton s method can be etended to solve a system of nonlinear equations. - 4 -

Newton s method for solving non-linear system f Notations: N Definition: * f If Newton s method f f f f N,,, N,,, N,,, N, then we say * is a solution of f Suppose f is differentiable. We start with a point. The goal is to find one solution of f. Taylor epansion of f around : f f f where f is the Jacobi matri of f f f f N f f f f N f N f N f N N Near, f is well approimated by f f Strategy: Start with Instead of solving f directly, we solve f f. Let be the solution of f f. Let. We have f f f f f \ f in MATLAB ==> ==> where is the solution of f f. Take as the new starting point and repeat the process. - 5 -

where is the solution of f f n n n If n tol, then stop. Norm of a vector u u,u,,u N T where n is the solution of f n n f n -norm: -norm: def u N u j j def / N u j u j def infinity-norm: u p-norm: u p def mau j j N p u j j /p Several issues about Newton s method for solving f : ) Calculating n requires the solution of a linear system A b, A f n, b f n ) In Matlab, to solve A b, we use A \ b. 3) Cost of calculating A f n is ON. 4) Cost of solving A b is ON 3. Talk about Assignment 3, NL_sys_solver/run_newton.m, which is similar to problem in Assignment 3. - 6 -

The Golden search method for minimization Definition: If f attains the minimum at, then we define argmin min def f f def f The golden search method Suppose f is concave up and has a minimum in a, b. The goal is to find argmin a,b f (the value of at which f attains the minimum). Note: To find a maimum, consider f instead and search its minimum. Strategy: Start with a, b (Draw the graph of a concave up f to show that it is not possible to determine which sub-interval contains the minimum if we only consider c a b ) Consider r a b a g, r a b ag,.5 g a b Note that r is on the left of the mid-point r while r is on the right of the midpoint r a b. (Draw the graph of a concave up f with a, b, r, r ) If f r f, r repeat the process with a, r. Otherwise, repeat the process with r, b. If the size of interval < tol, stop. The number of function evaluations per iteration =. Question: How to reduce it to? Suppose we repeat the process with a, r. a a, b r, b a r a b ag ==> r a b a g a b ag g, - 7 -

r a b a g a b ag If we select g such that g g, then we have r r, f r f r This reduces the number of function evaluations per iteration to. Solving g g, we obtain g 5.68 This number is called the golden ratio. The method is called the golden search method. 4 Eample: Use the golden search method to find a minimum of f e sin in,. First, try the following MATLAB code to see roughly where the minimum point lies: =[:.:]; y=ep(/4)-sin(); plot(,y) Go to Golden_search/golden_.m Matlab code (minimizing f()) clear; a = ; b = ; tol =.e; n = ; % g=(sqrt(5))/; r=a+(ba)*(g); f=f(r); r=a+(ba)*g; f=f(r); - 8 -

% while (b-a) > tol, n = n+; if f < f, b=r; r=r; f=f; r=a+(ba)*(g); f=f(r); else a=r; r=r; f=f; r=a+(ba)*g; f=f(r); end end = (a+b)/; In a separate file named f.m function [y]=f() y=ep(/4)sin(); Remark: One can use MATLAB built-in function to do the job. FMINSEARCH Multidimensional unconstrained nonlinear minimization (Nelder-Mead). X = FMINSEARCH(FUN,X) starts at X and attempts to find a local minimizer X of the function FUN. FUN is a function handle. FUN accepts input X and returns a scalar function value F evaluated at X. X can be a scalar, vector or matri. format long fminsearch(@() ep(/4)-sin(),) - 9 -

Golden_.m is similar to problem 3 in homework assignment. For our homework problem we cannot use fminsearch since there is no analytical epression for our function. Note: The golden search method is guaranteed to converge. In each iteration step, the size of interval is multiplied by a factor of g.68. r a a b ag a b ag b r b a b a g b ag Let N be the number of iterations needed to reach the error tolerance. ==> b ag N tol N g b a tol ==> N log g log b a tol log b a tol ==> N log g Eample: a, b, tol log b a tol ==> N log 49.3 g ==> N 5 Use Newton s method for minimization problems Solving Solving arg min arg min G G is equivalent to solving The gradient of G is defined as G. is equivalent to solving G. - -

G G G G N N The bonus problem in Assignment 3 asks you to use Newton s method to find the optimal value c of the parameters abc,, such that error between the fitting function g e cosa band the data from data.tt is minimized. Numerical analysis uses computers a lot. Question: How are real numbers stored in computers? Floating point representation and round-off error Floating point representation In computers, a non-zero real number is represented as fl.a a a t p = is the base of the number system. is fied. = + or is the sign. It depends on, and occupies bit..a a a t is called the mantissa. It depends on. t : the number of bits in the mantissa. t is fied. Mathematically.a a a t a a a t t a i, i,,, t = ==> a i or a i We can always make a. If a but a, then fl.a a 3 a t ==> a ==> We do not need to store a! ==> The mantissa occupies only (t-) bits. p - -

p is the eponent. p depends on, and occupies k bits. p is stored as k k pbias bk bk b bkbk b, bi integer k : the number of bits used to store p. k is fied. bias : the bias to make p bias positive. bias is fied. The range of p is determined by k. L p U Note: L is negative and U is positive. fl occupies t k t k bits. t k t k sign mantissa eponent 3.5.875. Eample: 3.5..5 3. 3 Machine precision In the floating point representation system,. The smallest number larger than is. t t For this reason, t is called the machine precision. (Draw the real ais with and t ) Round-off error Round-off error is the error in the floating point representation. We first consider the simple case where we do truncating. - -

==> ==>.a a a a p t t p.a a a t fl fl. t a t a t p.a a p t t t The absolute error (if we do truncating) is.a a t t fl p t p t Here we have used the fact that note a a i t at. at a t. The relative error (if we do truncating) is fl.a t a t p t.a a a t a t Here we have used.a a a t a t If we do rounding, we have fl p t p p t t (machine precision) p. since a a a at,,,,, fl t A quick eample: 3.59 3 5 9.359 p, t, rounding pt fl 3.59 3.6 fl 3.593.59. truncating fl pt fl 3.59 3.5 3.59 3.59.9 l rounding pt 3.54 3.5 3.54 3.54.4 fl fl truncating fl pt 3.54 3.5 3.54 3.54.4-3 -

A more convenient form of fl Let fl. We can write fl as fl, t (if we do rounding) t (if we do truncating) Note: This form of fl is very useful in error analysis. Eample: IEEE double precision floating point representation fl.a a a t p p biasbk bk b, L p U integer, t 53, k bias 3, L, U 3 MATLAB uses the IEEE-754 Double Precision Floating-point Standard to store data and perform arithmetical operations. Several issues: fl occupies 53 + = 64 bits or 8 bytes ( byte = 8 bits). Machine precision: t 5. 6 Round-off error: fl, t 53. 6 The range of p is p 3 ==> p 3 46 p is stored as p 3 b b b ==> and represents (the real number zero). 47 are not used for storing p. - 4 -

represents Inf, Inf, NaN, Eample: Toy model Assume bias. negative positive t 3 k What is this number? p. where p bias p bias p 8. 3 4 8 6 6 6 Overflow and underflow In the double precision floating point representation, p, L p U fl.a a a t The largest number is B. U U 3 38 The smallest non-zero number is b. L L 38 Overflow: If B, then set fl inf Note: overflow is a fatal error. Underflow: If b, then set fl Note: underflow is a non-fatal error. Summary: Double precision floating point representation fl.a a a t p p bias b b b k k, L p U, t 53, k bias 3, L, U 3-5 -

Machine precision: t 5. 6 Round-off error: fl, t 53. 6 Overflow and underflow: The largest number is B. U U 3 38 If B, then set fl inf (overflow is a fatal error). The smallest non-zero number is b. L L 38 If b, then set fl (underflow is a non-fatal error). Eample To fully understand the floating point representation system, let us see a very simple eample. Suppose t 3, k,, bias=, p, p bias. Then a real number is represented by fl.a a where a, a a and a or. p 3 3 3 All the numbers that can be represented by this simple system are.. 5 5 3 8 6. 3 3 6 3 4 8 6 7 7. 3 8 6... 4 4 6 b 4 8 5 3 8 3 6 3 4 8 positive sign a a3 b - 6 -

. 7 3 8 7 6 5 7 6 5 4 and,,,,,,,, plus and inf, -inf, NaN,. The last two are 8 8 8 6 6 6 6 represented as follows. First, the eponent p (where p ) is stored as pbias pbias while represents the real number zero and represents inf, -inf, NaN,. In this eample, the machine precision is and the represented numbers are NOT 4 uniformly distributed on the number line (even though the distribution is uniform in each subinterval). The numbers represented by the simple floating point system. t -. - -.5.5 Also, note that in this system cannot be represented. If one increases the range of the eponent, then one can be represented and the smallest number that is bigger than one will be (one+machine precision). Furthermore, the smallest positive number in this system is 4 (same as the machine precision, which is a coincidence) and the largest positive number is 7. Give me 8 any number now and we can decide how it is represented in this system. Let us pick. Since 7, it is represented as inf. How about 8 8 or? Since, this is underflow and we set 8 8 4 them as zero. - 7 -

Eample In order to see that the smallest positive number is not necessarily the same as the machine precision, let us consider the case where t, k,,bias=, p. Then the machine precision is.. 4 t but the smallest positive number now is still You can play with more simple cases to gain better understanding. Four Sources of error Eample: Consider a model for wild salmon population. The goal of modeling is to do predictions. Let yt be the population. y ry y p Hy,t K Eponential Growth Effect of Harvest limited resource Error source #: error in the mathematical model The effect of limited resource may not be y p K ; Error source #: error in the parameters The estimated values of r, K and p are not eact. Error source #3: discretization error To predict the population, we need to solve: y Fy,t, y y where Fy,t ry y p K Hy,t Let t n n t. yt n yt n t y t n Fyt,t n n ==> yt n yt tf n yt n,t n Euler s method: y n y n tfy n,t n y n satisfies - 8 -

y n yt n O t Error source #4: round-off error Error due to discretization For most real numbers, fl. Loss of accuracy due to numerical cancellation Consider the calculation of A B where A > and B >. fla A, ~ 6 flb B, ~ 6 ==> fla flb A B The absolute error: A B A B Eact value Absolute error A A B B A B Eact value Relative error A B A B 3, 3 ~ 6 The relative error: A B A B A B A B 3, 3 ~ 6 That is, in the calculation of A B, the round-off error is magnified by a factor of A B A B. When A is close to B, the factor A B is large and we lose accuracy. A B When A B A B ~6, the relative error is 6 3 ~ O and there is no accuracy. Eample: Solving a b c for a 8, b 8, c.3 8 r b b 4 ac a Numerical formula #: r = (b+sqrt(b^4*a*c))/(*a) numerical cancellation occurs! - 9 -

Numerical result: r =.745 (not a good solution since 7 ar brc 4.849 ) Now let us find a good numerical formula. Multiplying both the numerator and denominator by b r b b 4acb b 4ac 4 ac b b 4ac a b b 4ac a c b b 4ac b 4 ac, we have Numerical formula #: r = -*c/(b+sqrt(b^4*a*c)) Numerical result: r =.3 (a good solution since ar br c ) 8.49 See Codes\Num_Cancel\Num_cancel_.m. What is wrong with Numerical formula #? We identify r b b 4 ac a A b 4 ac, B b a 8, b 8, c.3 8 ==> A B b 4 ac b 8 4 ac ==> A B b 4 ac b b 4 ac b 4.3 8.46 8 ==> A B A B 8 8.86.46 ==> The relative error = A B A B 3.8 6 3 ~ O - -

b e b 8 Eample: Compute directly for b.3. Numerical cancellation occurs and we b get.856 from the direct computation. For small value of b, we use the Taylor series epansion for 3 4 b b b b b b e b! 3! 4! b b.5 b b 6 4 See Codes\Num_Cancel\Num_cancel_.m. b e to avoid cancellation: - -