Introductory Numerical Analysis

Introductory Numerical Analysis Lecture Notes December 16, 017

Contents 1 Introduction to 1 11 Floating Point Numbers 1 1 Computational Errors 13 Algorithm 3 14 Calculus Review 3 Root Finding 5 1 Bisection Method 5 Fixed Point Iteration 8 3 Newton-Raphson Method 13 4 Order of Convergence 16 3 Interpolation 18 31 Lagrange Polynomials 19 3 Cubic Spline Interpolation 3 4 Numerical Differentiation and Integration 6 41 Numerical Differentiation 6 4 Elements of Numerical Integration 30 43 Composite Numerical Integration 33 5 Differential Equations 36 51 Euler s Method 36 5 Higher-order Taylor s Method 39 53 Runge-Kutta Method 41 6 Linear Algebra 45 61 Introduction to Linear Algebra 45 6 Systems of Linear Equations: Gaussian Elimination 48 63 Jacobi and Gauss-Seidel Methods 5 64 Eigenvalues: Power Method 55 7 Additional Topics 58 71 PDE: Heat Equation 58 7 Least Squares Approximation 60

1 Introduction to Sometimes we cannot solve a problem analytically For example, Find the root x of f(x) = e x x on the interval [0, ] Also we do not have a general analytic formula or technique to find roots of a polynomial of degree 5 or more (See Galois Theory) We solve these kinds of problem numerically: Construct a sequence {x n } that converges to x, ie, lim n x n = x Approximate x by finding x k for some k for which f(x k ) = e x k xk 0 Numerical Analysis includes study of the following: Construct a sequence {x n } that converges to the solution (Iteration formula) Determine how fast {x n } converges to the solution (Rate of convergence) Find bounds of error committed at a certain iteration x n (Error analysis) We will cover numerical methods for the following topics: Root finding Interpolation Differentiation and integration Differential equations Liner algebra 11 Floating Point Numbers We know π = 31415965358979, where the decimal digits never terminate For numerical calculations, we consider only a finite number of digits of a number A t-digit floating point number of base 10 is of the form ±0a 1 a a t 10 e, where 0a 1 a a t is called the mantissa and e is called the exponent Usually the mantissa 0a 1 a a t is normalized, ie, a 1 0 For example, the normalized 15-digit floating point number of π is fl(π) = 031415965358979 10 1 1

Note that floating point numbers are approximation of the exact numbers obtained by either chopping or rounding up the digits The error in calculations caused by the use of floating point numbers is called roundoff error For example, a computer may calculate the following ( ) = 044408909850063 10 15, which is just a roundoff error Note that since floating point numbers are rational numbers, a computer cannot express any irrational number without errors Also note that almost all computers use binary floating point numbers 1 1 Computational Errors When we approximate a number x by a number x, there are a few ways to measure errors: Absolute Error: x x Relative Error: x x, x 0 x For example, if we approximate x = 14 by x = 15, then the absolute error is x x = 14 15 = 001 and the relative error is x x 14 15 = = 0008 x 15 The relative error gives us information about the number of decimal digits of x and x match We approximate x by x to n significant digits if n is the largest nonnegative integer for which x x < 5 10 n x Since x x = 0008 < 5 10 and x x = 0008 5 10 3, we have the largest x x nonnegative integer n = Thus x = 14 and x = 15 agree to significant digits 1 A 64-bit computer uses the IEEE 754-008 standard which defines the following format for 64-bit binary floating point numbers (Double-precision floating point format): converts to the following decimal number s x f (1 bit sign) (11 bit exponent) (5 bit fraction) ( 1) s x10 103 ( 1 + (f 1 1 + f + + f 5 5 ) ), where x 10 is the decimal number of x For example, 1 10000000010 11000000 0 converts to ( 1) 1 106 103 ( 1 + (1 1 + 1 + 0 3 + + 0 5 ) ) ( = 3 1 + 1 + 1 ) = 14 4 In this format the magnitude of the largest and smallest decimal numbers is 17976931348631 10 308 If a number has magnitude bigger than that, a 64-bit computer stops working and it is called an overflow Note that the single-precision floating point format uses 3 bits including 3 bits fraction

13 Algorithm An algorithm is a set of ordered logical operations that applies to a problem defined by a set of data (input data) to produce a solution (output data) An algorithm is usually written in an informal way (pseudocode) before writing it with syntax of a computer language such as C, Java, Python etc The following is an example of a pseudocode to find n!: Algorithm you-name-it Input: nonnegative integer n Output: n! fact = 1 for i = to n fact = fact i end for return fact Stopping criteria: Sometimes we need a stopping criteria to terminate an algorithm For example, when an algorithm approximates a solution x by constructing a sequence {x n }, the algorithm needs to stop after finding x k for some k There is no universal stopping criteria as it depends on the problem, acceptable error (ie, error < tolerance, say 10 4 ), the maximum number of iterations etc 14 Calculus Review You should revise the limit definitions of the derivative and the Riemann integral of a function from a standard text book The following are some theorems which will be used later Theorem Let f be a differentiable function on [a, b] f is increasing on [a, b] if and only if f (x) > 0 for all x [a, b] If f has a local maximum or minimum value at c, then f (c) = 0 (c is a critical number) If f (c) = 0 and f (c) < 0, then f(c) is a local maximum value If f (c) = 0 and f (c) > 0, then f(c) is a local minimum value Theorem Let f be a continuous function on [a, b] If f(c) is the absolute maximum or minimum value f on [a, b], then either f (c) does not exist or f (c) = 0 or c = a or c = b 3

Intermediate Value Theorem: Let f be a function such that (i) f is continuous on [a, b], and (ii) N is a number between f(a) and f(b) Then there is at least one number c in (a, b) such that f(c) = N In the particular case when f(a)f(b) < 0, ie, f(a) and f(b) are of opposite signs, there is at least one root c of f in (a, b) Mean Value Theorem: Let f be a function such that (i) f is continuous on [a, b], and (ii) f is differentiable on (a, b) Then there is a number c in (a, b) such that y f(b) f(a) b a = f (c) y y = f(x) y = T 3 (x) y = T (x) y = T 1 (x) a c b x x Taylor s Theorem: Let n 1 be an integer and f be a function such that (i) f (n) is continuous on [a, b], and (ii) f (n) is differentiable on (a, b) Let c be a number in (a, b) Then for all x in [a, b], we have f(x) = f(c) + f (c) 1! (x c) + f (c)! for some number ξ (depends on x) between c and x (x c) + + f (n) (c) (x c) n + f (n+1) (ξ) n! (n + 1)! (x c)n+1, Sometimes we simply write f(x) = T n (x) + R n (x), where T n (x) = n k=0 f (k) (c) (x c) k is the k! Taylor polynomial of degree n and R n (x) = f (n+1) (ξ) (n + 1)! (x c)n+1 is the remainder term 4

Root Finding In this chapter we will find roots of a given function f, ie, x for which f(x ) = 0 1 Bisection Method Suppose f is a continuous function on [a, b] and f(a) and f(b) have the opposite signs Then by the IVT (Intermediate Value Theorem), there is at least one root x of f in (a, b) For simplicity, let s assume the root x is unique Set a 1 = a and b 1 = b Let x 1 = a 1 + b 1 which breaks [a 1, b 1 ] into two subintervals [a 1, x 1 ] and [x 1, b 1 ] Then there are three possibilities: 1 x 1 = x : f(x 1 ) = 0 and we are done x (a 1, x 1 ): f(x 1 ) and f(a 1 ) have the opposite signs and set a = a 1, b = x 1 3 x (x 1, b 1 ): f(x 1 ) and f(b 1 ) have the opposite signs and set a = x 1, b = b 1 Set x = a + b We can continue this process of bisecting an interval [a n, b n ] containing x and getting an approximation x n = a n + b n of x We will show that {x n } converges to x y y = f(x) a x x 3 x 1 b x Example Do five iterations by the Bisection Method to approximate the root x of f(x) = e x x on the interval [0, ] Solution n a n f(a n ) b n f(b n ) x n = an+bn f(x n ) 1 0 + 1 1 + 15 + 3 1 15 + 15 + 4 1 15 + 115 5 115 15 + 11875 + 5

Since x 5 x 4 = 11875 115 = 0065 < 01 = 10 1, we can (roughly) say that the root is x 5 = 11875 correct to one decimal place But why roughly? If x n is correct to t decimal places, then x x n < 10 t But the converse is not true For example, x = 1146193 is correct to 6 decimal places (believe me!) So x 11 = 1145507 is only correct to decimal places but x x 11 < 10 3 Similarly if x = 1000 is approximated by x = 0999, then x x = 0001 < 10 But x = 0999 is not correct to any decimal places of x = 1000 Also note that we computed x 5 x 4, not x x 5 (which is not known for x ) It would be useful to know the number of iteration that guarantees to achieve a certain accuracy of the root, say within 10 3 That is to find n for which x x n < 10 3, ie, x 10 3 < x n < x + 10 3 So we have to find an upper bound of the absolute error for the nth iteration x n The Maximum Error: Let ε n be the absolute error for the nth iteration x n Then ε n = x x n b a n Proof Since x [a n, b n ] and b n a n = (b n 1 a n 1 )/ for all n, ε n = x x n b n a n = b n 1 a n 1 = = b 1 a 1 n = b a n Example Find the number of iteration by the Bisection Method that guarantees to approximate the root x of f(x) = e x x on [0, ] with accuracy within 10 3 Solution ε n = x x n 0 n = n < 10 3 = 10 3 < n = ln( 10 3 ) < ln n = ln + 3 ln(10) < n ln ln + 3 ln(10) = < n ln ln + 3 ln(10) = 1097 < n ln Thus 11th iteration guarantees to achieve accuracy of the root within 10 3 Note that the root is x = 1146193 correct to 6 decimal places So x 10 = 1146484 and x 11 = 1145507 both have accuracy within 10 3 (check: x x 10 < 10 3, x x 11 < 10 3 ) Thus accuracy within 10 3 is reached even before 11th iteration It is interesting to note that x 10 = 1146484 is correct to 3 decimal places whereas x 11 = 1145507 is only correct to decimal places 6

Convergence: The sequence {x n } constructed by the bisection method converges to the solution x because b a lim n x x n lim = 0 = lim x x n n n = 0 n But it converges really slowly in comparison to other methods (See Section 4) Algorithm Bisection-Method Input: f(x) = e x x, interval [0, ], tolerance 10 3, maximum number of iterations 50 Output: an approximate root of f on [0, ] within 10 3 or a message of failure set a = 0 and b = ; xold = a; for i = 1 to 50 x = (a + b)/; if x xold < 10 3 % checking required accuracy FoundSolution= true; % done break; % leave for environment end if if f(a)f(x) < 0 a=a and b=x else a=x and b=b end if xold = x; % update xold for the next iteration end for if FoundSolution return x else print the required accuracy is not reached in 50 iterations end if 7

Fixed Point Iteration A number p is a fixed point of a function g if g(p) = p For example, if g(x) = x, then solving g(x) = x = x we get x = 1, We can easily check g( 1) = 1 and g() = Thus 1 and are fixed points of g Note that fixed points of g are the x-value of the points of intersection of the curve y = g(x) and the line y = x y 4 3 y = x y = x (, ) 1 4 3 1 0 1 3 4 ( 1, 1) x The following shows the equivalence of root finding and finding fixed points Observation p is a fixed point of a function g if and only if p is a root of f(x) = x g(x) If p is a fixed point of a function g, then g(p) = p and consequently f(p) = p g(p) = 0 The converse follows similarly Note that there are multiple choices for f such as f(x) = e x (x g(x)), f(x) = 1 + e x g(x) etc The following theorem gives us sufficient conditions for existence and uniqueness of a fixed point: Theorem 1 (Fixed Point Theorem) Let g be a function on [a, b] 1(Existence) If g is continuous and a g(x) b for all x [a, b], then g has a fixed point in [a, b] (Uniqueness) Moreover, if g (x) < 1 for all x (a, b), then g has a unique fixed point in [a, b] Proof Suppose g is continuous and a g(x) b for all x [a, b] If g(a) = a or g(b) = b, then a or b is a fixed point of g Otherwise g(a) > a and g(b) < b because a g(x) b for all x [a, b] Define a new function h by h(x) = x g(x) Since g is continuous, h is also continuous Also note that h(a) = a g(a) < 0 and h(b) = b g(b) > 0 By the IVT, h(c) = 0 for some c (a, b) Now h(c) = c g(c) = 0 = g(c) = c, ie, c is a fixed point of g 8

Suppose g (x) < 1 for all x (a, b) To show uniqueness of c, suppose d is another fixed point of g in [a, b] WLOG let d > c Applying the MVT on g on [c, d], we find t (c, d) such that g(d) g(c) = g (t)(d c) Since g (x) < 1 for all x (a, b), g(d) g(c) = g (t)(d c) < (d c) Since g(c) = c and g(d) = d, we have g(d) g(c) = d c which contradicts that g(d) g(c) < (d c) Suppose a g(x) b for all x [a, b] and g (x) k < 1 for all x (a, b) For any initial approximation x 0 in [a, b], the fixed point iteration constructs a sequence {x n }, where x n+1 = g(x n ), n = 0, 1,, to approximate the unique fixed point x of g in [a, b] We will show {x n } converges to x y y = x y = g(x) (x 1, x 1 ) (x 0, x 1 ) (x, x ) (x 1, x ) (x 3, x 3 ) (x, x 3 ) x 3 x x 1 x 0 x Example This problem approximates the root x of f(x) = e x x on the interval [0, ] (a) Find a function g that has a unique fixed point which is the root x of f(x) = e x x on the interval [0, ] (b) Do six iterations by the fixed point iteration to approximate the root x of f(x) = e x x on the interval [0, ] using x 0 = 1 Solution (a) We need to find a function g such that (i) 0 g(x) for all x [0, ], and (ii) g (x) < 1 for all x (0, ) 9

f(x) = e x x = 0 = x = e x But g(x) = e x does not satisfy (i) as g(0) = < 0 and also (ii) as g () = e > 1 Note that f(x) = e x x = 0 = e x = x + = x = ln(x + ) Take g(x) = ln(x + ) Then g is an increasing function and g(0) = ln > 0 and g() = ln 4 < Thus 0 g(x) for all x [0, ] Also g (x) = 1 1 < 1 for all x+ x (0, ) Thus g has a unique fixed point in [0, ] which is the root x of f(x) = e x x on the interval [0, ] (b) Use g(x) = ln(x + ) and x 0 = 1 n x n 1 x n = g(x n 1 ) 1 1 10986 10986 11309 3 11309 11413 4 11413 11446 5 11446 11456 6 11457 11460 Since x 6 x 5 = 11460 11456 = 00004 < 0001 = 10 3, we can say that the root is x 6 = 11460 roughly correct to three decimal places (which is true indeed as x = 1146193) The Maximum Error: Let ε n be the absolute error for the nth iteration x n Then ε n = x x n k n max{x 0 a, b x 0 } and ε n = x x n kn 1 k x 1 x 0 Proof Applying the MVT on g on [x, x n 1 ], we find ξ n 1 (x, x n 1 ) such that g(x ) g(x n 1 ) = g (ξ n 1 )(x x n 1 ) Then x x n = g(x ) g(x n 1 ) = g (ξ n 1 ) x x n 1 k x x n 1 Continuing this process, we get x x n k x x n 1 k x x n k n x x 0 Since x, x 0 (a, b), we have x x 0 max{x 0 a, b x 0 } Thus ε n = x x n k n max{x 0 a, b x 0 } Note that similarly applying the MVT on g on [x n, x n+1 ], we can show that x n+1 x n k n x 1 x 0 10

For the other bound, let m > n 0 Then Thus x m x n = x m x m 1 + x m 1 x m + x m + x n+1 x n x m x m 1 + x m 1 x m + + x n+1 x n k m 1 x x 0 + k m x x 0 + + k n x x 0 k n x x 0 (1 + k + + k m n 1 ) m n 1 k n x x 0 m n 1 lim x m x n lim m m kn x x 0 k i = k n x x 0 = x x n k n x x 0 i=0 k i i=0 k i = i=0 i=0 k i kn 1 k x 1 x 0 (as 0 < k < 1) Convergence: The sequence {x n } constructed by the fixed point iteration converges to the unique fixed point x irrespective of choice of the initial approximation x 0 because lim n x x n lim k n max{x 0 a, b x 0 } = 0 (as 0 < k < 1) = n It converges really fast when k is close to 0 lim x x n = 0 n Example Find the number of iteration by the fixed point iteration that guarantees to approximate the root x of f(x) = e x x on [0, ] using x 0 = 1 with accuracy within 10 3 Solution Consider g(x) = ln(x + ) on [0, ] where g (x) 1 = k < 1 for all x (0, ) ε n = x x n k n max{x 0 a, b x 0 } = = 10 3 < n = ln 10 3 < ln n = 3 ln 10 < n ln 3 ln 10 = ln 997 < n ( ) n 1 1 < 10 3 Thus 10th iteration guarantees to achieve accuracy of the root within 10 3 But note that x x 5 = 11461 11457 = 00004 < 10 3 So Thus accuracy within 10 3 is reached at 5th iteration (way before 10th iteration) Also note that the other bound of ε n = x x n k n 1 k x 1 x 0 gives 1094 < n which does not improve our answer 11

Algorithm Fixed-point-Iteration Input: g(x) = ln(x + ), interval [0, ], an initial approximation x 0, tolerance 10 3, the maximum number of iterations 50 Output: an approximate fixed point of g on [0, ] within 10 3 or a message of failure set x = x 0 and xold = x 0 ; for i = 1 to 50 x = g(x); if x xold < 10 3 % checking required accuracy FoundSolution= true; % done break; % leave for environment end if xold = x; % update xold for the next iteration end for if FoundSolution return x else print the required accuracy is not reached in 50 iterations end if 1

3 Newton-Raphson Method Suppose f is a function with a unique root x in [a, b] Assume f is continuous in [a, b] To find x, let x 0 be a good initial approximation (ie, x x 0 0) where f (x 0 ) 0 Using Taylor s Theorem on f about x 0, we get f(x) = f(x 0 ) + f (x 0 ) 1! (x x 0 ) + f (ξ) (x x 0 ),! for some number ξ (depends on x) between x 0 and x Plugging x = x, we get 0 = f(x ) = f(x 0 ) + f (x 0 ) 1! = f(x 0 ) + f (x 0 ) 1! Now solving for x, we get (x x 0 ) + f (ξ) (x x 0 )! (x x 0 ) = f (ξ) (x x 0 ) 0 (since x x 0 0)! x x 0 f(x 0) f (x 0 ) So x 1 = x 0 f(x 0) f (x 0 ) is an approximation to x Using x 1, similarly we get x = x 1 f(x 1) f (x 1 ) Continuing the process, we get a sequence {x n } to approximate x where x n+1 = x n f(x n), n = 0, 1,, f (x n ) y y = f(x) x x 1 x 0 x Observation x n+1 is the x-intercept of the tangent line to y = f(x) at (x n, f(x n )) The equation of the tangent line to y = f(x) at (x n, f(x n )) is y = f(x n ) + f (x n )(x x n ) 13

For the x-intercept, y = 0 So 0 = f(x n ) + f (x n )(x x n ) = x = x n f(x n) f (x n ) Thus x n+1 is the x-intercept of the tangent line to y = f(x) at (x n, f(x n )) Note We will show later that {x n } converges to x for a good choice of an initial approximation x 0 If x 0 is far from x, then {x n } may not converge to x Example Do four iterations by the Newton-Raphson Method to approximate the root x of f(x) = e x x on the interval [0, ] using x 0 = 1 Solution f(x) = e x x = f (x) = e x 1 and f (1) = e 1 0 Thus x x n+1 = x n exn n, n = 0, 1,, e xn 1 n x n f(x n ) f (x n ) x n+1 = x n f(xn) f (x n) 0 1 0817 1718 11639 1 11639 00384 03 11464 11464 00004 1468 1146 3 1146 00000 146 1146 Since x 4 x 3 < 10 4, we can say that the root is x 4 = 1146 roughly correct to three decimal places (almost true as x = 1146193) Actually x 4 is correct to 1 decimal places! Note that this sequence converges to the root faster than that of other methods (why?) Convergence: Suppose f is a function with a simple root x in [a, b], ie, f (x ) 0 Assume f is continuous in [a, b] Then there is a δ > 0 such that the sequence {x n } constructed by the Newton-Raphson Method converges to x for any choice of the initial approximation x 0 (x δ, x + δ) Proof Consider the following function g on [a, b]: g(x) = x f(x) f (x) First note that x is a fixed point of g in [a, b] Since f (x ) 0, by the continuity of f in [a, b], there is a δ 1 > 0 such that f (x) 0 for all x [x δ 1, x + δ 1 ] So g is continuous on [x δ 1, x + δ 1 ] 14

The convergence follows by that of the Fixed Point Iteration of g on [x δ, x + δ] if we can show that there is a positive δ < δ 1 such that (i) x δ g(x) x + δ for all x [x δ, x + δ] and (ii) g (x) < 1 for all x (x δ, x + δ) To prove (ii), note that g (x) = 1 f (x)f (x) f(x)f (x) = f(x)f (x) (f (x)) (f (x)) = g (x ) = f(x )f (x ) (f (x )) = 0 (since f(x ) = 0, f (x ) 0) Since g (x ) = 0, by the continuity of g in [x δ, x + δ], there is a positive δ < δ 1 such that g (x) < 1 for all x (x δ, x + δ) So we have (ii) To show (i), take x [x δ, x + δ] By the MVT on g, we have g(x ) g(x) = g (ξ)(x x) for some ξ between x and x Thus x g(x) = g(x ) g(x) = g (ξ) x x < x x Since x [x δ, x + δ], ie, x x δ, we have x g(x) < x x δ, ie, x δ g(x) x + δ Note that if the root x of f is not simple, ie, f (x ) = 0, then this proof does not work but still {x n } may converge to x For multiple root x of f we use a modified Newton-Raphson iteration formula The Secant Method: In the iteration formula by the Newton-Raphson Method x n+1 = x n f(x n), n = 0, 1,,, f (x n ) we need to calculate the derivative f (x n ) which may be difficult sometimes To avoid such calculations, the Secant Method modifies the above formula by replacing f (x n ) with its approximation Note that f f(x) f(x n ) (x n ) = lim x x n x x n If x n 1 is close to x n, then f (x n ) f(x n 1) f(x n ) x n 1 x n = f(x n) f(x n 1 ) x n x n 1 Thus x n+1 = x n f(x n) f (x n ) x f(x n ) n f(x n) f(x n 1 = x ) n (x n x n 1)f(x n ) f(x n ) f(x n 1 ) x n x n 1 Thus the iteration formula by the Secant Method is x n+1 = x n (x n x n 1 )f(x n ), n = 1,, 3, f(x n ) f(x n 1 ) Note that geometrically x n+1 is the x-intercept of the secant line joining (x n, f(x n )) and (x n 1, f(x n 1 )) Also note that the convergence of the sequence {x n } by the Secant Method is slower than that of the Newton-Raphson Method 15

4 Order of Convergence In the preceding three sections we learned techniques to construct a sequence {x n } that converges to the root x of a function But the speed of their convergence are different In this section we will compare them by their order of convergence Definition The convergence of a sequence {x n } to x is of order p if for some constant c > 0 x n+1 x lim n x n x = lim ε n+1 p n ε p n For higher order convergence (ie, larger p), the sequence converges more rapidly The rate of convergence is called linear, quadratic, and superlinear if p = 1, p =, and 1 < p < respectively We say {x n } converges linearly, quadratically, or superlinearly to x In quadratic convergence, we have ε n+1 cε n and then accuracy of x n to x gets roughly double at each iteration Example The sequence { 1 n } converges linearly to 0 (n+1) 0 lim = lim pn n 1 = n n 0 p n = c, 0 if p < 1 1/ if p = 1 if p > 1 Example The rate of convergence of the Bisection Method is linear Recall that ε n = x x n b a n So the rate of convergence of the sequence {x n } is similar to (at least) that of { 1 } We n denote this by x n = x + O( 1 ) Since { 1 } converges linearly, {x n n n } also converges linearly Example The rate of convergence of the Fixed Point Iteration is linear when g (x ) 0 Consider the sequence {x n }, where x n+1 = g(x n ), that converges to x where g (x ) 0 We have shown before by applying the MVT on g on [x, x n ], that we find ξ n (x, x n ) such that x x n+1 = g(x ) g(x n ) = g (ξ n )(x x n ) Since {x n } converges to x and ξ n (x, x n ), {ξ n } also converges to x Then x x n+1 lim n x x n = lim n g (ξ n ) = g ( lim n ξ n ) = g (x ) (Assuming continuity of g ) 16

Example The rate of convergence of the Newton-Raphson Method to find a simple root is quadratic Recall that to find a simple root x of f (ie, f (x ) 0), we used the Fixed Point Iteration on g(x) = x f(x) f (x) for any choice of the initial approximation x 0 (x δ, x + δ) for small δ > 0 Also recall that since f(x ) = 0 and f (x ) 0, g (x ) = 0 By Taylor s Theorem on g about x, we get g(x) = g(x ) + g (x ) 1! (x x ) + g (ξ) (x x ),! for some ξ between x and x For x = x n, we get ξ n between x n and x such that g(x n ) = g(x ) + g (x ) 1! = x n+1 = x + g (x ) 1! = x n+1 x = g (ξ n )! x n+1 x = lim n x n x = lim g (ξ n ) n! (x n x ) + g (ξ n ) (x n x )! (x n x ) + g (ξ n ) (x n x )! (x n x ) (since g (x ) = 0) ( ) g lim ξ n n =! (Assuming continuity of g ) Since {x n } converges to x and ξ n lies between x n and x, {ξ n } also converges to x Thus x n+1 x lim n x n x = g (x ) = f (x ) f (x ) Note 1 If the root is not simple, the Newton-Raphson Method still may converge but the rate of convergence is not quadratic but linear Verify it to approximate the double root zero of f(x) = x using x 0 = 1 A modified Newton-Raphson Method to find a multiple root has linear convergence 3 The order of convergence of the Secant Method is superlinear (p = (1 + 5)/ 16) which is in between that of the Bisection Method and the Newton-Raphson Method (Proof skipped) 17

3 Interpolation Suppose we have a set of data in which for each x i you have y i : x 0 y 0 x 1 y 1 x y x 3 y 3 x n So there is an unknown function f for which f(x i ) = y i, i = 0, 1,, n With this data we would like to predict f(x ) for a given x [x 0, x n ] where x x i, i = 0, 1,, n This method of finding an untabulated data from a given table of data is called interpolation How do we find or approximate the unknown function f? One easy answer is to get a piecewise linear function f such that f (x) = y i + (x x i ) y i y i 1 x i x i 1 for all x [x i 1, x i ] But this is too naive because it assumes the functional values are changing at a constant rate y i y i 1 x i x i 1 on the entire interval [x i 1, x i ] y y n y = P (x) + ε y = P (x) y = P (x) ε x 0 x x n x There are multiple techniques of approximating f We will mainly focus on approximating f by a polynomial P n of degree n, called the interpolating polynomial and the method is called the polynomial interpolation The polynomial interpolation is suggested by the following theorem: Theorem (Weierstrass Approximation Theorem) For a given continuous function f on [a, b] and a small positive number ε, there exists a polynomial P such that f(x) P (x) < ε, ie, P (x) ε < f(x) < P (x) + ε, for all x in [a, b] How to find such a polynomial P? What is the maximum error in approximating f(x ) by P (x )? 18

31 Lagrange Polynomials For two distinct points (x 0, y 0 ) and (x 1, y 1 ), there is a unique polynomial P 1 of degree 1 such that P 1 (x 0 ) = y 0 and P 1 (x 1 ) = y 1 It can be verified that P 1 (x) = y 0 + y 1 y 0 x 1 x 0 (x x 0 ) = y 0 x x 1 x 0 x 1 + y 1 x x 0 x 1 x 0, whose graph is the straight line joining (x 0, y 0 ) and (x 1, y 1 ) We can extend this idea to n+1 distinct points: Theorem Suppose (x 0, y 0 ), (x 1, y 1 ),, (x n, y n ) are n+1 distinct points where x 0, x 1,, x n are distinct and f is a function such that f(x i ) = y i, i = 0, 1,, n Then there is a unique polynomial P n of degree at most n such that P n (x i ) = f(x i ), i = 0, 1,, n Proof (Sketch) Consider a polynomial P n (x) = a 0 + a 1 x + a x + + a n x n for which P n (x i ) = f(x i ) = y i, i = 0, 1,, n It gives us a system of n + 1 equations in n + 1 variables a 0, a 1,, a n : a 0 + a 1 x i + a x i + + a n x n i = y i, i = 0, 1,, n 0 i<j n Its matrix form is A x = b where x = [a 0,, a n ] T, b = [y 0,, y n ] T, and A is the Vandermonde matrix 1 x 0 x n 0 1 x 1 x n 1 A = 1 x n x n n The determinant det(a) = (x j x i ) 0 as x 0, x 1,, x n are distinct So A is invertible and we have a unique solution [a 0,, a n ] T = x = A 1 b giving a unique polynomial P n of degree at most n Note that there are infinitely many polynomials P of degree more than n for which P (x i ) = f(x i ) = y i, i = 0, 1,, n One construction of the polynomial P n of degree at most n such that P n (x i ) = f(x i ) = y i, i = 0, 1,, n is given by Joseph Lagrange: Lagrange Polynomial: P n (x) = y 0 L 0 (x) + y 1 L 1 (x) + + y n L n (x), where L i is given by L i (x) = n j=0 j i (x x j ) (x i x j ) = (x x 0) (x x i 1 )(x x i+1 ) (x x n ) (x i x 0 ) (x i x i 1 )(x i x i+1 ) (x i x n ) Note that L i (x i ) = 1 and L i (x j ) = 0 for all j i Thus P n (x i ) = y i = f(x i ), i = 0, 1,, n 19

Example Approximate f() by constructing the Lagrange polynomial P of degree from the following data: x f(x) 1 0 3 439 4 554 Solution P is given by P (x) = y 0 L 0 (x) + y 1 L 1 (x) + y L (x), where Thus (x 3)(x 4) L 0 (x) = (1 3)(1 4) (x 1)(x 4) L 1 (x) = (3 1)(3 4) (x 1)(x 3) L (x) = (4 1)(4 3) P (x) = y 0 L 0 (x) + y 1 L 1 (x) + y L (x) (x 3)(x 4) = 0 6 = 439 Thus f() P () = 54 6 y (x 1)(x 4) (x 3)(x 4) =, 6 (x 1)(x 4) =, (x 1)(x 3) = 3 (x 1)(x 4) + 439 + 554 (x 1)(x 3) 3 (x 1)(x 3) + 554 3 5 4 3 1 y = 4 ln(x) y = P (x) 0 1 3 4 x The preceding example has the table for f(x) = 4 ln x 0

Maximum Error: If a function f that has continuous f (n+1) on [x 0, x n ] is interpolated by the Lagrange polynomial P n using n + 1 points (x 0, y 0 ), (x 1, y 1 ),, (x n, y n ), then the error is given by the following for each x [x 0, x n ]: where ξ (x 0, x n ) depends on x f(x) P n (x) = f (n+1) (ξ) (n + 1)! (x x 0)(x x 1 ) (x x n ), Proof (Sketch) If x = x i, i = 0, 1,, n, then f(x i ) P n (x i ) = 0 = f (n+1) (ξ) (n + 1)! (x i x 0 )(x i x 1 ) (x i x n ) For a fixed x x i, i = 0, 1,, n, define a function g on [x 0, x n ] by g(t) = f(t) P n (t) [f(x) P n (x)] n j=0 (t x j ) (x x j ) Verify that g (n+1) is continuous on [x 0, x n ] and g is zero at x, x 0, x 1,, x n Then by the Generalized Rolle s Theorem, g (n+1) (ξ) = 0 for some ξ (x 0, x n ) which implies (steps skipped) Now solving for f(x), we get f (n+1) (n + 1)! (ξ) 0 [f(x) P n (x)] n (x x j ) j=0 = 0 f(x) P n (x) = f (n+1) (ξ) (n + 1)! (x x 0)(x x 1 ) (x x n ) So the maximum error is given by where M = f(x) P n (x) MK (n + 1)! max f (n+1) (x) and K = max (x x 0 )(x x 1 ) (x x n ) x [x 0,x n] x [x 0,x n] This error bound does not have a practical application as f is unknown But it shows that the error decreases when we take more points n (most of the times!) Note that if f is a polynomial of degree at most n, then f (n+1) = 0 and consequently M = 0 giving no error 1

Example Find the maximum error in approximating f(x) = 4 ln x on [1, 4] by the Lagrange polynomial P using points x = 1, 3, 4 Solution Since f (x) = 8/x 3 is decreasing on [1, 4], M = max f (x) = f (1) = 8 x [1,4] Now we find extremum values of g(x) = (x 1)(x 3)(x 4) = x 3 8x + 19x 1 Note that (8 ± 7)/3 [1, 4] and Since g((8 7)/3) > g((8 + 7)/3), Thus the maximum error is g (x) = 3x 16x + 19 = 0 = x = (8 ± 7)/3 g((8 + 7)/3) = (0 14 7)/7 g((8 7)/3) = (0 + 14 7)/7 g(1) = 0 g(4) = 0 K = max x [1,4] (x 1)(x 3)(x 4) = (0 + 14 7)/7 MK (n + 1)! 8 (0 + 14 7)/7 = 3! = 81 The last example has the table for f(x) = 4 ln x So f() = 4 ln = 77 and the absolute error is P () f() = 54 77 = 03 But approximating f(x) by P (x) for any x [1, 4] will have the maximum error 81 Note that another construction of the unique polynomial P n of degree at most n such that P n (x i ) = f(x i ), i = 0, 1,, n is given by Issac Newton: P n (x) = a 0 + a 1 (x x 0 ) + a (x x 0 )(x x 1 ) + + a n (x x 0 ) (x x n 1 ), where a i, i = 0,, n are found by Divided Differences

3 Cubic Spline Interpolation There are some problems in approximating an unknown function f on [x 0, x n ] by a single polynomial P n using n + 1 points (x 0, y 0 ), (x 1, y 1 ),, (x n, y n ): The values P n (x) may oscillate near the end points (Runge s phenomenon) and then the maximum error f(x) P n (x) as n, ie, the error grows for more points For example, consider the Runge s function f(x) = 1/(1 + 5x ) on [ 1, 1] To avoid these problems, we use a piecewise interpolating polynomial S on the intervals [x 0, x 1 ], [x 1, x ],, [x n 1, x n ] This is called piecewise polynomial interpolation But in the piecewise linear interpolation, the piecewise linear polynomial S is not smooth ie, S (x) is not continuous at x 0, x 1,, x n If S is piecewise quadratic, then we get the smoothness but it does not work when S (x 0 ) and S (x n ) are given So we will study piecewise cubic interpolation y S 3 S 1 S S n y = S(x) x 0 x 1 x x 3 x n x A spline S of degree k is a piecewise polynomial of degree at most k such that S (k 1) is continuous A cubic spline is a piecewise cubic with continuous first and second derivatives: 3

1 S(x) = S i (x) on [x i 1, x i ], for i = 1,, n S i (x i ) = y i = S i+1 (x i ) for i = 1,, n 1, S 1 (x 0 ) = y 0, and S n (x n ) = y n 3 S i(x i ) = S i+1(x i ) for i = 1,, n 1 4 S i (x i ) = S i+1(x i ) for i = 1,, n 1 A cubic spline S is called natural if S (x 0 ) = S (x n ) = 0 and clamped if S (x 0 ) and S (x n ) are provided Example Approximate f() by constructing a natural cubic spline S from the following data: x f(x) 1 1 3 0 4 1 Solution We define S piecewise on [1, 3] as follows: { S 1 (x) = a 1 + b 1 (x 1) + c 1 (x 1) + d 1 (x 1) 3 if x [1, 3] S(x) = S (x) = a + b (x 3) + c (x 3) + d (x 3) 3 if x [3, 4] Using the conditions of cubic spline together with the natural boundary conditions, we get a system of 8 equations in 8 variables: S 1 (1) = 1 = a 1 = 1 S 1 (3) = 0 = a 1 + b 1 + 4c 1 + 8d 1 = 0 S (3) = 0 = a = 0 S (4) = 1 = a + b + c + d = 1 S 1(3) = S (3) = b 1 + 4c 1 + 1d 1 = b S 1 (3) = S (3) = c 1 + 1d 1 = c S 1 (1) = 0 = c 1 = 0 S (4) = 0 = c + 6d = 0 Solving (using linear algebra) we get the unique solution ( (a 1, b 1, c 1, d 1, a, b, c, d ) = 1, 1 3, 0, 1 4, 0, 5 6, 1 4, 1 ) 1 Thus Thus f() S() = 15 4 { 1 + 1 1 (x 1) + 3 4 S(x) = (x 1)3 if x [1, 3] 5 (x 3) + 1(x 6 4 3) 1 (x 1 3)3 if x [3, 4] 4

y 1 0 S 1 3 4 x 1 S 1 Note in the preceding example that we got a unique solution and hence a unique natural cubic spline But we did not just get lucky because this is the case always: Theorem There is a unique natural and clamped cubic spline passing through n + 1 points (x 0, y 0 ), (x 1, y 1 ),, (x n, y n ), n Proof (Sketch) We have total 4n unknown coefficients a i, b i, c i, d i, i = 1,, n in n cubics S 1,, S n Using 4n conditions of cubic spline together with natural or clamped boundary conditions, we get a system of 4n equations in 4n variables Using algebraic substitutions and linear algebra (steps skipped), we get a unique solution Note that a clamped cubic spine usually gives better approximation than that of a natural cubic spine near the endpoints of [x 0, x n ] 5

4 Numerical Differentiation and Integration In this chapter we will learn numerical methods for derivative and integral of a function 41 Numerical Differentiation In this section we will numerically find f (x) evaluated at x = x 0 We need numerical techniques for derivatives when f (x) has a complicated expression or f(x) is not explicitly given By the limit definition of derivative, So when h > 0 is small, we have f f(x 0 + h) f(x 0 ) (x 0 ) = lim h 0 h f (x 0 ) f(x 0 + h) f(x 0 ), h which is called the two-point forward difference formula (FDF) Similarly the twopoint backward difference formula (BDF) is f (x 0 ) f(x 0) f(x 0 h) h Taking the average of the FDF and BDF, we get the two-point centered difference formula (CDF) is f (x 0 ) f(x 0 + h) f(x 0 h) h y Its slope is f (x 0 ) y = f(x) x 0 h x 0 x 0 + h x Note that CDF gives better accuracy than FDF and BDF (explained later) But CDF does not work if f(x) is not known in one side of x 0 All the difference formulas suffer from round-off errors when h is too small 6

Example f(x) = x e x Approximate f (1) using the FDF, BDF, and CDF with h = 0 Solution Two-point FDF : f (1) Two-point BDF : f (1) Two-point CDF : f (1) f(1 + 0) f(1) 478 71 = = 1035 0 0 f(1) f(1 0) 71 14 = = 645 0 0 f(1 + 0) f(1 0) 478 14 = = 84 (0) 04 Analytically we know f (1) = 3e So the absolute errors are 1035 3e = 19, 645 3e = 17, and 84 3e = 04 respectively So CDF gives the least error Errors in finite difference formulas: By the Taylor s theorem on f about x 0, we get Plugging x = x 0 + h, we get f(x) = f(x 0 ) + f (x 0 ) 1! f(x 0 + h) = f(x 0 ) + f (x 0 ) 1! for some ξ 1 (x 0, x 0 + h) Solving for f (x 0 ), we get The maximum error in FDF is f (x 0 ) = f(x 0 + h) f(x 0 ) h (x x 0 ) + f (ξ) (x x 0 )! h max f (x) x (x 0,x 0 +h) h + f (ξ 1 ) h,! f (ξ 1 ) h So the error in FDF is O(h) (ie, absolute error ch for some c > 0) It means small step size h results in more accurate derivative We say FDF is first-order accurate Similarly BDF is also first-order accurate with the maximum error For CDF, note that h max f (x) x (x 0 h,x 0 ) f(x 0 + h) = f(x 0 ) + f (x 0 ) 1! f(x 0 h) = f(x 0 ) f (x 0 ) 1! h + f (x 0 )! h + f (x 0 )! h + f (ξ 1 ) h 3, 3! h f (ξ ) h 3, 3! for some ξ 1 (x 0, x 0 + h) and ξ (x 0 h, x 0 ) Subtracting we get, 7

f(x 0 + h) f(x 0 h) = f (x 0 )h + f (ξ 1 ) + f (ξ ) 6 f(x 0 + h) f(x 0 h) = f (x 0 ) + f (ξ 1 ) + f (ξ ) h h 1 f (x 0 ) = f(x 0 + h) f(x 0 h) h Assuming continuity of f and using the IVT on f, we get for some ξ (ξ, ξ 1 ) (x 0 h, x 0 + h) Thus The maximum error in CDF is f (ξ) = f (ξ 1 ) + f (ξ ), f (x 0 ) = f(x 0 + h) f(x 0 h) h h 6 max f (x) x (x 0 h,x 0 +h) h 3 f (ξ 1 ) + f (ξ ) h 1 f (ξ) h 6 So the error in CDF is O(h ), ie, CDF is second-order accurate which is better than firstorder accurate as h << h for small h > 0 Example Consider f(x) = x e x Find the maximum error in approximating f (1) by the FDF, BDF, and CDF with h = 0 Solution f (x) = (x + 4x + )e x and f (x) = (x + 6x + 6)e x are increasing functions for x > 0 So max x (1,1) f (x) = f (1) = 73 Maximum error in two-point FDF : 0 Maximum error in two-point BDF : 0 Maximum error in two-point CDF : (0) 6 max f (x) = 01 f (1) = 73 x (1,1) max f (x) = 01 f (1) = 19 x (08,1) max f (x) = 004 x (08,1) 6 f (1) = 03 Derivative from Lagrange polynomial: If f is not explicitly given but we know (x i, f(x i )) for i = 0, 1,, n, then f is approximated by the Lagrange polynomial: f(x) = n i=0 f(x i )L i (x) + f (n+1) (ξ(x)) (n + 1)! n (x x i ), i=0 8

where ξ (x 0, x n ) and L i (x) = n j=0 j i ( ] x = x j, we get steps skipped but note that d n dx i=0 (x x i) f (x j ) = n i=0 (x x j ) Differentiating both sides and evaluating at (x i x j ) f(x i )L i(x j ) + f (n+1) (ξ) (n + 1)! x=x j = n i=0 i j n (x j x i ) If the points x 0, x 1,, x n are equally-spaced, ie, x j = x 0 + jh, then we get i=0 i j ) (x j x i ) f (x j ) = n i=0 f(x i )L i(x j ) + f (n+1) (ξ) (n + 1)! O(hn ) (1) It can be verified that two-point FDF and BDF are obtained from (1) using n = 1 Similarly n = gives the three-point FDF and BDF and two-point CDF whose errors are O(h ): Three-point FDF : f (x 0 ) 3f(x 0) + 4f(x 0 + h) f(x 0 + h) h Three-point BDF : f (x 0 ) 3f(x 0) 4f(x 0 h) + f(x 0 h) h Example From the following table approximate f (1) by the three-point FDF and BDF Solution Here h = 0 x 06 08 1 1 14 f(x) 065 14 71 478 794 Three-point FDF : f 3f(1) + 4f(1 + 0) f(1 + (0)) (1) = 76 (0) Three-point BDF : f 3f(1) 4f(1 0) + f(1 (0)) (1) = 775 (0) Note that the table is given for f(x) = x e x So f (1) = 3e Then the absolute errors are 76 3e = 053 and 775 3e = 04 respectively Notice that three-point FDF and BDF give less error than two-point FDF and BDF respectively 9

4 Elements of Numerical Integration Sometimes it is hard to calculate a definite integral analytically For example, 1 0 e x dx To approximate such an integral we break [a, b] into n subintervals [x 0, x 1 ], [x 1, x ],, [x n 1, x n ] where x i = a + ih and h = (b a)/n Then we approximate the integral by a finite sum given by a quadrature rule (or, quadrature formula): b a f(x) dx n c i f(x i ) A quadrature rule you have seen before is the Midpoint Rule: b ( ) a + b f(x) dx (b a)f a It approximates the area given by b f(x) dx by the area of the rectangle with length (b a) a and width f ( ) a+b y i=0 y = f(x) a a+b b x Let s discuss other quadrature rules Recall that we can approximate f(x) by the Lagrange polynomial P n (x) of degree n using n + 1 points a = x 0, x 1, x n = b: where L i (x) = n j=0 j i b a f(x) n f(x i )L i (x), i=0 (x x j ) Integrating both sides, we get (x i x j ) f(x) dx b a n f(x i )L i (x) dx = i=0 n i=0 [ b ] f(x i ) L i (x) dx a 30

= b a f(x) dx n c i f(x i ), where c i = b a L i(x) dx We will discuss the quadrature rules given by n = 1 and For n = 1, we have n + 1 = points a = x 0, x 1 = b and then i=0 = b a f(x) P 1 (x) = f(x 0 ) (x x 1) (x 0 x 1 ) + f(x 1) (x x 0) b) a) = f(a)(x + f(b)(x (x 1 x 0 ) (b a) (b a) f(x) dx f(a) b (x b) dx + f(b) b (x a) dx b a a b a a = f(a) ] (x b) b + f(b) ] (x a) b b a[ a ] b a a f(a) + f(b) = (b a) So we get the Trapezoidal Rule: b a [ ] f(a) + f(b) f(x) dx (b a) It approximates the area given by b f(x) dx by the area of the trapezoid with height (b a) a and bases f(a) and f(b) y y = f(x) a b x The error in the trapezoidal rule is the integral of the error term for the Lagrange polynomial: E T = b a f (ξ(x)) (x a)(x b) dx! By the Weighted MVT (where f (ξ(x)) is continuous and (x a)(x b) does not change sign in [a, b]), we get a constant c (a, b) such that 31

E T = f (c) b a (x a)(x b) dx = f (c) (b a)3 1 Similarly for n =, we have n + 1 = 3 points a = x 0, x 1 = (a + b)/, x = b and then f(x) P (x) = f(x 0 ) (x x 1)(x x ) (x 0 x 1 )(x 0 x ) +f(x 1) (x x 0)(x x ) (x 1 x 0 )(x 1 x ) +f(x 1) (x x 0)(x x 1 ) (x x 1 )(x x 1 ) Integrating we get the Simpson s Rule: b a f(x) dx (b a) 6 [ f(a) + 4f ( ) a + b ] + f(b) where the error in the Simpson s Rule (obtained from the Taylor polynomial T 3 (x) of f about x = (a + b)/ with the error term) is (b a)5 E S = 90 f (4) (c) 5 Note from E T that if f(x) is a polynomial of degree at most 1, then f = 0 and consequently E T = 0 So the trapezoidal rule gives the exact integral Similarly if f(x) is a polynomial of degree at most 3, then E S = 0 and consequently the Simpson s rule gives the exact integral Example Approximate Solution First of all let s find the exact integral: Midpoint : Trapezoidal : Simpson s : 0 0 0 0 x 3 dx by the Midpoint Rule, Trapezoidal Rule, Simpson s Rule 0 x 3 dx = x4 4 ] 0 = 4 ( ) ( ) 3 a + b 0 + x 3 dx (b a)f = ( 0) = [ ] [ ] f(a) + f(b) 0 + x 3 3 dx (b a) = ( 0) = 8 [ ( ) ] x 3 (b a) a + b ( 0) [ dx f(a) + 4f + f(b) = 0 + 4 1 3 + 3] = 4 6 6 The Simpson s Rule gives the best approximation which turns out to be the exact integral Note that the error of the Midpoint Rule is always half of that of the Trapezoidal Rule Because the Midpoint Rule is obtained by integrating the Taylor polynomial T 1 (x) of f about x = (a + b)/ and integrating its remainder term, we can show that E M = (b a)3 f (c) 4 3

43 Composite Numerical Integration Approximating b a f(x) dx by quadrature rules like trapezoidal, Simpson s will give large error when the interval [a, b] is large We can modify those rules by using n + 1 points instead of or 3 points Then the Lagrange polynomial of degree n might give large error near the end points for large n So we use a composite ( quadrature rule that breaks ) [a, b] into n subintervals [x 0, x 1 ], [x 1, x ],, [x n 1, x n ] x i = a + ih and h = (b a)/n and approximates the integral by applying quadrature rules on each subinterval and adding them up: b y a f(x) dx = x1 x 0 f(x) dx + x x 1 f(x) dx + + xn x n 1 f(x) dx y = f(x) x 0 x 1 x x n 1 x n x Applying trapezoidal rule on each subinterval [x i 1, x i ], we get b a f(x) dx = n i=1 n ( ) f(xi 1 ) + f(x i ) f(x) dx h x i 1 i=1 = h [ ] f(x 0 ) + f(x 1 ) + + f(x n 1 ) + f(x n ) xi So the Composite Trapezoidal Rule is b f(x) dx h [ ] f(x 0 ) + f(x 1 ) + + f(x n 1 ) + f(x n ) a Similarly the Composite Midpoint Rule is b n ( ) xi 1 + x i f(x) dx h f a i=1 33

For the Composite Simpson s Rule, we take even n and apply simple Simpson s Rule to the subintervals [x 0, x ], [x, x 4 ],, [x n, x n ]: b a n/ f(x) dx = i=1 n/ i=1 xi f(x) dx x i [ f(x i ) + 4f(x i 1 ) + f(x i ) h 3 Example Approximate = h [ ( ) ( f(x 0 )+4f(x 1 )+f(x ) + 3 = h [ ( f(x 0 )+f(x n)+4 3 = h n/ f(x 0 ) + f(x n ) + 4 3 0 ] f(x )+4f(x 3 )+f(x 4 ) f(x 1 )+f(x 3 )+ +f(x n 1 ) i=1 ) + ) ( + + (n )/ f(x i 1 ) + ( f(x n )+4f(x n 1 )+f(x n) f(x )+f(x 4 )+ +f(x n ) i=1 f(x i ) e x dx using 4 subintervals in (a) Composite Trapezoidal Rule, (b) Composite Midpoint Rule, (c) Composite Simpson s Rule Solution First of all let s find the exact integral: 0 ) ] e x dx = e x ] 0 = e 1 6389 n = 4 = h = ( 0)/4 = 05 and the 4 subintervals are [0, 05], [05, 1], [1, 15], [15, ] ) ] CTR : CMR : CSR : 0 0 0 e x dx 05 ] [e 0 + e 05 + e 1 + e 15 + e = 65 ] e x dx 05 [e 05 + e 075 + e 15 + e 175 = 63 e x dx 05 ] [e 0 + 4e 05 + e 1 + 4e 15 + e = 639 3 The error in the composite trapezoidal rule (using n subintervals) is given by E Tn = n i=1 f (c i ) (x i x i 1 ) 3 1 = h3 1 n f (c i ) i=1 Assuming continuity of f on (a, b), by the IVT we can find c (a, b) such that 1 n n f (c i ) = f (c) i=1 34

Thus n i=1 f (c i ) = nf (c) and then E Tn = nh3 1 f (c) Note that n = (b a)/h Then the error for the composite trapezoidal rule becomes: E Tn = (b a) h f (c) 1 Similarly we get errors in the composite midpoint and Simpson s rule: E Mn = (b a) h f (b a) (c) E Sn = 4 180 h4 f (4) (c) Note that since errors are O(h ) and O(h 4 ), small step sizes lead to more accurate integral Example Find the step size h and the number of subintervals n required to approximate 0 e x dx correct within 10 using (a) Composite Trapezoidal Rule, (b) Composite Midpoint Rule, (c) Composite Simpson s Rule Solution Note f (x) = f (4) (x) = e x which have the maximum absolute value e on [0, ] (b a) ( ) E Tn = h f (b a) (c) h max f ( 0) 0 (x) = e < 10 1 1 [0,] 1 n = n > 00e /3 = 19 Thus for the CTR we need n = 3 and h = ( 0)/3 = /3 0087 Similarly for the CMR we need n = 16 and h = ( 0)/16 = 015, and for the CSR we need n = 4 and h = ( 0)/4 = 05 Algorithm Composite-Simpson s Input: functionf, interval [a, b], an even number n of sunintervals Output: an approximation of b set h = (b a)/n; set I = f(a) + f(b); for i = 1 to n/ I = I + 4 f(a + (i 1) h) end for for i = 1 to (n )/ I = I + f(a + i h) end for return I h/3 a f(x) dx 35

5 Differential Equations In this chapter we numerically solve the following IVP (initial value problem): dy dt = f(t, y), a t b, y(a) = c () Instead of finding y = y(t) on [a, b], we break [a, b] into n subintervals [t 0, t 1 ], [t 1, t ],, [t n 1, t n ] and approximate y(t i ), i = 0, 1,, n But if we need y(t), it can be approximated by the Lagrange polynomial P n using t 0, t 1,, t n Before approximating a solution y = y(t), we must ask if () has a solution and it is unique on [a, b] The answer is given by the Existence and Uniqueness Theorem: Theorem 51 The IVP () has a unique solution y(t) on [a, b] if 1 f is continuous on D = {(t, y) a t b, < y < }, and f satisfies a Lipschitz condition on D with constant L: f(t, y 1 ) f(t, y ) L y 1 y, for all (t, y 1 ), (t, y ) D When we approximate y(t i ), i = 0, 1,, n for the unique solution y(t), we might commit some round-off errors So we ask if the IVP () is well-posed: a small change in the problem (ie, small change in f, c) gives a small change in the solution It can be proved that the IVP () is well-posed if f satisfies a Lipschitz condition on D Also note that if f y (t, y) L on D, then f satisfies a Lipschitz condition on D with constant L 51 Euler s Method We break [a, b] into n subintervals [t 0, t 1 ], [t 1, t ],, [t n 1, t n ] where t i = a + ih and h = (b a)/n The Euler s Method finds y 0, y 1,, y n such that y i y(t i ), i = 0, 1,, n: y 0 = c y i+1 = y i + hf(t i, y i ), i = 0, 1,, n 1 To justify the iterative formula, use Taylor s theorem on y about t = t i : y(t) = y(t i ) + (t t i )y (t i ) + (t t i) y (ξ i ) = y(t i+1 ) = y(t i ) + (t i+1 t i )y (t i ) + (t i+1 t i ) y (ξ i ) = y(t i ) + hf(t i, y(t i )) + h y (ξ i ) = y(t i+1 ) y i + hf(t i, y i ) =: y i+1 36

Example Use Euler s method with step size h = 05 to approximate the solution of the following IVP: dy dt = t y, 0 t 3, y(0) = 1 Solution We have h = 05, t 0 = 0, y 0 = 1 and f(t, y) = t y So y i+1 = y i + 05(t i y i ) = 05(t i + y i ) t 1 = 0 + 1 05 = 05, y 1 = 05(t 0 + y 0 ) = 05(0 + 1) = 05 t = 0 + 05 = 1, y = 05(t 1 + y 1 ) = 05(05 + 05) = 0375 etc i t i y i 0 0 1 1 05 05 1 0375 3 15 06875 4 146875 5 5 734375 6 3 4491875 y y = (t t + ) e t 4 3 1 0 1 3 4 t Geometric Interpretation: The tangent line to the solution y = y(t) at the point (t 0, y 0 ) has slope dy ] = f(t 0, y 0 ) So an equation of the tangent line is dt (t 0,y 0 ) y = y 0 + (t t 0 )f(t 0, y 0 ) If t 1 is close to t 0, then y 1 = y 0 + (t 1 t 0 )f(t 0, y 0 ) = y 0 + hf(t 0, y 0 ) would be a good approximation to y(t 1 ) Similarly if t is close to t 1, then y(t ) y = y 1 + hf(t 1, y 1 ) 37

Maximum error: Suppose D = {(t, y) a t b, < y < } and f satisfies a Lipschitz condition on D with constant L Suppose y(t) is the unique solution to () where y (t) M for all t [a, b] Then for the approximation y i of y(t i ) by the Euler s method with step size h, we have y(t i ) y i hm L [ 1 + e L(t i a) ], i = 0, 1,, n Proof Use Taylor s theorem and some inequality See a standard textbook Example Find the maximum error in approximating y(1) by y in the preceding example Compare it with the actual absolute error using the solution y = (t t + ) e t Solution f(t, y) = t y = f y = 1 1 = L for all y Thus f satisfies a Lipschitz condition on D = (0, 3) (, ) with the constant L = 1 Now y = (t t + ) e t = y = e t Since y = e t > 0, y is an increasing function and then y = e t e 3 = 195 = M for all t [0, 3] Note h = 05, t = 1, and a = 0 Thus y(1) y hm L [ 1 + e L(t a) ] = 05 195 1 [ 1 + e 1(1 0) ] = 083 Using the solution y = (t t + ) e t, we get the actual absolute error y(1) y = (1 e 1 ) 0375 = 05 38

5 Higher-order Taylor s Method Recall that the Euler s method was derived by approximating y(t) by its Taylor polynomial of degree 1 about t = t i Similarly we can approximate y(t) by its Taylor polynomial of degree k for any given integer k By Taylor s theorem on y about t = t i, y(t) = y(t i ) + (t t i )y (t i ) + (t t i) y (t! i ) + + (t t i) k y (k) (t n! i ) + (t t i) (k+1) y (k+1) (ξ (k+1)! i ) = y(t i+1 ) = y(t i ) + hy (t i ) + h! y (t i ) + + hk k! y(k) (t i ) + h(k+1) (k+1)! y(k+1) (ξ i ) Since y = f(t, y), y = f (t, y),, y (k) = f (k 1) (t, y) Thus y(t i+1 ) = y(t i ) + hf(t i, y(t i )) + h f (t! i, y(t i )) + + hk f (k 1) (t k! i, y(t i )) + h(k+1) f (k) (ξ (k+1)! i, y(ξ i )) y i + h [f(t i, y i ) + h ]! f (t i, y i ) + + hk 1 f (k 1) (t i, y i ) + O(h k+1 ) k! = y(t i+1 ) y i + ht k (t i, y i ) =: y i+1, where T k (t i, y i ) = f(t i, y i ) + h! f (t i, y i ) + + hk 1 f (k 1) (t i, y i ) k! Thus the Taylor s Method of order k finds y 0, y 1,, y n such that y i y(t i ), i = 0, 1,, n: y 0 = c y i+1 = y i + ht k (t i, y i ), i = 0, 1,, n 1 Example Use Taylor s method of order with step size h = 05 to approximate the solution of the following IVP: dy dt = t y, 0 t, y(0) = 1 Solution We have h = 05, t 0 = 0, y 0 = 1 and y = f(t, y) = t y Then f (t, y) = t y = t (t y) = t + t + y So by the Taylor s method of order, y i+1 = y i + hf(t i, y i ) + h! f (t i, y i ) y i+1 = y i + 05(t i y i ) + (05) ( t i + t i + y i ) = 3t i + t i + 5y i 8 t 1 = 0 + 1 05 = 05, y 1 = (3t 0 + t 0 + 5y 0 )/8 = 065 t = 0 + 05 = 1, y = (3t 1 + t 1 + 5y 1 )/8 = 06093 etc i t i y i 0 0 1 1 05 065 1 06093 3 15 10058 4 18474 39