EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science

Multidimensional Unconstrained Optimization Suppose we have a function f() of more than one variable f(x 1, x 2,, x n ) We want to find the values of x 1, x 2,, x n that give f() the largest (or smallest) possible value Graphical solution is not possible, but a graphical picture helps understanding Hilltops and contour maps

Methods of solution Direct or non-gradient methods do not require derivatives Grid search Random search One variable at a time Line searches and Powell s method Simplex optimization

Gradient methods use first and possibly second derivatives Gradient is the vector of first partials Hessian is the matrix of second partials Steepest ascent/descent Conjugate gradient Newton s method Quasi-Newton methods

Grid and Random Search Given a function and limits on each variable, generate a set of random points in the domain, and eventually choose the one with the largest function value Alternatively, divide the interval on each variable into small segments and check the function for all possible combinations

Features of Random and Grid Slow and inefficient Search Requires knowledge of domain Works even for discontinuous functions Poor in high dimension Grid search can be used iteratively, with progressively narrowing domains

Line searches Given a starting point and a direction, search for the maximum, or for a good next point, in that direction. Equivalent to one dimensional optimization, so can use Newton s method or another method from previous chapter Different methods use different directions

x v ( x, x,, x ) 1 2 ( v, v,, v ) 1 2 n n f( x) f( x, x,, x ) 1 2 gλ( ) f ( x λ v) n

One-Variable-at-a Time Search Given a function f() of n variables, search in the direction in which only variable 1, changes Then search in the direction from that point in which only variable 2 changes, etc. Slow and inefficient in general Can speed up by searching in a direction after n changes (pattern direction)

Powell s Method If f() is quadratic, and if two points are found by line searches in the same direction from two different starting points, then the line joining the two ending points (a conjugate direction) heads toward the optimum Since many functions we encounter are approximately quadratic near the optimum, this can be effective

Start with a point x 0 and two random directions h 1 and h 2 Search in the direction of h 1 from x 0 to find a new point x 1 Search in the direction of h 2 from x 1 to find a new point x 2. Let h 3 be the direction joining x 0 to x 2 Search in the direction of h 3 from x 2 to find a new point x 3 Search in the direction of h 2 from x 3 to find a new point x 4 Search in the direction of h 3 from x 4 to find a new point x 5

Points x 3 and x 5 have been found by searching in the direction of h 3 from two starting points x 2 and x 4 Call the direction joining x 3 and x 5 h 4 Search in the direction of h 4 from x 5 to find a new point x 6 The new point x 6 will be exactly the optimum if f() is quadratic The iterations can then be repeated Errors estimated by change in x or in f()

Nelder-Mead Simplex Algorithm Direct search method that uses simplices, which are triangles in dimension 2, pyramids in dimension 3, etc. At each iteration a new point is added usually in the direction of the face of the simplex with largest function values

4.2 4 3-2

Gradient Methods The gradient of f() at a point x is the vector of partial derivatives of the function f() at x For smooth functions, the gradient is zero at an optimum, but may also be zero at a non-optimum The gradient points uphill The gradient is orthogonal to the contour lines of a function at a point

Directional Derivatives Given a point x in R n, a unit direction v, and a function f() of n variables, we can define a new function g() of one variable by g(λ)=f(x+λv) The derivative g (λ) is the directional derivative of f() at x in the direction of v This is greatest when v is in the gradient direction

x v ( x, x,, x ) 1 2 ( v, v,, v ) T 1 vv 1 2 i 1 2 i f( x) f( x, x,, x ) 1 2 f f f f,,, x x x 1 2 gλ( ) f ( x λ v) n v n n n T f f f g'(0) ( f) v v, v,, vn x x x n 1 2 1 2 n

Steepest Ascent The gradient direction is the direction of steepest ascent, but not necessarily the direction leading directly to the summit We can search along the direction of steepest ascent until a maximum is reached Then we can search again from a new steepest ascent direction

xx 2 1 2 1 2 f( x, x) at (2,2) f (2,2) 8 2 1 1 2 2 1 f ( x, x ) x f (2,2) 4 f ( x, x) 2 xx f (2,2) 8 2 1 2 1 2 2 f (2,2) (4,8) (2 4 λ,2 8 λ) is the gradient line gλ( ) f (2 4 λ,2 8 λ ) (2 4 λ )(2 8 λ ) 2

The Hessian The Hessian of a function f() is the matrix of second partial derivatives The gradient is always 0 at a maximum (for smooth functions) The gradient is also 0 at a minimum The gradient is also 0 at a saddle point, which is neither a maximum nor a minimum A saddle point is a max in at least one direction and a min in at least one direction

Max, Min, and Saddle Point For one-variable functions, the second derivative is negative at a maximum and positive at a minimum For functions of more than one variable, a zero of the gradient is a max if the second directional derivative is negative for every direction and is a min if the second directional derivative is positive for every direction

Positive Definiteness A matrix H is positive definite if x T Hx > 0 for every vector x Equivalently, every eigenvalue of H is positive λ is an eigenvalue of H with eigenvector x if Hx = λx -H is positive definite if every eigenvalue of H is negative

Max, Min, and Saddle Point If the gradient f of a function f is zero at a point x and the Hessian H is positive definite at that point, then x is a local min If f is zero at a point x and -H is positive definite at that point, then x is a local max If f is zero at a point x and neither H nor -H is positive definite at that point, then x is a saddle point The determinant H helps only in dimension 1 or 2

Steepest Ascent/Descent This is the simplest of the gradient-based methods From the current guess, compute the gradient Search along the gradient direction until a local max is reached of this onedimensional function Repeat until convergence

Eigenvalues Suppose the gradient is 0 and H is the Hessian H is positive definite if and only if all the eigenvalues of H are positive (minimum). -H is positive definite if and only if all the eigenvalues of H are negative (maximum). If the eigenvalues of H are not all of the same sign then we have a saddle point.

Practical Steepest Ascent In real examples, the maximum in the gradient direction cannot be calculated analytically Problem reduces to one dimensional optimization as a line search One can also use more primitive line searches that are fast but do not try to find the absolute optimum

Newton s Method Steepest ascent can be quite slow Newton s method is faster, though it requires evaluation of the Hessian Function is modeled by a quadratic at a point using first and second derivatives The quadratic is solved exactly This is used as the next iterate

A second-order multivariate Taylor series expansion at the current iterate is T T f( x) = f( x ) + f ( x )( x x ) + 0.5( x x ) H ( x x ) i i i i i i At the optimum, the gradient is 0, so f( x) = f( x ) + H ( x x ) = 0 i i i If H is invertible,then 1 x = i+ 1 x i H i f( xi) In practice, solve the linear problem, H x= H x f( x ) i i i i

Curve Fitting Given a set of n points (x i, y i ), find a fitted curve that provides a fitted value y = f(x) for each value of x in a range. The curve may interpolate the points (go through each one), either linearly or nonlinearly, or may approximate the points without going through each one, as in least-squares regression.

Simple Linear Regression We have a set of n data points, each of which has a measured predictor x and a measured response y. We wish to develop a prediction function f(x) for y. In the simplest case, we take f(x) to be a linear function of x, as in f(x) = a 0 + a 1 x

Criteria and Estimation If we have one point, way (1,1), then many lines fit perfectly: f(x) = x, f(x) = 2x-1 f(x) = -x+2 If there are two points, say (1,1) and (2,3), then in general there is exactly one line going through the points: f(x) = 2x-1.

If there are more than two points, then in general there is no straight line through all of them. These problems are, respectively, underdetermined, determined, and overdetermined. Reasonable criteria for choosing the coefficients a and b in f(x) = a 0 + a 1 x lie in minimizing the size of the residuals: r i = y i f(x i ) = y i (a 0 + a 1 x i ), but how to combine different residuals?

The least-squares criterion minimizes n n 2 i i i i= 1 i= 1 = = ( ( )) 2 SS r y f x There are many other possible criteria. Use of the least-squares criterion does not imply any beliefs about the data Use of the linear form for f(x) assumes that this straight-line relationship is reasonable Assumptions are needed for inference about the predictions or about the relationship itself

Mininimize Sum of Residuals Minimize Sum of Absolute Values of Residuals Minimize Max Residual

Computing the Least-Squares Solution We wish to minimize the sum of squares of deviations from the regression line by choosing the coefficients a 0 and a 1 accordingly Since this is a continuous, quadratic function of the coefficients, one can simply set the partial derivatives equal to zero

n n n 2 ( ( )) 2 2 0 1 = i = i i = ( i 0 1 i) i= 1 i= 1 i= 1 SS( a, a ) r y f x y a a x SS( a, a ) n n n n 0 1 0= = 2 ( yi a0 ax 1 i) = 2 yi a0 ax 1 i a 0 i= 1 i= 1 i= 1 i= 1 SS( a, a ) n n n n 0 1 2 0= = 2 ( yi a0 ax 1 i) xi = 2 xy i i ax 0 i ax 1 i a 1 i= 1 i= 1 i= 1 i= 1 na + a x 0 1 n i= 1 i = i= 1 n n n 2 0 i 1 i = i i i= 1 i= 1 i= 1 a x a x xy n y i

These normal equations have a unique solution as two equations in two unknowns The straight line that is calculated in this way is used in practice to see if there is a relationship between x and y It is also used to predict y from x It can also be used to predict x from y by inverting the equation We now look at some practical uses of least squares

Quantitative Prediction Regression analysis is the statistical name for the prediction of one quantitative variable (fasting blood glucose level) from another (body mass index) Items of interest include whether there is in fact a relationship and what the expected change is in one variable when the other changes

Assumptions Inference about whether there is a real relationship or not is dependent on a number of assumptions, many of which can be checked When these assumptions are substantially incorrect, alterations in method can rescue the analysis No assumption is ever exactly correct

Linearity This is the most important assumption If x is the predictor, and y is the response, then we assume that the average response for a given value of x is a linear function of x E(y) = a + bx y = a + bx + ε ε is the error or variability

In general, it is important to get the model right, and the most important of these issues is that the mean function looks like it is specified If a linear function does not fit, various types of curves can be used, but what is used should fit the data Otherwise predictions are biased

Independence It is assumed that different observations are statistically independent If this is not the case inference and prediction can be completely wrong There may appear to be a relationship even though there is not Randomization and control prevents this in general

Note no relationship between x and y These data were generated as follows: x y 1 1 0 x y 0.95x i 1 i i 0.95y ε η i 1 i i

Constant Variance Constant variance, or homoscedacticity, means that the variability is the same in all parts of the prediction function If this is not the case, the predictions may be on the average correct, but the uncertainties associated with the predictions will be wrong Heteroscedacticity is non-constant variance

Consequences of Heteroscedacticity Predictions may be unbiased (correct on the average) Prediction uncertainties are not correct; too small sometimes, too large others Inferences are incorrect (is there any relationship or is it random)

Normality of Errors Mostly this is not particularly important Very large outliers can be problematic Graphing data often helps If in a gene expression array experiment, we do 40,000 regressions, graphical analysis is not possible Significant relationships should be examined in detail

Example Analysis Standard aqueous solutions of fluorescein (in pg/ml) are examined in a fluorescence spectrometer and the intensity (arbitrary units) is recorded What is the relationship of intensity to concentration? Use later to infer concentration of labeled analyte

+---------------------+ concen~n intens~y --------------------- 1. 0 2.1 2. 2 5 3. 4 9 4. 6 12.6 5. 8 17.3 --------------------- 6. 10 21 7. 12 24.7 +---------------------+

intensity 0 5 10 15 20 25 0 5 10 15 concentration

. regress intensity concentration Source SS df MS Number of obs = 7 -------------+------------------------------ F( 1, 5) = 2227.53 Model 417.343228 1 417.343228 Prob > F = 0.0000 Residual.936784731 5.187356946 R-squared = 0.9978 -------------+------------------------------ Adj R-squared = 0.9973 Total 418.280013 6 69.7133355 Root MSE =.43285 ------------------------------------------------------------------------------ intensity Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- concentrat~n 1.930357.0409002 47.20 0.000 1.82522 2.035495 _cons 1.517857.2949358 5.15 0.004.7597003 2.276014 ------------------------------------------------------------------------------

. regress intensity concentration Source SS df MS Number of obs = 7 -------------+------------------------------ F( 1, 5) = 2227.53 Model 417.343228 1 417.343228 Prob ANOVA > F = 0.0000 Residual.936784731 5.187356946 R-squared = 0.9978 -------------+------------------------------ Adj Table R-squared = 0.9973 Total 418.280013 6 69.7133355 Root MSE =.43285 ------------------------------------------------------------------------------ intensity Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- concentrat~n 1.930357.0409002 47.20 0.000 1.82522 2.035495 _cons 1.517857.2949358 5.15 0.004.7597003 2.276014 ------------------------------------------------------------------------------

. regress intensity concentration Source SS df MS Number of obs = 7 -------------+------------------------------ F( 1, 5) = 2227.53 Model 417.343228 1 417.343228 Prob > F = 0.0000 Residual.936784731 5.187356946 R-squared = 0.9978 -------------+------------------------------ Adj R-squared = 0.9973 Total 418.280013 6 69.7133355 Root MSE =.43285 ------------------------------------------------------------------------------ intensity Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- concentrat~n Test 1.930357 of overall.0409002 model 47.20 0.000 1.82522 2.035495 _cons 1.517857.2949358 5.15 0.004.7597003 2.276014 ------------------------------------------------------------------------------

. regress intensity concentration Source SS df MS Number of obs = 7 -------------+------------------------------ F( 1, 5) = 2227.53 Model 417.343228 1 417.343228 Prob > F = 0.0000 Residual.936784731 5.187356946 R-squared = 0.9978 -------------+------------------------------ Adj R-squared = 0.9973 Total 418.280013 6 69.7133355 Root MSE =.43285 ------------------------------------------------------------------------------ intensity Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- concentrat~n 1.930357.0409002 47.20 0.000 1.82522 2.035495 Variability around the regression line _cons 1.517857.2949358 5.15 0.004.7597003 2.276014 ------------------------------------------------------------------------------

0 5 10 15 20 25 0 5 10 15 concentration intensity Fitted values

Residuals -.5 0.5 0 5 10 15 20 25 Fitted values

Use of the calibration curve yˆ 1.52 1.93x yˆ is the predicted average intensity x is the true concentration y 1.52 xˆ 1.93 y is the observed intensity xˆ is the estimated concentration

Measurement and Calibration Essentially all things we measure are indirect The thing we wish to measure produces an observed transduced value that is related to the quantity of interest but is not itself directly the quantity of interest Calibration takes known quantities, observes the transduced values, and uses the inferred relationship to quantitate unknowns

Measurement Examples Weight is observed via deflection of a spring (calibrated) Concentration of an analyte in mass spec is observed through the electrical current integrated over a peak (possibly calibrated) Gene expression is observed via fluorescence of a spot to which the analyte has bound (usually not calibrated)

Measuring Variation If we do not use any predictor, the variability of y is its variance, or mean square difference between y and the mean of all the y s. If we use a predictor, then the variability is the mean square difference between y and its prediction

n 1 2 MST = n 1 y y ( ) ( ) i= 1 ( ) ( ) ( ) i 0 1 i ( ) ( ) ( ) n 1 2 MSE = n 2 y yˆ 1 i= 1 n = n 2 y a ax i= 1 MSR = SST SSE i i /1 i 2

0 5 10 15 20 25 0 5 10 15 concentration intensity Fitted values

Source SS df MS -------------+------------------------------ Model 417.343228 1 417.343228 Residual.936784731 5.187356946 -------------+------------------------------ Total 418.280013 6 69.7133355

Multiple Regression If we have more than one predictor, we can still fit the least-squares equations so long as we don t have more coefficients than data points This involves solving the normal equations as a matrix equation

y = xb + ε Y = XB + E n = number of data points p = number of predictors including constant y is 1 1 x is 1 p B is p 1 ε is 1 1 Y is n 1 X is n p E is n 1

Y = XB + E n = number of data points p = number of predictors including constant B is p 1 Y is n 1 X is n p E is n 1 ( Y XBˆ ) is n 1 ( Y XBˆ) ( Y XBˆ) is 1 1, the SSE

( Y XBˆ) ( Y XBˆ) is 1 1, the SSE To minimize this over choices of B, solve ( XX ) Bˆ = ( XY ) ˆ 1 = ( ) ( ) B XX XY

Linearization of Nonlinear Relationships We can fit a curved relationship with a polynomial The relationship f(x) = a 0 + a 1 x + a 2 x 2 can be treated as a problem with two predictors This can then be dealt with as any multiple regression problem

Sometimes a nonlinear relationship can be linearized by a transformation of the response and the predictor Often this involves logarithms, but there are many possibilities

y ln y y = αe β x = lnα + βx d a = a+ 1 + / d a = 1 + / y a d y b = ( x/ c) y a ( x c) ( x c) d y ln = bln c+ bln x y a b b

Intrinsic nonlinearity We can still solve the least-squares problem even if f(x) is not linear in the parameters We do this by approximate linearization at each step = Gauss-Newton There are other, more effective methods, but this is beyond our scope

y = f( x; a, a, a ) + ε i i 0 1 n ( a x ) 1, j 1, j ( ax) 1 ( ) ( ) 0( j) + 0( j) 0 0, j + 1( ( ) j) a1 a1, j ( ) y = f( x; a, a, a ) + f ( x; a, a, a ) a a + i i 0, j 1, j n, j 0 i 0, j 1, j n, j 0 0, j 0 + f ( x; a, a, a ) a a + ε 1 j 0, j j n i 0, j 1, j n, j n n, j y = f( x) + ε = a 1 e + ε f ( x ) = 1 e j f( x) = a xe 0 a y f x f x a a f x y f ( x ) f ( x ) a + f ( x ) a j x j 0 j 0 j 0 1 j 1 Solve for a and a by linear least squares. 0 1 Repeat until convergence.

Interpolation Given a set of points (x i, y i ), an interpolating function is one which is defined for all x in the range of the x i, and which satisfies f(x i ) = y i. Polynomials are a convenient class of functions to use for this purpose, though others such as splines are also used. There are different ways to express the same polynomial. Given n points, we can in general determine an n-1 degree polynomial that interpolates them.

Linear function Two points Degree one Quadratic function Three points Degree Two Cubic function Four points Degree three

Linear Interpolation

f x f x f x f x = x x x x ( ) ( ) ( ) ( ) 1 0 1 0 ( ) ( ) 0 1 0 f x f x f x = f x + x x ( ) ( ) 1 0 ( ) 1 0 0 x1 x0 ln(1) = 0 ln(4) = 1.386294 1.386294 0 ln( x) f1( x) = 0 + ( x 1) 4 1 1.386294 ln(2) f1(2) = (2 1) = 0.4620981 3 ln(2) = 0.6931472

Quadratic Interpolation Three points determine a quadratic This should fit many functions better than linear interpolation We derive a general form for quadratic interpolation We then derive a method to estimate the three unknowns (coefficients) that determine a quadratic function

f ( x) = b + b( x x ) + b ( x x )( x x ) 2 0 1 0 2 0 1 = b + bx bx + bx + bxx bxx bxx 2 0 1 1 0 2 2 0 1 2 0 2 1 which is of the form = a + ax+ ax with 0 1 2 a = b bx + bx x 0 0 1 0 2 0 1 a = b bx bx a 1 1 2 0 2 1 = b 2 2 2 which shows either form is general

f ( x) = b + b( x x ) + b ( x x )( x x ) 2 0 1 0 2 0 1 f( x ) b 0 0 0 0 1 b f( x ) f( x ) = b + b( x x ) 1 0 1 1 0 f( x ) = f( x ) + b( x x ) b = = = 1 0 1 1 0 f( x1) f( x0) ( x x ) 1 0

f ( x) = b + b( x x ) + b ( x x )( x x ) b b 2 0 1 0 2 0 1 0 0 1 2 f( x ) f( x1) f( x0) ( x x ) 1 0 f( x ) f( x ) f( x ) = f( x ) + ( x x ) + b ( x x )( x x ) 1 0 2 0 2 0 2 2 0 2 1 ( x1 x0) f( x2) f( x0) f( x1) f( x0) ( x2 x0) = + b ( x x )( x x ) ( x x ) ( x x )( x x ) b = = 2 0 2 1 1 0 2 0 2 1 f( x ) f( x ) f( x ) f( x ) ( x x )( x x ) ( x x )( x x ) 2 0 1 0 = 2 0 2 1 1 0 2 1 2

b b b b b b 2 2 2 2 2 2 f( x2) f( x0) f( x1) f( x0) = ( x x )( x x ) ( x x )( x x ) 2 0 2 1 1 0 2 1 f( x2) f( x1) f( x1) f( x0) f( x1) f( x0) = + ( x x )( x x ) ( x x )( x x ) ( x x )( x x ) 2 0 2 1 2 0 2 1 1 0 2 1 2 1 = + [ f( x ) f( x )][( x x ) ( x x )] f( x ) f( x ) ( x2 x0)( x2 x1) ( x2 x0)( x2 x 1 )( x 1 x 0 ) f( x2) f( x1) f( x1) f( x0) = ( x x )( x x ) ( x x )( x x ) = = 2 0 2 1 2 0 1 0 f( x2) f( x1) f( x1) f( x0) ( x2 x1) ( x1 x0) ( x x ) 2 0 f( x2) f( x1) f( x1) f( x0) ( x2 x1) ( x1 x0) ( x x ) 2 0 1 0 1 0 2 0

f ( x) = b + b( x x ) + b ( x x )( x x ) b b 2 0 1 0 2 0 1 0 0 1 2 f( x ) f( x1) f( x0) ( x x ) 1 0 looks like a finite first divided difference b = = = f( x2) f( x1) f( x1) f( x0) ( x2 x1) ( x1 x0) ( x x ) 2 0 looks like a finite second divided difference

Approximate ln(2) = 0.6931472 by interpolating (1, 0) (4,1.386294) (6,1.791759) b b b 0 0 1 2 = f( x ) = 0 = f( x1) f( x0) 1.386294-0 = =0.4620981 ( x x ) 3 1 0 ( 2) f( x1) f( x1) f( x0) 1.791759 1.386294 0.4 x2 x1 x1 x0 f x ( ) ( ) = ( x x ) = 2 2 0 f ( x) = b + b( x x ) + b ( x x )( x x ) f 2 0 1 0 2 0 1 2 (2) = 0 + 0.4620981(2 0) 0.0518731(2 0)(2 4) = 0.5658444 5 620981 = 0.0518731

General Form of Newton s Divided Difference Interpolating Polynomials The order n polynomial interpolates n+1 points The coefficients are finite divided differences They can be calculated recursively

f ( x) = b + b( x x ) + b( x x )( x x ) + + b ( x x )( x x ) ( x x ) b b n 0 1 0 1 0 1 n 0 1 n = f( x ) 0 0 = f[ x, x ] 1 1 0 b = f[ x, x, x ] 2 2 1 0 b = f[ x, x,, x, x ] n n n 1 1 0 f[ x, x,, x, x ] = i i 1 1 0 f[ xi, xi 1,, x2, x1] f[ xi 1, xi 2,, x1, x0] x x i 0

xi f(xi) 1st dd 2nd dd 3rd dd 4th dd 1 0 0.81093-0.20007 0.051307-0.01095 1.5 0.405465 0.510826-0.09746 0.018459 2.5 0.916291 0.364643-0.05131 3 1.098612 0.287682 4 1.386294 2 1 0.5-0.25 0.25 0 0.81093-0.10003-0.01283-0.00274 0.695331 0.693147-0.00218

Lagrange Interpolating Polynomial Given n+1 points and function values, there is only one degree-n polynomial going through the points The Lagrange formulation is thus equivalent, leading to the same interpolating polynomial It is easier to calculate

f ( x) = L( x) f( x ) n i i i= 0 L( x) i n n j= 0 i j j i This passes through each of the points because when x = x, all of the L( x) are 0 except for L k = k x x x ( x), which is equal to 1. x j i

Numerical Integration Some functions of known form can be integrated analytically Others require numerical estimates because the form of the integrand yields no closed form solution Sometimes the function may not even be defined by an equation, but rather by a computer program

0 π 1 2 3 x dx /2 π /2 4 2 x 16 1 15 = = = 4 4 4 4 sin( x) dx = cos( x) 1 0 = cos( π / 2) + cos(0) = 0+ 1= 1 1 4 e x 2 dx =?

Left and right Riemann sums, and the midpoint rule give definition, not a good computational method. Exact only for constant functions (LR and RR) or linear functions (MR). The Definite Integral b a b a b a n 1 b a f ( x) dx = lim f ( a + i[ b a]/ n) n i = 0 n n b a f ( x) dx = lim f ( a + i[ b a]/ n) n i = 1 n n 1 b a f( x) dx = lim f( a+ i[ b a]/ n+ [ b a]/ 2 n) n i = 0 n

Example f(x) = exp(-x 2 ) Use left Riemann sum Integrate from 0 to 2 Exact value is 0.882 N Sum 4 1.126 10 0.980 20 0.931 50 0.902 100 0.891

Trapezoidal Rule Simple Riemann sum approximates the function over each interval by a constant function We can use linear, quadratic, etc. instead for more accuracy Using a linear approximation over each interval results in the trapezoidal rule

Linear and Quadratic Approximations

Linear Approximations over Short Intervals

Closed and Open Rules

Trapezoidal Rule for an Interval b a ( a, f( a)) ( b, f( b)) f( b) f( a) f1( x) = f( a) + ( x a) b a f( b) f( a) f1( x) dx = f ( a) x + ( x a) 2( b a) f( b) f( a) 2 = f( a) b+ ( b a) f( a) a 2( b a) f( b) f( a) = f( a)( b a) + ( b a) 2 f( b) + f( a) = ( b a) 2 2 b a

Trapezoidal Rule for a Subdivided Interval Divide the interval [a, b] into n equal segments, each of width (b-a)/n Apply the trapezoidal rule to each segment Add up all the results This is much more accurate than the simple Riemann sum

h = ( b a)/ n x = a + ih i = 0,1, 2,, n f i i = f( x ) i 0.5 h( f + f ) + 0.5 h( f + f ) + + 0.5 h( f + f ) + 0.5 h( f + f ) 0 1 1 2 n 2 n 1 n 1 n n 1 0 i n i= 1 0 i n i= 1 2n = 0.5h f + 2 f + f = nh n 1 f + 2 f + f = 2n = n 1 f + 2 f + f 0 i n i 1 ( b a) = (width)(average he ight)

Example f(x) = exp(-x 2 ) Use trapezoidal rule Integrate from 0 to 2 Exact value is 0.8820814 N Sum 4 0.8806186 10 0.8818388 20 0.8820204 50 0.8820716 100 0.8820789

Simpson s Rules

Simpson s Rules Simpson s rules generalize the trapezoidal rule to use more than two points per interval, so we can use quadratic or cubic models instead of linear We will mainly cover the quadratic model, or Simpson s 1/3 rule

Quadratic Interpolation For a single interval, we will derive Simpson s 1/3 rule We will need to find the quadratic equation that goes through three points (x 1, f(x 1 )), (x 2, f(x 2 )), (x 3, f(x 3 )) We will then integrate the quadratic to obtain the estimate of the integral This also integrates cubics exactly

f = f( x ) f = f( x ) f = f( x ) 0 0 1 1 2 2 h= x2 x1 = x1 x0 ( x x )( x x ) ( x x )( x x ) ( x x )( x x ) f( x) = f + f + f ( x x )( x x ) ( x x )( x x ) ( x x )( x x ) h f x x x x x f x x x x f x x x x f x 2 1 2h f( x) dx= ( )( 2 ) 2 0 2 ( 2 ) 1 ( ) 2 0 2 y h y h f y y h f + y y h f dy x h 0 h f0 + 4 f1+ f2 = ( f 0 + 4 f 1+ f 2) = 2 h = width/average height 3 6 1 2 0 2 0 1 0 1 2 0 1 0 2 1 0 1 2 2 0 2 1 2 2 ( ) = ( 1)( 2) 0 2( 0)( 2) 1+ ( 0)( 1) 2 2h 0 2h 0 2h 3 2 3 3 3 1 1 8 2 y( y h) dy = y hy = h 2h = h 3 2 3 3 0 2h 3 2 2 3 3 3 3 1 3 8 2 ( y h)( y 2 h) dy = y hy + 2h y = h 6h + 4h = h 3 2 3 3 2h 2 yy ( 0 2h 3 2 3 3 3 2 16 8 2 h) dy = y + 2hy = h + 8h = h 3 3 3 0 0

Simpson s 1/3 Rule for a Subdivided Interval Divide the interval [a, b] into n equal segments, each of width (b-a)/n Apply the Simpson s 1/3 rule to each pair of segments Add up all the results This is more accurate than the trapezoidal rule

h f 4 f f f 4 f f f 4 f f f 4 f f 3 h f 0 4 f 1 2 f 2 4 f 3 2 f 4 2 f n 4 4 f n 3 2 f n 2 4 f n 1 f n 3 n 2 m is even 0 1 2 2 3 4 n 4 n 3 n 2 n 2 n 1 n

Example f(x) = exp(-x 2 ) Use Simpson s rule Integrate from 0 to 2 Exact value is 0.8820814 N Sum 4 0.8818124 10 0.8820749 20 0.8820810 50 0.8820814 100 0.8820814

Simpson s 3/8 Rule Uses four points to fit a cubic polynomial Is not theoretically more accurate than the 1/3 rule, but can use an odd number of segments We can combine this with Simpson s 1/3 rule if the number of segments is odd With 15 intervals (16 points), this is 6 Simpson s 1/3 rule plus 1 of Simpson s 3/8 rule

3h = [ f ] 0 + 3 f 1+ 3 f 2 + f 3 8 f + 3f + 3f + f = ( b a) 8 = (width)(average height) 0 1 2 3

Theoretical Errors of Newton-Cotes Methods Left and right Riemann integral formulas have errors of O(h). In the case of a linear function, y = c+dx for example, integrated over the interval [a, b], each approximating rectangle is missing a triangular portion whose base is h and whose height is dh, and there are n such triangles (h is the length of the interval divided by n), so the total error is ndh 2 /2 = d(b-a)h/2, which is proportional to h

Improving Left and Right Riemann Sums We can eliminate these triangles in two ways We can use a central Riemann sum that uses points in the middle of the intervals (open rule). This fits straight lines exactly We can use the trapezoidal rule, which also fits straight lines exactly Both these have O(h 2 ) errors

Error in Simpson s Rule The error in Simpson s 1/3 rule is is O(h 4 ) Compare this to left and right Riemann sums with errors at O(h) and the central Riemann sum and trapezoidal rule with errors at O(h 2 ) This means that in general Simpson s rule is more accurate at a given value of n It also gives information about changes of errors with n

Absolute Errors of Three Integration Methods f(x) = exp(-x 2 ), Integrate from 0 to 2, Exact value is 0.8820814 N R L Trap Simp. 4 2 10-1 1 10-3 2 10-4 10 1 10-1 2 10-4 6 10-6 20 5 10-2 6 10-5 4 10-7 50 2 10-2 1 10-5 1 10-8 100 1 10-2 2 10-6 6 10-10

Is the function available? The Newton-Cotes rules we have been looking at need a vector of function values The programs seen previously do not explicitly call a function; rather use a provided grid of values These methods can also be used in the form where a function is called In the case that any value can be called, other methods are available

Fixed Interval vs. Functional Integration The Newton-Cotes methods we have been describing all begin with a set of equally spaced function values. Sometimes this is all that is available, but we may be able to do better with some variation in the x s.

Richardson Extrapolation Given two estimates of an integral with known error properties, it is possible to derive a third estimate that is more accurate We will illustrate this with the trapezoidal rule, though the idea applies to any integration method with an error estimate

b f ( x ) dx = I = I ( h ) + E ( h ) a For the subdivided interval trapezoidal rule b a Eh = Oh = h f ξ 12 I= Ih ( ) + Eh ( ) = Ih ( ) + Eh ( ) 2 2 ( ) ( ) ''( ) for some in [a, b] 1 1 2 2 Eh ( ) h f''( ξ ) h Eh ( ) h f''( ) h 2 2 1 1 1 1 = 2 2 2 2 ξ2 2 h Eh ( ) ( ) 2 1 1 Eh 2 2 h2 h Ih ( 1) + h Eh ( ) 2 1 2 2 Eh ( ) Ih ( ) + Eh ( ) 2 1 2 2 2 h1 / h2 1 2 2 2 Ih ( ) Ih ( ) Ih ( ) Ih ( ) I= Ih + Eh = Ih + Oh 2 1 4 ( 2) ( 2) ( 2) which has error ( ) 2 2 h1 / h2 1 ξ

For the special case where h = h / 2 I= Ih ( ) + Eh ( ) = Ih ( ) + 2 1 Ih ( ) Ih ( ) 2 1 2 2 2 2 2 h1 / h2 1 Ih ( 2) Ih ( 1) Ih ( 2) Ih ( 1) = Ih ( 2) + = Ih ( 2) + 4 1 3 4 1 = Ih ( 2) Ih ( 1) 3 3 0 2 e x 2 dx = 0.8820814 I(0.2) = 0.8818388 (n=10) I(0.1) = 0.8820204 (n=20) 4 1 I(0.1) I(0.2) = 0.8820810 3 3 comparable to Simpson's rule with n = 20

Repeated Richardson Extrapolation With two separate O(h 2 ) estimates, we can combine them to make an O(h 4 ) estimate With two separate O(h 4 ) estimates, we can combine them to make an O(h 6 ) estimate, etc. The weights will be different for these repeated extrapolations

I, I, I 10 20 40 4 1 I = I I 3 3 20 /10 20 10 4 1 I = I I 3 3 40 / 20 40 20 16 1 I = I I 15 15 40 / 20 /10 40 / 20 20 /10 64 20 1 = I I + I 45 45 45 40 20 10

Errors for Richardson Extrapolation from Trapezoidal Rule Estimates n T R1 R2 10 2 10-4 4 10-7 20 6 10-5 5 10-11 40 2 10-5 3 10-8

Romberg Integration Let I j,k represent an array of estimates of integrals k = 1 represents trapezoid rules O(h 2 ) k = 2 represents Richardson extrapolation from pairs of trapezoid rules O(h 4 ) k = 3 represents Richardson extrapolation from pairs of the previous step at O(h 6 ), etc.

If we double the number of points (halve the interval) at each step, then we only need to evaluate the function at the new points For example, if the first step uses four intervals, it would involve evaluation at five points, the second one would use eight intervals, evaluated at nine points, only four of which are new I 4 k 1 I I = j+ 1, k 1 jk, 1 jk, k 1 4 1

Romberg starting with 2 intervals = 3 points 0.8770373 0.8818124 0.8820824 0.8820814 0.8820814 0.8806186 0.8820655 0.8820814 0.8820814 0.0000000 0.8817038 0.8820804 0.8820814 0.0000000 0.0000000 0.8819862 0.8820813 0.0000000 0.0000000 0.0000000 0.8820576 0.0000000 0.0000000 0.0000000 0.0000000 True value is 0.8820814, requires 17 function evaluations to achieve 7-digit accuracy. Simpson s rule requires 36 function evaluations, and the trapezoid rule requires 775!

Exact Integration The trapezoidal rule integrates a linear function exactly using two points Simpson s 1/3 rule integrates a quadratic (and cubics also) exactly using three points It is possible to take n+1 evenly spaced points and choose the weights so that the rule integrates polynomials for degree n exactly (e.g., Simpson s 3/8 rule)

Gaussian Integration Consider a function f() on a closed interval [a, b] We assume f() is continuous We wish to choose n points in [a, b] and weights, so that the weighted sum of the function values at the n points is optimal Can be chosen to integrate polynomials of degree 2n-1 exactly

Two interior points can integrate more exactly than two end points

Two integrals that should be integrated exactly by the trapezoid rule Method of undetermined coefficients

b f( a) + f( b) f ( x) dx ( b a) =c0f(a)+c1f(b) Trapezoid Rule a 2 f ( x) = 1 and f ( x) = x should be integrated exactly 0 1 b a 1dx = c 1+ c 1 2 2 0 1 b- a = h = c + c b b a 0 1 xdx = c a + c b 0 1 a = ca 0 + cb 1 = ca 0 + ( b a c0) b 2 2 + 2 = 2 ( ) 2 2 2 b a b ab c0 a b = c 2 ( a b) 2 c0( a b) b a = = c 2 0 1

0 2 1 b a b a f ( x) dx =c f +c f +c f 0 0 1 1 2 2 1dx = b a = c + c + c 0 1 2 2 2 b b a xdx = = c0a + c1( a + b)/2+ c2b a 2 b b a x dx c a c a b c b a 3 c = c = ( b a)/6 3 3 2 2 2 2 = = 0 + 1( + ) /4+ 2 c = 4( b a) / 6 Simpson's rule

Gauss-Legendre Find n points in [-1, 1] and n weights so that the sum of the weighted function values at the chosen points integrates as high a degree polynomial as possible n points and n weights means 2n coefficients, which is the number in polynomials of degree 2n 1 We find the two-point Gauss-Legendre points and weights for [-1, 1]; other intervals follow by substitution

c x x 1 1 1 1 1 1 1 1 1 1 0 1 0 1 f( x) dx c f( x ) + c f( x ) 1dx = 2= c + c 0 0 1 1 0 1 xdx = 0 = c x + c x 0 0 1 1 2 x dx = = c x + c x 3 2 2 2 0 0 1 1 x dx = 0 = c x + c x 3 3 3 0 0 1 1 = c = 1 = = 1 3 1 3

Gaussian Quadrature Gauss Legendre is highly accurate with a small number of points Suitable for continuous functions on closed intervals Gaussian quadrature also comes in other forms: Laguerre, Hermite, Chebychev, etc. for functions with infinite limits of integration, or which are not finite in the interval

With n points, Gauss-Laguerre integrates functions exactly that are multiples of w(x) = e -x by polynomials of degree 2n-1 exactly. w(x) is called the weight function. The weight function for Gauss-Legendre is w(x) = 1.

2 2 1/2 wx ( ) = (1 x) Chebyshev, first kind 2 1/2 wx ( ) = (1 x) Chebyshev, second kind x wx ( ) = e Laguerre α x wx ( ) = xe Generalized Laguerre x wx ( ) = e Hermite

Numerical Differentiation Previously we learned the forward, backward, and centered difference methods for numerical differentiation These use the first-order Taylor-series expansion These can be made more accurate by using higher order Taylor series expansions

First-Order Forward Difference f ''( x ) f x+ h = f x + f x h+ h + O h 2 f ''( x ) f x h f x h f x h O h 2 f( x+ h) f( x) f ''( x ) f x h Oh h 2 f( x+ h) f( x) f'( x) = + Oh ( ) h 0 2 3 ( ) ( ) '( ) ( ) 0 2 3 '( ) = ( + ) ( ) + ( ) 0 2 '( ) = + ( )

First-Order Second Forward Difference x x = h j+ 1 j x, x,, x, x, x, 0 1 i i+ 1 i+ 2 f ''( xi ) 2 3 f( x) = f( xi) + f '( xi) h+ h + O( h ) 2 f ''( xi ) 2 f( xi+ 2) 2 f( xi+ 1) + f( xi) = f( xi) + f '( xi)2h+ 4h 2 f ''( xi ) 2 2 f( xi) + f '( xi) h+ h + f( xi) + O( h 2 2 3 ''( i ) ( ) = f x h + Oh f( xi+ 2) 2 f( xi+ 1) + f( xi) f''( xi ) = + Oh ( ) 2 h 3 )

Second-Order Forward Difference f( xi+ 1) f( xi) f ''( xi) 2 f'( xi ) = h+ Oh ( ) h 2 f( xi+ 2) 2 f( xi+ 1) + f( xi) f''( xi ) = + Oh ( ) 2 h f( x ) f( x ) f( x ) 2 f( x ) + f( x ) f x Oh h 2h f( xi+ 2) + 4 f( xi+ 1) 3 f( xi) 2 f'( x) = + Oh ( ) 2h f( x+ h) f( x) + Oh ( ) h i+ 1 i i+ 2 i+ 1 i 2 '( i ) = + ( )

f( x) = e x 2 x = 2 h =.2 f (2.0) = 0.0183156389 f (2.2) = 0.0079070541 f (2.4) = 0.0031511116 f(2.2) f(2.0) = 0.0520429.2 f(2.4) + 4 f(2.2) 3 f(2.0) =.0661745.4 f '( x) = 2xe x f '(2.0) = 0.07326 2

Numerical error as a function of step size and method h E1 O(h) E2 O(h 2 ) 0.2 0.021220 0.007088 0.1 0.011658 0.002096 0.05 0.006112 0.000567 0.025 0.003130 0.000147 0.0125 0.001584 0.000037

Factors affecting approximation accuracy First or second order method Forward or centered difference Step size All these affect the accuracy of the method

h Forward 1 O(h) Forward 2 O(h 2 ) Center 1 O(h 2 ) Center 2 O(h 4 ) 0.2 2.12E-02 7.09E-03 4.88E-03 2.96E-05 0.1 1.17E-02 2.10E-03 1.22E-03 1.20E-06 0.05 6.11E-03 5.67E-04 3.05E-04 6.46E-08 0.025 3.13E-03 1.47E-04 7.63E-05 3.87E-09 0.0125 1.58E-03 3.75E-05 1.91E-05 2.39E-10

Richardson Extrapolation Just as with numerical integration, estimates with different errors can be combined to reduce the error Can be applied iteratively to further reduce the error as in Romberg integration

D= Dh ( ) + Eh ( ) k Eh ( ) = Oh ( ) D= Dh ( ) + Eh ( ) = Dh ( ) + Eh ( ) 1 1 2 2 E( h) kh h Eh ( ) kh h k k 1 1 1 1 = k k 2 2 2 2 h Eh ( ) ( ) k 1 1 Eh k 2 h2 h Dh ( ) Eh ( ) Dh ( ) Eh ( ) k 1 1 + 2 2 + k h2 2 Eh ( ) 2 Dh ( ) Dh ( ) h 2 1 k k 1 / h2 1 D= Dh + Eh Dh + Dh ( ) Dh ( ) Oh 2 1 k + 2 ( 2) ( 2) ( 2) which has error ( ) k k h1 / h2 1

h h = 2 1 /2 / h = 2 k k k 1 2 Dh ( ) Dh ( ) 2 Dh ( ) Dh ( ) Dh ( 2) + = h k = 2 h k 2 1 2 1 k k k 1 / h2 1 2 1 Dh ( ) Dh ( ) 4 Dh ( ) Dh ( ) Dh ( 2) + = h 2 1 2 1 k k 1 / h2 1 3

Ordinary Differential Equations ODE: solve for functions of one variable Possibly multiple equations and multiple functions, but usually one equation in one variable Functions of more than one variable can appear in partial differential equations (PDE s) Some ODE s can be solved analytically, but most cannot

Initial/Boundary Value Problem An initial value problem is an ODE in which the specifications that make the solution unique occur at a single value of the independent variable x or t A boundary value problem specifies the conditions at a number of different x or t values

Consider an ODE of the form dy = f( xy, ) with initial conditions dx We can trace out a solution starting at ( x, y ) y = y + φh i+ 1 where x x = h i+ 1 dy φ = dx i i 0 0

Runge-Kutta methods Euler s method is the simplest of these one-step methods Improved slope estimates can improve the result These methods are called in general Runge-Kutta or RK methods

dy = f( xy, ) dx y = y + φh i+ 1 i dy φ = = dx ( x, y ) Euler's method i i f( x, y ) i i

Errors in Euler s Method Errors are local to each step global or accumulated Errors are caused by truncation (when h is large) roundoff (when h is small and number of steps is large)

Euler s Method Is simple to implement Can be sufficiently accurate for many practical tasks if the step size is small enough No step size will result in a highly accurate result Higher order methods are needed

Improvements in Euler s Method We could use a higher order Taylor expansion at the current iterate to reduce truncation error This results in more analytical complexity due to the need for more derivatives Mostly, alternative methods are used to make the extrapolation more accurate Extrapolation is a hazardous business!

Heun s Method One problem with Euler s method is that it uses the derivative at the beginning of the interval to predict the change within the interval Heun s method uses a better estimate of the change, which is closer to the average derivative in the interval, rather than the initial derivative It is one of a class of predictor-corrector methods

y = f( x, y ) i i i y y + f( x, y ) h Euler Step Predictor Equation 0 i+ 1 i i i y = f( x, y ) 0 i+ 1 i+ 1 i+ 1 y = i f( x, y ) + f( x, y ) 0 i i i+ 1 i+ 1 2 0 f( xi, yi) + f( xi 1, yi 1) + + yi+ 1 = yi + h Corrector Equation 2 Can be iterated

a Integrate y = e y x = x = 0.8x 4 0.5 from 0 to 4 with stepsize = 1 x = 0 then y = 2 Analytical y = ae + be 0.8x 0.5x y = 0.8ae 0.5be 0.8x 0.5x ( 0.8a 4 0.5 ) ( ) ( ) 0 = y 4e + 0.5y = 0.8ae 0.5be 4e + 0.5ae + 0.5be a = + b= = + 4 /1.3 2 0.8x 0.8x 0.5x 0.8x 0.8x 0.5x a e 0.8x 4 y = ( e e ) + 2e 1.3 0.8x 0.5x 0.5x

y = e y x = x = 0.8x 4 0.5 from 0 to 4 with stepsize = 1 x = 0 then y = 2 x = 0 y = 2 y = 4-1 = 3 0 0 0 0 y1 = 2 + 3(1) = 5 (true value at x = 1 is 6.1946) ε = 6.1946 5 / 6.1946 = 0.193 t e = 0.8 y 1=4 0.5(5) 6.4022 y = (3 + 6.4022) / 2 = 4.7011 y 1 1 t = 2 + (4.7011)(1) = 6.7010 ε = 6.1946 6.7010 / 6.1946 = 0.082

y = e y x = x = 0.8x 4 0.5 from 0 to 4 with stepsize = 1 x = 0 then y = 2 x = 0 y = 2 y = 4-1 = 3 y y 0 0 0 0 1 1 1 = 2 + 3(1) = 5 (true value at x = 1 is 6.1946) = 2 + (4.7011)(1) = 6.7010 = 0.8 y 1=4e 0.5(6.7010) 5.5517 y = (3+ 5.5517) / 2 = 4.2758 y 2 1 t = 2 + (4.2758)(1) = 6.2758 ε = 6.1946 6.2758 / 6.1946 = 0.013

This will not, in general converge upon iteration to the true value of y i+1 This is because we are at best estimating the actual slope of the secant by the average of the slopes at the two ends, and even were the slopes at the two ends exact, this is not an identity

Integrate between 0 and 4, with y = 1 at x = 0 dy dx = x + x x+ 3 2 2 12 20 8.5 x = 0 y = 1 h= 0.5 0 0 Euler's Method x = 0.5 y = 8.5 y = 1 + (8.5)(0.5) = 5.25 0 1 0 1 Heun's Method y = 1.25 y = 4.875 y = 1+ (4.875)(0.5) = 3.4375 1 1 1 1 No iteration needed (True value is 3.00)

Midpoint Method Euler s method approximates the slope of the secant between two points by the slope at the left end of the interval Heun s method approximates it by the average of the estimated slopes at the endpoints The midpoint method approximates it by the estimated slope at the average of the endpoints

y 1/2 Integrate by midpoint method y = e y x= x= 0.8x 4 0.5 from 0 to 4 with stepsize = 1 x= 0 then y = 2 x = 0 y = 2 y = 4-1= 3 0 0 0 = 2 + 3(1/ 2) = 3.5 (true value at x= 1/ 2 is 3.7515) = 0.8/ 2 y 1/2 =4e 0.5(3.5) 1 t 4.2173 y = 2 + (4.2173)(1) = 6.2173 ε = 6.1946 6.2173 / 6.1946 =.02267 / 6.1946 = 0.0366

y = i f( x ) y y + f( x ) h Euler 0 i+ 1 i i y = f( x ) i+ 1 i+ 1 i f( xi) + f( xi+ 1) y i = 2 f( x ) + f( x ) = + 2 i i+ 1 yi+ 1 yi h f( x ) + f( x ) = 2 Trapezoid Rule i i+ 1 yi+ 1 yi h Heun

y = i f( x ) 0 + i+ 1 i i y y f( x ) h Euler = Riemann left y = f( x ) i+ 1/2 i+ 1/2 y = y + f( x ) h i+ 1 i i+ 1/2 Midpoint = Riemann midpoint y y = f( x ) h i+ 1 i i+ 1/2 i

Integrate between 0 and 4, with y = 1 at x = 0 dy = x + x x+ dx x = 0 y = 1 h= 0.5 3 2 2 12 20 8.5 0 0 Euler's Method x = 0.5 y = 8.5 y = 1 + (8.5)(0.5) = 5.25 0 1 0 1 Heun's Method y = 1.25 y = 4.875 y = 1+ (4.875)(0.5) = 3.4375 1 1 1 1 Midpoint Method y = 4.21875 y = 1 + (4.21875)(.5) = 3.109375 1/2 1 (True value is 3.00)

Error Analysis Euler s method integrates exactly over an interval so long as the derivative at the beginning is the same as the slope of the secant line. This requires the derivative to be constant. y = f(x) = ax + b fulfills this requirement. The function must be linear. If f(x) is quadratic, then Heun s method and the midpoint method are exact

2 f x ax bx c 2 2 1 0 1 1 0 0 1 0 1 0 ( ) f ( x ) f ( x ) = ( ax + bx + c) ( ax + bx + c) f( x ) f( x ) x x = + + = ax ( x) + bx ( x) 2 2 1 0 1 0 [ ] = ( x x) ax ( + x) + b 1 0 1 0 = ax ( + x) + b 1 0 f ( x1) + f ( x0) 2ax1+ b + 2ax0 + b = = ax ( 1+ x0) + b 2 2 f (( x + x )/2) = 2 a( x + x )/2 + b= a( x + x ) + b 1 0 1 0 1 0

Error Analysis If the function f(x) is approximated by a Taylor series, then Euler s method is exact on the first-order term, so the local error is O(h 2 ) Heun s method and the midpoint method are exact on the second-order approximation, so the local error is O(h 3 ) Since we are integrating O(h) intervals, the global error is O(h) for Euler and O(h 2 ) for Heun and the midpoint method

General Runge-Kutta Methods Achieve accuracy of higher order Taylor series expansions without having to compute additional terms explicitly Use the same general formulation as Euler s method, Heun s method, and the midpoint method in which the next point is the previous point plus the stepsize times an estimate of the slope.

y = f( xy, ) y i+ 1 i i i 1 φx = yy+ hh (,, ) φ= ak + ak + + ak k 1 1 2 2 f( x, y ) i i k = f( x + ph, y + q kh) 2 i 1 i 11 1 k = f( x + ph, y + q kh+ q kh) 3 i 2 i 21 1 22 2 n k = f( x + p h, y + q kh+ + q k h) n i n 1 i n 11 1 n 1n 1 n 1 n = 1 Euler's method n = 2 Heun/Midpoint method p = = 1 q = 1 a = 1/ 2 a = 1/ 2 1 11 1 2 p = 1/ 2 q = 1/ 2 a = 0 a = 1 1 11 1 2 n

Second-Order Runge-Kutta y = f( xy, ) yi+ φx 1 = yiy+ hh ( i, i, ) φ= ak 1 1+ ak 2 2 k1 = f( xi, yi) k2 = f( xi + ph 1, yi + q11kh 1 ) yi+ 1= yi + ( ak 1 1+ ak 2 2) h 1 d yi+ 1 yi + f( xi, yi) h+ f( xi, yi) h 2 dx d f( xy, ) f( xy, ) dy f( xy, ) = + dx x y dx y y + f( x, y ) h+ i+ 1 i i i 1 2 f + x 2 f dy h y dx 2

y = y + ( ak + ak ) h i+ 1 i 1 1 2 2 1 f f dy yi+ 1 yi + f( xi, yi) h+ + h 2 x y dx k = f( x, y ) 1 i i f k2 = f( xi + ph 1, yi + q11kh 1 ) f( xi, yi) + ph 1 + q11kh 1 x 2 f 2 yi+ 1 = yi + ahf 1 ( xi, yi) + ahf 2 ( xi, yi) + ah 2 p1 + ahq 2 11 f( xi, yi) x = y+ af( x, y) + a i 1 i i 2 2 [ f( x, y )] h+ a p + a q f( x, y ) h 1= a+ a = 2a p = 2aq 1 2 2 1 2 11 a = 1 a p = q = 1/2a 1 2 1 11 2 f x f y 2 i i 2 1 2 11 i i f y f y

y = y + ( ak + ak ) h i+ 1 i 1 1 2 2 k 1 = f( x, y ) i k = f( x + ph, y + q kh) 2 i 1 i 11 1 a = 1 a p = q = 1/2a 1 2 1 11 2 a = 1/2 a = 1/2 p = q = 1 2 1 1 11 y = y + h( f( x, y ) + f( x + h, y + f( x, y ) h))/2 i+ 1 i i i i i i i Heun's Method i

y = y + ( ak + ak ) h i+ 1 i 1 1 2 2 k 1 = f( x, y ) i i k = f( x + ph, y + q kh) 2 i 1 i 11 1 a = 1 a p = q = 1/2a 1 2 1 11 2 a = 1 a = 0 p = q = 1/ 2 2 1 1 11 y = y + hf( x + h/2, y + f( x, y ) h/2) i+ 1 i i i i i Midpoint Method

y = y + ( ak + ak ) h i+ 1 i 1 1 2 2 k 1 = f( x, y ) i i k = f( x + ph, y + q kh) 2 i 1 i 11 1 a = 1 a p = q = 1/2a 1 2 1 11 2 a = 2/3 a = 1/3 p = q = 3/4 2 1 1 11 y = y + h( f( x, y ) + 2 f( x + 3 h/4, y + 3 f( x, y ) h/4))/3 i+ 1 i i i i i i i Ralston's Method

Higher-Order Methods Euler s method is RK order 1 and has global error O(h) Second-order RK methods (Heun, Midpoint, Ralston) have global error O(h 2 ) Third-order RK methods have global error O(h 3 ) Fourth-order RK methods have global error O(h 4 )

Derivation of RK Methods Second-order RK methods have four constants, and three equations from comparing the Taylor series expansion to the iteration. There is one undetermined constant Third-order methods have six equations with eight undetermined constants, so two are arbitrary.

y Third-Order Runge-Kutta y = f( xy, ) φx = yy+ hh (,, ) i+ 1 i i i φ= ak + ak + ak k 1 = 1 1 2 2 3 3 f( x, y ) i i k = f( x + ph, y + q kh) 2 i 1 i 11 1 k = f( x + ph, y + q kh+ q kh) 3 i 2 i 21 1 22 2 y = y + ( ak + ak + ak ) h i+ 1 i 1 1 2 2 3 3 2 1 d 2 1 d yi+ 1 yi + f( xi, yi) h+ f( xi, yi) h + f( x, 2 i 2 dx 6 dx d f( xy, ) f( xy, ) dy f( xy, ) = + dx x y dx y ) h 2 2 2 2 2 2 f( xy, ) f( xy, ) dy f( xy, ) dy f( xy, ) dy f( xy, ) = + 2 + 2 2 2 + 2 x x x y dx y dx y dx i 3

k 1 Common Third-Order Method = f( x, y ) i k = f( x + h/2, y + kh/2) 2 i i 1 k = f( x + h, y k h+ 2 k h) 3 i i 1 2 y = y + ( k + 4 k + k ) h/6 i+ 1 i 1 2 3 i Reduces to Simpson's Rule

k 1 Standard Fourth-Order Method = f( x, y ) i k = f( x + h/2, y + kh/2) 2 i i 1 k = f( x+ h/2, y+ kh/2) 3 i i 2 k = f( x+ h, y+ kh) 4 i i 3 i y = y + ( k + 2k + 2 k + k ) h/6 i+ 1 i 1 2 3 4 Reduces to Simpson's Rule

Comparing RK Methods Accuracy depends on the step size and the order Computational effort is usually measured in function evaluations Up to order 4, an order-m RK method requires m(b-a)/h function evaluations Butcher s order 5 method requires 6(b-a)/h

Systems of ODE s We track multiple responses y 1, y 2,, y n, each of which depends on a single variable x and on possibly all of the other responses We also need n initial conditions at x = x 0

dy dx dy dx 1 2 = = f( xy,, y,, y) 1 1 2 f( xy,, y,, y) 2 1 2 dyn = fn( xy, 1, y2,, yn) dx Euler's Method y = y + f( xy,, y,, y ) h ( i+ 1) () i () i () i () i j j j 1 2 n n n

dy dy = 0.5 y = 4 0.3y 0.1y dx dx Integrate x= 0 to x= 2 1 2 1 2 1 with initial values y = 4y = 6, h= 0.5 1 2 x y 1 y 2 y 1 y 2 0.0 4 6-2.0 1.63 0.5 3 6.9-1.5 1.4605 1.0 2.25 7.715-1.125 1.2977 1.5 1.6875 8.4453-0.8438 1.1452 2.0 1.2656 9.0941

RK Methods for ODE Systems We describe the common order 4 method. First determine slopes at initial value for all variables, this gives a set of n k 1 values. Then use these to estimate a set of functional values at the midpoint and slopes Use these to get improved midpoint values and slopes Use these to get estimate of value and slope at end Combine for final projection

dy dy = 0.5 y = 4 0.3y 0.1y dx dx Integrate x= 0 to x= 2 1 1 2 1 2 1 with initial values y = 4y = 6, h= 0.5 k = f( x, y ) i 2 i i 1 3 i i 2 4 i i 3 1 2 k = f( x + h/2, y + kh/2) k = f( x+ h/2, y+ kh/2) k = f( x+ h, y+ kh) i y = y + ( k + 2k + 2 k + k ) h/6 i+ 1 i 1 2 3 4

dy dy = 0.5 y = 4 0.3y 0.1 y dx dx x= 0, 2 y = 4y = 6, h= 0.5 k k k k 1 2 1 2 1 1,1 1 1,2 2 2,1 1 1 2,2 2 1 2 = f (0, 4,6) = 2 = f (0, 4, 6) = 4 0.3(6) 0.1(4) = 1.8 = f (0.25, 4 + ( 2)(0.5) / 2,6 + (1.8)(0.5) / 2) = f (0.25,3.5, 6.45) = 1.75 = f (0.25,3.5, 6.45) = 1. 715

dy dy = 0.5 y = 4 0.3y 0.1 y dx dx x= 0, 2 y = 4y = 6, h= 0.5 1 2 1 2 1 1 2 k = 2 k = 1.8 k = 1.75 k = 1.715 k k k k 1,1 1,2 2,1 2,2 = f (0.25, 4 + ( 1.75)(0.5) / 2, 6 + (1.715)(0.5) / 2) 3,1 1 = f (0.25,3.5625, 6.42875) = 1.78125 = f 1 3,2 2 4,1 1 1 4,2 2 (0.25,3.5625, 6.42875) = 1.715125 = f (0.5, 4 + ( 1.78125)(0.5), 6 + (1.715125)(0.5)) = f (0.5,3.109375, 6.857563) = 1.554688 = f (0.5,3.109375, 6.857563) = 1.631794

y y 1 2 dy dy = 0.5 y = 4 0.3y 0.1 y dx dx x= 0, 2 y = 4y = 6, h= 0.5 1 2 1 2 1 1 2 k = 2 k = 1.8 k = 1.75 k = 1.715 k k 1,1 1,2 2,1 2,2 = 1.78125 k = 1.715125 3,1 3,2 = 1.554688 k = 1.631794 4,1 4,2 (0.5) = 4 + ( 2 + 2( 1.75) + 2( 1.78125) + ( 1.554688))(0.5) / 6 = 3.115234 (0.5) = 6 + (1.8 + 2(1.715) + 2(1.715125) + 1.631794)(0.5) / 6 = 6.857670

Adaptive RK Methods A fixed step size may be overkill for some regions of a function and may be too large to be accurate for others Adaptive methods use different step sizes for different regions of the function Several methods for accomplishing this Use different step sizes but same order Use different orders

Adaptive RK or Step-Halving Predict over step with order 4 RK, obtain prediction y 1 Predict with two steps of half step size to obtain prediction y 2 Difference Δ = y 2 y 1 is an estimate of the error that can be used to control step size adjustment y 2* = y 2 + Δ/15 is fifth order accurate

0.8x y 4e 0.5y x = 0 to 2, h = 2 y(0) = 2 True value at 2 is 14.84392 Full step prediction k = f( x, y ) = f(0,2) = 3 1 = i i k = f( x + h/ 2, y + kh/ 2) = f(1,5) = 6.402164 2 i i 1 k = f( x+ h/ 2, y+ kh/ 2) = f(1,8.402164)=4.701082 3 i i 2 k4 = f( xi + hy, i + k 3 h) = f(2,11.40216)=14.11105 y = y + ( k + 2k + 2 k + k ) h/ 6 = 15.10585 i+ 1 i 1 2 3 4

y y 0.8x y = 4e 0.5y x = 0 to 2, h = 2 y(0) = 2 True value at 2 is 14.84392 Full step prediction is 15.10585 Half-step predictions are i+ 1 i+ 2 E E y E a t * t = 2 + (3 + 2(4.217299 + 3.912974) + 5.945677)1/ 6 = 6.201037 = 6.201037 + (5.801645 + 2( 8.729538+7.997565) + 12.712829)1/ 6 = 14.862484 = (14.862484 15.10585) /15 = 0.1622 = 14.84392 14.862484 = 0.01857 = 14.862484 + ( 0.1622) = 14.84627 = 14.84392 14.84627 = 0.00235

Fehlberg/Cash-Karp RK Instead of using two different step sizes, we can use two different orders This may use too many function evaluations unless the two orders are coordinated Fehlberg RK uses a fifth order method using the same function evaluations as a fourth order method Coefficients due to Cash and Karp

(4) 37 250 125 512 yi+ 1 = yi + k1+ k3+ k4 + k6 h 378 621 594 1771 (5) 2825 18575 13525 277 1 yi+ 1 = yi + k1+ k3+ k4 + k5 + k6 h 27648 48384 55296 14336 4 k = f( x, y ) 1 i i 1 1 k2 = f( xi + h, yi + kh 1 ) 5 5 3 3 9 k3 = f( xi + h, yi + kh 1 + kh 2 ) 10 40 40 3 3 9 6 k4 = f( xi + hy, i + kh 1 kh 2 + kh 3 ) 5 10 10 5 11 5 70 35 k5 = f( xi + h, yi kh 1 + kh 2 kh 3 + kh 4 ) 54 2 27 27 7 1631 175 575 44275 253 k6 = f( xi + h, yi + kh 1 + kh 2 + kh 3 + kh 4 + kh 5 ) 8 55296 512 13824 110592 4096

Values needed for RK Fehlberg for the example. x y f(x,y) k 1 0 2 3 k 2 0.4 3.2 3.908511 k 3 0.6 4.20883 4.359883 k 4 1.2 7.228398 6.832587 k 5 2.0 15.42765 12.09831 k 6 1.75 12.17686 10.13237

y (4) i+ 1 37 250 125 512 = 2 + 3 + 4.359883 + 6.832587 + 10.1337 h 378 621 594 1771 = 14.83192 (5) 2825 18575 13525 277 1 yi+ 1 = yi + 3 + 4.359883 + 6.832587 + 12.09831+ 10.1337 h 27648 48384 55296 14336 4 = 14.83677 E a = 14.83677 14.83192 =.004842

Step Size Control First we specify desired accuracy Relative error can be a problem if the function is near 0 Absolute error takes no account of the scale of the function One method is to let the desired accuracy depend on a multiple of both the function and its derivative

= ε y y new scale = y scale dy yscale = y + h dx h new = h present new present α α =.2 when the step size is increased α =.25 when the step size is decreased This is one scheme of many for adaptive step size

Example: 2 dy ( x 2) + 0.6y = 10exp dx 2(0.075) y(0) = 0.5 General Solution y = 0.5exp( 0.6 x) 2