lecture 26: Richardson extrapolation

43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation) to construct some equally simple algoritm (eg, differentiate te interpolant to get a finite difference formula (Section 7), integrate te interpolant to get te trapezoid rule (Section 32) An unfortunate consequence is tat suc approximations often converge slowly, wit errors decaying only like or 2, were is some discretization parameter (eg, te spacing between interpolation points) In tis lecture we describe a remarkable, fundamental tool of classical numerical analysis Like alcemists wo sougt to convert lead into gold, so we will take a sequence of slowly convergent data and extract from it a igly accurate estimate of our solution Tis procedure is Ricardson extrapolation, an essential but easily overlooked tecnique tat sould be part of every numerical analyst s toolbox Wen applied to quadrature rules, te procedure is called Romberg integration We begin in a general setting: Suppose we seek some abstract quantity, x 2 IR, wic could be te value of a definite integral, a derivative, te solution to a differential equation at a certain point, or someting else entirely Furter suppose we cannot compute x exactly; we can only access numerical approximations to it, generated by some function (an algoritm) F tat depends upon a mes parameter We compute F() for several values of, expecting tat F()! F(0) =x as! 0 To obtain good accuracy, one naturally seeks to evaluate F wit increasingly smaller values of Tere are two reasons not to do so: Often F becomes increasingly expensive to evaluate as srinks; Te numerical accuracy wit wic we can evaluate F may deteriorate as gets small, due to rounding errors in floating point aritmetic (For an example of te latter, try computing estimates of f 0 (a) using te formula f 0 (a) ( f (a + ) f (a))/ as! 0) In te case of integration, you migt prefer using a iger order metod, like Clensaw Curtis or Gaussian quadrature Wat we talk about ere is an alternative to suc approaces For example, computing F(/2) often requires at least twice as muc work as F() In some cases, F(/2) could require 4, or even 8, times as muc work at F(), ie, te expense of F could grow like / or / 2 or / 3, etc Assume tat F is infinitely continuously differentiable as a function of, tus allowing us to expand F() in te Taylor series F() =F(0)+ F 0 (0)+ 2 2 F 00 (0)+ 6 3 F 000 (0)+ Te derivatives ere may seem to complicate matters (eg, wat are te derivatives of a quadrature rule wit respect to?), but we sall not need to compute tem: tey key is tat te function F beaves

44 smootly in Recalling tat F(0) = x, we can rewrite te Taylor series for F() as F() =x + c + c 2 2 + c 3 3 + for some constants {c j } j= (For example, c = F 0 (0)) Tis expansion implies tat taking F() as an approximation for x incurs an O() error Halving te parameter sould rougly alve te error, according to te expansion F(/2) =x + c 2 + c 2 4 2 + c 3 8 3 + Here comes te trick tat is key to te wole lecture: Combine te expansions for F() and F(/2) in suc a way tat eliminates te O() term In particular, define Y() := 2F(/2) F() = 2 x + c 2 + c 2 4 2 + c 3 8 3 + x + c + c 2 2 + c 3 3 + = x c 2 2 2 c 3 3 4 3 + Tus, Y() also approximates x = Y(0) =F(0), but wit an O( 2 ) error, rater tan te O() error tat pollutes F() For small, tis O( 2 ) approximation will be considerably more accurate Wy stop wit Y()? Repeat te procedure, combining Y() and Y(/2) to eliminate te O( 2 ) term Since For te sake of clarity let us discuss a concrete case, elaborated upon in Example 32 below Suppose we wis to compute x = f 0 (a) using te finite difference formula F() = f (a + ) f Te quotient rule gives F 0 () = (a) f (a) f (a + ) 2 + f 0 (a + ), wic will depend smootly on provided f is smoot near a In particular, a Taylor expansion for f gives f (a + ) = f (a)+f 0 (a)+ 2 2 f 00 (a)+ 6 3 f 000 () for some 2 [a, a + ] Substitute tis formula into te equation for F 0 () and simplify to get F 0 () = f 0 (a + ) f 0 (a) Now tis expression leads to a clean formula for te first coefficient of te Taylor series for F(): c = F 0 (0) =lim!0 F 0 () = 2 f 00 (a) 2 f 00 (a) 6 f 000 () Te moral of te example: wile it migt seem strange to take a Taylor series of te algoritm F, te quantities involved often ave a very natural interpretation in terms of te underlying problem at and Y(/2) =x c 2 8 2 c 3 3 32 3 +, we ave Q() := 4Y(/2) Y() = x + c 3 3 8 3 + To compute Q(), we must ave access to bot Y() and Y(/2) Tese, in turn, require F(), F(/2), and F(/4) In many cases, F becomes increasingly expensive to compute as te parameter is reduced Tus tere is some practical limit to ow small we can take wen evaluating F() One could continue tis procedure repeatedly, eac time improving te accuracy by one order, at te cost of one additional F computation wit a smaller To facilitate generalization and to avoid a furter tangle of Greek caracters, we adopt a new notation: Define R(j,0) := F(/2 j ), j 0; R(j, k) := 2k R(j, k ) R(j, k ) 2 k, j k > 0

45 Tus: R(0, 0) =F(), R(, 0) =F(/2), and R(, ) =Y() Tis procedure is called Ricardson extrapolation after te Britis applied matematician Lewis Fry Ricardson, a pioneer of te numerical solution of partial differential equations, weater modeling, and matematical models in political science Te numbers R(j, k) are arranged in a triangular extrapolation table: R(0, 0) R(, 0) R(, ) R(2, 0) R(2, ) R(2, 2) R(3, 0) R(3, ) R(3, 2) R(3, 3) " " " " O() O( 2 ) O( 3 ) O( 4 ) To compute any given element in te table, one must first determine entries above and to te left Note tat only te first column will require significant work; te subsequent columns follow from easy aritmetic Te teory suggests tat te bottom-rigt element in te table will be te most accurate approximation to x Indeed tis bottom-rigt entry will generally be te most accurate, provided te assumption tat F is infinitely continuously differentiable olds Wen floating point roundoff errors spoil wat oterwise would ave been an infinitely continuously differentiable procedure, te bottom-rigt entry will suffer acutely from tis pollution Suc errors will be apparent in te fortcoming example Example 32 (Finite difference approximation of te first derivative) We seek x = f 0 (a) for some function continuously differentiable function f Recall from Section 7 te simple finite difference approximation to te first derivative tat follows from differentiating te linear interpolant to f troug te points x = a and x = a + : f 0 (a) f (a + ) f (a) In fact, in Teorem 6 we quantified te error to be O() as! 0: f 0 f (a + ) (a) = f (a) + O() Tus we define f (a + ) F() = f (a) As a simple test problem, take f (x) =e x We will use F and Ricardson extrapolation to approximate f 0 () =e = 278288284 Te simple finite difference metod produces crude answers:

46 F() error 4670774270 95249 0 0 /2 352684484 808533 0 /4 308824456 369963 0 /8 289548064 7798 0 /6 280502585 867440 0 2 /32 276200889 4299 0 2 /64 2739629446 23476 0 2 /28 2728927823 06460 0 2 /256 2723597892 53606 0 3 /52 272093830 265630 0 3 Even wit = /52 = 00095 we fail to approximate f 0 () to even tree correct digits As we take smaller and smaller, finite precision aritmetic eventually causes unacceptable errors; Figure 39 sows te error in F() as! 0 (Te red line sows wat perfect O() convergence would look like) f 0 () F() 0 0 0-2 0-4 0-6 0-8 Figure 39: Linear convergence of te estimate F() to f 0 () (blue line) As gets small, rounding errors spoil te O() convergence (red line) An accuracy of about 0 8 seems to be te best we can do for tis metod and tis problem 0-0 O() 0-2 0 0 0-2 0-4 0-6 0-8 0-0 0-2 A few steps of Ricardson extrapolation on te data in te table above reveals greatly improved solutions, five correct digits in R(4, 4): j R(j,0) R(j,) R(j,2) R(j,3) R(j,4) 0 46707742704760 35268448375804 238285469704447 2 30882445608 264967454826433 27386449867095 3 2895480636788 270275833258 272039623235534 2777936228868 4 28050258540344 274575393500 27852344840247 27825590783778 27828672683485 Te good performance of tis metod depends on f aving sufficiently many smoot derivatives If iger derivatives are not

47 absolute error 0 0 0-2 0-4 0-6 0-8 0-0 Q() Y() F() Figure 30: Te convergence of F() (blue) along wit its first two Ricardson refinements, Y() (green) and Q() (black) Te red lines sow O(), O( 2 ), and O( 3 ) convergence For tese values of, rounding errors are not apparent in te F() plot; owever, tey lurk in te later digits of F(), enoug to interfere wit te Y() and Q() approximations Before tese errors take old, Q() gives several additional orders of magnitude accuracy tan was obtained by F() wit muc smaller, in Figure 39 0-2 0 0 0-0 -2 0-3 0-4 0-5 0-6 smoot, ten F() will not ave smoot derivatives, and te accuracy breaks down Te accuracy also eventually degrades because of rounding errors tat subtly pollute te initial column of data, as sown in te Figure 30 35 Extrapolation for iger order approximations In many cases, te initial algoritm F() is better tan O() accurate, and in tis case te formula for R(j, k) sould be adjusted to take advantage Suppose tat F() =x + c r + c 2 2r + c 3 3r + for some integer r Ten define R(j,0) := F(/2 j ) for j 0 Notice tat tis structure is rater special: for example, if r = 2, ten te Taylor series for F() must avoid all odd-order terms (33) R(j, k) := 2rk R(j, k ) R(j, k ) 2 rk for j k > 0 In tis case, te R(:, k) column will be O( (k+)r ) accurate 352 Extrapolating te composite trapezoid rule: Romberg integration Suppose f 2 C [a, b], and we wis to approximate R b a f (x) dx wit te composite trapezoid rule, T() = 2 n i f (a)+2 Â f (a + j)+ f (b) j= Notice tat T() only makes sense (as te composite trapezoid rule) wen =(b a)/n for some integer n Notice tat T((b a)/n) If you find tis restriction on distracting, just define T() to be a sufficiently smoot interpolation between te values of T((b a)/n) for n =, 2,

48 requires n + evaluations of te function f, and so increasing n (decreasing ) increases te expense One can sow tat for any f 2 C [a, b], T() = Z b a f (x) dx + c 2 + c 2 4 + c 3 6 + Now perform te generalized Ricardson extrapolation (33) on T() wit r = 2: R(j,0)=T(/2 j ) for j 0 R(j, k) = 4k R(j, k ) R(j, k ) 4 k Tis procedure is called Romberg integration for j k > 0 In cases were f 2 C [a, b] (or if f as many continuous derivatives), te Romberg table will converge to ig accuracy, toug it may be necessary to take to be relatively small before tis is observed Wen f does not ave many continuous derivatives, eac column of te Romberg table will still converge to te true integral, but not at te ever-improving clip we expect for smooter functions Tis procedure s utility is best appreciated troug an example Example 33 For purposes of demonstration, we sould use an integral we know exactly, say Z p sin(x) dx = 2 0 Start te table wit = p to generate R(0, 0), requiring 2 evaluations of f (x) To build out te table, compute te composite trapezoid approximation based on an increasing number of function evaluations at eac step Te final entry in te first column requires 29 function evaluations, and as four digits correct Tis may not seem particularly impressive, but after refining tese computations troug a few steps of Romberg integration, we ave an approximation tat is accurate to full precision Ideally, one would exploit te fact tat some grid points used to compute T() are also required for T(/2), etc, tus limiting te number of new function evaluations required at eac step j R(j,0) R(j,) R(j,2) R(j,3) R(j,4) R(j,5) R(j,6) 0 0000000000000 570796326795 209439502393 2 8968897937 2004559754984 99857073824 3 9742360946 200026969948 99998330946 2000005549980 4 993570343772 20000659048 999999752455 200000006288 999999994587 5 998393360970 200000033369 9999999969 2000000000060 999999999996 200000000000 6 999598388640 2000000064530 99999999994 2000000000000 2000000000000 2000000000000 2000000000000 Be warned tat Romberg results are not always as clean as tis example, but tis procedure is important tool to ave at and wen ig precision integrals are required Te general strategy of Ricardson extrapolation can be applied to great effect in a wide variety of numerical settings