Numerical Methods I. Olof Widlund Transcribed by Ian Tobasco

Numericl Methods I Olof Widlund Trnscribed by In Tobsco

Abstrct. This is prt one of two semester course on numericl methods. The course ws offered in Fll 011 t the Cournt Institute for Mthemticl Sciences, division of New York University. The primry text for the course will be Numericl Liner Algebr by L. Trefethen nd D. Bu. Anlysis of Numericl Methods by Iscson nd Keller my be helpful when we discuss orthogonl polynomils nd Gussin qudrture. There will be regulr homeworks. The course website is http://www.cs.nyu.edu/courses/fll11/ CSCI-GA.40-001/index.html.

Contents Chpter 1. Singulr Vlue Decomposition nd QR Fctoriztion 5 1. Orthogonlity 5. The Singulr Vlue Decomposition 6 3. Projection Opertors 9 4. The QR Fctoriztion 10 5. Lest-Squres 13 Chpter. Interpoltion by Polynomils 17 1. Newton Interpoltion 17. Hermite Polynomils 19 3. Interpoltion Error 1 4. Piecewise Interpoltion 5. Qudrture by Polynomil Interpoltion 4 6. Orthogonl Polynomils 8 7. Gussin Qudrture 30 Chpter 3. Solving Liner Equtions 35 3

CHAPTER 1 Singulr Vlue Decomposition nd QR Fctoriztion Wht is this course relly bout? It s minly bout orthogonlity nd its uses. Lecture 1, 9/8/11 x = 1. Orthogonlity First, some nottion. We consider R n to be the liner spce of column vectors x 1. x n inner product will be. We ll denote the trnspose of x s x = (x 1,..., x n ). The (cnonicl) (x, y) = x y = nd s usul the (Eucliden) norm of x is n x k y k, k=1 x l = x x. Recll tht vectors v 1,..., v k R n re orthogonl if vj v k = 0 for j k, nd tht vector v is norml if v = 1. Why re orthogonl vectors importnt to scientific computing? Computers cnnot do exct mth. But we ll see tht certin lgorithms do not mplify the round-off errors so much: these methods will be known s well-conditioned nd stble. As it will turn out, orthogonlity is key to produce well-conditioned methods. Orthogonlity is lso relted to vritionl principles. A vritionl problem is one in which the nswer is mximizer/minimizer. nd so Now so Proposition 1.1. Suppose x 0, then d dɛ x + ɛy ɛ=0 = 0 iff x y = 0. Proof. Set f (ɛ) = x + ɛy. Then f (ɛ) = (x + ɛy) (x + ɛy) = x x + ɛx y + ɛ y y, f (0) = x y. g (ɛ) = f (ɛ) = x + ɛy, g (0) = 1 1 g (0) f (0) = x y x. 5

6 1. SINGULAR VALUE DECOMPOSITION AND QR FACTORIZATION Hence which gives the result. d x + ɛy dɛ = x y ɛ=0 x. The Singulr Vlue Decomposition In this section we ll describe the SVD of mtrix. (In finnce this is known s principl component nlysis, or PCA for short.) Let A : R n R m be m n mtrix. The SVD looks for the importnt prts of A. Here is n lgorithm for computing the SVD of A. Step 1. Compute σ 1 = mx v =1 Av = A nd find v 1 such tht v 1 = 1 nd Av 1 = σ 1. (σ 1 represents the biggest prt of A.) Also define u 1 vi with u 1 = 1. Av 1 = σ 1 u 1 Proposition.1. If v v 1 = 0 nd σ 1 0, then (Av) u 1 = 0. Proof. By contrdiction, suppose v v 1 = 0 but w = Av hs u 1w 0. Then v 1 ws not optiml. To see this, set v (ɛ) = v 1 + ɛv v 1 + ɛv nd note tht v (ɛ) = 1 for ll ɛ. By hypothesis, d dɛ v 1 + ɛv = 0 ɛ=0 nd thus by the chin rule. Now let d 1 dɛ v 1 + ɛv = 0 ɛ=0 w (ɛ) = Av (ɛ) 1 = v 1 + ɛv (Av 1 + ɛav) 1 = v 1 + ɛv (σ 1u 1 + ɛw) where w = Av. Now observe tht d w (ɛ) dɛ = 0 ɛ=0 if v 1 is optiml. However, w (ɛ) = 1 ( σ v 1 + ɛv 1 u 1u 1 + ɛσ 1 u 1w + ɛ w w ) nd so d w (ɛ) dɛ = σ 1u 1w ɛ=0 v 1 + ɛv. Thus d w (ɛ) dɛ = 0 ɛ=0 only if u 1w = 0 (or σ 1 = 0, but we ve ssumed otherwise).

. THE SINGULAR VALUE DECOMPOSITION 7 Exercise.. Find non-clculus proof. Step. Find the second most importnt vector. Tht is, compute σ = mx Av. v =1 v v 1=0 Note tht σ σ 1. If v is mximizer, then set u with u = 1 vi Av = σ u. Note tht v 1 = v = 1 by choice, nd the proposition gives tht u 1u = 0 s v 1v = 0. Step k + 1. Suppose we hve orthonorml v 1,..., v k nd u 1,..., u k long with σ 1 σ σ k > 0. Then compute σ k+1 = mx Av v =1 v v j=0, j=1,...,k nd if v k is mximizer find u k with u k = 1 vi Av k = σ k u k. Another wy of writing this is s follows. Define S k = spn {v 1,..., v k }, then σ k+1 = mx v =1 v S k Av. Agin we hve proposition telling us how to proceed. Proposition.3. Suppose Av k+1 = σ k+1 u k+1 with u k+1 = 1 s in the setup bove. Then u k+1 u j = 0 for j = 1,..., k. Proof sketch. If u k+1 u j 0 then the v j were not optiml. To see this, set for j = 1,..., k, nd observe if u j u k+1 0. v j (ɛ) = v j + ɛv k+1 v j + ɛv k+1 d dɛ Av j (ɛ) 0 ɛ=0 Note. It could hppen tht Av = 0 if v S k. Then tke v k+1,..., v n to be rbitrry orthonorml vectors which re orthogonl to S k, nd set σ j = 0 with j k. We ve proved the following theorem. Theorem.4 (SVD). Let A be m n mtrix. Then there re orthonorml vectors v 1,..., v n nd u 1,..., u n long with numbers σ 1 σ σ n 0 so tht Av k = σ k u k. Exercise.5. Consider the integrl trnsformtion f (x) = L 0 K (x, y) g (y) dy

8 1. SINGULAR VALUE DECOMPOSITION AND QR FACTORIZATION with kernel K (x, y) = sin ( x y ). Set x = L/n nd x k = k x + x/. Define n n n mtrix with components A kj = x K ( x k y j). Compute its singulr vlues, nd in prticulr compute log 10 (σ j ). Observe the rte t which σ j 0. The theorem cn be interpreted s sttement bout mtrix decomposition. Theorem.6 (SVD Decomposition). Given mtrix A there re orthogonl mtrices V nd U long with digonl mtrix Σ such tht AV = UΣ, i.e. such tht A = UΣV. Proof. Find u k, v k, σ k s in the previous theorem. If m > n then choose u n+1,..., u m to be orthonorml nd orthogonl to spn {u 1,..., u n }. Set V = v 1 v n, nd U = u 1 u n u n+1 u m σ 1... Σ = σ n., The proof is similr for m < n. Wht re the uses of SVD? Suppose A = UΣV, then nd hence A = V Σ U A A = V Σ ΣV AA = UΣΣ U. So V contins eigenvectors of A A nd U contins eigenvectors of AA. And σ k re the non-zero eigenvlues. Exercise.7. Show tht the non-zero eigenvectors of A A nd AA re the sme. Here is second ppliction: The low-rnk pproximtion. Recll rnk (A) = dim (R (A)). If A is rnk 1, then A = xy for some x R m nd y R n. If A is

3. PROJECTION OPERATORS 9 rnk k, then A = k j=1 x jyj. The best rnk k pproximtion to A is the mtrix k j=1 σ ju j vj. Tht is, if B hs rnk k, then k A B A σ j u j vj. There is proof in the text. Low-rnk pproximtion llows for mtrix compression. Indeed, k A σ j u j vj = σ k+1, j=1 so when the σ k 0 quickly, we cn pproximte the ction of A with fewer nd fewer numbers. Fewer numbers mens fewer multiplies to compute Ax to high degree of ccurcy. Note. ( The singulr vlues nd the eigenvlues of mtrix re very different. Indeed λ ) k A = (λ k (A)) ( but σ ) k A ( ) (σ k (A)). To see how different they re, 0 1 consider A =. It s cler from this exmple tht the singulr vlues do 0 0 not represent the dynmics (lthough the eigenvlues do). j=1 3. Projection Opertors Definition 3.1. An m m mtrix P is projection if P = P. For n rbitrry vector v we cn write v = P v + (I P ) v. This is unique decomposition. Indeed, the subspces R (P ) nd R (I P ) re complementry, in tht R (P ) R (I P ) = {0}. (Suppose P x = (I P ) y, then pplying P gives P x = 0 nd hence the result.) So C n = R (P ) R (I P ). Note lso tht I P is lso projection, for (I P ) = I P + P = I P by definition. So we hve decomposed the entire spce into the direct sum of the rngespces of (complementry) projections. There is n importnt clss of projections for which R (P ) R (I P ). These re known s orthogonl projections. Proposition 3.. A projection P is orthogonl iff P = P. Proof. Observe tht ((I P ) v) P v = v (I P ) P v, so if P = P then P is n orthogonl projection. For the other direction, recll tht if we hve S 1 S = C n then there exist orthonorml bses {q 1,..., q m } nd {q m+1,..., q n } for S 1 nd S. Now tke R (P ) = S 1 nd R (I P ) = S. Write v = n i=1 (q i v) q i nd pply P to get m m P v = (qi v) q i = q i qi v. i=1 This shows P = m i=1 q iq i nd hence P = P. i=1 Lecture, 9/15/11

10 1. SINGULAR VALUE DECOMPOSITION AND QR FACTORIZATION We cn build n orthogonl projector P onto the columns of given mtrix. Suppose A is n n m with full rnk, nd suppose n > m. Given vector v, we wnt (I P ) v R (A). Cll P v = Ax for some x, then this sys v Ax R (A). Thus or just Now if then A (v Ax) = 0 A v = A Ax. x A Ax = 0, Ax = 0 nd hence x = 0. So A A is invertible, nd hence x = (A A) 1 A v. So wht should P v be? Applying A to both side we get P v = A (A A) 1 A v. 4. The QR Fctoriztion Suppose A is n m n mtrix. We look for fctoriztion A = QR with unitry Q nd upper-tringulr R. Such fctoriztion is clled the QR fctoriztion for A. Why should we cre bout QR? Suppose A is squre, nd tht we wnt to solve the system Given the QR, we cn write Ax = b. QRx = b = Rx = Q b which is tringulr system of equtions. For exmple, consider tringulr system of equtions ( ) ( ) ( ) r11 r 1 x1 c1 =. 0 r x c Then x = 1 r c x 1 = 1 r 11 (c 1 r 1 x ). This process generlizes immeditely to the n n cse. And the work to solve tringulr system is exctly proportionl to the number of non-zero entries.

4. THE QR FACTORIZATION 11 4.1. Grm-Schmidt. Let s do computtion. Suppose q 1, q re orthogonl unit vectors. Then we cn write projection I q 1 q 1 q q which removes the components in the q 1 - nd q -directions. Now (4.1) I q 1 q 1 q q = (I q q ) (I q 1 q 1). This observtion gives us two wys in which to compute the QR of mtrix, the first of which is Grm-Schmidt. Suppose A is full rnk nd n m. Let the vectors 1,..., m be the columns of A. The Grm-Schmidt process from bsic liner lgebr is (1) Set q 1 = 1 / 1 nd define r 11 so tht 1 = r 11 q 1. () Set r 1 = q 1 nd r = q. Set = r 1 q 1 + r q. (3) Continue. So we rrive t the decomposition Cll nd 1 m = ˆR = q 1 q m ˆQ = q 1 q m r 11 r 1 r 1m r r m.... r mm r 11 r 1 r 1m r r m.... r mm then A = ˆQ ˆR is clled the reduced QR decomposition of A. Note tht ˆQ my not be unitry, s we could hve m < n. But pick-up q m+1,..., q n which re orthogonl to q 1,..., q m nd of unit norm. Then define the n n mtrix nd the n m mtrix Q = ˆQ q m+1 q n R = ( ˆR 0 then A = QR is the full QR decomposition. As it turns out, the lgorithm prescribed bove is numericlly unstble. Insted, consider the following procedure, motivted by the second equlity in eqution 4.1: ),, (1) Set q 1 = 1 / 1. () Normlize (I q 1 q 1) to obtin q. (3) Normlize (I q q ) (I q 1 q 1) to obtin q 3. (4) Continue. This lgorithm genertes the sme decomposition, but s it will turn out, it is numericlly stble. We ll see why lter..

1 1. SINGULAR VALUE DECOMPOSITION AND QR FACTORIZATION 4.. Householder Trnsformtions. There is third wy to find the QR fctoriztion, vi Householder trnsformtions. Suppose A is n m n mtrix with m n. The ide is to produce unitry mtrices Q i such tht Q n Q 1 A is upper tringulr. How will we cheive this? Fix vector v nd consider the mtrix I vv v v. This is not projector, but observe (I vv v v ) = I vv v v vv v v + 4vv vv (v v) = I, so it is unitry. Our gol is QR. So we should choose v so tht ± 1 0 ) (I vv v 1 = v Since I vv vv v v is projector, it s esy to see tht I v v reflects 1 in the plne perpendiculr to v (think of it geometriclly). So we define Q 1 to be the unitry mtrix I v1v 1 v with 1 v1 ± 1 1 1 ± 1 11 v 1 = 31 0 =.. 1.. 0 m1 m1 We hve yet to sy how to choose the sign: it is best numericlly to choose + 1 if 11 < 0 nd 1 if 11 > 0. (Subtrcting two smll numbers is numericlly unstble.) So now we hve 0 new We wnt so we set Q 1 A =. 0. vribles 0.. 0 Q Q 1 A =. 0,.. 0 0 Q = 1 0 0 0. I vv v v 0 for pproprite v C n 1. And so on. After producing ll the Q i we hve Q n Q 1 A = R

5. LEAST-SQUARES 13 nd hence A = Q 1 Q nr. Finlly, consider solving Ax = b for x. We hve Rx = Q b so we only need to compute Q b. We could first compute Q = Q n Q 1 nd then compute Q b. Will this sve spce on computer? In fct the mtrix-mtrix multiplies required here will wste spce. Insted, we should compute Q 1 b then Q (Q 1 b) then Q 3 (Q (Q 1 b)) nd so on. This step-by-step mtrix-vector multipliction is much more efficient. 5. Lest-Squres We cn pply the methods of this chpter to the ge-old lest-squres problem. Suppose we wnt to solve Ax = b with A n m n mtrix, m > n. From liner lgebr, we know the best solution x will stisfy A (Ax b) = 0, i.e. If A is of full rnk, then A Ax = A b. x = (A A) 1 A b. Is there more computtionlly efficient wy to find x? Lecture 3, 9/7/11 One wy is to use QR. We sw three wys to compute the QR decomposition of mtrix A. The third method involved the Housholder trnsformtion I vv v v, which reflects perpendiculr to v. The first step ws to set x = A 1 nd v = sgn (x) x e 1 x, nd then pply I vv v v to A. Why did we choose to write sgn (x) in the definition of v? This prevents the possibility of subtrcting two close numbers, which (s we ll see) would introduce lrge mount of round-off error. To compute QR we lso need to compute x = x i. Is it possible to do this ccurtely? Cn we even compute α ccurtely? Below, we discuss these issues; then, we ll solve lest-squres. 5.1. Introduction to Error nd Stbility. Computtion introduces errors, nd n importnt mesure of n lgorithm is the mount of reltive error it introduces. Why re we interested in reltive error? It gives us n ide of the number of relible digits. A number is represented in binry form on computer, in soclled floting point system. For exmple, number in the IEEE double formt is sequence ± 1... 11 b 1... b 5 where ech entry is one or zero. (See the hndout for detils.) Now suppose two numbers re stored in given floting point system. All of the usul opertions +,,,, cn be performed on these numbers, but the result my not be in the system. Of course, if the result is contined in the system then it is stored s such; otherwise, the computer chooses one of the closest numbers in the system to represent the result. Another wy of sying this is tht, in floting point, ll the usul opertions round to the lst digit. We need wy to nlyze the errors which re inherent to such setup. If x is number, we denote the floting point representtion of x s (x) FL = (x) (1 + ɛ)

14 1. SINGULAR VALUE DECOMPOSITION AND QR FACTORIZATION where ɛ 10 16. So the opertion + hs s its floting point nlogue the reltion (x + y) FL = (x + y) (1 + ɛ). This crries over immeditely to the rest of the usul opertions. If ɛ = 0 in this representtion, then x + y is in the floting point system. If ɛ 0, then the representtion is not exct. This is exctly wht leds to non-zero reltive error: the reltive error introduced by the opertion of ddition is the quntity reltive error = (x + y) FL (x + y) = ɛ. (x + y) Wht if we try to dd three numbers? The reltive error introduced would then be reltive error = ((x + y) (1 + ɛ 1) + z) (1 + ɛ ) (x + y + z) x + y + z = (x + y) ɛ 1 + zɛ + (x + y) ɛ 1 ɛ x + y + z (x + y) ɛ 1 + zɛ x + y + z since ɛ 1 ɛ will be smll. Now observe tht if we do not know the signs of x, y, z, then we cnnot produce bound on the error. But if x, y, z ll shre the sme sign, the error will be t most ɛ 1 + ɛ + ɛ 1 ɛ. Wht bout generl function f? Even if x + x is representtion of x with smll error x, f (x + x) my hve lrge error reltive to the expected result f (x). The mount by which f mgnifies the reltive error in x is ) error mgnifiction = ( f(x+ x) f(x) f(x) ( x ) x f (x) x f (x) x x = f (x) f (x) x. The smller this mgnifiction is, the better the resulting computtion will be. If the error mgnifiction ssocited with pplying f is smll, we sy f is numericlly stble. Exmples 5.1. (1) f (x) = x hs f (x) f(x) x = 1 which is good. So x preserves reltive errors (it does not mgnify them), nd hence is stble. () Similrly, x = x i is stble. (3) f (x) = e x hs f (x) f(x) x = x which is not good for lrge x. (4) f (x) = 1 x hs f (x) f(x) x = x 1 x s x 1. This relly shows tht subtrction on computer is unstble! 5.. Solution of Lest-Squres. Suppose A is full rnk mtrix. The problem is to solve Ax = b

5. LEAST-SQUARES 15 s best s we cn. If b R (A) this is solvble exctly; but wht if b / R (A)? Then the problem is to find x which minimizes Ax b. Such n x is s close s we cn get (in ) to solution of Ax = b. Here is first ttempt t solution method (the norml equtions). Observe tht x is minimizer iff A (Ax b) = 0 since then Ax b will be perpendiculr to the column spce of A. minimizer is x = (A A) 1 A b, i.e. the unique solution to A Ax = A b. Thus the The dvntge of this pproch from computing stndpoint is tht A A is smll mtrix in generl. However, computing with A A cn be problemtic due to round-off error. Tht is, even though A hs full rnk, the floting point version of A A cn be more singulr. There re other options, the most stndrd of which is vi the QR-fctoriztion. Let A be m n mtrix with m > n, nd suppose A hs full rnk. Then write A = QR where ( ) ˆR R = 0 with ˆR non-singulr n n mtrix (A is of full rnk). Then nd if we write QRx b = Rx Q b, ( ) Q c b = ĉ with c R n nd ĉ R m n, then the minimizer is exctly the solution of ˆRx = c, n upper tringulr (red esily solvble) system of equtions. We lso get tht the minimum error is Rx Q b = ĉ. A cler dvntge of this solution method is tht it does not require computtion of A A. A second option is vi the SVD-fctoriztion. Then we hve A = UΛV, nd once we cll y = V x. Now if gin Ax b = U (UΛV x b) = ΛV x U b = Λy U b U b = ( c ĉ with c R n nd ĉ R m n, the minimizing y is the solution of Λy = c which is trivil to solve, nd then the minimizing x is exctly x = V y. The SVD often offers firly stble wy to solve problems numericlly. )

16 1. SINGULAR VALUE DECOMPOSITION AND QR FACTORIZATION 5.3. Dt Fitting. Suppose we hve dt (x i, y i ) with i = 1,..., m. The problem is to find k, l so tht y = kx + l best pproximtes the dt. More generlly, the problem is to find 0, 1,..., n so tht y = 0 + 1 x + + n x n best pproximtes the dt. If there re more mesurements thn unknowns, n exct fit is most likely impossible. But consider the system of equtions 0 + 1 x 1 + + n x n 1 = y 1.. 0 + 1 x m + + n x n m = y m This cn be written in mtrix-vector form s with nd Ax = y 1 x 1 x n 1 1 x x n A =., 1 x m x n m 0 1 x =., n y = Although A my not be invertible, it will be full-rnk if x i x j for i j. So the methods developed bove will work to solve for the coefficients 0,..., n which give the best fit of the dt. (??) Here is relted problem. Suppose we wnt to solve Ax = b but N (A) is non-trivil. If A is of full-rnk, then we cn solve y 1 y. y m AA y = b, exctly. Then the generl solution to Ax = b will be of the form x = A y + z for some z N (A). And since R (A ) N (A) we see tht A 1 ({b}) = R (A ) N (A).

CHAPTER Interpoltion by Polynomils An importnt observtion in the development of numericl techniques is tht mny functions cn be well-pproximted by polynomils. The first step in this direction is polynomil interpoltion. Given distinct points x 0, x 1..., x n nd vlues y 0, y 1,..., y n, the problem is to identify n nth order polynomil p n (x) = 0 + 1 x+ + n x n which interpoltes, i.e. which hs p n (x i ) = y i for i = 0,..., n. This cn be posed s clssicl mtrix problem: find 0,..., n which stisfy 1 x 0 x n 0 0 y 0.... 1 x n x n n. n = Note tht the mtrix on the left is specil type of mtrix known s Vndermonde mtrix. How cn we show there is solution? One wy is to look t the Vndermonde determinnt. Or we cn solve it directly by writing n+1 p n (x) = y i l i (x) i=0. y n with l i (x j ) = δ ij. The l i s here re the Lgrnge polynomils, l i (x) = Π k i (x x k ) Π k i (x i x k ). This method indirectly shows the invertibility of the Vndermonde mtrix. Thus there exists solution. How should be find it numericlly? We could try to invert the Vndermonde mtrix, but this cn be pinful nd often costs o ( n 3) time. Also, suppose we were to compute the interpolting 0,..., n corresponding to dt (x i, y i ) i=0,...,n. If we dd even single dt point, we would hve to find ll of the i s over gin. Cn we solve this problem in wy which llows us to dd dditionl dt?. 1. Newton Interpoltion Polynomils cn lwys be rewritten round n rbitrry center point p n (x) = 1 0 + 1 1 (x c) + + 1 n (x c) n. 17

18. INTERPOLATION BY POLYNOMIALS This is just shifted power series for p n, but it is good to compute with when x c is smll. In generl, the Newton form of polynomil is p n (x) = 11 0 + 11 1 (x c 1 ) + 11 (x c 1 ) (x c ) + + 11 n (x c 1 ) (x c n ) = 11 0 + (x c 1 ) ( 11 1 + (x c ) ( 11 + 11 3 (x c 3 ) +... )). This lst line is the skeleton for fst polynomil evlution. In this respect, the Newton form is superior to the Lgrnge form of polynomil. As we ll see next, the Newton form is lso superior when it comes to interpoltion. Suppose we re given dt (x i, f i ) i=0,...,n nd we wnt to find coefficients A 0, A 1,..., A n so tht p n (x) = A 0 + A 1 (x x 0 ) + A (x x 0 ) (x x 1 ) + + A n (x x 0 ) (x x n 1 ) interpoltes. Plugging in x 0 into the reltion bove yields A 0 = f 0 ; plugging in x 1 yields A 1 = f1 f0 x 1 x 0. In generl we ll hve (1.1) A k = f [x 0,..., x k ] = f [x 1,..., x k ] f [x 0,..., x k 1 ] x k x 0, nd so we cll A k = f [x 0,..., x k ]the kth divided difference. Computing A k is n itertive process, s depicted in the tble below. x 0 f [x 0 ] x 1 f [x 1 ] x f [x ] x 3 f [x 3 ] x 4 f [x 4 ] f [x 0, x 1 ] f [x 1, x ] f [x, x 3 ] f [x 3, x 4 ] f [x 0, x 1, x ] f [x 1, x, x 3 ] f [x, x 3, x 4 ] Tble 1. Newton s tringulr tble. f [x 0, x 1, x, x 3 ] f [x 1, x, x 3, x 4 ] How do we know the A k stisfy (1.1)? Suppose p k 1 P k 1 hs p k 1 (x i ) = f i for i = 0,..., k 1 nd q k 1 P k 1 hs q k 1 (x i ) = f i for i = 1,..., k. Then construct p k (x) = x x 0 q k 1 (x) + x k x p k 1 (x) x k x 0 x k x 0 which stisfies p k (x i ) = f i for i = 0,..., k. Now write p k (x) = A 0 + + A k x k p k 1 (x) = B 0 + + B k 1 x k 1 q k 1 (x) = C 0 + + C k 1 x k 1

. HERMITE POLYNOMIALS 19 nd trck the leding coefficients to conclude A k = C k 1 B k 1 x k x 0, which is the result we re fter. We hve just developed the following result: Proposition 1.1. The interpolting polynomil of nth degree is n p n (x) = f [x 0,..., x i ] Π i 1 j=0 (x x j). i=0 In prticulr, the liner interpolnt is p 1 (x) = f 0 + f (x 1) f (x 0 ) x 1 x 0 (x x 0 ). By construction, the qudrtic interpolnt p includes p 1 in its description. More generlly, p n includes p i for ll i < n in its description. So the formul bove is recursive: it gives wy to systemticlly build up higher order interpolnts of given dt. Also, note tht the ordering of the points x 0,..., x n in the description bove is irrelevnt; interpolting polynomils re uniquely determined without regrd to ordering of the dt. This lst observtion cn be esily turned into Proposition 1.. Divided differences re invrint under permuttion of the dt. Tht is, given ny permuttion σ on the set {0,..., n}, f [x 0,..., x n ] = f [ x σ(0),..., x σ(n) ].. Hermite Polynomils Now suppose we re given dt in the form (x i, f i, f i ). Cn we interpolte with polynomil tht mtches both the desired function vlues nd the corresponding derivtives? As first exmple, consider the simple cse (x i, f i, f i ) i=0,1 where we re given only four pieces of dt. For smooth enough f nd smll ɛ, this problem is close to the problem of interpolting the dt set {(x 0, f (x 0 )), (x 0 + ɛ, f (x 0 + ɛ)), (x 1, f (x 1 )), (x 1 + ɛ, f (x 1 + ɛ))}, which we would interpolte with p ɛ 3 (x) = A ɛ 0+A ɛ 1 (x x 0 )+A ɛ (x x 0 ) (x (x 0 + ɛ))+a ɛ 3 (x x 0 ) (x (x 0 + ɛ)) (x x 1 ) in the Newton scheme. The superscripts on the A i s indicte tht the coefficients in the interpoltion depend on ɛ. We cn represent this interpoltion with the following tble: Now we wnt to tke ɛ 0 nd get n interpolnt for the originl dt set (x i, f i, f i ) i=0,1. Consider the divided difference A 1 (ɛ) = f [x 0, x 0 + ɛ] = f [x 0 + ɛ] f [x 0 ]. ɛ As ɛ 0, we recover lim ɛ 0 A 1 (ɛ) = f 0. We cn we mke the suggestive lbeling f [x 0 + ɛ] f [x 0 ] f [x 0, x 0 ] = lim, ɛ 0 ɛ

0. INTERPOLATION BY POLYNOMIALS x 0 f [x 0 ] x 0 + ɛ f [x 0 + ɛ] x 1 f [x 1 ] x 1 + ɛ f [x 1 + ɛ] f [x 0, x 0 + ɛ] f [x 0 + ɛ, x 1 ] f [x 1, x 1 + ɛ] f [x 0, x 0 + ɛ, x 1 ] f [x 0 + ɛ, x 1, x 1 + ɛ] Tble. Newton scheme with smll ɛ. then we hve the reltion f [x 0, x 0 ] = f (x 0 ). Similrly, f [x 1, x 1 ] = f (x 1 ). Now if we let ɛ 0 in our interpoltion, we recover (.1) p 0 3 (x) = lim ɛ 0 p ɛ 3 (x) = f [x 0 ] + f [x 0, x 0 ] (x x 0 ) + f [x 0, x 0, x 1 ] (x x 0 ) (x x 0 ) + f [x 0, x 0, x 1, x 1 ] (x x 0 ) (x x 0 ) (x x 1 ). This is known s Hermite polynomil nd hs the following tble: x 0 f [x 0 ] x 0 f [x 0 ] x 1 f [x 1 ] x 1 f [x 1 ] f [x 0, x 0 ] f [x 0, x 1 ] f [x 1, x 1 ] f [x 0, x 0, x 1 ] f [x 0, x 1, x 1 ] Tble 3. Cubic Hermite polynomil. Finlly, we sk: does p 0 3 interpolte the given dt (x i, f i, f i ) i=0,1? It s cler from the tble tht p 0 3 (x i ) = f (x i ) for i = 0, 1. But lso, we hve d ( ) p 0 dx 3 (x0 ) = f [x 0, x 0 ] nd d ( ) p 0 dx 3 (x1 ) = f [x 1, x 1 ]. (The first is cler from (.1); the second follows once we rewrite p 0 3 round the points x 1, x 1, x 0, for then the coefficient on (x x 1 ) will be f [x 1, x 1 ]. Both re

3. INTERPOLATION ERROR 1 cler from the tble.) Since f [x 0, x 0 ] = f 0 nd f [x 1, x 1 ] = f 1, p 0 3 is the desired interpolting polynomil. The discussion bove showed how to rrive t cubic Hermite interpoltion s limit of pproximting Newton schemes. This process generlizes immeditely to rbitrry dt sets of the form (x i, f i, f i ), which re interpolted by higher order Hermite polynomils. Lecture 4, 9/9/11 Recll the expnsion p n (x) = 3. Interpoltion Error n i=0 f [x 0,..., x i ] Π i 1 j=0 (x x j) for the nth order interpolting polynomil. This lmost looks like Tylor series, in tht the coefficients re lmost derivtives. So to estimte interpoltion error, we proceed in similr wy s for Tylor series (where the nth order error is bounded in terms of the (n + 1)th derivtive). Suppose p n interpoltes for f. Then the error is e n (x) = f (x) p n (x). Given x x i, consider the (n + 1)th order interpolnt p n+1 (x) = p n (x) + f [x 0, x 1,..., x n x] Π n j=0 (x x j ). In prticulr, this interpoltes t x so tht p n+1 (x) = f (x). Thus e n (x) = f (x) p n (x) = f [x 0, x 1,..., x n, x] Π n j=0 (x x j ) Is n exct expression for the nth interpoltion error. How cn we tell if the error is big or smll? As predicted bove, the size of the nth divided difference for f depends on the higher derivtives of f. But lso, the error depends on the distribution of the points x i. Of course we cnnot lwys choose x i in prctice, but if we cn it would be dvntgeous to know the best points x i to use (the so-clled Chebyshev points ). Proposition 3.1. The nth interpoltion error stisfies e n (x) = f (n+1) (ξ) (n + 1)! for some ξ between the interpolting points. This follows from Theorem 3.. The kth divided difference stisfies f [x 0,..., x k ] = f (k) (ξ) k! for ξ between the smllest nd lrgest of the points x i. Proof. For k = 1, f (x 1 ) f (x 0 ) = f (ξ) x 1 x 0 for some ξ, by the men vlue theorem. Now consider e k (x) = f (x) p k (x), which hs e k (x i ) = 0 for i = 1,..., k. So e k (ξ) = 0 between the x is. Thus, f (x) p k (x)

. INTERPOLATION BY POLYNOMIALS vnishes t t lest k points. Similrly, f (x) p k (x) vnishes t t lest k 1 points. Going forwrds, we see there must exist ξ so tht f (k) (ξ) p (k) k (ξ) = 0. Now Π k 1 j=0 (x x j) = x k + lower order terms nd thus f (k) (ξ) = k!f [x 0,..., x k ]. This proves the theorem. 4. Piecewise Interpoltion Thus fr, we hve only discussed interpoltion of dt with single polynomil. Now, we sk: when does single polynomil fil to be stisfctory interpolnt? In prctice, high order pproximtions turn out to be quite bd. One silly exmple cn be found in MATLAB the uthors computed high order polynomil interpolnt for lrge mount of censue dt, nd concluded negtive popultion. A more clssic exmple is due to Runge. Exmple 4.1. Runge s phenomenon. Tke f (x) = 1 1+x nd interpolte on [ 5, 5] t equidistnt points. Now increse the number of points nd so the order of pproximtion. Observe how bd the pproximtion becomes! But wht if ppliction demnds interpolting t mny points? Polynomils hve the wonderful property of hving infinitely mny derivtives. But given lrge dt set, better pproch is to use piecewise polynomil interpolnts. The simplest cse is the piecewise liner, continuous pproximtion. This hs the obvious disdvntge of kinks. In wht follows, we ll develop wy to smooth out the kinks, vi so-clled cubic splines. As we ll see, these re piecewise cubic, C pproximtions to the dt. But first, recll the cubic Hermite polynomils, built to mtch function vlues nd first derivtives: p 3 (x) = f 0 + f (x 0 ) (x x 0 ) + f [x 0, x 1 ] f (x 0 ) (x x 0 ) x 1 x 0 + f (x 0 ) f [x 0, x 1 ] + f (x 1 ) (x 1 x 0 ) (x x 0 ) (x x 1 ). This polynomil interpoltes the dt (x i, f (x i ), f (x i )) i=0,1. Now we build the cubic spline. If x i is n interpoltion point, we re forced to mtch f(x i ); however, we re free to chose the derivtive s i = f (x i ). Since we desire C (piecewise) pproximtion, the obvious thing to do is the choose s i so tht second derivtives mtch t interpoltion points. Explicitly, suppose we hve two Hermite cubic polynomils which interpolte to the left nd right of x i. On the right we hve p i (x) = f (x i ) + s i (x x i ) + f [x i, x i+1 ] s i x i+1 x i (x x i ) + s i f [x i, x i+1 ] + s i+1 (x i+1 x i ) (x x i ) (x x i+1 ) nd similrly on the left. Second derivtive mtching is p i (x i) = p i 1 (x i), nd if we denote x i = x i+1 x i then (fter some work) we rrive t the requirement x i s i 1 + ( x i 1 + x i ) s i + x i s i = 3 (f [x i, x i 1 ] x i + f [x i, x i+1 ] x i 1 ).

4. PIECEWISE INTERPOLATION 3 Sy x 0,..., x n re the interpolting points, rrnged in incresing order. Then t x 1,..., x n 1 we enforce the reltion bove. This is system of n 1 equtions. But there re n + 1 unknowns, s 0, s 1,..., s n. The missing informtion is t the endpoints. If s 0, s n re known priori, then we ll hve the sme number of equtions s unknowns nd the the problem is solvble. Observe tht the problem is nonlocl there is coupling between neighboring intervls. But s the coupling itself is locl, the problem is not hrd to solve. To see why, consider the mtrix eqution corresponding to the system: it is of the form s 1. s n 1 =.. This mtrix is tridigonl, nd we clim it is lwys invertible. Clling the mtrix J, suppose there exists y with Jy = 0. Then identify the lrgest component y k of y, i.e. with y k y i for ll i. Then but ( x i + x i 1 ) y k = x i y k 1 x i 1 y k, x i y k 1 x i 1 y k x i + x i 1 y k which yields contrdiction unless y k = 0. Such mtrix J is sid to be digonlly dominted. Now let s solve for the s i. First, perform Gussin elimintion on J. The first step is 11 1 1 3 3 33 34 11 1 0 ã 3 3 33 34 with ã = 11 11. ( 11 0 for digonlly dominted mtrix.) From this it s cler tht fter Gussin elimintion (reltively inexpensive computtionlly) we ll end up with n upper tringulr system, which is efficiently solvble. We hve brushed over s 0 nd s n. Suppose, for exmple, tht we don t know s 0. To determine s 0, we could require the spline to be C 3 (insted of just C ) t the point x 1. This would yield s 0 in terms of s 1 nd s. Or we could pproximte s 0 with finite diferences. Or if we know priori f (x 0 ) = f (x n ) = 0 ( physiclly legitimte boundry condition in some problems) then s 0 is utomticlly determined by s 1. Wht bout higher order splines? Doble, but cubics re much nicer to del with in prctice. Wht bout interpoltion in higher dimensions? This is n importnt question, nd its solution hs engineering pplictions (e.g. computerided design).

4. INTERPOLATION BY POLYNOMIALS 5. Qudrture by Polynomil Interpoltion This is n importnt ppliction of interpolting polynomils. The gol is to compute f (x) dx s ccurtely s possible. The ide is the interpolte with polynomil t certin number of points, nd then integrte the polynomil. And now tht we cn interpolte in piecewise fshion, it will be sensible to write the integrl bove s the sum of integrls tken over sub-intervls. How then should we choose the points t which to brek up the intervl? It turns out this cn be utomted nd optimized; we ll discuss so-clled dptive qudrture lter in this section. Our pproximtion is f (x) dx I (f) = k A i f (x i ). How cn we determine the A i s? Let p k be the kth order interpolnt t points x i, i = 0,..., k, then p k (x) = Π k x x j j=0,j k f (x k ) x k x j nd so A i = Π x x j dx. x k x j Wht error re we mking in this pproximtion? We hve i=0 f (x) = p k (x) + f [x 0,..., x k, x] ψ k (x) with ψ k (x) = Π k j=0 (x x j). So the error is E (f) = f [x 0, x 1,..., x k, x] ψ k (x) dx. Unfortuntely, this is quite complicted expression. Recll from clculus f min g (x) dx f (x) g (x) dx f mx g (x) dx so long s g does not chnge sign on [, b]. Then we cn use n intermedite vlue rgument to bound the integrl. And s f [x 0,..., x k, x] = f (k+1) (ξ) /(k + 1)! for ξ in [, b], the estimtes will be explicit. In wht follows, we present the results of this progrm for severl different interpolnts. 5.1. Trpezoid Rule. This first rule comes by interpolting f linerly t the endpoints of [, b]. Let ψ 1 (x) = (x ) (x b) nd note ψ 1 hs non-zero integrl on [, b]. Also compute nd (x ) (x b) dx = (b )3 6 ( ) f (b) f () f () + f (b) f() + (x ) dx = (b ). b

5. QUADRATURE BY POLYNOMIAL INTERPOLATION 5 Thus we hve the trpezoid rule, with ssocited error f(x) dx error = f (η) 1 f () + f (b) (b ) 3. (b ) 5.. Midpoint Rule. Now, interpolte f linerly t the left endpoint nd the midpoint of [, b]. This yields the midpoint rule, ( ) + b f(x) dx f (b ). Let s evlute the error error = f [ ] ( + b, x x + b ) dx. Since x ( + b) / chnges sign, we cn t proceed directly s before. But note ( x + b ) dx = 0, nd s f [ ] [ + b + b, x = f, + b ] [ + b + f, + b ] (, x x + b ) we find [ + b error = f, + b ] (, x x + b ) dx. Thus the error ssocited with the midpoint rule is error = f (ξ) (b ) 3. 4 Note the midpoint rule is exct for both constnt nd liner functions. The next rule will be exct for qudrtic functions, nd lso for cubic functions. 5.3. Simpson s Rule. Interpolte f with qudrtic polynomil t the endpoints nd midpoint of [, b]. Here re the detils. We expect ( ) + b f (t) dt = Af () + Bf + Cf (b), then A, B, C re determined vi the cnonicl bsis for third order polynomils, {1, t, t }. Of course we cn use ny bsis, e.g. the Lgrnge bsis, to find the coefficients. Some re esier to work with thn others. Tking f (t) 1, we get A + B + C = b. Tking f (t) = t ( + b) / we get f = 0, so ( ) ( ) b b A + C = 0.

6. INTERPOLATION BY POLYNOMIALS And for f (t) = (t ( + b) /) we get f = (b )3 /6, so ( ) ( ) b b (b )3 A + C =. 6 Solving these equtions gives A = C = 1 (b ), 6 B = (b ). 3 Lecture 5, 10/6/11 Thus Simpson s rule is The ssocited error is f(x) dx b 6 ( f () + 4f ( + b ) ) + f (b). error = f (4) (ξ) 90 5 (b )5. As remrked, Simpson s rule is exct up to cubic polynomils. This is n immedite consequence of the error estimte bove; it s lso esy to check ech of 1, t ( + b) /, (t ( + b) /), (t ( + b) /) 3 re integrted exctly by the rule, nd these form bsis for the cubic polynomils. 5.4. Corrected Trpezoid Rule. Interpolte with Hermite cubic polynomil t the endpoints of [, b]. This yields the corrected trpezoid rule, f(x) dx b (f () + f (b)) + (b ) 1 (f () f (b)). One could derive this by computtion, or just check it is correct by checking exctness on cubic polynomils. The ssocited error is error = f (4) (ξ) (b ) 5. 70 This rule cn be useful when we hve two subintervls [x i, x i+1 ] nd [x i+1, x i+ ] of the sme width. The rule gives xi+ f x i+ x i+1 x i+1 (f (x i+1 ) + f (x i+ )) + (x i+ x i ) (f (x i+1 ) f (x i+ )), 1 nd observe tht the derivtive term t x i+1 cncels upon dding this to the expression for x i+1 x i f. So if evluting f is hrd t some point x, by subdividing t x nd employing the corrected trpezoid rule twice, we cn completely void evluting f ( x). 5.5. Adptive Simpson s Rule. The error estimte for Simpson s rule requires knowledge of the fourth derivtive of f. But in prctice, we don t know nything bout the derivtives of f. So wht good is this estimte? Suppose we hve function whose fourth derivtive is well-behved, nd suppose we cut [, b] into subintervls nd pproximte the integrl. If we then double the number of subintervls, how much better will the pproximtion become? This kind of question is immeditely nswerble vi the error bound. (Suppose the fourth derivtive is constnt, then cut [, b] into two equl pieces nd consider the error; divide those

5. QUADRATURE BY POLYNOMIAL INTERPOLATION 7 pieces further nd consider the error gin.) This observtion llows us to formulte first lgorithm for dptive mesh refinement the dptive Simpson s rule. Consider the integrl I i = xi+1 x i f (x) dx on sub-intervl [x i, x i+1 ]. Let h = x i+1 x i, then Simpson s rule sys I i S i = h 6 ( ( f (x i ) + 4f x i+ 1 Consider dividing the subintervl further to get S i = h ( ( ) ( ) ( f (x i ) + 4f x 1 i+ 1 + f x 4 i+ 1 + 4f Hve we improved the pproximtion? We know I i S i = 1 ( ) 5 h 90 f (4) (η) nd ) ) + f (x i+1 ). I i S i = 1 ( ) 5 h 90 f (4) (η). 4 x i+ 3 4 ) ) + f (x i+1 ). (Think of f (4) (η) s n verge over the two sub-subintervls.) So if f (4) is pproximtely constnt, then S i S i = f (4) h 5 ( ) 1 4 5 90 4 nd so I S i = S i S i. 15 Now suppose we wnt to evlute the integrl on the whole intervl [, b] to within ɛ. Then if, fter refining, we get I S i = S i S i ɛh 15 b, dding the sub-subintervls ws enough, nd we should not refine ny further. Conversely, if even fter refining this criterion is not met, we should keep the subsubintervls nd try to refine ech of them further. This is recursive process. When will such process fil? Consider the problem I (λ) = f (x, λ) dx where we need to compute I s function of λ. Adptive mesh refinement my led to vstly different meshes for different vlues of the prmeter λ. But suppose I is smooth in λ priori. Then this is property we expect the computtion to preserve. However, by introducing different meshes cross different λs, we will esily fil to reproduce I s smooth function fter computing.

8. INTERPOLATION BY POLYNOMIALS 6. Orthogonl Polynomils We hve lredy seen one wy of pproximting n rbitrry function by polynomils, vi polynomil interpoltion. One cn tke more bstrct pproch nd view the spce of rel-vlued functions s normed liner spce. Then ny orthonorml bsis produces redymde pproximtions in the given norm. In this section, we ll develop the bsic theory of orthogonl polynomils nd see some useful exmples. In the next section, we ll pply the results to produce Gussin qudrture. Define weighted inner product on the spce of rel-vlued functions (f, g) ω = ω (x) f (x) g (x) dx with the weight ω (x) 0 (ω must not vnish.e.). As usul, f ω = (f, f) ω. As we hope to pproximte f, we look for polynomils which spn this spce; in prticulr we look for n orthonorml bsis. Why do we cre bout finding such bsis? Given function f, we hope to find the best possible pproximtion p n P n, stisfying f p n ω f p n ω for ll p n P n. Suppose {φ i } is n orthonorml bsis, then ll p n P n re of the form p n = n i=0 α iφ i. We cn clculte explicitly n f α i φ i n n = ω f α i (f, φ i ) ω + αi, i=0 so to get the best pproximtion we should tke α i = (f, φ i ) ω. (One wy to see this is to differentite w.r.t. the α i s nd set the result equl to zero.) So n orthonorml set cn be very convenient. The good news is there re plenty of ccesible orthonorml bses: given ny bsis we cn simply orthonormlize it vi the Grm-Schmidt process. Exmple 6.1. Tke w 1, = 1, b = +1. Strt with the bsis { 1, x, x,... } then orthonormlize with Grm-Schmidt. This produces the Legendre polynomils. Proposition 6.. Suppose {φ j } is n orthonorml set of polynomils under some weight ω nd on [, b]. Suppose φ i (x) P i is the ith degree polynomil in this set. Then ll the roots of φ i lie in (, b). Proof. Define Q r = (x x 1 ) (x x r ) when x 1, x,..., x r re the roots of φ i in (, b). We wnt to prove r = i. Assume ll roots re simple. Then i=0 ω (x) φ i (x) Q r (x) dx 0 for φ i Q r is of one sign on [, b]. But if r < i then (φ i, Q r ) = 0 which is contrdiction. Cn there be double roots? If x 0 is double root then write φ i (x) = (x x 0 ) ψ i (x) nd compute 0 = ω (x) φ i (x) ψ i (x) dx = which is contrdiction. This completes the proof. i=0 ω (x) [(x x 0 ) ψ i (x)] dx 0

6. ORTHOGONAL POLYNOMIALS 9 Although the Grm-Schmidt process is gurnteed to produce n orthonorml set out of ny linerly independent set, the lgorithm becomes less computtionlly trctble s the order of the polynomils goes up. Cn we compute n orthonorml set {p n } in more efficient mnner? In fct, there is recursive formul: p n+1 (x) = (A n x + B n ) p n (x) C n p n 1 (x). Of course, p n+1 is lwys describble in terms of p 0,..., p n, but wht is interesting (nd useful) here is tht only p n 1 nd p n re needed? Let s prove this. For k < n 1 we hve ω (A n x + B n ) p n (x) p k (x) dx ωc n p n 1 (x) p k (x) dx = 0 since the p j re orthogonl for j n. So there re no dditionl terms needed. The coefficients A n, B n, C n re determined in generl by the system w (A nx + B n ) p n (x) p n (x) = 0 w (A nx + B n ) p n p n 1 = c n. p n+1 w = 1 A n, B n, C n re known for ll of the clssicl polynomils. 6.1. Chebyshev Polynomils. Tke ω (x) = 1 1 x nd [, b] = [ 1, +1], then orthonormlize 1, x, x,... by Grm-Schmidt. The result is the Chebyshev polynomils. The first polynomil is the constnt T 0 (x) = 1 π. In generl, T n (x) = cos (n rccos x). π The recursion reltion cn be derived by trigonometry. From we find Once we know T 0 = 1 π nd T 1 = cos ((n + 1) θ) + cos ((n 1) θ) = cos θ cos (nθ) T n+1 (x) = xt n (x) T n 1 (x). x we cn produce the rest vi this reltion. π The Chebyshev polynomils re quite interesting. For one, their roots (clled the Chebyshev points) re very importnt for ccurte interpoltion nd qudrture. The roots collect t the edges of [ 1, 1]. Now recll Runge s exmple. As it turns out, interpolting t the Chebyshev points is just the right thing to do to correct the error. And we ll see tht the Chebyshev polynomils give the best pproximtion not only in the w-norm, but lso in the -norm. 6.. Legendre Polynomils. Now tke ω 1, [, b] = [ 1, +1], nd orthonormlize 1, x, x,... by Grm-Schmidt. This yields the Legendre polynomils, which re in generl given by L n (x) = n + 1 ( ) n 1 d ( x n! n 1 ) n. dx Lecture 7, 8/0/11

30. INTERPOLATION BY POLYNOMIALS Observe the degree of L n is n. And the L n re indeed orthogonl; to see this, write (L n, L m ) = 1 = C 1 1 L n (x) L m (x) dx 1 [( ) n d ( x 1 ) ] [( ) m n d ( x 1 ) m] dx dx dx nd integrte by prts repetedly. They re lso normlized, but we won t prove tht here. A relted set of polynomils (sometimes lso clled the Legendre polynomils) re given by 1 l n (x) = L n (x). n + 1/ These re not normlized, but insted stisfy l n (0) = 1. The recurrnce reltion for the l n is l n+1 (x) = n + 1 n + 1 xl n (x) n n + 1 l n 1 (x). Now observe tht d dx ( (1 x ) d dx (l n) ) = d dx l n (x) x d dx l n (x) = n (n + 1) l n (x). (( 1 x ) d dx ( )). So the l n re eigenvectors of the differentil opertor L ( ) = d dx Since 1 ( (Lf, g) = 1 x ) f g dx = (f, Lg) 1 so L is self-djoint. So its eigenvlues re ll rel; indeed, they re n (n + 1). We could go in the other direction, nd strt off by defining the opertor L on the spce of polynomils. Then if we look for polynomil eigenvector 0 + 1 x + + n x n + n+1 x n+1 +... with eigenvlue n (n + 1), we ll find n+1 = n+ = = 0. So n (n + 1) is exctly the right eigenvlue to demnd to get n nth order polynomil eigenvector. Eventully if we solve for the 0,..., n, we d end up t l n. 6.3. Hermite Polynomils. Wht bout unbounded domins? Tke = nd b = + nd ω (x) = e x. Orthonormlizing the stndrd bsis now results in the Hermite polynomils. These tke the form 1 p n (x) = C n ω (x) d n dx n e x for some choice of constnt C n depending only on n. And s before, one cn write down differentil eqution tht the p n solve. One cn do this for ll the clssicl polynomils. 7. Gussin Qudrture Now we ll use the theory of orthogonl polynomils to pproximte f(x) dx.

7. GAUSSIAN QUADRATURE 31 The ide is still to pproximte function by polynomil, nd then to compute the integrl of tht polynomil. In other words, the gol is to find A i so tht n f (x) dx A i f (x i ). Simpson s rule is choice of A i which integrtes polynomils of degree two exctly. And s it turned out, this rule lso integrted polynomils of degree three exctly. Now we sk if it s possible to integrte higher degree polynomils exctly. We ll see tht by selecting A 0,..., A n nd x 0,..., x n in clever wy, we cn integrte ll of P n+1 exctly. This is lmost intuitive consider tht the vector spce P n+1 hs dimension n +, nd tht there re n + 1 points x i nd n + 1 coefficients A i. 7.1. Guss-Lgrnge Qudrture. We strt by finding the Hermite interpolnt of f t x 0,..., x n, which will mtch the dt {f (x i ), f (x i )} i=0,...,n. As we re working towrds qudrture, this my seem like bd first step (we don t know f (x k ) priori). But lter we ll see tht it s possible to choose the x k so tht the vlues f (x k ) re never needed. Now we guess tht the Hermite interpolnt tkes the form n n p (x) = H k (x) f (x k ) + K k (x) f (x k ) k=0 for some H k, K k P n+1. Observe tht p will be the required interpolnt if i=0 k=0 H k (x i ) = δ ik, K k (x i ) = 0, (H k ) (x i ) = 0, (K k ) (x i ) = δ ik where δ ik is the Kronecker delt. Recll the polynomils L k (x) = Πn i=0,i k (x x i) Π n i=0,i k (x k x i ) P n used in Lgrnge interpoltion; these stisfy L k (x i ) = δ ik. Define ( H k (x) = (L k (x)) 1 d ) dx L k (x k ) (x x k ) We hve H k (x i ) = δ ik nd d dx H k (x) = K k (x) = (L k (x)) (x x k ). ( L k (x k ) d ) ( dx L k (x k ) + (L k (x)) ( d dx L k (x k ) 1 ) so tht (H k ) (x i ) = 0 for ll i. Also K k (x i ) = 0 nd ) d dx L k (x k ) (x x k ) d dx K k (x) = L k (x) d dx L k (x) (x x k ) + (L k (x)) so tht (K k ) (x i ) = δ ik. Thus we ve procured the Hermite interpolnt of f. Now we ll integrte p to get the weights A i. In doing so, we ll see how to choose the x i. First, suppose [, b] = [ 1, +1]. Then write 1 p (x) dx = n W k f (x k ) + V k f (x k ) 1 k=0

3. INTERPOLATION BY POLYNOMIALS with W k = 1 1 H k (x) dx nd V k = 1 1 K k (x) dx. If we cn select x 0,..., x n so tht V k = 0 for k = 0,..., n, then the qudrture rule will red So observe V k = 1 1 1 1 f(x) dx W k f (x k ). 1 L k (x) L k (x) (x x k ) dx = C L k (x) Π n i=0 (x x i ) dx 1 for some constnt C 0. As L k P n, the ide is to choose x 0,..., x n to be the roots of the (n + 1)th Legendre polynomil l n+1 P n+1. Then Π n i=0 (x x i) will be multiple of l n+1 nd hence V k = 0 for P n l n+1. (Recll the Legendre polynomils form n orthonorml bsis on [ 1, 1] with ω 1.) So we ve found the desired interpoltion points x i nd corresponding weights A i = W i. Note tht W k > 0 for ll k; to see this compute W k = (L k (x)) dx (L k (x)) (x x k ) d dx L k (x) dx nd observe tht the second integrl vnishes by our choice of x i. By liner chnge of vrible we cn relx the ssumption [, b] = [ 1, 1]. Then the generl qudrture rule reds f(x) dx b n A i f (l (x i )) where l(x) = b x + b. This is clled Guss-Lgrnge qudrture. The key ide ws to pproximte the integrl of f by the integrl of its Hermite interpolnt t n + 1 qudrture points. As every polynomil p P n+1 cn be exctly written s Hermite interpolnt t n + 1 distinct points, we cn now integrte ll of P n+1 exctly. 7.. Guss-Lobtto Qudrture. Recll we proved tht the roots of ny polynomil belonging to n orthonorml set on [ 1, 1] must lie in the interior ( 1, 1). Thus the qudrture points in our scheme re in ( 1, 1). Wht if we require x 0 = 1, x n = +1? In this cse, there re n free prmeters: x 1,..., x n 1 nd A 1,..., A n 1. So we should expect to only integrte polynomils from P n 1 exctly. To derive the qudrture rule, we strt by interpolting f with p (x) = n k=0 i=0 n 1 H k (x) f (x k ) + K k (x) f (x k ) where H k, K k re s bove. Note the second sum does not include the endpoints. Agin, we compute 1 p (x) dx = n W k f (x k ) + V k f (x k ) 1 with W k = 1 1 H k (x) dx nd V k = 1 1 K k (x) dx. We hope to choose x 1,..., x n 1 so tht V k = 0 for ll k. So fr, the derivtion hs not chnged. k=1 k=0

7. GAUSSIAN QUADRATURE 33 Now, we hve 1 V k = C L k (x) (1 x )Π n 1 i=1 (x x i) dx 1 where L k P n. To ensure this ws zero before, we chose x 0,..., x n to be the roots of the (n + 1)th Lgrnge polynomil l n+1 P n+1. But now observe tht 1 1 d dx l n (x) d dx l m (x) ( 1 x ) dx = 0 d for n m, so tht the derivtives dx l n of the Legendre polynomils form n orthgonl bsis (up to normliztion) with respect to the weight ω (x) = 1 x. d With this in mind, we choose x 1,..., x n 1 to be the roots of dx l n+1 P n 1, then 1 ( ) V k = C d L k (x) dx l n+1(x) ω(x) dx = 0 1 s d dx l n+1 P n with respect to ω. This is wht we wnted. To sum up, we hve produced qudrture rule t the points 1 = x 0 < x 1 < < x n 1 < x n = 1 where x 1,..., x n 1 re the roots of the second derivtive of the (n + 1)th Legendre polynomil. This is known s Guss-Lobtto qudrture. Wht bout f?

CHAPTER 3 Solving Liner Equtions Consider wht hppens to p (x) = 0 + 1 x + + n x n when we replce ech i by i + δ i. We sk: wht hppens to the roots? Why is this relevnt question? To get the eigenvlues of mtrix, we would solve for the roots of its chrcteristic polynomil. If we chnge the entries of the mtrix, then the coefficients on the chrcteristic polynomil chnge ccordingly. So the question is: how do the eigenvlues of mtrix chnge if its entries re perturbed? Let s tke n = 0, nd sy the roots re first x i = i with i = 1,..., 0. As we chnge the coefficients, the roots x i will chnge. We hope to understnd this mp. Recll we considered the rtio f (x + δx) f (x) / δx f (x) x before with f (x) = x nd so on. Now we look t this rtio with f the mp from the coefficients to prticulr root the root x 15. Set p (x) = ã 0 + ã 1 x +... with ã i = i +δ i nd look t the equtions p (x j ) = 0 nd p (x j + δx j ) = 0. In prticulr suppose we perturb only the ith coefficient. Then we ll hve p (x) = p (x) + δ i x i nd to first order. Hence p (x j δx j ) = p (x j ) + p (x j ) δx j + δ i x j = p (x j ) δx j + δ i x j = 0 δx j = δ ix i j p (x j ). To get the denomintor, write p (x) = Π 0 i=1 (x i) nd so p (x) = 0 j=1 Π0 i=1,i j (x i). Thus p (x j ) = Π 0 i=1,i j (x j x i ). It turns out the lrgest this gets is with j = 15, then p (x j ) = 4!15!. The reltive chnge in the roots is δx j / δ i δ 15 15 14 i = x j i 4!15! δ 15 And i 1.16 10 9, so = 1514 4!15! i. δx j x j / δ i i 5.1 10 13. This is dissterous. Wht hve we lerned? Solving the chrcteristic polynomil to find the eigenvlues of mtrix is bd ide. 35