Numerical Analysis Lecture 1 1

Similar documents
Theoretical foundations of Gaussian quadrature

Chapter 3 Polynomials

Orthogonal Polynomials

Discrete Least-squares Approximations

Part IB Numerical Analysis

Orthogonal Polynomials and Least-Squares Approximations to Functions

Lecture 14: Quadrature

Matrices, Moments and Quadrature, cont d

Abstract inner product spaces

Best Approximation. Chapter The General Case

Lecture Note 9: Orthogonal Reduction

1. Gauss-Jacobi quadrature and Legendre polynomials. p(t)w(t)dt, p {p(x 0 ),...p(x n )} p(t)w(t)dt = w k p(x k ),

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

Numerical quadrature based on interpolating functions: A MATLAB implementation

1 The Lagrange interpolation formula

Best Approximation in the 2-norm

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

1 Linear Least Squares

Physics 116C Solution of inhomogeneous ordinary differential equations using Green s functions

Lecture 19: Continuous Least Squares Approximation

Math 270A: Numerical Linear Algebra

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

The Regulated and Riemann Integrals

Numerical Methods I Orthogonal Polynomials

Lecture 1. Functional series. Pointwise and uniform convergence.

Numerical integration

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.

Math 520 Final Exam Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

HW3, Math 307. CSUF. Spring 2007.

Numerical Linear Algebra Assignment 008

Lecture 6: Singular Integrals, Open Quadrature rules, and Gauss Quadrature

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

MATH34032: Green s Functions, Integral Equations and the Calculus of Variations 1

Numerical Integration

Math& 152 Section Integration by Parts

Here we study square linear systems and properties of their coefficient matrices as they relate to the solution set of the linear system.

Math 360: A primitive integral and elementary functions

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

STUDY GUIDE FOR BASIC EXAM

Numerical Analysis. Doron Levy. Department of Mathematics Stanford University

Lecture 17. Integration: Gauss Quadrature. David Semeraro. University of Illinois at Urbana-Champaign. March 20, 2014

Advanced Computational Fluid Dynamics AA215A Lecture 3 Polynomial Interpolation: Numerical Differentiation and Integration.

MAA 4212 Improper Integrals

Numerical Analysis. 10th ed. R L Burden, J D Faires, and A M Burden

Improper Integrals, and Differential Equations

Analytical Methods Exam: Preparatory Exercises

Review of Calculus, cont d

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Chapter 2. Determinants

1.9 C 2 inner variations

Bases for Vector Spaces

ODE: Existence and Uniqueness of a Solution

Quadratic Forms. Quadratic Forms

1 The Riemann Integral

Definite integral. Mathematics FRDIS MENDELU

8 Laplace s Method and Local Limit Theorems

Matrix Solution to Linear Equations and Markov Chains

Lecture Solution of a System of Linear Equation

Chapter 14. Matrix Representations of Linear Transformations

The Algebra (al-jabr) of Matrices

Construction of Gauss Quadrature Rules

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

Definite integral. Mathematics FRDIS MENDELU. Simona Fišnarová (Mendel University) Definite integral MENDELU 1 / 30

Numerical Analysis: Trapezoidal and Simpson s Rule

Lecture 3. Limits of Functions and Continuity

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams

Sturm-Liouville Eigenvalue problem: Let p(x) > 0, q(x) 0, r(x) 0 in I = (a, b). Here we assume b > a. Let X C 2 1

CAAM 453 NUMERICAL ANALYSIS I Examination There are four questions, plus a bonus. Do not look at them until you begin the exam.

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

Engineering Analysis ENG 3420 Fall Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 11:00-12:00

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Fourier series. Preliminary material on inner products. Suppose V is vector space over C and (, )

Euler, Ioachimescu and the trapezium rule. G.J.O. Jameson (Math. Gazette 96 (2012), )

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying

MATH 174A: PROBLEM SET 5. Suggested Solution

Elements of Matrix Algebra

Coalgebra, Lecture 15: Equations for Deterministic Automata

Convex Sets and Functions

Section 6.1 INTRO to LAPLACE TRANSFORMS

Continuous Random Variables

Partial Derivatives. Limits. For a single variable function f (x), the limit lim

An approximation to the arithmetic-geometric mean. G.J.O. Jameson, Math. Gazette 98 (2014), 85 95

Week 10: Line Integrals

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Recitation 3: More Applications of the Derivative

Matrices and Determinants

MTH 5102 Linear Algebra Practice Exam 1 - Solutions Feb. 9, 2016

Definition of Continuity: The function f(x) is continuous at x = a if f(a) exists and lim

p-adic Egyptian Fractions

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1

Math 8 Winter 2015 Applications of Integration

Lecture Note 4: Numerical differentiation and integration. Xiaoqun Zhang Shanghai Jiao Tong University

Review of basic calculus

III. Lecture on Numerical Integration. File faclib/dattab/lecture-notes/numerical-inter03.tex /by EC, 3/14/2008 at 15:11, version 9

Review of Gaussian Quadrature method

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER /2019

1 1D heat and wave equations on a finite interval

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes

MATRICES AND VECTORS SPACE

63. Representation of functions as power series Consider a power series. ( 1) n x 2n for all 1 < x < 1

Transcription:

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture LU fctoriztion of mtrices. Definition nd pplictions Let A be rel n n mtrix. We sy tht the n n mtrices L nd U re n LU fctoriztion of A if () L is lower tringulr (i.e., L i,j = 0, i < j); () U is upper tringulr, U i,j = 0, i > j; nd (3) A = LU. Therefore the fctoriztion tkes the form = Appliction Clcultion of determinnt: det A = (det L)(det U) = ( n k= L k,k) ( n k= U k,k). Appliction Testing for nonsingulrity: A = LU is nonsingulr iff ll the digonl elements of L nd U re nonzero. Appliction 3 Solution of liner systems: Let A = LU nd suppose we wish to solve Ax = b. This is the sme s L(Ux) = b, which we decompose into Ly = b, Ux = y. Both ltter systems re tringulr nd cn be clculted esily. Thus, L, y = b gives y, next L, y + L, y = b yields y etc. Hving found y, we solve for x by reversing the order: U n,n x n = y n gives x n, U n,n x n + U n,n x n = y n produces x n nd so on. This requires O(n ) computtionl opertions (usully we only bother to count multiplictions/divisions). Appliction 4 The inverse of A: It is strightforwrd to devise direct wy of clculting the inverse of tringulr mtrices, subsequently forming A = U L. Why not Crmer s rule? For the uninitited, recursive definition of determinnt my seem to be good method for its clcultion (nd perhps even for the solution of liner systems with Crmer s rule). Unfortuntely, the number of opertions increses like n!. Thus, on 0 9 flop/sec. computer n = 0 0 4 sec., n = 0 7 min, n = 30 4 0 5 yers... The clcultion of LU fctoriztion We denote the columns of L by l, l,..., l n nd the rows of U by u, u,..., u n. Hence A = LU = [ l l l n ] u u.. u n = n k= l k u k. (.) Since the first k components of l k nd u k re ll zero, ech rnk-one mtrix l k u k its first k rows nd columns. hs zeros in Assume tht the fctoriztion exists (hence the digonl elements of L re nonzero) nd tht A is nonsingulr. Since l k u k stys the sme if we replce l k αl k, u k α u k, where α 0, we my ssume w.l.o.g. tht ll digonl elements of L equl one. In other words, the kth row of l k u k is u k nd its kth column is U k,k times l k. Plese emil ll corrections nd suggestions to these notes to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

We begin our clcultion by extrcting l nd u from A, nd then proceed similrly to extrct l nd u, etc. First we note tht since the leding k elements of l k nd u k re zero for k, it follows from (.) tht u is the first row of A nd l is the first column of A, divided by A, (so tht L, = ). Next, hving found l nd u, we form the mtrix A l u = n k= l ku k. The first row & column of A re zero nd it follows tht u is the second row of A l u, while l is its second column, scled so tht L, =. The LU lgorithm: Set A 0 := A. For ll k =,,..., n set u k to the kth row of A k nd l k to the kth column of A k, scled so tht L k,k =. Further, clculte A k := A k l k u k before incrementing k. Note tht ll elements in the first k rows & columns of A k re zero. Hence, we cn use the storge of the originl A to ccumulte L nd U. The full LU fctoriztion requires O(n 3 ) computtionl opertions..3 Reltion to Gussin elimintion The eqution A k = A k l k u k hs the property tht the jth row of A k is the jth row of A k minus L j,k times u k (the kth row of A k ). Moreover, the multipliers L k,k, L k+,k,..., L n,k re chosen so tht the outcome of this elementry row opertion is tht the kth column of A k is zero. This construction is nlogous to Gussin elimintion for solving Ax = b. An importnt difference is tht in LU we do not consider the right hnd side b until the fctoriztion is complete. This is useful e.g. when there re mny right hnd sides, in prticulr if not ll the b s re known t the outset: in Gussin elimintion the solution for ech new b would require O(n 3 ) computtionl opertions, wheres with LU fctoriztion O(n 3 ) opertions re required for the initil fctoriztion, but then the solution for ech new b only requires O(n ) opertions..4 Pivoting Nive LU fctoriztion fils when, for exmple, A, = 0. The remedy is to exchnge rows of A, technique clled column pivoting. This is equivlent to picking suitble eqution for eliminting the first unknown in Gussin elimintion. Specificlly, column pivoting mens tht, hving obtined A k, we exchnge two rows of A k so tht the element of lrgest mgnitude in the kth column is in the pivotl position (k, k). In other words, (A k ) k,k = mx{ (A k ) j,k : j =,,..., n}. Of course, the sme exchnge is required in the portion of L tht hs been formed lredy (i.e., the first k columns). Also, we need to record the permuttion of rows to solve for the right hnd side nd/or to compute the determinnt. (The exchnge of rows cn be regrded s the pre-multipliction of the relevnt mtrix by permuttion mtrix.) Column pivoting copes with zeros t the pivot position, except when the whole kth column of A k is zero in tht cse it is usul to let l k be the kth unit vector while, s before, choose u k s the kth row of A k ). Such choice preserves the condition tht the mtrix l k u k hs the sme kth row nd column s A k. Thus A k := A k l k u k still hs zeros in its kth row nd column s required. An importnt dvntge of column pivoting is tht every element of L hs mgnitude t most one. This voids not just division by zero but lso tends to reduce the chnce of very lrge numbers occuring during the fctoriztion, phenomenon tht might led to ill conditioning nd to ccumultion of roundoff error. In row pivoting one exchnges columns of A k, rther thn rows (sic!), wheres totl pivoting corresponds to exchnge of both rows nd columns, so tht the modulus of the pivotl element (A k ) k,k is mximised.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture Fctoriztion of structured mtrices. Symmetric mtrices Let A be n n n symmetric mtrix (i.e., A k,l = A l,k ). An nlogue of LU fctoriztion tkes dvntge of symmetry: we express A in the form of the product LDL, where L is n n lower tringulr, with ones on its digonl, wheres D is digonl mtrix. Subject to its existence, we cn write this fctoriztion s D, 0 0 l A = [ ]. l l l n 0 D..., l......... 0 = n D k,k l k l k 0 0 D n,n l k= n where, s before, l k is the kth column of L. The nlogy with the lgorithm of Section. becomes obvious by letting U = DL, but the present form lends itself better to exploittion of symmetry. Specificlly, to compute this fctoriztion, we let A 0 = A nd for k =,,..., n let l k be the multiple of the kth column of A k such tht L k,k =. Set D k,k = (A k ) k,k nd form A k = A k D k,k l k l k. [ ] [ ] 4 Exmple Let A = A 0 =. Hence l 4 =, D, = nd We deduce tht l = A = A 0 D, l l = [ 0 [ 4 4 ] [ 0, D, = 3 nd A = ] [ 4 ] [ 0 0 3 ] = [ 0 0 0 3 ] [ 0 ]. ].. Symmetric positive definite mtrices Recll tht A is positive definite if x Ax > 0 for ll x 0. Theorem Let A be rel n n symmetric mtrix. It is positive definite if nd only if it hs n LDL fctoriztion in which the digonl elements of D re ll positive. Proof. Suppose tht A = LDL nd let x R n \ {0}. Since L is nonsingulr, y := L x 0. Then x Ax = y Dy = n k= D k,kyk > 0, hence A is positive definite. Conversely, suppose tht A is positive definite. We wish to demonstrte tht n LDL fctoriztion exists. We denote by e k R n the kth unit (.k.. coordinte) vector. Hence e Ae = A, > 0 nd l & D, re well defined. We now show tht (A k ) k,k > 0 for k =,,.... The result is true for k = nd we continue by induction (hence my ssume tht A k = A k j= D j,jl j l j hs been computed successfully). We define x R n s follows. The bottom n k components re zero, x k = nd x, x,..., x k re clculted in reverse order, ech x j being chosen so tht l j x = 0 for j = k, k,...,. In other words, since 0 = l j x = n i= L i,jx i = k i=j L i,jx i, we let x j = k i=j+ L i,jx i, j = k, k,...,. Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

Since the first k rows & columns of A k vnish, our choice implies tht (A k ) k,k = x A k x. Thus, from the definition of A k nd the choice of x, k k (A k ) k,k = x A k x = x A D j,j l j l j x = x Ax D j,j (l j x) = x Ax > 0, j= s required. Hence (A k ) k,k > 0, k =,,..., n, nd the fctoriztion exists. Conclusion It is possible to check if symmetric mtrix is positive definite by trying to form its LDL fctoriztion. Cholesky fctoriztion Define D / s the digonl mtrix whose (k, k) element is D / k,k, hence D / D / = D. Then, A being positive definite, we cn write j= A = (LD / )(D / L ) = (LD / )(LD / ). In other words, letting L := LD /, we obtin the Cholesky fctoriztion A = L L..3 Sprse mtrices Frequently it is required to solve very lrge systems Ax = b (n = 0 5 is considered smll in this context!) where nerly ll the elements of A re zero. Such mtrix is clled sprse nd efficient solution of Ax = b should exploit sprsity. In prticulr, we wish the mtrices L nd U to inherit s much s possible of the sprsity of A. The only tool t our disposl t the moment is the freedom to exchnge rows nd columns to minimise fill-in. To this end the following theorem is useful. Theorem Let A = LU be n LU fctoriztion (without pivoting) of sprse mtrix. Then ll leding zeros in the rows of A to the left of the digonl re inherited by L nd ll the leding zeros in the columns of A bove the digonl re inherited by U. Proof Follows from Question on Exmples Sheet. This theorem suggests tht if one requires fctoriztion of sprse mtrix then one might try to reorder its rows nd columns by preliminry clcultion so tht mny of the zero elements re leding zero elements in rows nd columns. This will reduce the fill-in. Exmple The LU fctoristion of 3 0 3 0 0 0 0 0 0 0 3 0 = 0 0 0 3 0 0 0 0 3 0 0 0 3 8 0 0 3 4 0 3 8 3 0 0 8 3 3 9 3 0 0 8 4 8 6 9 0 8 4 0 0 0 9 9 4 7 9 8 0 0 0 0 8 hs significnt fill-in. However, reordering (symmetriclly) rows nd columns 3, 4 nd 4 5 yields 0 0 0 0 0 0 0 0 0 0 0 3 0 0 3 0 0 0 0 3 = 0 0 0 0 0 3 0 0 3 0 0 0 0 9 6 0 0 0 0 0 0 0 0 3. 0 0 3 0 0 6 9 3 0 0 0 0 7 87 Exmple If the nonzeros of A occur only on the digonl, in one row nd in one column, then the full row nd column should be plced t the bottom nd on the right of A, respectively. Generl tretment of orderings tht minimise sprsity cn be ddressed using grph theory, but this is well outside the scope of n undergrdute course.,

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 3 Bnd mtrices The mtrix A is bnd mtrix if there exists n integer r < n such tht A i,j = 0 for i j > r, i, j =,,..., n. In other words, ll the nonzero elements of A reside in bnd of width r + long the min digonl. In tht cse, ccording to the sttement from the end of the lst lecture, A = LU implies tht L i,j = U i,j = 0 i j > r nd sprsity structure is inherited by the fctoriztion. In generl, the expense of clculting n LU fctoriztion of n n n dense mtrix A is O ( n 3) opertions nd the expense of solving Ax = b, provided tht the fctoriztion is known, is O ( n ). However, in the cse of bnded A, we need just O ( r n ) opertions to fctorize nd O(rn) opertions to solve liner system. If r n this represents very substntil sving! Generl sprse mtrices re crucil to wide rnge of pplictions, e.g. the solution of prtil differentil equtions. There exists welth of methods for their solution. One pproch is efficient fctoriztion, tht minimizes fill in. Yet nother is to use itertive methods, our next topic. There lso exists substntil body of other, highly effective methods, e.g. Fst Fourier Trnsforms nd multigrid techniques (cf. Prt II course in Numericl Anlysis), fst multipole techniques nd much more. 3 Itertive methods for liner systems 3. Bsic itertive schemes Solution of Ax = b by fctoriztion is frequently very expensive for lrge n, even if we exploit sprsity. An lterntive is to use itertive methods. Such methods re very efficient nd hve been subjected to intensive ttention in the lst few decdes. An exmple of n itertive scheme is to write A = B C, where () B & C re n n mtrices; () B is nonsingulr; (3) the system Bx = c is esy to solve nd (4) the mtrix C is somehow smll in comprison with B. We write the originl system in the form Bx = Cx + b nd consider solving it by itertion. Choose n rbitrry x 0 R n nd define x m+, m = 0,,..., by solving Bx m+ = Cx m + b. (3.) Provided tht B is, for exmple, bnded, the solution of (3.) is chep (nd the LU fctoriztion of B cn be re-used n exmple of why the LU formlism is superior to Gussin elimintion). Often the sequence {x m } m=0 converges to the solution of Ax = b. The Jcobi itertion We write A = A D A L A U, where A L is strictly lower tringulr, A D is digonl nd A U is strictly upper tringulr. Suppose tht no digonl element of A is zero. The Jcobi itertion is A D x m+ = (A L + A U )x m + b, m = 0,,.... (3.) The Guss Seidel itertion In the bove nottion, it tkes the form (A D A L )x m+ = A U x m + b, m = 0,,.... (3.3) Note tht A L + A D is lower tringulr, hence the solution of (3.3) is chep. Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

3. Necessry nd sufficient conditions for convergence Suppose tht A is nonsingulr nd denote by x the solution of Ax = b. Hving written A = B C, we exmine the itertive scheme (3.). (Note tht both (3.) nd (3.3) cn be cst in this form.) Our gol is to identify conditions so tht x m x, regrdless of the choice of x 0 R n. Subtrct Bx = Cx +b from (3.). This gives B(x m+ x ) = C(x m x ), hence Bε m+ = Cε m, where ε m := x m x is the error in the mth iterte. Since B is nonsingulr (otherwise we cnnot execute (3.) in the first plce), it follows tht ε m+ = Hε m = = H m+ ε 0, m = 0,,... where H = B C is the itertion mtrix.. (3.4) This indictes tht the errors tend to zero s m (regrdless of the choice of x 0 ) provided tht lim m H m = O. We employ the nottion ρ(p ) for the mgnitude of the lrgest (in bsolute vlue) eigenvlue of the n n mtrix P. The quntity ρ(p ) is clled the spectrl rdius of the mtrix P. (Note: Recll tht, even if P is rel, its eigenvlues might be complex.) Theorem lim m x m = x for ll x 0 R n if nd only if ρ(h) <. Proof. We commence with the cse ρ(h) nd wish to demonstrte tht ε m need not tend to 0. Let λ be n eigenvlue of H such tht λ = ρ(h) nd let w be corresponding eigenvector, Hw = λw. If w is rel, we choose x 0 = x + w, hence ε 0 = w. It follows t once by induction tht ε m = λ m w, nd this cnnot tend to zero since λ. If λ C \ R then w is complex. Moreover, lso λ λ is n eigenvlue nd w is its eigenvector (the br denotes complex conjugtion). Note tht w nd w re linerly independent (otherwise they would hve corresponded to the sme eigenvlue). We denote the Eucliden length of p C n by ( n ) / p = p k. k= Note tht p is continuous function of the components of p. Hence, zw + z w is continuous function of the complex vrible z. It is consequence of the liner independence of w nd w nd of the theorem tht continuous function ttins its minimum in closed intervl tht inf e iθ w + e iθ w = min e iθ w + e iθ w = ν, π θ π π θ π sy, is positive. (ν = 0 would hve implied e iθ w + e iθ w = 0 for some θ.) By homogeneity, it is true for every z C tht zw + z w ν z. (3.5) We let x 0 = x + w + w, hence ε 0 = w + w. (Note tht everything in sight is rel: this ws precisely the purpose of our construction!) We hve by induction on (3.) tht ε m = λ m w + λ m w, m = 0,,.... Setting z = λ m, (3.5) implies tht ε m ν λ m ν. Hence the sequence {ε m } m=0 is bounded wy from zero nd ε m 0. This completes the proof of the only if prt of the theorem.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 4 Recp of Theorem lim m x m = x for ll x 0 R n if nd only if ρ(h) <.... proof. We consider next the cse of ρ(h) <. Assume for simplicity tht H possesses n linerly independent eigenvectors w, w,..., w n, sy. Hence Hw j = λ j w j, λ j <, j =,,..., n. Liner independence mens tht every ε R n cn be expressed s liner combintion of the eigenvectors. Therefore, given x 0 R n, there exist α, α,..., α n C such tht ε 0 = x 0 x = n j= α jw j. Thus, ε = Hε 0 = α j λ j w j nd, by induction, ε m = j= α j λ m j w j for ll m = 0,,.... Since ρ(h) <, it follows tht lim m ε m = 0, s required. The missing cse Suppose tht ρ(h) < but tht H[ does not ] hve n linerly independent b eigenvlues. This occurs, for exmple, for the mtrix H =, where b 0 nd <. The 0 eigenvlues of H re both, but it[ is n esy exercise ] to verify tht ll eigenvectors re necessrily multiples of e. Moreover, H m = m m m b 0 m (prove!), therefore < implies H m O. j= 4 QR fctoriztion of mtrices 4. Sclr products, norms nd orthogonlity We first revise few definitions. R n is the liner spce of ll rel n-tuples. For ll u, v R n we define the sclr product u, v = v, u = u j v j = u v = v u. j= If u, v, w R n nd α, β R then αu + βw, v = α u, v + β w, v. ( The norm (.k.. the Eucliden length) of u R n n / is u = j= j) u = u, u / 0. For u R n, u = 0 iff u = 0. We sy tht u R n nd v R n re orthogonl to ech other if u, v = 0. The vectors q, q,..., q m R n re orthonorml if q k, q l = {, k = l, 0, k l, k, l =,,..., m. An n n rel mtrix Q is orthogonl if ll its columns re orthonorml. Since (Q Q) k,l = q k, q l, this implies tht Q Q = I (I is the unit mtrix). Hence Q = Q nd QQ = QQ = I. We conclude tht the rows of n orthogonl mtrix re lso orthonorml, nd tht Q is n orthogonl mtrix. Further, = det I = det(qq ) = det Q det Q = (det Q), nd thus we deduce tht det Q = ±, nd tht n orthogonl mtrix is nonsingulr. Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

Proposition If P, Q re orthogonl then so is P Q. Proof. Since P P = Q Q = I, we hve (P Q) (P Q) = (Q P )(P Q) = Q (P P )Q = Q Q = I, hence P Q is orthogonl. Proposition Let q, q,..., q m R n be orthonorml. Then m n. Proof. We rgue by contrdiction. Suppose tht m n + nd let Q be the orthogonl mtrix whose columns re q, q,..., q n. Since Q is nonsingulr nd q m 0, there exists nonzero solution to the liner system Q = q m, hence q m = n j= jq j. But 0 = q l, q m = q l, j q j = j q l, q j = l, l =,,..., n, j= j= hence = 0, contrdiction. We deduce tht m n. Lemm Let q, q,..., q m R n be orthonorml nd m n. Then there exists q m+ R n such tht q, q,..., q m+ re orthonorml. Proof. We construct q m+. Let Q be the n m mtrix whose columns re q,..., q m. Since m m Q k,j = q j = m < n, k= j= j= it follows tht l {,,..., n} such tht m j= Q l,j <. We let w = e l m j= q j, e l q j. Then for i =,,..., m m q i, w = q i, e l q j, e l q i, q j = 0, i.e. by design w is orthogonl to q,..., q m. Further, since Q l,j = q j, e l, we hve j= j= m m m m w = w, w = e l, e l q j, e l e l, q j + q j, e l q k, e l q j, q k = Q l,j > 0. j= k= j= Thus we define q m+ = w/ w. 4. The QR fctoriztion The QR fctoriztion of n m n mtrix A hs the form A = QR, where Q is n m m orthogonl mtrix nd R is n m n upper tringulr mtrix (i.e., R i,j = 0 for i > j). We will demonstrte in the sequel tht every mtrix hs (non-unique) QR fctoriztion. An ppliction Let m = n nd A be nonsingulr. We cn solve Ax = b by clculting the QR fctoriztion of A nd solving first Qy = b (hence y = Q b) nd then Rx = y ( tringulr system!). Interprettion of the QR fctoriztion Let m n nd denote the columns of A nd Q by,,..., n nd q, q,..., q m respectively. Since R, R, R,n. 0 R,.. [ n ] = [ q q q m ]......, 0 R n,n.... 0 0 we hve k = k j= R j,kq j, k =,,..., n. In other words, Q hs the property tht ech kth column of A cn be expressed s liner combintion of the first k columns of Q.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 5 4.3 The Grm Schmidt lgorithm Given nonzero m n mtrix A with the columns,,..., n R m, we construct Q & R where Q is orthogonl, R upper-tringulr nd A = QR: in other words, l R k,l q l = k, k =,,..., n, where A = [ n ]. (4.) k= Assuming 0, we derive q nd R, from the eqution (4.) for k =. Since q =, we let q = /, R, =. Next we form the vector b = q, q. It is orthogonl to q, since q, q, q = q, q, q, q = 0. If b 0, we set q = b/ b, hence q nd q re orthonorml. Moreover, q, q + b q = q, q + b =, hence, to obey (4.) for k =, we let R, = q,, R, = b. The Grm Schmidt lgorithm The bove ide cn be extended to ll columns of A. Step Set k := 0, j := 0 (k is the number of columns of Q tht hve been lredy formed nd j is the number of columns of A tht hve been lredy considered, clerly k j); Step Increse j by. If k = 0 then set b := j, otherwise (i.e., when k ) set R i,j := q i, j, i =,,..., k, nd b := j k i= q i, j q i. [Note: b is orthogonl to q, q,..., q k.] Step 3 If b 0 increse k by. Subsequently, set q k := b/ b, R k,j := b nd R i,j := 0 for i k +. [Note: Hence, ech column of Q hs unit length, s required, j = k i= R i,jq j nd R is upper tringulr, becuse k j.] Step 4 Terminte if j = n, otherwise go to Step. Previous lecture Since the columns of Q re orthonorml, there re t most m of them, i.e. the finl vlue of k cn t exceed m. If it is less then m then previous lemm demonstrtes tht we cn dd columns so tht Q becomes m m nd orthogonl. The disdvntge of Grm Schmidt is its ill-conditioning. Since we re using finite rithmetic, even smll imprecisions in the clcultion of inner products rpidly led to effective loss of orthogonlity. Thus, errors ccumulte fst nd even for moderte vlues of m it is no longer true tht the computed off-digonl elements of Q Q re very smll in mgnitude. On the other hnd, orthogonlity conditions re preserved well when one genertes new orthogonl mtrix by computing the product of two given orthogonl mtrices. Therefore lgorithms tht express Q s product of simple orthogonl mtrices re highly useful. This suggests n lterntive wy forwrd. 4.4 Orthogonl trnsformtions Given rel, m n mtrix A 0 = A, we seek sequence Ω, Ω,..., Ω k of m m orthogonl mtrices such tht the mtrix A i := Ω i A i hs more zero elements below the min digonl thn A i for i =,,..., k nd so tht the mnner of insertion of such zeros is such tht A k is upper tringulr. We then let R = A k, therefore Ω k Ω k Ω Ω A = R Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

nd Q = (Ω k Ω k Ω ) = (Ω k Ω k Ω ) = Ω Ω Ω k. Hence A = QR, where Q is orthogonl nd R upper tringulr. 4.5 Givens rottions We sy tht n m m orthogonl mtrix Ω j is Givens rottion if it coincides with the unit mtrix, except for four elements. Specificlly, we use the nottion Ω [p,q], where p < q m for mtrix such tht Ω [p,q] p,p = Ω [p,q] q,q = cos θ, Ω [p,q] p,q = sin θ, Ω [p,q] q,p = sin θ for some θ [ π, π]. The remining elements of Ω [p,q] re those of unit mtrix. For exmple, cos θ sin θ 0 0 0 0 0 m = 4 = Ω [,] = sin θ cos θ 0 0 0 0 0, Ω[,4] = 0 cos θ 0 sin θ 0 0 0. 0 0 0 0 sin θ 0 cos θ Geometriclly, such mtrices correspond to the underlying coordinte system being rigidly rotted long two-dimensionl plne (in mechnics this is clled n Euler rottion). It is trivil to confirm tht they re orthogonl. Theorem Let A be n m n mtrix. Then, for every p < q m, i {p, q} nd j n, there exists θ [ π, π] such tht (Ω [p,q] A) i,j = 0. Moreover, ll the rows of Ω [p,q] A, except for the pth nd the qth, re the sme s the corresponding rows of A, wheres the pth nd the qth rows re liner combintions of the old pth nd qth rows. Proof. Let i = q. If A p,j = A q,j = 0 then ny θ will do, otherwise we let cos θ := A p,j / A p,j + A q,j, sin θ := A q,j/ A p,j + A q,j. Hence (Ω [p,q] A) q,k = (sin θ)a p,k + (cos θ)a q,k, k =,,..., n (Ω [p,q] ) q,j = 0. Likewise, when i = p we let cos θ := A q,j / A p,j + A q,j, sin θ := A p,j/ A p,j + A q,j. The lst two sttements of the theorem re n immedite consequence of the structure of Ω [p,q]. An exmple: Suppose tht A is 3 3. We cn force zeros underneth the min digonl s follows. First pick Ω [,] so tht (Ω [,] A), = 0 Ω [,] A = 0. Next pick Ω [,3] so tht (Ω [,3] Ω [,] A) 3, = 0. Note tht multipliction by Ω [,3] doesn t lter the second row, hence (Ω [,3] Ω [,] A), remins zero Ω [,3] Ω [,] A = 0. 0 3 Finlly, pick Ω [,3] so tht (Ω [,3] Ω [,3] Ω [,] A) 3, = 0. Since both second nd third row of Ω [,3] Ω [,] A hve leding zero, their liner combintion preserves these zeros, hence lso (Ω [,3] Ω [,3] Ω [,] A), = (Ω [,3] Ω [,3] Ω [,] A) 3, = 0. It follows tht Ω [,3] Ω [,3] Ω [,] A is upper tringulr. Therefore R = Ω [,3] Ω [,3] Ω [,] A = 0, Q = (Ω [,3] Ω [,3] Ω [,] ). 0 0

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 6 The Givens lgorithm Given m n mtrix A, let l i be the number of leding zeros in the ith row of A, i =,,..., m. Step Stop if the (integer) sequence {l, l,..., l m } increses monotoniclly, the increse being strictly monotone for l i n. Step Pick ny two integers p < q m such tht either l p > l q or l p = l q < n. Step 3 Replce A by Ω [p,q] A, using the Givens rottion tht nnihiltes the (q, l q + ) element. Updte the vlues of l p nd l q nd go to Step. The finl mtrix A is upper tringulr nd lso hs the property tht the number of leding zeros in ech row increses strictly monotoniclly until ll the rows of A re zero mtrix of this form is sid to be in stndrd form. This end result, s we recll, is the required mtrix R. The cost There re less thn mn rottions nd ech rottion replces two rows by their liner combintions, hence the totl cost is O ( mn ). If we wish to obtin explicitly n orthogonl Q s.t. A = QR then we commence by letting Ω be the m m unit mtrix nd, ech time A is premultiplied by Ω [p,q], we lso premultiply Ω by the sme rottion. Hence the finl Ω is the product of ll the rottions, in correct order, nd we let Q = Ω. The extr cost is O ( m n ). However, in most pplictions we don t need Q but, insted, just the ction of Q on given vector (recll: solution of liner systems!). This cn be ccomplished by multiplying the vector by successive rottions, the cost being O(mn). 4.6 Householder trnsformtions Let u R m \ {0}. The m m mtrix I uu u is clled Householder trnsformtion (or Householder reflection). Ech such mtrix is symmetric nd orthogonl, since (I uu u ) ) (I uu u = ) (I uu u = I 4 uu u + u)u 4u(u u 4 = I. Householder trnsformtions offer n lterntive to Given rottions in the clcultion of QR fctoriztion. Deriving the first column of R Our gol is to multiply n m n mtrix A by sequence of Householder trnsformtions so tht ech product induces zeros under the digonl in n entire successive column. To strt with, we seek reflection tht trnsforms the first nonzero column of A to multiple of e. Let R m be the first nonzero column of A. We wish to choose u R m s.t. the bottom m entries of ( ) I uu u = u u u Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

vnish nd, in ddition, we normlise u so tht u = u (recll tht 0). Therefore u i = i, i =,..., m nd the normlistion implies tht m m m u + i = u + i u u + i = 0 u = ±. i= i= It is usul to let the sign be the sme s the sign of, since u might led to division by tiny number, hence to numericl difficulties. For lrge m we do not execute explicit mtrix multipliction. Insted, to clculte ) (I uu u A = A u(u A) u, we first evlute w := u A, subsequently forming A u uw. Subsequent columns of R Suppose tht is the first column of A tht isn t comptible with stndrd form (previous columns hve been, presumbly, lredy delt with by Householder trnsformtions) nd tht the stndrd form requires to bring the k +,..., m components to zero. Hence, nonzero elements in previous columns must be confined to the first k rows nd we wnt them to be unmended by the reflection. Thus, we let the first k components of u be zero nd choose u k = k ± ( m i=k i ) / nd ui = i, i = k +,..., m. The Householder method We process columns of A in sequence, in ech stge premultiplying current A by the requisite Householder trnsformtion. The end result is n upper tringulr mtrix R in its stndrd form. Exmple A = 4 7 0 3 0 0 0 0 0 0 u = 0 0 5 i= ) (I uu u A = 4 7 0 3 0 0 3 0 0 0 0 0 0 Clcultion of Q If the mtrix Q is required in n explicit form, set Ω = I initilly nd, for ech successive reflection, replce Ω by ) (I uu u Ω = Ω u u(u Ω). As in the cse of Givens rottions, by the end of the computtion, Q = Ω. However, if we require just the vector c = Q b, sy, rther thn the mtrix Q, then we set initilly c = b nd in ech stge replce c by ) (I uu u c = c u c u u. Deciding between Givens nd Householder trnsformtions If A is dense, it is in generl more convenient to use Householder reflections. Givens rottions come into their own, however, when A hs mny leding zeros in its rows. In n extreme cse, if n n n mtrix A consists of zeros underneth the first subdigonl, they cn be rotted wy in just n Givens rottions, t the cost of O ( n ) opertions!.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 7 5 Liner lest squres 5. Sttement of the problem Suppose tht n m n mtrix A nd vector b R m re given. The eqution Ax = b, where x R n is unknown, hs in generl no solution (if m > n) or n infinity of solutions (if m < n). Problems of this form occur frequently when we collect m observtions (which, typiclly, re prone to mesurement error) nd wish to exploit them to form n n-vrible liner model, where n m. (In sttistics, this is known s liner regression.) Bering in mind the likely presence of errors in A nd b, we seek x R n tht minimises the Eucliden length Ax b. This is the lest squres problem. Theorem x R n is solution of the lest squres problem iff A (Ax b) = 0. Proof. If x is solution then it minimises f(x) := Ax b = Ax b, Ax b = x A Ax x A b + b b. Hence f(x) = 0. But f(x) = A Ax A b, hence A (Ax b) = 0. Conversely, suppose tht A (Ax b) = 0 nd let u R n. Hence, letting y = u x, Au b = Ax + Ay b, Ax + Ay b = Ax b, Ax b + y A (Ax b) + Ay, Ay = Ax b + Ay Ax b nd x is indeed optiml. Corollry Optimlity of x the vector Ax b is orthogonl to ll columns of A. 5. Norml equtions One wy of finding optiml x is by solving the n n liner system A Ax = A b the method of norml equtions. This pproch is populr in mny pplictions. However, there re three disdvntges. Firstly, A A might be singulr, secondly sprse A might be replced by dense A A nd, finlly, forming A A might led to loss of ccurcy. Thus, suppose tht our computer works in the IEEE rithmetic stndrd ( 5 significnt digits) nd let [ ] 0 8 0 A = 8 [ = A A = 0 6 + 0 6 + 0 6 + 0 6 + ] 0 6 [ Given b = [0, ] the solution of Ax = b is [, ], s cn be esily found by Gussin elimintion. However, our computer believes tht A A is singulr! ]. 5.3 QR nd lest squres Lemm Let A be ny m n mtrix nd let b R m. The vector x R n minimises Ax b iff it minimises ΩAx Ωb for n rbitrry m m orthogonl mtrix Ω. Proof. Given n rbitrry vector v R m, we hve Ωv = v Ω Ωv = v v = v. Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

In prticulr, ΩAx Ωb = Ax b. An irrelevnt, yet importnt remrk The property tht orthogonl mtrices leve the Eucliden distnce intct is clled isometry nd it hs mny importnt rmifictions throughout mthemtics nd mthemticl physics. Method of solution Suppose tht A = QR, QR fctoriztion with R in stndrd form. Becuse of the lemm, letting Ω := Q, Ax b = Q (Ax b) = Rx Q b, therefore we seek x R n tht minimises Rx Q b. In generl (m > n) mny rows of R consist of zeros. Suppose for simplicity tht rnk R = rnk A = n. Then the bottom m n rows of R re zero. Therefore we find x by solving the (nonsingulr) liner system given by the first n equtions of Rx = Q b. Similr (lthough more complicted) lgorithm pplies when rnk R n. Note, reclling our former remrk, tht we don t require Q explicitly, just to evlute Q b. 6 Polynomil interpoltion 6. The interpoltion problem Given n + distinct rel points x 0, x,..., x n nd rel numbers f 0, f,..., f n, we seek function p : R R such tht p(x i ) = f i, i = 0,,..., n. Such function is clled n interpolnt. We denote by P n [x] the set of ll rel polynomils of degree t most n nd observe tht ech p P n [x] is uniquely defined by its n + coefficients. In other words, we hve n + degrees of freedom, while interpoltion t x 0, x,..., x n constitutes n + conditions. This, intuitively, justifies seeking n interpolnt from P n [x]. 6. The Lgrnge formul Although, in principle, we my solve liner problem with n + unknowns to determine polynomil interpolnt, this cn be ccomplished more esily by using the explicit Lgrnge formul. We clim tht n x x l p(x) = f k, x R. x k x l l=0 l k Note tht p P n [x], s required. We wish to show tht it interpoltes the dt. Define L k (x) := n l=0 l k x x l x k x l, j = 0,,..., n (Lgrnge crdinl polynomils). It is trivil to verify tht L j (x j ) = nd L j (x k ) = 0 for k j, hence p(x j ) = f k L k (x j ) = f j, j = 0,,..., n, nd p is n interpolnt, Uniqueness Suppose tht both p P n [x] nd q P n [x] interpolte to the sme n + dt. Then the nth degree polynomil p q vnishes t n + distinct points. But the only nth-degree polynomil with n + zeros is the zero polynomil. Therefore p q 0 nd the interpolting polynomil is unique.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 8 6.3 The error of polynomil interpoltion Let [, b] be closed intervl of R. We denote by C[, b] the spce of ll continuous functions from [, b] to R nd let C s [, b], where s is positive integer, stnd for the liner spce of ll functions in C[, b] tht possess s continuous derivtives. Theorem Given f C n+ [, b], let p P n [x] interpolte the vlues f(x i ), i = 0,,..., n, where x 0,..., x n [, b] re pirwise distinct. Then for every x [, b] there exists ξ [, b] such tht f(x) p(x) = (n + )! f (n+) (ξ) n (x x i ). (6.) Proof. The formul (6.) is true when x = x j for j {0,,..., n}, since both sides of the eqution vnish. Let x [, b] be ny other point nd define n n φ(t) := [f(t) p(t)] (x x i ) [f(x) p(x)] (t x i ), t [, b]. i=0 [Note: The vrible in φ is t, wheres x is fixed prmeter.] Note tht φ(x j ) = 0, j = 0,,..., n, nd φ(x) = 0. Hence, φ hs t lest n + distinct zeros in [, b]. Moreover, φ C n+ [, b]. We now pply the Rolle theorem: if the function g C [, b] vnishes t two distinct points in [, b] then its derivtive vnishes t n intermedite point. We deduce tht φ vnishes t (t lest) n + distinct points in [, b]. Next, pplying Rolle to φ, we conclude tht φ vnishes t n points in [, b]. In generl, we prove by induction tht φ (s) vnishes t n + s distinct points of [, b] for s = 0,,..., n +. Letting s = n +, we hve φ (n+) (ξ) = 0 for some ξ [, b]. Hence 0 = φ (n+) (ξ) = [f (n+) (ξ) p (n+) (ξ)] i=0 i=0 n (x x i ) [f(x) p(x)] dn+ i=0 dt n+ i=0 n (ξ x i ). Since p (n+) 0 nd d n+ n i=0 (t x i)/dt n+ (n + )!, we obtin (6.). Runge s exmple We interpolte f(x) = /( + x ), x [ 5, 5], t the eqully-spced points x j = 5 + 0 j n, j = 0,,..., n. Some of the errors re displyed below x f(x) p(x) n i=0 (x x i) 0.75 3. 0 3.5 0 6.75 7.7 0 3 6.6 0 6.75 3.6 0 4. 0 7 3.75 5. 0 7.6 0 8 4.75 4.0 0 + 7.3 0 0 Tble: Errors for n = 0 Figure: Errors for n = 5 The growth in the error is explined by the product term in (6.) (the rightmost column of the tble). Adding more interpoltion points mkes the lrgest error even worse. A remedy to this Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

stte of ffirs is to cluster points towrd the end of the rnge. A considerbly smller error is ttined for x j = 5 cos (n j)π n, j = 0,,..., n (so-clled Chebyshev points). It is possible to prove tht this choice of points minimizes the mgnitude of mx x [ 5,5] n i=0 (x x i). 6.4 Divided differences: definition Given pirwise-distinct points x 0, x,..., x n [, b], we let p P n [x] interpolte f C[, b] there. The coefficient of x n in p is clled the divided difference nd denoted by f[x 0, x,..., x n ]. We sy tht this divided difference is of degree n. We cn derive f[x 0,..., x n ] from the Lgrnge formul, f[x 0, x,..., x n ] = n f(x k ) l=0 l k x k x l. (6.) Theorem Let [ā, b] be the shortest intervl tht contins x 0, x,..., x n nd let f C n [ā, b]. Then there exists ξ [ā, b] such tht f[x 0, x,..., x n ] = n! f (n) (ξ). (6.3) Proof. Let p be the interpolting polynomil. The error function f p hs t lest n+ zeros in [ā, b] nd, pplying Rolle s theorem n times, it follows tht f (n) p (n) vnishes t some ξ [ā, b]. But p(x) = n! p(n) (ζ)x n + lower order terms (for ny ζ R), therefore, letting ζ = ξ, nd we deduce (6.3). f[x 0, x,..., x n ] = n! p(n) (ξ) = n! f (n) (ξ) Appliction It is consequence of the theorem tht divided differences cn be used to pproximte derivtives. 6.5 Recurrence reltions for divided differences Our next topic is useful wy to clculte divided differences (nd, ultimtely, to derive yet nother mens to construct n interpolting polynomil). We commence with the remrk tht f[x i ] is the coefficient of x 0 in the polynomil of degree 0 (i.e., constnt) tht interpoltes f(x i ), hence f[x i ] = f(x i ). Theorem Suppose tht x 0, x,..., x k+ re pirwise distinct, where k 0. Then f[x 0, x,..., x k+ ] = f[x, x,..., x k+ ] f[x 0, x,..., x k ] x k+ x 0. (6.4) Proof. Let p, q P k [x] be the polynomils tht interpolte f t respectively nd define {x 0, x,..., x k } nd {x, x,..., x k+ } r(x) := (x x 0)q(x) + (x k+ x)p(x) x k+ x 0 P k+ [x]. We redily verify tht r(x i ) = f(x i ), i = 0,,..., k +. Hence r is the (k + )-degree interpolting polynomil nd f[x 0,..., x k+ ] is the coefficient of x k+ therein. The recurrence (6.4) follows from the definition of divided differences.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 9 6.6 The Newton interpoltion formul Reclling tht f[x i ] = f(x i ), the recursive formul llows for fst evlution of the divided difference tble, in the following mnner: f[x 0 ] f[x 0, x ] f[x 0, x, x ] f[x 0, x, x, x 3 ] f[x ] f[x, x ] f[x, x, x 3 ]. f[x n ] This cn be done in O ( n ) opertions nd the outcome re the numbers {f[x 0, x,..., x l ]} k l=0. We now provide n lterntive representtion of the interpolting polynomil. Agin, f(x i ), i = 0,,..., k, re given nd we seek p P k [x] such tht p(x i ) = f(x i ), i = 0,..., k. Theorem Suppose tht x 0, x,..., x k re pirwise distinct. The polynomil k p k (x) := f[x 0 ] + f[x 0, x ](x x 0 ) + + f[x 0, x,..., x k ] (x x i ) P k [x] obeys p k (x i ) = f(x i ), i = 0,,..., k. Proof. By induction on k. The sttement is obvious for k = 0 nd we suppose tht it is true for k. We now prove tht p k+ (x) p k (x) = f[x 0, x,..., x k+ ] k i=0 (x x i). Clerly, p k+ p k P k+ [x] nd the coefficient of x k+ therein is, by definition, f[x 0,..., x k+ ]. Moreover, p k+ (x i ) p k (x i ) = 0, i = 0,,..., k, hence it is multiple of k i=0 (x x i), nd this proves the sserted form of p k+ p k. The explicit form of p k+ follows by dding p k+ p k to p k. We hve derived the Newton interpoltion formul, which requires only the top row of the divided difference tble. It hs severl dvntges over Lgrnge s. In prticulr, its evlution t given point x (provided tht divided differences re known) requires just O(k) opertions, s long s we do it by the Horner scheme p k (x) = {{{f[x 0,..., x k ](x x k ) + f[x 0,..., x k ]} (x x k ) + f[x 0,..., x k ]} (x x 3 ) + } + f[x 0 ]. On the other hnd, the Lgrnge formul is often better when we wish to mnipulte the interpoltion polynomil s prt of lrger mthemticl expression. We ll see n exmple in the section on Gussin qudrture. i=0 7 Orthogonl polynomils 7. Orthogonlity in generl liner spces We hve lredy seen the sclr product x, y = n i= x iy i, cting on x, y R n. Likewise, given rbitrry weights w, w,..., w n > 0, we my define x, y = n i= w ix i y i. In generl, sclr (or Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

inner) product is ny function V V R, where V is vector spce over the rels, subject to the following three xioms: Symmetry: x, y = y, x x, y V; Nonnegtivity: x, x 0 x V nd x, x = 0 iff x = 0; nd Linerity: x + by, z = x, z + b y, z x, y, z V,, b R. Given sclr product, we my define orthogonlity: x, y V re orthogonl if x, y = 0. Let V = C[, b], w V be fixed positive function nd define f, g := b w(x)f(x)g(x) dx for ll f, g V. It is esy to verify ll three xioms of the sclr product. 7. Orthogonl polynomils definition, existence, uniqueness Given sclr product in V = P n [x], we sy tht p n P n [x] is the nth orthogonl polynomil if p n, p = 0 for ll p P n [x]. [Note: different inner products led to different orthogonl polynomils.] A polynomil in P n [x] is monic if the coefficient of x n therein equls one. Theorem For every n 0 there exists unique monic orthogonl polynomil of degree n. Moreover, ny p P n [x] cn be expnded s liner combintion of p 0, p,..., p n, Proof. We let p 0 (x) nd prove the theorem by induction on n. Thus, suppose tht p 0, p,..., p n hve been lredy derived consistently with both ssertions of the theorem nd let q(x) := x n+ P n+ [x]. Motivted by the Grm Schmidt lgorithm, we choose p n+ (x) = q(x) q, p k p k, p k p k(x), x R. (7.) Clerly, p n+ P n+ [x] nd it is monic (since ll the terms in the sum re of degree n). Let m {0,,..., n}. It follows from (7.) nd the induction hypothesis tht p n+, p m = q, p m q, p k p k, p k p k, p m = q, p m q, p m p m, p m p m, p m = 0. Hence, p n+ is orthogonl to p 0,..., p n. Consequently, ccording to the second inductive ssertion, it is orthogonl to ll p P n [x]. To prove uniqueness, we suppose the existence of two monic orthogonl polynomils p n+, p n+ P n+ [x]. Let p := p n+ p n+ P n [x], hence p n+, p = p n+, p = 0, nd this implies 0 = p n+, p p n+, p = p n+ p n+, p = p, p, nd we deduce p 0. Finlly, in order to prove tht ech p P n+ [x] is liner combintion of p 0,..., p n+, we note tht we cn lwys write it in the form p = cp n+ + q, where c is the coefficient of x n+ in p nd where q P n [x]. According to the induction hypothesis, q cn be expnded s liner combintion of p 0, p,..., p n, hence our ssertion is true. Well-known exmples of orthogonl polynomils include Nme Nottion Intervl Weight function Legendre P n [, ] w(x) Chebyshev T n [, ] w(x) = ( x ) / Lguerre L n [0, ) w(x) = e x Hermite H n (, ) w(x) = e x

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 0 7.3 The three-term recurrence reltion How to construct orthogonl polynomils? (7.) might help, but it suffers from the sme problem s the Grm Schmidt lgorithm in Eucliden spces: loss of ccurcy due to imprecisions in the clcultion of sclr products. A considerbly better procedure follows from our next theorem. Theorem Monic orthogonl polynomils re given by the formul where p (x) 0, p 0 (x), p n+ (x) = (x α n )p n (x) β n p n (x), n = 0,,..., (7.) α n := p n, xp n p n, p n, β n = p n, p n p n, p n > 0. Proof. Pick n 0 nd let ψ(x) := p n+ (x) (x α n )p n (x) + β n p n (x). Since p n nd p n+ re monic, it follows tht ψ P n [x]. Moreover, becuse of orthogonlity of p n, p n, p n+, ψ, p l = p n+, p l p n, (x α n )p l + β n p n, p l = 0, l = 0,,..., n. Becuse of monicity, xp n = p n + q, where q P n [x]. Thus, from the definition of α n, β n, ψ, p n = p n, xp n + β n p n, p n = p n, p n + β n p n, p n = 0, ψ, p n = xp n, p n + α n p n, p n = 0. Every p P n [x] tht obeys p, p l = 0, l = 0,,..., n, must necessrily be the zero polynomil. For suppose tht it is not so nd let x s be the highest coefficient of x in p. Then p, p s 0, which is impossible. We deduce tht ψ 0, hence (7.) is true. Exmple Chebyshev polynomils We choose the sclr product f, g := dx f(x)g(x), f, g C[, ] x nd define T n P n [x] by the reltion T n (cos θ) = cos(nθ). Hence T 0 (x), T (x) = x, T (x) = x etc. Chnging the integrtion vrible, T n, T m = = π 0 dx π T n (x)t m (x) = cos nθ cos mθ dθ x [cos(n + m)θ + cos(n m)θ] dθ = 0 whenever n m. The recurrence reltion for Chebyshev polynomils is prticulrly simple, s cn be verified t once from the identity T n+ (x) = xt n (x) T n (x), cos[(n + )θ] + cos[(n )θ] = cos(θ) cos(nθ). Note tht the T n s ren t monic, hence the inconsistency with (7.). To obtin monic polynomils tke T n (x)/ n, n. Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/. 0

7.4 Lest-squres polynomil fitting Given f C[, b] nd sclr product g, h = b w(x)g(x)h(x) dx, we wish to pick p P n[x] so s to minimise f p, f p. Agin, we stipulte tht w(x) > 0 for x (, b). Intuitively speking, p pproximtes f nd is n lterntive to n interpolting polynomil. (The sitution is similr to the one tht we hve lredy encountered in numericl liner lgebr, lest-squres fitting vs solving liner equtions.) Let p 0, p,..., p n be orthogonl polynomils w.r.t. the underlying inner product, p l P l [x]. They form bsis of P n [n], therefore for every p P n there exist c 0, c,..., c n R such tht p = n c kp k. Becuse of orthogonlity, f p, f p = f c k p k, f c k p k = f, f c k p k, f + c k p k, p k. To derive optiml c 0, c,..., c n we seek to minimise the lst expression. (Note tht it is qudrtic function in the c i s.) Since setting the grdient to zero yields f p, f p = p k, f + c k p k, p k, k = 0,,..., n, c k p(x) = p k, f p k, p k p k(x). (7.3) Note tht f p, f p = f, f {c k p k, f c k p k, p k } = f, f p k, f p k, p k. (7.4) This identity cn be rewritten s f p, f p + p, p = f, f, reminiscent of the Pythgors theorem. How to choose n? Note tht c k = p k, f / p k, p k is independent of n. Thus, we cn continue to dd terms to (7.3) until f p, f p is below specified tolernce ε. Becuse of (7.4), we need to pick n so tht f, f ε < n p k, f / p k, p k. Theorem (The Prsevl identity) Let [, b] be finite. Then Incomplete proof. Let σ n := p k, f = f, f. (7.5) p k, p k p k, f, n = 0,,..., p k, p k hence f p, f p = f, f σ n 0. The sequence {σ} n=0 increses monotoniclly nd σ n f, f implies tht lim n σ n exists. According to the Weierstrss theorem, ny function in C[, b] cn be pproximted rbitrrily close by polynomil, hence lim n f p, f p = 0 nd we deduce n tht σ n f, f nd (7.5) is true.

Mthemticl Tripos Prt IB: Ester 006 Numericl Anlysis Lecture 7.5 Lest-squres fitting to discrete function vlues Suppose tht m n +. We re given m function vlues f(x ), f(x ),..., f(x m ), where the x k s re pirwise distinct, nd seek p P n [x] tht minimises f p, f p, where g, h := m g(x k )h(x k ). (7.6) k= One lterntive is to express p s n l=0 c lx l nd find optiml c 0,..., c n s solution of liner lest squres problem similrly to Section 5, using QR fctoriztion. An lterntive is to construct orthogonl polynomils w.r.t. the sclr product (7.6). The theory is identicl to tht of subsections 7. 4, except tht we hve enough dt to evlute only p 0, p,..., p m. However, we need just p 0, p,..., p n nd n m, nd we hve enough informtion to implement the lgorithm. Thus. Employ the three-term recurrence (7.) to clculte p 0, p,..., p n (of course, using the sclr product (7.6)); p k, f. Form p(x) = p k, p k p k(x). Since the work for ech k is bounded by constnt multiple of m, the complete cost is O(mn), s compred with O ( n m ) if QR is used. 7.6 Gussin qudrture We re gin in C[, b] nd sclr product is defined s in subsection 7., nmely f, g = b w(x)f(x)g(x) dx, where w(x) > 0 for x (, b). Our gol is to pproximte integrls by finite sums, b ν w(x)f(x) dx b k f(c k ), f C[, b]. k= The bove is known s qudrture formul. Here ν is given, wheres the points b,..., b ν (the weights) nd c,..., c ν (the nodes) re independent of the choice of f. A resonble pproch to chieving high ccurcy is to require tht the pproximnt is exct for ll f P m [x], where m is s lrge s possible this results in Gussin qudrture nd we will demonstrte tht m = ν cn be ttined. Firstly, we clim tht m = ν is impossible. To prove this, choose rbitrry nodes c,..., c ν nd note tht p(x) := ν k= (x c k) lives in P ν [x]. But b w(x)p(x) dx > 0, while ν k= b kp(c k ) = 0 for ny choice of weights b,..., b ν. Hence the integrl nd the qudrture do not mtch. Let p 0, p, p,... denote, s before, the monic polynomils which re orthogonl w.r.t. the underlying sclr product. Theorem Given n, ll the zeros of p n re rel, distinct nd lie in the intervl (, b). Proof. Recll tht p 0. Thus, by orthogonlity, b w(x)p n (x) dx = b w(x)p 0 (x)p n (x) dx = p 0, p n = 0 Corrections nd suggestions to these notes should be emiled to A.Iserles@dmtp.cm.c.uk. All hndouts re vilble on the WWW t the URL http://www.dmtp.cm.c.uk/user/n/prtib/.

nd we deduce tht p n chnges sign t lest once in (, b). Denote by m the number of the sign chnges of p n in (, b) nd ssume tht m n. Denoting the points where sign chnge occurs by ξ, ξ,..., ξ m, we let q(x) := m j= (x ξ j). Since q P m [x], m n, it follows tht q, p n = 0. On the other hnd, it follows from our construction tht q(x)p n (x) does not chnge sign throughout [, b] nd vnishes t finite number of points, hence b b q, p n = w(x)q(x)p n (x) dx = w(x) q(x)p n (x) dx > 0, contrdiction. It follows tht m = n nd the proof is complete. We commence our construction of Gussin qudrture by choosing pirwise-distinct nodes c, c,..., c ν [, b] nd define the interpoltory weights b ν x c j b k := w(x) dx, k =,,..., ν. c k c j j= j k Theorem The qudrture formul with the bove choice is exct for ll f P ν [x]. Moreover, if c, c,..., c ν re the zeros of p ν then it is exct for ll f P ν [x]. Proof. Every f P ν [x] is its own interpolting polynomil, hence by Lgrnge s formul f(x) = ν f(c k ) ν j= j k x c j c k c j. (7.7) The qudrture is exct for ll f P ν [x] if b w(x)f(x) dx = ν k= b kf(c k ), nd this, in tndem with the interpolting-polynomil representtion, yields the stipulted form of b,..., b ν. Let c,..., c ν be the zeros of p ν. Given ny f P ν [x], we cn represent it uniquely s f = qp ν +r, where q, r P ν [x]. Thus, by orthogonlity, b w(x)f(x) dx = = b b w(x)[q(x)p ν (x) + r(x)] dx = q, p ν + w(x)r(x) dx. On the other hnd, the choice of qudrture knots gives ν ν b k f(c k ) = b k [q(c k )p ν (c k ) + r(c k )] = k= k= b ν b k r(c k ). k= w(x)r(x) dx Hence the integrl nd its pproximnt coincide, becuse r P ν [x] nd the qudrture is exct for ll polynomils in P ν [x]. Exmple Let [, b] = [, ], w(x). Then the underlying orthogonl polynomils re the Legendre polynomils: P 0, P (x) = x, P (x) = 3 x, P 3(x) = 5 x3 3 x, P 4(x) = 35 8 x4 5 4 x + 3 8 (it is customry to use this, non-monic, normlistion). The nodes of Gussin qudrture re n = : c = 0; n = : c = 3 3, c = 3 3 ; n = 3: c = 5 5, c = 0, c 3 = 5 5 ; 3 n = 4: c = 7 + 35 30, 3 c = 7 35 30, 3 c3 = 7 35 30, 3 c4 = 7 + 35 30.