Numerical Linear Algebra Notes

Numerical Linear Algebra Notes Brian Bockelman October 11, 2006 1 Linear Algebra Background Definition 1.0.1. An inner product on F n (R n or C n ) is a function, : F n F n F satisfying u, v, w F n, c F, 1. v, v 0 with equality iff v = 0 2. u, v + w = u, v + u, w 3. u, v = v, u 4. u, c v = c u, v A norm on F n is a function : F n R, such that for all u, v F n, c F, 1. u 0 with equality iff u = 0 2. c u = c u 3. u + v u + v Theorem 1.0.2. CBS-inequality Examples: u, v 2 u, u v, v 1. (Inner products). Let H C n,n be Hermitian (H = H) and positive definite (i.e., v C n, v Hv 0 with equality iff v = 0) (this is equivalent to all eigenvalues of H positive). If H is real, we call it symmetric positive definite (SPD). Define, u, v = u Hv. It s a simple exercise to show this satisfies the requirements of the inner product. 1

2. (Special example of above) Let A be a nonsingular n n (invertible) matrix in C. Define H = A A. Note Also, if v 0, v C n. Then, (A A) = A (A ) = A A v Hv = v A Av = (Av) (Av) = Av 2 2 Since A is invertible, N (A) = {0}, so v 0 = Av 0 = Av 2 2 > 0. 3. Norms: (a) Induced norm (from inner product, ): v 2 = v, v Exercise: Verify that the norm laws hold. This allows us to restate CBS: u, v u v where is induced from the inner product. (b) p-norms. Let p 1 be real. Let v = (v 1,..., v n ). Then, Important norms: v p = ( v 1 p + + v n p ) 1 p v = lim p v p = max{ v 1,..., v n } p = 1, v 1 = v 1 + + v n p = 2, v 2 = ( v 1 2 + + v n 2 ) 1 2 p =, v = max{ v 1,..., v n } Example: v = ( 2, 3 + i, 4, i) C 4. Then, v 1 = 7 + 10 v 2 = 31 v = 4 (c) Matrix norms: C m,n or R m,n is a vector space in its own right, so anything satisfying norm laws works; Frobenius norm. vec(a) is the vector resulting from stacking columns of A in order. So, any vector norm gives us a matrix norm. In particular, A F = vec(a) 2 2

Key Fact: Operator norm. Let be some norm on vector spaces F m, F n where A is m n. Define the operator norm to be: A = max x 0 Ax x = max x =1 Ax This is a norm (exercise). Fact: Let A be m n, norm are vectors p. Write A = [a 1,..., a n ]. Then, A 1 = max{ a 1 1,..., a n 1 } A = A T 1 = max{ r 1,..., r m 1 } where A T = [r 1,..., r n ]. Key Property: A matrix norm is multiplicative if AB A B. 1. Every linearly independent set in a finite dimensional vector space can be enlarged to a basis. 2. Every orthonormal set in a finite dimensional vector space can be enlarged to an orthonormal basis. 3. Let A be m n, U, V unitary matrices, m m and n n, respectively. Then, UAV F = A F UAV 2 = A 2 Needs this elementary fact: for v C n, V n n unitary, V v 2 = v 2 2 Factorizations 2.1 Schur Factorization Theorem 2.1.1. Schur Triangularization Theorem Let A be an n n complex matrix. Then, there exists a unitary matrix U such that U AU = T where T is a triangular matrix. 3

Proof. Proceed by induction on n. n = 1 is trivial. Suppose the theorem is true for all sizes < n. Let A be an n n matrix, n > 1. Compute an eigenvalue, λ, of A and eigenvector u of unit length. Next, use the key fact that the orthonormal set {u 1 = u} can be expanded to an orthonormal basis u 1,..., u n of C n. Form This is unitary. Calculate, U 1 = [u 1, u 2,..., u n ] U AU = u 1. u n [ λu 1,..., ] u 1λu 1 u 2λu 1 =. A 1 u nλu 1 λ 0 =. A 0 By induction, there exists a unitary matrix U 2 such that U2 A 1 U 2 is upper triangular. Then, form 1 0... 0 0 U 3 =. A 0 Finally, form Then, is upper triangular. Applications U = U 1 U 3 U AU Theorem 2.1.2. Principal Axes Theorem: If A is Hermitian, then there exists a unitary U such that U AU is diagonal and eigenvalues of A are real. If A is real, then U can be chosen to be orhtogonal. Proof. Apply Schur. Then, U AU = T where T is upper triangular for some unitary U. But, T = (U AU) = U A U = U AU 4

= (T ) = T = T So, T is both upper and lower triangular, so T is a diagonal matrix. Thus, λ 1 0 λ 1 0 T =... = T =... 0 λ n 0 λn 2.2 Singular Value Decomposition Theorem 2.2.1. Singular Value Decomposition (SVD) Let A be m n matrix, A C m,n. Then, there exists unitary matrices U and V (can be chosen real if A is real) such that U AV = Σ = σ 1 σ 2... where σ 1 σ 2 σ p 0 with p = min{m, n}. Morover, σ 1,..., σ p are uniquely determined by A. Notation: The σ i s are the singular values of A. The u i s are left singular vectors, and the v i s are the right singular values of A. Proof. For B = A A. Then, B is Hermitian. We claim that B is positive semidefinite: x Bx = x A Ax = (Ax) (Ax) = Ax 2 0 Hence, then eigenvalues are also nonnegative (let e be an eigenvector for λ): 0 e Be = e λe = λ e 2 2 > 0 Since eigenvalues are nonnegative, write them as real squares and order them: σ 1 σ 2 σ n 0 We can diagonalize B to obtain for some unitary n n. That is, σ1 2 0 V BV =... 0 σn 2 Note that rank(a A) ranka = r min{m, n} Hence, we conclude that rankb r. Hence rank(v BV ) = r. But, σ1 2 0 V BV =... 0 σn 2 5

So, σ j = 0 for j > r and σ 1 σ 2 σ r 0. Then, Now, let u i = u i u j = 1 sigma i Av i, i = 1,..., q. ( ) ( ) 1 1 Av i Av j σ i σ j = 1 vi Bv j = 1 vi σj 2 v j σ i σ j σ i σ j = σ2 j σ i σ j v i v j = { 0, i j 1, i = j Thus, u 1,..., u q form an orthonormal set. Fill u 1,..., u q out to an orthonormal basis u 1,..., u m of C m. Set Then, U is unitary. Moreover, But, U AV = Reasons: If j > q, then σ j = 0. Therefore, Av j = 0. Finally, if i = j q, then U = [u 1,..., u m ] u 1. u m [Av 1, Av 2,..., Av n ] = [u i Av j ] m,n 0, j q u i Av j = σ i, j q, i = j 0, j q, i j Bv j = σ 2 j v j = 0. v j A Av j = v j 0 Av j 2 2 = 0 u i Av j = 1 σ i (Av i ) Av i = 1 σ j v i A Av i = 1 σ j v i σ 2 i v i = σ i v i v i = σ Finally, rank(a) = rank(ua V ) = q 6

Tuesday, September 5, 2006: Note that means U AV = Σ A = UΣV 1 = [u 1,..., u m ] σ... = p σ j u j vj j=1 σ p v 1. v n If m n, so that n = p, we have [ ] v 1 A = [u 1,..., u n ]Σ p = ŪΣ pv.v n (note Ū is not unitary as it is not square) where Σ p = diag{σ 1,..., σ p } This is the reduced form of SVD. Even further, if r = rank(a), then r = rank(uσv ) = rank(σ) Hence, σ j = 0 for j > r. So, we can write: A = [u 1,..., u r ]Σ r = ŪΣ r V v 1. v r This is customarily called the compact form of the SVD. 2.3 Applications of SVD We have the following results: 1. rank(a) = r where σ r 0, σ r+1 = 0. 2. To solve Ax = b stably; form U AV (V x) = U b Σy = c Set y i = ci σ i, i = 1,..., r, y i = 0 for i > r. This gives a solution ɛ; if rank(a) = n it gives the unique solution. 7

3. A square matrix A is invertible iff all singular values 0. 4. A 2 = σ 1. Proof. A 2 = UΣV 2 = sup U(σV x) 2 = x 2=1 sup ΣV x 2 := x 2=1 sup Σy 2 x 2=1 Note that Then, y 2 = V x 2 = x = 1 = sup Σy 2 = σ 1 y 2=1 5. A 1 2 = 1 σ n Proof. Note, if A is a square n n invertible, σ 1 0 U AV = Σ =... 0 σ n Then, σ1 1 0... 0 σ 1 n = (U AV ) 1 = V 1 A 1 (U ) 1 = V A 1 U Multiply by permutation matrices to get the σ 1 j in desired order, and notice that a product of unitary matrices is unitary. So, by part 4, A 1 2 = 1 σ n Note, if A is a square, invertible matrix, then cond 2 (A) = A 2 A 1 2 = σ 1 σ n 6. Range A = C(A) = span{u 1, u 2,..., u r }. r = rank(a). 8

Proof. Remember r A = σ k u k vk k=1 Then, range(a) = C(A) = span({a i }) = {Ax x F n } Thus, r range(a) = {( σ k u k vk)x x F n } k=1 n n = { σ k (vkx) u k x F} = { y k u k y F r } k=1 k=1 = span{u 1,..., u r } 7. null(a) = N (A) = span{v r+1,..., v n } 8. Let A k = k j=1 σ ju j vj. Then, for k < r, A A k 2 = inf B C m n,rank(b)=k A B 2 = σ k+1 3 QR Factorization and Least Squares 3.1 Motivation: We have three points of view. 1. Geometric point of view: (pictures). 2. Analytic View: Define a linear operator T : C m C m and suppose T has the projection property: T 2 = T. Let U = ranget = {T (x) : x C m } V = null(t ) = {x : T (x) = 0} One shows that U + V = C m and U V = {0}. Thursday, September 7, 2006: 9

3. Matrix View: Definition 3.1.1. A projector is a P C m n such that P 2 = P. Fact: Let P be a projector. Let U = rangep = C(P ) and let V = nullp = N (P ). Then, I P is a projector nullp = range(i P ) C m = U V 3.2 Orthogonal Projections Definition 3.2.1. A projector P is an orthogonal projector if v u = 0 for all v N (P ), u C(P ). Fact: Projector P is orthogonal iff P = P. Proof. If P is orthogonal, then for all v N (P ) = C(I P ), u C(P ), we have v u = 0 So, write u = P x, where x is arbitrary. Then, write v = (I P )y, where y is arbitrary. Thus, If v u = 0, then v u = ((I P )y) P x = y (I P )x y (I P )P x = 0 So, every entry of (I P )P is 0. Therefore, P = (P P ) = P P = P. Conversely, if P = P, then we have u = P x, v = (I P )y, v u = y (P P P )x = y (P P 2 )x = y 0x = 0 How to construct an orthogonal projection onto U (the range) along V (the nullspace). Method 1: (Derived from the normal equations) Let U be a subspace of C m and suppose dim U = n < m. Let a 1, a 2,..., a n be basis of U. Let A be the full column rank matrix: A = [a 1,..., a n ] 10

(Remark Ax = b, we ha ve the normal equations: A Ax = A b, will always have solutions.) In our case, we can show that null(a A) = {0}. Hence, A A is invertible, so the unique solution to the normal equations is: Form Then, Then, Now, x = (A A) 1 A b P = A(A A) 1 A P 2 = A(A A) 1 A A(A A) 1 A = AI(A A) 1 A = P P = A (A A) 1 A = A(A A ) 1 A = A(A A) 1 A = P Method 2: Supply an orthonormal basis u 1,..., u n of U C m. Define and when i = j, P = u 1 u 1 + + u n u n u i u i u j u j = 0, i j u i u i u i u i = u i 1u i = u i u i Thus, using these facts, it can be shown: P 2 = P Finally, observe: P = P P x = u i (u i x) + + u n (u nx) = c 1 u 1 + + c n u n Take x = u i and get P x = u i, so range(p ) = span(u 1,..., u n ). Tuesday, September 19, 2006: Definition 3.2.2. The Householder transform defined by (v 0): H v := I 2 vv v v 11

Note that H v is a unitary Hermitian operator. Key Idea: Want to map a vector x to y stably. The best choice would be Ux = y where U is unitary. However, we must have x 2 = y 2. Try H v, where v = x y. Fact: If x, y are real, x, y are the same length, then x y x + y. In general, H v v = (I 2 vv v v )v = I v) 2v(v v = v v Hence, H v (x y) = y x. Also, in general, if w v, then H v w = w. So, Adding these together, H v (x + y) = x + y H v (x y) = y x H v x = y H v y = H 2 v x = x Big Idea: Use the Householder transform to make as many 0s as possible: x 1 ± x x 2 x =. 0. = ±z x n 0 We prefer y = (sign(x 1 ) x, 0,..., 0) We can now take the Householder transform, and use it to go along the columns of a matrix A to make it upper triangular. For example, we can make: (H v3 H v2 H v1 )A = R = Q (an upper trian- where R is upper triangular. Then, taking H v3 H v2 H v1 gular method): Q A = R A = QR Algorithm 10.1 Input A, m, n. Let R = A p = min{m, n}. for k = 1 : p, x = R k:m,k x(1) = sqn(x(1)) x 2 + x(1) 12

v k = x/ x 2. R k:m,k:n = R k:m,k:n 2v k v k R k:m,k:n. end return v 1,..., v p, R. Flop count. To get work per pass: k = 1: 2mn + mn + mn = 4(mn) (4 flops per entry). Total work: 4(mn + (m 1)(n 1) + + (m p + 1)(n p + 1)) For m n, this ends up being: ( n 3 4 3 + mn2 2 Total work: ) n3 = 2mn 2 2 2 3 n3 2mn 2 2 3 n3, m > n 4 3 n3, m = n 2m 2 n 2 3 m3, m < n Algorithm 10.2 Implicit calculation of Q b. Input v i s and b: for k = 1 : n: b k:m = b k:m 2v k v k b k:m Algorithm 10.3 Implicit calculation of Q x. Input v i s and x: for k = n : 1 : 1: x k:m = x k:m 2v k v k x k:m end return x 4 Least Squares: The problem: solve Ax = b when there may not be a solution. We want a least squares solution that will minimize: b Ax 2 There is a solution. Let A = [a 1,..., a n ]. Force: a i (b Ax) = 0 A Ax = A b these always have solutions. The solutions to these equations are called the least square soln to Ax = b. Ex 1: Linear regression: variables x and y are theoretically related by the linear equation: y = ax + b 13

Estimate a, b gives us data pairs (x i, y i ). So, we have the following data: x 1 1 [ y 1 a.. = b]. x n 1 y n Example: Interpolating 10 data points by a polynomial with a Van der Mond matrix, including using. x = ( 5 : 5), y = [0; 0; 0; 1; 1; 1; zeros(5, 1)], A = vander(x), c = A y Example: Normal equations, in general Suppose we have an inner product, on a finite dimensional vector space V with basis v 1,..., v n. To approximate an element v V using v 1,..., v k, k < n, simply write: v = c 1 v 1 +. + c k v k This may well have no solution. To try to find the best solution (minimize residual), force v j, r = 0 for j = 1,..., k. This leads to the following system: v 1, v 1 c 1 + v 1, v 2 c 2 + + v 1, v k c k = v 1, v.. v k, v 1 c 1 + v k, v 2 c 2 + + v k, v k c k = v k, v = A c = b Example: V = C[0, 1], v i = x i, i = 1,..., k, using the inner product: f(x), g(x) = What results is the Hilbert matrix: x i, x j = 1 0 f(x)g(x)dx 1 i + j + 1 where has a very bad condition number. Remark: In most practical problems, A of Ax = b has full column rank. Thus, A is m n, m n, and rank(a) = n. One checks that A A is an n n matrix of rank n. So, A A is invertible and the normal equations has a unique solution. Do not use the inverse of A A! 14

Rather, use something like a Cholesky factorization. First, do QR. It is possible to write: A A = R R where R is upper triangular (as A is SPD). Now write: then solve for y. Then, solve R Rx = A b R y = b Rx = y for x. The cost here is mn 2 + 1 3 n3. If we did plain old Gaussian Elimination on normal equations, the cost is mn 2 + 2 3 n3. The next method is to use the QR factorization of A: A = QR Now, solve: for x. The cost is 2mn 2 2 3 n3. The reason we should use this is: Rx = Q b b Ax 2 = b QRx = Q(Q b Rx) = Q b Rx [ ] [ R ˆQ = Q b b x = 1 Rx ] 0 y The last method is to use the reduced SVD and solve Σw = Û b, then set x = V w. The cost is 2mn 2 + 11n 3. The reason for using this is stability and the ability to solve rank-deficient problems. QR can t do this without serious modification, and GE / Cholesky can t handle rank deficient at all. Tuesday, September 26, 2006: 5 Conditioning and Condition Numbers Fundamental Questions: Given a calculation of f(x), how sensitive is f(x) to changes in x? This is really a mathematical sensitivity. Problem is: f(x + δx). rather than compute the intended f(x), we might compute 15

5.1 Absolute condition number One measure is the (absolute) condition number. Set δf = f(x + δx) f(x). Take, δf ˆκ = ˆκ(x) = lim sup δ 0 δx Example: δx δ f(x) = 4x 2 f(x + δx) f(x) ˆκ = lim = = f (x) = 8 x δ 0 δx This is odd - we want x 2 to be stable. Note: For smooth multivariate, The Jacobian of f is: f(x) = f 1 ((x 1,..., x n )). f m ((x 1,..., x n )) [ ] fi J f (x) = x j plays the part of the derivative in this sense: m,n δf J f (x) δx I.e., So, Example 2: δf J f (x) δx) lim = 0. δx 0 δx ˆK = lim sup δ 0 δx δ J f (x) δx δx = J f (x) f(x) = x 1 x 2 The condition number here is 2, which suggests that subtraction is a stable operation (numerically, this is not true!) Obviously, we have the wrong idea. 16

5.2 Relative Condition Number: Need x, f(x) 0: Example 1: Example 2: κ := lim sup δ 0 δx δ δ / f δx / x κ = f (x) x 8 x x = f 4 x 2 = 2 κ = J f (x) x = 2 max{ x 1, x 2 } f x 1 x 2 Heuristic: If the condition number of the problem is κ, expect to lose log 10 κ digits of accuracy. Reason: If δx x 10 β, then δf f = κ δx x, δx 0 5.3 Examples: 5.3.1 Wilkinson s Polynomial We define: δf f 10log 10 κ 10 β = 10 log 10 κ β p(x) = 20 j=1 (x j) = a + + a 1 9x 19 + x 20 The condition number of λ = 15 is 5.1 10 13. This results in the perfidious polynomial, as Wilkinson calls it. Theorem 5.3.1. The condition number of computing b = Ax = f(x) is κ = A x b A A 1 =: κ(a) 17

Proof. f(x + δx) f(x) / f(x) δx / x = A δx Ax Also, Ax = b implies x = A 1 b, so So, = A(x+δx) Ax Ax δx / x x δx = Aδx δx x b A x b x b = A 1 b A 1. b κ A A 1 These inequalities are frequently nearly equal. They can be exact equalities for certain choices of b and δx. Fact: κ(a) = σ 1 σ n. Theorem 5.3.2. Let b be fixed, and x be a solution to Ax = b. Let, f(a) = A 1 b. Then, κ f κ(a) Theorem 5.3.3. Perturbation Theorem Suppose an invertible matrix A satisfies Ax = b. Suppose δa and δb are given and δx satisfies (A + δa)(x + δx) = b + δb Set B = A 1 δa. Suppose β = B < 1. Then, δx x K(A) 1 β { δa A + δb b This provides an estimate on the relative error of the solution. } Proof. Subtract Ax = b: (A + δa)(x + δx) = b + δb Aδx + δax + δaδx = δb (A + δa)δx = δb δax A(I + A 1 δa)δx = δb δax A(I + B)δx = δb δax 18

By the Banach lemma (notice B 1), δx = (I + B) 1 A 1 {δb δax} δx (I + B) 1 A 1 { δb + δa x } 1 { δb 1 β A 1 A A + δa } A x δx x κ(a) { δb 1 β A x + δa } A δx x κ(a) { δb 1 β b + δa }. A 6 Floating Point Analysis Thursday, September 28, 2006: Ref: What every computer scientist should know about floating point arithmetic. David Goldberg, 1992, Computing Surveys (ACM) While we are used to working with R or C, on a computer, we are limited to an approximation of these. We say, Here, 0 m < β t, m Z, a e b. β The base of our representation. m βt Mantissa of x e Exponent of x x f(x) = ± m β t βe t The precision of our representation Example: IEEE double precision standard. 1 byte for sign, 8 bytes for exponent, and 52 bits for mantissa. So, the floating point universe is finite. We ll idealize our field a bit by removing the bounds on our exponents. So, we have a countably infinite, self-similar set of floating point numbers. This avoids overflow and underflow. Machine Epsilon The smallest value in a floating point approximation is known as ɛ: ɛ machine = 1 2 β1 t 19

eps in Matlab. This is the measure of gaps between floating point numbers. A reasonable expectation is that f(x) approximates x via rounding, i.e.: x f(x) ɛ machine (Rounding Axiom, RA). Note: If we start with with: x = 0.d 1 d 2 d t d t+1 β e So, β t x = d 1 d 2 d t d t+1 β e Thus, d 1 d 2 d t β t e x d 1 d 2 d t + 1 Now, if we choose left or rhs, whichever is closed to β t x, we get xβ t e f(xβ t e ) 1 2 So β t e x f(x) 1 2 x f(x) 1 2 βe t β e t = 1 2 β1 t β e 1 But So, βx β e, x β e 1 x f(x) 1 2 β1 t x Thus, x f(x) ɛ machine x Rem: Some machines have a different ɛ machine. In particular, if one deals with complex numbers, one has to enlarge the machine ɛ of the rounding axiom - by 2 3 2. So, in base 2, ɛ machine = 1 2 β1 t = β t Fundamental Axiom of Floating Point Arithmetic (FAFPA): Let x, y F. Let + be any of the basic four arithmetic operations. Let be the corresponding machine operation. Require: (x + y) (x y) ɛ machine x + y 20

(considerations must be made for x + y to be nonzero). Problems that occur on real machines: Consider the system with β = 10, t = 5, 70 e 70. 1. 10 40 10 40 - This gives 10 80 and underflow, which is typically ok 2. 10 40 10 40 /10 60. Left to right, this causes an overflow. Right to left, this is ok 3. 10 40 10 40 10 60 - can be overflow or not, depending on how it is grouped. 4. x = 5/7, y = 0.71425. f(x) f(y) = 0.00003 Correct value:.34714 10 4 Error:.4714 10 5 Relative error: 0.136 The error is larger than it should be; this is because we started with x which isn t floating point. So, we should avoid subtracting nearly equal real numbers. Classic Example: Solve x 2 + bx + c = 0. The quadratic formula can cause catastrophic cancellation. So, we reorganize the calculation; Note r 1 r 2 = c. So, calculate: 7 Stability: x 2 + bx + c = (x r 1 )(x r 2 ) r 1 = b sgn(b) b 2 rc, r 2 = c r 1 We have a problem - calculate f(x), and we want to compute f(x). We want: f(x) ˆf(x) = O(ɛ machine ) f(x) This is true independent of the norm used, as all finite dimensional norms are equivalent. If we can prove this, we call the algorithm accurate. One example of this is approximating x with f(x). The FAFPA tells us this is an accurate algorithm. Thursday, October 5, 2006: 21

Note that multiplication is accurate: ˆf(x, y) = fl(x) fl(y) = (x(1 + ɛ 1 )y(1 + ɛ 2 )) (1 + ɛ 3 ) = x y(1 + O(ɛ machine )) = f(x, y)(1 + O(ɛ)) So, ˆf(x, y) f(x, y) = f(x, y) O(ɛ) ˆf(x, y) f(x, y) = O(ɛ) f(x, y) For the outer product, f(x, y) = x y which is the matrix whose (i, j) entry satisfies: x i y j fl( x ) fl(y j ) = ˆx i y j O(ɛ) So, entrywise, the calculation is stable. Using any desirable norm, we can also show the matrix, as a whole, is accurate. But this is not backwards stable; ˆf(x, y) is just an outer product with random perturbations in each entry; we cannot expect the result to be rank one. However, f(ˆx, ŷ) = ˆxŷ is rank one. Now, consider inner products: Problem: f(x, y) = x y. Algorithm: Computed ˆf(x, y) on a computer satisfying RA and FAFPA. Here, So, x = (x 1,..., x n ), y = (y 1,..., y n ) ŝ 1 = fl( x 1 ) fl(y 1 ) = x 1 y 1 (1 + ɛ 1 )(1 + ɛ 2 )(1 + µ 1 ) ŝ 2 = ŝ 1 (fl( x 2 ) fl(y 2 )) = ( x 1 y 1 (1 + e 21 ) + x 2 y 2 (1 + e 22 )(1 + µ 2 ) Eventually, you get: ŝ n = x 1 y 1 (1 + e n1 ) + x 2 y(1 + e 21 ) + + x n y n (1 + e nn ) Finally, set ˆx i = x i, ˆx = x, ŷ i = y i (1 + e n,i ), and ŷ = [ŷ 1,..., ŷ n ]. So, the computed value is: ˆf(x, y) = ŝ n = x ŷ = f(x, ŷ) where y ŷ = y O(ɛ). So, we have backward stability. Unfortunately, this algorithm is not accurate: [ ] [ ] 1 x1 x 2 = x 1 1 x 2 as subtraction is not accurate. 22

8 Stability of the Householder Triangularization Caution: Regarding the stability of vector or matrix calculations; e.g., inner products: x 1 ỹ 1 (1 + ɛ n1 if along the way: 1 + ɛ n,1 = (1 + µ 1 ) (1 + µ n ) = (1 + µ) n = 1 + nµ + O(µ 2 ) So, in general, our order contants C may be of order n. Problem: Solve for x in Ax = b. The condition of this problem is κ = κ(a). Algorithm 16.1 To solve Ax = b by QR, Factor A = QR into orthogonal Q and upper triangular R. We actually find orthogonal Q and triangular R. Compute y = Q b; Actually, compute ỹ = [ Q b] Solve y = Rx to get soln x = R 1 y.. Actually compute x = [ R 1 ỹ]. The necessary backward stability facts: The computer Q and R of A = QR by Householder reflections satisfy: Q R = A + δa with δa A = O(ɛ). This is exactly the backward stability of [Q, R] = f(a). If Q is orthogonal, b vector, and y = Q b, then there exists δ Q such that ( Q + δ Q) δ Q ỹ = b where Q = O(ɛ). This is backwards stability for f(q) = Q b. If R is nonsingular and upper triangular, then the computed solution x = R 1 ỹ satisfies ( R + δ R) x = ỹ for some δ R such that back-substitution δ R = O(ɛ). This is just backwards stability of R Remark: About Fact 1: Suppose Q 1 = H v1. Then, Q 1 A = H v1 A = (I 2 v 1v1 v1 v )A = A 2 1 v1 v v 1 (v1a) 1 We can use backward stability of inner products and + to get this algorithm is backward stable. We want the theorem: 23

Theorem 8.0.4. Algorithm 16.1 is backward stable in sense that computed x satisfies (A + δa)x = b for some δa such that δa A = O(ɛ). Proof. From fact 2, From fact 3, From Fact 1, get, where: b = ( Q + δ Q)ỹ b = ( Q + δ Q)( R + δ R) x. b = ( Q R + δ Q R + Q δtilder + δ Q δ R) x = (A + (δa + stuff )) x A = (δa + stuff ) So, we need to check that each part of A is O(ɛ), by the triangle inequality. Certainly, δa A = O(ɛ) by Fact 1. Now, Q R = A + δa so R = Q (A + δa). Then, R A Q ( A + δa ) A So, For the second term, 1(1 + O(ɛ) δ Q R A R δ Q A = (1 + O(ɛ))O(ɛ) = O(ɛ) Finally, δ R Q A frac δ R R Q R A = O(ɛ) 1 (1 + O(ɛ)) = O(ɛ) δq δr A = δq δ R A R R = δr R R δq = O(ɛ)(1 + O(ɛ))O(ɛ) Ã = O(ɛ) 24

Now, we appeal to the forward error estimate theorem and the fact that the condition number of prob Ax = b is κ(a) to obtain Theorem 8.0.5. Computed x of Algorithm 16.1 satisfies: x x x = O(κ(A)ɛ) 9 Stability of Backsolving. Problem: Given nonsingular R = [r ij ] m,m and b = [b i ] m, solve for x = [x i ] in Rx = b. Algorithm 17.1: For j = m : 1 : 1, x j = 1 m b j r jj end The flop count for this is: k=j+1 r jk x k m [2 + (m (k + 1) + 1) 2] m 2 k=1 Theorem 9.0.6. Algorithm 17.1 applied to a system of floating point numbers is stable in the sense that computed x satisfies with δr R = O(ɛ) Proof. We do this for m = 3: (R + δr) x = b r 11 x 1 + r 12 x 2 + r 13 x 3 = b 1 r 22 x 2 + r 23 x 3 = b 2 Ideally, r 33 x 3 = b 3 x 3 = b 33 r 33 x 2 = 1 r 22 (b 2 r 23 x 3 ) x 1 = 1 r 11 (b 1 (r 12 x 2 + r 13 x 3 )) 25

Instead, what happens, but, implies Here, we can say Thus, Here, x 3 = b 33 r 33 (1 + ɛ) 1 1 + ɛ = (1 + ɛ 1) 1 + ɛ = 1 1 + ɛ 1 ɛ = 1 1 + ɛ 1 1 = ɛ 1 1 + ɛ 1 = ɛ 1 (1 ɛ 1 + ɛ 2 1 ) = ɛ 1 + O(ɛ 2 1) ɛ ɛ machine + O(ɛ 2 machine) x 3 = b 3 r 33 (1 + ɛ) = b 3 r 33 r 33 r 33 r 33 ɛ For x 2, calculate, r 33 x 3 = b 3 x 2 = fl( b 2 r 23 x 3 r 22 = 1 r 22 (b 2 r 23 x 3 (1 + ɛ 1 ))(1 + ɛ 2 )(1 + ɛ 3 ) But So, where So, we have: (1 + ɛ 2 )(1 + ɛ 3 ) = 1 + ɛ 2 + ɛ 3 + O(ɛ 2 machine) = 1 + µ 1 + µ = 1 1 + ɛ 4, ɛ 4 2ɛ machine + O(ɛ 2 machine) x 2 = b r 23(1 + ɛ 1 ) x 3 r 22 (1 + ɛ 4 ) r 22 r 22 r 22 r 23 r 23 r 23 ɛ machine = b r 23 x 3 r 22 2ɛ machine + O(ɛ 2 machine) 26

We continue on like this to get the error term for x 3. So, we have, R R R mɛ machine + O(ɛ 2 machine) This shows backward stability in any norm (different norm just changes order constants. 27