P A = (P P + P )A = P (I P T (P P ))A = P (A P T (P P )A) Hence if we let E = P T (P P A), We have that

Backward Error Analyss for House holder Reectors We want to show that multplcaton by householder reectors s backward stable. In partcular we wsh to show fl(p A) = P (A) = P (A + E where P = I 2vv T s the exact Householder reector and P takes nto account the roundng that occurs n oatng pont arthmetc, E s the backward error, whch says that n oatng pont arthmetc we get that we are applyng the exact Householder reector to a nearby matrx. If we can bound P P then we can do the followng P A = (P P + P )A = P (I P T (P P ))A = P (A P T (P P )A) Hence f we let E = P T (P P A We have that E P P P = P P. So we just need to show that P P s O(ε). The backward error analyss s just a lttle more nvolved than the backward error analyss that we have done for Gaussan Elmnaton. We are gong to thnk of our Householder reector n the followng form P = I 2 v vt v 2 We wll bound the error fl(p A) at each step. Startng wth the rst entry of v we have that fl( v 1 ) = sgn( v 1 )( v 1 + n v 2 (1 + δ(1) )(1 + δ (1) n+1 ))(1 + δ(1) n+2 ) =1 where δ (1) nε and δ (1) n+1, δ(1) n+2 ε. Pullng out all the δ1, we have that fl( v 1 ) v 1 ( n 2 + 2)ε v 1 + O(ε 2 where we used (1 + δ) = 1 + 1 2 δ + O(δ2 ). So as a gross bound we have that v (1) = fl( v 1 ) = v 1 + v 1 (1), 1

where Now consderng the error n ( v (1) ) ( v) ( n 2 + 2) fl( v (1) 2 ) = n =1 ( v (1) ) 2 (1 + δ (2) where we have δ (2) nε + O(ε 2 ). Hence after takng nto account the error n v (1) 1, fl( v 2 ) v 2 = v 2 (2n + 4)ε + O(ε 2 ). Note up to hs pont all the terms that we have been dealng wth have been postve. We wll actually derve our result on a column by column bass, ( I v(1) ( v (1) ) T fl( v (1) 2 ) ) A j. Lookng at the nner product of v wth A j we have where fl(( v (1) ) T A j ) = = n =1 n =1 v (1) A j (1 + δ (3) v (2) A j, ( v (2) ) = ( v (1) ) (1 + δ (3) ) = ( v) (1 + δ (3) )(1 + δ (1) where δ (3) nε and δ (1) ( n 2 + 2)ε + O(ε2 ). Combnng more of the operatons and accountng for the roundng we have (A j (2 ( v(1) (( v (2) ) T A j )(1 + δ (4) ))(1 + δ (5) fl( v (1) 2 ) ) (1 + δ (6) )))(1 + δ (7) where δ (), for = 4, 5, 6, 7 account for the multplcaton by v (1), the multplcaton by two, the dvson by the norm squared and nally the subtracton. Pusng the errors around and o of A j we get ( ) dag(1 + δ (7) ) 2 v(3) v (2) v 2 A j, 2

where dag(1+δ (7) denotes a dagonal matrx wth the δ (7) along the dagonal. Takng the derence of ths matrx wth the exact Householder reector. We get that entry wse hence columnwse we can show that P j P j (n + 9)ε + O(ε 2 ) P P (2n + 9) P ε. Combnng the result for all the columns and usng the Frobenus norm, we would acheve the desred result. that E = O(ε) where O(ε) would nclude addtonal factors of n that come from gong to Frobenus to the 2-norm. Wth ths result we can then say that Householder QR s Backward stable and that the computed Q s close to beng an orthogonal matrx. See Demmel's Theorem 3.5. Perturbaton Theory for the Least Squares Problem Frst we begn by decomposng b = b R + b N where b R R(A) and b N N (A T ) Startng wth Ax = 2 b, we perturb A and b by A and b = b R + b N, so that (A + A)(x + x) = 2 b. Now we are gong to form the Normal Equatons however n solvng for x we are gong to take advantage of what we have learned from the QR factorzaton to avod κ(a) n some of the terms. Multplyng both sdes by (A + A) T and usng the decomposton for b, 3

(A + A) T (A + A)(x + x) = (A + A) T (b R + b R + b N + b N ) (A + A) T (A + A)(x + x) = (A + A) T (b R + b R + b N + b N ) (A T A + (A T A))(x + x) = A T b R + A T b N + A T b R + A T b N + A T b R + A T b N + A T b R + A T b N (A T A + (A T A))(x + x) = A T b R + 0 A T b N + A T b R + A T b N + A T b R + 0 A T b N + H.O.T. A T b R + H.O.T. A T b N (A T A)(I + (A T A) 1 (A T A))(x + x) = A T b R + A T b R + A T b N +A T b R + H.O.T. where (A T A) = A T A + A T A + A T + A, H.O.T. contans all the hgher order terms, and we have elmnated all the terms that are zero due to orthogonalty. Smlar to the perturbaton theory for square systems we must have that (A T A) 1 E < 1 s small enough so as not to change the range of A, thereby ensurng that the problem s solvable for any rght hand sde. Under ths assumpton we may use the Neumann seres to approxmate (I + (A T A) 1 (A T A)) 1 = I (A T A) 1 (A T A) + H.O.T, but rst we solate x on the left hde sde (A T A)(I + (A T A) 1 (A T A)) x = A T b R + A T b R + A T b N +A T b R (A T A + (A T A))x + H.O.T. = A T b R + A T b R + A T b N +A T b R A T Ax A T Ax A T Ax A T Ax + H.O.T. = 0 A T b R A T Ax + 0 A T b R A T Ax + A T b N + A T b R A T Ax H.O.T. A T Ax + H.O.T. = A T b N + A T b R A T Ax + H.O.T. We note that all the terms on the rght hand sde are rst order, hence after multplyng both sdes by (A T A) 1 and then by the Neumann seres 4

approxmaton to the nverse of (I + (A T A) 1 (A T A) to rst order we have x = (A T A) 1 A T b N + (A T A) 1 A T b R (A T A) 1 A T Ax + H.O.T. Now we recall that A = (A T A) 1 A T s the matrx that gves the soluton to the normal equatons. Usng the QR factorzaton we showed that A = R 1 Q T, and hence A = R 1. Takng advantage of these facts we have that x = (R T R) 1 A T b N + R 1 Q T b R R 1 Q T Ax + H.O.T. Now we take the norm of both both sdes, apply the trangle nequalty and the propertes of the matrx norm, and dvde by the norm of x. x R 1 2 A T b N A 2 A T b N + R 1 Q T b R + A b R + R 1 Q T A + H.O.T. + A A + H.O.T. Now we wll treat each term seperately to acheve the desred bound. for the rst term A 2 A T b N = A 2 A T b N = A 2 AT b b R b b R b N b b R b b R (1) (2) If we let θ denote the angle between b R and b, we have that b N b R b b = sn θ and = cos θ. Snce Ax = b R we have that b R, thus ths term reduces to A 2 A T b N A 2 2 A tan θ (3) κ(a) 2 A tan θ, (4) where κ(a) = A. So the rst term depends on the condton number squared, f θ s small that s b R s close to beng n the range of A, and A s well condtoned then ths term wll make lttle contrbuton to the error bound. 5

For the second term we have that A b R = A b R b b R b b R = A b R b R b b b R A b 1 b cos θ κ(a) b cos θ b Ths term only depends on κ(a)/ cos θ hence f θ s small and A s well condtoned ths wll be the leadng order error term. For the thrd term we have that A A = A A = κ(a) A Ths term depends on κ(a). Combnng all the results, we have to rst order that ( x 1 κ(a) δb cos θ b + δa ) + κ(a) 2 tan θ A. Hence f θ s small and A s well condtoned the leadng term wll be lkely to be the domnant term. If θ s large, close to π/2 then b cannot be well approxmated wth vectors n the Range of A. 6