ERROR AND SENSITIVTY ANALYSIS FOR SYSTEMS OF LINEAR EQUATIONS. Perturbation analysis for linear systems (Ax = b)

ERROR AND SENSITIVTY ANALYSIS FOR SYSTEMS OF LINEAR EQUATIONS Conditioning of linear systems. Estimating errors for solutions of linear systems Backward error analysis Perturbation analysis for linear systems Ax = b) Question addressed by perturbation analysis: determine the variation of the solution x when the data, namely A and b, undergoes small variations. Problem is Ill-conditioned if small variations in data cause very large variation in the solution. Relative element-wise error analysis 5-2 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-1 5-2 Analysis I: Asymptotic First Order Analysis Let E, be an n n matrix and e b be an n-vector. Perturb A into Aɛ) = A + ɛe and b into b + ɛe b. Note: A + ɛe is nonsingular for ɛ small enough. Why? The solution xɛ) of the perturbed system is s.t. A + ɛe)xɛ) = b + ɛe b. Let δɛ) = xɛ) x. Then, A + ɛe)δɛ) = b + ɛe b ) A + ɛe)x = ɛ e b Ex) δɛ) = ɛ A + ɛe) 1 e b Ex). xɛ) is differentiable at ɛ = 0 and its derivative is x 0) = lim ɛ 0 δɛ) ɛ = A 1 e b Ex). A small variation [ɛe, ɛe b ] will cause the solution to vary by roughly ɛx 0) = ɛa 1 e b Ex). The relative variation is such that xɛ) x ɛ A 1 Since b A : xɛ) x ɛ A A 1 eb + E ) + Oɛ 2 ). ) eb b + E + Oɛ 2 ) A 5-3 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-4 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-3 5-4

The quantity κa) = A A 1 is called the condition number of the linear system with respect to the norm.. When using the p-norms we write: κ p A) = A p A 1 p Note: κ 2 A) = σ max A)/σ min A) = ratio of largest to smallest singular values of A. Allows to define κ 2 A) when A is not square. Determinant *is not* a good indication of sensitivity Small eigenvalues *do not* always give a good indication of poor conditioning. Example: Consider, for a large α, the n n matrix A = I + αe 1 e T n Inverse of A is : A 1 = I αe 1 e T n For the -norm we have A = A 1 = 1 + α so that κ A) = 1 + α ) 2. Can give a very large condition number for a large α but all the eigenvalues of A are equal to one. 5-5 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-6 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-5 5-6 Rigorous norm-based error bounds Previous bound is valid only when perturbation is small enough, where small is not precisely defined. New bound valid within an explicitly given neighborhood. THEOREM 1: Assume that A + E)y = b + e b and Ax = b and that A 1 E < 1. Then A + E is nonsingular and A 1 A E 1 A 1 E A + e ) b b Begin with simple case: LEMMA: If E < 1 then I E is nonsingular and I E) 1 1 1 E Proof is based on following 5 steps a) Show: If E < 1 then I E is nonsingular b) Show: I E)I + E + E 2 + + E k ) = I E k+1. c) From which we get: To prove, first need to show that A + E is nonsingular if A is nonsingular and E is small. I E) 1 = k E i + I E) 1 E k+1 5-7 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-8 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-7 5-8

d) I E) 1 k = lim k Ei. We write this as I E) 1 = E i e) Finally: I E) 1 = lim k lim k 1 1 E k E i k = lim E i k k k E i lim E i k Can generalize result: LEMMA: If A is nonsingular and A 1 E < 1 then A + E is non-singular and A + E) 1 A 1 1 A 1 E Proof is based on relation A + E = AI + A 1 E) and use of previous lemma. Now we can prove the theorem: THEOREM 1: Assume that A + E)y = b + e b and Ax = b and that A 1 E < 1. Then A + E is nonsingular and A 1 A E 1 A 1 E A + e ) b b 5-9 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-10 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-9 5-10 Proof: From A + E)y = b + e b and Ax = b we get A + E)y x) = e b Ex. Hence: y x = A + E) 1 e b Ex) Taking norms y x A + E) 1 [ e b + E ] Dividing by and using result of lemma y x A + E) 1 [ e b / + E ] A 1 1 A 1 E [ e b / + E ] [ A 1 A eb 1 A 1 E A + E ] A Result follows by using inequality A b... QED Simplification when e b = 0 : A 1 E 1 A 1 E Simplification when E = 0 : A 1 A e b b Slightly less general form: Assume that E / A δ and e b / b δ and δκa) < 1 then Show the above result 2δκA) 1 δκa) 5-11 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-12 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-11 5-12

Another common form: THEOREM 2: Let A + A)y = b + b and Ax = b where A ɛ E, b ɛ e b, and assume that ɛ A 1 E < 1. Then ɛ A 1 A eb 1 ɛ A 1 E b + E ) A Normwise backward error We solve Ax = b and find an approximate solution y Question: Find smallest perturbation to apply to A, b so that *exact* solution of perturbed system is y Results to be seen later are of this type. 5-13 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-14 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-13 5-14 Normwise backward error in just A or b Suppose we model entire perturbation in RHS b. Let r = b Ay be the residual. Then y satisfies Ay = b + b with b = r exactly. The relative perturbation to the RHS is r b. Suppose we model entire perturbation in matrix A. ) Then y satisfies A + ryt y = b y T y The relative perturbation to the matrix is ry T y T y / A 2 = r 2 2 A y 2 Normwise backward error in both A & b For a given y and given perturbation directions E, e b, we define the Normwise backward error: η E,eb y) = min{ɛ A + A)y = b + b; for all A, b satisfying: A ɛ E ; and b ɛ e b } In other words η E,eb y) is the smallest ɛ for which { A + A)y = b + b; 1) A ɛ E ; b ɛ e b 5-15 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-16 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-15 5-16

y is given a computed solution). E and e b to be selected most likely directions of perturbation for A and b ). Typical choice: E = A, e b = b Explain why this is not unreasonable Let r = b Ay. Then we have: THEOREM 3: η E,eb y) = r E y + e b Normwise backward error is for case E = A, e b = b: η A,b y) = r A y + b Show how this can be used in practice as a means to stop some iterative method which computes a sequence of approximate solutions to Ax = b. Consider the 6 6 Vandermonde system Ax = b where a ij = j 2i 1), b = A [1, 1,, 1] T. We perturb A by E, with E 10 10 A and b similarly and solve the system. Evaluate the backward error for this case. Evaluate the forward bound provided by Theorem 2. Comment on the results. 5-17 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-18 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-17 5-18 Proof of Theorem 3 Let D E y + e b and η η E,eb y). The theorem states that η = r /D. Proof in 2 steps. First: Any A, b pair satisfying 1) is such that ɛ r /D. Indeed from 1) we have recall that r = b Ay) Ay + Ay = b + b r = Ay b r A y + b ɛ E y + e b ) ɛ r Second: We need to show an instance where the minimum value of r /D is reached. Take the pair A, b: A = αrz T ; b = βr with α = E y D ; β = e b D D The vector z depends on the norm used - for the 2-norm: z = y/ y 2. Here: Proof only for 2-norm a) We need to verify that first part of 1) is satisfied: A + A)y = Ay + αr yt y 2y = b r + αr ) E y = b 1 α)r = b 1 r E y + e b = b e b D r = b + βr A + A)y = b + b The desired result 5-19 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-20 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-19 5-20

b) Finally: Must now verify that A = η E and b = η e b. Exercise: Show that uv T 2 = u 2 v 2 A = α = E y r y = η E y 2 ryt D y 2 b = β r = e b D r = η e b QED 5-21 TB: 12; AB: 1.2.7; GvL 3.5 PertA 5-21