AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 13: Conditioning of Least Squares Problems; Stability of Householder Triangularization Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 16
Outline 1 Conditioning of Least Squares Problems (NLA 18) 2 Stability of Householder Triangularization (NLA 16,19) Xiangmin Jiao Numerical Analysis I 2 / 16
Four Conditioning Problems Least squares problem: Given A R m n with full rank and b R m, min x R n b Ax Its solution is x = A + b. Another quantity is y = Ax = Pb, where P = AA + Consider A and b as input data, and x and y as output. We then have four conditioning problems: Input \ Output y x b κ b y κ b x A κ A y κ A x These conditioning problems are important and subtle. Xiangmin Jiao Numerical Analysis I 3 / 16
Some Prerequisites We focus on the second column, namely κ b x and κ A x However, understanding κ b y and κ A y is prerequisite Three quantities: (All in 2-norms) Condition number of A: κ(a) = A A + = σ 1 /σ n Angle between b and y: θ = arccos y b. (0 θ π/2) Orientation of y with range(a): η = A x y. (1 η κ(a)) Xiangmin Jiao Numerical Analysis I 4 / 16
Sensitivity of y to Perturbations in b Intuition: The larger θ is, the more sensitive y is in terms of relative error Analysis: y = Pb, so where P = 1 κ b y = P y / b = b y = 1 cosθ, Input \ Output y x 1 b cosθ A Question: When is the maximum attained for perturbation δb? Xiangmin Jiao Numerical Analysis I 5 / 16
Sensitivity of y to Perturbations in b Intuition: The larger θ is, the more sensitive y is in terms of relative error Analysis: y = Pb, so where P = 1 κ b y = P y / b = b y = 1 cosθ, Input \ Output y x 1 b cosθ A Question: When is the maximum attained for perturbation δb? Answer: When is δb in range(a) Xiangmin Jiao Numerical Analysis I 5 / 16
Sensitivity of x to Perturbations in b Intuition: It depends on how sensitive y is to b, and how y lies within range(a) Analysis: x = A + b, so κ b x = A+ x / b = A+ b y y x = A+ 1 A cosθ η = κ(a) η cosθ, where η = A x / y Input \ Output y x b 1 cosθ A κ(a) η cosθ Xiangmin Jiao Numerical Analysis I 6 / 16
Sensitivity of x to Perturbations in b Assume cosθ = O(1), κ b x = κ(a) O(κ(A))! η cosθ can lie anywhere between 1 and Question: When is the maximum attained for perturbation δb? Xiangmin Jiao Numerical Analysis I 7 / 16
Sensitivity of x to Perturbations in b Assume cosθ = O(1), κ b x = κ(a) O(κ(A))! η cosθ can lie anywhere between 1 and Question: When is the maximum attained for perturbation δb? Answer: When is δb in subspace spanned by left singular vectors corresponding to smallest singular values Question: What if A is a nonsingular matrix? Xiangmin Jiao Numerical Analysis I 7 / 16
Sensitivity of x to Perturbations in b Assume cosθ = O(1), κ b x = κ(a) O(κ(A))! η cosθ can lie anywhere between 1 and Question: When is the maximum attained for perturbation δb? Answer: When is δb in subspace spanned by left singular vectors corresponding to smallest singular values Question: What if A is a nonsingular matrix? Answer: κ b x can lie anywhere between 1 and κ(a)! Xiangmin Jiao Numerical Analysis I 7 / 16
Sensitivity of y and x to Perturbations in A The relationship are nonlinear, because range(a) changes due to δa Intuitions: The larger θ is, the more sensitive y is in terms of relative error. Tilting of range(a) depends on κ(a). For x, it depends where y lies within range(a) Input \ Output y x 1 b cosθ A κ(a) cosθ κ(a) η cosθ κ(a)+ κ(a)2 tanθ η For second row, bounds are not necessarily tight Assume cosθ = O(1), κ A x can lie anywhere between κ(a) and O(κ(A) 2 ) Xiangmin Jiao Numerical Analysis I 8 / 16
Condition Numbers of Linear Systems Linear system Ax = b for nonsingular A R m m is a special case of least squares problems, where y = b If m = n, then θ = 0, so cosθ = 1 and tanθ = 0. Input \ Output y x b 1 κ(a)/η A κ(a) κ(a) Xiangmin Jiao Numerical Analysis I 9 / 16
Outline 1 Conditioning of Least Squares Problems (NLA 18) 2 Stability of Householder Triangularization (NLA 16,19) Xiangmin Jiao Numerical Analysis I 10 / 16
Solution of Least Squares Problems An efficient and robust approach is to use QR factorization A = ˆQˆR b can be projected onto range(a) by P = ˆQˆQT, and therefore ˆQˆRx = ˆQˆQ T b Left-multiply by ˆQ T and we get ˆRx = ˆQ T b (note A + = ˆR 1ˆQ T ) Least squares via QR Factorization Compute reduced QR factorization A = ˆQˆR Compute vector c = ˆQ T b Solve upper-triangular system ˆRx = c for x Computation is dominated by QR factorization (2mn 2 2 3 n3 ) What about stability? Xiangmin Jiao Numerical Analysis I 11 / 16
Backward Stability of Householder Triangularization For a QR factorization A = QR computed by Householder triangularization, the factors Q and R satisfy Q R = A+δA, δa / A = O(ǫ machine ), i.e., exact QR factorization of a slightly perturbed A R is R computed by algorithm using floating points However, Q is product of exactly orthogonal reflectors Q = Q 1 Q2... Q n where Q k is given by computed ṽ k, since Q is not formed explicitly Xiangmin Jiao Numerical Analysis I 12 / 16
Backward Stability of Solving Ax = b with QR Algorithm: Solving Ax = b by QR Factorization Compute A = QR using Householder, represent Q by reflectors Compute vector y = Q T b implicitly using reflectors Solve upper-triangular system Rx = y for x All three steps are backward stable Overall, we can show that (A+ A) x = b, A / A = O(ǫ machine ) as we prove next Xiangmin Jiao Numerical Analysis I 13 / 16
Backward Stability of Solving Ax = b with Householder Triangularization Proof: Step 2 gives Step 3 gives Therefore, ( Q +δq)ỹ = b, δq = O(ǫ machine ) ( R +δr) x = ỹ, δr / R = O(ǫ machine ) b = ( Q +δq)( R +δr) x = [ Q R +(δq) R + Q(δR)+(δQ)(δR) ] x Step 1 gives where Q R = A+δA b = A+δA+(δQ) R + Q(δR)+(δQ)(δR) x }{{} A Xiangmin Jiao Numerical Analysis I 14 / 16
Proof of Backward Stability Cont d Q R = A+δA where δa / A = O(ǫ machine ), and therefore R A Q T A+δA = O(1) A Now show that each term in A is small Overall, (δq) R A Q(δR) A (δq)(δr) A (δq) R A = O(ǫ machine ) Q δr R R A = O(ǫ machine ) δq δr A = O(ǫ2 machine ) A A δa A + (δq) R + Q(δR) + (δq)(δr) A A A Since the algorithm is backward stable, it is also accurate. = O(ǫ machine ) Xiangmin Jiao Numerical Analysis I 15 / 16
Backward Stability of Householder Triangularization Theorem Let the full-rank least squares problem be solved using Householder triangularization on a computer satisfying the two axioms of floating point numbers. The algorithm is backward stable in the sense that the computed solution x has the property (A+δA) x b) = min, δa A = O(ǫ machine ) for some δa R m n. Backward stability of the algorithm is true whether ˆQ T b is computed via explicit formation of ˆQ or computed implicitly Backward stability also holds for Householder triangularization with arbitrary column pivoting AP = ˆQˆR Xiangmin Jiao Numerical Analysis I 16 / 16