Householder QR Householder reflectors are matrices of the form P = I 2ww T, where w is a unit vector (a vector of 2-norm unity) w Px x Geometrically, P x represents a mirror image of x with respect to the hyperplane span{w}. w 9-1 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-1
A few simple properties: For real w: P is symmetric It is also orthogonal (P T P = I). In the complex case P = I 2ww H is Hermitian and unitary. P can be written as P = I βvv T with β = 2/ v 2 2, where v is a multiple of w. [storage: v and β] P x can be evaluated x β(x T v) v (op count?) Similarly: P A = A vz T where z T = β v T A NOTE: we work in R m, so all vectors are of length m, P is of size m m, etc. 9-2 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-2
Problem 1: Given a vector x 0, find w such that (I 2ww T )x = αe 1, where α is a (free) scalar. Writing (I βvv T )x = αe 1 yields β(v T x) v = x αe 1. Desired w is a multiple of x αe 1, i.e., we can take v = x αe 1 To determine α we just recall that (I 2ww T )x 2 = x 2 As a result: α = x 2, or α = ± x 2 9-3 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-3
Should verify that both signs work, i.e., that in both cases we indeed get P x = αe 1 [exercise] Which sign is best? To reduce cancellation, the resulting x αe 1 should not be small. So, α = sign(ξ 1 ) x 2. v = x + sign(ξ 1 ) x 2 e 1 and β = 2/ v 2 2 v = ˆξ 1 ξ 2. ξ m 1 with ˆξ1 = { ξ1 + x 2 if ξ 1 > 0 ξ 1 x 2 if ξ 1 0 ξ m OK, but will yield a negative multiple of e 1 if ξ 1 > 0. 9-4 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-4
.. Show that (I βvv T )x = αe 1 when v = x αe 1 and α = ± x 2. Solution: Equivalent to showing that x (βx T v)v = αe 1 i.e., x αe 1 = (βx T v)v but recall that v = x αe 1 so we need to show that βx T 2x T v v = 1 i.e., that = 1 x αe 1 2 2 Denominator = x 2 2 + α2 2αe T 1 x = 2( x 2 2 αet 1 x) Numerator = 2x T v = 2x T (x αe 1 ) = 2( x 2 2 αxt e 1 ) Numerator/ Denominator = 1. 9-5 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-5
Alternative: Define σ = m i=2 ξ2 i. Always set ˆξ 1 = ξ 1 x 2. Update OK when ξ 1 0 When ξ 1 > 0 compute ˆx 1 as ˆξ 1 = ξ 1 x 2 = ξ2 1 x 2 2 ξ 1 + x 2 = σ ξ 1 + x 2 So: ˆξ1 = { σ ξ 1 + x 2 if ξ 1 > 0 ξ 1 x 2 if ξ 1 0 It is customary to compute a vector v such that v 1 = 1. So v is scaled by its first component. If σ == 0, wll get v = [1; x(2 : m)] and β = 0. 9-6 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-6
Matlab function: function [v,bet] = house (x) %% computes the householder vector for x m = length(x); v = [1 ; x(2:m)]; sigma = v(2:m) * v(2:m); if (sigma == 0) bet = 0; else xnrm = sqrt(x(1)^2 + sigma) ; if (x(1) <= 0) v(1) = x(1) - xnrm; else v(1) = -sigma / (x(1) + xnrm) ; end bet = 2 / (1+sigma/v(1)^2); v = v/v(1) ; end 9-7 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-7
Problem 2: Generalization. Given an m n matrix X, find w 1, w 2,..., w n such that (I 2w n w T n ) (I 2w 2w T 2 )(I 2w 1w T 1 )X = R where r ij = 0 for i > j First step is easy : select w 1 so that the first column of X becomes αe 1 Second step: select w 2 so that x 2 has zeros below 2nd component. etc.. After k 1 steps: X k P k 1... P 1 X has the following shape: 9-8 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-8
X k = x 11 x 12 x 13 x 1n x 22 x 23 x 2n x 33 x 3n.... x kk. x k+1,k x k+1,n... x m,k x m,n. To do: transform this matrix into one which is upper triangular up to the k-th column...... while leaving the previous columns untouched. 9-9 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-9
To leave the first k 1 columns unchanged w must have zeros in positions 1 through k 1. P k = I 2w k w T k, w k = v v 2, where the vector v can be expressed as a Householder vector for a shorter vector using the matlab function house, v = ( ) 0 house(x(k : m, k)) The result is that work is done on the (k : m, k : n) submatrix. 9-10 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-10
ALGORITHM : 1 Householder QR 1. For k = 1 : n do 2. [v, β] = house(x(k : m, k) 3. X(k : m, k : n) = (I βvv T )X(k : m, k : n) 4 If (k < m) 5 X(k + 1 : m, k) = v(2 : m k + 1) 6 end 7 end In the end: X n = P n P n 1... P 1 X = upper triangular 9-11 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-11
Yields the factorization: X = QR where Q = P 1 P 2... P n and R = X n Example: Reduce the system of vectors: X = [x 1, x 2, x 3 ] = 1 1 1 1 1 0 1 0 1 1 0 4 Answer: 9-12
1 1 + 2 x 1 = 1 1, x 1 2 = 2, v 1 = 1 1 1 1 3 3 3 3 P 1 = I 2w 1 w1 T = 1 3 5 1 1 6 3 1 5 1, 3 1 1 5, w 1 = 1 2 3 1 + 2 1 1 1 9-13 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-13
2 1 2 P 1 X = 0 1/3 1 0 2/3 2 Next stage: 0 2/3 3 0 0 x 2 = 1/3 2/3, x 2 2 = 1, v 2 = 1/3 + 1 2/3, 2/3 2/3 3 0 0 0 P 2 = I 2 v2 2v T v 2 v2 T = 1 0 1 2 2 3 0 2 2 1, 0 2 1 2 9-14 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-14
2 1 2 P 2 P 1 X = 0 1 1 0 0 3 0 0 2 0 x 3 = 0 2, x 3 2 = 13, v 1 = 3 P 2 = I 2 v T 3 v 3v 3 v T 3 = Last stage: 0 0 2 13 3, 1 0 0 0 0 1 0 0 0 0.83205.55470, 0 0.55470.83205 9-15 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-15
2 1 2 P 3 P 2 P 1 X = 0 1 1 0 0 13 = R, 0 0 0.50000.50000.50000.50000 P 3 P 2 P 1 =.50000.50000.50000.50000.13868.13868.69338.69338.69338.69338.13868.13868 So we end up with the factorization X = P 1 P 2 P 3 }{{} Q R 9-16 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-16
MAJOR difference with Gram-Schmidt: Q is m m and R is m n (same as X). The matrix R has zeros below the n-th row. Note also : this factorization always exists. Cost of Householder QR? Compare with Gram-Schmidt Question: How to obtain X = Q 1 R 1 where Q 1 = same size as X and R 1 is n n (as in MGS)? 9-17 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-17
Answer: simply use the partitioning X = ( ) ( ) R Q 1 Q 1 2 0 X = Q 1 R 1 Referred to as the thin QR factorization (or economy-size QR factorization in matlab) How to solve a least-squares problem Ax = b using the Householder factorization? Answer: no need to compute Q 1. Just apply Q T to b. This entails applying the successive Householder reflections to b 9-18 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-18
The rank-deficient case Result of Householder QR: Q 1 and R 1 such that Q 1 R 1 = X. In the rank-deficient case, can have span{q 1 } = span{x} because R 1 may be singular. Remedy: Householder QR with column pivoting. Result will be: AΠ = Q ( ) R11 R 12 0 0 R 11 is nonsingular. So rank(x) = size of R 11 = rank(q 1 ) and Q 1 and X span the same subspace. Π permutes columns of X. 9-19 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-19
Algorithm: At step k, active matrix is X(k : m, k : n). Swap k-th column with column of largest 2-norm in X(k : m, k : n). If all the columns have zero norm, stop. X(k:m, k:n) Swap with column of largest norm 9-20 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-20
Practical Question: How to implement this??? Suppose you know the norms of each column of X at the start. What happens to each of the norms of X(2 : m, j) for j = 2,, n? Generalize this to step k and obtain a procedure to inexpensively compute the desired norms at each step. 9-21 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-21
Properties of the QR factorization Consider the thin factorization A = QR, (size(q) = [m,n] = size (A)). Assume r ii > 0, i = 1,..., n 1. When A is of full column rank this factorization exists and is unique 2. It satisfies: span{a 1,, a k } = span{q 1,, q k }, k = 1,..., n 3. R is identical with the Cholesky factor G T of A T A. When A in rank-deficient and Householder with pivoting is used, then Ran{Q 1 } = Ran{A} 9-22 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-22
Givens Rotations Matrices of the form G(i, k, θ) = 1... 0... 0 0......... 0 c s 0........ 0 s c 0.... 0... 0... 1 i k with c = cos θ and s = sin θ represents a rotation in the span of e i and e k. 9-23 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-23
Main idea of Givens rotations consider y = Gx then y i = c x i + s x k y k = s x i + c x k y j = x j for j i, k Can make y k = 0 by selecting s = x k /t; c = x i /t; t = x 2 i + x2 k This is used to introduce zeros in the first column of a matrix A (for example G(m 1, m), G(m 2, m 1) etc..g(1, 2) ).. See text for details 9-24 TB: 10,19; AB: 2.3.3;GvL 5.1 HouQR 9-24
Orthogonal projectors and subspaces Notation: Given a supspace X of R m define X = {y y x, x X } Let Q = [q 1,, q r ] an orthonormal basis of X How would you obtain such a basis? Then define orthogonal projector P = QQ T 9-25 AB: 2.4.4;GvL 5.4 URV 9-25
Properties (a) P 2 = P (b) (I P ) 2 = I P (c) Ran(P ) = X (d) Null(P ) = X (e) Ran(I P ) = Null(P ) = X Note that (b) means that I P is also a projector Proof. (a), (b) are trivial (c): Clearly Ran(P ) = {x x = QQ T y, y R m } X. Any x X is of the form x = Qy, y R m. Take P x = QQ T (Qy) = Qy = x. Since x = P x, x Ran(P ). So X Ran(P ). In the end X = Ran(P ). (d): x X (x, y) = 0, y X (x, Qz) = 0, z R r (Q T x, z) = 0, z R r Q T x = 0 QQ T x = 0 P x = 0. 9-26 AB: 2.4.4;GvL 5.4 URV 9-26
(e): Need to show inclusion both ways. x Null(P ) P x = 0 (I P )x = x x Ran(I P ) x Ran(I P ) y R m x = (I P )y P x = P (I P )y = 0 x Null(P ) Result: Any x R m can be written in a unique way as x = x 1 + x 2, x 1 X, x 2 X Proof: Just set x 1 = P x, x 2 = (I P )x Called the Orthogonal Decomposition 9-27 AB: 2.4.4;GvL 5.4 URV 9-27
Orthogonal decomposition In other words R m = P R m (I P )R m or: R m = Ran(P ) Ran(I P ) or: R m = Ran(P ) Null(P ) or: R m = Ran(P ) Ran(P ) Can complete basis {q 1,, q r } into orthonormal basis of R m, q r+1,, q m {q r+1,, q m } = basis of X. dim(x ) = m r. 9-28 AB: 2.4.4;GvL 5.4 URV 9-28
Four fundamental supspaces - URV decomposition Let A R m n and consider Ran(A) Property 1: Ran(A) = Null(A T ) Proof: x Ran(A) iff (Ay, x) = 0 for all y iff (y, A T x) = 0 for all y... Property 2: Ran(A T ) = Null(A) Take X = Ran(A) in orthogonal decomoposition 9-29 AB: 2.4.4;GvL 5.4 URV 9-29
Result: R m = Ran(A) Null(A T ) R n = Ran(A T ) Null(A) 4 fundamental subspaces Ran(A) Null(A), Ran(A T ) Null(A T ) 9-30 AB: 2.4.4;GvL 5.4 URV 9-30
Express the above with bases for R m : and for R n [u } 1, u 2, {{, u } r, u } r+1, u r+2 {{,, u m} ] Ran(A) Null(A T ) [v } 1, v 2, {{, v } r, v } r+1, v r+2 {{,, v n} ] Ran(A T ) Null(A) Observe u T i Av j = 0 for i > r or j > r. Therefore ( ) C 0 U T AV = R = C R r r 0 0 m n A = URV T General class of URV decompositions 9-31 AB: 2.4.4;GvL 5.4 URV 9-31
Far from unique. Show how you can get a decomposition in which C is lower (or upper) triangular, from the above factorization. Can select decomposition so that R is upper triangular URV decomposition. Can select decomposition so that R is lower triangular ULV decomposition. SVD = special case of URV where R = diagonal How can you get the ULV decomposition by using only the Householder QR factorization (possibly with pivoting)? [Hint: you must use Householder twice] 9-32 AB: 2.4.4;GvL 5.4 URV 9-32