The QR Factorization How to Make Matrices Nicer Radu Trîmbiţaş Babeş-Bolyai University March 11, 2009 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 1 / 25
Projectors A projector is a square matrix P that satisfies P 2 = P Not necessarily an orthogonal projector (more later) If v range(p) then Pv = v Since v = Px, Pv = P 2 x = Px = v Projection along the line Pv v null(p) Since P(Pv v) = P 2 v Pv = 0 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 2 / 25
Complementary Projectors The matrix I P is the complementary projector to P I P projects on the nullspace of P: If Pv = 0, then (I P)v = v, so null(p) range(i P) But for any v, (I P)v = v Pv null(p), so range(i P) null(p) Therefore range(i P) = null(p) and null(i P) = range(p) Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 3 / 25
Complementary Subspaces For a projector P, null(i P) null(p) = {0} or range(p) null(p) = {0} A projector separates C m into two spaces S 1, S 2, with range(p) = S 1 and null(p) = S 2 P is the projector onto S 1 along S 2 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 4 / 25
Orthogonal Projectors An orthogonal projector projects onto S 1 along S 2, with S 1, S 2 orthogonal Theorem A projector P is orthogonal P = P Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 5 / 25
Proof Proof. ( ) If P = P, then the inner product between a vector Px S 1 and a vector (I P)y S 2 is zero: x P (I P)y = x (P P 2 )y = 0. Thus the projector is orthogonal. ( )We shall use SVD. Suppose P projects onto S 1 along S 2, where S 1 S 2 and S 1 has dimension n. Then an SVD of P can be constructed as follows. Let {q 1, q 2,..., q m } be an orthonormal basis for C m, where {q 1,..., q n } is a basis for S 1 and {q n+1,..., q m } is a basis for S 2. For j n, we have Pq j = q j, and for j > n, we have Pq j = 0. Now let Q be the unitary matrix whose jth column is q j. Then we have PQ = q 1... q n 0..., Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 6 / 25
Proof - continued so that Q PQ = 1... 1 0... = Σ, a diagonal matrix with ones in the first n entries and zeros everywhere else. Thus we have constructed a singular value decomposition of P: P = QΣQ. (1) (note that this is also an eigenvalue decomposition.) Since P = (QΣQ ) = QΣ Q = QΣQ, it follows that P is hermitian. Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 7 / 25
Projection with Orthonormal Basis Reduced SVD gives projector for orthonormal columns ˆQ P = ˆQ ˆQ Complement I ˆQ ˆQ also orthogonal, projects onto space orthogonal to range( ˆQ) Special case 1: Rank-1 Orthogonal Projector (gives component in direction q) P q = qq Special case 2: Rank m - 1 Orthogonal Projector (eliminates component in direction q) P q = I qq Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 8 / 25
Projection with Arbitrary Basis Project v to y range(a). Then y v range(a), or a j (y v) = 0, j Set y = Ax: a j (Ax v) = 0, j A (Ax v) = 0 A Ax = A v A A is nonsingular, so x = (A A) 1 A v Finally, we are interested in the projection y = Ax = A(A A) 1 A v giving the orthogonal projector P = A(A A) 1 A Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 9 / 25
The QR Factorization -Main Idea Find orthonormal vectors that span the successive spaces spanned by the columns of A: This means that (for full rank A), a 1 a 1, a 2 a 1, a 2, a 3... q 1, q 2,..., q j = a 1, a 2,..., a j, for j = 1;... ; n Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 10 / 25
The QR Factorization -Matrix Form In matrix form, q 1, q 2,..., q j = a 1, a 2,..., a j becomes a 1 a 2... a n = q 1 q 2... q n or A = ˆQ ˆR This is the reduced QR factorization of A r 11 r 12... r 1n r 22... r 2n.... r nn Add orthogonal extension to ˆQ and add rows to ˆR to obtain the full QR factorization Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 11 / 25
Figure: Full and reduced QR factorization Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 12 / 25
Full and Reduced QR Factorization Let A be an m n matrix. The full QR factorization of A is the factorization A = QR, where Q is m m unitary and R is m n upper-triangular A more compact representation is the reduced QR factorization A = ˆQ ˆR, where (for m n) Q is m n and R is n n Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 13 / 25
Gram-Schmidt Orthogonalization Find new q j orthogonal to q 1,..., q j 1 by subtracting components along previous vectors v j = a j (q 1a j )q 1 (q 2a j )q 2 (q j 1a j )q j 1 Normalize to get q j = v j / v j We then obtain a reduced QR factorization A = ˆQ ˆR, with r ij = q i a j, (i = j) and rjj = a j j 1 i=1 r ij q i 2 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 14 / 25
Classical Gram-Schmidt Straight-forward application of Gram-Schmidt orthogonalization Numerically unstable Algorithm: Classical Gram-Schmidt for j := 1 to n do v j := a j ; for i := 1 to j 1 do r ij := q i a j; v j := v j r ij q i ; r jj := v j 2 ; q j := v j /r jj ; Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 15 / 25
Existence and Uniqueness Every A C m n (m n) has a full QR factorization and a reduced QR factorization Proof. For full rank A, Gram-Schmidt proves existence of A = ˆQ ˆR. Otherwise, when v j = 0 choose arbitrary vector orthogonal to previous q i. For full QR, add orthogonal extension to Q and zero rows to R. Each A C m n (m n) of full rank has unique A = ˆQ ˆR with r jj > 0. Proof. Again Gram-Schmidt, r jj > 0 determines the sign. Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 16 / 25
Gram-Schmidt Projections The orthogonal vectors produced by Gram-Schmidt can be written in terms of projectors where q 1 = P 1a 1 P 1 a 1, q 2 = P 2a 2 P 2 a 2,..., q n = P na n P n a n P j = I ˆQ j 1 ˆQ j 1 with ˆQ j 1 = q 1 q 2 q j 1 Pj projects orthogonally onto the space orthogonal to q 1,..., q j 1, and rank(pj) = m (j 1) Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 17 / 25
The Modified Gram-Schmidt Algorithm The projection Pj can equivalently be written as P j = P qj 1... P q2 P q1 where P q = I qq P q projects orthogonally onto the space orthogonal to q, and rank(p q ) = m 1 The Classical Gram-Schmidt algorithm computes an orthogonal vector by v j = P j a j while the Modified Gram-Schmidt algorithm uses v j = P qj 1... P q2 P q1 a j Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 18 / 25
Classical vs. Modified Gram-Schmidt Small modification of classical G-S gives modified G-S (but see next slide) Modified G-S is numerically stable (less sensitive to rounding errors) Classical/Modified Gram-Schmidt for j := 1 to n do v j := a j ; for i := 1 to j 1 do r ij := qi a j; (CGS) r ij := qi v j; (MGS) v j := v j r ij q i ; r jj := v j 2 ; q j := v j /r jj ; Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 19 / 25
Implementation of Modified Gram-Schmidt In modified G-S, P qi can be applied to all v j as soon as q i is known Makes the inner loop iterations independent (like in classical G-S) CGS for j := 1 to n do v j := a j ; for i := 1 to j 1 do r ij := q i a j; v j := v j r ij q i ; r j j := v j 2 ; q j := v j /r jj ; MGS for i = 1 to n do v i := a i ; for i := 1 to n do r ii := v i 2 ; q j := v j /r jj ; for j := i + 1 to n do r ij := q i v j; v j := v j r ij q i ; Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 20 / 25
Operation Count Count number of floating points operations flops in an algorithm Each +, -, *, /, or counts as one flop No distinction between real and complex No consideration of memory accesses or other performance aspects Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 21 / 25
Operation Count -Modified G-S Example: Count all +, -, *, / in the Modified Gram-Schmidt algorithm (not just the leading term) 1 for i := 1 to n 2 v i := a i 3 for i := 1 to n 4 r ii := v i m *, m 1 + 5 q i := v i /r ii m / 6 for j := i + 1 to n 7 r ij := q i v j m *, m 1 + 8 v j := v j r ij q i m *, m - Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 22 / 25
Operation Count -Modified G-S The total for each operation is ( #+ = # = # = n i=1 m 1 + n j=i+1 (m 1) ) n(n 1)(m 1) = n(m 1) + 2 n i=1 n i=1 n j=i+1 ( m + m = n j=i+1 = n(m 1) + n i=1 = 1 n(n + 1)(m 1) 2 n m(n i) = 1 mn(n 1) i=1 2 ) 2m 2mn(n 1) = mn + 2 n #/ = m = mn i=1 = mn + = mn 2 n i=1 2m(n i) = (m 1)(n i) = Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 23 / 25
Operation Count -Modified G-S and the total flop count is 1 2 n(n + 1)(m 1) + 1 2 mn(n 1) + mn2 + mn = 2mn 2 + mn 1 2 n2 1 2 n 2mn2 The symbol indicates asymptotic value as m, n (leading term) Easier to find just the leading term: Most work done in lines (7) and (8), with 4m flops per iteration Including the loops, the total becomes n n n n 4m = 4m (n i) 4m i = 2mn 2 i=1 j=i+1 i=1 i=1 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 24 / 25
Figure: Jórgen Pedersen Gram (1850-1916) Figure: Erhart Schmidt (1876-1959) Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 25 / 25