Lecture 4 Orthonormal vectors and QR factorization

Orthonormal vectors and QR factorization 4 1 Lecture 4 Orthonormal vectors and QR factorization EE263 Autumn 2004 orthonormal vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced by a matrix

Orthonormal vectors and QR factorization 4 2 Orthonormal set of vectors set of vectors u 1,..., u k R n is normalized if u i = 1, i = 1,..., k (u i are called unit vectors or direction vectors) orthogonal if u i u j for i j orthonormal if both in terms of U = [u 1 u k ], orthonormal means U T U = I k orthonormal vectors are independent (multiply α 1 u 1 + α 2 u 2 + + α k u k = 0 by u T i ) hence u 1,..., u k is an orthonormal basis for span(u 1,..., u k ) = R(U)

Orthonormal vectors and QR factorization 4 3 Geometric properties suppose columns of U = [u 1 u k ] are orthonormal if w = Uz, then w = z multiplication by U does not change norm mapping w = Uz is isometric: it preserves distances simple derivation using matrices: w 2 = Uz 2 = (Uz) T (Uz) = z T U T Uz = z T z = z 2 inner products are also preserved: Uz, U z = z, z if w = Uz and w = U z then w, w = Uz, U z = (Uz) T (U z) = z T U T U z = z, z norms and inner products preserved angles are preserved: (Uz, U z) = (z, z) i.e., multiplication by U preserves inner products, angles, and distances

Orthonormal vectors and QR factorization 4 4 Orthonormal basis for R n suppose u 1,..., u n is an orthonormal basis for R n then U = [u 1 u n ] is called orthogonal: it is square and satisfies U T U = I (you d think such matrices would be called orthonormal, not orthogonal... ) it follows that U 1 = U T, and hence also UU T = I, i.e., n i=1 u iu T i = I

Orthonormal vectors and QR factorization 4 5 Expansion in orthonormal basis suppose U is orthogonal, so x = UU T x, i.e., x = n i=1 (ut i x)u i u T i x is called the component of x in the direction u i a = U T x resolves x into the vector of its u i components x = Ua reconstitutes x from its u i components x = Ua = n i=1 a iu i is called the (u i -) expansion of x the identity I = UU T = n i=1 u i u T i is sometimes written as I = n i u i i=1 since x = n i u i x i=1 (but we won t use this notation)

Orthonormal vectors and QR factorization 4 6 Geometric interpretation if U is orthogonal, then transformation w = Uz preserves norm of vectors, i.e., Uz = z preserves angles between vectors, i.e., (Uz, U z) = (z, z) examples: rotations (about some axis) reflections (through some plane) in fact, the converse is true: if U is orthogonal, then the mapping w = Uz is either a rotation or a reflection

Orthonormal vectors and QR factorization 4 7 Example: rotation by θ in R 2 is given by y = U θ x, U θ = cos θ sin θ sin θ cos θ (since e 1 [ cos θ sin θ ] T, e2 [ sin θ cos θ ] T ) reflection across line x 2 = x 1 tan( θ 2 ) is given by y = R θ x, R θ = cos θ sin θ sin θ cos θ (since e 1 [ cos θ sin θ ] T, e2 [ PSfrag replacements sin θ cos θ ] T ) sin θ cos θ x 2 e 2 θ rotation cos θ sin θ e 1 x 1 x 2 e 2 θ reflection cos θ sin θ e 1 sin θ cos θ x 1 can check that U θ and R θ are orthogonal

Orthonormal vectors and QR factorization 4 8 Gram-Schmidt procedure given independent vectors a 1,..., a k R n, finds orthonormal vectors q 1,..., q k s.t. span(a 1,..., a r ) = span(q 1,..., q r ) for r k one method: q i s are given recursively as q 1 := a 1 q 1 := q 1 / q 1 (normalize) q 2 := a 2 (q T 1 a 2 )q 1 (remove q 1 component from a 2 ) q 2 := q 2 / q 2 (normalize) q 3 := a 3 (q T 1 a 3 )q 1 (q T 2 a 3 )q 2 (remove q 1, q 2 components from a 3 ) q 3 := q 3 / q 3 (normalize) etc. a 2 PSfrag replacements q 2 = a 2 (q T 1 a 2 )q 1 q 2 q 1 (q T 1 a 2 )q 1 q 1 = a 1

Orthonormal vectors and QR factorization 4 9 for i = 1, 2,..., k we have a i = r 1i q 1 + r 2i q 2 + + r ii q i (the r ij are easily obtained from the procedure) with r ii 0 (if not, a i can be written as a linear combination of a 1,..., a i 1 ) written in matrix form: [ a1 a 2 a k ] } {{ } A = [ ] q 1 q 2 q k } {{ } Q r 11 r 12 r 1k 0 r 22 r 2k...... 0 0 r kk } {{ } R i.e., A = QR where Q T Q = I k and R is upper triangular, invertible (full rank, det R = k i=1 r ii 0) called QR decomposition of A never computed using Gram-Schmidt (due to propagation of numerical error) columns of Q are orthonormal basis for R(A)

Orthonormal vectors and QR factorization 4 10 General Gram-Schmidt procedure on page 4 8 we assume a 1,..., a k are independent... if a 1,..., a k are dependent, we find q j = 0 for some j, which means a j is linearly dependent on a 1,..., a j 1 modified algorithm: when we encounter q j = 0, skip to next vector a j+1 and continue: r = 0; for i = 1,..., k { ã = a i r j=1 q j q T j a i ; if ã 0 { r = r + 1; q r = ã/ ã ; } } on exit, r = Rank(A) q 1,..., q r is an orthonormal basis for R(A) each a i is linear combination of previously generated q j s

Orthonormal vectors and QR factorization 4 11 in matrix notation we have A = QR with Q T Q = I r and R R rk in upper staircase form: zero entries possibly nonzero entries PSfrag replacements entries below staircase are zero corner entries (shown with ) are nonzero hence R is full rank (r) can permute columns with to front of matrix: A = Q[ R S]P where: Q T Q = I r R R rr is upper triangular and invertible P R kk is a permutation matrix

Orthonormal vectors and QR factorization 4 12 Applications directly yields o.n. basis for R(A) yields factorization A = BC with B R nr, C R rk, r = Rank(A) to check if b span(a 1,..., a k ): apply Gram-Schmidt to [a 1 a k b] more generally, staircase pattern in R shows which columns of A are dependent on previous ones works incrementally: one G-S procedure yields QR factorizations of [a 1 a p ] for p = 1,..., k: [a 1 a p ] = [q 1 q s ]R p where s = Rank([a 1 a p ]) and R p is leading s p submatrix of R

Orthonormal vectors and QR factorization 4 13 Full QR factorization with A = Q 1 R 1 the QR factorization as above, write A = [ Q 1 Q 2 ] R 1 0 where [Q 1 Q 2 ] is orthogonal, i.e., columns of Q 2 R n(n r) are orthonormal, orthogonal to Q 1 to find Q 2 : find any matrix Ã s.t. [A Ã] is full rank (e.g., Ã = I) apply general Gram-Schmidt to [A Ã] Q 1 are o.n. vectors obtained from columns of A Q 2 are o.n. vectors obtained from extra columns (Ã) i.e., any set of o.n. vectors can be extended to an o.n. basis for R n Q 1 and Q 2 are called complementary since their ranges are orthogonal together they span R n

Orthonormal vectors and QR factorization 4 14 Orthogonal decomposition induced by A interpretation of Q 2 : so A T = [ R T 1 0 ] Q T 1 Q T 2 z N (A T ) Q T 1 z = 0 z R(Q 2 ) i.e., columns of Q 2 are o.n. basis for N (A T ) (recall columns of Q 1 are o.n. basis for R(A)) we conclude: R(A) and N (A T ) are complementary subspaces: they are orthogonal: R(A) N (A T ) together they span R n sometimes this is written R(A) + N (A T ) = R n every y R n can be written uniquely as y = z + w, with z R(A), w N (A T )