Math 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES

Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES Recap Yesterday we talked about several new, important concepts The central theme connecting the ideas was orthogonality, but more specifically we covered orthogonal complements; a big theorem that let us write a vector as a sum of two vectors, one from a given vector space and the other from its orthogonal complement; specifically, for V and x there is a unique representation x x + x with x V and x V ; a formula for computing x using dot products (once we have an orthonormal basis); formally defining the map proj V by proj V ( x ) x ; and writing down a matrix which gives the transformation proj V (this means that proj V is a linear transformation); specifically, we said that if u,, u s is a basis for V then u proj V ( x ) u us us x Making orthonormal bases The Gram-Schmidt Algorithm We saw yesterday that orthonormal bases are handy to use; for instance, they make writing the formula for proj V simple Our goal today is to produce orthonormal bases More specifically, if someone gives us a basis v,, v m of a space V, we want to manufacture an orthonormal basis for V from v,, v m from it Example Given the linearly independent collection { v }, construct an orthonormal basis for Span( v ) Solution First notice that Span( v ) is -dimensional since v is a basis Since we are looking for an orthonormal basis of a dimensional space we only have to find one vector, and orthonormality just means it should be unit length So let s just scale the vector we ve been given by its magnitude Our orthonormal basis for Span( v ) is the vector u v v Example Given the linearly independent collection { v, v }, construct an orthonormal basis for Span( v, v ) Solution Since we are working in a -dimensional space we ll have to do a little more work than last time We ll begin in the same way, by defining the first vector in our orthonormal basis by scaling the first vector in the given basis u v v acs@mathuiucedu http://wwwmathuiucedu/~acs/w0/math46 Page of 5

Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 The second vector in our orthonormal basis must be orthogonal to u and have unit length To find a vector orthogonal to u, we will use the decomposition v v + v, where the decomposition is relative to Span( u ) The vector v is orthogonal to u like we want, but it isn t quite unit length If we scale it appropriately, though, it will be u v v v v v v v v ( u v ) u v ( u v ) u The second equality I wrote follows from how we constructed v as the difference v v The third equality comes from expressing v as a dot product in terms of the orthonormal basis { u } of Span( u ) Span( v ) Example Given the linearly independent collection { v, v, v 3 }, construct an orthonormal basis for Span( v, v, v 3 ) Solution We re now in a 3-dimensional space, so we get to do a little more work We ll begin as before, by defining the first vector in our orthonormal basis by scaling the first vector in the given basis v u v Also as before we ll construct the second vector by writing v v + v (the decomposition relative to Span( u )) Then we define u v v v v v v v v ( u v ) u v ( u v ) u Finally for the third vector we need to find a vector orthogonal to Span( u, u ) Span( v, v ) For this we will write v3 v 3 + v3, the decomposition relative to the space Span( u, u ) Span( v, v ) Then we ll define u 3 v3 v3 v 3 v 3 v 3 v 3 v 3 v3 ( u v 3 ) u ( u v 3 ) u v ( u v 3 ) u ( u v 3 ) u Example Let s compute a specific example We ll try to construct an orthonormal basis for the span of the vectors ( ) v and ( ) 0 v From above we write v u v ( ) acs@mathuiucedu http://wwwmathuiucedu/~acs/w0/math46 Page of 5

Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 For the second vector, we have ( ) ( ) 0 0 u ( u v ) u u u ( u v ) u the above junk ( 0 ) above junk The process we have outlined above can be followed for any number of initial vectors In fact, by going through the cases we have, we should be able to see how the algorithm works generally The process we have followed is called the Gram-Schmidt algorithm Theorem (Gram-Schmidt Algorithm) Suppose we are given a collection { v,, v s } which is linearly independent Then if we write vi v i + vi (the decomposition relative to the space Span( v,, v i )), the collection of vectors { u,, u s } defined by vi ui v i is an orthonormal basis for Span( v,, v s ) We can calculate u i iteratively as and v u v vi ( u v i ) u ( u i v i ) u i ui v i ( u v i ) u ( u i v i ) u i The algorithm actually provides a factorization of the matrix whose columns are the initial linearly independent vectors This factorization is called the QR factorization of the matrix Theorem (QR factorization) Suppose that A is a matrix whose columns v,, v s are linearly independent Then A QR where Q is the matrix Q u us and R is the matrix v R u v u v 3 v u v 3 u v s u v s v 3 u3 v s u s v s v s u v u v u v 3 u v u v u v 3 us v us v us v 3 u v s u v s u s v s Proof The fact that we can write the matrix R in two ways just comes from the fact that u i ui v j 0 when i > j (since u i is orthogonal to Span( v,, v i ) by construction) v i v i and that acs@mathuiucedu http://wwwmathuiucedu/~acs/w0/math46 Page 3 of 5

Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 Using the first expression for the matrix R, we ll check the matrices on the left and right hand side are the same column by column For this, note that the ith column of the product (ie, the right hand side) is just u v i u u s v i 0 ( u v i ) u + ( u v i ) u + + ( u i v i ) u i + v i ui ( u v i ) u + ( u v i ) u + + ( u i v i ) u i + v i ui }{{} vi vi v i But this is just the ith column of the left hand side, so our matrices must be equal 3 Orthogonal Transformations and Matrices Definition 3 A linear transformation T : R n R n is called orthogonal if it preserves the length of vectors: T ( x ) x for all x R n A matrix is called orthogonal if it corresponds to an orthogonal linear transformation Notice that we have used the adjective orthogonal before in the context of a pair of vectors Saying that a matrix is orthogonal, though, means something different, so be careful to keep these two concepts separate in your mind! Example Any rotation in R is an orthogonal transformation, since ( ) ( ) cos θ sin θ x (cos θx sin θ cos θ x sin θx ) + (sin θx + cos θx ) cos θx sin θ cos θx x + sin θx + sin θx + sin θ cos θx x + cos θx ( ) x + x x x This isn t surprising given our intuition of what rotations do to vectors in R Example If V is a subspace of R n, then we define This is orthogonal because Ref V ( x ) x x Ref V ( x ) x x x + x x + x x + x x Example If V is a subspace of R n that isn t all of R n, then proj V is not an orthogonal transformation To see why this is true, choose a nonzero vector x V (such a vector exists since V R n ) Then proj V ( x ) 0 x acs@mathuiucedu http://wwwmathuiucedu/~acs/w0/math46 Page 4 of 5

Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 This example points out an important feature of orthogonal matrices: they must have trivial kernel This gives the following result Theorem 3 An orthogonal matrix is invertible Proof Since an orthogonal matrix is square automatically, we only have to check that ker(a) { 0 } So choose b ker(a), and we want to show b 0 Using the definition of kernel we have A b 0, so that A b 0 0 But A is orthogonal, so that A b b Hence we have b 0, and so b 0 Hence ker(a) { 0 } as desired This proof only tells us that A is invertible, but doesn t tell us what the inverse is We ll compute the inverse of an orthogonal matrix by the end of the class period Theorem 3 (Operations Preserving Orthogonality in Matrices) The product of two orthogonal transformations is orthogonal The inverse of an orthogonal matrix is orthogonal Proof To prove the first statement, suppose that A and B are two orthogonal, n n matrices Then for any x R n we have AB x A(B x ) B x x (in the equalities labeled I have used the orthogonality of A and B) This means that AB is orthogonal For the second statement, let A be an orthogonal n n matrix and choose a vector x R n ; our goal is to show A x x To do so, notice A x AA x x, where the equality labeled uses the orthogonality of A In class there was a very reasonable question asked: why is an orthogonal matrix called orthogonal? One good answer to this question is the following Theorem 33 If v and w are orthogonal elements of R n, then for any orthogonal transformation T : R n R n the vectors T ( v ) and T ( w ) are orthogonal We ll prove this result at the beginning of class next time To do this, we ll need the following result (which you ll prove for homework): Lemma 34 For two vectors x, y R n, we have if and only if x y 0 v + y x + y acs@mathuiucedu http://wwwmathuiucedu/~acs/w0/math46 Page 5 of 5