Math 08B Professor: Padraic Bartlett Lecture 4: Applications of Orthogonality: QR Decompositions Week 4 UCSB 204 In our last class, we described the following method for creating orthonormal bases, known as the Gram-Schmidt method: Theorem Suppose that V is a k-dimensional space with a basis B = { b, b k } The following process (called the Gram-Schmidt process) creates an orthonormal basis for V : First, create the following vectors { v, v k }: u = b u 2 = b 2 proj( b 2 onto u ) u 3 = b 3 proj( b 3 onto u ) proj( b 3 onto u 2 ) u 4 = b 4 proj( b 4 onto u ) proj( b 4 onto u 2 ) proj( b 4 onto u 3 ) u k = b k proj( b k onto u ) proj( b k onto u k ) 2 Now, take each of the vectors u i, and rescale them so that they are unit length: ie redefine each u i as the rescaled vector u i u i In this class, I want to talk about a useful application of the Gram-Schmidt method: the QR decomposition! We define this here: The QR decomposition Definition We say that an n n matrix Q is orthogonal if its columns form an orthonormal basis for R n As a side note: we ve studied these matrices before! In class on Friday, we proved that for any such matrix, the relation Q T Q = I held; to see this, we just looked at the (i, j)-th entry of the product Q T Q, which by definition was the i-th row of Q T dotted with the j-th column of A The fact that Q s columns formed an orthonormal basis was enough to tell us that this product must be the identity matrix, as claimed
Definition A QR-decomposition of an n n matrix A is an orthogonal matrix Q and an upper-triangular matrix R, such that A = QR Theorem Every invertible matrix has a QR-decomposition, where R is invertible Proof We prove this using the Gram-Schmidt process! Specifically, consider the following process: take the columns a c, a cn of A Because A is invertible, its columns are linearly independent, and thus form a basis for R n Therefore, running the Gram-Schmidt process on them will create an orthonormal basis for R n! Do this here: ie set u = a c u 2 = a c2 proj( a c2 onto u ) u 3 = a c3 proj( a c3 u 4 = a proj( a u n = a cn proj( a cn onto u ) proj( a c3 onto u 2 ) onto u ) proj( a onto u 2 ) proj( a onto u 3 ) onto u ) proj( a cn onto u n ) Skip the rescaling step for a second columns of A, we get a ci If we take these equations and solve them for the c = u c2 = u 2 + proj( a c2 onto u ) c3 = u 3 + proj( a c3 = u 4 + proj( a cn = u n + proj( a cn onto u ) + proj( a c3 onto u 2 ) onto u ) + proj( a onto u 2 ) + proj( a onto u 3 ) onto u ) + + proj( a cn onto u n ) Now, notice that all of the proj( a ci onto u j )-terms are actually, by defintion, just multiples of the vector u j To make this more obvious, we could replace each of these terms with what a ci u j they are precisely defined to be, ie u u 2 j However, that takes up a lot of space! Instead, for shorthand s sake, denote the constant ac i u j u as p 2 ci,j Then, we have the following: c = u A matrix is called upper-triangular if all of its entries below the main diagonal are 0 For example, 2 3 0 3 2 is upper-triangular 0 0 2
c2 = u 2 + p c2, u c3 = u 3 + p c3, u + p c3,2 u 2 = u 4 + p, u + p,2 u 2 + p,3 u 3 cn = u n + p c2, u + + p c2,n u n If we do this, then it is not too hard to see that we actually have the following identity: p c2, p c3, p cn, 0 p c3,2 p cn,2 u u 2 u 3 u n 0 0 p cn,3 0 0 0 ( = u ( u 2 + p c2, u ) ( u 3 + p c3,2 u 2 + p c3, u ) u n + ) n i= p c n,i u i = A In other words, we have a QR-decomposition! Well, almost The left-hand matrix s rows form an orthogonal basis for R n, but they are not yet all length To fix this, simply scale the left matrix s columns so that they re all length, and then increase the scaling on the right-hand matrix s rows so that it cancels out: ie u p c2, u p c3, u p cn, u u u 2 u 3 u u u 2 u 3 n 0 u 2 p c3,2 u 2 p cn,2 u 2 u n 0 0 u 3 p cn,3 u 3 0 0 0 u n ( = u ( u 2 + p c2, u ) ( u 3 + p c3,2 u 2 + p c3, u ) u n + ) n i= p c n,i u i = A A QR-decomposition! 3
As a side note, it bears mentioning that this result holds even if the matrix is not invertible: Theorem Every matrix has a QR-decomposition, though R may not always be invertible The proof is pretty much exactly the same as above, except you have to be careful when dealing with the u i s, as you might be dividing by zero in the situations where A s columns were linearly dependent On your own, try to think about what you d need to change in the above proof to make it work for a general matrix! Instead, I want to focus on why this decomposition is nice: solving systems of linear equations! 2 Applications of QR decompositions: Solving Systems of Linear Equations Suppose you have an invertible matrix A and vector b Consider the task of a vector v such that A v = b; in other words, solving the system of n linear equations a ri v = b i, i = n This is typically a doable if slightly tedious task, via Gaussian elimination (ie pivoting on entries in A) However it takes time, and (from the perspective of implementing on a computer) can be fairly sensitive to small changes or errors: ie if you re pivoting on an entry in A that is nearly zero, it is easy for small rounding errors in a computer program to suddenly cause very big changes in what the entries in your matrix should be! However, suppose that we have a QR decomposition for A: ie we can write A = QR, for some upper-triangular R and orthogonal Q Then, solving is the same task as solving QR v = b Q T QR v = R v = Q T b; and this is suddenly much easier! In particular, because R is upper-triangular, R v is just r, v + r,2 v 2 + r,3 v 3 + + r,n v n r 2,2 v 2 + r 2,3 v 3 + + r 2,n v n r n,n v n + r n,n v n r n,n v n And finding values of this to set equal to some fixed vector Q T b is really easy! In particular, the last coordinate of R v just has one variable v n, so it s easy to solve for that variable From here, the second coordinate of R v has just two variables, v n, v n, one of which we 4
know now! So it s equally easy to solve for v n Working our way up, we have the same situation for each variable: we never have to do any work to solve for the variables v i! To illustrate this, we work an example: Example Consider the matrix 2 3 3 A = 2 2 3 3 2 First, find its QR decomposition Then, use that QR decomposition to find a vector A such that A v = 2 0 Answer We start by performing Gram-Schmidt on the columns of A: u = a c = (2, 2, 2, 2) u 2 = a c2 proj( a c2 onto u ) = (,,, ) = (,,, ) 0 = (,,, ) u 3 = a c3 proj( a c3 = (3,, 3, ) (,,, ) (2, 2, 2, 2) (2, 2, 2, 2) (2, 2, 2, 2) (2, 2, 2, 2) onto u ) proj( a c3 onto u 2 ) = (3,, 3, ) 8 (2, 2, 2, 2) 0 6 = (2, 2, 2, 2) u 4 = a proj( a (3,, 3, ) (2, 2, 2, 2) (3,, 3, ) (,,, ) (2, 2, 2, 2) (,,, ) (2, 2, 2, 2) (2, 2, 2, 2) (,,, ) (,,, ) onto u ) proj( a onto u 2 ) proj( a onto u 3 ) (3,, 3, ) (2, 2, 2, 2) (3,, 3, ) (,,, ) = (3,, 3, ) (2, 2, 2, 2) (,,, ) (2, 2, 2, 2) (2, 2, 2, 2) (,,, ) (,,, ) (3,, 3, ) (2, 2, 2, 2) (2, 2, 2, 2) (2, 2, 2, 2) (2, 2, 2, 2) = (3,, 3, ) 0 8 (,,, ) 0 4 = (,,, ) Using these four vectors, we construct a QR-decomposition as described earlier Notice that we ve already calculated the ac i u j u u 2 j = p ci,j s above, and so we don t need to repeat our 5
work here: u p c2, u p c3, u p, u u u 2 u 3 u 4 u u u u 4 0 u 2 p c3,2 u 2 p,2 u 2 0 0 u 3 p,3 u 3 0 0 0 u 4 4 0 2 0 = 2 0 2 0 4 0 0 4 0 0 0 0 2 From here, to solve the equation A v = (, 2, 0, ), we can use our QR-decomposition to write QR v = (, 2, 0, ), or equivalently R v = Q T (, 2, 0, ) We find the right-hand-side here: 2 T 2 0 = 2 2 0 = So: we have the simple problem of finding v = (v, v 2, v 3, v 4 ) such that 4 0 2 0 v 0 2 0 4 0 0 4 0 v 2 v 3 = 0 0 0 2 v 4 Ordering from bottom to top, this problem is just the task of solving the four linear equations 2v 4 = 0 4v 3 = 2v 2 + 4v 4 = 4v + 2v 3 = 2 2 0 2 0 If we solve from the top down, this is trivial: we get v 4 = 0, v 3 = /4, v 2 = /2, and v = 5 8 Thus, we ve found a solution to our system of linear equations, and we re done! 6