MATH 2030: MATRICES Matrix Algebra As with vectors, we may use the algebra of matrices to simplify calculations. However, matrices have operations that vectors do not possess, and so it will be of interest to explore their properties. Properties of Addition and Scalar Multiplication. All of the algebraic properties of addition and scalar multiplication from vectors apply to matrices as well. These are summarized as a theorem Theorem 0.1. Let A,B and C be matrices of the same size and let c and d be scalars. Then (1) A + B B + A (Commutativity) (2) (A + B) + C A + (B + C) (Associativity) (3) A + O A (4) A + ( A) O (5) c(a + B) ca + cb (Distrivutivity) (6) (c + d)a ca + da (Distrivutivity) (7) c(da) cda (8) 1A A These may be proven directly using the component wise definitions of matrix addition and scalar multiplication. These properties allow us to combine k m n matrices {A i }, i 1,.., k with k scalars c i, i 1,..., k to produce a linear combination c 1 A 1 + c 2 A 2 +... + c k A k we say the c i are coefficients of the linear combination. In fact we can say when a collection of matrices are linearly dependent or independent - as in the case of vectors. If a matrix B is a linear combination of the k matrices A i, we say that B is in the Span of these matrices Span(A 1,..., A k ) or B Span(A 1,..., A k ) [ ] [ ] 0 1 1 0 Example 0.2. Q:Define A 1, A 1 0 2, A 0 1 3 [ ] 1 4 Is B in the Span of A 2 1 1, A 2 and A 3 [ ] 1 2 Is C in the Span of A 3 4 1, A 2 and A 3 A: We wish to find c 1, c 2, and c 3 such that 0 1 1 0 1 1 c 1 + c 1 0 2 + c 0 1 3 1 1 1 [ ] 1 4 2 1 [ 1 1 1 1 ].
2 MATH 2030: MATRICES Using the algebraic properties of matrix addition and scalar multiplication, we may write the left hand side as [ ] [ ] c2 + c 3 c 1 + c 3 1 4 c 1 + c 3 c 2 + c 3 2 1 This gives four linear equations in terms of three variables, it is a system of linear equations with the augmented matrix 0 1 1 1 1 0 1 4 1 0 1 2 0 1 1 1 Gauss-Jordan elimination gives 1 0 0 1 0 1 0 2 0 0 1 3 0 0 0 0 implying that c 1 1, c 2 2 and c 3 03 and so B A 1 2A 2 + 3A 3. We wish to find c 1, c 2, and c 3 such that [ ] 0 1 1 0 1 1 1 4 c 1 + c 1 0 2 + c 0 1 3 1 1 2 1 Listing the linear equations in the components of the resulting matrix c 2 + c 3 1, c 1 + c 3 2, c 1 + c 3 3 c 2 + c 3 4 Row reduction of the augmented matrix gives 0 1 1 1 1 0 1 2 1 0 1 3 0 0 0 3 As this system is inconsistent, we say that C is not in the Span of A 1, A 2 and A 3. Asking if O is in the span of A 1, A 2,..., A k is tatamount to asking whether these k matrices are linearly dependent or linearly independent. A solution is linearly dependent if O Span(A 1, A 2,..., A k ) which implies c 1 A 1 + c 2 A 2 +... + c k A k O for some vector [c 1,..., c k ] 0, i.e., not equal to the zero k-vector. If c 1 c 2... c k, then we say it is linearly independent. As an example, the three matrices in the previous example are linearly indepdendent - meaning that if one were to determine the related linear system in terms of c 1, c 2 and c 3 yields the only solution c 1 c 2 c 3 0
MATH 2030: MATRICES 3 Properties of Matrix Multiplication. We have introduced a new operation for matrices that looks like multiplication for the real numbers, however this analogy only goes so far, we must be careful with what these two have in common and how they differ. Example 0.3. Consider the 2 2 matrices and the two products AB and BA [ ] [ ] 2 4 1 0 A, and B 1 2 1 1 Calculating the matrix products: 2 4 1 0 6 4 AB, 1 2 1 1 3 2 1 0 2 4 2 4 BA, 1 1 1 2 1 2 Notice that AB BA, we say that matrix multiplication is not commutative. This differs from multiplication of numbers where a b b a for any a, b R. Although matrix multiplication lacks commutativity, the operation still has some satisfying properties that will be of use. Theorem 0.4. Let A,B and C be matrices of the appropriate size to suit the operation for each case, and k a scalar. Then (1) A(BC)(Associativity) (2) A(B + C) AB + AB (Left Distributivity) (3) (A + B)C AC + BC (Right Distributivity) (4) k(ab) (ka)b A(kB) (5) I m A A AI n where A is a m n matrix. (Multiplicative identity). These may be proven using the row-column representations of the matrices. Example 0.5. Q: If A and B are square matrices (n n ), calculate (A + B) 2 A: Using the properties of matrix multiplication we find (A + B) 2 (A + B)(A + B), then by applying left distributivity then right distributivity in order, we find this becomes (A + B)A + (A + B)B A 2 + BA + AB + B 2 Notice the difference from the real numbers, the only time that (A + B) 2 A 2 + 2AB + B 2 is when A 2 + 2AB + B 2 A 2 + AB + BA + B 2 removing the extra terms we find this is simply AB +BA 2AB, or by subtracting an AB on both sides BA AB. This can only happen in the special case when A and B commute under matrix multiplication.
4 MATH 2030: MATRICES Properties of the Transpose. Combining the properties we have for matrix addition, scalar multiplication and matrix multiplication with the transpose operation we have the following list of helpful properties of the transpose Theorem 0.6. Let A and B be matrices - of appropriate size - and k be a scalar (1) (A t ) t A (2) (A + B) T A t + B t (3) (ka) t ka t (4) (AB) t B t A t (5) A k ) t (A t ) k, here k is a non-negative integer. Proof. Properties 1-3 are easily proven with the componentwise definition of A [a ij ], while property 5 is a great excercise using mathematical induction. We will only prove property 4 as it is a property that illustrates the difference between multiplication and matrix multiplication. Supposing A [a ij ] m n and B [b ij ] n r it is clear B t will be an r n matrix and A t will be a n m matrix. And so B t A t is a well defined matrix product. Furthermore AB t is an r m matrix, so it is the same size as B t A t, we just have to prove that their entries are equal. We find denote the i-th row of a matrix X by row i (X) and the j-th column of X by col j (X) then the matrix multiplication of A and B is [(AB) t ] ij (AB) ji row j (A) col i (B) col j (A t ) row i (B t ) row i (B t ) col j (A t ) [B t A t ] ij. where we have used the commutativity of the dot product in the second-last step. The Inverse of a Matrix Returning to the question of when a system of linear equations Ax b has a solution, we would like to use matrix algebra to solve the system. Returning to the analogy of the real numbers, to solve the linear equation ax b one must compute the inverse of a which is 1 a. Then using this and simple algebra ax b ax a b ( a ) 1 a (a) x b a 1 x b a Thus we found the solution to be x b a. To reproduce this idea, we need to find some matrix A such that A A I the square identity matrix with the same number of columns as A. If a matrix like this exists, the procedure for finding x is considerably simpler: Ax b A Ax A b Ix A b x A b. in this chapter we answer the question of when such a matrix exists. Definition 0.7. If A is an n n matrix, an inverse of A is an n n matrix A such that AA I, and A A I where I I n is the n n identity matrix. If A exists we say that A is invertible.
MATH 2030: MATRICES 5 [ ] [ ] 2 5 3 5 Example 0.8. If A, and A 1 3, computing the two products we 1 2 see that A is invertible: AA 2 5 3 5 1 0. 1 3 1 2 0 1 A 3 5 2 5 1 0 A. 1 2 1 3 0 1 [ ] 1 2 Example 0.9. Q: Show that the matrix B is not invertible. A: Supposing [ ] 2 4 a b there is such a matrix B such that c d 1 2 a b 1 0 2 4 c d 0 1 we produce four linear equations 1 + 2y 1, x + 2x 0, 2w + 4y 0, 2x + 4z 1; subtracting twice the first equation from the third produces 0 2 which is a contradiction, and so there cannot be a solution. Notice that a matrix and its inverse is commutative AA A A. Uniqueness of an Inverse of a Matrix - if it exists. Although we cannot yet say when a matrix has an inverse or how it can be found, by positing the existence an inverse for a matrix A, we may prove that it is unique Theorem 0.10. If A is an invertible matrix, then its inverse is unique. Proof. Suppose A and A are (both potentially distinct) matrices such that AA IA A and AA I A A, then consider the line of equalities: A A I A (AA ) (A A)A IA A we have shown that they are equal and hence A is unique. We may update the definition for the inverse of a matrix: Definition 0.11. If A has a matrix A such that AA I A A, we denote this as A 1 and say it is the unique inverse of A. Note: A 1 1 A, as there is no operation for division by matrices. Wit the inverse defined we may complete the analogy with solving a simple linear system in R. Theorem 0.12. If A is an invertible n n matrix, then the system of linear equations given by Ax b has the unique solution x A 1 b for any b R n. Proof. Proving the existence of x is easily done with matrix algebra. A(A 1 b Ib b To prove uniqueness, suppose y is another solution i.e. Ay b then A 1 (Ay) A 1 b (A 1 A)b A 1 b Iy A 1 b y A 1 b Hence y x implying the solution is unique.
6 MATH 2030: MATRICES Conditions for the Existence of an Inverse for 2 2 matrices. The above theorem is wonderful if one can calculate the inverse of the coefficient matrix for the system of linear equations. In the case of n 2 we have the following theorem [ ] a b Theorem 0.13. If A, then A is invertible if ad bc 0, in which case c d its invariant is A 1 1 ad bc [ d b c a If ad bc 0 the matrix A is not invertible. We call ad bc the determinant of the matrix A and denote it as deta Proof. Supposing ad bc 0, then we verify that the matrices are inverse [ ] ( [ ]) [ ] [ ] a b 1 d b 1 ad bc ab + ba ad bc 1 0 c d ad bc c a ad bc cd dc da cb ad bc 0 1 Repeating the calculation with the matrix product switched on the left-hand side produces the same result. We have proven that A is invertible and that A 1 defined above is the unique inverse of A. Instead if we assume ad bc 0 we must consider two cases, depending on whether a vanishes. If a 0 then d bc a and the matrix becomes [ ] a b A, k c ka kb a Supposing this does have an inverse [ ] [ ] a b w x ka kb y z ]. [ ] 1 0 0 1 we will produce a linear system of four equation in four variables x, y, z and w which is inconsistent. If a 0, the vanishing of det(a) ad bc implies bc 0 and so we have two further subcases. A will be of the form [ ] [ ] 0 0 0 b A, or c d 0 d Generically we have that 0 0 x y 0 0 c d z w z w 0 b x y 0 y 0 d z w 0 w. where y, z and w are distinct from [ y, ] z and w. The resulting product can never 1 0 be equal to the identity matrix. 0 1 [ ] 12 15 Example 0.14. Q:Is the matrix B invertible? 4 5 A: No, as the determinant detb 12( 5) ( 15)(4) 60 60 0 Example 0.15. Q:Use the inverse of the coefficient matrix to solve the linear system x + 2y 3, 3x + 4y 2.
A: The coefficient matrix is the matrix A [ 2 1 3 2 1 2 MATH 2030: MATRICES 7 [ ] 1 2 whose inverse will be A 3 4 1 ]. Thus by applying the inverse to both sides we find that x A 1 b or x [ ] 8 11 2 Invoking the relevant theorem, we note this is the unique solution to the linear system. Properties of Invertible Matrices. Of course, the inverse of a matrix has some interesting algebraic properties. Theorem 0.16. (1) If A is an invertible matrix, then A 1 is invertible and (A 1 ) 1 A. (2) If A is an invertible matrix and c is a nonzero scalar then ca is invertible and (ca) 1 1 c A 1. (3) If A and B are invertible matrices, then (AB) 1 B 1 A 1. More generally if A i is a collection of k n n invertible matrices, (A 1 A 2...A k ) 1 A 1 k A 1 k 1...A 1 2 A 1 1 (4) If A is an invertible matrix, then A t is invertible and (A t ) 1 (A 1 ) t. (5) If A is an invertible matrix, then A n is invertible for all nonnegative integers n and (A n ) 1 (A 1 ) n. We define A n (A n ) 1. Proof. This is a great exercise in matrix algebra and the uniqueness of the inverse. Unfortunately for brevity s sake we will omit it. If you are interested, I refer to the proof in the textbook, pg. 174 for the proof. Example 0.17. Q:Solve the following matrix equation for X, assuming all matrices involved are invertible. A 1 (BX) 1 (A 1 B 3 ) 2 A: Noticing that A 1 (BX) 1 (A(BX)) 1 by Theorem (0.16)-3 we find that (BXA) 1 (A 1 B) 2 ((BXA) 1 ) 1 [(A 1 B 3 ) 2 ] 1 T heorem(0.16) 1 BXA [(A 1 B 3 )(A 1 B 3 )] 1 T heorem(0.16) 5 BXA B 3 (A 1 ) 1 B 3 (A 1 ) 1 T heorem(0.16) 1 BXA B 3 AB 3 A T heorem(0.16) 1 B 1 BXAA 1 B 4 AB 3 AA 1 IXI B 4 AB 3 I X B 4 AB 3 Elementary Matrices as Elementary Row Operations. Originally we defined elementary row operations on an augmented matrix in terms of operations done on the linear system, and then generalized these operations to matrices in general. We are going to reinterpret our elementary row operations this time in terms of matrix multiplication. Consider the matrices 1 0 0 5 7 E 0 0 1, A 1 0 0 1 0 8 3
8 MATH 2030: MATRICES Calculating the product EA we find that the second and third row has been switched in the new matrix EA 5 7 8 3 1 0 So we have found a matrix whose product with A reproduces the row operation R 2 R 3. We have a special name for matrices of this type. Definition 0.18. An elementary matrix is any matrix that can be obtained by performing an elementary row operation on an identity matrix. As there are three elementary row operations, we expect three types of elementary matrices. Example 0.19. Define 1 0 0 0 0 0 1 0 1 0 0 0 E 1 0 3 0 0 0 0 1 0, E 2 0 1 0 0 1 0 0 0, E 3 0 1 0 0 0 0 1 0. 0 0 0 1 0 0 0 1 0 2 0 1 Following the definition of an elementary matrix, these are all elementary matrices arising from I 4 by applying a single elementary row operation. In E 1 we have multiplied the second row by 3, in E 2 we have interchanged two rows R 1 R 3 and in E 3 we have R 4 2R 2. Notice that if we apply this to any 4 4 matrix the same row operation is performed on it. That is, given a 11 a 12 a 13 a 14 A a 21 a 22 a 23 a 24 a 31 a 32 a 33 a 34 a 41 a 42 a 43 a 44 then the products are simply a 11 a 12 a 13 a 14 a 31 a 32 a 33 a 34 E 1 A 3a 21 3a 22 3a 23 3a 24 a 31 a 32 a 33 a 34, E 2A a 21 a 22 a 23 a 24 a 11 a 12 a 13 a 14 a 41 a 42 a 43 a 44 a 41 a 42 a 43 a 44 and E 3 A a 11 a 12 a 13 a 14 a 21 a 22 a 23 a 24 a 31 a 32 a 33 a 34 a 41 2a 21 a 42 2a 22 a 43 2a 23 a 44 2a 24 Theorem 0.20. Let E be the elementary matrix obtained by performing an elementary row operation on I n. If the same elementary row operation is performed on an n r matrix A, the result is the same as the matrix EA. Remember that an elementary row operation is reversible, and so we expect all elementary matrices to be invertible due to this. The inverse would be the matrix resulting from the reserse elementary row operation acting on I n.
MATH 2030: MATRICES 9 Example 0.21. Let 1 0 0 1 0 0 1 0 0 E 1 0 0 1, E 2 0 4 0, and E 3 0 1 0 0 1 0 0 0 1 2 1 0 As E 1 corresponds to R 2 R 3 its inverse will be itself, that is E1 1 E 1. As E 2 arises by the row operation 4R 2 the inverse will be E 1 2 1 0 0 1 0 4 0. 0 0 1 And finally the inverse row operation related to E 3 is R 3 + 2R 2 so that E3 1 1 0 0 0 1 0 2 1 0 We have a theorem generalizing the observations here Theorem 0.22. Each elementary matrix is invertible, and its inverse is an elementary matrix of the same type. The Fundamental Theorem of Invertible Matrices. With the definitions and theorems established in the previous subsections, we are now able to summarize many equivalent conditions for an invertible matrix into one theorem. Here equivalent means that if one is given a matrix A all of the statements in the following theorem are all true or all false. Theorem 0.23. The Fundamental Theorem of Inverbile Matrices: Ver.1 Let A be an n n matrix. the following statements are equivalent( i.e. all true or all false): (1) A is invertible (2) Ax b has a unique solution for every b in R n (3) Ax 0 has only the trivial solution. (4) The reduced row echelon form of A is the n n identity matrix (5) A is a product of elementary matrices. Proof. We will establish the theorem by proving the circular chain of implications (1) (2) (3) (4) (5) (1) 1 2 As A is invertible, the system of linear equations has the solution x A 1 b for any b R n. 2 3 Since any linear system Ax b has a unique solution, so does the linear system where b 0. Furthermore x 0 is always a trivial solution - it must be the only solution. 3 4 The homogeneous system Ax 0 has only the trivial solution, and so the corresponding system of equations is a 11 x 1 + a 12 x 2 +... + a 1n x n 0 a 21 x 1 + a 22 x 2 +... + a 2n x n 0. a n1 x 1 + a n2 x 2 +... + a nn x n 0
10 MATH 2030: MATRICES and that this linear system is equivalent to the one where x 1 0, x 2 0,..., x n 0. In other words, using Gauss-Jordan elimination applied to the augmented matrix of the system gives a 11 a 12 a 1n 0 1 0 0 0 a 21 a 22 a 2n 0 [A 0]...... vdots 0 1 0 0....... [I n 0]. a n1 a n2 a nn 0 0 0 0 0 proving the required condition. 4 5 Assuming the reduced row echelon form of A is the identity matrix I n, A can be reduced to I n using a finite sequence of elementary row operations. By Theorem (0.20) each one of these elementary row operations corresponds to left-multiplication by an appropriate elementary matrix. Combining the sequence of elementary matrices E 1, E 2,..., E k we find that E k E 2 E 1 A I n Furthermore these matrices are invertible by Theorem (0.22) and so A (E k E 2 E 1 ) 1 I n E1 1 E 1 2 E 1 I n E 1 k 1 E 1 2 E 1 5 1 If A is a product of elementary matrices, we have shown that A is invertible, since each elementary matrix is invertible and the product of invertible matrices is always invertible. [ ] 2 3 Example 0.24. Q: Express A as a product of elementary matrices: A: 1 3 Applying the following row operations R 1 R 2, R 2 2R 1, R 1 + R 2 and 1 3 R 2 we may reduce A I n Applying the fundamental theorem of invertible matrices, as A s reduced row echelon form is I 2 we know that it is invertible, and it may be written in terms of the elementary matrices related to the row operations used in Gauss-Jordan elimination [ ] 0 1 1 0 1 1 1 0 E 1, E 1 0 2, E 2 1 3, E 0 1 4 0 3 Using the formula from Theorem (0.23) we find that [ ] A (E 4 E 3 E 2 E 1 ) 1 E1 1 0 1 1 0 1 1 1 0 E 1 2 E 1 3 E 1 4 1 0 2 1 0 1 0 3 As another application of the Fundamental theorem of invertible matrices, we will show that the inverse of a matrix A is defined as B such that AB I n and BA I n need only be defined as AB I n as the other condition is derived from this one. Theorem 0.25. Let A be a square matrix. if B is a square matrix such that either AB I or BA I then A is invertible and B A 1. Proof. Suppose BA I. Consider the equation Ax 0. By left multiplying by B we have BAx B0, implying that Ix x 0. The original system Ax 0 has the trivial solution x 0. From the equivalence of (5) and (1) in Theorem (0.23): k
MATH 2030: MATRICES 11 A is invertible so that A 1 exists and AA 1 A 1 A I. Right multiplying both sides of AB I by A 1 we find the following implications BAA 1 IA 1 BI A 1 B A 1 Another application of the fundamental theorem of invertible matrices gives a helpful result for the calculation of a matrix s inverse Theorem 0.26. Let A be a square matrix. If a sequence of elementary row operations reduce A to I, then the same sequences of elementary row operations transforms I into A 1. Proof. If A is row equivalent to I, then we can achieve the reduction by leftmultiplication by the sequence of elementary matrices related to the row operations used in the Gauss-Jordan method. Thus if E k E 2 E 1 A I, we may set B E k E 2 E 1 and we find that BA I. By the previous theorem A is invertible and A 1 B. By applying the same sequence of row operations to I, this will be equivalent to left multiplication onto I by E k E 2 E 1 B. Doing so we find that E k E 2 E 1 I BI B A 1 proving that I is transformed into A 1 by the same sequence of elementary row operations. The Gauss-Jordan Method for Computing the Inverse of a matrix. The last theorem of the previous subsection suggests a power approach to determining the inverse of a matrix. The question is how does one one record the row operations used to reduce A to the identity matrix (if that is possible). To do this compactly, we consider the new matrix [A I] which may be seen as a super-augmented matrix. Then Theorem (0.26) shows that if A is row equivalent to I, that is, A is invertible then applying row operations to this new matrix to bring A into reduced row echelon form yields: [A I] [I A 1 ] Furthermore if A cannot be reduced to I using elementary row operations, the fundamental theorem of invertible matrices ensures that A is not invertible. In essence this procedure is merely Gauss-Jordan elimination performed on an n 2n matrix instead of n (n + 1) augmented matrix. Example 0.27. Q: Find the inverse of of the matrix A if it exists: [ ] 1 2 A 2 2 A: Applying Gauss-Jordan elimination we find [ ] 1 2 1 0 [A I] 2 2 0 1 Applying R 2 2R 1, then 1 2 R 2 and R1 2R 1 we find 1 2 1 0 1 2 1 0 1 0 1 1 0 2 2 1 0 1 1 1 2 0 1 1 1 2 this agrees with the 2 2 definition of the matrix inverse.
12 MATH 2030: MATRICES Let us see what happens when a matrix does not have an inverse. Example 0.28. Q:Find the inverse of the matrix A if it exits 2 1 4 A 4 1 6 2 2 2 A: Proceeding as in the previous example, we form the super-augmented matrix 2 1 4 1 0 0 2 1 4 1 0 0 2 1 4 1 0 0 4 1 6 0 1 0 0 1 2 2 1 0 0 1 2 2 1 0 2 2 2 0 0 1 0 3 6 1 0 1 0 0 0 5 3 1 As A is not reducible to the identity matrix, Theorem (0.23) implies this matrix does not have an inverse. The LU Factorization We are going to explore the idea of expressing matrices into the products of simple matrices. In the case of the real numbers this is often helpful, say when one considers the prime decomposition of a number, i.e. 72 2 3 3 2. Any representation of a matrix as a product of two or more other matrices is called a matrix factorization. For example, the following would be a matrix factorization [ 3 1 9 5 ] [ 1 0 3 1 ] [ ] 3 1 0 2 In this section we will explore a factorization that arises in the solution of systems of linear equations using Gaussian elimination. Consider a system of linear equation of the form Ax b where A is an n n matrix. Our aim will be to factor A into a product of matrices that simplify the solution to the given system as easily as possible. Example 0.29. Q: Consider 2 1 3 A 4 1 3 2 5 5 A: Applying the row operations R 2 2R 1, R 3 + R 1 and R 3 + 2R 2 we find the following upper-triangular matrix 2 1 3 U 0 3 3 0 0 2 The three elementary matrices E 1, E 2 and E 3 related to the row operations are then 1 0 0 1 0 0 1 0 0 E 1 2 1 0, E 2 0 1 0, E 3 0 1 0 0 0 1 1 0 1 0 2 1
MATH 2030: MATRICES 13 Hence E 3 E 2 E 1 A U, and solving for A produces 1 0 0 1 0 0 1 0 0 A E1 1 2 E 1 3 U 2 1 0 0 1 0 0 1 0 U 0 0 1 1 0 1 0 2 1 1 0 0 2 1 0 U LU 1 2 1 Thus A may be factored as A LU where U is an upper triangular matrix and L is a unit lower triangular matrix. This implies 1 0 0 1 0 L......... 1 where all of the entries above the diagonal are zero and along the diagonal they are 1s. Definition 0.30. Let A be a square matrix. A factorization of A as A LU, where L is a unit lower triangular matrix and U is upper-triangular, is called an LU factorization of A. The LU factorization works well when there are no row interchanges needed to reduce A to the upper-triangular matrix U. In cases like this all of the resulting elementary matrices are unit lower triangular - ensuring that L is unit lower triangular, and that its inverse (built of inverse elementary matrices which are unit lower triangular) will be as well. This approach fails when a zero appears in a pivot position, as we must apply a row interchange to bring the matrix into upper-triangular form, and so L would no longer be in unit lower triangular form. Theorem 0.31. If A is a square matrix that can be reduced to row echelon form without using any row interchanges, then A has an LU factorization. As a rough argument for why the LU factorization works, consider the linear system Ax b where the coefficient matrix A has an LU factorization, ALU. We can rewrite the system as Ax b as LUx b. If we define y Ux we can solve for x in two steps Solve Ly b for y. Solve Ux y for x. Both of these linear systems are easier to solve due to their coefficient matrices having the lower and upper triangular form respectively. Example 0.32. Q: Use an LU factorization of A 2 1 3 4 1 3 to solve Ax b 2 5 5 where b 1 4. A: In the previous example we saw that 9 1 0 0 2 1 3 A 2 1 0 0 3 3 LU 1 2 1 0 0 2
14 MATH 2030: MATRICES and so we begin by solving for y from Ly b for y y 1 y 2. This is a simple linear y 3 system: (1) y 1 1, 2y 1 + y 2 4, y 1 2y 2 + y 3 9. Using forward substitution, i.e., working from the top to bottom, we find 1 y 6 2 With this known we may solve the linear system Ux y for x, in this case the linear system is just 2x 1 + x 2 + 3x 1 1, 3x 2 3x 3 6, 2x 3 2; back substitution yields the solution 1 2 x 3. 1 To find an LU Factorization: An approach. In the first example of this section we computed the matrix L as a product of elementary matrices. This was somewhat calculation intensive; however it is possible to determine L merely by performing row reduction. However, we should stress that this will only work for those matrices which do not require row interchanges to reduce A to row echelon form, as the LU factorization will only work in these situations. if this is the case, the entire row reduction process can be done using only elementary row operations of the form R i kr j - we call k here the multiplier. In the previous example we saw that the row operations R 2 2R 1, R 3 + R 1 R 3 ( 1)R 1 and R 3 + 2R 2 R 3 ( 2)R 2 were used to transform A into U the upper triangular matrix. Then we saw that the multiplies became the entries below the diagonal of L in example (0.29) 1 0 0 L 2 1 0 1 2 1 Here L 21 2, L 31 1 and L 32 2, notice the relationship between the rows and columns of L [L ij ] and the row operations R i lr j. Example 0.33. Q:Find an LU factorization of 3 1 3 4 A 6 4 8 10 3 2 5 1. 9 5 2 4
MATH 2030: MATRICES 15 A: Applying the row operations in order R 2 2R 1, R 3 R 1, R 4 ( 3)R 1, R 3 1 2 R 2, R 4 4R 2 and R 4 ( 1)R 3 yield the following upper triangular matrix 3 1 3 4 U 0 2 2 2 0 0 1 4. 0 0 0 4 Looking at these row operations we may read off the components of L, for R 2 2R 1 we find that the multiplier will also be the L 21 2 entry of the matrix L, and similarly with the next two row operations L 31 1 and L 41 3, and so L takes the form: L 1 0 0 0 2 1 0 0 1 1 0 3 1 where the three asterisks indicated entries of L not yet identified. Continuing with the recording of the row operations we see that L 32 1 2 and L 42 4. The last row operation implies L 43 1 thus L takes the form L 1 0 0 0 2 1 0 0 1 1 2 1 0 3 4 1 1 With U and L defined as above we see that A LUby direct calculation. From this example, we note that applying elementary row operations R i kr j must be performed from top to bottom within each column, and column by column from left to right, otherwise the components of L may be mixed up so that we do not produce the LU factorization. When the row echelon form of a matrix was introduced we emphasized that this form was not unique for the matrix. In the case of invertible matrices, we will show that the LU factorization is unique and always exists! Theorem 0.34. If A is an invertible matrix that has an LU factorization, then L and U are unique. Proof. Suppose A LU and A L U are two distinct factorizations of A, so that LU L U where the pairs L, and L and U andu are lower and upper triangular matrices respectively. As L must be invertible (if it weren t it d cause a contradiction since A is invertible!), and since A is invertible, its reduced row echelon form is an identity matrix I by Theorem (0.23), thus U must be row reduced to be I as well, and hence U is invertible by a second application of Theorem (0.23). Thus L 1 (LU)U 1 L 1 (L U )U 1 (L 1 L)(UU 1 ) (L 1 L )(U U 1 ) so that (L 1 L)I I(U U 1 ) L 1 L U U 1 However L 1 L is a unit lower triangular matrix and U U 1 is an upper triangular matrix, if these two matrix products are equal then they must be both upper and
16 MATH 2030: MATRICES lower triangular at once. Thus It follows that L L and U U. L 1 L I, U U 1 I The P t LU factorization. The LU factorization is very helpful, however it cannot help with systems of linear equations Ax b where A requires row interchanges to reduce the matrix to an upper-triangular form. Consider the matrix 1 2 1 A 3 6 2 1 1 4 upon row reduction we find A B 1 2 1 0 0 5 0 3 3 this is not an upper-triangular matrix, but could be one if we could only interchange rows. If we did this we d find 1 2 1 U 0 3 3 0 0 5 The elementary matrix related to this row interchange is given by P 1 0 0 0 0 1 0 1 0 If we denote E as the product of the elementary matrices used to reduce P A to U, i.e. so that E 1 L the unit lower triangular matrix. Thus EP A U so A (EP ) 1 U P 1 E 1 U P 1 LU. This will only work in the case of a single row interchange, but by constructing all of the elementary matrices related to row interchange and denote them as P i, then in general P will be a product of these matrices P P k P 2 P 1. Such a matrix is called a permutation matrix. Remember that any elementary permutation matrix arises from permuting the rows of the identity matrix in some order. For example, the following are all permutation matrices [ ] 0 1 0 0 0 0 1 0 1, 1 0 0, 0 0 0 1 1 0 1 0 0 0 0 1 0 0 0 1 0 The inverse of a permutation matrix is very easily calculated Theorem 0.35. If P is a permutation matrix, the P 1 P t. Proof. To show that P t P I, we note that the i-th row of P 6t is the i-th column of P, and so these are both equal to the same standard unit vector e. And so (P t P ) ii (i-th row of P t )(i-th column of P) e t e e e 1. This shows us that the diagonal elements of P t P are 1s. Alternatively if j i then the j-th column of P is a different standard unit vector from e, say e. Thus for any off-diagonal entry of P t P we find (P t P ) ij (i-th row of P t )(j-th column of P) e t e e e 0. We
MATH 2030: MATRICES 17 As the diagonal components are 1s and the off-diagonal are 0s, we conclude that in general P t P I. Definition 0.36. Let A be a square matrix. A factorization of A as A P t LU, where P is a permutation matrix, L is an unit lower triangular matrix and U an upper triangular matrix, is called a P t LU factorization of A. Example 0.37. Q: Find a P t LU factorization of the matrix A 0 0 6 A 1 2 3 2 1 4 A: To reduce A to row echelon form we apply the row operations R 1 R 2, R 3 2R 1 and R 2 R 3. As we have used two row interchanges the permutation matrix will be P P 2 P 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1. 0 1 0 1 0 0 1 0 0 We now find an LU factorization of PA 0 1 0 0 0 6 1 2 3 P A 0 0 1 1 2 3 2 1 4 1 0 0 2 1 4 0 0 6 This is put into upper triangular form by the row operation R 2 2R 1, implying that L 21 2 and so 0 1 0 1 0 0 1 2 3 A P t LU 0 0 1 2 1 0 0 3 2. 1 0 0 0 0 1 0 0 6 we end this section with a helpful theorem without proof Theorem 0.38. Every square matrix has a P t LU factorization. References [1] D. Poole, Linear Algebra: A modern introduction - 3rd Edition, Brooks/Cole (2012).