. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

Vectors and Matrices Continued Remember that our goal is to write a system of algebraic equations as a matrix equation. Suppose we have the n linear algebraic equations a x + a 2 x 2 + a n x n = b a 2 x + a 22 x 2 + a n x n = b 2. =. a i x + a i2 x 2 + a in x n = b i. =. a n x + a n2 x 2 + a nn x n = b n where the a ij, b i are given and the unknowns are x, x 2,..., x n. Clearly the right hand side will be represented by a vector b = (b, b 2,, b n ) T so the left hand side must also be a vector. If A is an n n matrix which contains the n 2 coefficients a ij, i, j =,..., n, x is an n-vector of the unknowns and b is the right hand side, then we write the system in matrix form as A x = b. a a 2 a n a 2 a 22 a n... A = a i a i2 a in... a n a n2 a nn The ith equation is just a i x + a i2 x 2 + a in x n = b i so the ith row of A contains the coefficients (in order) of this equation. The ith component of the vector A x (i.e., the left hand side vector) is n (A x) i = a ij x j We can also view our system using the columns of A which we denote by a, a 2,... a n. Then solving the linear system is equivalent to finding x such that j= x a + x 2 a 2 + x 3 a 3 + + x n a n = b Form A x where A = 2 4 6 0 x = x x 2. 2 7 2 x 3

Using matrix multiplication we have A x is the vector A x = 2x + x 2 + x 3 4x 6x 2 2x + 7x 2 + 2x 3 Use your result in the previous example to write the linear system 2x + x 2 + x 3 = 4x 6x 2 = 2 2x + 7x 2 + 2x 3 = 3 as a matrix equation A x = b. Then write the equation in column form. Clearly we have 2 4 6 0 x x 2 = 2 2 7 2 x 3 3 So now we can see the reason that matrix multiplication was defined in this way. We can also view the linear system as finding x such that x 2 4 + x 2 6 + x 3 0 = 2 2 7 2 3 that is, we seek a combination of the columns of A (which themselves are vectors) which forms the right hand side. This viewpoint is especially useful when we talk about the solvability of A x = b. For an n n matrix A we may be able to define an n n matrix B which has the property that AB = BA = I where I is the n n identity matrix. If this holds, then B is called the inverse of A. If such a matrix exists, then we denote it by A. Please realize that this is notation, it does not mean /A because this doesn t make sense. If A exists then Why is A important? Because AA = A A = I A x = b A A x = A b I x = A b x = A b So analytically, if we have A then we can obtain the solution by performing a matrix times vector operation to get A b. It turns out that computationally this is NOT an efficient way to find the solution; we will address this later. 2

A matrix whose inverse is its transpose is called orthogonal; that is A = A T if AA T = A T A = I then A is orthogonal. Of course a matrix must be square to be orthogonal. This tells us that if A is orthogonal then the linear system A x = b is readily solved by forming x = A T b; i.e., we only have to perform a matrix times vector operation. Let ( 2 4 A = 3 ), A = 2 3 4 2 B = 0 2 B = 2 2 0 Demonstrate that AA = I and BB = I, then form AB, its inverse and the product B A. For a 2 2 matrix there is a trick that can be used to find the inverse. If a b A = A = c d ad bc ( d ) b c a Now AB and its inverse are given by 2 6 AB = 5 (AB) = 4 5 6 2 and B A is B A = 4 5 6 = (AB) 2 In the last example we saw that the inverse of the product of two specific matrices is the product of their inverses with the order of multiplication reversed. This is true in general. Let AA = I, BB = I then ( AB ) = B A The Matrix Form of Gauss Elimination (GE) Now that we have written our linear system of equations as a matrix problem A x = b, we want to describe our GE algorithm in terms of matrix operations. First, recall an example that we did in the previous lecture. 2x + y + z = 5 4x 6y = 2 2x + 7y + 2z = 9 2x + y + z = 5 0 8y 2z = 2 0 + 8y + 3z = 4 2x + y + z = 5 0 8y 2z = 2 z = 2 3

Let s concentrate on the left hand side for now and write the coefficient matrices for the three systems above. 2 4 6 0 2 0 8 2 2 0 8 2 2 7 2 0 8 3 0 0 We first recognize that the last matrix is upper triangular so the goal in GE is to convert the original system to an equivalent upper triangular system because we saw that this type of system is easy to solve. What we want to do now is take the process we used to eliminate x from the second and third equations in the first step of GE above and write it as a matrix times A; e.g., find M such that M 2 4 6 0 = 2 0 8 2 2 7 2 0 8 3 Now we want M as simple as possible. If we return to a previous example, we can guess what M should be. We have 0 0 2 0 2 4 6 0 = 2 0 8 2 0 2 7 2 0 8 3 Our M is just a modified identity matrix where in the first column we have included the factors we multiplied the first equation by so that when we added it to another equation it removed the first variable. Note that M is unit lower triangular. Our next step is to find M 2 such that M 2 2 0 8 2 = 2 0 8 2 0 8 3 0 0 Now M 2 must have the property that the first two rows remain unchanged but we know that multiplication by the identity matrix doesn t change anything. Clearly the first two rows of M 2 must be the identity matrix. Now we modify the (3,2) entry of M 2 so that we introduce a zero into the (3,2) entry. We have 0 0 0 0 2 0 8 2 = 2 0 8 2 0 0 8 3 0 0 because our multiplier in the equation was one. We have converted the original matrix A to M 2 M A = U 4

where U is an upper triangular matrix and each M i is unit lower triangular. We multiplied the left hand side of our equation first by M and then by M 2 so we have to do the same thing to both sides of the equations: M 2 M A x = M 2 M b Notice that the order here is important because, in general, matrix multiplication is not commutative. Now in our example if we multiply our vector b by M and then by M 2 we should get (5, 2, 2) T. You should verify this. If we have an n n system then M has the form 0 0 0 m 2 0 0 M = m 2 3 0 0....... m n 0 0 where m 2 = a 2 /a m 3 = a 3 /a m i = a i /a Recall that a is called our first pivot and must be nonzero for the algorithm to work. M is a unit lower triangular matrix but moreover, it is the identity matrix modified to have nonzero entries only in the first column below the diagonal. M 2 is a unit lower triangular matrix but moreover, it is the identity matrix modified to have nonzero entries only in the second column below the diagonal. Note that the entries of M 2 in the second column below the diagonal are determined by the pivot in the modified matrix A = M A. If A is the 0 0 coefficient matrix for a linear system (with ten unknowns) that has a unique solution, what is the maximum number of matrices M i that we must determine to transform A into an upper triangular matrix? The matrices M i are called elementary matrices or Gauss transformation matrices. Because they differ from the identity matrix in only one column their inverse can be easily determined. From our example above M = 0 0 2 0 [ M ] = 0 0 2 0 0 0 and M 2 = 0 0 0 0 [ M 2] = 0 0 0 0 0 0 You should verify this by multiplication. So we can immediately obtain the inverse of M k by multiplying our entries in the kth column below the diagonal by -. So the inverse of 5

an elementary matrix is also a unit lower triangular matrix and it differs from the identity by entries in one column below the diagonal. Find the Gauss transformation matrices M and M 2 which converts the given matrix to an upper triangular matrix; give this upper triangular matrix. If the right hand side of the corresponding linear system is b = (, 0, 2) T, what is the resulting upper triangular system that needs to be solved? Here A = 0 3 4 7 2 0 We have and M A = 0 0 3 0 0 3 4 7 = 0 0 7 7 2 0 2 0 0 2 M 2 (M A) = 0 0 0 0 0 0 7 7 = 0 0 7 7 = U 0 2/7 0 2 0 0 3 The resulting system is then 0 0 7 7 x = M 2 M 0 = 0 0 3 0 0 = 3 0 0 3 2 2 2/7 2 4 So far, we have constructed Gauss transformation matrices M k such that M q M q M 2 M A = U where U is an upper triangular matrix. Oftentimes the notation A k denotes the matrix at the kth step, i.e., A = A A 2 = M A A 3 = M 2 M A Our system A x = b becomes M q M q M 2 M A x = M q M q M 2 M b U x = M q M q M 2 M b. Because we can immediately write down the inverse of each M k we can write [ (M A = ) ( M 2 ) ( M q ) ( M q ) ] U which says that A can be written as the product of unit lower triangular matrices times an upper triangular matrix. If we can show that the product of unit lower triangular matrices is itself a unit lower triangular matrix, then we have shown that GE is equivalent to writing A = LU 6

where L is unit lower triangular matrix and U is upper triangular. We will see later how this interpretation of GE can lead to an algorithm which is equivalent to GE on paper but its implementation can be much more efficient in some situations. Perform the product [ M ] [ ] M 2 where these are the matrices above. 0 0 0 0 0 0 2 0 = 0 0 2 0 0 0 We notice that after multiplying these two matrices we see a definite pattern to the product. Is this true in general? Let s do a product [ M ] [ ] M 2 for general entries 0 0 0 0 0 0 µ 2 0 = 0 0 µ 2 0 0 ν 3 µ 3 0 µ 3 ν 3 Of course this doesn t prove that it is true in general but you can convince yourselves that a product of lower triangular matrices is lower triangular and a product of unit lower triangular matrices is unit lower triangular. Of course if in the process of constructing M k we find a zero pivot, i.e., a k kk = 0 then the method fails because we are dividing by zero in our formula for the entries of M k. Previously, we found that we could interchange the equations to eliminate a zero (or small) pivot. We now want to determine how interchanging rows of our matrix can be done by premultiplying by some other matrix. If you recall we had an example where we premultiplied a matrix by a matrix which looked like the identity matrix but the last two rows were interchanged. Premultiplying by this matrix had the effect of interchanging the last two rows. Permutation matrices are n n matrices which are formed by interchanging rows (or columns) of the identity matrix. For example, P = 0 0 0 0 Q = 0 0 0 0 0 0 0 0 are permutation matrices. Premultiplying a 3 3 matrix by P interchanges the first and third rows; premultiplying a 3 3 matrix by Q interchanges the first and second rows. What does post-multiplying a matrix by a permutation matrix do? If we include permutation matrices our GE process can be written as M q P q M q P q M 2 P 2 M P A = U Of course if row interchanges are not needed then P k is just the identity matrix. One can demonstrate that [ (M q ) ( P q ) ( M q ) ( P q ) ( M 2 ) ( P 2 ) ( M ) ( P ) ] 7

is also unit lower triangular so that we still have A = LU. Back Solving We have seen that GE can be viewed as converting our original problem A x = b into an equivalent upper triangular system U x = c. In our examples, we saw that once we have an upper triangular system we can easily solve it by starting with the last variable and solving until we get to the first. This process is called a back solve. We can easily describe the algorithm for a back solve. First, let s do a specific example to remind ourselves. Solve the upper triangular system 2 0 8 2 x x 2 = 0 0 x 3 5 2 2 Clearly x 3 = 2; because 8x 2 2x 3 = 2 we have 8x 2 = 8 which implies x 2 =. Lastly, substituting our values for x 2, x 3 into 2x + x 2 + x 3 = 5 implies 2x = 5 2 = 2 or x =. Now consider the general upper triangular system u u 2 u 3 u 4 u n 0 u 22 u 23 u 24 u 2n 0 0 u 33 u 34 u 3n........... 0 0 0 u n,n u n,n 0 0 0 0 u nn x x 2 x 3. x n x n = To write the equations we just equate entries in the vectors of the right and left sides of the equation; recall that U x is itself a vector. We have x n = b n u nn Then to obtain x n we equate the n component b b 2 b 3. b n b n u n,n x n + u n,n x n = b n x n = b n u n,n x n u n,n For x n 2 we equate the n 2 component u n 2,n 2 x n 2 +u n 2,n x n +u n 2,n x n = b n 2 x n 2 = b n 2 u n 2,n x n u n 2,n x n u n 2,n 2 In general, we can find the ith component of x for i < n by x i = b i n j=i+ u i,jx j u ii 8

We can write the general algorithm in the following manner using pseudo-code. Note that we want to write it in such a way that another person could take the algorithm and code it in the language of their choice. Given an n n upper triangular matrix U with entries u ij and an n-vector b with components b i then the solution of U x = b is given by the following algorithm. Set x n = b n u nn For i = n, n 2,..., x i = b i n j=i+ u i,jx j u ii Triangular Factorization We have seen that the process of GE essentially factors a matrix A into LU. Now we want to see how this factorization allows us to solve linear systems and why in many cases it is the preferred algorithm compared with GE. Remember on paper, these methods are the same but computationally they can be different. First, suppose we want to solve A x = b and we are given the factorization A = LU. It turns out that the system LU x = b is easy to solve because we first do a forward solve L y = b and then we do a back solve U x = y. We have seen that we can easily implement the equations for the back solve and for homework you will write out the equations for the forward solve. If A = 2 2 4 9 = LU = 0 0 2 0 2 2 0 3 5 8 5 24 4 3 0 0 solve the linear system A x = b where b = (0, 5, 6) T. We first solve L y = b to get y = 0, 2y +y 2 = 5 implies y 2 = 5 and 4y +3y 2 +y 3 = 6 implies y 3 =. Now we solve U x = y = (0, 5, ) T. Back solving yields x 3 =, 3x 2 + 5x 3 = 5 implies x 2 = 0 and finally 2x x 2 + 2x 3 = 0 implies x = giving the solution (, 0, ) T. If GE and LU factorization are equivalent on paper, why would one be computationally advantageous in some settings? Recall that when we solve A x = b by GE we must also multiply the right hand side by the Gauss transformation matrices. Often in applications, we have to solve many linear systems where the coefficient matrix is the same but the right hand side vector changes. If we have all of the right hand side vectors at one time, then we can treat them as a rectangular matrix and multiply this by the Gauss transformation matrices. However, in many instances we solve a single linear system and use its solution 9

to compute a new right hand side, i.e., we don t have all the right hand sides at once. When we perform an LU factorization then we overwrite the factors onto A and if the right hand side changes, we simply do another forward and back solve to find the solution. One can easily derive the equations for an LU factorization by writing A = LU and equating entries. Consider the matrix equation A = LU written as a a 2 a 3 a n 0 0 0 u u 2 u 3 u n a 2 a 22 a 23 a 2n l a 3 a 32 a 33 a 3n..... = 2 0 0 0 u 22 u 23 u 2n l 3 l 32 0 0 0 u 33 u 3n.......... a n a n2 a n3 a nn l n l n2 ln3 0 0 0 u nn Now equating the (, ) entry gives a = u u = a In fact, if we equate each entry of the first row of A, i.e., a j we get u j = a j for j =,..., n. Now we move to the second row and notice that we can t find u 22 until we find l 2 so we look at the first column of L. For the (2,) entry we get a 2 = l 2 u implies l 2 = a 2 /u. Now we can determine the remaining terms in the first column of L by l i = a i /u for i = 2,..., n. We now find the second row of U. Equating the (2,2) entry gives a 22 = l 2 u 2 +u 22 implies u 22 = a 22 l 2 u 2. In general u 2j = a 2j l 2 u j for j = 2,..., n. Continuing in this manner, we get the following algorithm. Let A be a given n n matrix. Then if no pivoting is needed, the LU factorization of A into a unit lower triangular matrix L with entries l ij and an upper triangular matrix U with entries u ij is given by the following algorithm. Set u j = a j for j =,..., n and l i = a i /u for i = 2,..., n. For k = 2, 3..., n for j = k,..., n for i = k,..., n u k,j = a k,j k m= l km u m,j 0

l i,k = a i,k k m= u k,k l im u m,k Note that this algorithm clearly demonstrates that you can NOT find all of L and then all of U or vice versa. One must determine a row of U, then a column of L, then a row of U, etc. Perform an LU decomposition of A = 2 2 4 9 8 5 24 The result is given in a previous example and can be found directly by equating elements of A with the corresponding element of LU.