VII Selected Topics. 28 Matrix Operations

VII Selected Topics Matrix Operations Linear Programming Number Theoretic Algorithms Polynomials and the FFT Approximation Algorithms 28 Matrix Operations We focus on how to multiply matrices and solve sets of simultaneous linear equations First we show how to solve a set of linear equations using LUP decompositions Then, we explore the close relationship between multiplying and inverting matrices Finally, we discus the class of symmetric positive-denite matrices and show how we can use them to nd a least-squares solution to an overdetermined set of linear equations MAT-72006 AA+DS, Fall 2014 13-Nov-14 693 1

Matrix inverses and ranks If a matrix has an inverse, it is called invertible, or nonsingular The vectors,, are linearly dependent if there exist coefcients,,, not all of which are zero, such that =0 E.g., the row vectors = (13), = (24), and = (4119) are linearly dependent since + 3 =0 The column rank of a nonzero matrix is the size of the largest set of linearly independent columns of MAT-72006 AA+DS, Fall 2014 13-Nov-14 694 For any matrix its row rank = its column rank, so that we can simply refer to the rank of Square matrix has full rank if its rank is An matrix has full column rank if its rank is An matrix is positive-denite if >0for all -vectors 0 E.g., the identity matrix is positive-denite, since for any nonzero vector =, = >0 MAT-72006 AA+DS, Fall 2014 13-Nov-14 695 2

Matrices that arise in applications are often positive-denite due to the following theorem Theorem D.6 For any matrix with full column rank, the matrix is positive-denite. MAT-72006 AA+DS, Fall 2014 13-Nov-14 696 28.1 Solving systems of linear equations Numerous applications need to solve sets of simultaneous linear equations We can formulate a linear system as a matrix equation in which each matrix or vector element belongs to a eld, typically the real numbers How to solve a system of linear equations using a method called LUP decomposition? We start with a set of linear equations in unknowns,, MAT-72006 AA+DS, Fall 2014 13-Nov-14 697 3

,, A solution to these equations is a set of values for,, that satisfy all of the equations simultaneously We treat only the case in which there are exactly equations in unknowns MAT-72006 AA+DS, Fall 2014 13-Nov-14 698 We can rewrite the equations as the matrixvector equation = Equivalently, letting = =,and = If is nonsingular (= invertible), it possesses an inverse, and is the solution vector MAT-72006 AA+DS, Fall 2014 13-Nov-14 699 4

is the unique solution to the equation Let there be two solutions, and Then and, letting denote an identity matrix, = ) = = MAT-72006 AA+DS, Fall 2014 13-Nov-14 700 We are predominantly concerned with the case in which is nonsingular Equivalently, the rank of is equal to the number of unknowns If the number of equations is or, more generally, if the rank of is then the system is underdetermined Such a system typically has innitely many solutions, although it may have no solutions at all if the equations are inconsistent If the number of equations exceeds, the system is overdetermined, and there may not exist any solutions MAT-72006 AA+DS, Fall 2014 13-Nov-14 701 5

Later we address the problem of nding good approximate solutions to overdetermined systems of linear equations Returning to the problem of solving the system of equations in unknowns We could compute and then, multiply by, yielding This approach suffers in practice from numerical instability Fortunately, another approach LUP decomposition is numerically stable and has the further advantage of being faster in practice MAT-72006 AA+DS, Fall 2014 13-Nov-14 702 Overview of LUP decomposition The idea behind LUP decomposition is to nd three matrices,, and such that is a unit lower-triangular matrix, is an upper-triangular matrix, and is a permutation matrix (has exactly one 1 in each row and column) Call matrices,, and satisfying the equation an LUP decomposition of the matrix Every nonsingular matrix has such a decomposition MAT-72006 AA+DS, Fall 2014 13-Nov-14 703 6

Computing an LUP decomposition for has the advantage that we can more easily solve linear systems when they are triangular, as is the case for both matrices and Once we have found an LUP decomposition for, we can solve equation by solving only triangular linear systems, as follows Multiplying both sides of by yields the equivalent equation, which, by Exercise D.1-4, amounts to permuting the original linear equations Using our decomposition, we obtain = MAT-72006 AA+DS, Fall 2014 13-Nov-14 704 We can now solve this equation by solving two triangular linear systems Let us dene, where is the desired solution vector First, we solve the lower-triangular system for the unknown vector by a method called forward substitution Having solved for, we then solve the uppertriangular system for the unknown by back substitution MAT-72006 AA+DS, Fall 2014 13-Nov-14 705 7

Because the permutation matrix is invertible (Exercise D.2-3), multiplying both sides of by gives, so that Hence, the vector is our solution to : We still need to show how forward and back substitution work MAT-72006 AA+DS, Fall 2014 13-Nov-14 706 Forward and back substitution Forward substitution can solve the lowertriangular system in time, given,, and For convenience, we represent the permutation compactly by an array [1.. ] The entry ] indicates that ] =1and =0for ] Thus, has in row and column, and has as its th element MAT-72006 AA+DS, Fall 2014 13-Nov-14 707 8

Since is unit lower-triangular, we can rewrite as ] ] ] ] The rst equation tells us that ] Knowing the value of, we can substitute it into the second equation, yielding ] And so forth MAT-72006 AA+DS, Fall 2014 13-Nov-14 708 In general, we substitute,,, forward into the th equation to solve for : Having solved for, we solve for in using back substitution, which is similar to forward substitution Here, we solve the th equation rst and work backward to the rst equation Like forward substitution, this process runs in time MAT-72006 AA+DS, Fall 2014 13-Nov-14 709 9

Given,,, and, LUP-SOLVE solves for by combining forward and back substitution Assume that the dimension appears in the attribute and that the permutation matrix is represented by the array LUP-SOLVE ) 1. 2. let be a new vector of length 3. for =1to 4. 5. for downto 1 6. )/ 7. return MAT-72006 AA+DS, Fall 2014 13-Nov-14 710 Consider the system of linear equations 1 2 0 3 = 3 4 4 = 7 5 6 3 8 We wish to solve for the unknown The LUP decomposition is 1 0 0 = 0.2 1 0 = 0.6 0.5 1 = 0 0 1 1 0 0 0 1 0 5 6 3 0 0.8 0.6 0 0 2.5 MAT-72006 AA+DS, Fall 2014 13-Nov-14 711 10

Let us verify that = = = 0 0 1 1 0 0 0 1 0 1 0 0 0.2 1 0 0.6 0.5 1 5 6 3 1 2 0 3 4 4 1 2 0 3 4 4 5 6 3 = 5 6 3 0 0.8 0.6 0 0 2.5 5 6 3 1 2 0 3 4 4 MAT-72006 AA+DS, Fall 2014 13-Nov-14 712 By forward substitution, we solve = for : 1 0 0 8 0.2 1 0 = 3 0.6 0.5 1 7 obtaining = 8 1.4 1.5 by computing rst, then, and nally ] =8 ] 0.2 8 = 3 1.6 = 1.4 ] 0.6 8 + 0.5 1.4 = 7 4.8 + 0.7 = 7 5.5 = 1.5 MAT-72006 AA+DS, Fall 2014 13-Nov-14 713 11

Using back substitution, we solve for : 5 6 3 8 0 0.8 0.6 = 1.4 0 0 2.5 1.5 thereby obtaining the desired answer 1.4 = 2.2 0.6 by computing rst, then, and nally MAT-72006 AA+DS, Fall 2014 13-Nov-14 714 28.2 Inverting matrices We do not generally use matrix inverses to solve systems of linear equations, rather use more stable techniques such as LUP decomposition We show how to use LUP decomposition to compute a matrix inverse Matrix multiplication and computing the inverse are equivalently hard problems (subject to technical conditions): we can use an algorithm for one to solve the other in the same asymptotic running time MAT-72006 AA+DS, Fall 2014 13-Nov-14 715 12

Computing a matrix inverse from an LUP decomposition Suppose that we have an LUP decomposition of a matrix :,, and such that Using LUP-SOLVE, we can solve an equation of the form in time ) Since the LUP decomposition depends on but not, we can run LUP-SOLVE on a second set of equations in additional time ) Once we have the LUP decomposition of, we can solve, in time, versions of the equation = that differ only in MAT-72006 AA+DS, Fall 2014 13-Nov-14 716 Think of the equation which denes, the inverse of, as a set of distinct equations of the form Let denote the th column of, and recall that the unit vector is the th column of We can then solve the equation for by using the LUP decomposition for to solve each equation separately for Once we have the LUP decomposition of, we can compute from it in time Hence we can compute the inverse of a matrix in time ( ) MAT-72006 AA+DS, Fall 2014 13-Nov-14 717 13

Matrix multiplication and matrix inversion The speedups obtained for matrix multiplication translate to speedups for matrix inversion In fact, matrix inversion is equivalent to matrix multiplication, in the following sense If ) denotes the time to multiply two matrices, then we can invert a nonsingular matrix in time O( ) Moreover, if ) denotes the time to invert a nonsingular matrix, then we can multiply two matrices in time ( ) MAT-72006 AA+DS, Fall 2014 13-Nov-14 718 Theorem 28.1 (Multiplication is no harder than inversion) If we can invert an matrix in time ), where and satises the regularity condition ), then we can multiply two matrices in time ).. Proof Let and be matrices whose matrix product we wish to compute. We dene the 3 matrix by 0 = 0 0 MAT-72006 AA+DS, Fall 2014 13-Nov-14 719 14

The inverse of is =, 0 0 and thus we can compute the product by taking the upper right submatrix of. We can construct matrix in ) time, which is )because we assume that, and we can invert in )time, by the regularity condition on ). We thus have = ( ). MAT-72006 AA+DS, Fall 2014 13-Nov-14 720 () satises the regularity condition whenever ) for any constants >0and 0. The following proof relies on some properties of symmetric positive-denite matrices proved later. Theorem 28.2 (Inversion is no harder than multiplication) Suppose we can multiply two real matrices in time ), where and ) satises the two regularity conditions ) = ) for any in the range and 2 ) for some constant < 1. 2Then we can compute the inverse of any real nonsingular matrix in time ( ). MAT-72006 AA+DS, Fall 2014 13-Nov-14 721 15

28.3 Symmetric positive-denite matrices and least-squares approximation Lemma 28.3 Any positive-denite matrix is nonsingular. Proof Suppose that a matrix is singular. Then by Corollary D.3, there exists a nonzero vector such that =0. Hence, =0, and cannot be positive-denite. MAT-72006 AA+DS, Fall 2014 13-Nov-14 722 Least-squares approximation One application of symmetric pos-def matrices arises in tting curves to given data points We are given a set of data points,,,, where we know that the are subject to measurement errors We would like to determine a function ) such that the approximation errors are small for = 1,2,, MAT-72006 AA+DS, Fall 2014 13-Nov-14 723 16

The form of depends on the problem at hand Here, we assume that it has the form of a linearly weighted sum, = The number of summands and the specic basis functions are chosen based on knowledge of the problem at hand A common choice is, meaning that is a polynomial of degree 1 in MAT-72006 AA+DS, Fall 2014 13-Nov-14 724 Thus, given data points,,,,, we wish to calculate coefcients,, that minimize the approximation errors,, Choosing, we can calculate each exactly Such a high-degree ts the noise as well as the data, however, and gives poor results when used to predict for unseen values of We choose and hope that by choosing the coefcients well, we can obtain that nds the signicant patterns in the data points without paying undue attention to the noise MAT-72006 AA+DS, Fall 2014 13-Nov-14 725 17

Once we choose a value of, we end up with an overdetermined set of equations whose solution we wish to approximate Let = denote the matrix of values of the basis functions at the given points; that is, The desired -vector of coefcients is = MAT-72006 AA+DS, Fall 2014 13-Nov-14 726 Then = = is the -vector of predicted values for Thus, ) ) ) is the -vector of approximation errors To minimize approximation errors, we choose to minimize the norm of the error vector MAT-72006 AA+DS, Fall 2014 13-Nov-14 727 18

This gives us a least-squares solution, since Because = = = we can minimize by differentiating with respect to each and then setting the result to 0: MAT-72006 AA+DS, Fall 2014 13-Nov-14 728 = 2 =0 The equations for = 1,2,, are equivalent to the single matrix equation =0 or, equivalently (using Exercise D.1-2), to =0 which implies In statistics, this is called the normal equation MAT-72006 AA+DS, Fall 2014 13-Nov-14 729 19

The matrix is symmetric by Exercise D.1-2, and if has full column rank, then by Theorem D.6, is positive-denite as well Hence, exists, and the solution to the previous equation is = where the matrix = is the pseudoinverse of the matrix The pseudoinverse naturally generalizes the notion of a matrix inverse to the case in which is not square MAT-72006 AA+DS, Fall 2014 13-Nov-14 730 Equation is the approximate solution to Whereas the solution is the exact solution to = MAT-72006 AA+DS, Fall 2014 13-Nov-14 731 20

As an example of producing a least-squares t, suppose that we have ve data points = (1,2) = (1,1) = (2,1) = (3,0) = (5,3) shown as black dots in the next figure We wish to t these points with a quadratic polynomial = + + MAT-72006 AA+DS, Fall 2014 13-Nov-14 732 MAT-72006 AA+DS, Fall 2014 13-Nov-14 733 21

We start with the matrix of basis-function values 1 1 1 1 1 1 1 = 1 = 1 2 4 1 1 3 9 1 1 5 25 whose pseudoinverse is 0.500 0.300 0.200 0.100 0.100 0.388 0.093 0.190 0.193 0.088 0.060 0.036 0.048 0.036 0.060 MAT-72006 AA+DS, Fall 2014 13-Nov-14 734 Multiplying by, we obtain the coefcient vector 1.200 = 0.757 0.214, which corresponds to the quadratic polynomial = 1.200 0.757 + 0.214 as the closest-tting quadratic to the given data, in a least-squares sense MAT-72006 AA+DS, Fall 2014 13-Nov-14 735 22