MA 0540 fall 2013, Row operations on matrices

MA 0540 fall 2013, Row operations on matrices December 2, 2013 This is all about m by n matrices (of real or complex numbers). If A is such a matrix then A corresponds to a linear map F n F m, which we will denote by T A. The map is defined by matrix multiplication: T A (X) = AX, where the vector X F n is thought of as an n by 1 matrix and the m by 1 matrix AX is thought of as a vector in F m. At times I will view the matrix as a list of m vectors R 1,..., R m, all in F n (the rows). At other times I will view it as a list of n vectors C 1,..., C n, all in F m (the columns). 1 The rank of a matrix 1.1 The column space of a matrix The subspace of F m spanned by the columns of A is called the column space of A, and the rank of this subspace is called the column rank of A. The column space is the same as the range of T A, because T A takes the vector (x 1,..., x n ) to x 1 C 1 +... + x n C n. Thus the column rank of the matrix A is the dimension of the range of the linear map T A. Of course, the column rank is no bigger than m, and it is equal to m if and only if T A is surjective. 1.2 The row space of a matrix The subspace of F n spanned by the rows of A is called the row space of A, and its rank is called the row rank of A. 1

1.3 Equality of row rank and column rank In fact we know that the dimension of the column space of A is always the same as the dimension of the row space. We proved this before by using the nullspace of the linear map F m F n that corresponds to the transpose matrix. We will see a different proof below. 2 Row operations How can we determine the rank of a matrix A? How can we find a basis for the nullspace of T A? Row operations give answers to these and other questions. 2.1 The operations For this discussion, think of an m by n matrix as a list of its rows R i. There are three types of things we can do to a matrix that are called row operations. 1. Interchange R i with R j for some i and j i. 2. For some i, replace R i by cr i for some scalar c 0. Leave the other rows unchanged. 3. For some i and j i, replace R i by R i + cr j for some scalar c. Leave the rows other than R i unchanged. We say that two m by n matrices A and B are row-equivalent if it is possible to change A into B by a sequence of row operations. It is clear that row operations can be reversed: if some row operation turns A into B then some row operation turns B into A. For example, if you get B be replacing the ith row R i by R i = R i + cr j then you can recover A from B by replacing the new ith row R i by R i + ( c)r j = R i. Here is one way to look at this: Any row operation applies a certain linear operator to every column of the matrix: if the columns of the matrix A are C 1,..., C n then the columns of the new matrix are P C 1,..., P C n, where P is a certain m by m matrix. (The new matrix is P A.) For any row operation of any of the three types, there is some m by m matrix P such that this is true. The matrix P is always invertible. 2.2 The effect of row operations on the row space and column space of A If A and B are row-equivalent, then they have the same row space. In fact, when a matrix is altered by a row operation then every new row belongs to the span of the old rows, so the new row space 2

is contained in the old row space; and since row operations can be reversed it is also true that the old row space is contained in the new row space. Therefore the new row space and the old row space are the same. It follows, of course, that the row rank of a matrix does not change when we perform row operations on it. Row operations do change the column space of a matrix, but nevertheless they do not change its dimension. To see this, suppose that that a row operation corresponds to the invertible matrix P as above. The row space of P A has the same dimension as the row space of A because the one subspace of R m is obtained from the other by applying an invertible operator T P to it. So, just like the row rank, the column rank of A is unaltered by row operations. 3 Echelon matrices We call an m by n matrix A an echelon matrix if it has the following form: For some number r (an integer satisfying 0 r m) there are numbers c(1),..., c(r) satisfying 1 c(1) <... < c(r) n such that: When i > r then a i,j = 0 for all j. For every i from 1 to r we have a i,c(i) = 1. For every i from 1 to r and every j < c(i) we have a i,j = 0. For every i from 1 to r and every k < i we have a k,c(i) = 0. In other words, R i = 0 if i > r; each row after the first r rows is entirely zero. If i r then in the row R i the first nonzero entry is a 1, and it occurs in column number c(i), where c(i) is an increasing function of i. In the column C c(i) everything above the 1 in row i is a zero. Note that inside this matrix there is an r by r submatrix that is an identity matrix; it is in the rows numbered 1, 2,..., r and the columns numbered c(1), c(2),..., c(r). 3

3.1 Rank of an echelon matrix We now show directly that for an echelon matrix both the row rank and the column rank are equal to the number called r above, the number of rows that are not entirely made of zeroes. To see that the row rank is r, it suffices to show that the first r rows are linearly independent. For this, suppose we have a linear relation x 1 R 1 +... + x r R r = 0, i.e. a linear relation x 1 a 1,j + x 2 a 2,j +... + x r a r,j = 0 valid for every j from 1 to n. For any i from 1 to r we may see that x i = 0 by taking j to be c(i): this yields the equation x 1 a 1,c(i) + x 2 a 2,c(i) +... + x r a r,c(i) = 0, which tells us that x i = 0, since the only one of a 1,c(i),..., a r,c(i) that is not zero is a i,c(i) = 1. To see that the column rank of an echelon matrix is also r we can observe that the column space is the r-dimensional subspace of F m consisting of all vectors (x 1,..., x m ) such that for every i > r the number x i is zero. It is contained in that subspace because each column C j is that subspace (i.e. the numbers a ij are all zero for i > r), and it is all of that subspace because the columns C c(1),..., C c(r) are the obvious basis of that subspace. 3.2 Every matrix is row-equivalent to some echelon matrix Here is a procedure ( row reduction of a matrix ) for doing row operations on a matrix to put it in echelon form. Look at the first column. If it is entirely zero, then ignore it and go to work on the remaining m by n 1 matrix. If some row operations turn that submatrix into an echelon matrix, say E, then these same operations will also turn A into an echelon matrix (E with one extra row of zeroes on the left). If the first column is not entirely zero, then arrange for the upper left entry a 1,1 to be not 0, by interchanging the first row with some other row if necessary. Now that a 1,1 0, multiply the first row by 1 a 1,1. So in the new matrix a 1,1 = 1. Now for each i 1 do a row operation to make a i,1 into 0. (Replace R i by R i a i,1 R 1.) At this point the first column has 1 at the top and all zeroes below. If m = 1 then we are done: this 1 by n matrix is in echelon form. Also if n = 1 then we are done: this m by 1 matrix is jun echelon form. So assume m > 1 and n > 1. Temporarily ignore the first column and the first row and look at the remaining m 1 by n 1 matrix. Do row operations on it until it is in echelon form. These same operations performed on the original matrix will not change the first row, and they will not affect the first column either (because a i,1 = 0 for i > 1). 4

At this point our m by n matrix is almost in echelon form. The only problem is in the first row: we need a 1,c(i) to be zero for each i > 1. We can arrange this by subtracting a multiple of R i from R 1, for each i from 2 to r. 3.3 Row rank equals column rank We have already seen that the row and column rank of a matrix do not change when it is subjected to row operations. So here is a new proof that column rank equals row rank: for any matrix A there is a row-equivalent echelon matrix E, and we can say rowrank(a) = rowrank(e) = columnrank(e) = columnrank(a). 4 The case of square matrices The biggest possible rank of an m by m matrix is m, and the rank m matrices are the invertible matrices. The only m by m echelon matrix is the identity matrix. Therefore every invertible matrix is rowequivalent to the identity. One consequence of this is that the inverse of A can be found by row operations, as follows: 4.1 Inverting a matrix by row operations Write down A next to I as an m by 2m matrix (A, I). Perform row operations on this in such a way as to turn the left half (A) into I. Row operation is left multiplication by some invertible matrix P. So for some sequence of invertible matrices P k we are getting (A, I) (P 1 A, P 1 ) (P 2 P 1 A, P 2 P 1 )... (P r... P 2 P 1 A, P r... P 2 P 1 ). At the end, if A has turned into the identity then P r... P 2 P 1 is the inverse of A; we end up with (I, A 1 ). In other words, if A is in fact invertible then you can find its inverse by performing row operations on A to turn it into I, and as you go along simply performing the same operations on I. If A is not invertible then you will discover that A is not invertible when the echelon matrix obtained from A turns out to have rank less than m. 4.2 Another way to think about row-equivalence We have seen that if two m by n matrices A and B are row equivalent then there exists an invertible m by m matrix P such B = P A. 5

In fact, the converse is true: If B = P A where P is invertble, then P can be obtained from the identity matrix by performing row operations, and therefore P A can be obtained from A by performing the same operations. Thus B and A have to be row-equivalent in this case. 5 Row operations and systems of equations We have seen that the echelon matrix which is row-equivalent to A tells us what the rank of A is, and even gives us a basis for the row space. What else does it do for us? 5.1 Solving systems of homogeneous linear equations Given a list of m linear equations in n variables: a i,1 x 1 + a i,2 x 2 +... + a i,n x n, 1 i m, we can encode it as a matrix A. The system of equations can be read as a single vector equation AX = 0, and the problem of finding all solutions (x 1,..., x n ) becomes the problem of describing the nullspace of the linear map T A. Of course the dimension of the solution space is where r is the rank of A. dim(f n ) dim(range T ) = n r, When E is an echelon matrix equivalent to A, then E = P A for some invertible P, and it follows that the nullspace of E is the same as that of A. (EX = 0 if and only if P AX = 0 if and only if AX = 0.) Thus the system of equations given by the rows of E has precisely the same solution as the original system. If A is an echelon matrix, then it gives you particularly useful equations. In fact, it gives you r equations numbered 1 through r, of which the equation numbered i (when we leave out all the terms that are zero) says: x c(i) = Σ j a i,j x j, with j running through all values from c(i) + 1 up to n excluding c(i + 1),..., c(r). The equations express the variables x c(1),... x c(r) as linear combinations of all the other variables x j, which can therefore be called independent variables. We can also read off a basis for the nullspace. There is one basis vector for each j that is not among the c(i)s. For each such j, choose x j to be 1 and choose all the other independent variables to be 0. Then x c(i) is a i,j if c(i) < j and 0 if c(i) j. 5.2 Inhomogeneous equations A system of equations a i,1 x 1 + a i,2 x 2 +... + a i,n x n = b i, 1 i m, 6

corresponds to a single vector equation AX = b, where A is m by n and b is m by 1. In other words, it means specifying a vector b belonging to F m and asking about the set of all vectors v F n such that T A (v) = b. If b = 0 (the homogeneous case), there is at least one solution, namely v = 0, and the space of solutions is a vector subspace of F n (the nullspace of T ) whose dimension is the rank of A. In general, there might not be any solutions. If there is at least one, say v p, then the equation T v = b becomes T (v v p ) = 0; the vectors v such that T v = b are the vectors v such that v v p belongs to the nullspace of T, i.e. the vectors of the form v p + x where x belongs to the nullspace. How does row reduction help to work out an example like this? Make an m by n + 1 matrix by putting b as one additional column: (A, b). Do row operations to transform A into an echelon matrix E = P A, and at the same perform the same operations on the additional column b to get a new column c, so that the matrix (A, b) becomes (E, c) = (P A, P b). If the new matrix (E, c) is interpreted as a system of inhomogenous equations, the solutions of these are the same as the solutions of the original system. (P AX = P b if and only if AX = b.) On the other hand, we can now tell at a glance whether there are any solutions: Say that the rank is r, so that E has all zeroes after the first r rows. If the extra column c has anything non-zero after the first r rows, i.e. if c i 0 for some i > r, then there can be no solution. On the other hand, if c i = 0 for all i > r, then you have a list of r equations with which you can solve for the x c(i) as in the homogeneous case. In particular by setting all the independent variables equal to zero (i.e. making x j = 0 if j is not any c(i)), we obtain a particular solution with x c(i) = b i for 1 i r, from which the general solution can be obtained by adding solutions of the homogeneous system of equations. 7