Matrix Arithmetic There is an arithmetic for matrices that can be viewed as extending the arithmetic we have developed for vectors to the more general setting of rectangular arrays: if A and B are m n matrices and r is a scalar then we can define matrix addition via a 11 a 1n b 11 b 1n A + B = + a m1 a mn b m1 b mn a 11 + b 11 a 1n + b 1n = a m1 + b m 1 a mn + b mn and scalar multiplication for matrices via ra = r a 11 a 1n ra 11 ra 1n = a m1 a mn ra m 1 ra mn. That is, adding and scalar multiplying is performed entrywise. It is important to note that two matrices must have the same size if we wish to add them together (and their sum has the same size as well).
As with vector addition and scalar multiplication, the same familiar arithmetical properties hold: Theorem If A, B, C are matrices of the same size and r and s are scalars, then 1. A + B = B + A; 2. ( A + B) + C = A + ( B + C); 3. A + 0 = 0 + A (where 0 represents the zero matrix, the matrix of the same size as A all of whose entries are zero); 4. r( A + B) = ra + rb and (r + s) A = ra + sa; 5. r( sa ) = (rs) A. // We can also extend the definition of the matrixvector product we developed to include a much wider class of matrices. The motivation comes from a consideration of how we compose linear transformations. As we have seen, any linear transformation S:R n R m has an associated standard m n matrix A for which S(y ) = Ay; here, y is an arbitrary vector in R n and its image S(y ) = Ay lies in R m. If we now consider a second linear transformation T:R p R n that has associated standard n p matrix B, then every vector x in R p is taken by T to the vector T(x ) = Bx in R n. Since
the output values of S can be used as input values for T, the composition map S T:R p R m makes sense: it carries a vector x in R p to the vector in R m. S T(x ) = S( T (x )) = A(Bx )) This composition map is a linear transformation (why?!), so it has a standard matrix representation. Therefore, we define the matrix product AB to be the standard matrix of the composition map S T. The first observation that must be made regarding this definition is that not every pair of matrices can be multiplied: AB only makes sense when the number of columns of A matches the number of rows of B. Next, we want to see how to determine the matrix product AB. By expressing the matrix B in terms of its columns as B = [ b 1 b 2 b p ], recall that Bx = x 1 b 1 + x 2 b 2 + + x p b p is the linear combination of the columns of B using as weights the entries of x. Thus, A( Bx )) = A( x 1 b 1 + x 2 b 2 + + x p b p ) = x 1 ( Ab 1 ) + x 2 ( Ab 2 ) + + x p ( Ab p )
is the linear combination of the matrix-vector products of the columns of B by A using these same weights. This means that A( Bx )) = [ Ab 1 Ab 2 Ab p ]x, so that the matrix of the composition map must be given by the relation AB = [ Ab 1 Ab 2 Ab p ]. This shows exactly what the matrix product AB is. Because we know how to form the matrix-vector products Ab j, we note that the columns of AB are the linear combinations of the columns of A whose weights come from the corresponding column of B. Equivalently, since the ith entry of the vector Ab j is the linear combination of the entries of the vector b j using as weights the entries of the ith row of A, we can conclude that the (i, j)-entry of AB is given by the formula ( AB ) ij = a i 1 b 1 j + a i 2 b 2 j + + a in b nj.
This definition of matrix multiplication satisfies many of the familiar arithmetical properties we have come to expect (but it fails to satisfy some others!): Theorem Let A, B, C be matrices having appropriate sizes to allow the products below to be defined. Then 1. A( BC ) = ( AB )C; 2. A( B + C ) = AB + AC and ( A + B)C = AC + BC; 3. r( AB) = (ra )B = A(rB) for any scalar r; and 4. if A is m n, then I m A = A = AI n where I m and I n are the appropriately sized identity matrices. // Notice which properties do not appear in the above theorem: in general, matrices do not commute, that is, the relation AB = BA is not true in general, even if the matrix products on both sides of this equation make sense! (See Example 7, p. 114.) Next, if three matrices satisfy the relation AB = AC, it is not necessarily the case that B = C, that is, the cancellation law does not hold in general for matrix multiplication (see Exercise 10, p. 116). Finally, it may happen that AB = 0 without either A or B being equal to the zero matrix, e.g.,
3 6 2 4 = 0 0. 1 2 1 2 0 0 Another useful and common matrix operation is the matrix transpose. The transpose of the m n matrix A is the n m matrix A T, whose rows are the columns of A, or equivalently, whose columns are the rows of A. A simple check of properties shows that the transpose operation is compatible with matrix arithmetic: Theorem Let A, B be matrices having appropriate sizes to allow the expressions below to be defined. Then 1. ( A T ) T = A; 2. ( A + B) T = A T + B T ; 3. (ra ) T = ra T for any scalar r; and 4. ( AB ) T = B T A T. Proof Properties 1-3 are straightforward. To see why property 4 holds, we note that the entries in the ith row of (AB) T are the entries of the ith column of AB, which is Ab i ; the jth entry in this column is the linear combination of the entries of b i (the ith column of B) using weights taken from the jth row of A. That is, the (i, j)-entry of (AB) T is the quantity a j1 b 1i +a j2 b 2i + +a jn b ni.
On the other hand, the (i, j)-entry of B T A T is the linear combination of the entries in the jth column of A T using weights that come from the ith row of B T, which is equivalently the linear combination of the entries in the jth row of A using weights that come from the ith column of B. That is, the (i, j)-entry of B T A T is the quantity b 1i a j1 +b 2i a j2 + +b ni a jn. Since this is identical to the computation in the previous paragraph, we must have that ( AB ) T = B T A T. //