MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix Definition: Let L : V 1 V 2 be a linear operator. The null space N (L) of L is the subspace of V 1 defined by N (L) = {x V 1 : Lx = 0} Note: The null space of L is sometimes called the kernel of L. Examples: ( ) 1 1 i) Lx Ax x = 0 then N (A) = span{(1, 1)} R 1 1 2. ii) Lf defined by (Lf)(t) f (t) for f C 2 [a, b] then N (L) = span{1, t}. iii) L : C 0 [ 1, 1] R defined by Lf 1 1 f(s)ds then N (L) contains the subspace of all odd continuous functions on [ 1, 1] plus many other functions such as f(t) = t 2 1 / 3. We shall now restrict ourselves to m n real matrices. We note that always 0 N (A). If this is the only vector in N (A), i.e., if N (A) = {0} then the null space is the trivial null space with dimension 0. We also know from Ax = n x j A j that R(A) = span{a 1,..., A n } R m. The range of A is often called the column space of A and the dimension of this space is called the rank of A, i.e., r(a) = rank(a) = dim R(A) = dim column space of A. We note that r(a) < min{m, n}. Example: Let x and y be two column vectors in R n. Then the n n matrix x y T = (y 1 x y 2 x y n x) is a matrix with rank 1 since every column is a multiple of x. 47
Theorem: Let A be an m n matrix. Then dim N (A) + rank(a) = n. Proof: Let {y 1,..., y r } be a basis of R(A). Let {x 1,..., x r } be the vectors which satisfy Ax j = y j for j = 1,..., r. Let {z 1,..., z p } be a basis of N (A). Then the vectors {x 1,..., x j, z 1,..., z p } are linearly independent because if then α j x j + A α j x j + p β j z j = 0 p β j z j = α j y j = 0 which implies that α 1 = α 2 = = α r = 0. But then the linear independence of the {z j } implies that the {β j } also must vanish. Finally, let x be arbitrary in R n. Then Ax = ( r γ jy j for some {γ j }. This implies that A x ) r γ jx j = 0 so that x r γ jx j N (A), i.e., p x γ j x j = β j z j. Hence the linearly independent vectors {x 1,..., x r, z 1,..., z p } span R n and r + p rank(a) + dim N (A) = n. It follows immediately that if A is an m n matrix and m < n then dim N (A) 1 because rank(a) min{m, n}. In particular, this implies that Ax = 0 has a non-zero solution so that such a matrix cannot have an inverse. So far we have looked at the columns of A as n column vectors in R m. Likewise, the m rows of A define a set of m vectors in R n. What can we say about the number of linearly independent rows of A? We recall from the homework of Module 2 that if x, y denotes the dot product then Ax, y = x, A T y 48
for x R n and y R m. Next, let {y 1, y 2,..., y r } be a basis of R(A) and apply the Gram-Schmidt orthogonalization process to the vectors {y 1, y 2,..., y r, ê 1, ê 2,..., ê m } then the first r orthogonal vectors will be a basis of R(A) and the remaining m r vectors {Y 1, Y 2,..., Y m r } will be orthogonal to R(A). Since A T Y j R n it it follows from A T Y j, A T Y j = Y j, A(A T Y j ) = 0 that A T Y j = 0 so that dim N (A T ) (m r). Finally, we observe that if Ax 0 then A T (Ax), x > 0 so that Ax cannot belong to N (A T ). Hence dim N (A T ) = m r so that rank(a T ) = number of linearly independent rows of A = m (m r) = r. In other words, an m n matrix has as many independent rows as columns. Finally, we observe that if we add to any row of A a linear combination of the remaining rows we do not change the number of independent rows. Hence we can apply Gaussian elimination to the rows of A and read off the number of independent rows of A from the final form of A where all elements below the diagonal are zero. Implications for the solution of the linear system Ax = b where A is an m n matrix. 1) We shall assume that b R(A). i) If the columns of A are linearly independent then Ax = b has a unique solution regardless of the size of the system. In this case the inverse mapping exists for every element y R(A). ii) If the columns of A are linearly dependent then dim N (A) 1 and there are infinitely many solutions. One can then constrain the solution by asking, for example, for the minimum norm solution. iii) If m n the columns of A may or may not be linearly dependent. If m < n then the columns of A must be linearly dependent 49
iv) If rank(a) = m then b R(A). 2) Regardless of the size of the system, if b R(A) there cannot be a solution. If b R(A) then Gaussian elimination will lead to inconsistent equations. Two points of view for finding an approximate solution of Ax = b when b R(A). I. The Least Squares Solution : When the system Ax = b is inconsistent then for any x R n the residual, defined as r(x) b Ax, cannot be zero. In this case it is common to try to minimize the residual (in some sense) over all x R n (or possibly over some specially chosen set of admissible x R n ). We shall consider here only the case of minimizing a norm of the residual which is obtained from an inner product. This means we need to find the minimum of the function f defined by f(x) r(x), r(x) = b Ax, b Ax. Let us assume now that we are dealing with real valued vectors. Then f is a function of the n real variables x 1,..., x n, and calculus tells us that a necessary condition for the minimum is that We find that f(x) = 0. f x j A j, b Ax + b Ax, A j = 0. Since in a real vector space the inner product is symmetric it follows that x must be a solution of A j, Ax = A j, b for j = 1,..., n. If the inner product is the dot product on R n then these n equations can be written in matrix form as A T Ax = A T b If the n n matrix A T A has rank n then dim N (A T A) = 0 and (A T A) 1 exists so that x = (A T A) 1 Ab. 50
This is the least squares solution of Ax = b in Euclidean n-space. If A and hence A T are square and have rank n then A T is invertible and x solves Ax = b. II. We know that we can solve Ax = b for any b R(A) since Gaussian elimination will give the answer. One may now pose the problem: Find the solution x of Ax = b where b is the vector in R(A) which is closest in norm to b. As we saw in module 4 the vector b is the orthogonal projection of b onto span{a 1,..., A n }. Thus b = where α is computed from n α j A j = A α A α = d with A ij = A j, A i and d i = b, A i. It follows that A and d can be written in matrix notation as A = A T A, d = A T b so that by inspection the solution of Ax = b = Aα = A(A T A) 1 A T b is x = (A T A) 1 A T b provided A has rank n. Hence the least squares solution is the exact solution of the closest linear system for which there is an exact solution. 51
Module 8 - Homework 1) Let V 1 = {u : u C 0 [ 1, 1]} Define V 2 = C 0 [ 1, 1] (Lu)(t) = t 1 su(s)ds. Show that L is linear and find N (L). Show that the range of L is not all of V 2. 2) Let What is the rank of A? 1 5 9 13 6 2 6 10 14 8 A =. 3 7 11 15 10 4 8 12 16 12 Find an orthogonal (with respect to the dot product) basis of the null space and range of A. 3) Let A be an m n matrix. Assume that its columns are linearly independent. i) Show that in this case n m. ii) Show that one can find an n m matrix B such that BA = I n where I n is the n n identity matrix. 4) Suppose the cost C(t) of a process grows quadratically with time, i.e., C(t) = a 0 + a 1 t + a 2 t 2 Company records contain the following data: time taken measured cost.1.911.2.84.3.788.4.76.5.747.6.77 What would be your estimate of the cost of the process if it takes one unit of time? 52