Lecture 3: Linear Algebra Review, Part II Brian Borchers January 4, Linear Independence Definition The vectors v, v,..., v n are linearly independent if the system of equations c v + c v +...+ c n v n = has only the trivial solution c =. If there are multiple solutions, then the vectors are linearly dependent. Determining whether or not a set of vectors is linearly independent is simple just solve the above system of equations. Example Let A = 3 4 5 6. 7 8 9 Are the columns of A linearly independent vectors? Setup the system of equations Ax = in an augmented matrix, and then find the RREF. The solutions are x = x 3 We can set x 3 = and obtain the nonzero solution x =. Thus the columns of A are linearly dependent. There are a number of important theoretical consequences of linear independence. For example, it can be shown that if the columns of an n by n matrix A are linearly independent, then A exists, and the system of equations Ax = b has a unique solution for every right hand side b.
Subspaces of R n So far, we ve worked with vectors of real numbers in the n dimensional space R n. There are a number of properties of R n that make it convenient to work with vectors. First, the operation of vector addition always works we can take any two vectors in R n and add them together and get another vector in R n. Second, we can multiply any vector in R n by a scalar and obtain another vector in R n. Finally, we have the vector, with the property that for any vector x, x +=+x = x. Definition A subspace W of R n is a subset of R n which satisfies the three properties. If x and y are vectors in W,thenx + y is also a vector in W.. If x is a vector in W and s is any scalar, then sx is also a vector in W. 3. The vector is in W. Example In R 3, the plane P defined by the equation x + x + x 3 = Is a subspace of R n, To see this, note that if we take any two vectors in the plane and add them together, we get another vector in the plane. If we take a vector in this plane and multiply it by any scalar, we get another vector in the plane. Finally, is a vector in the plane. Subspaces are important because they provide an environment within which all of the rules of matrix/vector algebra apply. The most important subspace of R n that we will work with in this course is the null space of a m by n matrix. Definition 3 let A be an m by n matrix. The null space of A (written N(A)) is the set of all vectors x such that Ax =. To show that N(A) is actually a subspace of R n, we need to show three things:. If x and y are in N(A), then Ax =anday =. By adding these equations, we get that A(x + y) =. Thusx + y is in N(A).. If x is in N(A) ands is any scalar, then Ax =. We can multiply this equation by s to get sax =. ThusA(sx) =, and sx is in N(A). 3. A =,soisinn(a). Computationally, finding the null space of a matrix is as simple as solving the system of equations Ax =. Example 3 Let 3 9 4 A = 7 3 5 6 7
In order to find the null space of A, we solve the system of equations Ax =. To solve the equations, we put the system of equations into an augmented matrix 3 9 4 7 3 5 6 7 and find the RREF 3 From the augmented matrix, we find that x = x 3 3 + x 4 Any vector in the null space can be written as a linear combination of the above vectors, so the null space is a two dimensional plane within R 4. Now, consider the problem of solving Ax = b, where b = 7 39 It happens that one particular solution to this system of equations is p = However, we can take any vector in the null space of A and add it to this solution to obtain another solution. Suppose that x is in N(A). Then A(x + p) =Ax + Ap A(x + p) =+b A(x + p) =b In the context of inverse problems, the null space is critical because the presence of a null space leads to non uniqueness in the solution to a linear system of equations. Definition 4 A basis for a subspace W is a set of vectors v,..., v p such that. Any vector in W can be written as a linear combination of the basis vectors. 3
. The basis vectors are linearly independent. Any subspace W of R n will have many different bases. For example, we can take any basis and multiply the basis vectors by to obtain a new basis. However, it is possible to show that all bases for a subspace W have the same number of basis vectors. Definition 5 Let W be a subspace of R n with basis v, v,..., v p. Then the dimension of W is p. Example 4 In the previous example, the vectors v = 3 v = form a basis for the null space of A because any vector in the null space can be written as a linear combination of v and v, and because the vectors v and v are linearly independent. Since the basis has two vectors, the dimension of the null space of A is two. It can be shown that the procedure used in the above example always produces a linearly independent basis for N(A). There is a second important subspace associated with a matrix called the column space of the matrix. Definition 6 let A be an m by n matrix. The column space or range of A (written R(A)) is the set of all vectors b such that Ax = b has at least one solution. In other words, the column space is the set of all vectors b that can be written as a linear combination of the columns of A. The column space is important because it tells us for which vectors b we can solve Ax = b. To find the column space of a matrix, we consider what happens when we compute the RREF of [A b]. In the part of the augmented matrix corresponding to the left hand side of the equations we always get the same result, namely the RREF of A. The solution to the system of equations may involve some free variables, but we can always set these free variables to. Thus when we can solve Ax = b, we can solve the system of equations by using only variables corresponding to the pivot columns in the RREF of A. In other words, if we can solve Ax = b, then we can write b as a linear combination of the pivot columns of A. Note that these are columns from the original matrix A, not columns from the RREF of A. Example 5 As in the previous example, let A = 3 9 4 7 3 5 6 7 We want to find the column space of A. We already know that the RREF of A 4
is 3 Thus whenever we can solve Ax = b, we can find a solution in which x 3 and x 4 are. In other words, whenever there is a solution to Ax = b, we can write b as a linear combination of the first two columns of A b = x 3 5 + x Since these two vectors are linearly independent and span R(A), they form a basis for R(A). The dimension of R(A) is two. In the context of inverse problems, the range of a matrix A is important because it tells us when we can solve Ax = b. In finding the null space and range of a matrix A we found that the basis vectorsforn(a) correspondedto nonpivotcolumns ofa, while the basis vectors for R(A) corresponded to pivot columns of A. Since the matrix A had n columns, we obtain the equation dim N(A) + dim R(A) =n In addition to the null space and range of a matrix A, we will often work with the null space and range of the transpose of A. Since the columns of A T are rows of A, the column space of A T is also called the row space of A. An important theorem of linear algebra is that the dimension of the row space of A equals the dimension of the column space of A. The dimension of the column space of A which is also the dimension of the row space of A is called the rank of A. 3 Orthogonality and the Dot Product Definition 7 Let x and y be two vectors in R n. The dot product of x and y is x y = x T y = x y + x y +...+ x n y n Definition 8 Let x be a vector in R n.thelength or norm of x is x = x T x = x + x +...+ x n You may be familiar with an alternative definition of the dot product in which x y = x y cos(θ) whereθ is the angle between the two vectors. The two definitions are equivalent. To see this, consider a triangle with sides x, y, and x y. See Figure. The angle between sides x and y is θ. By the law of cosines, x y = x + y x y cos(θ). 5
x-y x θ y Figure : Relationship between the dot product and the angle between two vectors. (x y) T (x y) =x T x + y T y x y cos(θ). x T x x T y + y T y = x T x + y T y x y cos(θ). x T y = x y cos(θ). x T y = x y cos(θ). Note that we can also use this formula to compute the angle between two vectors: ( x θ =cos T ) y x y Definition 9 Two vectors x and y in R n are orthogonal or perpendicular if x T y =. Definition A set of vectors v, v,..., v p is orthogonal if each pair of vectors in the set is orthogonal. Definition Two subspaces V and W of R n are orthogonal or perpendicular if every vector in V is perpendicular to every vector in W. It can be shown that N(A) is perpendicular to R(A T ). To see this, note that if x is in N(A), then Ax =. Since the each element of the product Ax is obtained by taking the dot product of a row of A and x, x is perpendicular to each row of A. Since x is perpendicular to all of the columns of A T,itis perpendicular to R(A T ). Definition A basis in which the basis vectors are orthogonal is an orthogonal basis. Definition 3 An n by n matrix Q is orthogonal if the columns of Q are orthogonal and each column of Q has length one. Orthogonal matrices have a number of useful properties:. Q T Q = I. In other words, Q = Q T.. For any vector x in R n, Qx = x. 3. For any two vectors x and y in R n, x T y =(Qx) T (Qy). 6
x y p Figure : The orthogonal projection of x onto y. Definition 4 Let A be an m by n matrix. A can be written as A = QR where Q is an m by m orthogonal matrix, and R is an m by n matrix with R ij = whenever i>j. This is called the QR factorization of A. The QR factorization can be easily computed using techniques from numerical linear algebra. If the columns of A are linearly independent, the first n columns of Q are an orthogonal basis for R(A). We can use this approach to find an orthogonal basis for any subspace W of R n. First, we find a basis for W. We put the basis vectors into a matrix, and find its QR factorization. The first columns of the Q matrix form an orthogonal basis for W. Definition 5 Let x and y be two vectors in R n. The orthogonal projection of x onto y is p =proj y x = xt y y T y y See Figure. Definition 6 Let W be a subspace of R n, with an orthogonal basis v, v,..., v p. Let x be a vector in R n. Then the orthogonal projection of x onto W is p =proj W x = xt v v T v v + xt v v T v v +...+ xt v p vp T v v p p An important property of the orthogonal projection is that the projection of x onto W is the point in W which is closest to x. In the special case that x is in W, the projection of x onto W is x. This provides a convenient way to write a vector in W as a linear combination of the orthogonal basis vectors. Example 6 In this example, we will find the point on the plane x + x + x 3 = 7
which is closest to the point x =,x =,x 3 =3. First, we must find an orthogonal basis for our subspace. Our subspace is the null space of the matrix A = [ ] This null space has the basis u = u = Unfortunately, this basis is not orthogonal. Using the QR factorization, we obtain the orthogonal basis w =.77 w =.48.865..77.48 Using this orthogonal basis, we compute the projection of x onto the plane p = xt w w T w w + xt w w T w w. p =. Given an inconsistent system of equations Ax = b, it is often desirable to find an approximate solution. A natural measure of the quality of an approximate solution is the distance from Ax to b, Ax b. Such a solution is called a least squares solution, because it minimizes the sum of squares of the errors. The least squares solution can be obtained by projecting b onto the range of A. This calculation requires us to first find an orthogonal basis for R(A). There is an alternative approach which does not require the orthogonal basis. Let Ax = p =proj R(A) b. Then Ax b is perpendicular to R(A). In particular, each of the columns of A is orthogonal to Ax b. Thus A T (Ax b) =. A T Ax = A T b. This last system of equations is referred to as the normal equations for the least squares problem. It can be shown that if the columns of A are linearly independent, then the normal equations have exactly one solution. This solution minimizes the sum of squared errors.. 8
and Example 7 Let A = b = 3 It s easy to see that the system of equations Ax = b is inconsistent. We ll find the least squares solution by solving the normal equations. [ ] 7 A T A =. 8 A T b = The solution to A T Ax = A T b is [ x = 3 4 [ 4 9.3846.69 ]. ]. 9