Dimension and Structure

Similar documents
7. Dimension and Structure.

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

SUMMARY OF MATH 1600

Math 123, Week 5: Linear Independence, Basis, and Matrix Spaces. Section 1: Linear Independence

2. Every linear system with the same number of equations as unknowns has a unique solution.

Math 369 Exam #2 Practice Problem Solutions

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Linear Algebra Highlights

Math 102, Winter 2009, Homework 7

1. General Vector Spaces

Third Midterm Exam Name: Practice Problems November 11, Find a basis for the subspace spanned by the following vectors.

ft-uiowa-math2550 Assignment OptionalFinalExamReviewMultChoiceMEDIUMlengthForm due 12/31/2014 at 10:36pm CST

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)

MAT Linear Algebra Collection of sample exams

Review Notes for Linear Algebra True or False Last Updated: February 22, 2010

MA 265 FINAL EXAM Fall 2012

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

LINEAR ALGEBRA SUMMARY SHEET.

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

Lecture 14: Orthogonality and general vector spaces. 2 Orthogonal vectors, spaces and matrices

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

PRACTICE PROBLEMS FOR THE FINAL

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Solution of Linear Equations

Linear Algebra Primer

1 9/5 Matrices, vectors, and their applications

Algorithms to Compute Bases and the Rank of a Matrix

5.4 Basis And Dimension

ELE/MCE 503 Linear Algebra Facts Fall 2018

Exam in TMA4110 Calculus 3, June 2013 Solution

Lecture 13: Row and column spaces

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

4.3 - Linear Combinations and Independence of Vectors

Chapter 6: Orthogonality

Math 54 HW 4 solutions

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Linear Algebra- Final Exam Review

Math 2174: Practice Midterm 1

Chapter 2 Subspaces of R n and Their Dimensions

No books, no notes, no calculators. You must show work, unless the question is a true/false, yes/no, or fill-in-the-blank question.

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

Chapter 2. General Vector Spaces. 2.1 Real Vector Spaces

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Matrices and Matrix Algebra.

Cheat Sheet for MATH461

Math 308 Practice Test for Final Exam Winter 2015

4. Linear Subspaces Addition and scaling

LINEAR ALGEBRA REVIEW

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

ICS 6N Computational Linear Algebra Vector Space

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

Solutions to Math 51 First Exam April 21, 2011

Midterm 1 Review. Written by Victoria Kala SH 6432u Office Hours: R 12:30 1:30 pm Last updated 10/10/2015

Matrix Algebra for Engineers Jeffrey R. Chasnov

18.06SC Final Exam Solutions

Quizzes for Math 304

3.3 Linear Independence

Chapter 5 Eigenvalues and Eigenvectors

Math 407: Linear Optimization

Math 321: Linear Algebra

Math 3108: Linear Algebra

The matrix will only be consistent if the last entry of row three is 0, meaning 2b 3 + b 2 b 1 = 0.

LINEAR ALGEBRA QUESTION BANK

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math113: Linear Algebra. Beifang Chen

Introduction to Linear Algebra, Second Edition, Serge Lange

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

Conceptual Questions for Review

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

MATH 2360 REVIEW PROBLEMS

Math Linear Algebra Final Exam Review Sheet

A Brief Outline of Math 355

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

W2 ) = dim(w 1 )+ dim(w 2 ) for any two finite dimensional subspaces W 1, W 2 of V.

Chapter SSM: Linear Algebra. 5. Find all x such that A x = , so that x 1 = x 2 = 0.

Math 321: Linear Algebra

4.9 The Rank-Nullity Theorem

Linear Algebra. Min Yan

Linear Algebra (Math-324) Lecture Notes

6. Orthogonality and Least-Squares

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Least squares problems Linear Algebra with Computer Science Application

The definition of a vector space (V, +, )

Math Final December 2006 C. Robinson

Linear Algebra Fundamentals

Typical Problem: Compute.

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

18.06 Professor Johnson Quiz 1 October 3, 2007

Lecture 23: 6.1 Inner Products

MAT 242 CHAPTER 4: SUBSPACES OF R n

Practice Final Exam. Solutions.

Math 314/814 Topics for first exam

1 Systems of equations

Transcription:

96

Chapter 7 Dimension and Structure 7.1 Basis and Dimensions Bases for Subspaces Definition 7.1.1. A set of vectors in a subspace V of R n is said to be a basis for V if it is linearly independent and spans V. The set {e 1,e,,e n } is called the standard basis for R n. Theorem 7.1.. If S = {v 1,,v k } is a set of two or more nonzero vectors in R n, then S is linearly dependent if and only if some vector in S is a linear combination of its predecessors. Example 7.1.3. The vectors are linearly independent since none of the vectors are a linear combination of its predecessors. v 1 = (0,1,0), v = (1,1,0), v 3 = (0,1,3). Example 7.1.4. The nonzero row vectors in a row echelon form are linearly independent. 1 1 0 1 0 0 1 0 0 1 or 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 Theorem 7.1.5 (Existence of a Basis). If V is a nonzero subspace of R n, then there exists a basis for V that has at most n vectors. 97

98 CHAPTER 7. DIMENSION AND STRUCTURE Theorem 7.1.6. All bases of a nonzero subspace of R n has the same number of vectors. Proof. Let V be a nonzero subspace of R n, and suppose B 1 = {v 1,,v k } and B = {w 1,,w m } are bases for V. We have to show m = k. Suppose k < m. Since both B 1 spans V we can express w i,(i = 1,,,m) in terms of v 1,,v k. w 1 = a 11 v 1 +a 1 v + +a k1 v k w = a 1 v 1 +a v + +a k v k. =... (7.1) w m = a 1m v 1 +a m v + +a km v k Consider the system a 11 a 1 a 1m c 1 0 a 1 a a m c.... = 0. a k1 a k a km if K eqs in m unknowns. Since k < m we have a nontrivial solution. Thus there exist numbers c 1,c,,c m not all zeros such that c m 0 c 1 a 11 + c a 1 + + c m a 1m = 0 c 1 a 1 + c a + + c m a m = 0... c 1 a k1 + c a k + + c m a km = 0 (7.) Now c 1 w 1 + +c m w m equals c 1 (a 11 v 1 + a 1 v + + a k1 v k ) +c (a 1 v 1 + a v + + a k v k )... +c m (a 1m v 1 + a m v + + a km v k ) (7.3) Arranging this we see the coefficients of v 1,v, are all zero. So we have c 1 w 1 + +c m w m = 0 which is a contradiction since B = {w 1,,w m } is a basis. Definition 7.1.7. If V is a nonzero subspace of R n, then the dimension of

7.. PROPERTIES OF BASES 99 V, written as dim(v), is the number of vectors in a basis for V. Dimension of a Solution Space The solution of a homog. linear system Ax = 0 is of the form(arising from Gauss -Jordan elimination) x = t 1 v 1 + +t s v s, where v 1,,v s are linearly independent(see section 3.5). These vectors are called canonical solutions and the set of vectors {v 1,,v s } is called a canonical basis for the solution space. Example 7.1.8. Find the canonical basis for the solution space of the homog. linear system x 1 +3x x 3 +x 5 = 0 x 1 +6x 5x 3 x 4 +4x 5 3x 6 = 0 x 1 +6x +8x 4 +4x 5 +18x 6 = 0 Dimension of a Hyperplane Example 7.1.9. If a = (a 1,,a n ) is a nonzero vector in R n, then the Hyperplane a is defined by the equation a 1 x 1 + +a n x n = 0 Theorem 7.1.10. If a is a nonzero vector in R n, then dim(a ) = n 1. 7. Properties of Bases Properties of Bases Theorem 7..1. If S = {v 1,,v k } is a basis for a subspace V of R n, then every vector in V can be expressed in exactly one way as a linear combination of vector in S. Theorem 7... Let S = {v 1,,v k } be a finite set of vectors in a nonzero subspace V of R n.

100 CHAPTER 7. DIMENSION AND STRUCTURE (1) If S spans V but is not a basis for V, then a basis for V can be obtained by removing some vectors from V. () If S is linearly independent vectors, but is not a basis for V, then a basis for V can be obtained by adding some vectors to S. Theorem 7..3. If V is a nonzero subspace of R n, the dim(v) is the maximum number of linearly independent vectors in V. Subspaces of Subspaces Theorem 7..4. If V and W are subspaces of R n and if V is a subspace W, then: (1) 0 dim(v) dim(w) n. () V = W if and only if dim(v) = dim(w). Theorem 7..5. Let S = {v 1,,v k } be a nonempty set of vectors in R n, let S be a set that results by adding additional vectors in R n to S. (1) If the additional vectors are in span(s) then span(s ) = span(s). () If span(s ) = span(s), then the additional vectors are in span(s). (3) If span(s ) and span(s) have the same dimension, then the additional vectors are in span(s) and span(s ) = span(s). Spanning and Linear Independence Theorem 7..6. (1) A set of k linearly independent vectors in k-dimensional subspaces of R n is a basis for that subspace. () A set of k vectors that span a k-dimensional subspaces of R n is a basis for that subspace. (3) A set of fewer than k vectors in k-dimensional subspaces of R n cannot span that subspace. (4) A set with more than k vectors in k-dimensional subspaces of R n is linearly dependent.

7.3. FUNDAMENTAL SPACES OF A MATRIX 101 Unifying Theorem Theorem 7..7. If A is an n n is matrix, if T A is the linear operator on R n with standard matrix A, then the followings statements are equivalent. (1) The reduced row echelon form is I n. () A is expressible as a product of elementary matrices. (3) A is invertible. (4) Ax = 0 has only trivial solution. (5) Ax = b is consistent for any b R n. (6) Ax = b has exactly one solution for any b R n. (7) The column vectors are linearly independent. (8) The row vectors are linearly independent. (9) det(a) 0. (10) λ = 0 is not an eigenvalue of A. (11) T A is one-to-one. (1) The column vectors of A are linearly independent. (13) The row vectors of A are linearly independent. (14) The column vectors of A span R n. (15) The row vectors of A span R n. (16) The column vectors of A form a basis for R n. (17) The row vectors of A form a basis for R n. 7.3 Fundamental Spaces of a Matrix Rank of a Matrix If A is an m n is matrix, then there are three important spaces associated with A.

10 CHAPTER 7. DIMENSION AND STRUCTURE (1) The row space of A, denoted by row(a) is a subspace of R m spanned by the rows of A. () The column space of A, denoted by col(a) is a subspace of R n spanned by the columns of A. (3) The null space of A, denoted by null(a) is a subspace of R n spanned by the solutions of Ax = 0. Considering A T, we have another space null(a T ). These four subspaces are called fundamental spaces of A. Definition 7.3.1. The dimension of the row space of a matrix A is called the rank of A, and the dimension of the null space of A is called the nullity of A and is denoted by nullity(a). Orthogonal Complements Definition 7.3.. If S is a nonempty set in R n, then the orthogonal complement of S, denoted by S is defined as the set of all vectors in R n that are orthogonal to every vector in S. Example 7.3.3. (1) If L is a line through the origin of R 3, then L is the plane through the origin that is perpendicular to L. () If S is the set of row vectors of an m n matrix A, then S is the solution space of Ax = 0. Theorem 7.3.4. If S is a nonempty set in R n, then the S is a subspace of R n. Example 7.3.5. (1) Find the orthogonal complement of the following vectors in R 3. v 1 = (1,1,0), v = (0,1,3). () Find the orthogonal complement of the same vectors in R 4. Properties of Orthogonal Complements Theorem 7.3.6. (1) If W is subspace of R n, then W W = {0}. () If S is a nonempty set in R n, then S = span(s).

7.3. FUNDAMENTAL SPACES OF A MATRIX 103 (3) If W is subspace of R n, then (W ) = W. Theorem 7.3.7. If A is an m n is matrix, then the row space of A and the null space of A are orthogonal complements. Proof. If x is in the null space of A, then Ax = 0. In other words, the vector x is orthogonal to row space of A. The converse also holds. If we apply this theorem to A T we obtain the following. Theorem 7.3.8. If A is an m n is matrix, then the column space of A and the null space of A T are orthogonal complements. The results of two theorems can be summarized as follows: row(a) = null(a), null(a) = row(a) col(a) = null(a T ), null(a T ) = col(a) (7.4) Theorem 7.3.9. (1) Elementary row operations do not change the row space of a matrix. () Elementary row operations do not change the null space of a matrix. (3) The nonzero row vectors in any row echelon form of a matrix form a basis for the row space of the matrix. Theorem 7.3.10. Let A and B are matrices with the same number of columns, then the followings statements are equivalent. (1) A and B have the same row space. () A and B have the same null space. (3) The row vectors of A are linear combinations of the row vectors of B, and conversely. Proof. (1) (). The row space and null space of a matrix is orthogonal complement of each other. Hence if A and B have the same row space, they must have the same null space, and conversely.

104 CHAPTER 7. DIMENSION AND STRUCTURE Finding Basis by Row Reduction Find a basis for a subspace W of R n that is spanned by the vectors S = {v 1,,v k } Example 7.3.11. (1) Find a basis for W spanned by the vectors (1,0,0,0,), (,1, 3,, 4), (0,5, 14, 9,0), (,10, 8, 18,4) () Find a basis for W. sol. (1) Let 1 0 0 0 A = 1 3 4 0 5 14 9 0 10 8 18 4 (7.5) Reducing to echelon form 1 0 0 0 U = 0 1 3 0 0 0 1 1 0 0 0 0 0 0 (7.6) Extracting nonzero rows we obtain the following vectors w 1 = (1,0,0,0,), w = (0,1, 3,,0), w 3 = (0,0,1,1,0) Or continuing, we get reduced row echelon form: 1 0 0 0 R = 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 (7.7) We obtain another basis: w 1 = (1,0,0,0,), w = (0,1,0,1,0), w 3 = (0,0,1,1,0) () Note that row(a) = W. Hence W = row(a) = null(a). Thus we need

7.3. FUNDAMENTAL SPACES OF A MATRIX 105 to compute the null space of A. But then Ax = 0 is equivalent to Rx = 0, where R is given in (7.7). Thus x 1 +x 5 = 0, x +x 4 = 0, x 3 +x 4 = 0, from which we set two free variables, s = x 5, t = x 4. So x 1 x x 3 x 4 x 5 s 0 t 0 1 = t = s 0 +t 1 t 0 1 s 1 0 (7.8) Thus the following vectors form a basis for W. v 1 = (,0,0,0,1),v = (0, 1, 1,1,0) Determining Whether a Vector is in a Given Space We consider the following problems: (1) Given a set of vectors S = {v 1,,v n } in R m, find conditions under which the vector b = (b 1,b,,b m ) will lie in the span of S. () Given an m n matrix A, find conditions under which the vector b = (b 1,b,,b m ) will lie in col(a). (3) Given a linear transformation T : R n R m, findconditions underwhich the vector b = (b 1,b,,b m ) will lie in ran(t). You can check that these problems are equivalent! Example 7.3.1. Findconditionsunderwhichthevectorsb = (b 1,b,,b 5 ) will lie in the span of vectors v 1,,v 4 in Example 7.3.11. sol. A direct way is to see whenbcan bewritten as a linear combinations of v 1,,v 4, i.e., when we can find numbers x 1,,x 4 such that the following holds. x 1 v 1 + +x 4 v 4 = b. (7.9)

106 CHAPTER 7. DIMENSION AND STRUCTURE This is a system of the form Cx = b where the successive columns of C are v 1,,v 4. Thus the augmented system is 1 0 b 1 0 1 5 10 b 0 3 14 8 b 3 (7.10) 0 9 18 b 4 4 0 4 b 5 Elimination gives 1 0 b 1 0 1 5 10 b 0 0 1 b 3 +3b 0 0 0 0 b 4 b 3 b The consistency conditions are 0 0 0 0 b 5 b 1 b 4 b 3 b = 0 b 5 b 1 = 0. Solution. (Focusing row rather than columns) Recall Theorem 7..5. The vector b lies in span {v 1,,v 4 } iff this space has the same dimension as {v 1,,v 4,b}, that is, if and only if thematrix A with row vectors v 1,,v 4 have the same rank as the matrix with row vectors v 1,,v 4,b. Thus adjoining the vector b to A yields 1 0 0 0 1 3 4 0 5 14 9 0 10 8 18 4 b 1 b b 3 b 4 b 5 (7.11) Reducing this up to the fourth row, 1 0 0 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 (7.1) b 1 b b 3 b 4 b 5 0 0 0 b 4 b 3 b b 5 b

7.3. FUNDAMENTAL SPACES OF A MATRIX 107 For this matrix to have rank 3 we must have b 4 b 3 b = 0 and b 5 b = 0 which is the same condition as before. Solution 3. Note that b lies in the subspace W = span{v 1,,v 4 } if and only if b is orthogonal to every vector in W. A basis for W was shown(example before) to be u 1 = (,0,0,0,1), and u = (0, 1, 1,1,0) Since b is orthogonal to u 1 and u, we have b u 1 = 0, b u = 0, hence b 1 +b 5 = 0, and b b 3 +b 4 = 0 which is the same condition as before. Example 7.3.13. Determine which of the vectors b 1 = (7,,5,3,14), b = (7,,5,3,6) and b 3 = (0, 1,3,,0) lie in the subspace of R 5 spanned by the vectors v 1,,v 4 in Example 7.3.11. Method 1. One way is to check the conditions found earlier b 1 +b 5 = 0, and b b 3 +b 4 = 0. Method. We form the system Cx = b 1, Cx = b, Cx = b 3 and see if these system have solutions. Consider the augmented system [C b 1 b b 3 ] : 1 0 7 7 0 0 1 5 10 1 0 3 14 8 5 5 3 (7.13) 0 9 18 3 3 4 0 4 14 6 0 Elimination(row echelon form) gives 1 0 7 7 0 0 1 5 10 1 0 0 1 1 1 0 0 0 0 0 0 0 4 0 0 0 0 0 8 0

108 CHAPTER 7. DIMENSION AND STRUCTURE We see that only the vector b 1 = (7,,5,3,14) lies in the subspace spanned by the vectors v 1,,v 4. 7.4 Dimension Theorem and its Implications Dimension Theorem for Matrices Let us recall Theorem... If Ax = 0 is the homogeneous linear system with n unknowns, and if the reduced row echelon form of the augmented matrix has r nonzero rows, then the system has n r free variables. This is called dimension theorem for homogeneous linear systems. However, for homogeneous system, the augmented matrix(augmented with zero rhs) and the coeff. matrix have the same number of nonzero rows in the reduced row echelon form, we can restate the dimension theorem as number of free variables = n rank(a) or rank(a) + number of free variables = number of columns (7.14) But the number of free variables is the same as the nullity of A. Hence we have Theorem 7.4.1 (Dimension theorem for Matrices). If A is an m n is matrix, then rank(a) + nullity(a) = n (7.15) Example 7.4.. 1 0 0 0 A = 1 3 4 0 5 14 9 0 10 8 18 4 rank(a)+nullity(a) = 3+ = 5. (7.16) Extending a Linearly Independent Set to a Basis Given an independent set of vectors {v 1,v,,v k }, we would like to extend it to a basis. One way is to form a matrix A having v 1,v,,v k

7.4. DIMENSION THEOREM AND ITS IMPLICATIONS 109 as rows and consider the system Ax = 0. Solving this system, we can find the null space of A(the dimension of null(a) is n k) whose basis we may put w k+1,w k+,,w n. Each of w i is orthogonal to v j, since null(a) and row(a) are orthogonal. Hence the set {v 1,v,,v k,w k+1,w k+,,w n } is a linearly independent set and hence form a basis of R n. Example 7.4.3. Given a linearly independent vectors v 1 = (,0,4,0), and v = (1,, 1,0) extend them to a basis for R 4. Form a matrix having these vectors as rows; [ ] 0 4 0 A = (7.17) 1 1 0 Find the null space of A by solving Ax = 0. Its row echelon form is [ ] 1 0 0 R = 0 3 0 Thus x 1 +x 3 = 0, x 3x 3 = 0 from which we get Thus the vectors x = ( s, 3 s,s,t) = s(, 3,1,0)+t(0,0,0,1). v 1 = (,0,4,0), v = (1,, 1,0), w 3 = (, 3,1,0), w 4 = (0,0,0,1) form a basis for R 4. Consequences of Dimension Theorem for Matrices Theorem 7.4.4 (Dimension theorem for Matrices). If an m n matrix A has rank k, then (1) A has nullity n k. () Every row echelon form of A has k nonzero rows.

110 CHAPTER 7. DIMENSION AND STRUCTURE (3) Every row echelon form of A has m k zero rows. (4) The homogeneous linear system Ax = 0 has k pivot variables(leading variables) and n k free variables. Theorem 7.4.5 (Dimension theorem for Subspaces). If W is a subspace of R n, then dim(w)+dim(w ) = n (7.18) Proof. We may assume W {0}. Choose a basis for W and let A be the matrix having these vectors as rows. Obviously, the matrix A has n columns. The row space is W and its null space is W, so from dimension theorem Theorem 7.4.1, we see dim(w)+dim(w ) = rank(a)+nullity(a) = n Theorem 7.4.6. If A is an n n is matrix, if T A is the linear operator on R n with standard matrix A, then the followings statements are equivalent. (1) Reduced echelon form is I n. () A is expressible as a product of elementary matrices. (3) A is invertible. (4) Ax = 0 has only trivial solution. (5) Ax = b is consistent for any b R n. (6) Ax = b has exactly one solution for any b R n. (7) det(a) 0. (8) λ = 0 is not an eigenvalue of A. (9) T A is one-to-one. (10) T A is onto. (11) The column vectors of A are linearly independent. (1) The row vectors of A are linearly independent.

7.4. DIMENSION THEOREM AND ITS IMPLICATIONS 111 (13) The column vectors of A span R n. (14) The row vectors of A span R n. (15) The column vectors of A form a basis for R n. (16) The row vectors of A form a basis for R n. (17) rank(a) = n (18) nullity(a) = 0 More on Hyperplane Theorem 7.4.7. If W is a subspace of R n with dimension n 1, then there is a nonzero vector a for which W = a ; that is W is a hyperplane through the origin in R n. Proof. From the dimension theorem, it follows that dim(w ) = 1; and thus W is the span of some nonzero vector, say a such that W = span{a}. Also, we see W = (W ) = span{a} = a. Theorem 7.4.8. The orthogonal complement of a hyperplane through the origin in R n is a line through the origin in R n, and the orthogonal complement of a line through the origin in R n is a hyperplane through the origin in R n. Specifically, if a is a nonzero vector in R n, then the line span{a} and hyperplane a are orthogonal complement of one another. Rank one Matrices Fact about rank one matrices. If rank(a) = 1, then the row space of A is spanned by some nonzero vector a, all the row vectors are scalar multiples of a and the null space of A is a.

11 CHAPTER 7. DIMENSION AND STRUCTURE v: An example of rank one matrix. The outer product of two vectors u and u 1 v 1 u 1 v u 1 v 3 u 1 v n uv T u = v 1 u v u v 3 u v n... u m v 1 u m v u m v 3 u m v n All the rows of a rank one matrix are multiples of a single vector, and all the columns of a rank one matrix are multiples of a single vector. [ ] 4 6 0, 3 6 9 0 1 3 3 3 4 4 Theorem 7.4.9. If u is an m 1 vector and v is an n 1 vector, then the outer product A = uv T is a rank one matrix. Conversely, if A is a m n rank one matrix, then A can be written as an outer product of two vectors. Proof. ( ) Let A be an m n rank one matrix, then all the rows of A are multiples of a single row, say v T. Then u 1 v T u 1 u v T. = u. vt = uv T. u m v T u m A Symmetric Rank one Matrix An example of symmetric rank one matrix. u 1 u 1 u u 1 u 3 u 1 u n uu T u = u 1 u u u 3 u u n... u n u 1 u n u u n u 3 u n

7.5. THE RANK THEOREM AND ITS IMPLICATIONS 113 Theorem 7.4.10. If u is an n 1 column vector, then the outer product uu T is a symmetric rank one matrix. Conversely, if A is an n n symmetric matrix of rank one, then A can be written as uu T or uu T for some column vector u. Proof. ( ) Let A be an n n symmetric matrix of rank one, then by above theorem A = uv T for some vectors u and v T. Since A is symmetric, we have (uv T ) T = vu T = uv T. Hence every row of A is a multiple of the vector u T as well as a multiple of the vector v T. Thus u = ±k v for some number k. Thus we see A = ±k vv T = ±(kv)(kv) T. Hence we see A is of the form uu T or uu T. 7.5 The Rank Theorem and Its Implications The Rank Theorem First we recall the following theorem. Theorem 7.5.1. The row space and the column space of a matrix has the same rank. Example 7.5.. Reducing we get 1 0 0 1 A = 0 1 1 0 0 1 5 0 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 1 0 0 0 1 1 0 1 0 0 0 0 1 The row rank is 3. Meanwhile we transpose it and compute the column rank. 1 0 1 0 1 0 0 1 1 0 1 1 0 1 1 1 5 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 1

114 CHAPTER 7. DIMENSION AND STRUCTURE Theorem 7.5.3. If A is m n matrix, then rank(a) = rank(a T ) (7.19) Recall Theorem 7.4.1. Theorem 7.5.4 (Dimension theorem for Matrices). If A is an m n is matrix, then rank(a) + nullity(a) = n. Applying this Theorem to A T, we obtain rank(a T )+nullity(a T ) = m. Since rank(a T ) = rank(a), we can rewrite it as rank(a)+nullity(a T ) = m. If A is m n matrix of rank k, then the dimensions of four fundamental spaces satisfy dim(row(a)) = k, dim(null(a)) = n k dim(col(a)) = k, dim(null(a T (7.0) )) = m k. Example 7.5.5. Find the dimension of fundamental space of the following matrix. 1 0 0 1 A = 0 1 1 0 0 1 5 0 1 dim(row(a)) = dim(col(a)) = 3, dim(null(a)) = 5 3 =, dim(null(a T )) = 0. Consistency and Rank Theorem 7.5.6 (Consistency Theorem). Let Ax = b be an m n system of linear equations. Then the following statements are equivalent. (1) Ax = b is consistent. () b is in the column space of A.

7.5. THE RANK THEOREM AND ITS IMPLICATIONS 115 R n R m m k = null(a T ) k = row(a) n k = null(a) A k = col(a) Figure 7.1: Rank and nullity of A (3) The coefficient matrix A and its augmented matrix [A b] have the same rank. Example 7.5.7. Find(if any) the solution of the following equation. 1 0 1 Ax = b, where A = 1 1 1 and b = 0 1 3 1 Definition 7.5.8. An m n matrix A is said to have full column rank if its column vectors are linearly independent, and it is said to have full row rank if its row vectors are linearly independent. Theorem 7.5.9. Let A be an m n matrix. (1) A has full column rank if and only if the column vectors of A form a basis for the column space, i.e., rank(a) = n. () A has full row rank if and only if the row vectors of A form a basis for the row space, i.e., rank(a) = m. Theorem 7.5.10. Let A be an m n matrix. Then the following statements are equivalent. (1) Ax = 0 has only the trivial solution. () Ax = b has at most one solution for every b R m. (3) A has full column rank.

116 CHAPTER 7. DIMENSION AND STRUCTURE Proof. Equiv. of (1) and () are the contents of Theorem 3.5.3. (1) (3). Let a 1,,a n be the column vectors of A, and write Ax = 0 in the form x 1 a 1 + +x n a n = 0. (7.1) ThenAx = 0 has only the trivial solution iff the vectors a 1,,a n are linearly independent. Over-determined and Under determined Linear Systems Theorem 7.5.11. Let A be an m n matrix. (1) (Over-determined) If m > n, then the system Ax = b is inconsistent for some vector b R n. () (Under-determined) If m < n, then for every vector b R n the system Ax = b is either inconsistent or has infinitely many solutions. Matrices of the form A T A and AA T Let A be a matrix with column vectors a 1,a,,a n. Then a 1 a 1 a 1 a a 1 a n A T a A = a 1 a a a a n.... (7.) a n a 1 a n a a n a n On the other hand, if r 1,r,,r m are row vectors of A, then r 1 r 1 r 1 r r 1 r m AA T r = r 1 r r r r m.... (7.3) r m r 1 r m r r m r m Theorem 7.5.1. Let A be an m n matrix. (1) A and A T A have the same null space. () A and A T A have the same row space. (3) A T and A T A have the same column space.

7.5. THE RANK THEOREM AND ITS IMPLICATIONS 117 (4) A and A T A have the same rank. Proof. Let A be an m n matrix.... A and A T A have the same null space. A and A T A have the same row space. Theorem 7.5.13. Let A be an m n matrix. (1) A T and AA T have the same null space. () A T and AA T have the same row space. (3) A and AA T have the same column space. (4) A and AA T have the same rank. Some Unifying Theorem Theorem 7.5.14. Let A be an m n matrix. Then the following statements are equivalent. (1) Ax = 0 has only the trivial solution. () Ax = b has at most one solution for every b R m. (3) A has full column rank(n). (4) A T A(n n) is invertible. Theorem 7.5.15. Let A be an m n matrix. Then the following statements are equivalent. (1) A T x = 0 has only the trivial solution. () A T x = b has at most one solution for every b R n. (3) A has full row rank. (4) AA T is invertible. Example 7.5.16. 1 0 A = 1 3 1

118 CHAPTER 7. DIMENSION AND STRUCTURE 7.6 Pivot Theorem and Its Implications Basis Problem Revisited Consider finding a basis for a subspace W spanned by a set of vectors S = {v 1,,v s }. Two possibilities: (1) Find any basis for W () Find a basis for W consisting of vectors in S. First basis problem may besolved by formingamatrix whoserow consisting of vectors in S. One may reduce it into a row echelon form and extract nonzero row vectors. One of the ways to solve the second basis problem is to form a matrix A whose column consist of vectors from S and find a basis for the column space of A. Some remarks: We know that the row operations do not change the row space. However, the row operations do change the column space. Example 7.6.1. 1 1 1 1 A = 0 E 31( 3)E 1 ( ) B = 0 4 4 3 1 0 4 4 If we let A = [c 1,c,c 3 ] and B = [c 1,c,c 3 ] then we see c 1 c c 3 = 0 and c 1 c c 3 = 0. In general, if B is row equivalent to A(i.e., EA = B for some product of elementary matrices E), then the solution of Ax = 0 and Bx = 0 are the same. Hence the following relation holds. x 1 c 1 +x c + +x n c n = 0 if and only if x 1 c 1 +x c + +x n c n = 0. Theorem 7.6.. Let A and B be row equivalent matrices. (1) If some subset of column vectors of A is linearly independent, then the corresponding column vectors of B are linearly independent, and vice versa.

7.6. PIVOT THEOREM AND ITS IMPLICATIONS 119 () If some subset of column vectors of A is linearly dependent, then the corresponding column vectors of B are linearly dependent, and vice versa. Moreover, the column vectors in the two matrices have the same dependency relationships. Example 7.6.3. Find a basis for the column space consisting of column vectors of 1 3 4 5 4 A = 6 9 1 8 6 9 1 9 7 1 3 4 5 4 sol. Row reduced echelon form becomes 1 3 4 5 4 U = 0 0 1 3 6 0 0 0 0 1 5 0 0 0 0 0 0 We see rank(a) = 3. Hence it suffices to choose three linearly independent columns. Since the linearly independency does not change by row operations, we may choose columns corresponding to leading 1 s in echelon form. Thus columns 1,3 and 5 from A suffices. Definition 7.6.4. The columns chosen above (corresponding to leading 1 in row echelon form) are called pivot columns. Theorem 7.6.5 (Pivot Theorem). The pivot columns of a nonzero matrix A form a basis for the column space of A. Algorithm 1 -Finding Basis IfW is asubspacespannedby thevectors S = {v 1,,v s }, thenthefollowing procedureproduces a basis for W from S. Steps 4 and 5 gives a way to express vectors in S not in the basis as linear combinations of basis vectors. Step 1. Form the matrix A that has v 1,,v s as successive column vectors. Step. Reduce A to row echelon form U and identify the pivot columns. Step 3. Extract the pivot columns of A as a basis for W.

10 CHAPTER 7. DIMENSION AND STRUCTURE Step 4. If it is desired to express the vectors that are not in the basis as a linear combinations of basis vectors, then continue reducing U to the reduced row echelon form R. Step 5. Find the relation between columns of R by inspection and the the same relation between columns of A hold. Example 7.6.6. (a) Find a basis for the column space consisting of column vectors of 1 3 4 5 4 A = 6 9 1 8 6 9 1 9 7 1 3 4 5 4 and (b) express those column vectors of that are not in the basis as linear combinations of basis vectors. sol. (a) Row reduce gives 1 3 4 5 4 U = 0 0 1 3 6 0 0 0 0 1 5 0 0 0 0 0 0 We see columns 1,3 and 5 from A form a basis. (b). We need a relationship between columns. The row reduced echelon form is 1 3 0 14 13 8 1 3 0 14 0 37 R = 0 0 1 3 0 4 0 0 0 0 1 5 0 0 1 3 0 4 0 0 0 0 1 5 0 0 0 0 0 0 0 0 0 0 0 0 If we denote the columns of the row reduced form by c 1,c,,c 5, we see by inspection that c = 3c 1,c 4 = 14c 1 +3c 3, c 6 = 37c 1 +4c 3 +5c 5.

7.6. PIVOT THEOREM AND ITS IMPLICATIONS 11 Since the relation between columns do not change by row operation by Theorem 7.6., the same relation must hold for the columns of A, i.e., c = 3c 1, c 4 = 14c 1 +3c 3, c 6 = 37c 1 +4c 3 +5c 5. Bases for Fundamental Spaces Let U be the row echelon form of a matrix A. We have seen how to find bases of three fundamental spaces of A by reducing to row echelon form. We have (1) The nonzero rows U form a basis for row(a). () The columns of U with leading 1 identify the pivot columns of A and these form a basis for col(a). (3) The canonical solution of Ax = 0 form a basis for null(a). How to find a basis for null(a T )? An obvious answer is to use row reduction to A T and find the solution of A T x = 0. However, it would be desirable to find the basis applying row reduction to A. But how? First note that the dimension of null(a T ) is m k, and that A T x = 0 is the same as x T A = 0. Algorithm If A is m n matrix with rank k, and if k < m, then we can find the basis for null(a T ) by the following procedure. Step 1. Adjoint m m identity matrix I m to the rhs of A to form [A I m ] Step. Apply row operations to [A I m ] until we obtain row echelon form, denote it by [U E] Step 3. Repartition [U E] by separating zero rows. Thus V E 1 k 0 E m k n m Step 4. The row vectors of E form a basis for null(a T ) Optional Proof. The vectors y null(a T ) R m if and only if y T A = 0. Thus applying elementary row operations to [A I m ] we get [EA E] = [U E].

1 CHAPTER 7. DIMENSION AND STRUCTURE Now V E 1 [U E] = 0 E where V is k n matrix. Hence V E 1 E 1 A = U = EA = A = 0 E A E From this we see E A = 0 is (m k) n matrix. Thus the row vectors of E are orthogonal to col(a) thus belong to null(a T ). It remains to show that the row vectors of E form a basis. Write the relation E A = 0 in the following form y 1 y. y m k A = 0 The m k rows of E are clearly linearly independent. Since the dimension of null(a T ) is m k, we conclude the m k rows of E span the null space of A T. Example 7.6.7. Find a basis for null(a T ) using the procedure above. 1 3 4 5 4 A = 6 9 1 8 6 9 1 9 7 1 3 4 5 4 Column Row Factorization Theorem 7.6.8 (Column Row Factorization). If A is m n matrix with rank k, then A can be factored as A = CR, (7.4) where C is the m k matrix whose column vectors are the pivot columns of A and R is the k n matrix row vectors are the nonzero rows in the reduced row echelon form of A.

7.6. PIVOT THEOREM AND ITS IMPLICATIONS 13 Proof. Applying elementary row operations to [A I m ] we get reduce row echelon form [EA E] = [R 0 E]. Now partition R 0 and E 1 as [ ] R R 0 = and E 1 = [C D], 0 where R is the nonzero vectors of R 0 and C consists of first k columns of E, and D consists of m k last columns of E. Hence [ ] A = E 1 R R 0 = [C D] = CR+D0 = CR. (7.5) 0 Here we can see the successive columns of C are the pivot columns of A and the column vectors in those position of R are the standard basis vectors. e 1,e,,e k. Thus j-th pivot of A is Ce j which is j-th column of C. Example 7.6.9. (a) Find a Column Row Factorization for the following matrix using the reduced row echelon form. 1 8 1 0 A = 1 1 5 0 1 3 5 19 0 0 0 Hence the firsttwo columns of A arepivot columns of A andthe corresponding rows from the reduced row echelon form are (1,0,) and (0,1,3). Hence we have 1 [ ] 1 0 A = 1 1 = CR. 0 1 3 5 Column Row Expansion We have seen in chapter 3 that a matrix can be expressed as a sum of outer products of columns from the first factor and rows from the second factor. Thus the previous factorization has the following interpretation. Theorem 7.6.10 (Column Row Expansion). If A is m n matrix with rank

14 CHAPTER 7. DIMENSION AND STRUCTURE k, then A can be factored as A = c 1 r 1 +c r + +c k r k, (7.6) where c 1,c,c k is the successive pivot columns of A and r 1,r,,r k are successive nonzero rows in the reduced row echelon form of A. Example 7.6.11. k = in the above example. So we have A = c 1 r 1 +c r, i.e., 1 8 1 [ ] [ ] 1 0 0 6 1 1 5 = 1 1 0 + 1 0 1 3 = 1 0 + 0 1 3. 5 19 5 0 4 0 5 15 7.7 Projection Theorem and its Implications Orthogonal Projection onto Lines Example 7.7.1. Let a be any nonzero vector. Consider the orthogonal projection of any vector x onto the line W = span{a}. If x 1 is the orthogonal projection of x, then we have x = x 1 +x, x 1 = ka for some scalar k and x a. Since 0 = x a = (x ka) a, we see k = x a a. Thus the orthogonal projection of any vector x onto the line W = span{a} is given by Proj a x = x a a a. (7.7) Here x Proj a x is called the orthogonal complement of x. Orthogonal projection of x about a line with angle θ is given by the matrix representation P θ = [ ] 1 (1+cosθ) 1 sinθ 1 sinθ 1 (1 cosθ) = [ cos θ sinθcosθ ] sinθcosθ sin θ (7.8)

7.7. PROJECTION THEOREM AND ITS IMPLICATIONS 15 Deriving the matrix representation (7.8) from (7.7) Let us derive the matrix representation (7.8) again. Let Compute u = a := (cosθ,sinθ). a Proj u e 1 = (e 1 u)u = (cosθ)u, Proj u e = (e u)u = (sinθ)u. Thus we obtain [ cos θ P θ = [Proj u e 1 Proj u e ] = sinθcosθ ] sinθcosθ sin θ Projection Operators on R n If we use the concept of operator, then the orthogonal projection operator T : R n R n is defined by T(x) = Proj a x = x a a a. (7.9) Theorem 7.7.. The standard matrix for the operator T(x) = Proj a x is Normalizing a to u = a a we get P = 1 a T a aat. (7.30) P = uu T. (7.31) Proof. We will be done if we compute the columns of T. Here a j is the j-th entry of a. Hence T(e j ) = e j a a a = a j a a. P = [ ] a1 a a, a a,, n a = 1 [ ] a a a a a a 1,a,,a n = 1 a T a aat. (7.3)

16 CHAPTER 7. DIMENSION AND STRUCTURE Example 7.7.3. Find the standard matrix P when a = [, 1,1]. sol. a T a = 4+1+1 = 6 and 4 aa T = 1 [, 1,1] = 1 1 1 1 1 Hence P = 1 4 1 1 6 1 1 x x a W O x 1 θ Figure 7.: Projection of x onto W = Span(a) Orthogonal Projection onto General Subspaces Theorem 7.7.4 (Projection Theorem for Subspaces). If W is a subspace of R n, then every vector x in R n is expressed in exactly one way as x = x 1 +x, (7.33) where x 1 is in W and x is in W. Proof. We may assume W {0} and let {w 1,,w k } be a basis of W.(Note that k n) Let M be the matrix having w i as columns. Then W is the column space of M and the orthogonal complement of W is the nullspace of M T. If x can be written in the form x = x 1 +x, (7.34)

7.7. PROJECTION THEOREM AND ITS IMPLICATIONS 17 where x 1 is in W and x is in W, then x 1 = Mv for some v R n and M T x = 0. But then we have M T (x x 1 ) = 0 or M T (x Mv) = 0. Now consider M T (x Mv) = 0 or M T Mv = M T x. (7.35) The matrix M has full column rank, hence M T M is invertible, so we have unique solution v = (M T M) 1 M T x. (7.36) In the special case W is the line through the origin, the vectors x 1 and x are the the same as previous example. So we have actually shown that the expression in (7.34) is possible in exactly one way. The vector x 1 is called the orthogonal projection of x onto W(written as Proj W x) and x is called the orthogonal projection of x onto W (written as Proj W x). x = Proj W x+proj W x. (7.37) The relations Proj W x = x 1 = Mv and (7.36) are rewritten in the following theorem. Theorem 7.7.5. If W is a nonzero subspace of R n, and if M is any matrix whose column vectors form a basis for W, then Proj W x = M(M T M) 1 M T x, (7.38) for x R n. The standard matrix corresponding to the projection is P = M(M T M) 1 M T. (7.39) The action of M T is to eliminate the component orthogonal to the of col(m), and the M on the left is to project to col(m). One way to check this formula

18 CHAPTER 7. DIMENSION AND STRUCTURE is to verify that Px 1 = PMv = M(M T M) 1 M T Mv = Mv and Px = M(M T M) 1 M T x = 0. Example 7.7.6. Find the standard matrix P for the orthogonal projection of R 3 onto the plane (1) x 3y 4z = 0. () UsethematrixP tofindtheorthogonalprojectionofthevector(1,, 1). First we find a basis for the plane and then form a matrix M to find P. From the row echelon form we see that the vectors in the plane are given by [x,y,z] = s[3,1,0] +t[4,0,1]. Thus the set {(3,1,0),(4,0,1)} is a basis. Hence 3 4 [ ] 3 4 [ ] M = 1 0 M T 3 1 0 10 1 M = 1 0 =, 4 0 1 1 17 0 1 0 1 and [ ] (M T M) 1 = 1 17 1 6 1 10 Therefore the projection matrix P = M(M T M) 1 M T is given by 1 6 3 4 [ ][ ] 17 1 3 1 0 1 0 1 10 4 0 1 0 1 = 1 6 5 3 4 3 17 1 4 1 10 When a Matrix Represent Orthogonal Projection? From the previous discussions we know that P = M(M T M) 1 M T is the standard matrix for the projection operator onto the space W which is spanned by the columns of M. We observe: (1) P T = P. () P = P.

7.7. PROJECTION THEOREM AND ITS IMPLICATIONS 19 Suppose we have an orthogonal projection P onto a k-dimensional subspace W of R n. Then (1) The columns space of P must be k-dimensional. () P is symmetric. (3) Moreover P = P.(Idempotent) These properties exactly characterize an orthogonal projection. In fact, we have Theorem 7.7.7 (Projection Matrix). If n n matrix P is the standard matrix for an orthogonal projection of R n onto a k-dimensional subspace of R n if and only if P is symmetric and idempotent, having rank k. The subspace W is the column space of P. Example 7.7.8. Show that A is the standard matrix for an orthogonal projection of R 3 onto a line through the origin. A = 1 1 4 4 9 4 4 We see that A is symmetric, idempotent and has rank 1. Hence it is an orthogonal projection on the a line. The first column (1,,) is a basis for the image space W. Strang Diagram Consider the system Ax = b, where A is m n matrix. Let W = row(a) and W = null(a). Recall x = Proj W +Proj W. (7.40) Applying this to W = row(a) and W = null(a), we get x = x row(a) +x null(a). (7.41) Similarly, we apply this to W = col(a) and W = null(a T ). For any vector b R m, we can decompose it as b = b col(a) +b null(a T ). (7.4)

130 CHAPTER 7. DIMENSION AND STRUCTURE Also note the following relations: dim(row(a)) + dim(null(a)) = n, (7.43) dim(col(a)) +dim(null(a T )) = m. (7.44) The system Ax = b is consistent if and only if b is in the column space of A, if and only if b null(a T ) = 0. Full Column Rank and Consistency of A Linear System Theorem 7.7.9. Let A be an m n matrix and b is in the column space of A. (1) If A has full column rank, then the system Ax = b has a unique solution, and that solution is in the row space of A. () If A does not have full column rank, then the system Ax = b has infinitely many solutions, but there is a unique solution in the row space of A. Moreover, among all the solutions the solution in the row space of A has the smallest norm. Proof. (1) If A has full column rank, then by Theorem 7.5.10 (7.5.6 of book), the system Ax = b has either inconsistent or has a unique solution. But since b is in the column space of A, it must be consistent, and there must exist a unique solution. () Since A does not have full column rank, the system Ax = 0 has infinitely many solutions, and hence Ax = b has infinitely many solutions. We recall the following. x = x row(a) +x null(a) (7.45) b = A(x row(a) +x null(a) ) = Ax row(a). (7.46) So there is at least one solution in the row space of A. (This also proves the second part of (1)). To see the uniqueness of solution in the row space for the case (), suppose x r and x r are two solutions. Then A(x r x r ) = 0 sothatx r x r null(a). However, x r x r isintherowspaceofa. (italicsized above.) And since row(a) = null(a), we must have x r x r = null(a)

7.8. BEST APPROXIMATION AND LEAST SQUARES 131 null(a) = {0}. Finally, any solution satisfies x x row(a) + x null(a) x row(a). Theorem 7.7.10. If W is a subspace of R n, then (W ) = W. Orthogonal Projection onto W IfW is anonzerosubspaceof R n, andif M is any matrix whosecolumnvectors form a basis for W, then Proj W x = x M(M T M) 1 M T x = (I M(M T M) 1 M T )x, (7.47) for x R n. The standard matrix corresponding to the orthogonal projection is I P = I M(M T M) 1 M T (7.48) Example 7.7.11. Find the standard matrix corresponding to the orthogonal projection onto the plane x 3y 4z = 0. Since P = M(M T M) 1 M T = 1 5 3 4 3 17 1, I P = 1 1 3 4 3 9 1 6 6 4 1 10 4 1 16 7.8 Best Approximation and Least Squares Minimum Distance Problems Given a subspace W and a vector b R n, consider the problem of finding a vector ŵ W that is closest to b, i.e., find ŵ W such that b ŵ b w, w W. Such a vector, if it exists, is called a best approximation to b from W. b = Proj W b+proj W b.

13 CHAPTER 7. DIMENSION AND STRUCTURE b O W ŵ P W b Figure 7.3: Projection Theorem 7.8.1 (Best Approximation Theorem). If W is a subspace and b is a vector in R n, there is a unique best approximation to b from W, namely ŵ = Proj W b. Proof. For every vector w W we have b w = (b Proj W b)+(proj W b w). Since the two terms are orthogonal(the first vector is in W and the second vector is in W), we have b w = b Proj W b + Proj W b w and hence Here we see b Proj W b b w. d b Proj W b = Proj W b (7.49) is the distance from W. Example 7.8.. Find the distance from a point b = (b 1,,b n ) to the hyperplane a 1 x 1 + +a n x n = 0. Denote the hyperplane by W. Then W = span{a}. We see the distance to the space W is Proj W b = Proj a b = a b a = a 1b 1 + +a n b n a 1 + +a n

7.8. BEST APPROXIMATION AND LEAST SQUARES 133 Definition 7.8.3. If A is an m n is matrix and b is a vector in R m, then a vector ˆx in R n is the best approximation solution or least square solution of Ax = b if b Aˆx b Ax, (7.50) for all x in R n. The quantity b Aˆx is called the least square error. Finding Least Square Solutions How to find the least square solutions of Ax = b? Noting that Ax is in the column space of A, we decompose b as b = Proj col(a) b+proj col(a) b. Then the following is an orthogonal decomposition: Ax b = (Ax Proj col(a) b) Proj col(a) b col(a)+col(a). The minimum is attained when we can find an x such that Ax = Proj col(a) b. (7.51) and min x R n b Ax = Proj col(a) b. In practice, one rarely solves (7.51) to compute the least square solution. Instead, rewriting (7.51) as b Ax = b Proj col(a) b (7.5) and multiplying A T, we see (since the space col(a) is equal to null space of A T ), that A T (b Ax) = A T (b Proj col(a) b) = 0. This is equivalent to A T Ax = A T b. (7.53) This is called a normal equation associated with Ax = b. Theorem 7.8.4. (1) The least square solutions of Ax = b are the solutions

134 CHAPTER 7. DIMENSION AND STRUCTURE of the normal equation A T Ax = A T b. (7.54) () If A has full column rank, the normal equation has a unique solution, namely ˆx = (A T A) 1 A T b. (7.55) (3) If A does not have full column rank, the normal equation has infinitely many solutions, but there is a unique solution in the row space of A. Moreover, among all the solutions of the normal equation, the solution in the row space of A has the smallest norm. Example 7.8.5. Find the least square solution of the system x 1 x = 4 3x 1 +x = 1 x 1 +4x = 3. sol. Compute ˆx = (A T A) 1 A T b. (7.56) Orthogonality of Least Square Error Note the following is an orthogonal decomposition: Ax b = (Ax Proj col(a) b) Proj col(a) b col(a)+col(a). The least square solution x satisfies Proj col(a) b Ax = 0. (7.57) Hence ˆx is a least square solution if and only if b Aˆx = Proj null(a T )b. Thus least square error vector = b Aˆx = Proj null(a T )b. (7.58)

7.8. BEST APPROXIMATION AND LEAST SQUARES 135 Theorem 7.8.6. A vector ˆx is the least square solution of Ax = b if and only if the error b Aˆx is orthogonal to the column space of A. More Application of Least Square Solution-Curve fitting Given a set of data (x 1,y 1 ),(x,y ),,(x n,y n ) in the xy-plane, one would like to find a line(or a curve) that fits these data best in some sense. Assume we use the line y = a+bx. Then we must have y 1 = a+bx 1 y = a+bx. y n = a+bx n. 1 x 1 y 1 [ ] 1 x a y =. b. (7.59) (7.60) 1 x n y n So Mv = y. (7.61) Its least square solution is obtained if we find the solution of M T Mv = M T y. (7.6) Least Square Solution- Higher Degree Polynomial Given data (x 1,y 1 ),(x,y ),,(x n,y n ) one would like to find a curve(or a line) that fits best in some sense. Assume we try a polynomial y = a 0 + a 1 x + +a m x m. Then we must

136 CHAPTER 7. DIMENSION AND STRUCTURE 3 1 0 0 1 3 4 5 Figure 7.4: Fitting data by a linear function using least square method have y 1 = a 0 +a 1 x 1 + +a m x m 1 y = a 0 +a 1 x + +a m x m (7.63). y n = a 0 +a 1 x n + +a m x m n. 1 x 1 x 1 x m 1 a 0 y 1 1 x x x m a 1.... = y Mv = y. (7.64). 1 x n x n x m n a m y n Its least square solution is obtained by the solution of v = (M T M) 1 M T y. (7.65) 7.9 Orthonomal Bases and Gram-Schmidt Orthogonal and Orthonormal Bases Orthogonal Projection Using Orthonormal Bases Recall the following. If W is a nonzero subspace of R n, and if M is any matrix whose column vectors form a basis for W, then Proj W x = M(M T M) 1 M T x, (7.66)

7.9. ORTHONOMAL BASES AND GRAM-SCHMIDT 137 for x R n. If the column vectors of M are orthonormal, we have M T M = I and we have Proj W x = MM T x. (7.67) The standard matrix corresponding to the projection is P = MM T. (7.68) Equation (7.67) can be restated in the following form. Theorem 7.9.1. If {v 1,,v k } is an orthonormal basis for a subspace W of R n, then the orthogonal projection of x in R n onto W is given by Proj W x = (x v 1 )v 1 + +(x v k )v k. (7.69) Example 7.9.. Find the orthogonal projection of x onto the plane W spanned by orthonormal vectors v 1,v... 1 4 Proj W x =... 0 1 4 1 3 Theorem 7.9.3. If {v 1,,v k } is an orthonormal basis for a subspace W of R n, then orthogonal projection onto W can be expressed as Proj W x = (x v 1 )v 1 + +(x v k )v k. (7.70) Proof. Let Then ] M = [v 1 v v k Proj W x = MM T x. So (7.70) is just a restatement of this. Two examples. Trace and Orthogonal Projections Theorem 7.9.4. If P is the standard matrix for an orthogonal projection of R n onto a subspace W, then trace(p) = rank(p).

138 CHAPTER 7. DIMENSION AND STRUCTURE Proof. First note that v 1 ] P = MM T v = [v 1 v v k. = v 1v1 T +v v T + +v kvk T. (7.71) v k Direct computation shows that trace(p) = 1+1+ +1 = k. Linear Combinations of Orthonormal Basis Vectors Theorem 7.9.5. If {v 1,,v k } is an orthonormal basis for a subspace W of R n, and if w is a vector in W, then w = (w v 1 )v 1 + +(w v k )v k. (7.7) Finding Orthonormal Bases Theorem 7.9.6. Every nonzero subspace of R n has an orthonormal basis. Proof. Let W be a nonzero subspace of R n, and let {w 1,,w k } be any basis for W. It suffices to show that we can construct an orthogonal basis. (We normalize it to get an orthonormal basis.) Let W i = span{w 1,,w i },i = 1,,,k and proceed as follows: Step 1. Let v 1 = w 1. Step. We construct a vector orthogonal to v 1 by computing an orthogonal projection of w and subtracting it from w. That is, v = w Proj W1 w = w w v 1 v 1 v 1 Step 3. v 3 = w 3 Proj W w 3 = w 3 w 3 v 1 v 1 v 1 w 3 v v v Step 4. In general, we have v j = w j Proj Wj 1 w j = w j j 1 w j v i i=1 v i w i for j = 1,,k. This process is called Gram-Schmidt process. Example 7.9.7. Use the Gram-Schmidt process to construct an orthonormal basis for the plane x+y +z = 0 in R 3.

7.9. ORTHONOMAL BASES AND GRAM-SCHMIDT 139 sol. We need any two linearly independent vectors from the plane. Writing the equation of plane in parametric form, we introduce y = t 1,z = t so that x = t 1 t, y = t 1, z = t. From this we choose t 1 = 1,t = 0 and t 1 = 0,t = 1. So the resulting vectors are w 1 = ( 1,1,0) and w = ( 1,0,1). Now use the Gram-Schmidt process. Normalizing v 1 = w 1 = ( 1,1,0) v = w w v 1 v 1 v 1 = ( 1,0,1) 1 ( 1,1,0) = ( 1, 1,1). q 1 = ( 1 1,,0) q = ( 1, 1, ). 6 6 6 A Property of the Gram-Schmidt Process Theorem 7.9.8. If S = {w 1,,w k } is a basis for a nonzero subspace of R n, and if S = {v 1,,v k } is the corresponding orthogonal basis produced by Gram-Schmidt process, then (1) {v 1,,v j } is an orthogonal basis for span{w 1,,w j } at the j-th step. () v j is orthogonal to span{w 1,,w j 1 } at the j-th step(j ). Extending Orthonormal Sets to Orthonormal Bases Theorem 7.9.9. If W is a nonzero subspace of R n, then (1) Every orthogonal set of nonzero vectors in W can be extended to an orthogonal basis for W.

140 CHAPTER 7. DIMENSION AND STRUCTURE () Every orthonormal set in W can be extended to an orthonormal basis for W. 7.10 QR-Decomposition; Householder Transformation QR-Decomposition Suppose A is an m k matrix with full column rank(this requires m k) whose successive column vectors are {w 1,,w k }. If Gram-Schmidt process isappliedtothesevectorstoproduceanorthonormalbasis{q 1,,q k }forthe column space of A, and Q is the matrix whosecolumn vectors are {q 1,,q k } in order, what is the relationship between A and Q? Let A and Q be the matrices having w i and q i as columns, i.e., A = [w 1,w,,w k ], Q = [q 1,q,,q k ]. We can express the vector w i in terms of orthonormal column vectors of Q as k w i = c ij q j. By orthonormal property of q j s, we see c ij = w i q j and hence j=1 w 1 = (w 1 q 1 )q 1 +(w 1 q )q + +(w 1 q k )q k w = (w q 1 )q 1 +(w q )q + +(w q k )q k = w k = (w k q 1 )q 1 +(w k q )q + +(w k q k )q k. By Theorem 7.9.8, q j is orthogonal to w i when i < j. Hence we have w 1 = (w 1 q 1 )q 1 w = (w q 1 )q 1 +(w q )q = w k = (w k q 1 )q 1 +(w k q )q + +(w k q k )q k.

7.10. QR-DECOMPOSITION; HOUSEHOLDER TRANSFORMATION 141 Let us form the upper triangular matrix (w 1 q 1 ) (w q 1 ) (w k q 1 ) 0 (w R = q ) (w k q ).... 0 0 (w k q k ) (7.73) Then we can see that AQ = R, i.e., (w 1 q 1 ) (w q 1 ) (w k q 1 ) 0 (w [w 1,w,,w k ] = [q 1,q,,q k ] q ) (w k q ).... 0 0 (w k q k ) A = Q R. (7.74) Theorem 7.10.1. If A is an m k(m k) matrix with full column rank, then A can be factored as A = QR, (7.75) where Q is m k matrix whose column vectors form an orthonormal basis for the column space of A, and R is a k k invertible upper triangular matrix. In general a matrix factorization of the form A = QR, where column vectors of Q are orthonormal and R is invertible, upper triangular is called a QR-decomposition. The QR-decomposition is not unique!(but unique if r ii > 0) Note that the order of generating R is columnwise in (7.9.6). If we change the order row wise, we get Modified Gram-Schimdt process. Other Ways to Obtain QR-decomposition: (Modified) Gram- Schmidt, Householder and Rotation One method of finding QR-decomposition for a matrix with full column rank is to use Gram-Schmidt process to the column vectors of A, where R is given by (7.73). Unfortunately, it produce large round off error numerically. Hence it is not recommended to use in numerical purpose. There are other methods. One is to rearrange the order of orthogonalization(called Modified Gram- Schmidt). Another method is to use Householder transformation. Still another method is to use the Givens rotation.

14 CHAPTER 7. DIMENSION AND STRUCTURE Example 7.10.. Find a QR-decomposition of the following matrix using the Gram-Schmidt process. 1 1 0 A = 0 1 1 1 1 1 sol. w 1 = [1,0,1] T, w = [ 1,1,1] T, w 3 = [0,1,1] T v 1 = w 1 = [1,0,1] T v = w w v 1 v 1 v 1 = [ 1,1,1] T 0 v 3 = w 3 w 3 v 1 v 1 v 1 w 3 v v v = [0,1,1] T 1 [1,0,1]T 3 [ 1,1,1]T = [ 1 6, 1 3, 1 6 ]T q 1 = 1 [1,0,1] T, q = 1 3 [ 1,1,1] T, q 3 = 1 6 [1,, 1] T (w 1 q 1 ) (w q 1 ) (w 3 q 1 ) 0 1 R = 0 (w q ) (w 3 q ) = 0 3 3 0 0 (w 3 q 3 ) 1 0 0 6 Example 7.10.3. Find a QR-decomposition of the matrix 1 1 4 1 4 1 4 1 1 0