Typical Problem: Compute.

Similar documents
(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w >

ORTHOGONALITY AND LEAST-SQUARES [CHAP. 6]

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Chapter 6: Orthogonality

1. General Vector Spaces

March 27 Math 3260 sec. 56 Spring 2018

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Chapter 4 Euclid Space

SUMMARY OF MATH 1600

Solutions to Review Problems for Chapter 6 ( ), 7.1

Worksheet for Lecture 23 (due December 4) Section 6.1 Inner product, length, and orthogonality

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Math 407: Linear Optimization

Linear Algebra Lecture Notes-II

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

Lecture 1: Review of linear algebra

Further Mathematical Methods (Linear Algebra) 2002

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Linear Algebra. Session 12

MTH 2032 SemesterII

Math 3191 Applied Linear Algebra

P = A(A T A) 1 A T. A Om (m n)

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

Math Linear Algebra II. 1. Inner Products and Norms

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

GENERAL VECTOR SPACES AND SUBSPACES [4.1]

Mathematics Department Stanford University Math 61CM/DM Inner products

Lecture notes: Applied linear algebra Part 1. Version 2

PRACTICE PROBLEMS FOR THE FINAL

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Worksheet for Lecture 15 (due October 23) Section 4.3 Linearly Independent Sets; Bases

Orthogonality and Least Squares

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

MATH 22A: LINEAR ALGEBRA Chapter 4

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Elementary linear algebra

Overview. Motivation for the inner product. Question. Definition

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Practice Final Exam. Solutions.

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

6 Inner Product Spaces

WI1403-LR Linear Algebra. Delft University of Technology

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

Lecture 23: 6.1 Inner Products

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Inner Product and Orthogonality

Review Notes for Linear Algebra True or False Last Updated: February 22, 2010

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

MATH Linear Algebra

INNER PRODUCT SPACE. Definition 1

Definition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition

Worksheet for Lecture 15 (due October 23) Section 4.3 Linearly Independent Sets; Bases

W2 ) = dim(w 1 )+ dim(w 2 ) for any two finite dimensional subspaces W 1, W 2 of V.

Math Linear Algebra

LINEAR ALGEBRA W W L CHEN

A Primer in Econometric Theory

MATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products.

Definitions for Quizzes

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

There are two things that are particularly nice about the first basis

NOTES on LINEAR ALGEBRA 1

LINEAR ALGEBRA SUMMARY SHEET.

Linear Algebra Massoud Malek

Exam in TMA4110 Calculus 3, June 2013 Solution

Inner Product Spaces

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

b 1 b 2.. b = b m A = [a 1,a 2,...,a n ] where a 1,j a 2,j a j = a m,j Let A R m n and x 1 x 2 x = x n

MATH 235. Final ANSWERS May 5, 2015

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

Chapter 2. Vectors and Vector Spaces

Linear Algebra Highlights

Review of Some Concepts from Linear Algebra: Part 2

Chapter 6. Orthogonality

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence)

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Linear Algebra- Final Exam Review

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Pseudoinverse & Moore-Penrose Conditions

Linear Analysis Lecture 5

7. Dimension and Structure.

October 25, 2013 INNER PRODUCT SPACES

The Gram Schmidt Process

The Gram Schmidt Process

EXAM. Exam 1. Math 5316, Fall December 2, 2012

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

MATH 167: APPLIED LINEAR ALGEBRA Chapter 3

Chapter 6 - Orthogonality

Orthogonal Projection and Least Squares Prof. Philip Pennance 1 -Version: December 12, 2016

Math 290, Midterm II-key

Linear Algebra 2 Spectral Notes

Applied Linear Algebra in Geoscience Using MATLAB

Linear algebra review

MTH 464: Computational Linear Algebra

x 1 + 2x 2 + 3x 3 = 0 x 1 + 2x 2 + 3x 3 = 0, x 2 + x 3 = 0 x 3 3 x 3 1

Transcription:

Math 2040 Chapter 6 Orhtogonality and Least Squares 6.1 and some of 6.7: Inner Product, Length and Orthogonality. Definition: If x, y R n, then x y = x 1 y 1 +... + x n y n is the dot product of x and y. [ ] [ ] 2 5 Typical Problem: Compute. 4 3 Important: The dot product of x and y is a scalar, NOT a vector. Remark: The dot product is mysterious and beautiful and powerful. The matrix product form of dot product: When x is viewed as a n 1 matrix, we get x y = x T y. Theorem 1 page 376: The dot product is a commutative, bilinear, positive definite operation. That is, 1. (Commutativity) x y = y x, 2. (Bilinearity) (αx + βy) z = α(x z) + β(y z) and x (αy + βz) = α(x y) + β(x z), and 3. (Positive definiteness) x x > 0 if x 0. Remark: x x = x 2 1 +... + x 2 n, so, always, x 0. T Also, x x = 0 implies x 2 1 +... + x 2 n = 0, in which case, x = 0. Thus, the positive definite property is equivalent to the property that always x x 0, and x x = 0 iff x = 0. Finally, it is worth noting that, in the presence of symmetry, the bilinearity property is equivalent to the two properties (Distributivity of dot product over addition) x (y + z) = x y + x z, and (Scalars float) (αx) y = α(x y) = x (αy). Abstraction: The abstraction of the above properties yields the idea of an inner product on a vector space. Thus, we define an inner product space to be a vector space V together with the assignment to each u, v V of a scalar < u, v >, called the inner product of u and v, such that for all u, v, w V and α, β R, the following conditions hold. 1. (Commutativity) < u, v >=< v, u >, 2. (Bilinearity) < αu + βv, w >= α < u, w > +β < v, w > and < u, αv + βw >= α < u, v > +β < u, w >, and 3. (Positive definiteness) < u, u > is positive if u 0. Remark: The positive definiteness property is equivalent to the condition that < u, u > is nonnegative, and < u, u >= 0 iff u = 0. In the presence of inner product symmetry, the bilinearity property of inner product is equivalent to the two properties 1

(Distributivity of inner product over addition) < u, v + w >=< u, v > + < u, w >, and (Scalars float) < αu, v >= α < u, v >=< u, αv >. Typical Problems: Establish each of the following. 1. R n together with the dot product is an example of an inner product space. 2. Any n th degree polynomial p(t) is completely determined by its values p(t 0 ), p(t 1 )..., p(t n ) at n + 1 distinct points t 0, t 1,..., t n in R. Given this, it is an easy matter to show that < p(t), q(t) >= p(t 0 )q(t 0 ) + p(t 1 )q(t 1 ) +... + p(t n )q(t n ) is an inner product on P n whenever t 0, t 1,..., t n are n + 1 distinct points in R. 3. Recall C[a, b] denotes the vector space of real valued continuous functions defined on the interval [a, b] = {x a x b} in R. C[a, b] becomes an inner product space when the inner product is given by < f, g >= b f(x)g(x)dx for each f, g C[a, b]. a Convention: Throughout what follows, unless otherwise specified, R n is considered to be an inner product space with the inner product equal to the dot product. Distance: The (Euclidean) distance from x to y in R n is the nonnegative square root of (x 1 y 1 ) 2 +... + (x n y n ) 2 (i.e. + (x 1 y 1 ) 2 +... + (x n y n ) 2 ). [ ] [ ] 2 5 Typical Problem: With x = and y =, find the distance from x to y in R 4 3 2. Norm: The norm or length of x R n is the distance from x to 0. Thus, the norm of x R n is x 2 1 + x 2 2 +... + x 2 n. [ 2 Typical Problem: With x = 4 ] in R 2, find the length (or norm) of x. Proposition: The distance from x to y in R n is (x y) (x y) and the norm of x is x x. Notation: The distance from x to y is denoted x y. The norm of x is denoted x. Unit Vectors: A vector x is a unit vector if it has length 1. If x 0, then x/ x is a unit vector in the direction of x. [ ] 2 Typical Problem: With x = in R 4 2, find a unit vector in the direction of x. Abstraction: The distance in an inner product space V from u to v in V is defined by u v = < u v, u v > and the norm of u V is defined to be u = < u, u >. Proposition: For any vectors u, v, and w in an inner product space V, each of the following properties hold. 2

1. (Symmetry) The distance from u to v is the distance from v to u; that is, u v = v u. 2. (Degeneracy) The distance from u to v is zero if and only if u = v; that is, u v = 0 iff u = v. 3. (Triangle Inequality) The distance from u to w is less than or equal to the distance from u to v plus the distance from v to w; that is, u w u v + v w. Remark. The properties above allow further abstraction to more general spaces and giving us the important ideas of metrics, metric spaces, normed spaces, normed linear spaces, Banach spaces, etc. Typical Problems: On P 2, consider the inner product < p(t), q(t) >= p( 1)q( 1) + p(0)q(0) + p(1)q(1) for each p(t), q(t) P 2. Let p(t) = 1 + t 2 and q(t) = 2 2t + t 2. 1. Compute < p(t), q(t) >. 2. What is the length of p(t)? 3. What is the distance from p(t) to q(t)? 4. Find a unit vector in the direction of p(t). Proposition: Let u and v be vectors in an inner product space V. Then 1. u v 2 =< (u v), (u v) >, 2. u = u 0 (i.e., the norm of u is its distance from the origin). 3. u 2 =< u, u >. 4. u v 2 = u 2 2 < u, v > + v 2. 5. u + v 2 = u 2 + 2 < u, v > + v 2. Pythagoras Theorem: < u, v >= 0 if and only if u + v 2 = u 2 + v 2 in any inner product space V. Proof: Examine item 5 in the last proposition above. Definition: In an inner product space, vectors u and v are called orthogonal or perpendicular vectors if and only if < u, v >= 0. When u and v are orthogonal, we write u v. Typical problems: 1. Determine if 1 1 1 1 1 2. What about 1 1 1 1 1 2 2. On P 2, consider the inner product < p(t), q(t) >= p( 1)q( 1) + p(0)q(0) + p(1)q(1) for each p(t), q(t) P 2. Let p(t) = 1 + t 2 and q(t) = 2 2t + t 2. Determine if p(t) q(t). 3. Prove 0 u for any vector u in an inner product space V.? 3

Proposition: If W is a subspace of an inner product space V then {u u w for all w W } is also a subspace of V. Definition: If W is a subspace of an inner product space V then {u u w for all w W } is denoted by W (spoken W perp), and called the orthogonal complement of W. Proposition: If W is a subspace of an inner product space V, then W W = {0}. Proposition: If W = Span{w 1,..., w k } in an inner product space V, then u W if and only if u w i, for all i = 1,..., k. Theorem 3 page 381 Generalized: Let A be an m n matrix. Then (1) Col(A T ) and Nul(A) are each other s orthogonal complement in R n, and (2) Col(A) and Nul(A T ) are each other s orthogonal complement in R m. Proof: See the Appendix below. Sections 6.2 through 6.4 via 6.7: Orthogonality and Gram-Schmidt. Definition. A set of vectors which are pair-wise orthogonal is called an orthogonal set. An orthogonal set which is also a basis of an inner product set is called an orthogonal basis. If all of the vectors in an orthogonal set are unit vectors then the set is called orthonormal. Warning: A matrix whose columns form an orthonormal set is called an orthogonal matrix, NOT an orthonormal matrix. There is no such thing as an orthonormal matrix. This is a historical accident. Also, it is more usual to speak of n n orthogonal matrices than it is of n m ones. Linear Independence Of Orthogonal Sets: If {b 1,..., b p } is an orthogonal set of nonzero vectors in an inner product space V, then it is linearly independent. This is because 0 = α 1 b 1 +... + α i b i +... + α n b n implies < b i, 0 >=< b i, α 1 b 1 +... + α i b i +... + α n b n > (since 0 = α 1 b 1 +... + α i b i +... + α n b n ) = α 1 < b i, b 1 > +... + α i < b i, b i > +... + α n < b i, b n > (by bilinearity) = α i < b i, b i > (since < b i, b j >= 0 for all i j and < b i, b i > 0), so α i = <b i,0> <b i,b i > = 0 (since < b i, 0 >= 0) for all i. Representations Using Orthogonal Basis: An orthogonal basis provides a useful representation of an inner product space because coordinates are easily computed. To see this let B = {b 1,..., b n } be an orthogonal basis of an inner product space V and let x V. Generally, [x] B = P 1 B (x) is difficult to compute, but not in this case. Write x = α 1b 1 +...+α i b i +...+α n b n. 4

Then, as just above (but using x instead of 0), < b i, x >= α i < b i, b i >, so α i = <b i,x> <b i,b i, each i. > <b 1,x> <b 1,b 1 >... Therefore, [x] B = P 1 B (x) = <b i,x> <b i,b i >.... <b n,x> <b n,b n> Testing Orthogonality In R n : In R n, {v 1,..., v p } is an orthogonal set if and only if A T A is a diagonal matrix, where A = [v 1... v p ], because if D = A T A then d i,j = v T i v j = v i v j, each i and j. Also, in R n, {v 1,..., v p } is an orthonormal set if and only if A T A = I p. It is worth noting here, for the sake of reducing the number of computations, that regardless of whether D = A T A is diagonal or not, it is symmetric (that is, D T = D), so to determine if D is diagonal, one only needs to check the entries strictly below its diagonal (if any are non zero then orthogonality fails). For orthonormality, one must check the unity of the diagonal elements as well. Finally, after repeating the warning that there is no such thing as an orthonormal matrix, we notice that an n p matrix A is orthogonal iff A T A = I p, in which case, we see that an n n matrix A is orthogonal iff A T = A 1. Thus, {v 1,..., v n } is an orthonormal basis of R n iff A = [v 1... v n ] is an orthogonal matrix. Typical problems: 1. Show the standard basis {e 1,..., e n } of R n is orthonormal. 2. Is {[1, 2, 2, 3] T, [2, 1, 6, 4] T } an orthogonal set in R 4? Orthonormal? 3. Is {[1, 2, 2, 3] T, [2, 1, 6, 4] T, [ 2, 3, 1, 2] T } an orthogonal set in R 4? 4. Let P 1 have inner product defined by evaluation at t 1 = 1 and t 2 = 1 (that is, < p(t), q(t) >= p( 1)q( 1) + p(1)q(1)). Is B = {1 + t, 1 t} orthogonal? Orthogonal basis? Orthonormal basis? How about S = {1, t}? 5. Repeat the last item but with t 1 = 0 and t 2 = 1. 6. Compute [2 + 3t] B for B = {1 + t, 1 t} in P 1 with inner product given by evaluation at t 1 = 1 and t 2 = 1. The Orthogonal Projection Theorem: Let B = {b 1,..., b p } be an orthogonal set of nonzero vectors in an inner product space V. Then for any y V, there is a unique vector ŷ = <y,b i> <b i,b i > b i +... + <y,b p> <b p,b p> b p in W = Span(B) (called the orthogonal projection of y onto W ) and unique vector such that y = ŷ + z. z = y ŷ in W 5

Proof: Obviously, the conclusion y = ŷ + z follows from the equation z = y ŷ defining z. The existence and uniqueness of z are guaranteed by the existence and uniqueness of ŷ, respectively, because z = y ŷ. ŷ exists because it is given by a computable formula. That y W is obvious. The uniqueness property of ŷ and that that z W will established in class. Typical problems: 1. Find the orthogonal projection of [2, 0, 1, 2] T onto the subspace of R 4 spanned by {[1, 2, 2, 3] T, [2, 1, 6, 4] T }. 2. Use an orthogonal projection to extend the orthogonal set {[1, 2, 2] T, [0, 1, 1] T } to an orthogonal basis of R 3. 3. Let P 1 have inner product defined by evaluation at t 1 = 1 and t 2 = 1. Find the orthogonal projection of 2 + 3t onto Span{p(t)} in P 1 where p(t) = 1 t. 4. Repeat the last item except use t 1 = 2 and t 2 = 1. Lecture Target: This is the optimal spot to end the lecture of Thursday, August 3 rd. Cauchy-Schwarz Inequality: For any vectors u and v in an inner product space, < u, v > u, v. The elegant but tricky proof of this inequality is given in the textbook on page 432. This inequality is the key to the metric triangle property mentioned above (also, see page 433 of text). Alternative Notation: The orthogonal projection ŷ of y onto the subspace W is written on occasion P roj W y; that is, for ŷ = P roj W y. The Best Approximation Theorem: For any subspace W = Span(B) and y V in a finite dimensional inner product space, P roj W y is the best approximation to y by a vector in W ; that is, y P roj W y y v for all v W. Proof: To be done in class. Typical Problems: 1. Find the best approximation to [2, 0, 1, 2] T in W = Span{[1, 2, 2, 3] T, [2, 1, 6, 4] T }. 2. Let P 1 have inner product defined by evaluation at t 1 = 1 and t 2 = 1. Find the best approximation to 2 + 3t in Span{p(t)} in P 1 where p(t) = 1 t. The Gram-Schmidt Process: Let {b 1,..., b p } be a basis of a subspace W in an inner product space V. Let v 1 = b 1 and for 1 < i p let v i = b i P roj W i b i where W i = Span({v 1,..., v i 1 }). Then {v 1,..., v p } is an orthogonal basis of W. Upon normalizing each v i, we get an orthonormal basis { v 1,..., v p } of W. v 1 v p Typical problems: 1. With B = {[0, 1, 1, 1] T, [1, 1, 1, 0] T, [1, 0, 1, 1] T }, find the orthogonal projection of [0, 1, 1, 0] T onto Span(B) in R 4. 6

2. Let P 2 have inner product defined by evaluation at t 1 = 1, t 2 = 0 and t 3 = 1. Find an orthogonormal basis for P 2. The QR Factorization: Let A be an m n matrix with linearly independent columns. Let Q be the matrix of columns formed by applying the Gram-Schmidt process to the columns of A and normalizing each of the resulting columns. Then Q is an orthogonal matrix. Let R = Q T A. Since Q is orthogonal, we have A = QR. Moreover, R is n n, upper triangular, invertible and has positive entries on its diagonal. Typical Problem: Find the QR factorization of A = 1 1 2 1 1 1 6.5 Least Squares: If A is an m n matrix and b R m then a least squares solution of Ax = b is a vector ˆx R n such that b Aˆx b Ax for all x R n. Clearly, the set of least squares solutions of Ax = b is the set of solutions of Aˆx = ˆb where ˆb = P rojcol(a) b. b ˆb = b Aˆx so b Aˆx lies in the orthogonal complement of Col(A). This orthogonal complement is Nul(A T ), so A T (b Aˆx) = 0. Thus, we seek x such that A T Aˆx = A T b. The system of equations A T Aˆx = A T b is called the system of normal equations for x in the least squares problem Ax = b. When A has linearly independent columns, A T A is invertible ˆx = (A T A) 1 A T b. From this we get a matrix product description of orthogonal projection, for ˆb = Aˆx = A(A T A) 1 A T b. The invertibility of A T A mentioned above is not obvious. That A T A is invertible when A has linearly independent columns proceeds by noticing Col(A T ) = R n (since Nul(A) = {0}, and since, as is shown in the appendix, Col(A T ) is the orthogonal complement of Nul(A)). Therefore, A T is onto. It will follow from this that A T A is also onto. To see this let x R n. Since A T is onto, there is y R m, such that A T y = x. Let ŷ be the orthogonal projection of y onto Col(A) in R m and let z = y ŷ. z Nul(A T ) because, again by the appendix below, Nul(A T ) is the orthogonal complement of Col(A) in R m. Therefore A T z = 0 and y = z + ŷ. Now, since ŷ Col(A), there is a vector v R n such that Av = ŷ. We have. 7

A T Av = A T ŷ = 0 + A T ŷ = A T z + A T ŷ = A T (z + ŷ) = A T y = x. Therefore, as claimed, A T A is an onto linear transformation. But A T A is an n n matrix, so, by the Invertible Matrix Theorem, A T A is invertible (because it is onto). This argument provides an inverse statement to the one given in the assigned exercise 6.5.20. Typical Problem: Let A have columns a 1 = [0, 1, 1, 1] T, a 2 = [1, 1, 1, 0] T, and a 3 = [1, 0, 1, 1] T. Use least squares to find the orthogonal projection of [0, 1, 1, 0] T onto Col(A) in R 4. 6.6 Least Squares Lines: In R n let 1 = [1, 1,..., 1] T be the column vector all of whose coordinates are 1. Given x, y R n the least squares line is given by β 0, β 1 R such that ŷ = β 0 1 + β 1 x where ŷ is the best approximation of y in Col([1 x]). Thus, we seek β = [β 0, β 1 ] T R 2 such that [1 x] β = ŷ. The normal equations provide the solution. [1 x] T [1 x] β = [1 x] T y Typical Problem: Find the equation y = β 0 + β 1 x of the least squares line that best fits the data points ( 1, 0), (0, 1), (1, 2), (2, 4). Appendix. Above, we generalized Theorem 3 page 381. generalization next and then we go on to prove it. Theorem 3 page 381 Generalized: Let A be an m n matrix. Then Here, we repeat the statement of the (1) Col(A T ) and Nul(A) are each other s orthogonal complement in R n, and (2) Col(A) and Nul(A T ) are each other s orthogonal complement in R m. Proof. The technical meaning of the two statements is (1) Row(A) Col(A T ) = (Nul(A)) and Nul(A) = (Col(A T )) (Row(A)), and (2) Col(A) = (Nul(A T )) and Nul(A T ) = (Col(A)). (2) follows from (1) upon replacing A, A T, m, and n throughout (1) by A T, A, n, and m, respectively. The second part of (1), that Nul(A) = (Col(A T )) (Row(A)), is easy and given in the textbook on page 381. The first part of (1), that Row(A) Col(A T ) = (Nul(A)), 8

is more difficult and requires an exploitation of the rank of A. We will prove Row(A) Col(A T ) = (Nul(A)). Row(A) Col(A T ) follows by the observation that transpose is an isomorphism sending rows to columns and columns to rows. Now we turn our attention to proving Col(A T ) = (Nul(A)) in R n. A brief meditation on the relationships between the rows of A, the columns of A T, and the solutions of Ax = 0 will allow the reader to see that each column of A T is orthogonal to any x Nul(A). Therefore, each column of A T is in (Nul(A)). Thus, Col(A T ) is a subspace of (Nul(A)), (*) because (Nul(A)) is a vector space (the vector space structure being inherited from R n ). Let r = Rank(A). We know that dim(col(a T )) = Rank(A T ) = Rank(A) = r. Let p = dim((nul(a)) ). Then r p by (*), so if we can show then we will have p r, dim(col(a T )) = r = p = dim((nul(a)) ). (**) Once p r is established, the desired result, that Col(A T ) = (Nul(A)) follows immediately from (*) and (**), because a subspace (namely, Col(A T )) with dimension equal to the dimension of its parent space (namely, (Nul(A)) ) must be all of the parent space. Let B = {v 1, v 2,...v p } be a basis of (Nul(A)). Let q = dim(nul(a)); so that, q = n r by the Rank Theorem. Let C = {w 1, w 2,...w q } be a basis of Nul(A), and let D = {v 1, v 2,...v p, w 1, w 2,...w q } be the union of B and C. We will show D is linearly independent. Why will this help? Well, if D is shown to be linearly independent, then since p + q is the number of elements in D, it follows that p + q n by the Basis Theorem, in which case, we have p + (n r) = p + q n, so p r 0; whence, p r, as required. Thus, once we have established that D is linearly independent, then the proof will be complete. We do this next. Suppose (α 1 v 1 +... + α p v p ) + (β 1 w 1 +... + β q w q ) = 0 for some scalars α 1,..., α p, β 1,..., β q, where the bracketing has no use other than to suggest what follows next. Let x = α 1 v 1 +... + α p v p and y = β 1 w 1 +... + β q w q. We have x + y = 0, so x = y. Now, x (Nul(A)) and y Nul(A). But y Nul(A) and x = y implies x Nul(A). Therefore, x (Nul(A)) Nul(A). But, far above we noted this means x = 0. Also, y = 0, because x = y and x = 0. Now we have it, 9

because x = 0 and y = 0 give us 0 = α 1 v 1 +... + α p v p and 0 = β 1 w 1 +... + β q w q ; whence, all of the α-scalars and β-scalars are 0 by the linear independence of the bases B and C, respectively. Therefore, D is linearly independent as claimed, and this completes the proof. 10