Lecture Notes on Linear Algebra. Arbind K Lal Sukant Pati DRAFT

Size: px
Start display at page:

Download "Lecture Notes on Linear Algebra. Arbind K Lal Sukant Pati DRAFT"

Transcription

1 Lecture Notes on Linear Algebra Arbind K Lal Sukant Pati July 10, 2016

2 2

3 Contents 1 Introduction to Matrices Definition of a Matrix A Special Matrices Operations on Matrices A Multiplication of Matrices B Inverse of a Matrix Some More Special Matrices A sub-matrix of a Matrix Summary System of Linear Equations over R Introduction A Elementary Row Operations System of Linear Equations A Elementary Matrices and the Row-Reduced Echelon Form (RREF) B Rank of a Matrix C Gauss-Jordan Elimination and System of Linear Equations Square Matrices and System of Linear Equations A Determinant B Adjugate (classically Adjoint) of a Matrix C Cramer s Rule Miscellaneous Exercises Summary Vector Spaces Vector Spaces: Definition and Examples A Subspaces B Linear Span C Fundamental Subspaces Associated with a Matrix Linear Independence A Basic Results related to Linear Independence B Application to Matrices

4 4 CONTENTS 3.2.C Linear Independence and Uniqueness of Linear Combination Basis of a Vector Space A Main Results associated with Bases B Constructing a Basis of a Finite Dimensional Vector Space Application to the subspaces of C n Ordered Bases Summary Linear Transformations Definitions and Basic Properties Rank-Nullity Theorem A Algebra of Linear Transformation Matrix of a linear transformation A Dual Space* Similarity of Matrices Summary Inner Product Spaces Definition and Basic Properties A Cauchy Schwartz Inequality B Angle between two Vectors C Normed Linear Space D Application to Fundamental Spaces E Properties of Orthonormal Vectors Gram-Schmidt Orthogonalization Process Orthogonal Operator and Rigid Motion Orthogonal Projections and Applications A Orthogonal Projections as Self-Adjoint Operators* QR Decomposition Summary Eigenvalues, Eigenvectors and Diagonalization Introduction and Definitions A Spectrum of an eigenvalue Diagonalization A Schur s Unitary Triangularization B Diagonalizability of some Special Matrices C Cayley Hamilton Theorem Quadratic Forms A Sylvester s law of inertia

5 CONTENTS 5 7 Appendix Permutation/Symmetric Groups Properties of Determinant Uniqueness of RREF Dimension of W 1 +W When does Norm imply Inner Product Index 178

6 6 CONTENTS

7 Chapter 1 Introduction to Matrices 1.1 Definition of a Matrix Definition Matrix A rectangular array of numbers is called a matrix. The horizontal arrays of a matrix are called its rows and the vertical arrays are called its columns. Let A be a matrix having m rows and n columns. Then, A is said to have order m n or is called a matrix of size m n and can be represented in either of the following forms: a 11 a 12 a 1n a 11 a 12 a 1n a 21 a 22 a 2n a 21 a 22 a 2n A =. or A =......,..... a m1 a m2 a mn a m1 a m2 a mn where a ij is the entry at the intersection of the i th row and j th column. To be concise, one writes A m n = [a ij ] or, in short, A = [a ij ]. We will also use A[i,:] to denote the i-th row of A, A[:,j] to denote the j-th column of A. We shall mostly be concerned with matrices having complex numbers as entries. [ ] [ ] 1 3+i 7 7 For example, if A = then A[1,:] = [1 3 + i 7], A[:,3] = and i 6 5i a 22 = 5. In general, in row vector commas are inserted to differentiate between entries. Thus, A[1,:] = [1, 3+i, 7]. A matrix having only one column is called a column vector and a matrix with only one row is called a row vector. All our vectors will be column vectors and will be represented by bold letters. Thus, A[1,:] is a row vector and A[:,3] is a column vector. Definition Equality of two Matrices Two matrices A = [a ij ] and B = [b ij ] having the same order m n are equal if a ij = b ij, for each i = 1,2,...,m and j = 1,2,...,n. In other words, two matrices are said to be equal if they have the same order and their corresponding entries are equal. Example The [ linear system ] of equations 2x+3y = 5 and 3x+2y = 5 can be identified with the matrix A =. Note that x and y are unknowns with the understanding that x is associated with A[:,1] and y is associated with A[:,2].

8 8 CHAPTER 1. INTRODUCTION TO MATRICES 1.1.A Special Matrices Definition For example, 1. A matrix in which each entry is zero is called a zero-matrix, denoted [ ] [ ] = and = A matrix that has the same number of rows as the number of columns, is called a square matrix. A square matrix is said to have order n if it is an n n matrix. 3. Let A = [a ij ] be an n n square matrix. (a) Then the entries a 11,a 22,...,a nn are called the diagonal entries the principal diagonal of A. (b) ThenAissaidtobeadiagonalmatrixif [ a ij ] = 0fori j, denoteddiag[a 11,...,a nn ]. 4 0 For example, the zero matrix 0 n and are a few diagonal matrices. 0 1 (c) If A = diag[a 11,...,a nn ] and a ii = d for all i = 1,...,n then the diagonal matrix A is called a scalar matrix. (d) Then A = diag[1,...,1] is called the identity matrix, denoted I n, or in short I. [ ] For example, I 2 = and I 3 = (e) Then A is said to be an upper triangular matrix if a ij = 0 for i > j. (f) Then A is said to be a lower triangular matrix if a ij = 0 for i < j. (g) Then A is said to be triangular if it is an upper or a lower triangular matrix For example, is upper triangular, is lower triangular An m n matrix A = [a ij ] is said to have an upper triangular form if a ij = 0 for all a 11 a 12 a 1n a 22 a 2n [ ] i > j. For example, the matrices., and a nn have upper triangular forms. 1.2 Operations on Matrices Definition Transpose of a Matrix Let A = [a ij ] be an m n matrix with real entries. Then the transpose of A, denoted A T = [b ij ], is an n m matrix with b ij = a ji, for all i,j.

9 1.2. OPERATIONS ON MATRICES 9 Definition Conjugate Transpose of a Matrix Let A = [a ij ] be an m n matrix with complex entries. Then the conjugate transpose of A, denoted A, is an n m matrix with (A ) ij = a ji, for all i,j, where for a C, a denotes the complex-conjugate of a. Thus, if x is a column vector then x T and x are row vectors and vice-versa. For example, if [ ] 1 0 [ ] [ ] A = then A = A T = i 1 0, whereas if A = then A = 0 1 i 4 i 1+i 5 2 and note that A A T. Theorem For any matrix A, (A ) = A. Thus, (A T ) T = A. Proof. Let A = [a ij ], A = [b ij ] and (A ) = [c ij ]. Clearly, the order of A and (A ) is the same. Also, by definition c ij = b ji = a ij = a ij for all i,j and hence the result follows. Remark Note that transpose is studied whenever the entries of the matrix are real. Since, we are allowing the matrix entries to be complex numbers, we will state and prove the results for complex-conjugate. The readers should separately very that similar results hold for transpose whenever the matrix has real entries. Definition Addition of Matrices Let A = [a ij ] and B = [b ij ] be two m n matrices. Then, the sum of A and B, denoted A+B, is defined to be the matrix C = [c ij ] with c ij = a ij +b ij. Definition Multiplying a Scalar to a Matrix Let A = [a ij ] be an m n matrix. Then the product of k C with A, denoted ka, is defined as ka = [ka ij ]. [ ] [ ] [ ] i 8+4i 10+5i Forexample, ifa = then5a = and(2+i)a = i 4+2i Theorem Let A,B and C be matrices of order m n, and let k,l C. Then 1. A+B = B +A (commutativity). 2. (A+B)+C = A+(B +C) (associativity). 3. k(la) = (kl)a. 4. (k +l)a = ka+la. Proof. Part 1. Let A = [a ij ] and B = [b ij ]. Then, by definition A+B = [a ij ]+[b ij ] = [a ij +b ij ] = [b ij +a ij ] = [b ij ]+[a ij ] = B +A as complex numbers commute. The reader is required to prove the other parts as all the results follow from the properties of complex numbers. Definition Additive Inverse Let A be an m n matrix.

10 10 CHAPTER 1. INTRODUCTION TO MATRICES 1. Then there exists a matrix B with A + B = 0. This matrix B is called the additive inverse of A, and is denoted by A = ( 1)A. 2. Also, for the matrix 0 m n, A + 0 = 0 + A = A. Hence, the matrix 0 m n is called the additive identity. Exercise Find a 3 3 non-zero matrix A with real entries satisfying (a) A T = A. (b) A T = A. 2. Find a 3 3 non-zero matrix A with complex entries satisfying (a) A = A. (b) A = A. 3. Find the 3 3 matrix A = [a ij ] satisfying a ij = 1 if i j and 2 otherwise. 4. Find the 3 3 matrix A = [a ij ] satisfying a ij = 1 if i j 1 and 0 otherwise. 5. Find the 4 4 matrix A = [a ij ] satisfying a ij = i+j. 6. Find the 4 4 matrix A = [a ij ] satisfying a ij = 2 i+j. 7. Suppose A+B = A. Then show that B = Suppose A+B = 0. Then show that B = ( 1)A = [ a ij ]. 1+i 1 [ ] 9. Let A = and B =. Compute A+B and B +A. 1 1 i 2 i A Multiplication of Matrices Definition Matrix Multiplication / Product Let A = [a ij ] be an m n matrix and B = [b ij ] be an n r matrix. The product of A and B, denoted AB, is a matrix C = [c ij ] of order m r with n c ij = a ik b kj = a i1 b 1j +a i2 b 2j + +a in b nj, 1 i m, 1 j r. k=1 Thus, AB is defined if and only if number of columns of A = number of rows of B. [ ] α β γ δ a b c For example, if A = and B = d e f x y z t then u v w s [ ] aα+bx+cu aβ +by +cv aγ +bz +cw aδ +bt+cs AB =. ( ) dα+ex+fu dβ +ey +fv dγ +ez +fw dδ +et+fs

11 1.2. OPERATIONS ON MATRICES 11 Thus, note that the rows of the matrix AB can be written directly as (AB)[1,:] = a [α,β,γ,δ] +b [x,y,z,t]+c [u,v,w,s] = ab[1,:]+bb[2,:]+cb[3,:] (AB)[2,:] = db[1,:]+eb[2,:]+fb[3,:] ( ) and similarly, the columns of the matrix AB can be written directly as [ ] aα+bx+cu (AB)[:,1] = = α A[:,1]+x A[:,2]+u A[:,3], ( ) dα+ex+fu (AB)[:,2] = β A[:,1]+y A[:,2]+v A[:,3],,(AB)[:,4] = δ A[:,1]+t A[:,2]+s A[:,3]. Remark Observe the following: 1. In this example, while AB is defined, the product BA is not defined. However, for square matrices A and B of the same order, both the product AB and BA are defined. 2. The product AB corresponds to operating on the rows of the matrix B (see Equation ( )). This is row method for calculating the matrix product. 3. The product AB also corresponds to operating on the columns of the matrix A (see Equation ( )). This is column method for calculating the matrix product. 4. Let A and B be two matrices such that the product AB is defined. Then verify that A[1,:]B A[2,:]B AB = = [A B[:,1], A B[:,2],..., A B[:,p]]. ( ). A[n,:]B Example Let A = and B = method of matrix multiplication to Use the row/column find the second row of the matrix AB. Solution: By Remark , (AB)[2,:] = A[2,:]B and hence (AB)[2,:] = 1 [1,0, 1] +0 [0,0,1]+1 [0, 1,1] = [1, 1,0]. 2. find the third column of the matrix AB. Solution: Again, by Remark , (AB)[:,3] = A B[:,3] and hence (AB)[:,3] = =

12 12 CHAPTER 1. INTRODUCTION TO MATRICES Definition [Commutativity of Matrix Product] Two square matrices A and B are said to commute if AB = BA. Remark Note that if A is a square matrix of order n and if B is a scalar matrix of order n then [ AB ] = BA. In[ general, ] the matrix product is not [ commutative. ] [ ] For example, consider A = and B =. Then verify that AB = = BA Theorem Suppose that the matrices A, B and C are so chosen that the matrix multiplications are defined. 1. Then (AB)C = A(BC). That is, the matrix multiplication is associative. 2. For any k R, (ka)b = k(ab) = A(kB). 3. Then A(B +C) = AB +AC. That is, multiplication distributes over addition. 4. If A is an n n matrix then AI n = I n A = A. 5. Now let A be a square matrix of order n and D = diag[d 1,d 2,...,d n ]. Then (DA)[i,:] = d i A[i,:], for 1 i n, and (AD)[:,j] = d j A[:,j], for 1 j n. Proof. Part 1. Let A = [a ij ] m n, B = [b ij ] n p and C = [c ij ] p q. Then (BC) kj = Therefore, p n b kl c lj and (AB) il = a ik b kl. l=1 k=1 ( A(BC) ) ij = = n ( ) n a ik BC kj = ( p ) n a ik b kl c lj = k=1 n k=1 p ( ) aik b kl clj = l=1 k=1 l=1 p ( n ) a ik b kl clj = k=1 l=1 l=1 k=1 l=1 p ( ) a ik bkl c lj t ( ) AB il c lj = ( (AB)C ) ij. Part 5. As D is a diagonal matrix, using Remark , we have (DA)[i,:] = D[i,:] A = j i 0 A[j,:]+d i A[i,:]. Using a similar argument, the next part follows. The other parts are left for the reader. Exercise Find a 2 2 non-zero matrix A satisfying A 2 = Find a 2 2 non-zero matrix A satisfying A 2 = A and A I Find 2 2 non-zero matrices A,B and C satisfying AB = AC but B C. That is, the cancelation law doesn t hold.

13 1.2. OPERATIONS ON MATRICES Let A = Compute A2 and A 3. Is A 3 = I? Determine aa 3 +ba+ca Let A and B be two m n matrices. Then prove that (A+B) = A +B. 6. Let A be a 1 n matrix and B be an n 1 matrix. Then verify that AB is a 1 1 matrix, whereas BA has order n n. 7. Let A and B be two matrices such that the matrix product AB is defined. (a) Prove that (AB) = B A. (b) If A[1,:] = 0 then (AB)[1,:] = 0. (c) If B[:,1] = 0 then (AB)[:,1] = 0. (d) If A[i,:] = A[j,:] for some i and j then (AB)[i,:] = (AB)[j,:]. (e) If B[:,i] = B[:,j] for some i and j then (AB)[:,i] = (AB)[:,j]. 1 1+i Let A = 1 2 i and B = 0 1. Compute i i 1 (a) A A, A+A, (3AB) 4B A and 3A 2A. (b) (AB)[1,:], (AB)[3,:], (AB)[:,1] and (AB)[:,2]. (c) (B A )[:,1], (B A )[:,3], (B A )[1,:] and (B A )[2,:]. [ ] Let A = and B = Guess a formula for An and B n and prove it? [ ] Let A =, B = and C = Is it true that A2 2A +I = 0? What is B 3 3B 2 +3B I? Is C 3 = 3C 2? 11. Construct the matrices A and B satisfying the following statements. (a) The product AB is defined but BA is not defined. (b) The products AB and BA are defined but they have different orders. (c) The products AB and BA are defined, they have the same order but AB BA. 12. Let a,b and c be indeterminate. Then, can we find A with complex entries satisfying a a+b [ ] [ ] A b = b c a a b? What if A =? Give reasons for your answer. b a c 3a 5b+c

14 14 CHAPTER 1. INTRODUCTION TO MATRICES 1.2.B Inverse of a Matrix Definition [Inverse of a Matrix] Let A be a square matrix of order n. 1. A square matrix B is said to be a left inverse of A if BA = I n. 2. A square matrix C is called a right inverse of A, if AC = I n. 3. A matrix A is said to be invertible (or is said to have an inverse) if there exists a matrix B such that AB = BA = I n. Lemma Let A be an n n matrix. Suppose that there exist n n matrices B and C such that AB = I n and CA = I n then B = C. Proof. Note that C = CI n = C(AB) = (CA)B = I n B = B. Remark unique. 1. Lemma implies that whenever A is invertible, the inverse is 2. Therefore the inverse of A is denoted by A 1. That is, AA 1 = A 1 A = I. [ ] a b Example Let A =. c d [ ] d b (a) If ad bc 0. Then verify that A 1 = 1 ad bc. c a [ ] [ ] (b) In particular, the inverse of equals (c) If ad bc = 0 then prove that either A[1,:] = 0 or A[:,1] = 0 or A[2,:] = αa[1,:] or A[:,2] = αa[:,1] for some α C. Hence, prove that A is not invertible. [ ] [ ] [ ] (d) The matrices, and do not have inverses Let A = Then A 1 = Prove that the matrices A = and B = are not invertible Solution: Suppose there exists C such that CA = AC = I. Then, using matrix product A[1,:]C = (AC)[1,:] = I[1,:] = [1,0,0] and A[2,:]C = (AC)[2,:] = I[2,:] = [0,1,0]. But A[1,:] = A[2,:] and thus [1,0,0] = [0,1,0], a contradiction. Similarly, if there exists D such that BD = DB = I then DB[:,1] = (DB)[:,1] = I[:,1], DB[:,2] = (DB)[:,2] = I[:,2] and DB[:,3] = I[:,3]. But B[:,3] = B[:,1]+B[:,2] and hence I[:,3] = I[:,1]+I[:,2], a contradiction.

15 1.2. OPERATIONS ON MATRICES 15 Theorem Let A and B be two invertible matrices. Then 1. (A 1 ) 1 = A. 2. (AB) 1 = B 1 A (A ) 1 = (A 1 ). Proof. Proof of Part 1. Let B = A 1 be the inverse of A. Then AB = BA = I. Thus, by definition, B is invertible and B 1 = A. Or equivalently, (A 1 ) 1 = A. Proof of Part 2. By associativity (AB)(B 1 A 1 ) = A(BB 1 )A 1 = I = (B 1 A 1 )(AB). Proof of Part 3. As AA 1 = A 1 A = I, we get (AA 1 ) = (A 1 A) = I. Or equivalently, (A 1 ) A = A (A 1 ) = I. Thus, by definition (A ) 1 = (A 1 ). We will again come back to the study of invertible matrices in Sections 2.2 and 2.3.A. Exercise Let A be an invertible matrix. Then (A 1 ) r = A r for all integer r. [ ] [ ] cos(θ) sin(θ) cos(θ) sin(θ) 2. Find the inverse of and. sin(θ) cos(θ) sin(θ) cos(θ) 3. Let A 1,...,A r be invertible matrices. Then the matrix B = A 1 A 2 A r is also invertible. 4. Let x = [1 + i,2,3] and y = [2, 1 + i,4]. Prove that x y is invertible but xy is not invertible. 5. Let A be an n n invertible matrix. Then prove that (a) A[i,:] 0 T for any i. (b) A[:,j] 0 for any j. (c) A[i,:] A[j,:] for any i and j. (d) A[:,i] A[:,j] for any i and j. (e) A[3,:] αa[1,:] +βa[2,:] for any α,β C, whenever n 3. (f) A[:,3] αa[:,1] +βa[:,2] for any α,β C, whenever n 3. [ ] 6. Determine A that satisfies (I +3A) = Determine A that satisfies (I A) 1 = [See Example ] Let A be a square matrix satisfying A 3 +A 2I = 0. Prove that A 1 = 1 ( 2 A 2 +I ). 9. Let A = [a ij ] be an invertible matrix. If B = [p i j a ij ] for some p C, p 0 then determine B 1.

16 16 CHAPTER 1. INTRODUCTION TO MATRICES 1.3 Some More Special Matrices Definition Let A be a square matrix with real entries. Then, A is called [ ] 1 3 (a) symmetric if A T = A. For example, A =. 3 2 [ ] 0 3 (b) skew-symmetric if A T = A. For example, A =. 3 0 [ ] (c) orthogonal if AA T = A T A = I. For example, A = Let A be a square matrix with complex entries. Then, A is called [ ] 1 i (a) Normal if A A = AA. For example, is a normal matrix. i 1 [ ] 1 1+i (b) Hermitian if A = A. For example, A =. 1 i 2 [ ] (c) skew-hermitian if A 0 1+i = A. For example, A =. 1+i 0 [ ] (d) unitary if AA = A A = I. For example, A = 1 1+i i [ ] A matrix A is said to be idempotent if A 2 = A. For example, A = is idempotent A matrix that is symmetric and idempotent is called a projection matrix. For example, let u R n beacolumnvector withu T u = 1thenA = uu T isanidempotentmatrix. Moreover, A is symmetric and hence is a projection matrix. In particular, let u = 1 5 (1,2) T and A = uu T. Then u T u = 1 and for any vector x = (x 1,x 2 ) T R 2 note that Ax = (uu )x = u(u x) = x [ 1 +2x 2 x1 +2x 2 u =, 2x ] 1 +4x T Thus, Ax is the feet of the perpendicular from the point x on the vector [1 2] T. 5. A square matrix A is said to be nilpotent if there exists a positive integer n such that A n = 0. The least positive integer k for which A k = 0 is called the order of nilpotency. For example, if A = [a ij ] is an n n matrix with a ij equal to 1 if i j = 1 and 0, otherwise then A n = 0 and A l 0 for 1 l n 1. Exercise Let A be a complex square matrix. Then S 1 = 1 2 (A+A ) is Hermitian, S 2 = 1 2 (A A ) is skew-hermitian, and A = S 1 +S Let A and B be two lower triangular matrices. Then prove that AB is a lower triangular matrix. A similar statement holds for upper triangular matrices.

17 1.3. SOME MORE SPECIAL MATRICES Let A and B be Hermitian matrices. Then prove that AB is Hermitian if and only if AB = BA. 4. Show that the diagonal entries of a skew-hermitian matrix are zero or purely imaginary. 5. Let A, B be skew-hermitian matrices with AB = BA. Is the matrix AB Hermitian or skew-hermitian? 6. Let A be a Hermitian matrix of order n with A 2 = 0. Is it necessarily true that A = 0? 7. Let A be a nilpotent matrix. Prove that there exists a matrix B such that B(I +A) = I = (I +A)B [If A k = 0 then look at I A+A 2 +( 1) k 1 A k 1 ] Are the matrices 0 cosθ sinθ and 0 cosθ sinθ orthogonal, for θ [0, 2π]? 0 sinθ cosθ 0 sinθ cosθ 9. Let {u 1,u 2,u 3 } be three vectors in R 3 such that u i u i = 1, for 1 i 3, and u i u j = 0 whenever i j. Then prove that the 3 3 matrix (a) U = [u 1 u 2 u 3 ] satisfies U U = I. Thus, UU = I. (b) A = u i u i, for 1 i 3, satisfies A2 = A. Is A symmetric? (c) A = u i u i +u ju j, for i j, satisfies A2 = A. Is A symmetric? 10. Verify that the matrices in Exercises 9.9b and 9.9c are projection matrices. 11. Let A and B be two n n orthogonal matrices. Then prove that AB is also an orthogonal matrix. 1.3.A sub-matrix of a Matrix Definition A matrix obtained by deleting some of the rows and/or columns of a matrix is said to be a sub-matrix of the given matrix. [ ] [ ] [ ] For example, if A = then [1],[2],,[1 5],, A are a few sub-matrices of A [ ] [ ] But the matrices and are not sub-matrices of A (the reader is advised to give reasons). Let A bean n mmatrix and B bean m pmatrix. [ ] Supposer < m. Then, we can decompose H the matrices A and B as A = [P Q] and B =, where P has order n r and H has order K r p. That is, the matrices P and Q are sub-matrices of A and P consists of the first r columns of A and Q consists of the last m r columns of A. Similarly, H and K are sub-matrices of B and H consists of the first r rows of B and K consists of the last m r rows of B. We now prove the following important theorem.

18 18 CHAPTER 1. INTRODUCTION TO MATRICES Theorem Let A = [a ij ] = [P Q] and B = [b ij ] = AB = PH +QK. [ ] H be defined as above. Then K Proof. The matrix products PH and QK are valid as the order of the matrices P,H,Q and K are respectively, n r, r p, n (m r) and (m r) p. Also, the matrices PH and QK are of the same order and hence their sum is justified. Now, let P = [P ij ], Q = [Q ij ], H = [H ij ], and K = [k ij ]. Then, for 1 i n and 1 j p, we have m r m r m (AB) ij = a ik b kj = a ik b kj + a ik b kj = P ik H kj + Q ik K kj k=1 k=1 k=r+1 k=1 k=r+1 = (PH) ij +(QK) ij = (PH +QK) ij. Thus, the required result follows. Remark Theorem is very useful due to the following reasons: 1. The order of the matrices P,Q,H and K are smaller than that of A or B. 2. The matrices P,Q,H and K can be further partitioned so as to form blocks that are either identity or zero or matrices that have nice forms. This partition may be quite useful during different matrix operations. 3. If we want to prove results using induction then after proving the initial step, one assume the result for all r r sub-matrices and then try to prove it for (r+1) (r+1) sub-matrices. [ ] a b [ ][ ] For example, if A = and B = c d 1 2 a b then AB =. 2 5 c d e f Suppose A = n 1 m 1 m [ 2 s ] 1 s [ 2 ] P Q and B = E F. Then the matrices P, Q, R, S r 1 n 2 R S r 2 G H and E, F, G, H, are called the blocks of the matrices A and B, respectively. Note that even if A + B is defined, the orders [ of P and E] need not be the same. But, if the block sums P +E Q+F are defined then A + B =. Similarly, if the product AB is defined, the R+G S +H product [ PE may not be defined. ] Again, if the block products are defined, one can verify that PE +QG PF +QH AB =. That is, once a partition of A is fixed, the partition of B has RE +SG RF +SH to be properly chosen for purposes of block addition or multiplication. Exercise Complete the proofs of Theorems and / Let A = 0 1 0, B = and C = Compute

19 1.3. SOME MORE SPECIAL MATRICES 19 (a) (AC)[1,:], (b) (B(AC))[1,:], (B(AC))[2,:] and (B(AC))[3,:]. (c) Note that (B(AC))[:,1]+(B(AC))[:,2]+(B(AC))[:,3] (B(AC))[:,4] = 0. (d) Let x T = [1,1,1, 1]. Use previous result to prove Cx = 0. [ ] [ ] [ ] [ ] x1 y1 cosα sinα cos(2θ) sin(2θ) 3. Let x =, y =, A = and B =. Then sinα cosα sin(2θ) cos(2θ) x 2 y 2 (a) prove that y = Ax gives the counter-clockwise rotation through an angle α. (b) prove that y = Bx gives the reflection about the line y = tan(θ)x. (c) compute y = (AB)x and y = (BA)x. Do they correspond to reflection? If yes, then about which line? (d) furthermore if y = Cx gives the counter-clockwise rotation through β and y = Dx gives the reflection about the line y = tan(δ) x, respectively. Then prove that i. AC = CA and y = (AC)x gives the counter-clockwise rotation through α+β. ii. y = (BD)x and y = (DB)x give rotations. Which angles do they represent? 4. Fix a unit vector a R n and define f : R n R n by f(y) = 2(a T y)a y. Does this function give a reflection about the line that contains the points 0 and a. 5. Consider the two coordinate transformations x 1 = a 11 y 1 +a 12 y 2 y 1 = b 11 z 1 +b 12 z 2 and. x 2 = a 21 y 1 +a 22 y 2 y 2 = b 21 z 1 +b 22 z 2 (a) Compose the two transformations to express x 1,x 2 in terms of z 1,z 2. (b) If x T = [x 1, x 2 ], y T = [y 1, y 2 ] and z T = [z 1, z 2 ] then find matrices A,B and C such that x = Ay, y = Bz and x = Cz. (c) Is C = AB? Give reasons for your answer. 6. For A n n = [a ij ], the trace of A, denoted Tr(A), is defined by Tr(A) = a 11 +a a nn. [ ] [ ] (a) Compute Tr(A) for A = and A = [ ] [ ] [ ] [ ] [ ] (b) Let A be a matrix with A = 2 and A = 3. If B = then compute Tr(AB). (c) Let A and B be two square matrices of the same order. Then prove that i. Tr(A+B) = Tr(A)+Tr(B). ii. Tr(AB) = tr(ba). (d) Prove that one cannot find matrices A and B such that AB BA = ci for any c 0.

20 20 CHAPTER 1. INTRODUCTION TO MATRICES 7. Let A and B be two m n matrices with complex entries. Then prove that (a) Ax = 0 for all n 1 vector x implies that A = 0, the zero matrix. (b) Ax = Bx for all n 1 vector x implies that A = B. 8. Let A be an n n matrix such that AB = BA for all n n matrices B. Then prove that A is a scalar matrix. That is, A = αi for some α C. [ ] Let A = (a) Find a matrix B such that AB = I 2. (b) What can you say about the number of such matrices? Give reasons for your answer. (c) Does there exist a matrix C such that CA = I 3? Give reasons for your answer Let A = and B = Compute the matrix product AB using the block matrix multiplication. [ ] P Q 11. Let A =. If P,Q and R are Hermitian, is the matrix A Hermitian? Q R [ ] A 11 A Let A =, where A 11 is an n n invertible matrix and c C. c A 21 (a) If p = c A 21 A 1 11 A 12 is non-zero, prove that is the inverse of A. [ ] A B = p (b) Use the above to find the inverse of [ A 1 11 A ] [ ] 12 A 21A and 13. Let x be an n 1 vector with real entries and satisfying x T x = (a) Define A = I n 2xx T. Prove that A is symmetric and A 2 = I. The matrix A is commonly known as the Householder matrix. (b) Let α 1 be a real number and define A = I n αxx T. Prove that A is symmetric and invertible [The inverse is also of the form I n +βxx T for some value of β]..

21 1.4. SUMMARY Let A be an invertible matrix of order n and let x and y be two n 1 vectors with real entries. Also, let β be a real number such that α = 1 + βy T A 1 x 0. Then prove the famous Shermon-Morrison formula (A+βxy T ) 1 = A 1 β α A 1 xy T A 1. This formula gives the information about the inverse when an invertible matrix is modified by a rank one matrix. 15. Suppose the matrices B and C are invertible and the involved partitioned products are defined, then prove that [ ] 1 [ ] A B 0 C 1 =. C 0 B 1 B 1 AC Let J be an n n matrix having each entry 1. (a) Prove that J 2 = nj. (b) Let α 1,α 2,β 1,β 2 R. Prove that there exist α 3,β 3 R such that (α 1 I n +β 1 J) (α 2 I n +β 2 J) = α 3 I n +β 3 J. (c) Let α,β R with α 0 and α+nβ 0 and define A = αi n +βj. Prove that A is invertible. 17. Let A be an upper triangular matrix. If A A = AA then prove that A is a diagonal matrix. The same holds for lower triangular matrix. 18. Let A be an m n matrix. Then a matrix G of order n m is called a generalized inverse of A [ if AGA ] = A. For example, a generalized inverse of the matrix A = [1,2] is a 1 2α matrix G =, for all α R. A generalized inverse G is called a pseudo inverse α or a Moore-Penrose inverse if GAG = G and the matrices AG and GA are symmetric. Check that for α = 2 the matrix G is a pseudo inverse of A Summary In this chapter, we started with the definition of a matrix and came across lots of examples. In particular, the following examples were important: 1. The zero matrix of size m n, denoted 0 m n or The identity matrix of size n n, denoted I n or I. 3. Triangular matrices. 4. Hermitian/Symmetric matrices.

22 22 CHAPTER 1. INTRODUCTION TO MATRICES 5. Skew-Hermitian/skew-symmetric matrices. 6. Unitary/Orthogonal matrices. 7. Idempotent matrices. 8. nilpotent matrices. We also learnt product of two matrices. Even though it seemed complicated, it basically tells that multiplying by a matrix on the 1. left to a matrix A is same as operating on the rows of A. 2. right to a matrix A is same as operating on the columns of A.

23 Chapter 2 System of Linear Equations over R 2.1 Introduction Let us look at some examples of linear systems. 1. Suppose a,b R. Consider the system ax = b in the unknown x. If (a) a 0 then the system has a unique solution x = b a. (b) a = 0 and i. b 0 then the system has no solution. ii. b = 0 then the system has infinite number of solutions, namely all x R. 2. Consider a linear system with 2 equations in 2 unknowns. The equation ax + by = c in the unknowns x and y represents a line in R 2 if either a 0 or b 0. Thus the solution set of the system a 1 x+b 1 y = c 1, a 2 x+b 2 y = c 2 is given by the points of intersection of the two lines. The different cases are illustrated by examples (see Figure 1). Figure 1??). (a) Unique Solution x+2y = 1 and x+3y = 1. The unique solution is [x,y] T = [1,0] T. Observe that in this case, a 1 b 2 a 2 b 1 0. (b) Infinite Number of Solutions x+2y = 1 and 2x+4y = 2. As both equations represent the same line, the solution set is [x,y] T = [1 2y,y] T = [1,0] T +y[ 2,1] T with y arbitrary. Observe that i. a 1 b 2 a 2 b 1 = 0, a 1 c 2 a 2 c 1 = 0 and b 1 c 2 b 2 c 1 = 0. ii. the vector [1,0] T corresponds to the solution x = 1,y = 0 of the given system. iii. the vector [ 2,1] T corresponds to the solution x = 2,y = 1 of the system x+2y = 0,2x+4y = 0.

24 24 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R (c) No Solution x + 2y = 1 and 2x + 4y = 3. The equations represent a pair of parallel lines and hence there is no point of intersection. Observe that in this case, a 1 b 2 a 2 b 1 = 0 but a 1 c 2 a 2 c As a last example, consider 3 equations in 3 unknowns. A linear equation ax + by + cz = d represent a plane in R 3 provided [a,b,c] [0,0,0]. Here, we have to look at the points of intersection of the three given planes. (a) Unique Solution Consider the system x+y+z = 3, x+4y+2z = 7 and 4x+10y z = 13. The unique solution to this system is [x,y,z] T = [1,1,1] T, i.e., the three planes intersect at a point. (b) Infinite Number of Solutions Consider the system x +y + z = 3, x +2y +2z = 5 and 3x + 4y + 4z = 11. The solution set is [x,y,z] T = [1,2 z,z] T = [1,2,0] T + z[0, 1,1] T, with z arbitrary. Observe the following: i. Here, the three planes intersect in a line. ii. The vector [1,2,0] T corresponds to the solution x = 1,y = 2 and z = 0 of the linear system x + y + z = 3, x + 2y + 2z = 5 and 3x + 4y + 4z = 11. Also, the vector [0, 1,1] T corresponds to the solution x = 0,y = 1 and z = 1 of the linear system x+y +z = 0, x+2y +2z = 0 and 3x+4y +4z = 0. (c) No Solution The system x+y+z = 3, 2x+2y+2z = 5 and 3x+3y+3z = 3 has no solution. In this case, we have three parallel planes. The readers are advised to supply the proof. Definition [LinearSystem]Asystemofmlinearequationsinnunknownsx 1,x 2,...,x n is a set of equations of the form a 11 x 1 +a 12 x 2 + +a 1n x n = b 1 a 21 x 1 +a 22 x 2 + +a 2n x n = b 2.. ( ) a m1 x 1 +a m2 x 2 + +a mn x n = b m where for 1 i n and 1 j m; a ij,b i R. Linear System ( ) is called homogeneous if b 1 = 0 = b 2 = = b m and non-homogeneous, otherwise. a 11 a 12 a 1n x 1 b 1 a 21 a 22 a 2n x 2 b 2 Let A =., x = and b =. Then ( ) can be re-written a m1 a m2 a mn x n b m

25 2.1. INTRODUCTION 25 as Ax = b. In this setup, the matrix A is called the coefficient matrix and the block matrix [A b] is called the augmented matrix of the linear system ( ). Remark Consider the augmented matrix [A b] of the linear system Ax = b, where A is an m n matrix, b and x are column vectors of appropriate size. If x T = [x 1,...,x n ] then it is important to note that 1. the unknown x 1 corresponds to the column ([A b])[:,1]. 2. in general, for j = 1,2,...,n, the unknown x j corresponds to the column ([A b])[:,j]. 3. the vector b = ([A b])[:,n+1]. 4. for i = 1,2,...,m, the i th equation corresponds to the row ([A b])[i,:]. Definition [Consistent, Inconsistent] A linear system is called consistent if it admits a solution and is called inconsistent if it admits no solution. For example, the homogeneous system Ax = 0 is always consistent as 0 is a solution whereas the system x+y = 2,2x+2y = 1 is inconsistent. Definition Consider the linear system Ax = b. Then the corresponding linear system Ax = 0 is called the associated homogeneous system. As mentioned in the previous paragraph, the associated homogeneous system is always consistent. Definition [Solution of a Linear System] A solution of Ax = b is a vector y such that Ay indeed equals b. The set of all solutions is called the solution set of the system. For example, the solution set of Ax = b, with A = and b = 7 equals The readers are advised to supply the proof of the next theorem that gives information about the solution set of a homogeneous system. Theorem Consider the homogeneous linear system Ax = Then x = 0, the zero vector, is always a solution. 2. Let u 0 be a solution of Ax = 0. Then, y = cu is also a solution for all c R. 3. Let u 1,...,u k be solutions of Ax = 0. Then k a i u i is also a solution of Ax = 0, for all a i R,1 i k. Remark Consider the homogeneous system Ax = 0. Then 1. the vector 0 is called the trivial solution. 2. a non-zero [ solution ] is called a non-trivial [ ] solution. For example, for the system Ax = 0, where A =, the vector x = is a non-trivial solution

26 26 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R 3. Thus, by Theorem , the existence of a non-trivial solution of Ax = 0 is equivalent to having an infinite number of solutions for the system Ax = Let u,v be two distinct solutions of the non-homogeneous system Ax = b. Then x h = u v is a solution of the homogeneous system Ax = 0. That is, any two solutions of Ax = b differ by a solution of the associated homogeneous system Ax = 0. Or equivalently, the solution set of Ax = b is of the form, {x 0 + x h }, where x 0 is a particular solution of Ax = b and x h is a solution of the associated homogeneous system Ax = 0. Exercise Consider a system of 2 equations in 3 unknowns. If this system is consistent then how many solutions does it have? 2. Define a linear system of 3 equations in 2 unknowns such that the system is inconsistent. 3. Define a linear system of 4 equations in 3 unknowns such that the system is inconsistent whereas it has three equations which form a consistent system. 4. Let Ax = b be a system of m equations and n unknowns. Then (a) determine the possible solution set if m 3 and n = 2. (b) determine the possible solution set if m 4 and n = 3. (c) can this system have exactly two distinct solutions? (d) can have only a finitely many (greater than 1) solutions? 2.1.A Elementary Row Operations Example Solve the linear system y +z = 2, 2x+3z = 5, x+y +z = 3. Solution: Let B 0 = [A b], the augmented matrix. Then B 0 = systematically proceed to get the solution We now Interchange 1 st and 2 nd equation (interchange B 0 [1,:] and B 0 [2,:] to get B 1 ). 2x+3z = y +z = 2 B 1 = x+y +z = In the new system, multiply 1 st equation by 1 2 (multiply B 1[1,:] by 1 2 to get B 2). x+ 3 2 z = 5 2 y +z = 2 x+y +z = B 2 =

27 2.1. INTRODUCTION In the new system, replace 3 rd equation by 3 rd equation minus 1 st equation (replace B 2 [3,:] by B 2 [3,:] B 2 [1,:] to get B 3 ). x+ 3 2 z = 5 2 y +z = 2 y 1 2 z = B 3 = In the new system, replace 3 rd equation by 3 rd equation minus 2 nd equation (replace B 3 [3,:] by B 3 [3,:] B 3 [2,:] to get B 4 ). x+ 3 2 z = 5 2 y +z = z = B 4 = In the new system, multiply 3 rd equation by 2 3 (multiply B 4[3,:] by 2 3 to get B 5). x+ 3 2 z = y +z = 2 B 5 = z = The last equation gives z = 1. Using this, the second equation gives y = 1. Finally, the first equation gives x = 1. Hence, the solution set is {[x,y,z] T [x,y,z] = [1,1,1]}, a unique solution. In Example , observe that for each operation on the system of linear equations there is a corresponding operation on the row of the augmented matrix. We use this idea to define elementary row operations and the equivalence of two linear systems. Definition [Elementary Row Operations] Let A be an m n matrix. Then the elementary row operations are 1. E ij : Interchange of A[i,:] and A[j,:]. 2. E k (c) for c 0: Multiply A[k,:] by c. 3. E ij (c) for c 0: Replace A[i,:] by A[i,:]+cA[j,:]. Definition [Equivalent Linear Systems] Consider the linear systems Ax = b and Cx = d with corresponding augmented matrices, [A b] and [C d], respectively. Then the two linear systems are said to be equivalent if [C d] can be obtained from [A b] by application of a finite number of elementary row operations. Definition [Equivalent Matrices] Two matrices are said to be equivalent if one can be obtained from the other by a finite number of elementary row operations. Thus, note that the linear systems at each step in Example are equivalent to each other. We now prove that the solution set of two equivalent linear systems are same.

28 28 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Lemma Let Cx = d be the linear system obtained from Ax = b by application of a single elementary row operation. Then Ax = b and Cx = d have the same solution set. Proof. We prove the result for the elementary row operation E jk (c) with c 0. The reader is advised to prove the result for the other two elementary operations. In this case, the systems Ax = b and Cx = d vary only in the j th equation. So, we need to show that y satisfies the j th equation of Ax = b if and only if y satisfies the j th equation of Cx = d. So, let y T = [α 1,...,α n ]. Then, the j th and k th equations of Ax = b are a j1 α 1 + +a jn α n = b j and a k1 α 1 + +a kn α n = b k. Therefore, we see that α i s satisfy (a j1 +ca k1 )α 1 + +(a jn +ca kn )α n = b j +cb k. ( ) Also, by definition the j th equation of Cx = d equals (a j1 +ca k1 )x 1 + +(a jn +ca kn )x n = b j +cb k. ( ) Therefore, using Equation ( ), we see that y T = [α 1,...,α n ] is also a solution for Equation ( ). Now, use a similar argument to show that if z T = [β 1,...,β n ] is a solution of Cx = d then it is also a solution of Ax = b. Hence, the required result follows. The readers are advised to use Lemma as an induction step to prove the main result of this subsection which is stated next. Theorem Let Ax = b and Cx = d be two equivalent linear systems. Then they have the same solution set. 2.2 System of Linear Equations In the previous section, we saw that if one linear system can be obtained from another by a repeated application of elementary row operations then the two linear systems have the same solution set. Sometimes it helps to imagine an elementary row operation as a product on the left by elementary matrix. In this section, we will try to understand this relationship and use them to first obtain results for the system of linear equations and then to the theory of square matrices. 2.2.A Elementary Matrices and the Row-Reduced Echelon Form (RREF) Definition A square matrix E of order n is called an elementary matrix if it is obtained by applying exactly one elementary row operation to the identity matrix I n. Remark The elementary matrices are of three types and they correspond to elementary row operations. 1. E ij : Matrix obtained by applying elementary row operation E ij to I n. 2. E k (c) for c 0: Matrix obtained by applying elementary row operation E k (C) to I n.

29 2.2. SYSTEM OF LINEAR EQUATIONS E ij (c) for c 0: Matrix obtained by applying elementary row operation E ij (c) to I n. Example In particular, for n = 3 and a real number c 0, one has c E 23 = 0 0 1, E 1(c) = 0 1 0, E 31(c) = and E 23(c) = 0 1 c c Let A = and B = Then verify that E 23 A = = = B Let A = Then E 21 ( 1)E 32 ( 2)A = E 21 ( 1) = Let A = Then verify that E 31( 2)E 13 E 31 ( 1)A = Exercise Which of the following matrices are elementary? , , 0 1 0, 5 1 0, 0 1 0, [ ] Find elementary matrices E 1,...,E k such that E k E 1 = I Determine elementary matrices F 1,...,F k such that E k E = I Remark Observe that 1. (E ij ) 1 = E ij as E ij E ij = I = E ij E ij. 2. Let c 0. Then (E k (c)) 1 = E k (1/c) as E k (c)e k (1/c) = I = E k (1/c)E k (c). 3. Let c 0. Then (E ij (c)) 1 = E ij ( c) as E ij (c)e ij ( c) = I = E ij ( c)e ij (c). Thus, each elementary matrix is invertible and the inverse is also an elementary matrix. Based on the above observation and the fact that product of invertible matrices is invertible, the readers are advised to prove the next result.

30 30 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Lemma Prove that applying elementary row operations is equivalent to multiplying on the left by the corresponding elementary matrix and vice-versa. Proposition Let A and B be two equivalent matrices. Then prove that B = E 1 E k A, for some elementary matrices E 1,...,E k. Proof. By definition of equivalence, the matrix B can be obtained from A by a finite number of elementary row operations. But by Lemma , each elementary row operation on A corresponds to multiplying on the left of A by an elementary matrix. Thus, the required result follows. We now give a direct prove of Theorem To do so, we state the theorem once again. Theorem Let Ax = b and Cx = d be two equivalent linear systems. Then they have the same solution set. Proof. Let E 1,...,E k be the elementary matrices such that E 1 E k [A b] = [C d]. Then, by definition of matrix product and Remark E 1 E k A = C,E 1 E k b = d and A = (E 1 E k ) 1 C,b = (E 1 E k ) 1 d. ( ) Now assume that Ay = b holds. Then, by Equation ( ) Cy = E 1 E k Ay = E 1 E k b = d. ( ) On the other hand if Cz = d holds then using Equation ( ), we have Az = (E 1 E k ) 1 Cz = (E 1 E k ) 1 d = b. ( ) Therefore, using Equations ( ) and ( ) the required result follows. As an immediate corollary, the readers are advised to prove the following result. Corollary Let A and B be two equivalent matrices. Then the systems Ax = 0 and Bx = 0 have the same solution set a Example Are the matrices A = and B = 0 1 b equivalent? a Solution: No, as b is a solution of Bx = 0 but it isn t a solution of Ax = 0. 1 Definition [Pivot/Leading Term] Let A be a non-zero matrix. Then, a pivot/leading term is the first (from left) nonzero element of a non-zero row in A and the column containing the pivot term is called the pivot column. If a ij is a pivot then we denote it by a ij. For example, the entries a 12 and a 23 are pivots in A = Thus, the columns 1 and are pivot columns.

31 2.2. SYSTEM OF LINEAR EQUATIONS 31 Definition [Echelon Form] A matrix A is in echelon form (EF) (ladder like) if 1. pivot of the (i+1)-th row comes to the right of the i-th. 2. entries below the pivot in a pivotal column are the zero rows are at the bottom. Example The following matrices are in echelon form , , and The following matrices are not in echelon form (determine the rule(s) that fail) and Definition [Row-Reduced Echelon Form] A matrix C is said to be in the rowreduced echelon form (RREF) if 1. C is already in echelon form, 2. pivot of each non-zero row is 1, 3. every other entry in the pivotal column is zero. A matrix in RREF is also called a row-reduced echelon matrix. Example The following matrices are in RREF , , and The following matrices are not in RREF (determine the rule(s) that fail) , , The proof of the next result is beyond the scope of this book and hence is omitted. Theorem Let A and B be two matrices in RREF. If they are equivalent then A = B. As an immediate corollary, we obtain the following important result. Corollary The RREF of a matrix is unique. Proof. Suppose there exists a matrix A with two different RREFs, say B and C. As the RREFs areobtained by multiplication of elementary matrices thereexist elementary matrices E 1,...,E k and F 1,...,F l such that B = E 1 E k A and C = F 1 F l A. Thus, B = E 1 E k A = E 1 E k (F 1 F l ) 1 C = E 1 E k F 1 l F 1 1 C.

32 32 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R As inverse of elementary matrices are elementary matrices, we see that the matrices B and C are equivalent. As B and C are in RREF, using Theorem , we see that B = C. Theorem Let A be an m n matrix. If B consists of the first s columns of A then RREF(B) equals the first s columns of RREF(A). Proof. Let us write F = RREF(A). By definition of RREF, there exist elementary matrices E 1,...,E k such that E 1 E k A = F. Then, by matrix multiplication E 1 E k A = [E 1 E k A[:,1],...,E 1 E k A[:,n]] = [F[:,1],...,F[:,n]]. Thus, E 1 E k B = [E 1 E k A[:,1],...,E 1 E k A[:,s]] = [F[:,1],...,F[:,s]]. Since the matrix F is in RREF, by definition, it s first s columns are also in RREF. Hence, by Corollary we see that RREF(B) = [F[:,1],...,F[:,s]]. Thus, the required result follows. Let A an m n matrix. Then by Corollary , it s RREF is unique. We use it to define our next object. 2.2.B Rank of a Matrix Definition [Row-Rank of a Matrix] Let A be an m n matrix and let the number of pivots (number of non-zero rows) in it s RREF. Then the row rank of A, denoted row-rank(a), equals r. For example, row-rank(i n ) = n and row-rank(0) = 0. Remark Even though, row-rank is defined using the RREF of a matrix, we just need to compute the echelon form as the number of non-zero rows/pivots do not change when we proceed to compute the RREF from the echelon form. 2. Let A be an m n matrix. Then by the definition of RREF, the number of pivots, say r, satisfies r min{m,n}. Thus, row-rank(a) min{m,n}. Example Determine the row-rank of the following matrices. 1. diag(d 1,...,d n ). Solution: Let S = {i 1 i n,d i 0}. Then, verify that row-rank equals the number of elements in S A = Solution: The echelon form of A is obtained as follows: E ( 1),E 21 ( 2) E 32 ( 1) As the echelon form of A has 3 non-zero rows row-rank(a) =

33 2.2. SYSTEM OF LINEAR EQUATIONS A = Solution: row-rank(a) = 2 as the echelon form of A (given below) has two non-zero rows: E ( 1),E 21 ( 2) E ( 1) Remark Let Ax = b be a linear system with m equations in n unknowns. Let RREF([A b]) = [C d], row-rank(a) = r and row-rank([a b]) = r a. 1. Then, using Theorem conclude that r r a. 2. If r < r a then again using Theorem , note that r a = r+1 and ([C d])[:,n+1] has a pivot at the (r +1)-th place. Hence, by definition of RREF, ([C d])[r +1,:] = [0 T, 1]. 3. If r = r a then ([C d])[:,n+1] has no pivot. Thus, [0 T, 1] is not a row of [C d]. Now, consideranm nmatrixaandanelementary matrixe ofordern. ThentheproductAE corresponds to applying column transformation on the matrix A. Therefore, for each elementary matrix, there is a corresponding column transformation as well. We summarize these ideas as follows. Definition The column transformations obtained by right multiplication of elementary matrices are called column operations. For example, if A = then AE 23 = and AE 14( 1) = Remark (Rank of a Matrix). 1. The idea of row-rank was based on RREF and RREF was the result of systematically applying a finite number of elementary row operations. So, starting with a matrix A, we can systematically apply a finite number of elementary column operations (see Definition ) to get a matrix which in some sense looks similar to RREF, call it say B, and then use the non-zero columns in that to define the column-rank. Note that B will have the following form: (a) The zero columns appear after the non-zero columns. (b) The first non-zero entry of a non-zero column is 1, called the pivot. (c) The pivots in non-zero column move down in successive columns. 2. It will be proved later that row-rank(a) = column-rank(a). Thus, we just talk of the rank, denoted Rank(A). we are now ready to prove a few results associated with the rank of a matrix.

34 34 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Theorem Let A be a matrix of rank r. Then there exist a finite number of elementary matrices E 1,...,E s and F 1,...,F l such that [ ] Ir 0 E 1 E s A F 1 F l =. 0 0 Proof. Let C = RREF(A). As Rank(A) = r, by definition of RREF, there exist elementary matrices E 1,...,E s such that C = E 1 E s A. Note that C has r pivots and they appear in columns, say i 1 < i 2 < < i r. Now, let D be the matrix obtained from C by successively multiplying [ ] the elementary matrices Ir B E jij, for 1 j r, on the right of C. Then observe that D =, where B is a matrix of 0 0 an appropriate size. As the (1,1) block of D is an identity matrix, the block (1,2) can be made the zero matrix by elementary column operations to D. Thus, the required result follows. [ ] [ ] Exercise Let A = and B =. Find P and Q such that B = PAQ. 2. Let A be[ a matrix] of rank r. Then [ prove ] that there [ exist invertible ] matrices B[ i,c i such ] that R1 R 2 S1 0 A1 0 Ir 0 B 1 A =, AC 1 =, B 2 AC 2 = and B 3 AC 3 =, where 0 0 S the (1,1) block of each matrix is of size r r. Also, prove that A 1 is an invertible matrix. 3. Let A be an m n matrix of rank r. Then prove that A can be written as A = BC, where both B and C have rank r and B is of size m r and C is of size r n. 4. Prove that if the product AB is defined and Rank(A) = Rank(AB) then A = ABX for some matrix X. Similarly, if BA is defined and Rank(A) = Rank(BA) then [ A = YBA ] for some A 1 0 matrix Y. [Hint: Choose invertible matrices P,Q satisfying PAQ =, P(AB) = 0 0 [ ] (PAQ)(Q 1 A2 A 3 B) =. Now find an invertible matrix R such that P(AB)R = 0 0 [ ] [ ] C 0 C 1 A 1 0. Now, define X = R Q 1 to get the required result.] Prove that if AB is defined then Rank(AB) Rank(A) and Rank(AB) Rank(B). 6. Let P and Q be invertible matrices such that the matrix product PAQ is defined. Then prove that Rank(P AQ) = Rank(A). 7. Prove that if A+B is defined then Rank(A+B) Rank(A)+Rank(B). 2.2.C Gauss-Jordan Elimination and System of Linear Equations Let A be an m n matrix. We now present an algorithm, commonly known as the Gauss-Jordan Elimination (GJE) method, to compute the RREF of A.

35 2.2. SYSTEM OF LINEAR EQUATIONS Input: A. 2. Output: a matrix B in RREF such that A is row equivalent to B. 3. Step 1: Put Region = A. 4. Step 2: If all entries in the Region are 0, STOP. Else, in the Region, find the leftmost nonzero column and find its topmost non-zero entry. Suppose this non-zero entry is a ij. Box it. This is a pivot. 5. Step 3: Replace the row containing the pivot with the top row of the region. Also, make the pivot entry 1. Use this pivot to make other entries in the pivotal column as Step 4: Put Region = the submatrix below and to the right of the current pivot. Now, go to step 2. Important: The process will stop, as we can get at most min{m,n} pivots Example Apply GJE to Region = A as A Then E 12 A = Also, E 31( 1)E 12 A = = B (say) Now, Region = Let C = Then E 1 ( 1 2 )C = and E 21 ( 2)E 1 ( 1 2 )C = Or equivalently, E 32 ( 2)E 2 ( 1 2 )B = Thus, E 34 E 32 ( 2)E 2 ( 1 2 )E ( 1)E 12 A = Hence, E 12 ( 3 2 )E 13( 1)E 23 ( 7 2 )E 34E 32 ( 2)E 2 ( 1 2 )E 31( 1)E 12 A = As the matrix A has been multiplied with elementary matrices on the left the RREF matrix is equivalent to A

36 36 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Remark (Gauss Elimination (GE)). GE is similar to the GJE except that 1. the pivots need not be made 1 and 2. the entries above the pivots need not be made For example, if A = then GE gives E 32 ( 3)E 21 ( 1)E 31 ( 1)A = Thus, Gauss-Jordan Elimination may be viewed as an extension of the Gauss Elimination. Example Consider the system Ax = b with A a matrix of order 3 3 and A[:,1] 0. If [C d] = RREF([A b]) then the possible choices for [C d] are given below: d 1 x d d 2. Here, Ax = b is consistent and with unique solution y = d d z d α 0 1 α α β β 0, or Here, Ax = b is inconsistent for any choice of α,β. 1 0 α d 1 1 α 0 d β d 2, 1 1 α β d d 2 or Here, Ax = b is consistent and has infinite number of solutions for every choice of α,β. Exercise Let Ax = b be a linear system of m equations in 2 unknowns. What are the possible choices for RREF([A b]) if m 1? 2. Find the row-reduced echelon form of the following matrices: , , 2 0 3, Find the solutions of the linear system of equations using Gauss-Jordan method. x +y 2u + v = 2 z +u +2v = 3 v + w = 3 v +2w = 5 Now, using Proposition , Theorem and the definition of RREF of a matrix, we obtain the following remark. Remark Let RREF([A b]) = [C d]. Then

37 2.2. SYSTEM OF LINEAR EQUATIONS there exist elementary matrices, say E 1,...,E k, such that E 1 E k [A b] = [C d]. Thus, the GJE (or the GE) is equivalent to multiplying by a finite number of elementary matrices on the left of [A b]. 2. by Theorem RREF(A) = C. Definition [Basic, Free Variables] Consider the linear system Ax = b. If RREF([A b]) = [C d] then the unknowns 1. corresponding to the pivotal columns of C are called the basic variables. 2. that are not basic are called free variables Example Let RREF([A b]) = Then, Ax = b has { [x,y,z] T [x,y,z] = [1,2 z,z] = [1,2,0] +z[0, 1,1], with z arbitrary } as it s solution set. Note that x and y are basic variables and z is the free variable Let RREF([A b]) = Then, the system Ax = b has no solution as RREF([A b])[3,:] = [0,0,0,1] which corresponds to the equation 0 x+0 y +0 z = Suppose the system Ax = b is consistent and RREF(A) has r non-zero rows. Then the system has r basic variables and n r free variables. We now prove the main result in the theory of linear system (recall Remark ). Theorem Let A be an m n matrix and let RREF([A b]) = [C d], Rank(A) = r and Rank([A b]) = r a. Then Ax = b 1. is inconsistent if r < r a 2. is consistent if r = r a. Furthermore, Ax = b has (a) a unique solution if r = n. (b) infinite number of solutions if r < n. In this case, the solution set equals {x 0 +k 1 u 1 +k 2 u 2 + +k n r u n r k i R, 1 i n r}, where x 0,u 1,...,u n r R n with Ax 0 = b and Au i = 0, for 1 i n r. Proof. Part 1: As r < r a, by Remark ([C d])[r + 1,:] = [0 T, 1]. Then, this row corresponds to the linear equation 0 x 1 +0 x x n = 1 which clearly has no solution. Thus, by definition and Theorem , Ax = b is inconsistent.

38 38 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Part 2: As r = r a, by Remark , [C d] doesn t have a row of the form[0 T, 1] and there are r pivots in C. Suppose the pivots appear in columns i 1,...,i r with 1 i 1 < < i r n. Thus, the unknowns x ij, for 1 j r, are basic variables and the remaining n r variables, say x t1,...,x tn r, are free variables with t 1 < < t n r. Since C is in RREF, in terms of the free variables and basic variables, the l-th row of [C d], for l, 1 l r, corresponds to the equation x il + n r k=1 c ltk x tk = d l x il = d l n r k=1 c ltk x tk. Hence, the solution set of the system Cx = d is given by x i1. x ir x t1 x t2. x tn r = d 1 n r k=1 c 1tk x tk. d r n r k=1 c rtk x tk x t1 x t2. x tn r = d 1. d r x t1 c 1t1. c rt x t2 c 1t2. c rt x tn r c 1tn r. c rtn r ( ) Thus, by Theorem the system Ax = b is consistent. In case of Part 2a, r = n and hence there are no free variables. Thus, the unique solution equals x i = d i, for 1 i n. In case of Part 2b, define x 0 = d 1. d r and u 1 = c 1t1. c rt ,...,u n r = c 1tn r. c rtn r Then, it can be easily verified that Ax 0 = b and for 1 i n r, Au i = 0 and by Equation ( ) the solution set has indeed the required form, where k i corresponds to the free variable x ti. As there is at least one free variable the system has infinite number of solutions. Thus, the proof of the theorem is complete. Let A be an m n matrix. Then by Remark , Rank(A) m and hence using Theorem the next result follows. Corollary Let A be a matrix of order m n and consider Ax = 0. If 1. Rank(A) = r < min{m,n} then Ax = 0 has infinitely many solutions. 2. m < n, then Ax = 0 has infinitely many solutions. Thus, in either case, the homogeneous system Ax = 0 has at least one non-trivial solution. Remark Let A be an m n matrix. Then Theorem implies that 1. the linear system Ax = b is consistent if and only if Rank(A) = Rank([A b]).

39 2.2. SYSTEM OF LINEAR EQUATIONS the vectors associated to the free variables in Equation ( ) are solutions to the associated homogeneous system Ax = 0. We end this subsection with some applications. Example points ( 1,4),(0,1) and (1,4). 1. Determine the equation of the line/circle that passes through the Solution: The general equation of a line/circle in euclidean plane is given by a(x 2 + y 2 )+bx+cy +d = 0, where a,b,c and d are unknowns. Since this curve passes through the given points, we get a homogeneous system in 3 equations and 4 unknowns, namely ( 1) a (0) b c = 0. Solvingthissystem,weget[a,b,c,d] = [ 13 d,0, d,d] d Hence, taking d = 13, the equation of the required circle is 3(x 2 +y 2 ) 16y +13 = Determinetheequationoftheplanethatcontainsthepoints(1,1,1),(1,3,2) and(2, 1,2). Solution: The general equation of a plane in space is given by ax+by+cz+d = 0, where a,b,c and d are unknowns. Since this plane passes through the 3 given points, we get a homogeneous system in 3 equations and 4 unknowns. So, it has a non-trivial solution, namely [a,b,c,d] = [ 4 3 d, d 3, 2 3d,d]. Hence, taking d = 3, the equation of the required plane is 4x y +2z +3 = Let A = Then (a) find a non-zero x R 3 such that Ax = 2x. (b) does there exist a non-zero vector y R 3 such that Ay = 4y? Solution of Part 3a: Solving for Ax = 2x is equivalent to solving (A 2I)x = 0 whose augmented matrix equals Verify that xt = [1,0,0] is a non-zero solution Part 3b: As above, Ay = 4y is equivalent to solving (A 4I)y = 0 whose augmented matrix equals Now, verify that yt = [2,0,1] is a non-zero solution Exercise In the first part of this chapter 3 figures (see Figure??) were given to illustrate different cases in Euclidean plane (2 equations in 2 unknowns). It is well known that in the case of Euclidean space (3 equations in 3 unknowns) there (a) is a way to place the 3 planes so that the system has a unique solution. (b) are 4 distinct ways to place the 3 planes so that the system has no solution.

40 40 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R (c) are 3 distinct ways to place the 3 planes so that the system has an infinite number of solutions. Determine the position of planes by drawing diagram to explain the above cases. Do these diagrams and the RREF matrices that appear in Example have any relationship? Justify your answer. 2. Determine the equation of the curve y = ax 2 + bx + c that passes through the points ( 1,4),(0,1) and (1,4). 3. Solve the following linear systems. (a) x+y +z +w = 0, x y +z +w = 0 and x+y +3z +3w = 0. (b) x+2y = 1, x+y +z = 4 and 3y +2z = 1. (c) x+y +z = 3, x+y z = 1 and x+y +7z = 6. (d) x+y +z = 3, x+y z = 1 and x+y +4z = 6. (e) x+y +z = 3, x+y z = 1, x+y +4z = 6 and x+y 4z = For what values of c and k, the following systems have i) no solution, ii) a unique solution and iii) infinite number of solutions. (a) x+y +z = 3, x+2y +cz = 4, 2x+3y +2cz = k. (b) x+y +z = 3, x+y +2cz = 7, x+2y +3cz = k. (c) x+y +2z = 3, x+2y +cz = 5, x+2y +4z = k. (d) kx+y +z = 1, x+ky +z = 1, x+y +kz = 1. (e) x+2y z = 1, 2x+3y +kz = 3, x+ky +3z = 2. (f) x 2y = 1, x y +kz = 1, ky +4z = For what values of a, does the following systems have i) no solution, ii) a unique solution and iii) infinite number of solutions. (a) x+2y +3z = 4, 2x+5y +5z = 6, 2x+(a 2 6)z = a+20. (b) x+y +z = 3, 2x+5y +4z = a, 3x+(a 2 8)z = Find the condition(s) on x,y,z so that the system of linear equations given below (in the unknowns a,b and c) is consistent? (a) a+2b 3c = x, 2a+6b 11c = y, a 2b+7c = z (b) a+b+5c = x, a+3c = y, 2a b+4c = z (c) a+2b+3c = x, 2a+4b+6c = y, 3a+6b+9c = z 7. Let A be an n n matrix. If the system A 2 x = 0 has a non trivial solution then show that Ax = 0 also has a non trivial solution.

41 2.3. SQUARE MATRICES AND SYSTEM OF LINEAR EQUATIONS Prove that 5 distinct points are needed to specify a general conic in Euclidean plane. 9. Let u = (1,1, 2) T and v = ( 1,2,3) T. Find condition on x,y and z such that the system cu+dv = (x,y,z) T in the unknowns c and d is consistent. 10. Consider the linear system Ax = b in m equations and 3 unknowns. Then, for each of the given solution set, determine the possible choices of m? Further, for each choice of m, determine a choice of A and b. (a) (1,1,1) T is the only solution. (b) (1,1,1) T is the only solution. (c) {(1,1,1) T +c(1,2,1) T c R} as the solution set. (d) {c(1,2,1) T c R} as the solution set. (e) {(1,1,1) T +c(1,2,1) T +d(2,2, 1) T c,d R} as the solution set. (f) {c(1,2,1) T +d(2,2, 1) T c,d R} as the solution set. 2.3 Square Matrices and System of Linear Equations In this section the coefficient matrix of the linear system Ax = b will be a square matrix. We start with proving a few equivalent conditions that relate different ideas. Theorem Let A be a square matrix of order n. Then the following statements are equivalent. 1. A is invertible. 2. The homogeneous system Ax = 0 has only the trivial solution. 3. Rank(A) = n. 4. The RREF of A is I n. 5. A is a product of elementary matrices. Proof. 1 = 2 As A is invertible, A 1 exists and A 1 A = I n. So, if x 0 is any solution of the homogeneous system Ax = 0. Then x 0 = I n x 0 = (A 1 A)x 0 = A 1 (Ax 0 ) = A 1 0 = 0. Hence, 0 is the only solution of the homogeneous system Ax = 0. 2 = 3 Let if possible Rank(A) = r < n. Then, by Corollary , the homogeneous system Ax = 0 has infinitely many solution. A contradiction. Thus, A has full rank. 3 = 4 Suppose Rank(A) = n and let B = RREF(A). As B = [b ij ] is a square matrix of order n, each column of B contains a pivot. Since the pivots move to the right as we go down the row, the i-th pivot must be at position b ii, for 1 i n. Thus, B = I n and hence, the RREF of A is I n.

42 42 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R 4 = 5 Suppose RREF(A) = I n. Then using Proposition , there exist elementary matrices E 1,...,E k such that E 1 E k A = I n. Or equivalently, A = (E 1 E k ) 1 = E 1 k E 1 1 ( ) which gives the desired result as by Remark we know that the inverse of an elementary matrix is also an elementary matrix. 5 = 1 Suppose A = E 1 E k for some elementary matrices E 1,...,E k. As the elementary matrices are invertible (see Remark ) and the product of invertible matrices is also invertible, we get the required result. As an immediate consequence of Theorem , we have the following important result which implies that one needs to compute either the left or the right inverse to prove invertibility. Corollary Let A be a square matrix of order n. Suppose there exists a matrix 1. C such that CA = I n. Then A 1 exists. 2. B such that AB = I n. Then A 1 exists. Proof. Part 1: Let CA = I n for some matrix C and let x 0 be a solution of the homogeneous system Ax = 0. Then Ax 0 = 0 and x 0 = I n x 0 = (CA)x 0 = C(Ax 0 ) = C 0 = 0. Thus, the homogeneous system Ax = 0 has only the trivial solution. Hence, using Theorem , the matrix A is invertible. Part 2: Using the first part, B is invertible. Hence, B 1 = A or equivalently A 1 = B and thus A is invertible as well. Another important consequence of Theorem is stated next which uses Equation( ) to get therequired result. This resultis usedto compute the inverse of amatrix usingthe Gauss- Jordan Elimination. Corollary Let A be an invertible matrix of order n. Suppose there exist elementary matrices E 1,...,E k such that E 1 E k A = I n. Then A 1 = E 1 E k. Remark Let A be an n n matrix. Apply GJE to [A I n ] and let RREF([A I n ]) = [B C]. If B = I n, then A 1 = C or else A is not invertible Example Use GJE to find the inverse of A =

43 2.3. SQUARE MATRICES AND SYSTEM OF LINEAR EQUATIONS 43 Solution: Applying GJE to [A I 3 ] = [A I 3 ] E 12 ( 1) Thus, A 1 = E E 13( 1),E 23 ( 2) gives Exercise Find the inverse of the following matrices using GJE (i) (ii) (iii) (iv) Let A and B be two matrices having positive entries and of order 1 2 and 2 1, respectively. Which of BA or AB is invertible? Give reasons. 3. Let A be an n m matrix and B be an m n matrix. Prove that (a) I BA is invertible if and only if I AB is invertible [Use Theorem ]. (b) if I AB is invertible then (I BA) 1 = I +B(I AB) 1 A. (c) if I AB is invertible then (I BA) 1 B = B(I AB) 1. (d) if A,B and A+B are invertible then (A 1 +B 1 ) 1 = A(A+B) 1 B. 4. Let A be a square matrix. Then prove that (a) A is invertible if and only if A T A is invertible. (b) A is invertible if and only if AA T is invertible. We end this section by giving two more equivalent conditions for a matrix to be invertible. Theorem The following statements are equivalent for an n n matrix A. 1. A is invertible. 2. The system Ax = b has a unique solution for every b. 3. The system Ax = b is consistent for every b. Proof. 1 = 2 Note that x 0 = A 1 b is the unique solution of Ax = b. 2 = 3 The system is consistent as Ax = b has a solution.

44 44 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R 3 = 1 For 1 i n, define e T i = I n [i,:]. By assumption, the linear system Ax = e i has a solution, say x i, for 1 i n. Define a matrix B = [x 1,...,x n ]. Then AB = A[x 1,x 2...,x n ] = [Ax 1,Ax 2...,Ax n ] = [e 1,e 2...,e n ] = I n. Therefore, by Corollary , the matrix A is invertible. We now give an immediate application of Theorem and Theorem without proof. Theorem The following two statements cannot hold together for an n n matrix A. 1. The system Ax = b has a unique solution for every b. 2. The system Ax = 0 has a non-trivial solution. Exercise Let A and B be square matrices of order n with B = PA, for an invertible matrix P. Then prove that A is invertible if and only if B is invertible. 2. Let A and B be two m n matrices. Then prove that A and B are equivalent if and only if B = PA, where P is product of elementary matrices. When is this P unique? 3. Let b T = [1,2, 1, 2]. Suppose A is a 4 4 matrix such that the linear system Ax = b has no solution. Mark each of the statements given below as true or false? (a) The homogeneous system Ax = 0 has only the trivial solution. (b) The matrix A is invertible. (c) Let c T = [ 1, 2,1,2]. Then the system Ax = c has no solution. (d) Let B = RREF(A). Then i. B[4,:] = [0,0,0,0]. ii. B[4,:] = [0,0,0,1]. iii. B[3,:] = [0,0,0,0]. iv. B[3,:] = [0,0,0,1]. v. B[3,:] = [0,0,1,α], where α is any real number. 2.3.A Determinant In this section, we associate a number with each square matrix. To start with, let A be an n n matrix. Then, for 1 i,j k and 1 α i,β j n, by A(α 1,...,α k β 1,...,β l ) we mean that submatrix of A that is obtained by deleting the rows corresponding to α i s and the columns corresponding to β j s of A [ ] Example For A = , A(1 2) = and A(1,2 1,3) = [4]

45 2.3. SQUARE MATRICES AND SYSTEM OF LINEAR EQUATIONS 45 With the notations as above, we are ready to give an inductive definition of the determinant of a square matrix. The advanced students can find the actual definition of the determinant in Appendix , where it is proved that the definition given below corresponds to the expansion of determinant along the first row. Definition [Determinant] Let A be a square matrix of order n. Then the determinant of A, denoted det(a) (or A ) is defined by a, if A = [a] (n = 1), det(a) = n ( 1) 1+j a 1j det ( A(1 j) ), otherwise. j=1 Example Let A = [ 2]. Then det(a) = A = 2. [ ] a b 2. Let A =. Then, det(a) = A = a det(a(1 1)) b det(a(1 2)) = ad bc. For c d [ ] example, if A = then det(a) = = = Let A = [a ij ] be a 3 3 matrix. Then, det(a) = A = a 11 det(a(1 1)) a 12 det(a(1 2))+a 13 det(a(1 3)) a 22 a 23 = a 11 a 32 a 33 a 12 a 21 a 23 a 31 a 33 +a 13 a 21 a 22 a 31 a 32 = a 11 (a 22 a 33 a 23 a 32 ) a 12 (a 21 a 33 a 31 a 23 )+a 13 (a 21 a 32 a 31 a 22 ) For A = 2 3 1, A = = 4 2(3)+3(1) = Exercise Find the determinant of the following matrices a a 2 i) ii) iii) 1 b b 2. 1 c c Definition [Singular, Non-Singular] A matrix A is said to be a singular if det(a) = 0 and is called non-singular if det(a) 0. The next result relates the determinant with row operations. For proof, see Appendix 7.2. Theorem Let A be an n n matrix. If 1. B = E ij A, for 1 i j n, then det(b) = det(a). 2. B = E i (c)a, for c 0,1 i n, then det(b) = cdet(a). 3. B = E ij (c)a, for c 0 and 1 i j n, then det(b) = det(a).

46 46 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R 4. A[i,:] T = 0, for 1 i,j n then det(a) = A[i,:] = A[j,:] for 1 i j n then det(a) = A is a triangular matrix with d 1,...,d n on the diagonal then det(a) = d 1 d n E 1 ( Example Since ) E ( 1),E 31 ( 1) 0 2 1, using Theorem , we see that, for A = 1 3 2, det(a) = 2 (1 2 ( 1)) = 4, where the first 2 appears from the elementary matrix E 1 ( 1 2 ) For A = E 1 ( verify that A 2 ) Now, a successive application of E 21 ( 1),E 31 ( 1) and E 41 ( 3) gives and then applyinge and E 43 ( 4), we get Thus, by Theorem det(a) = 2 ( 1) (8) = as 2 gets contributed due to E 1 ( 1 2 ) and 1 due to E 32. Exercise Use Theorem to arrive at the answer. a b c a b c a e αa+βe+h 1. Let A = e f g,b = e f g and CT = b f αb+βf +j for some h j l αh αj αl c g αc+βg +l complex numbers α and β. Prove that det(b) = αdet(a) and det(c) = det(a) Prove that 3 divides and = By Theorem det(i n ) = 1. The next result about the determinant of the elementary matrices is an immediate consequence of Theorem and hence the proof is omitted. Corollary Fix a positive integer n. Then 1. det(e ij ) = 1.

47 2.3. SQUARE MATRICES AND SYSTEM OF LINEAR EQUATIONS For c 0, det(e k (c)) = c. 3. For c 0, det(e ij (c)) = 1. Remark Theorem implies that the determinant can be calculated by expanding along any row. Hence, the readers are advised to verify that n det(a) = ( 1) k+j a kj det(a(k j)), for 1 k n. j=1 Example Let us use Remark to compute the determinant = ( 1) ( 1) = (2 6) 2(4 2) = = ( 1) = 2 1 = = ( 1) ( 1) = 2 1+( 8) = Remark Let u T = [u 1,u 2 ],v T = [v 1,v 2 ] R 2. Now, consider the parallelogram on vertices P = [0,0] T,Q = u,r = u+v and S = v (see Figure 3). Then Area (PQRS) = u 1 v 2 u 2 v 1, the absolute value of u 1 u 2 v 1 v 2. u v P γ T w θ S v Qu Figure 3: Parallelepiped with vertices P,Q,R and S as base Recall that the dot product of u T,v T, denoted u v = u 1 v 1 +u 2 v 2, the length of u, denoted l(u) = u 2 1 +u2 u v 2 and cos(θ) = l(u)l(v), where θ is the angle between u and v. Therefore ( ) u v 2 Area(P QRS) = l(u)l(v) sin(θ) = l(u)l(v) 1 l(u)l(v) = l(u) 2 l(v) 2 (u v) 2 = (u 1 v 2 u 2 v 1 ) 2 = u 1 v 2 u 2 v 1. That is, in R 2, the determinant is ± times the area of the parallelogram. R

48 48 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R 2. Consider Figure 3 again. Let u = [u 1,u 2,u 3 ] T,v = [v 1,v 2,v 3 ] T,w = [w 1,w 2,w 3 ] T R 3. Then u v = u 1 v 1 +u 2 v 2 +u 3 v 3 and the cross product of u and v, denoted u v = (u 2 v 3 u 3 v 2,u 3 v 1 u 1 v 3,u 1 v 2 u 2 v 1 ). Also, the vector u v is perpendicular to the plane containing both u and v. So, if u 3 = v 3 = 0 then one can think of u and v as vectors in the XY-plane and in this case l(u v) = u 1 v 2 u 2 v 1 = Area(PQRS). Hence, if γ is the angle between the vector w and the vector u v then w 1 w 2 w 3 volume (P) = Area(PQRS) height = w (u v) = ± u 1 u 2 u 3. v 1 v 2 v 3 3. In general, for an n n matrix A, det(a) satisfies all the properties associated with the volume of the n-dimensional parallelepiped. The actual proof is beyond the scope of this book. But, one can observe the following: (a) The volume of the n-dimensional unit cube is 1 = det(i n ). (b) If one vector is multiplied by c 0 then the volume either reduces or expands by c. The same holds for determinant. (c) Recall that if the height and base of a parallelogram is not changed then the area remains the same. This corresponds to replacing the perpendicular vector with the perpendicular vector plus c 0 times the base vector for different choices of c. 2.3.B Adjugate (classically Adjoint) of a Matrix Definition [Minor, Cofactor] Let A be an n n matrix. Then the 1. (i,j) th minor of A, denoted A ij = det(a(i j)), for 1 i,j n. 2. (i,j) th cofactor of A, denoted C ij = ( 1) i+j A ij. 3. the Adjugate (classically Adjoint) of A, denoted Adj(A) = [b ij ] with b ij = C ji, for i,j n. For example, for A = 2 3 1, A 11 = det(a(1 1)) = = 4,A 12 = = 3,A 13 = = 1,...,A = 2 1 = 5 and A = = 1. So, 2 3 ( 1) 1+1 A 11 ( 1) 2+1 A 21 ( 1) 3+1 A Adj(A) = ( 1) 1+2 A 12 ( 1) 2+2 A 22 ( 1) 3+2 A 32 = ( 1) 1+3 A 13 ( 1) 2+3 A 23 ( 1) 3+3 A We now prove a very important result that relates adjugate matrix with the inverse. Theorem Let A be an n n matrix. Then

49 2.3. SQUARE MATRICES AND SYSTEM OF LINEAR EQUATIONS for 1 i n, 2. for i l, n a ij C ij = n a ij ( 1) i+j A ij = det(a), j=1 j=1 n a ij C lj = n a ij ( 1) l+j A lj = 0, and j=1 j=1 3. A(Adj(A)) = det(a)i n. Thus, whenever det(a) 0 one has A 1 = 1 Adj(A). ( ) det(a) Proof. Part 1: It follows directly from Remark and the definition of the cofactor. Part 2: Fix positive integers i,l with 1 i l n and let B = [b ij ] be a square matrix with B[l,:] = A[i,:] and B[t,:] = A[t,:], for t l. As l i, B[l,:] = B[i,:] and thus, by Theorem , det(b) = 0. As A(l j) = B(l j), for 1 j n, using Remark = det(b) = = n ( 1) l+j b lj det ( B(l j) ) = j=1 n ( 1) l+j a ij det ( A(l j) ) = j=1 This completes the proof of Part 2. n ( 1) l+j a ij det ( B(l j) ) j=1 n a ij C lj. ( ) Part 3:, Using Equation ( ) and Remark , observe that [ A ( Adj(A) )] { n ( ) n = a ik Adj(A) kj = 0, if i j, a ik C jk = ij det(a), if i = j. k=1 k=1 j=1 ) Thus, A(Adj(A)) = det(a)i n. Therefore, if det(a) 0 then A( 1 det(a) Adj(A) = I n. Hence, by Corollary , A 1 = 1 det(a) Adj(A) Example For A = 0 1 1, Adj(A) = and det(a) = 2. Thus, /2 1/2 1/2 by Theorem , A 1 = 1/2 1/2 1/2. 1/2 3/2 1/2 Let A be a non-singular matrix. Then, by Theorem , A 1 = 1 det(a) Adj(A). Thus A ( Adj(A) ) = ( Adj(A) ) A = det(a) I n and this completes the proof of the next result Corollary Let A be a non-singular matrix. Then { n det(a), if j = k, a ij C ik = 0, if j k. The next result gives another equivalent condition for a square matrix to be invertible. Theorem A square matrix A is non-singular if and only if A is invertible.

50 50 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Proof. Let A be non-singular. Then det(a) 0 and hence A 1 = 1 det(a) Adj(A) as. Now, let us assume that A is invertible. Then, using Theorem , A = E 1 E k a product of elementary matrices. Also, by Corollary , det(e i ) 0 for 1 i k. Thus, a repeated application of Parts 1,2 and 3 of Theorem gives det(a) 0. The next result relates the determinant of product of two matrices with their determinants. Theorem Let A and B be square matrices of order n. Then det(ab) = det(a) det(b) = det(ba). Proof. Step 1. Let A be non-singular. Then, by Theorem , A is invertible and by Theorem , A = E 1 E k, a product of elementary matrices. Then a repeated application of Parts 1,2 and 3 of Theorem gives the desired result as det(ab) = det(e 1 E k B) = det(e 1 )det(e 2 E k B) = det(e 1 )det(e 2 )det(e 3 E k B) = = det(e 1 ) det(e k )det(b) = = det(e 1 E 2 E k )det(b) = det(a)det(b). Step 2. Let A be singular. Then, by Theorem A is not invertible. So, by Theorem [ ] and Exercise [ ] there exists an invertible matrix P such that PA = C, where C1 C =. So, A = P 1 C1. As P is invertible, using Part 1, we have 0 0 det(ab) = det (( P 1 [ C1 0 ]) B ) = det ( P 1 [ C1 B 0 = det(p) 0 = 0 = 0 det(b) = det(a)det(b). ]) = det(p 1 ) det ([ ]) C1 B 0 Thus, the proof of the theorem is complete. The next result relates the determinant of a matrix with the determinant of its transpose. Thus, the determinant can be computed by expanding along any column as well. Theorem Let A be a square matrix. Then det(a) = det(a T ). Proof. If A is a non-singular, Corollary gives det(a) = det(a T ). If A is singular then, by Theorem , A is not invertible. So, A T is also not invertible and hence by Theorem , det(a T ) = 0 = det(a). Example Let A be an orthogonal matrix then, by definition, AA T = I. Thus, by Theorems and = det(i) = det(aa T ) = det(a)det(a T ) = det(a)det(a) = (det(a)) 2. Hence deta = ±1. In particular, if A = [ ] a b then I = AA c d T = [ a 2 +b 2 ] ac+bd ac+bd c 2 +d Thus, a 2 +b 2 = 1 and hence there exists θ [0, 2π] such that a = cosθ and b = sinθ.

51 2.3. SQUARE MATRICES AND SYSTEM OF LINEAR EQUATIONS As ac + bd = 0, we get c = rsinθ and d = rcosθ, for some r R. But, c 2 + d 2 = 1 implies that either c = sinθ and d = cosθ or c = sinθ and d = cosθ. [ ] [ ] cosθ sinθ cosθ sinθ 3. Thus, A = or A =. sinθ cosθ sinθ cosθ [ ] cosθ sinθ 4. For A =, det(a) = 1. Then A represents a reflection across the line sinθ cosθ y = mx. Determine m? (see Exercise 3.3b). [ ] cosθ sinθ 5. For A =, det(a) = 1. Then A represents a rotation through the angle α. sinθ cosθ Determine α? (see Exercise 3.3a). Exercise Let A be an n n upper triangular matrix with non-zero entries on the diagonal. Then prove that A 1 is also an upper triangular matrix. 2. Let A be an n n matrix. Then det(a) = 0 if either a b c a e αa+βe+h (a) Prove that e f g = b f αb+βf +j for some α,β C. h j l c g αc+βg +l (b) Prove that 17 divides (c) A[i,:] T = 0 or A[:,i] = 0, for some i,1 i n. (d) or A[i,:= ca[j,:] or A[:,i] = ca[:,j], for some c C and for some i j. 2.3.C Cramer s Rule Let A be a square matrix. Then combining Theorem and Theorem , one has the following result. Corollary Let A be a square matrix. Then the following statements are equivalent: 1. A is invertible. 2. The linear system Ax = b has a unique solution for every b. 3. det(a) 0. Thus, Ax = b has a unique solution for every b if and only if det(a) 0. The next theorem gives a direct method of finding the solution of the linear system Ax = b when det(a) 0. Theorem (Cramer s Rule). Let A be an n n non-singular matrix. Then the unique solution of the linear system Ax = b with x T = [x 1,...,x n ] is given by x j = det(a j), for j = 1,2,...,n, det(a) where A j is the matrix obtained from A by replacing A[:,j] by b.

52 52 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R Proof. Since det(a) 0, A is invertible. Thus, there exists an invertible matrix P such that PA = I n and P[A b] = [I Pb]. Let d = Ab. Then Ax = b has the unique solution x j = d j, for 1 j n. Also, [e 1,...,e n ] = I = PA = [PA[:,1],...,PA[:,n]]. Thus, PA j = P[A[:,1],...,A[:,j 1],b,A[:,j +1],...,A[:,n]] = [PA[:,1],...,PA[:,j 1],Pb,PA[:,j +1],...,PA[:,n]] = [e 1,...,e j 1,d,e j+1,...,e n ]. Thus, det(pa j ) = d j, for 1 j n. Also, d j = d j 1 = det(pa j) det(pa) = det(p)det(a j) det(p)det(a) = det(a j) det(a). Hence, x j = det(a j) and the required result follows. det(a) Example Solve Ax = b using Cramer s rule, where A = and b = Solution: Check that det(a) = 1 and x T = [ 1,1,0] as x 1 = = 1, x 2 = = 1, and x 3 = = Miscellaneous Exercises [ ] [ ] Exercise Suppose A 1 A 11 A 12 B 11 B 12 = B with A = and B =. Also, A 21 A 22 B 21 B 22 assume that A 11 is invertible and define P = A 22 A 21 A 1 11 A 12. Then prove that [ ][ ] [ ] I 0 A11 A (a) 12 A11 A 12 A 21 A 1 = 11 I A 21 A 22 0 A 22 A 21 A 1 11 A, 12 [ ] A 1 11 (b) P is invertible and B = +(A 1 11 A 12)P 1 (A 21 A 1 11 ) (A 1 11 A 12)P 1 P 1 (A 21 A 1 11 ). P 1 2. Determine necessary and sufficient condition for a triangular matrix to be invertible. 3. Let A be a unitary matrix then what can you say about det(a)? 4. Let A be a 2 2 matrix with Tr(A) = 0 and det(a) = 0. Then A is a nilpotent matrix. 5. Let A and B be two non-singular matrices. Are the matrices A+B and A B non-singular? Justify your answer. 6. Let A be an n n matrix. Prove that the following statements are equivalent: (a) A is not invertible. (b) Rank(A) n.

53 2.4. MISCELLANEOUS EXERCISES 53 (c) det(a) = 0. (d) A is not row-equivalent to I n. (e) The homogeneous system Ax = 0 has a non-trivial solution. (f) The system Ax = b is either inconsistent or it has an infinite number of solutions. (g) A is not a product of elementary matrices. 7. For what value(s) of λ does the following systems have non-trivial solutions? Also, for each value of λ, determine a non-trivial solution. (a) (λ 2)x+y = 0, x+(λ+2)y = 0. (b) λx+3y = 0, (λ+6)y = Let a 1,...,a n C and define A = [a ij ] n n with a ij = a j 1 i. Prove that det(a) = (a j a i ). This matrix is usually called the van der monde matrix. 1 i<j n 9. Let A = [a ij ] n n with a ij = max{i,j}. Prove that deta = ( 1) n 1 n. 10. Solve the following system of equations by Cramer s rule. i) x+y +z w = 1, x+y z +w = 2, 2x+y +z w = 7, x+y +z +w = 3. ii) x y +z w = 1, x+y z +w = 2, 2x+y z w = 7, x y z +w = Let p C,p 0. Let A = [a ij ] and B = [b ij ] be two n n matrices with b ij = p i j a ij, for 1 i,j n. Then compute det(b) in terms of det(a). 12. The position of an element a ij of a determinant is called even or odd according as i+j is even or odd. Prove that if all the entries in (a) odd positions are multiplied with 1 then the value of determinant doesn t change. (b) even positions are multiplied with 1 then the value of determinant i. does not change if the matrix is of even order. ii. is multiplied by 1 if the matrix is of odd order. 13. Let A be a Hermitian matrix. Prove that deta is a real number. 14. Let A be an n n matrix. Then A is invertible if and only if Adj(A) is invertible. 15. Let A and B be invertible matrices. Prove that Adj(AB) = Adj(B)Adj(A). 16. Let A be an invertible matrix and let P = only if D = CA 1 B. [ ] A B. Then show that Rank(P) = n if and C D

54 54 CHAPTER 2. SYSTEM OF LINEAR EQUATIONS OVER R 2.5 Summary In this chapter, we started with a system of m linear equations in n unknowns and formally wrote it as Ax = b and in turn to the augmented matrix [A b]. Then the basic operations on equations led to multiplication by elementary matrices on the right of [A b] and thus giving as the RREF which in turn gave us rank of a matrix. If Rank(A) = r and Rank([A b]) = r a and 1. r < r a then the linear system Ax = b is inconsistent. 2. r = r a then the linear system Ax = b is consistent. Furthermore, if (a) r = n then the system Ax = b has a unique solution. (b) r < n then the system Ax = b has an infinite number of solutions. We have also see that the following conditions are equivalent for an n n matrix A. 1. A is invertible. 2. The homogeneous system Ax = 0 has only the trivial solution. 3. The row reduced echelon form of A is I. 4. A is a product of elementary matrices. 5. The system Ax = b has a unique solution for every b. 6. The system Ax = b has a solution for every b. 7. rank(a) = n. 8. det(a) 0. So, overall we have learnt to solve the following type of problems: 1. Solving the linear system Ax = b. This idea will lead to the question is the vector b a linear combination of the columns of A? 2. Solving the linear system Ax = 0. This will lead to the question are the columns of A linearly independent/dependent? In particular, if Ax = 0 has (a) a unique solution then the columns of A are linear independent. (b) else, the columns of A are linearly dependent.

55 Chapter 3 Vector Spaces In this chapter, we will mainly be concerned with finite dimensional vector spaces over R or C. The last section will consist of results in infinite dimensional vector spaces that are similar but different as compared with he finite dimensional case. We have given lots of examples of vector spaces that are infinite dimensional or are vector spaces over fields that are different from R and C. See appendix to have some ideas about fields that are different from R and C. 3.1 Vector Spaces: Definition and Examples In this chapter, F denotes either R, the set of real numbers or C, the set of complex numbers. Let A be an m n complex matrix and let V denote the solution set of the homogeneous system Ax = 0. Then, by Theorem , V satisfies: 1. 0 V as A0 = if x V then αx V, for all α C. In particular, for α = 1, x V. 3. if x,y V then, for any α,β C, αx+βy V. 4. if x,y,z V then, (x+y)+z = x+(y+z). That is, the solution set of a homogeneous linear system satisfies some nice properties. The Euclidean plane, R 2 and the Euclidean space, R 3, also satisfy the above properties. In this chapter, our aim is to understand sets that satisfy such properties. We start with the following definition. Definition [Vector Space] A vector space V over F, denoted V(F) or in short V (if the field F is clear from the context), is a non-empty set, satisfying the following axioms: 1. Vector Addition: To every pair u,v V there corresponds a unique element u v V (called the addition of vectors) such that (a) u v = v u (Commutative law). (b) (u v) w = u (v w) (Associative law).

56 56 CHAPTER 3. VECTOR SPACES (c) V has a unique element, denoted 0, called the zero vector that satisfies u 0 = u, for every u V (called the additive identity). (d) for every u V there is a unique element u V that satisfies u ( u) = 0 (called the additive inverse). 2. Scalar Multiplication: For each u V and α F, there corresponds a unique element α u in V (called the scalar multiplication) such that (a) α (β u) = (αβ) u for every α,β F and u V ( is multiplication in F). (b) 1 u = u for every u V, where 1 F. 3. Distributive Laws: relating vector addition with scalar multiplication For any α,β F and u,v V, the following distributive laws hold: (a) α (u v) = (α u) (α v). (b) (α+β) u = (α u) (β u) (+ is addition in F). Definition The number 0 F, whereas 0 V. 2. The elements of F are called scalars. 3. The elements of V are called vectors. 4. If F = R then V is called a real vector space. 5. If F = C then V is called a complex vector space. 6. In general, a vector space over R or C is called a linear space. Some interesting consequences of Definition is stated next. Intuitively, these results seem obvious but for better understanding of the axioms it is desirable to go through the proof. Theorem Let V be a vector space over F. Then 1. u v = u implies v = α u = 0 if and only if either u = 0 or α = ( 1) u = u, for every u V. Proof. Part 1: By Axiom d, for each u V there exists u V such that u u = 0. Hence, u v = u is equivalent to u (u v) = u u ( u u) v = 0 0 v = 0 v = 0. Part 2: As 0 = 0 0, using Axiom , we have α 0 = α (0 0) = (α 0) (α 0). Thus, using Part 1, α 0 = 0 for any α F. In the same way, using Axiom b, 0 u = (0+0) u = (0 u) (0 u).

57 3.1. VECTOR SPACES: DEFINITION AND EXAMPLES 57 Hence, using Part 1, one has 0 u = 0 for any u V. Now suppose α u = 0. If α = 0 then the proof is over. Therefore, assume that α 0,α F. Then, (α) 1 F and 0 = (α) 1 0 = (α) 1 (α u) = ((α) 1 α) u = 1 u = u as 1 u = u for every vector u V (see Axiom 2.2b). Thus, if α 0 and α u = 0 then u = 0. Part 3: As 0 = 0 u = (1+( 1))u = u ( 1) u, one has ( 1) u = u. Example The readers are advised to justify the statements given below. 1. Let A be an m n matrix with complex entries with Rank(A) = r n. Then, using Theorem , V = {x Ax = 0} is a vector space. 2. Consider R with the usual addition and multiplication. That is, + and. Then, R forms a real vector space. 3. Let R 2 = {(x 1,x 2 ) T x 1,x 2 R} Then, for x 1,x 2,y 1,y 2 R and α R, define (x 1,x 2 ) T (y 1,y 2 ) T = (x 1 +y 1,x 2 +y 2 ) T and α (x 1,x 2 ) T = (αx 1,αx 2 ) T. Verify that R 2 is a real vector space. 4. LetR n = {(a 1,...,a n ) T a i R,1 i n}. For u = (a 1,...,a n ) T, v = (b 1,...,b n ) T V and α R, define u v = (a 1 +b 1,...,a n +b n ) T and α u = (αa 1,...,αa n ) T (called component wise operations). Then, V is a real vector space. The vector space R n is called the real vector space of n-tuples. Recall that the symbol i represents the complex number Consider C = {x + iy x,y R}, the set of complex numbers. Let z 1 = x 1 + iy 1 and z 2 = x 2 +iy 2 and define z 1 z 2 = (x 1 +x 2 )+i(y 1 +y 2 ). For scalar multiplication, (a) let α R. Then α z 1 = (αx 1 )+i(αy 1 ) and we call C the real vector space. (b) let α+iβ C. Then (α+iβ) (x 1 +iy 1 ) = (αx 1 βy 1 )+i(αy 1 +βx 1 ) and we call C the complex vector space. 6. Let C n = {(z 1,...,z n ) T z i C,1 i n}. For z = (z 1,...,z n ),w = (w 1,...,w n ) T C n and α F, define z+w = (z 1 +w 1,...,z n +w n ) T, and α z = (αz 1,...,αz n ) T. Then, verify that C n forms a vector space over C (called complex vector space) as well as over R (called real vector space). In general, we assume C n to be a complex vector space.

58 58 CHAPTER 3. VECTOR SPACES Remark If F = C then i(1,0) = (i,0) is allowed. Whereas, if F = R then i(1,0) doesn t make sense as i R. 7. Fix m,n N and let M m,n (C) = {A m n = [a ij ] a ij C}. For A,B M m,n (C) and α C, define (A+αB) ij = a ij +αb ij. Then M m,n (C) is a complex vector space. If m = n, the vector space M m,n (C) is denoted by M n (C). 8. Let S be a non-empty set and let R S = {f f is a function from S to R}. For f,g R S and α R, define (f +αg)(x) = f(x) +αg(x), for all x S. Then, R S is a real vector space. In particular, (a) for S = N, observe that R N, consisting of all real sequences, forms a real vector space. (b) Let V be the set of all bounded real sequences. Then V is a real vector space. (c) Let V be the set of all real sequences that converge to 0. Then V is a real vector space. (d) Let S be the set of all real sequences that converge to 1. Then check that S is not a vector space. Determine the conditions that fail. 9. Fix a,b R with a < b and let C([a,b],R) = {f : [a,b] R f is continuous}. Then C([a,b],R) with (f +αg)(x) = f(x)+αg(x), for all x [a,b], is a real vector space. 10. Let C(R,R) = {f : R R f is continuous}. Then C(R,R) with (f + αg)(x) = f(x) + αg(x), for all x R, is a real vector space. 11. Fix a < b R and let C 2 ((a,b),r) = {f : (a,b) R f exists and f is continuous}. Then C 2 ((a,b),r) with (f+αg)(x) = f(x)+αg(x), for all x (a,b), is a real vector space. 12. Fix a < b R and let C ((a,b),r) = {f : (a,b) R f is infinitely differentiable}. Then C ((a,b),r) with (f +αg)(x) = f(x)+αg(x), for all x (a,b) is a real vector space. 13. Fix a < b R. Then V = {f : (a,b) R f +f +2f = 0} is a real vector space. 14. Let R[x] = {p(x) p(x) is a real polynomial in x}. Then, with the usual addition of polynomials and α(a 0 +a 1 x+ +a n x n ) = (αa 0 )+ +(αa n )x n, for α R, gives R[x] a real vector space structure. 15. Fix n N and let R[x;n] = {p(x) R[x] p(x) has degree n}.then, with the usual addition of polynomials and α(a 0 +a 1 x+ +a n x n ) = (αa 0 )+ +(αa n )x n, for α R, gives R[x; n] a real vector space structure. 16. Let C[x] = {p(x) p(x) is a complex polynomial in x}. Then, with the usual addition of polynomials and α(a 0 +a 1 x+ +a n x n ) = (αa 0 )+ +(αa n )x n, for α C, gives C[x] a real vector space structure. 17. Let V = {0}. Then V is a real as well as a complex vector space. 18. Let R + = {x R x > 0}. Then

59 3.1. VECTOR SPACES: DEFINITION AND EXAMPLES 59 (a) R + is not a vector space under usual operations of addition and scalar multiplication. (b) R + is a real vector space with 1 as the additive identity if we define u v = u v and α u = u α for all u,v R + and α R. 19. For any α R and x = (x 1,x 2 ) T,y = (y 1,y 2 ) T R 2, define x y = (x 1 +y 1 +1,x 2 +y 2 3) T and α x = (αx 1 +α 1,αx 2 3α+3) T. Then R 2 is a real vector space with ( 1,3) T as the additive identity. 20. Let V = {A = [a ij ] M n (C) a 11 = 0}. Then V is a complex vector space. 21. Let V = {A = [a ij ] M n (C) A = A }. Then V is a real vector space but not a complex vector space. 22. Let V and W be vector spaces over F, with operations (+, ) and (, ), respectively. Let V W = {(v,w) v V,w W}. Then V W forms a vector space over F, if for every (v 1,w 1 ),(v 2,w 2 ) V W and α R, we define (v 1,w 1 ) (v 2,w 2 ) = (v 1 +v 2,w 1 w 2 ), and α (v 1,w 1 ) = (α v 1,α w 1 ). v 1 +v 2 and w 1 w 2 on the right hand side mean vector addition in V and W, respectively. Similarly, α v 1 and α w 1 correspond to scalar multiplication in V and W, respectively. 23. Let Q be the set of scalars. Then R is a vector space over Q. As e,π 2 Q, these real numbers are vectors but not scalars in this space. 24. Similarly, C is a vector space over Q. Since e π,i+ 2,i Q, these complex numbers are vectors but not scalars in this space. 25. Let Z 5 = {0,1,2,3,4} with addition and multiplication, respectively, given by and Then, V = {(a,b) a,b Z 5 } is a vector space having 25 vectors. Note that all our vector spaces, except the last three, are linear spaces. From now on, we will use u+v for u v and αu or α u for α u. Exercise Verify the axioms for vector spaces that appear in Example

60 60 CHAPTER 3. VECTOR SPACES 2. Does the set V given below form a real/complex or both real and complex vector space? Give reasons for your answer. (a) For x = (x 1,x 2 ) T,y = (y 1,y 2 ) T R 2, define x + y = (x 1 + y 1,x 2 + y 2 ) T and αx = (αx 1,0) T for all α R. {[ ] } a b (b) Let V = a,b,c,d C,a+c = 0. c d {[ ] } a b (c) Let V = a = b, a,b,c,d C. c d (d) Let V = {(x,y,z) T x+y +z = 1}. (e) Let V = {(x,y) T R 2 x y = 0}. (f) Let V = {(x,y) T R 2 x = y 2 }. (g) Let V = {α(1,1,1) T +β(1,1, 1) T α,β R}. (h) Let V = R with x y = x y and α x = αx, for all x,y V and α R. (i) Let V = R 2. Define (x 1,y 1 ) T (x 2,y 2 ) T = (x 1 +x 2,0) T and α (x 1,y 1 ) T = (αx 1,0) T, for α,x 1,x 2,y 1,y 2 R. 3.1.A Subspaces Definition [Vector Subspace] Let V be a vector space over F. Then, a non-empty subset S of V is called a subspace of V if S is also a vector space with vector addition and scalar multiplication inherited from V. Example subspaces. 1. If V is a vector space then V and {0} are subspaces, called trivial 2. The real vector space R has no non-trivial subspace. 3. W = {x R 3 [1,2, 1]x = 0} is a plane in R 3 containing 0 (so a subspace). [ ] W = {x R 3 x = 0} is a line in R containing 0 (so a subspace). 5. The vector space R[x;n] is a subspace of R[x]. 6. Prove that C 2 (a,b) is a subspace of C(a,b). 7. Prove that W = {(x,0) T R 2 x R} is a subspace of R Is the set of sequences converging to 0 a subspace of the set of all bounded sequences? 9. Let V be the vector space of Example Then (a) S = {(x,0) T x R}isnotasubspaceofVas(x,0) T (y,0) T = (x+y+1, 3) T S. (b) W = {(x,3) T x R} is a subspace of V. 10. The vector space R + defined in Example is not a subspace of R.

61 3.1. VECTOR SPACES: DEFINITION AND EXAMPLES 61 Let V(F) be a vector space and W V,W. We now prove a result which implies that to check W to be a subspace, we need to verify only one condition. Theorem Let V(F) be a vector space and W V,W. Then W is a subspace of V if and only if αu+βv W whenever α,β F and u,v W. Proof. Let W be a subspace of V and let u,v W. Then, for every α,β F, αu,βv W and hence αu+βv W. Now, we assume that αu + βv W, whenever α,β F and u,v W. To show, W is a subspace of V: 1. Taking α = 1 and β = 1, we see that u+v W, for every u,v W. 2. Taking α = 0 and β = 0, we see that 0 W. 3. Taking β = 0, we see that αu W, for every α F and u W. Hence, using Theorem , u = ( 1)u W as well. 4. The commutative and associative laws of vector addition hold as they hold in V. 5. The axioms related with scalar multiplication and the distributive laws also hold as they hold in V. Thus, one obtains the required result. Exercise Determine all the subspaces of R and R Prove that a line in R 2 is a subspace if and only if it passes through (0,0) R Are all the sets given below subspaces of C([ 1, 1])? (a) W = {f C([ 1,1]) f(1/2) = 0}. (b) W = {f C([ 1,1]) f( 1/2) = 0,f(1/2) = 0}. (c) W = {f C([ 1,1]) f ( 1 4 ) exists }. 4. Are all the sets given below subspaces of R[x]? (a) W = {f(x) R[x] deg(f(x)) = 3}. (b) W = {f(x) R[x] deg(f(x)) = 0}. (c) W = {f(x) R[x] f(1) = 0}. (d) W = {f(x) R[x] f(0) = 0,f(1/2) = 0}. 5. Which of the following are subspaces of R n (R)? (a) {(x 1,x 2,...,x n ) T x 1 0}. (b) {(x 1,x 2,...,x n ) T x 1 is rational}. (c) {(x 1,x 2,...,x n ) T x 1 1}.

62 62 CHAPTER 3. VECTOR SPACES 6. Among the following, determine the subspaces of the complex vector space C n? (a) {(z 1,z 2,...,z n ) T z 1 is real }. (b) {(z 1,z 2,...,z n ) T z 1 +z 2 = z 3 }. (c) {(z 1,z 2,...,z n ) T z 1 = z 2 }. 7. Fix n N. Then, is W a subspace of M n (R), where (a) W = {A M n (R) A is upper triangular}? (b) W = {A M n (R) A is symmetric}? (c) W = {A M n (R) A is skew-symmetric}? (d) W = {A A is a diagonal matrix}? (e) W = {A trace(a) = 0}? (f) W = {A M n (R) A T = 2A}? (g) W = {A = [a ij ] a 11 +a 21 +a 34 = 0}? 8. Fix n N. Then, is W = {A = [a ij ] a 11 + a 22 = 0} a subspace of the complex vector space M n (C)? What if M n (C) is a real vector space? 9. Prove that the following sets are not subspaces of M n (R). (a) G = {A M n (R) det(a) = 0}. (b) G = {A M n (R) det(a) 0}. (c) G = {A M n (R) det(a) = 1}. 3.1.B Linear Span Definition [Linear Combination] Let V be a vector space over F and let u 1,...,u n V. Then, a vector u V is said to be a linear combination of u 1,...,u n if we can find scalars α 1,...,α n F such that u = α 1 u 1 + +α n u n. Example (3,4,3) = 2(1,1,1) +(1,2,1) but we cannot find a,b R such that (3,4,5) = a(1,1,1) +b(1,2,1). 2. Is (4,5,5) a linear combination of (1,0,0), (2,1,0), and (3,3,1)? Solution: (4,5,5) is a linear combination if the linear system a(1,0,0) +b(2,1,0) +c(3,3,1) = (4,5,5) ( ) in the unknowns a,b,c R has a solution. Clearly, Equation ( ) has solution a = 9,b = 10 and c = 5.

63 3.1. VECTOR SPACES: DEFINITION AND EXAMPLES Is (4,5,5) a linear combination of the vectors (1,2,3),( 1,1,4) and (3,3,2)? Solution: The vector (4,5,5) is a linear combination if the linear system a(1,2,3) +b( 1,1,4)+c(3,3,2) = (4,5,5) ( ) in the unknowns a,b,c R has a solution. The RREF of the corresponding augmented matrix equals , implying infinite number of solutions. For example, (4,5,5) = 3(1,2,3) ( 1,1,4). 4. Is (4,5,5) a linear combination of the vectors (1,2,1),(1,0, 1) and (1,1,0)? Solution: The vector (4,5,5) is a linear combination if the linear system a(1,2,1) +b(1,0, 1)+c(1,1,0) = (4,5,5) ( ) in the unknowns a,b,c R has a solution. The RREF of the corresponding augmented 1 0 1/2 5/2 matrix equals So, Equation ( ) has no solution. Thus, (4,5,5) is not a linear combination of the given vectors. Exercise Let x R 3. Prove that x T is a linear combination of (1,0,0), (2,1,0) and (3,3,1). Is this linear combination unique? That is, does there exist (a,b,c) (e,f,g) with x T = a(1,0,0) +b(2,1,0) +c(3,3,1) = e(1,0,0) +f(2,1,0)+g(3,3,1)? 2. Find condition(s) on x,y,z R such that (x,y,z) is a linear combination of (a) (1,2,3),( 1,1,4) and (3,3,2). (b) (1,2,1),(1,0, 1) and (1,1,0). (c) (1,1,1),(1,1,0) and (1, 1,0). Definition [Linear Span] Let V be a vector space over F and S = {u 1,...,u n } V. Then, the linear span of S, denoted LS(S), equals LS(S) = {α 1 u 1 + +α n u n α i F,1 i n}. If S is an empty set, we define LS(S) = {0}. Example For the set S given below, determine LS(S). 1. S = {(1,0) T,(0,1) T } R 2. Solution: LS(S) = {a(1,0) T +b(0,1) T a,b R} = {(a,b) T a,b R} = R S = {(1,1,1) T,(2,1,3) T }. What is the geometrical representation of LS(S)? Solution: LS(S) = {a(1,1,1) T +b(2,1,3) T a,b R} = {(a+2b,a+b,a+3b) T a,b R}. For geometrical representation, we need to find conditions on x,y and z such that

64 64 CHAPTER 3. VECTOR SPACES (a + 2b,a + b,a + 3b) = (x,y,z). Or equivalently, a + 2b = x,a + b = y,a + 3b = z has a solution for all a,b R. Check that the RREF of the augmented matrix equals 1 0 2y x 0 1 x y. Thus, we need 2x y z = 0. Hence, LS(S) is a plane given by 0 0 z +y 2x LS(S) = {a(1,1,1) T +b(2,1,3) T a,b R} = {(x,y,z) T R 3 2x y z = 0}. 3. S = {(1,2,1) T,(1,0, 1) T,(1,1,0) T }. What is the geometrical representation of LS(S)? Solution: As above, we need to find condition(s) on x,y,z such that the linear system a(1,2,1) +b(1,0, 1) +c(1,1,0) = (x,y,z) ( ) in the unknowns a, b, c is always consistent. An application of GJE to Equation ( ) x+y gives 1 2x y Thus, x y +z 4. S = {(1,2,3) T,( 1,1,4) T,(3,3,2) T }. LS(S) = {(x,y,z) T R 3 x y +z = 0}. Solution: As above, need to find condition(s) on x,y,z such that the linear system a(1,2,3) +b( 1,1,4) +c(3,3,2) = (x,y,z) in the unknowns a, b, c is always consistent. An application of GJE method gives 5x 7y +3z = 0 as the required condition. Thus, LS(S) = {(x,y,z) T R 3 5x 7y +3z = 0}. 5. S = {(1,2,3,4) T,( 1,1,4,5) T,(3,3,2,3) T } R 4. Solution: The readers are advised to show that LS(S) = {(x,y,z,w) T R 4 2x 3y +w = 0, 5x 7y +3z = 0}. Exercise For each S, determine the geometric representation of LS(S). 1. S = { 1} R. 2. S = {π} R. 3. S = {(1,0,1) T,(0,1,0) T,(3,0,3) T } R S = {(1,2,1) T,(2,0,1) T,(1,1,1) T } R S = {(1,0,1,1) T,(0,1,0,1) T,(3,0,3,1) T } R 4.

65 3.1. VECTOR SPACES: DEFINITION AND EXAMPLES 65 Definition [Finite Dimensional Vector Space] Let V be a vector space over F. Then V is called finite dimensional if there exists S V, such that S has finite number of elements and V = LS(S). If such an S does not exist then V is called infinite dimensional. Example {(1,2) T,(2,1) T } spans R 2. Thus, R 2 is finite dimensional. 2. {1,1+x,1 x+x 2,x 3,x 4,x 5 } spans C[x;5]. Thus, C[x;5] is finite dimensional. 3. Fix n N. Then, R[x;n] is finite dimensional as R[x;n] = LS({1,x,x 2,...,x n }). 4. C[x] is not finite dimensional as the degree of a polynomial can be any large positive integer. Indeed, verify that C[x] = LS({1,x,x 2,...,x n,...}). 5. The vector space R over Q is infinite dimensional. 6. The vector space C over Q is infinite dimensional. Lemma (Linear Span is a Subspace). Let V be a vector space over F and S V. Then LS(S) is a subspace of V. Proof. By definition, 0 LS(S). So, LS(S) is non-empty. Let u,v LS(S). To show, au+bv LS(S) for all a,b F. As u,v LS(S), there exist n N, vectors w i S and scalars α i,β i F such that u = α 1 w 1 + +α n w n and v = β 1 w 1 + +β n w n. Hence, au+bv = (aα 1 +bβ 1 )w 1 + +(aα n +bβ n )w n LS(S) as aα i +bβ i F for 1 i n. Thus, by Theorem , LS(S) is a vector subspace. Remark Let V be a vector space over F. If W is a subspace of V and S W then LS(S) is a subspace of W as well. Theorem Let V be a vector space over F and S V. Then LS(S) is the smallest subspace of V containing S. Proof. For every u S, u = 1 u LS(S). Thus, S LS(S). Need to show that LS(S) is the smallest subspace of V containing S. So, let W be any subspace of V containing S. Then, by Remark , LS(S) W and hence the result follows. Definition Let V be a vector space over F. 1. Let S and T be two subsets of V. Then, the sum of S and T, denoted S + T equals {s+t s S,t T}. For example, (a) if V = R, S = {0,1,2,3,4,5,6} and T = {5,10,15} then S +T = {5,6,...,21}. {[ ]} {[ ]} {[ ]} (b) if V = R 2, S = and T = then S +T = {[ ]} ([ ]) {[ ] [ ] } (c) if V = R 2, S = and T = LS then S +T = +c c R Let P and Q be two subspaces of V. Then, we define their sum, denoted P + Q, as P +Q = {u+v u P,v Q}. For example, P +Q = R 2, if

66 66 CHAPTER 3. VECTOR SPACES (a) P = {(x,0) T x R} and Q = {(0,x) T x R} as (x,y) = (x,0)+(0,y). (b) P = {(x,0) T x R} and Q = {(x,x) T x R} as (x,y) = (x y,0)+(y,y). (c) P = LS((1,2) T ) and Q = LS((2,1) T 2y x ) as (x,y) = (1,2)+ 2x y (2,1). 3 3 We leave the proof of the next result for readers. Lemma Let V be a vector space over F and let P and Q be two subspaces of V. Then P +Q is the smallest subspace of V containing both P and Q. Exercise Let a R 2,a 0. Then show that {x R 2 a T x = 0} is a non-trivial subspace of R 2. Geometrically, what does this set represent in R 2? 2. Find all subspaces of R Prove that {(x,y,z) T R 3 ax+by +cz = d} is a subspace of R 3 if and only if d = Let W = {f(x) R[x] deg(f(x)) = 5}. Prove that W is not a subspace R[x]. 5. Determine all subspaces of the vector space in Example {[ ] } {[ ] } a b a 0 6. Let U = a,b,c R and W = a,d R be subspaces of M c 0 0 d 2 (R). Determine U W. Is M 2 (R) = U W? What is U+W? 7. Let W and U be two subspaces of a vector space V over F. (a) Prove that W U is a subspace of V. (b) Give examples of W and U such that W U is not a subspace of V. (c) Determine conditions on W and U such that W W a subspace of V? (d) Prove that LS(W U) = W+U. 8. Let S = {x 1,x 2,x 3,x 4 }, where x 1 = (1,0,0) T,x 2 = (1,1,0) T,x 3 = (1,2,0) T and x 4 = (1,1,1) T. Then, determine all x i such that LS(S) = LS(S \{x i }). 9. Let W = LS((1,0,0) T,(1,1,0) T ) and U = LS((1,1,1) T ). Prove that W + U = R 3 and W U = {0}. If u R 3, determine u W W and u U U such that u = u W +u U. Is it necessary that u W and u U are unique? 10. Let W = LS((1, 1,0),(1,1,0)) and U = LS((1,1,1),(1,2,1)). Prove that W + U = R 3 and W U {0}. Find u R 3 such that when we write u = u W +u U, with u W W and u U U, the vectors u W and u U are not unique. 3.1.C Fundamental Subspaces Associated with a Matrix Definition Let A M m,n (C). Then, we use the functions P : C n C m and Q : C m C n defined by P(x) = Ax and Q(y) = A y, respectively, for x C n and y C m, to get the four fundamental subspaces associated with A, namely,

67 3.1. VECTOR SPACES: DEFINITION AND EXAMPLES Col(A) = {Ax x C n } = Rng(P) is a subspace of C m, called the Column / Range space. Observe that Col(A) is the linear span of columns of A. 2. Null(A) = {x Ax = 0,x C n } = Null(P) is a subspace of C n, called the Null space. 3. Col(A ) = {A x x C n } = Rng(Q) is the linear span of rows of A = [a ij ]. If A M m,n (R) then Col(A ) reduces to Row(A) = {x T A x R m }, the row space of A. 4. Null(A ) = {x A x = 0,x C n } = Null(Q). If A M m,n (R) then Null(A ) reduces to Null(A T ) = {x R m x T A = 0 T } Example Let A = Then (a) Col(A) = {x = (x 1,x 2,x 3 ) T R 3 x 1 x 2 +x 3 = 0}. (b) Row(A) = {x = (x 1,x 2,x 3 ) T R 3 x 1 +x 2 2x 3 = 0}. (c) Null(A) = LS((1,1, 2) T ). (d) Null(A T ) = LS((1, 1,1) T ) Let A = Then (a) Col(A) = {x = (x 1,x 2,x 3 ) T R 3 x 1 +x 2 x 3 = 0}. (b) Row(A) = {x = (x 1,x 2,x 3,x 4 ) T R 4 x 1 x 2 2x 3 = 0,x 1 3x 2 2x 4 = 0}. (c) Null(A) = LS({(1, 1, 2,0) T,(1, 3,0, 2) T }). (d) Null(A T ) = LS((1,1, 1) T ). 1 i 2i 3. Let A = i 2 3. Then i (a) Col(A) = {(x 1,x 2,x 3 ) C 3 (2+i)x 1 (1 i)x 2 x 3 = 0}. (b) Col(A ) = {(x 1,x 2,x 3 ) C 3 ix 1 x 2 +x 3 = 0}. (c) Null(A) = LS((i,1, 1) T ). (d) Null(A ) = LS(( 2+i,1+i,1) T ). Remark Let A M m,n (R). Then, in Example , observe that the direction ratios of Col(A) matches vector(s) in Null(A T ). Similarly, the direction ratios of Row(A) matches with vectors in Null(A). What are the relationships in case A M m,n (C)? We will come back to these spaces again and again. Let V be a vector space over either R or C. Then, we have learnt that 1. for any S V, LS(S) is again a vector space. Moreover, LS(S) is the smallest subspace containing S.

68 68 CHAPTER 3. VECTOR SPACES 2. unless S =, LS(S) has infinite number of vectors. Therefore, the following questions arise: (a) Are there conditions under which LS(S 1 ) = LS(S 2 ) for S 1 S 2? (b) Is it always possible to find S so that LS(S) = V? (c) Suppose we have found S V such that LS(S) = V. Can we find the minimum number of vectors in S? We try to answer these questions in the subsequent sections. 3.2 Linear Independence Definition [Linear Independence and Dependence] Let S = {u 1,...,u m } be a nonempty subset of a vector space V over F. Then the set S is said to be linearly independent if the system of linear equations α 1 u 1 +α 2 u 2 + +α m u m = 0, ( ) in the unknowns α i s, 1 i m, has only the trivial solution. If the system ( ) has a non-trivial solution then the set S is said to be linearly dependent. If the set S has infinitely many vectors then S is said to be linearly independent if for every finite subset T of S, T is linearly independent, else S is linearly dependent. Let V be a vector space over F and S = {u 1,...,u m } V with S. Then, one needs to solve the linear system of equation α 1 u 1 +α 2 u 2 + +α m u m = 0 ( ) in the unknowns α 1,...,α n F. If α 1 = = α m = 0 is the only solution of ( ) then S is a linearly independent subset of V. Otherwise, S is a linearly dependent subset of V. Since one is solving a linear system over F, linear independence and dependence depend on F, the set of scalars. Example Is the set S a linear independent set? Give reasons. 1. Let S = {(1,2,1) T,(2,1,4) T,(3,3,5) T }. Solution: Consider the system a(1,2,1)+b(2,1,4)+c(3,3,5) = (0,0,0) in the unknowns a,b and c. As rank of coefficient matrix is 2 < 3, the number of unknowns, the system has a non-trivial solution. Thus, S is a linearly dependent subset of R Let S = {(1,1,1) T,(1,1,0) T,(1,0,1) T }. Solution: Consider the system a(1,1,1)+b(1,1,0)+c(1,0,1) = (0,0,0) in the unknowns a,b and c. As rank of coefficient matrix is 3 = the number of unknowns, the system has only the trivial solution. Hence, S is a linearly independent subset of R 3.

69 3.2. LINEAR INDEPENDENCE Consider C as a complex vector space and let S = {1,i}. Solution: Since C is a complex vector space, i 1+( 1)i = i i = 0. So, S is a linear dependent subset the complex vector space C. 4. Consider C as a real vector space and let S = {1,i}. Solution: Consider the linear system a 1 + b i = 0, in the unknowns a,b R. Since a,b R, equating real and imaginary parts, we get a = b = 0. So, S is a linear independent subset the real vector space C. 5. Let A M m,n (C). If Rank(A) < min{m,n} then the rows of A are linearly dependent. Solution: Let B = RREF(A). Then, there exists an invertible matrix P = [p ij ] such that B = PA. Since Rank(A) < min{m,n}, B[m,:] = 0 T. Thus, 0 T = B[m,:] = n p mi A[i,:]. As P is invertible, at least one p mi 0. Thus, the required result follows. 3.2.A Basic Results related to Linear Independence The reader is expected to supply the proof of parts that are not given. Proposition Let V be a vector space over F. Then, 1. 0, the zero-vector, cannot belong to a linearly independent set. 2. every non-empty subset of a linearly independent set in V is also linearly independent. 3. a set containing a linearly dependent set of V is also linearly dependent. Proof. Let S = {0 = u 1,u 2,...,u n }. Then, 1 u 1 +0 u u n = 0. Hence, the system α 1 u 1 + +α m u m = 0 has a non-trivial solution [α 1,α 2,...,α n ] = [1,0...,0]. Thus, the set S is linearly dependent. Theorem Let V be a vector space over F. Let S = {u 1,...,u k } V with S. If T LS(S) such that m = T > k then T is a linearly dependent set. Proof. Let T = {w 1,...,w m }. As w i LS(S), there exist a ij F such that w i = a i1 u a ik u k, for 1 i m. So, w 1. w m = a 11 u 1 + +a 1k u k. a m1 u 1 + +a mk u k = a 11 a 1k u a m1 a mk As m > k, using Corollary , the system A T x = 0 has a non-trivial solution, say x T = m m [α 1,...,α m ] 0 T. That is, α i A T [:,i] = 0. Or equivalently, α i A[i,:] = 0 T. Thus, m α i w i = m α i Thus, the set T is linearly dependent. u 1 ( m ) u 1 u 1 A[i,:]. = α i A[i,:]. = 0 T. = 0. u k u k u k u k

70 70 CHAPTER 3. VECTOR SPACES Corollary Fix n N. Then, any set S R n with S n+1 is linearly dependent. Proof. Observe that R n = LS({e 1,...,e n }), where e i = I n [:,i], the i-th column of I n. Hence, the required result follows using Theorem Theorem Let V be a vector space over F and let S be a linearly independent subset of V. Suppose v V. Then S {v} is linearly dependent if and only if v LS(S). Proof. Let us assume that S {v} is linearly dependent. Then, there exist v i S such that the system α 1 v 1 + +α p v p +α p+1 v = 0 has a non-trivial solution, say α i = c i, for 1 i p+1. As the solution is non-trivial one of the c i s is non-zero. We claim that c p+1 0. For, if c p+1 = 0 then the system α 1 v α p v p = 0 in the unknowns α 1,...,α p has a non-trivial solution [c 1,...,c p ]. This contradicts Proposition as {v 1,...,v p } is a subset of the linearly independent set S. Thus, c p+1 0 and we get v = 1 c p+1 (c 1 v 1 + +c p v p ) L(v 1,...,v p ) as c i c p+1 F, for 1 i p. Hence, v is a linear combination of v 1,...,v p. Now, assume that v LS(S). Then, there exists c i F, not all zero and v i S such that p v = c i v i. Thus, the system α 1 v α p v p + α p+1 v = 0 in the unknowns α i s has a non-trivial solution [c 1,...,c p, 1]. Hence, S {v} is linearly dependent. We now state a very important corollary of Theorem without proof. Corollary Let V be a vector space over F and let S = {u 1,...,u n } V with u 1 0. If S is 1. linearly dependent then there exists k, 2 k n with LS(u 1,...,u k ) = LS(u 1,...,u k 1 ). 2. linearly independent then v V\LS(S) if and only if S {v} = {u 1,...,u n,v} is also a linearly independent subset of V. 3. linearly independent then LS(S) = V if and only if each proper superset of S is linearly dependent. 3.2.B Application to Matrices We leave the proof of the next result for readers. Theorem The following statements are equivalent for A M n (C). 1. A is invertible. 2. The columns of A are linearly independent. 3. The rows of A are linearly independent. A generalization of Theorem is stated next.

71 3.2. LINEAR INDEPENDENCE 71 Theorem Let A M m,n (C) with B = RREF(A). Then, the rows of A corresponding to the pivotal rows of B are linearly independent. Also, the columns of A corresponding to the pivotal columns of B are linearly independent. Proof. Pivotal rows of B are linearly independent due to the pivotal 1 s. Now, let B 1 be the submatrix of B consisting of the pivotal rows of B. Let A 1 be the submatrix of A which gives B 1. As the RREF of a matrix is unique (see Corollary ) there exists an invertible matrix Q such that QA 1 = B 1. So, if there exists c 0 such that c T A 1 = 0 T then 0 T = c T A 1 = c T (Q 1 B 1 ) = (c T Q 1 )B 1 = d T B 1, with d T = c T Q 1 0 T as Q is an invertible matrix (see Theorem ). Thus, it contradicts the linear independence of the pivotal rows of B. Let B[:,i 1 ],...,B[:,i r ] be the pivotal columns of B which are linearly independent due to pivotal 1 s. Let B = PA then the corresponding columns of A satisfy [A[:,i 1 ],...,A[:,i r ]] = P 1 [B[:,i 1 ],...,B[:,i r ]]. Hence, the required result follows as P is an invertible matrix. We give an example for better understanding Example Let A = with RREF(A) = B = Then B[:,3] = B[:,1]+2B[:,2]. Thus, A[:,3] = A[:,1]+2A[:,2]. As the 1-st, 2-nd and 4-th columns of B are linearly independent, the set {A[:,1],A[:,2],A[:,4]} is linearly independent. Also, note that during the application of GJE to get RREF, we have interchanged the 3-rd and 4-th rows. Hence, the rows A[1,:],A[2,:] and A[4,:] are linearly independent. 3.2.C Linear Independence and Uniqueness of Linear Combination We end this section with a result that states that linear combination with respect to linearly independent set is unique. Lemma Let S be a linearly independent set in a vector space V over F. Then each v LS(S) is a unique linear combination vectors from S. Proof. On the contrary, suppose there exists v LS(S) such that v = α 1 v 1 + +α p v p and v = β 1 v 1 + +β p v p, for α i,β i F and v i S, for 1 i p. Equating the two expressions for v gives (α 1 β 1 )v 1 + +(α p β p )v p = 0. ( ) As {v 1,...,v p } S is a linearly independent subset in LS(S), the system c 1 v 1 + +c p v p = 0 in the unknowns c 1,...,c p has only the trivial solution. Thus, each of the scalars α i β i, appearing in Equation ( ), must be equal to 0. That is, α i β i = 0, for 1 i p. Thus, for 1 i p, α i = β i and the result follows.

72 72 CHAPTER 3. VECTOR SPACES Exercise Consider the Euclidean plane R 2. Let u 1 = (1,0) T. Determine condition on u 2 such that {u 1,u 2 } is a linearly independent subset of R Let S = {(1,1,1,1) T,(1, 1,1,2) T,(1,1, 1,1) T } R 4. Does (1,1,2,1) T LS(S)? Furthermore, determine conditions on x,y,z and u such that (x,y,z,u) T LS(S). 3. Show that S = {(1,2,3) T,( 2,1,1) T,(8,6,10) T } R 3 is linearly dependent. 4. Prove that {u 1,...,u n } C n is linearly independent if and only if {Au 1,...,Au n } is linearly independent for every invertible matrix A. 5. Let V be a complex vector space and let A M n (C) be invertible. Then {u 1,...,u n } V is a linearly independent if and only if {w 1,...,w n } V is linearly independent, where w i = n a ij u j, for 1 i n. j=1 6. Find u,v,w R 4 such that {u,v,w} is linearly dependent whereas {u,v},{u,w} and {v, w} are linearly independent. 7. Is {(1,0) T,(i,0) T } a linearly independent subset of the real vector space C 2? 8. Suppose V is a collection of vectors such that V is a real as well as a complex vector space. Then prove that {u 1,...,u k,iu 1,...,iu k } is a linearly independent subset of V over R if and only if {u 1,...,u k } is a linear independent subset of V over C. 9. Let V be a vector space and M be a subspace of V. For u,v V\M, define K = LS(M,u) and H = LS(M,v). Then prove that v K if and only if u H. 10. Let A M n (R). Suppose x,y R n \{0} such that Ax = 3x and Ay = 2y. Then prove that x and y are linearly independent Let A = Determine x,y,z R3 \ {0} such that Ax = 6x, Ay = 2y and Az = 2z. Use the vectors x,y and z obtained above to prove the following. (a) A 2 v = 4v, where v = cy+dz for any c,d R. (b) The set {x, y, z} is linearly independent. (c) Let P = [x, y, z] be a 3 3 matrix. Then P is invertible (d) Let D = Then AP = PD Prove that the rows/columns of (a) A M n (C) are linearly independent if and only if det(a) 0. (b) A M n (C) span C n if and only if A is an invertible matrix.

73 3.3. BASIS OF A VECTOR SPACE 73 (c) a skew-symmetric matrix A of odd order are linearly dependent. 13. Let P and Q be subspaces of R n such that P +Q = R n and P Q = {0}. Prove that each u R n is uniquely expressible as u = u P +u Q, where u P P and u Q Q. 3.3 Basis of a Vector Space Definition Let S be a subset of a set T. Then S is said to be a maximal subset of T having property P if 1. S has property P and 2. no proper superset S of T has property P. Example Let T = {2,3,4,7,8,10,12,13,14,15}. Then a maximal subset of T of consecutive integers is S = {2,3,4}. Other maximal subsets are {7,8},{10} and {12,13,14,15}. Note that {12,13} is not maximal. Why? Definition Let V be a vector space over F. Then S is called a maximal linearly independent subset of V if 1. S is linearly independent and 2. no proper superset S of V linearly independent. Example In R 3, the set S = {e 1,e 2 } is linearly independent but not maximal as S {(1,1,1) T } is a linearly independent set containing S. 2. In R 3, S = {(1,0,0) T,(1,1,0) T,(1,1, 1) T } is a maximal linearly independent set as any collection of 4 or more vectors from R 3 is linearly dependent (see Corollary ). 3. LetS = {v 1,...,v k } R n. Now, formthematrix A = [v 1,...,v k ] andlet B = RREF(A). Then, using Theorem , we see that if B[:,i 1 ],...,B[:,i r ] are the pivotal columns of B then {v i1,...,v ir } is a maximal linearly independent subset of S. Theorem Let V be a vector space over F and S a linearly independent set in V. Then S is maximal linearly independent if and only if LS(S) = V. Proof. Let v V. As S is linearly independent, using Corollary , the set S {v} is linearly independent if and only if v V\LS(S). Thus, the required result follows. Let V = LS(S) for some set S with S = k. Then, using Theorem , we see that if T V is linearly independent then T k. Hence, a maximal linearly independent subset of V can have at most k vectors. Thus, we arrive at the following important result. Theorem Let V be a vector space over F and let S and T be two finite maximal linearly independent subsets of V. Then S = T. Proof. By Theorem , S and T are maximal linearly independent if and only if LS(S) = V = LS(T). Now, use the previous paragraph to get the required result.

74 74 CHAPTER 3. VECTOR SPACES Definition Let V be a vector space over F with V {0}. Suppose V has a finite maximal linearly independent set S. Then S is called the dimension of V, denoted dim(v). By convention, dim({0}) = 0. Example As {π} is a maximal linearly independent subset of R, dim(r) = As {(1,0,1) T,(0,1,1) T,(1,1,0) T } R 3 is maximal linearly independent, dim(r 3 ) = As {e 1,...,e n } is a maximal linearly independent set in R n, dim(r n ) = n. 4. As {e 1,...,e n } is a maximal linearly independent subset of the complex vector space C n, dim(c n ) = n. 5. Using Exercise , {e 1,...,e n,ie 1,...,ie n } is a maximal linearly independent subset of the real vector space C n. Thus, as a real vector space dim(c n ) = 2n. 6. Let S = {v 1,...,v k } R n. Define A = [v 1,...,v k ]. Then, using Example , we see that dim(ls(s)) = Rank(A). Definition [Basis of a Vector Space] Let V be a vector space over F with V {0}. Then a maximal linearly independent subset of V is called a basis of V. The vectors in a basis are called basis vectors. Note that a basis of {0} is either not defined or is the empty set. Definition Let V be a vector space over F with V {0}. Then a set S V is called minimal spanning if LS(S) = V and no proper subset of S spans V. Example [Standard Basis] Fix n N and let e i = I n [:,i], the i-th column of the identity matrix. Then B = {e 1,...,e n } is called the standard basis of R n or C n. In particular, 1. B = {e 1 } = {1} is a standard basis of R over R. 2. B = {e 1,e 2 } with e T 1 = (1,0)T and e T 2 = (0,1)T is the standard basis of R B = {e 1,e 2,e 3 } = {(1,0,0) T,(0,1,0) T,(0,0,1) T } is the standard basis of R {1} is a basis of C over C. 5. {e 1,...,e n,ie 1,...,ie n } is a basis of C n over R. So, {1,i} is a basis of C over R. Example Note that { 2} is a basis and a minimal spanning set of R. 2. Let u 1,u 2,u 3 R 2. Then, {u 1,u 2,u 3 } can neither be a basis nor a minimal spanning subset of R {(1,1, 1) T,(1, 1,1) T,( 1,1,1) T } is a basis and a minimal spanning subset of R Let V = {(x,y,0) T x,y R} R 3. Then B = {(1,0,0) T,(1,3,0) T } is a basis of V. 5. Let V = {(x,y,z) T R 3 x+y z = 0} R 3. As each element (x,y,z) T V satisfies x+y z = 0. Or equivalently z = x+y, we see that (x,y,z) = (x,y,x+y) = (x,0,x)+(0,y,y) = x(1,0,1) +y(0,1,1). Hence, {(1,0,1) T,(0,1,1) T } forms a basis of V.

75 3.3. BASIS OF A VECTOR SPACE Let S = {1,2,...,n} and consider { the vector space R S (see Example ). Then, 1, if j = i for 1 i n, define e i (j) =. Prove that B = {e 1,...,e n } is a linearly 0, if j i. independent set. Is it a basis of R S? 7. Let S = R n and consider the vector space R S (see Example ). For 1 i n, define the functions e i (x) = e i ( (x1,...,x n ) ) = x i. Then, verify that {e 1,...,e n } is a linearly independent set. Is it a basis of R S? 8. Let S = {v 1,...,v k } R n. Define A = [v 1,...,v k ]. Then, using Theorem , the columns of A corresponding to the pivotal columns in RREF(A) form a basis as well as a minimal spanning subset of LS(S). 9. Let S = {a 1,...,a n }. Then { recall that R S is a real vector space (see Example 8). For 1 if j = i 1 i n, define f i (a j ) = 0 otherwise. Then, verify that {f 1,...,f n } is a basis of R S. What can you say if S is a countable set? 10. Recall the vector space C[a,b], where a < b R. For each α [a,b] Q, define f α (x) = x α, for all x [a,b]. Is the set {f α α [a,b]} linearly independent? What if α [a,b]? Can we write any function in C[a,b] as a finite linear combination? Give reasons for your answer. 3.3.A Main Results associated with Bases Theorem Let V {0} be a vector space over F. Then the following statements are equivalent. 1. B is a basis (maximal linearly independent subset) of V. 2. B is linearly independent and it spans V. 3. B is a minimal spanning set of V. Proof. 1 2 By definition, every basis is a maximal linearly independent subset of V. Thus, using Corollary , we see that B spans V. 2 3 Let S be a linearly independent set that spans V. As S is linearly independent, for any x S, x / LS(S {x}). Hence LS(S {x}) V. 3 1 IfB is linearly dependentthen usingcorollary B is not minimal spanning. A contradiction. Hence, B is linearly independent. We now need to show that B is a maximal linearly independent set. Since B spans V, for any x V\B, the set B {x} is linearly dependent. That is, every proper superset of B is linearly dependent. Hence, the required result follows. Now, using Lemma , we get the following result. Remark Let B be a basis of a vector space V over F. Then, for each v V, there exist unique u i B and unique α i F, for 1 i n, such that v = n α i u i.

76 76 CHAPTER 3. VECTOR SPACES The next result is generally known as every linearly independent set can be extended to form a basis in a finite dimensional vector space. Theorem Let V be a vector space over F with dim(v) = n. If S is a linearly independent subset of V then there exists a basis T of V such that S T. Proof. If LS(S) = V, done. Else, choose u 1 V\LS(S). Thus, by Corollary , the set S {u 1 } is linearly independent. We repeat this process till we get n vectors in T as dim(v) = n. By Theorem , this T is indeed a required basis. 3.3.B Constructing a Basis of a Finite Dimensional Vector Space We end this section with an algorithm which is based on the proof of the previous theorem. Step 1: Let v 1 V with v 1 0. Then {v 1 } is linearly independent. Step 2: If V = LS(v 1 ), we have got a basis of V. Else, pick v 2 V \ LS(v 1 ). Then by Corollary , {v 1,v 2 } is linearly independent. Step i: Either V = LS(v 1,...,v i ) or LS(v 1,...,v i ) V. In the first case, {v 1,...,v i } is a basis of V. Else, pick v i+1 V \LS(v 1,...,v i ). Then, by Corollary , the set {v 1,...,v i+1 } is linearly independent. This process will finally end as V is a finite dimensional vector space. Exercise Let B = {u 1,...,u n } be a basis of a vector space V over F. Then, does the condition n α i u i = 0 in α i s imply that α i = 0, for 1 i n? 2. Find a basis of R 3 containing the vector (1,1, 2) T. 3. Find a basis of R 3 containing the vector (1,1, 2) T and (1,2, 1) T. 4. Is it possible to find a basis of R 4 containing the vectors (1,1,1, 2) T, (1,2, 1,1) T and (1, 2,7, 11) T? 5. Let S = {v 1,...,v p } be a subset of a vector space V over F. Suppose LS(S) = V but S is not a linearly independent set. Then does this imply that each v V is expressible in more than one way as a linear combination of vectors from S? 6. Show that B = {(1,0,1) T,(1,i,0) T,(1,1,1 i) T } is a basis of C 3 over C. 7. Find a basis of C 3 over R containing the basis B given in Example Determine a basis and dimension of W = {(x,y,z,w) T R 4 x+y z +w = 0}. 9. Find a basis of V = {(x,y,z,u) R 4 x y z = 0,x+z u = 0} Let A = Find a basis of V = {x R5 Ax = 0}

77 3.4. APPLICATION TO THE SUBSPACES OF C N Prove that B = {1,x,...,x n,...} is a basis of R[x]. B is called the standard basis of R[x]. 12. Let u T = (1,1, 2),v T = ( 1,2,3) and w T = (1,10,1). Find a basis of L(u,v,w). Determine a geometrical representation of LS(u, v, w). 13. Let V be a vector space of dimension n. Then any set (a) consisting of n linearly independent vectors forms a basis of V. (b) S in V having n vectors with LS(S) = V forms a basis of V. 14. Let {v 1,...,v n } be a basis of C n. Then prove that the two matrices B = [v 1,...,v n ] and C = v T 1. v T n are invertible. 15. Let W 1 and W 2 be two subspaces of a vector space V such that W 1 W 2. Show that W 1 = W 2 if and only if dim(w 1 ) = dim(w 2 ). 16. Consider the vector space C([ π, π]) over R. For each n N, define e n (x) = sin(nx). Then prove that S = {e n n N} is linearly independent. [Hint: Need to show that every finite subset of S is linearly independent. So, on the contrary assume that there exists l N and functions e k1,...,e kl such that α 1 e k1 + + α l e kl = 0, for some α t 0 with 1 tl. But, the above system is equivalent to looking at α 1 sin(k 1 x)+ +α l sin(k l x) = 0 for all x [ π, π]. Now in the integral π π sin(mx)(α 1 sin(k 1 x)+ +α l sin(k l x)) dx replace m with k i s to show that α i = 0, for all i,1 i l to get the required contradiction.] 17. Let V be a vector space over F with dim(v) = n. If W 1 is a k-dimensional subspace of V then prove that there exists a subspace W 2 of V such that W 1 W 2 = {0}, W 1 +W 2 = V and dim(w 2 ) = n k. Also, prove that for each v V there exist unique vectors w 1 W 1 and w 2 W 2 such that v = w 1 + w 2. The subspace W 2 is called the complementary subspace of W 1 in V. 18. Is the set W = {p(x) R[x;4] p( 1) = p(1) = 0} a subspace of R[x;4]? If yes, find its dimension. 3.4 Application to the subspaces of C n In this subsection, we will study results that are intrinsic to the understanding of linear algebra from the point of view of matrices, especially the fundamental subspaces (see Definition ) associated with matrices. We start with an example Example Compute the fundamental subspaces for A = Solution: Verify the following

78 78 CHAPTER 3. VECTOR SPACES (a) Col(A ) = Row(A) = {(x,y,z,u) T C 4 3x 2y = z,5x 3y +u = 0}. (b) Col(A)= Row(A ) = {(x,y,z) T C 3 4x 3y z = 0}. (c) Null(A) = {(x,y,z,u) T C 4 x+3z 5u = 0,y 2z +3u = 0}. (d) Null(A ) = {(x,y,z) T C 3 x+4z = 0,y 3z = 0} Let A = Find a basis and dimension of Null(A) Solution: Writinf the basic vairables x 1,x 3 and x 6 in terms of the free variables x 2,x 4,x 5 and x 7, we get x 1 = x 7 x 2 x 4 x 5, x 3 = 2x 7 2x 4 3x 5 and x 6 = x 7. Hence, the solution set has the form x 1 x 7 x 2 x 4 x x 2 x x 3 2x 7 2x 4 3x x 4 = x 4 = x 2 0 +x 4 1 +x 5 0 +x 7 0. ( ) x 5 x x 6 x x 7 x 7 [ Now, let u T 1 [ 1,1,0,0,0,0,0 ], = u T 2 = and u T 4 = [1,0,2,0,0, 1,1 reasons for S to be a basis are as follows: [ ] ], u T 3 = 1,0, 3,0,1,0,0 1,0, 2,1,0,0,0 ]. Then S = {u 1,u 2,u 3,u 4 } is a basis of Null(A). The (a) By Equation ( ) Null(A) = LS(S). (b) For Linear independence, the homogeneous system c 1 u 1 +c 2 u 2 +c 3 u 3 +c 4 u 4 = 0 in the unknowns c 1,c 2,c 3 and c 4 has only the trivial solution as i. u 4 is the only vector with a non-zero entry at the 7-th place (u 4 corresponds to x 7 ) and hence c 4 = 0. ii. u 3 is the only vector with a non-zero entry at the 5-th place (u 3 corresponds to x 5 ) and hence c 3 = 0. iii. Similar arguments hold for the unknowns c 2 and c Exercise Let A = and B = Find RREF(A) and RREF(B). 2. Find invertible matrices P 1 and P 2 such that P 1 A = RREF(A) and P 2 B = RREF(B). 3. Find bases for Col(A), Col(A ), Col(B) and Col(B ).

79 3.4. APPLICATION TO THE SUBSPACES OF C N Find bases of Null(A), Null(A ), Null(B) and Null(B ). 5. Find the dimensions of all the vector subspaces so obtained. 6. Does there exist relationship between the bases of different spaces? Lemma Let A M m n (C) and let E be an elementary matrix. If 1. B = EA then Col(A ) = Col(B ) and Col(A T ) = Col(B T ). Hence dim(col(a )) = dim(col(b )) and dim(col(a T )) = dim(col(b T )). 2. B = AE then Col(A) = Col(B) and Col(A) = Col(B). Hence dim(col(a)) = dim(col(b)) and dim(col(a)) = dim(col(b)). Proof. Note that B = EA if and only if B = EA. As E is invertible, A are B are equivalent and hence they have the same RREF. Also, E is invertible as well and hence A are B have the same RREF. Now, use Theorem to get the required result. For the second part, note that B = E A and E is invertible. Hence, using the first part Col((A ) ) = Col((B ) ), or equivalently, Col(A) = Col(B). LetA M m n (C)andletB = RREF(A). ThenasanimmediateapplicationofLemma , we get dim(col(a )) = Row rank(a). Hence, dim(col(a)) = Column rank(a) as well. We now prove that Row rank(a) = Column rank(a). Theorem Let A M m n (C). Then Row rank(a) = Column rank(a). Proof. Let Row rank(a) = r = dim(col(a T )). Then there exist i 1,...,i r such that {A[i 1,: A[i 1,:] ],...,A[i r,:]} form a basis of Col(A T ). Then, B =. is an r n matrix and it s rows A[i r,:] are a basis of Col(A T ). Therefore, there exist α ij C, 1 i m, 1 j r such that A[t,:] = [α t1,...,α tr ]B, for 1 t m. So, using matrix multiplication (see Remark ) A[1,:] [α 11,...,α 1r ]B A = = = CB,. A[m,:]. [α m1,...,α mr ]B where C = [α ij ] is an m r matrix. Thus, matrix multiplication implies that each column of A is a linear combination of r columns of C. Hence, Column rank(a) = dim(col(a)) r = Row rank. A similar argument gives Row rank(a) Column rank(a). Hence, we have the required result. Remark The proof also shows that for every A M m n (C) of rank r there exists matrices B r n and C m r, each of rank r, such that A = CB. Let W 1 and W 1 be two subspaces of a vector space V over F. Then recall that (see Exercise d) W 1 + W 2 = {u + v u W 1, v W 2 } = LS(W 1 W 2 ) is the smallest subspace of V containing both W 1 and W 2. We now state a result similar to a result in Venn diagram that states A + B = A B + A B, whenever the sets A and B are finite (for a proof, see Appendix ).

80 80 CHAPTER 3. VECTOR SPACES Theorem Let V be a finite dimensional vector space over F. If W 1 and W 2 are two subspaces of V then dim(w 1 )+dim(w 2 ) = dim(w 1 +W 2 )+dim(w 1 W 2 ). ( ) For better understanding, we give an example for finite subsets of R n. The example uses Theorem to obtain bases of LS(S), for different choices S. The readers are advised to see Example before proceeding further. Example Let V and W betwo spaces with V = {(v,w,x,y,z) T R 5 v+x+z = 3y} and W = {(v,w,x,y,z) T R 5 w x = z,v = y}. Find bases of V and W containing a basis of V W. Solution: (v,w,x,y,z) T V W if v,w,x,y and z satisfy v +x 3y +z = 0, w x z = 0 and v = y. The solution of the system is given by (v,w,x,y,z) T = (y,2y,x,y,2y x) T = y(1,2,0,1,2) T +x(0,0,1,0, 1) T. Thus, B = {(1,2,0,1,2) T,(0,0,1,0, 1) T } is a basis of V W. Similarly, a basis of V is given by C = {( 1,0,1,0,0) T,(0,1,0,0,0) T,(3,0,0,1,0) T,( 1,0,0,0,1) T } and that of W is given by D = {(1,0,0,1,0) T,(0,1,1,0,0) T,(0,1,0,0,1) T }. Tofindtherequiredbasisformamatrixwhose rows are the vectors in B,C and D (see Equation( )) and apply row operations other than E ij. Then after a few row operations, we get ( ) Thus, a required basis of V is {(1,2,0,1,2) T,(0,0,1,0, 1) T,(0,1,0,0,0) T,(0,0,0,1,3) T }. Similarly, a required basis of W is {(1,2,0,1,2) T,(0,0,1,0, 1) T,(0,1,0,0,1) T }. Exercise Give an example to show that if A and B are equivalent then Col(A) need not equal Col(B). 2. Let V = {(x,y,z,w) T R 4 x + y z + w = 0,x + y + z + w = 0,x + 2y = 0} and W = {(x,y,z,w) T R 4 x y z+w = 0,x+2y w = 0} be two subspaces of R 4. Find bases and dimensions of V, W, V W and V+W. 3. Let W 1 and W 2 be 4-dimensional subspaces of a vector space V of dimension 7. Then prove that dim(w 1 W 2 ) 1.

81 3.4. APPLICATION TO THE SUBSPACES OF C N Let W 1 and W 2 be two subspaces of a vector space V. If dim(w 1 )+dim(w 2 ) > dim(v), then prove that dim(w 1 W 2 ) Let A M m n (C) with m < n. Prove that the columns of A are linearly dependent. We now prove the rank-nullity theorem and give some of it s consequences. Theorem (Rank-Nullity Theorem). Let A M m n (C). Then dim(col(a)) + dim(null(a)) = n. ( ) Proof. Let dim(null(a)) = r n and let B = {u 1,...,u r } be a basis of Null(A). Since B is a linearly independent set in R n, extend it to get B = {u 1,...,u n } as a basis of R n. Then, Col(A) = LS(B) = LS(Au 1,...,Au n ) = LS(0,...,0,Au r+1,...,au n ) = LS(Au r+1,...,au n ). So, C = {Au r+1,...,au n } spans Col(A). We further need to show that C is linearly independent. So, consider the linear system α 1 Au r+1 + +α n r Au n = 0 A(α 1 u r+1 + +α n r u n ) = 0 ( ) in the unknowns α 1,...,α n r. Thus, α 1 u r+1 + +α n r u n Null(A) = LS(B). Therefore, there exist scalars β i, 1 i r, such that n r α i u r+i = r β j u j. Or equivalently, j=1 β 1 u 1 + +β r u r α 1 u r+1 α n r u n = 0. ( ) As B is a linearly independent set, the only solution of Equation ( ) is α i = 0, for 1 i n r and β j = 0, for 1 j r. In other words, we have shown that the only solution of Equation ( ) is the trivial solution. Hence, {Au r+1,...,au n } is a basis of Col(A). Thus, the required result follows. Theorem is part of what is known as the fundamental theorem of linear algebra (see Theorem ). The following are some of the consequences of the rank-nullity theorem. The proof is left as an exercise for the reader. Exercise Let A M m,n (C). If (a) n > m then the system Ax = 0 has infinitely many solutions, (b) n < m then there exists b R m \{0} such that Ax = b is inconsistent. 2. The following statements are equivalent for an m n matrix A. (a) Rank (A) = k. (b) There exist a set of k rows of A that are linearly independent. (c) There exist a set of k columns of A that are linearly independent.

82 82 CHAPTER 3. VECTOR SPACES (d) dim(col(a)) = k. (e) There exists a k k submatrix B of A with det(b) 0. Further, the determinant of every (k+1) (k +1) submatrix of A is zero. (f) There exists a linearly independent subset {b 1,...,b k } of R m such that the system Ax = b i, for 1 i k, is consistent. (g) dim(null(a)) = n k. 3.5 Ordered Bases Let V be a vector subspace of C n for some n N with dim(v) = k. Then, a basis of V may not look like a standard basis. Our problem may force us to look for some other basis. In such a case, it is always helpful to fix the vectors in a particular order and then concentrate only on the coefficients of the vectors as was done for the system of linear equations where we didn t worry about the unknowns. It also may happen that k is very-very small as compared to n and hence it is always better to work with vectors of small size. Definition [Ordered Basis, Basis Matrix] Let V be a vector space over F with a basis B = {u 1,...,u n }. Then an ordered basis for V is a basis B together with a one-to-one correspondence between B and {1,2,...,n}. Since there is an order among the elements of B, we write B = [u 1,...,u n ]. The matrix B = [u 1,...,u n ] is called the basis matrix. Thus, B = [u 1,u 2,...,u n ] is different from C = [u 2,u 3,...,u n,u 1 ] and both of them are different from D = [u n,u n 1,...,u 2,u 1 ] even though they have the same set of vectors as elements. We now define the notion of coordinates of a vector with respect to an ordered basis. Definition [Coordinates of a Vector] Let B = [v 1,...,v n ] be the basis matrix corresponding to an ordered basis B of V. Since B is a basis of V, for each v V, there exist β i,1 i n, such that v = n β i v i = B. β 1 β n. The column vector β 1. β n is called the coordinates of v with respect to B, denoted [v] B. Thus, using notation v = B[v] B. [[ ] [ ]] [ ] [ ][ ] 1 1 π 1 1 π Example Let B =, beanorderedbasisofr Then, = e 1 2 e [ ] [ ] 1 [ ] π 1 1 π Thus, =. e 1 2 e B 2. Consider the vector space R[x;2] with basis {1 x,1+x,x 2 }. Then an ordered basis can either be B = [1 x,1+x,x 2 ] or C = [1+x,1 x,x 2 ] or... Note that there are 3! different ordered bases. Also, for a 0 +a 1 x+a 2 x 2 R[x;2], one has a 0 +a 1 x+a 2 x 2 = [1 x,1+x,x 2 ] a 0 a 1 2 a 0+a 1 2 a 2.. B

83 3.5. ORDERED BASES 83 Thus, [a 0 +a 1 x+a 2 x 2 ] B = a 0 a 1 2 a 0+a 1 2 a 2, whereas [a 0 +a 1 x+a 2 x 2 ] C = a 0+a 1 2 a 0 a 1 2 a 2 3. Let V = {(x,y,z) T x+y = z}. If B = [( 1,1,0) T,(1,0,1) T ] is an ordered basis of V then x 1 1 [ ] 3 [ ] y = 1 0 y and hence 4 4 =. x+y 7 z B B 4. Let V = {(v,w,x,y,z) T R 5 w x = z,v = y,v + x + z = 3y}. [ ] So, if B = 3 [(1,2,0,1,2) T,(0,0,1,0, 1) T ] is an ordered basis then [(3,6,0,3,1)] B =. 5 Let V be a vector space over F with dim(v) = n. Let A = [v 1,...,v n ] and B = [u 1,...,u n ] be basis matrices corresponding to the ordered bases B and C, respectively. So, using the notation in Definition , we have A = [v 1,...,v n ] = [B[v 1 ] C,...,B[v n ] C ] = B[[v 1 ] C,...,[v n ] C ].. So, the matrix [A] C = [[v 1 ] C,...,[v n ] C ], denoted [B] C, is called the matrix of B with respect to the ordered basis C or the change of basis matrix from B to C. We now summarize the above discussion. Theorem Let V be a vector space over F with dim(v) = n. Let B = [v 1,...,v n ] and C = [u 1,...,u n ] be two ordered bases of V. 1. Then, the matrix [B] C is invertible and [w] C = [B] C [w] B, for all w V. 2. Similarly, verify that [C] B is invertible and [w] B = [C] B [w] C, for all w V. 3. Furthermore, ([B] C ) 1 = [C] B. Proof. Part 1: We prove all the parts together. Let A = [v 1,...,v n ],B = [u 1,...,u n ] and C = [B] C and D = [C] B. Then, by previous paragraph A = BC. Similarly, B = [u 1,...,u n ] = [A[u 1 ] B,...,A[u n ] B ] = A[[u 1 ] B,...,[u n ] B ] = AD. But, by Exercise , A and B are invertible and thus C = B 1 A and D = A 1 B are invertible as well. Clearly, C 1 = ( B 1 A ) 1 = A 1 B = D which proves the third part. For the first two parts, note that for any w V, w = A[w] B, w = B[w] C. Hence, B[w] C = w = A[w] B = BC[w] B = B[B] C [w] B and thus [w] C = [B] C [w] B. Similarly, [w] B = [C] B [w] C and the required result follows. Example Note that if C = [e 1,...,e n ] then [A] C = [[v 1 ] C,...,[v n ] C ] = [v 1,...,v n ] = A. 2. Suppose B = [ (1,0,0) T,(1,1,0) T,(1,1,1) T] and C = [ (1,1,1) T,(1, 1,1) T,(1,1,0) T] are two bases of R 3. Then, verify the statements in the previous result.

84 84 CHAPTER 3. VECTOR SPACES 1 x x x x x y (a) Then y = y. Thus, y = y = y z. z z z z z B B 1 x x (b) Similarly, y = y = x y = 1 x+y +2z x y +z. 2 2 z z z 2x 2z C 1/ (c) Verify that [B] C = 1/2 0 0 and [C] B = Remark Let V be a vector space over F with B = [v 1,...,v n ] as an ordered basis. Then, by Theorem , v B is an element of F n, for each v V. Therefore, if 1. F = R then the elements of V look like the elements of R n. 2. F = C then the elements of V look like the elements of C n. Exercise Let B = [ (1,2,0) T,(1,3,2) T,(0,1,3) T] and C = [ (1,2,1) T,(0,1,2) T,(1,4,6) T] be two ordered bases of R 3. Find the change of basis matrix 1. P from B to C. 2. Q from C to B. 3. from the standard basis of R 3 to B. What do you notice? Is it true that PQ = I = QP? Give reasons for your answer. 3.6 Summary In this chapter, we defined vector spaces over F. The set F was either R or C. To define a vector space, we start with a non-empty set V of vectors and F the set of scalars. We also needed to do the following: 1. first define vector addition and scalar multiplication and 2. then verify the axioms in Definition If all axioms in Definition are satisfied then V is a vector space over F. If W was a non-empty subset of a vector space V over F then for W to be a space, we only need to check whether the vector addition and scalar multiplication inherited from that in V hold in W. We then learnt linear combination of vectors and the linear span of vectors. It was also shown that the linear span of a subset S of a vector space V is the smallest subspace of V containing S. Also, to check whether a given vector v is a linear combination of u 1,...,u n, we needed to solve the linear system c 1 u 1 + +c n u n = v in the unknowns c 1,...,c n. Or equivalently, the system Ax = b, where in some sense A[:,i] = u i, 1 i n, x T = [c 1,...,c n ] and b = v. It was also shown that the geometrical representation of the linear span of S = {u 1,...,u n } is equivalent to finding conditions in the entries of b such that Ax = b was always consistent.

85 3.6. SUMMARY 85 Then, we learnt linear independence and dependence. A set S = {u 1,...,u n } is linearly independent set in the vector space V over F if the homogeneous system Ax = 0 has only the trivial solution in F. Else S is linearly dependent, where as before the columns of A correspond to the vectors u i s. We then talked about the maximal linearly independent set (coming from the homogeneous system) and the minimal spanning set (coming from the non-homogeneous system) and culminating in the notion of the basis of a finite dimensional vector space V over F. The following important results were proved. 1. A linearly independent set can be extended to form a basis of V. 2. Any two bases of V have the same number of elements. This number was defined as the dimension of V, denoted dim(v). Now let A M n (R). Then, combining a few results from the previous chapter, we have the following equivalent conditions. 1. A is invertible. 2. The homogeneous system Ax = 0 has only the trivial solution. 3. RREF(A) = I n. 4. A is a product of elementary matrices. 5. The system Ax = b has a unique solution for every b. 6. The system Ax = b has a solution for every b. 7. Rank(A) = n. 8. det(a) Col(A T ) = Row(A) = R n. 10. Rows of A form a basis of R n. 11. Col(A) = R n. 12. Columns of A form a basis of R n. 13. Null(A) = {0}.

86 86 CHAPTER 3. VECTOR SPACES

87 Chapter 4 Linear Transformations 4.1 Definitions and Basic Properties In the previous chapter, it was shown that if V is a real vector space with dim(v) = n then with respect to an ordered basis, the elements of V were column vectors of size n. So, in some sense the vector in V look like elements of R n. In this chapter, we concretize this idea. We also show that matrices give rise to functions between two finite dimensional vector spaces. To do so, we start with the definition of functions over vector spaces that commute with the operations of vector addition and scalar multiplication. Definition [Linear Transformation, Linear Operator] Let V and W be vector spaces over F. A function (map) T : V W is called a linear transformation if for all α F and u,v V the function T satisfies T(α u) = α T(u) and T(u+v) = T(u) T(v), where +, are binary operations in V and, are the binary operations in W. By L(V,W), we denote the set of all linear transformations from V to W. In particular, if W = V then the linear transformation T is called a linear operator and the corresponding set of linear operators is denoted by L(V). Definition [Equality of Linear Transformation] Let S, T L(V, W). Then, S and T are said to be equal if T(x) = S(x), for all x V. We now give examples of linear transformations. Example Let V be a vector space. Then, the maps Id,0 L(V), where (a) Id(v) = v, for all v V, is commonly called the identity operator. (b) 0(v) = 0, for all v V, is commonly called the zero operator. 2. Let V and W be two vector spaces over F. Then, 0 L(V,W), where 0(v) = 0, for all v V, is commonly called the zero transformation.

88 88 CHAPTER 4. LINEAR TRANSFORMATIONS 3. The map T(x) = x, for all x R, is an element of L(R) as T(ax) = ax = at(x) and T(x+y) = x+y = T(x)+T(y). 4. Themap T(x) = (x,3x) T, forall x R, is anelement of L(R,R 2 ) as T(λx) = (λx,3λx) T = λ(x,3x) T = λt(x) and T(x+y) = (x+y,3(x+y) T = (x,3x) T +(y,3y) T = T(x)+T(y). 5. Let V,W and Z be vector spaces over F. Then, for any T L(V,W) and S L(W,Z), the map S T L(V,Z), where(s T)(v) = S ( T(v) ), forall v V, is called the composition of maps. The readers should verify that S T, in short ST, is an element of L(V,Z). 6. Fix a R n and define T(x) = x,a, for all x R n. Then T L(R n,r). For example, if (a) a = (1,...,1) T then T(x) = n x i, for all x R n. (b) a = e i, for a fixed i, 1 i n, then T i (x) = x i, for all x R n. 7. Define T : R 2 R 3 by T ( (x,y) T) = (x+y,2x y,x+3y) T. Then T L(R 2,R 3 ) with T(e 1 ) = (1,2,1) T and T(e 2 ) = (1, 1,3) T. 8. Let A M m n (C). Define T A (x) = Ax, for every x C n. Then, T A L(C n,c m ). Thus, for each A M m,n (C), there exists a map T A L(C n,c m ). 9. Define T : R n+1 R[x;n] by T ( (a 1,...,a n+1 ) T) = a 1 + a 2 x + + a n+1 x n, for (a 1,...,a n+1 ) R n+1. Then T is a linear transformation. 10. Fix A M n (C). Then T A : M n (C) M n (C) and S A : M n (C) C are both linear transformations, where T A (B) = BA and S A (B) = Tr(BA ), for every B M n (C). 11. The map T : R[x;n] R[x;n] defined by T(f(x)) = d dx f(x) = f (x), for all f(x) R[x;n] is a linear transformation. 12. The maps T,S : R[x] R[x] defined by T(f(x)) = d dx f(x) and S(f(x)) = x f(x) R[x] are linear transformations. Is it true that TS = Id? What about ST? 0 f(t)dt, for all 13. Recall the vector space R N in Example Now, define maps T,S : R N R N by T({a 1,a 2,...}) = {0,a 1,a 2,...} and S({a 1,a 2,...}) = {a 2,a 3,...}. Then, T and S, commonly called the shift operators, are linear operators with exactly one of ST or TS as the Id map. 14. Recall the vector space C(R,R) (see Example ). Then, the map g = T(f), for x each f C(R,R), defined by g(x) = f(t)dt is an element of L(C(R,R)). For example, (T(sin))(x) = 0 x sin(t)dt = 1 cos(x). So, (T(sin))(x) = 1 cos(x), for all x R. 0 We now prove that any linear transformation sends the zero vector to a zero vector. Proposition Let T L(V,W). Suppose that 0 V is the zero vector in V and 0 W is the zero vector of W. Then T(0 V ) = 0 W.

89 4.1. DEFINITIONS AND BASIC PROPERTIES 89 Proof. Since 0 V = 0 V +0 V, we have T(0 V ) = T(0 V +0 V ) = T(0 V )+T(0 V ). Thus, T(0 V ) = 0 W as T(0 V ) W. From now on 0 will be used as the zero vector of the domain and codomain. The next result states that a linear transformation is known if we know its image on a basis of the domain space. Example Does there exist a linear transformation 1. T : V W such that T(v) 0, for all v V? Solution: No, as T(0) = 0 (see Proposition ). 2. T : R R such that T(x) = x 2, for all x R? Solution: No, as T(ax) = (ax) 2 = a 2 x 2 = a 2 T(x) at(x), unless a = 0,1. 3. T : R R such that T(x) = x, for all x R? Solution: No, as T(ax) = ax = a x a x = at(x), unless a = 0,1. 4. T : R R such that T(x) = sin(x), for all x R? Solution: No, as T(ax) at(x). 5. T : R R such that T(5) = 10 and T(10) = 5? Solution: No, as T(10) = T(5+5) = T(5)+t(5) = = T : R R such that T(5) = π and T(e) = π? Solution: No, as 5T(1) = T(5) = π implies that T(1) = π 5. So, T(e) = et(1) = eπ T : R 2 R 2 such that T((x,y) T ) = (x+y,2) T? Solution: No, as T(0) T : R 2 R 2 such that T((x,y) T ) = (x+y,xy) T? Solution: No, as T((2,2) T ) = (4,4) T 2(2,1) T = 2T((1,1) T ). Theorem Let V and W be two vector spaces over F and let T L(V,W). Then T is known if the image of T on basis vectors of V are known. Proof. Let {u 1,...,u n } be a basis of V and v V. Then, there exist c 1,...,c n F such that v = n u i c i = [u 1,...,u n ].. Hence, by definition c 1 c n T(v) = T(c 1 u 1 + +c n u n ) = c 1 T(u 1 )+ +c n T(u n ) = [T(u 1 ),...,T(u n )] Thus, the required result follows. As a direct application, we have the following result. Corollary (Reisz Representation Theorem). Let T L(R n,r). Then, there exists a R n such that T(x) = a T x. c 1. c n.

90 90 CHAPTER 4. LINEAR TRANSFORMATIONS Proof. By Theorem , T is known if we know the image of T on {e 1,...,e n }, the standard basis of R n. As T is given, for 1 i n, T(e i ) = a i, for some a i R. So, let us take a = (a 1,...,a n ) T. Then, for x = (x 1,...,x n ) T R n, ( n ) n n T(x) = T x i e i = x i T(e i ) = x i a i = a T x. Thus, the required result follows. Example Does there exist a linear transformation 1. T : R 2 R 2 such that T((1,1) T ) = (1,2) T and T((1, 1) T ) = (5,10) [ T? ] 1 1 Solution: Yes, as the set {(1,1),(1, 1)} is a basis of R 2, thematrix is invertible. 1 1 ([ ][ ]) ( [ ] [ ]) [ ] [ ] [ ][ ] 1 1 a a Also, T = T a +b = a +b =. So, 1 1 b b ([ ]) ( [1 ] ( [ ] 1 [ ] )) [ ] ( [ ] 1 [ ] ) x x x T = T = y y y [ ][ 1 5 x+y ] [ ] 3x 2y = = x 4y 2 x y 2 2. T : R 2 R 2 such that T((1,1) T ) = (1,2) T and T((5,5) T ) = (5,10) T? Solution: Yes, as (5,10) T = T((5,5) T ) = 5T((1,1) T ) = 5(1,2) T = (5,10) T. To construct one such linear transformation, let {(1,1) T,u} be a basis of R 2 and define T(u) = v = (v 1,v 2 ) T, for some v R 2. For example, if u = (1,0) T then ([ ]) ( [1 ] ( [ ] 1 [ ] )) [ ] ( [ ] 1 [ x x 1 v1 1 1 x T = T = y y 2 v y ] ) = y [ ] 1 2 +(x y)v. 3. T : R 2 R 2 such that Rng(T) = {T(x) x R 2 } = LS{(1,π) T }? Solution: Yes. Define T(e 1 ) = (1,π) T and T(e 2 ) = 0 or T(e 1 ) = (1,π) T and T(e 2 ) = a(1,π) T ), for some a R. 4. T : R 2 R 2 such that Rng(T) = {T(x) x R 2 } = R 2? Solution: Yes. Define T(e 1 ) = (1,π) T and T(e 2 ) = (π,e) T. Or, let {u,v} be a basis of R 2 and define T(e 1 ) = u and T(e 2 ) = v. 5. T : R 2 R 2 such that Rng(T) = {T(x) x R 2 } = {0}? Solution: Yes. Define T(e 1 ) = 0 and T(e 2 ) = T : R 2 R 2 such that Null(T) = {x R 2 T(x) = 0} = LS{(1,π) T }? Solution: Yes. Let a basis of R 2 = {(1,π) T,(1,0) T } and define T((1,π) T ) = 0 and T((1,0) T ) = u 0. Exercise Let V be a vector space and let a V. Then the map T a : V V defined by T a (x) = x + a, for all x V is called the translation map. Prove that T a L(V) if and only if a = 0.

91 4.1. DEFINITIONS AND BASIC PROPERTIES Are the maps T : V W given below, linear transformations? (a) Let V = R 2 and W = R 3 with T((x,y) T ) = (x+y +1,2x y,x+3y) T. (b) Let V = W = R 2 with T((x,y) T ) = (x y,x 2 y 2 ) T. (c) Let V = W = R 2 with T((x,y) T ) = (x y, x ) T. (d) Let V = R 2 and W = R 4 with T((x,y) T ) = (x+y,x y,2x+y,3x 4y) T. (e) Let V = W = R 4 with T((x,y,z,w) T ) = (z,x,w,y) T. 3. Which of the following maps T : M 2 (R) M 2 (R) are linear operators? (a) T(A) = A T (b) T(A) = I +A (c) T(A) = A 2 (d) T(A) = BAB 1, where B is a fixed 2 2 matrix. 4. Prove that a map T : R R is a linear transformation if and only if there exists a unique c R such that T(x) = cx, for every x R. 5. Let A M n (C) and define T A : C n C n by T A (x) = Ax for every x C n. Prove that for any positive integer k, TA k(x) = Ak x. 6. Use matrices to give examples of linear operators T,S : R 3 R 3 that satisfy: (a) T 0, T 2 0, T 3 = 0. (b) T 0, S 0, S T 0, T S = 0. (c) S 2 = T 2, S T. (d) T 2 = I, T I. 7. Let T : R n R n be a linear operator with T 0 and T 2 = 0. Prove that there exists a vector x R n such that the set {x,t(x)} is linearly independent. 8. Fix a positive integer p and let T : R n R n be a linear operator with T k 0 for 1 k p and T p+1 = 0. Then prove that there exists a vector x R n such that the set {x,t(x),...,t p (x)} is linearly independent. 9. Let T : R n R m be a linear transformation with T(x 0 ) = y 0 for some x 0 R n and y 0 R m. Define T 1 (y 0 ) = {x R n : T(x) = y 0 }. Then prove that for every x T 1 (y 0 ) there exists z T 1 (0) such that x = x 0 + z. Also, prove that T 1 (y 0 ) is a subspace of R n if and only if 0 T 1 (y 0 ). 10. Define a map T : C C by T(z) = z, the complex conjugate of z. Is T a linear transformation over the real vector space C? 11. Prove that there exists infinitely many linear transformations T : R 3 R 2 such that T((1, 1,1) T ) = (1,2) T and T(( 1,1,2) T ) = (1,0) T? 12. Does there exist a linear transformation T : R 3 R 2 such that

92 92 CHAPTER 4. LINEAR TRANSFORMATIONS (a) T((1,0,1) T ) = (1,2) T, T((0,1,1) T ) = (1,0) T and T((1,1,1) T ) = (2,3) T? (b) T((1,0,1) T ) = (1,2) T, T((0,1,1) T ) = (1,0) T and T((1,1,2) T ) = (2,3) T? 13. Let T : R 3 R 3 be defined by T((x,y,z) T ) = (2x+3y+4z,x+y+z,x+y+3z) T. Find the value of k for which there exists a vector x R 3 such that T(x) = (9,3,k) T. 14. Let T : R 3 R 3 be defined by T((x,y,z) T ) = (2x 2y+2z, 2x+5y+2z,8x+y+4z) T. Find x R 3 such that T(x) = (1,1, 1) T. 15. Let T : R 3 R 3 be defined by T((x,y,z) T ) = (2x +y +3z,4x y +3z,3x 2y +5z) T. Determine x,y,z R 3 \{0} such that T(x) = 6x, T(y) = 2y and T(z) = 2z. Is the set {x, y, z} linearly independent? 16. Let T : R 3 R 3 be defined by T((x,y,z) T ) = (2x+3y +4z, y, 3y +4z) T. Determine x,y,z R 3 \ {0} such that T(x) = 2x, T(y) = 4y and T(z) = z. Is the set {x,y,z} linearly independent? 17. Let n N. Does there exist a linear transformation T : R 3 R n such that T((1,1, 2) T ) = x, T(( 1,2,3) T ) = y and T((1,10,1) T ) = z (a) with z = x+y? (b) with z = cx+dy, for some c,d R? 18. For each matrix A given below, define T L(R 2 ) by T(x) = Ax. What do these linear operators signify geometrically? { [ [ ] (a) A 1 1 ],, 1 [ [ ] ( cos 2π ) ( ],,[ 3 sin 2π )]} sin ( ) 2π 3 cos ( ) 2π. 3 {[ ] [ ] (b) A,, 1 [ ] 1 1, 1 [ ] [ ] [ ]} ,, { [ ] [ ] (c) A ,, 1 [ ],[ ( 1 3 cos 2π ) 3 sin ( ) ]} 2π sin ( ) 2π 3 cos ( ) 2π Find all functions f : R 2 R 2 that fixes the line y = x and sends (x 1,y 1 ) for x 1 y 1 to its mirror image along the line y = x. Or equivalently, f satisfies (a) f(x,x) = (x,x) and (b) f(x,y) = (y,x) for all (x,y) R Consider the space C 3 over C. If f L(C 3 ) with f(x) = x,f(y) = (1+i)y and f(z) = (2+3i)z, for x,y,z C 3 \{0} then prove that {x,y,z} form a basis of C Rank-Nullity Theorem The readers are advised to see Theorem on Page 81 for clarity and similarity with the results in this section. We start with the following result.

93 4.2. RANK-NULLITY THEOREM 93 Theorem Let V and W be two vector spaces over F and let T L(V,W). If S V is linearly dependent then T(S) = {T(v) v V} is linearly dependent. Proof. As S is linearly dependent, there exist k N and v i S, for 1 i k, such that the system k x i v i = 0, in the unknown x i s, has a non-trivial solution, say x i = a i F,1 i k. Thus, k a i v i = 0. Now, consider the system k y i T(v i ) = 0, in the unknown y i s. Then, k a i T(v i ) = ( k k ) T(a i v i ) = T a i v i = T(0) = 0. Thus, a i s give a non-trivial solution of k y i T(v i ) = 0 and hence the required result follows. As an immediate corollary, we get the following result. Remark Let V and W be two vector space over F and let T L(V,W). Suppose S V such that T(S) is linearly independent then S is linearly independent. We now give some important definitions. Definition [Range Space and Null Space] Let V and W be two vector spaces over F and let T L(V,W). Then, we define 1. Rng(T) = {T(x) x V} and call it the range space of T and 2. Null(T) = {x V T(x) = 0} and call it the null space of T. Example DetermineRng(T)andNull(T)ofT L(R 3,R 4 ),wherewedefinet((x,y,z) T ) = (x y +z,y z,x,2x 5y +5z) T. Solution: Consider the standard basis {e 1,e 2,e 3 } of R 3. Then Rng(T) = LS(T(e 1 ),T(e 2 ),T(e 3 )) = LS ( (1,0,1,2) T,( 1,1,0, 5) T,(1, 1,0,5) T) = LS ( (1,0,1,2) T,(1, 1,0,5) T) = {λ(1,0,1,2) T +β(1, 1,0,5) T λ,β R} = {(λ+β, β,λ,2λ+5β) : λ,β R} = {(x,y,z,w) T R 4 x+y z = 0,5y 2z +w = 0} and Null(T) = {(x,y,z) T R 3 : T((x,y,z) T ) = 0} = {(x,y,z) T R 3 : (x y +z,y z,x,2x 5y +5z) T = 0} = {(x,y,z) T R 3 : x y +z = 0,y z = 0,x = 0,2x 5y +5z = 0} = {(x,y,z) T R 3 : y z = 0,x = 0} = {(0,y,y) T R 3 : y R} = LS((0,1,1) T ) Exercise Let V and W be two vector spaces over F and let T L(V,W). Then

94 94 CHAPTER 4. LINEAR TRANSFORMATIONS (a) Rng(T) is a subspace of W. (b) Null(T) is a subspace of V. Furthermore, if V is finite dimensional then (a) dim(null(t)) dim(v). (b) dim(rng(t)) is finite and whenever dim(w) is finite dim(rng(t)) dim(w). 2. Describe Null(D) and Rng(D), where D L(R[x;n]) and is defined by D(f(x)) = f (x). Note that Rng(D) R[x;n 1]. 3. Define T L(R 3 ) by T(e 1 ) = e 1 +e 3, T(e 2 ) = e 2 +e 3 and T(e 3 ) = e 3. Then (a) determine T((x,y,z) T ), for x,y,z R. (b) determine Null(T) and Rng(T). (c) is it true that T 3 = T? 4. Find T L(R 3 ) for which Rng(T) = LS ( (1,2,0) T,(0,1,1) T,(1,3,1) T). 5. Let V and W be two vector spaces over F. If {v 1,...,v n } is a V and w 1,...,w n W then prove that there exists a unique T L(V,W) such that T(v i ) = w i, for i = 1,...,n. Definition [Rank and Nullity] Let V and W be two vector spaces over F. If T L(V,W) and dim(v) is finite then we define Rank(T) = dim(rng(t)) and Nullity(T) = dim(null(t)). We now prove the rank-nullity Theorem. The proof of this result is similar to the proof of Theorem We give it again for the sake of completeness. Theorem (Rank-Nullity Theorem). Let V and W be two vector spaces over F. If T L(V,W) and dim(v) is finite then Rank(T)+Nullity(T) = dim(rng(t))+dim(null(t)) = dim(v). Proof. By Exercise a, dim(null(t)) dim(v). Let B be a basis of Null(T). We extend it toform abasisc of V. So, by definitionrng(t) = LS({T(v) v C}) = LS({T(v) v C \B}). We claim that {T(v) v C \B} is linearly independent subset of W. Let if possible the claim be false. Then, there exists v 1,...,v k C \B and a = [a 1,...,a k ] T such that a 0 and k a i T(v i ) = 0. Thus, we see that That is, ( k ) k T a i v i = a i T(v i ) = 0. k a i v i Null(T). Hence, there exists b 1,...,b l F and u 1,...,u l B such that k a i v i = k b j u j. Or equivalently, the system k x i v i + k y j u j = 0, in the unknowns x i s j=1 j=1

95 4.2. RANK-NULLITY THEOREM 95 and y j s, has a non-trivial solution [a 1,...,a k, b 1,..., b l ] T (non-trivial as a 0). Hence, S = {v 1,...,v k,u 1,...,u l } is linearly dependent in V. A contradiction to S C. That is, dim(rng(t))+dim(null(t)) = C \B + B = C = dim(v). Thus, we have proved the required result. As an immediate corollary, we have the following result. Corollary Let V be a vector space over F with dim(v) = n. If S,T L(V). Then 1. Nullity(T) + Nullity(S) Nullity(ST) max{nullity(t), Nullity(S)}. 2. min{rank(s),rank(t)} Rank(ST) n Rank(S) Rank(T). Proof. The prove of Part 2 is omitted as it directly follows from Part 1 and Theorem Part 2: We first prove the second inequality. Suppose v Null(T). Then (ST)(v) = S(T(v) = S(0) = 0 gives Null(T) Null(ST). Thus, Nullity(T) Nullity(ST). By Theorem , Nullity(S) Nullity(ST) is equivalent to Rng(ST) Rng(S). And this holds as Rng(T) V implies Rng(ST) = S(Rng(T)) S(V) = Rng(S). To prove the first inequality, let {v 1,...,v k } be a basis of Null(T). Then {v 1,...,v k } Null(ST). So, let us extend it to get a basis {v 1,...,v k,u 1,...,u l } of Null(ST). Claim: {T(u 1 ),T(u 2 ),...,T(u l )} is a linearly independent subset of Null(S). Clearly, {T(u 1 ),...,T(u l )} Null(S). Now, consider ( the system c 1 T(u 1 )+ +c l T(u l ) = 0 l ) l in the unknowns c 1,...,c l. As T L(V), we get T c i u i = 0. Thus, c i u i Null(T). l Hence, c i u i is a unique linear combination of v 1,...,v k. Therefore, c 1 u 1 + +c l u l = α 1 v 1 + +α k v k ( ) for some scalars α 1,...,α k. But by assumption, {v 1,...,v k,u 1,...,u l } is a basis of Null(ST) and hence linearly independent. Therefore, the only solution of Equation ( ) is given by c i = 0 for 1 i l and α j = 0 for 1 j k. Thus, we have proved the claim. Hence, Nullity(S) l and Nullity(ST) = k+l Nullity(T)+Nullity(S). Exercise for all v R n. Then prove that 1. Let A M n (R) with A 2 = A. Define T L(R n,r n ) by T(v) = Av, (a) T 2 = T. Equivalently, T(Id T) = 0. (b) Null(T) Rng(T) = {0}. (c) R n = Rng(T)+Null(T). [Hint: x = T(x)+(Id T)(x)] x [ ] 2. Define T L(R 3 ) by T y x y +z =. Find a basis and the dimension of Rng(T) x+2z z and Null(T). 3. Let z i C, for 1 i k. Define T L(C[x;n],C k ) by T ( P(z) ) = ( P(z 1 ),...,P(z k ) ). If z i s are distinct then for each k 1, determine Rank(T).

96 96 CHAPTER 4. LINEAR TRANSFORMATIONS 4.2.A Algebra of Linear Transformation We start with the following definition. Definition [Sum and Scalar Multiplication of Linear Transformations] Let V, W be vector spaces over F and let S,T L(V,W). Then, we define the point-wise 1. sum of S and T, denoted S +T, by (S +T)(v = S(v)+T(v), for all v V. 2. scalar multiplication, denoted ct for c F, by (ct)(v = c(t(v)), for all v V. Theorem Let V and W be vector spaces over F. Then L(V,W) is a vector space over F. Furthermore, if dimv = n and dimw = m, then diml(v,w) = mn. Proof. It can be easily verified that for S,T L(V,W), if we define (S+αT)(v) = S(v)+αT(v) (point-wise addition and scalar multiplication) then L(V, W) is indeed a vector space over F. We now prove the other part. So, let us assume that B = {v 1,...,v n } and C = {w 1,...,w m } are bases of V and W, respectively. For 1 i n,1 j m, we define the functions f ij on the basis vectors of V by { wj, if k = i f ij (v k ) = 0, k i. For other vectors of V, we extend the definition by linearity. That is, if v = n α s v s then ( n ) f ij (v) = f ij α s v s = s=1 Thus, f ij L(V,W). s=1 n α s f ij (v s ) = α i f ij (v i ) = α i w j. ( ) s=1 Claim: {f ij 1 i n,1 j m} is a basis of L(V,W). So, let us consider the linear system n m c ij f ij = 0, in the unknowns c ij s for 1 i n,1 j=1 j m. Using the point-wise addition and scalar multiplication, we get n m n m m 0 = 0(v k ) = c ij f ij (v k ) = c ij f ij (v k ) = c kj w j. j=1 j=1 j=1 But, the set {w 1,...,w m } is linearly independent and hence the only solution equals c kj = 0, for 1 j m. Now, as we vary k from 1 to n, we see that c ij = 0, for 1 j m and 1 i n. Thus, we have proved the linear independence. Now, let us prove that LS({f ij 1 i n,1 j m}) = L(V,W). So, let f L(V,W). Then, f(v s ) W and hence there exists β st s such that f(v s ) = m β st w t, for 1 s n. So, if v = n α s v s V then, using Equation ( ), we get s=1 ( n ) f(v) = f α s v s = s=1 s=1 s=1 t=1 s=1 t=1 t=1 ( n n m ) n m α s f(v s ) = α s β st w t = β st (α s w t ) ( n m n m = β st f st (v s ) = β st f st )(v). s=1 t=1 s=1 t=1

97 4.2. RANK-NULLITY THEOREM 97 Since the above is true for every v V, LS({f ij 1 i n,1 j m}) = L(V,W) and thus the required result follows. Definition Let f : S T be any function. 1. Then, a function g : T S is called a left inverse of f if (g f)(x) = x, for all x S. That is, g f = Id, the identity function on S. 2. Then, a function h : T S is called a right inverse of f if (f h)(y) = y, for all y T. That is, f h = Id, the identity function on T. 3. Then f is said to be invertible if it has a right inverse and a left inverse. Remark Let f : S T be invertible. Then, it can be easily shown that any right inverse and any left inverse are the same. Thus, the inverse function is unique and is denoted by f 1. The reader should prove that f is invertible if and only if f is both one-one and onto. Theorem Let V and W be two vector spaces over F and let T L(V,W). Also assume that T is one-one and onto. Then 1. for each w W, the set T 1 (w) = 1, where T 1 (w) = {v V T(v) = w}. 2. the map T 1 L(W,V), where one defines T 1 (w) = v whenever T(v) = w. Proof. Part 1. As T is onto, for each w W there exists v V such that T(v) = w. So, T 1 (w). Now, let us assume that there exist vectors v 1,v 2 V such that T(v 1 ) = T(v 2 ). Then T is one-one implies v 1 = v 2. Hence, T 1 (w) = 1. This completes the proof of Part 1. Part 2. WeneedtoshowthatT 1 (α 1 w 1 +α 2 w 2 ) = α 1 T 1 (w 1 )+α 2 T 1 (w 2 ), forallα 1,α 2 F andw 1,w 2 W. NotethatbyPart1, thereexistuniquevectors v 1,v 2 VsuchthatT 1 (w 1 ) = v 1 and T 1 (w 2 ) = v 2. Or equivalently, T(v 1 ) = w 1 and T(v 2 ) = w 2. So, T(α 1 v 1 + α 2 v 2 ) = α 1 w 1 +α 2 w 2, for all α 1,α 2 F. Hence, by definition of T 1, for all α 1,α 2 F, we get T 1 (α 1 w 1 +α 2 w 2 ) = α 1 v 1 +α 2 v 2 = α 1 T 1 (w 1 )+α 2 T 1 (w 2 ). Thus the proof of Part 2 is complete. Definition [Inverse Linear Transformation] Let V and W be two vector spaces over F and let T L(V,W). If T is one-one and onto then T 1 L(W,V), where T 1 (w) = v whenever T(v) = w. The map T 1 is called the inverse of the linear transformation T. Example ([ ]) [ 1. Let T : R 2 R 2 be defined by T((x,y) T ) = (x+y,x y) T. Then x x+y ] ([ ]) ( ([ ])) ([ T 1 = 2 x x x+y ]) [ ] x y as (T T y 1 ) = T T 2 y 1 = T 2 x x y =. Thus, y 2 y the map T 1 is indeed the inverse of T. 2. Define T L(R n+1,r[x;n]) by T(a 1,...,a n+1 ) = n+1 a i x i 1, for (a 1,...,a n+1 ) R n+1. ( n+1 ) Then, one defines T 1 a i x i 1 = (a 1,...,a n+1 ), for all n+1 a i x i 1 R[x;n]. Verify that T 1 L(R[x;n],R n+1 ).

98 98 CHAPTER 4. LINEAR TRANSFORMATIONS Definition Let V and W be two vector spaces over F and let T L(V,W). Then, T is said to be singular if there exists v V such that v 0 but T(v) = 0. If such a v V does not exist then T is called non-singular. Example Let T L(R 2,R 3 ) be defined by T non-singular. Is T invertible? ([ ]) x x = y. Then, verify that T is y 0 We now prove a result that relates non-singularity with linear independence. Theorem Let V and W be two vector spaces over F and let T L(V,W). Then the following statements are equivalent. 1. T is one-one. 2. T is non-singular. 3. Whenever S V is linearly independent then T(S) is necessarily linearly independent. Proof. 1 2 Let T be singular. Then, there exists v 0 such that T(v) = 0 = T(0). This implies that T is not one-one, a contradiction. 2 3 Let S V be linearly independent. Let if possible T(S) be linearly dependent. k Then, there exists v 1,...,v k S and α = (α 1,...,α k ) T 0 such that α i T(v i ) = 0. ( k ) k Thus, T α i v i = 0. But T is nonsingular and hence we get α i v i = 0 with α 0, a contradiction to S being a linearly independent set. 3 1 Suppose that T is not one-one. Then, there exists x,y V such that x y but T(x) = T(y). Thus, we have obtained S = {x y}, a linearly independent subset of V with T(S) = {0}, a linearly dependent set. A contradiction. Thus, the required result follows. Definition Let V and W be two vector spaces over F and let T L(V,W). Then, T is said to be an isomorphism if T is one-one and onto. The vector spaces V and W are said to be isomorphic, denoted V = W, if there is an isomorphism from V to W. We now give a formal proof of the statement in Remark Theorem Let V be an n-dimensional vector space over F. Then V = F n. Proof. Let {v 1,...,v n } be a basis ( of V and {e 1,...,e n }, the standard basis of F n. Now define n ) T(v i ) = e i, for 1 i n and T α i v i = n α i e i, for α 1,...,α n F. Then, it is easy to observe that T L(V,F n ), T is one-one and onto. Hence, T is an isomorphism. We now summarize the different definitions related with a linear operator on a finite dimensional vector space. The prove basically uses the rank-nullity theorem. Theorem Let V be a vector space over F with dimv = n. Then the following statements are equivalent for T L(V).

99 4.3. MATRIX OF A LINEAR TRANSFORMATION T is one-one. 2. Null(T) = {0}. 3. Rank(T) = n. 4. T is onto. 5. T is an isomorphism. 6. If {v 1,...,v n } is a basis for V then so is {T(v 1 ),...,T(v n )}. 7. T is non-singular. 8. T is invertible. Proof. 1 2 Let x Null(T). Then T(x) = 0 = T(0). So, T is one-one implies x = 0. Thus Null(T) = {0}. 2 3 As Null(T) = {0}, Nullity(T) = 0 and hence by Theorem Rank(T) = n. 3 4 As Rank(T) = n, Rng(T) V and dim(v) = n, we get Rng(T) = V. Thus T is onto. 4 1 As T is onto, dim(rng(t)) = n. So, by Theorem Null(T) = {0}. Now, let x,y V such that T(x) = T(y). Or equivalently, x y Null(T) = {0}. Thus x = y and T is one-one. The equivalence of 1 and 2 gives the equivalence with 5. Also, using Theorem , one has the equivalence of 1, 6 and 7. Further note that the equivalence of 1 and 2 with Theorem implies that T is invertible. For the other way implication, note that by definition T is invertible implies that T is one-one and onto. Thus, all the statements are equivalent. Exercise Let V and W be two vector spaces over F and let T L(V,W). If dim(v) is finite then prove that 1. T cannot be onto if dim(v) < dim(w). 2. T cannot be one-one if dim(v) > dim(w). 4.3 Matrix of a linear transformation In Example , we saw that for each A M m n (C) there exists a linear transformation T L(C n,c m ) given by T(x) = Ax, for each x C n. In this section, we prove that if V and W are vector spaces over F with dimensions n and m, respectively, then any T L(V,W) corresponds to an m n matrix. Before proceeding further, the readers should recall the results on ordered basis (see Section 3.5). So, let B = [v 1,...,v n ] and C = [w 1,...,w m ] be ordered bases of V and W, respectively. Then, recall that if A = [v 1,...,v n ] and B = [w 1,...,w m ] then v = A[v] B and w = B[w] C, for all v V and w W. So, if T L(V,W) then, for any v V, B[T(v)] C = T(v) = T(A[v] B ) = T(A)[v] B = [T(v 1 ),...,T(v n )][v] B = [B[T(v 1 )] C,...,B[T(v n )] C ][v] B = B[[T(v 1 )] C,...,[T(v n )] C ][v] B.

100 100 CHAPTER 4. LINEAR TRANSFORMATIONS As B is invertible, we get [T(v)] C = [[T(v 1 )] C,...,[T(v n )] C ][v] B. Note that the matrix [[T(v 1 )] C,...,[T(v n )] C ], denoted T[B,C], is an m n matrix and is unique as the i-th column equals [T(v i )] C, for 1 i n. So, we immediately have the following definition and result. Definition [Matrix of a Linear Transformation] Let B = [v 1,...,v n ] and C = [w 1,...,w m ] be ordered bases of V and W, respectively. If T L(V,W) then the matrix T[B,C] is called the coordinate matrix of T or the matrix of the linear transformation T with respect to the basis B and C, respectively. When there is no mention of bases, we take the standard bases and denote the matrix by [T]. Theorem Let B = [v 1,...,v n ] and C = [w 1,...,w m ] be ordered bases of V and W, respectively. If T L(V,W) then there exists a matrix A M m n (F) with A = T[B,C] = [[T(v 1 )] C,...,[T(v n )] C ] and [T(x)] C = A [x] B, for all x V. We now give a few examples to understand the above discussion and Theorem Q = (0,1) Q = ( sinθ,cosθ) θ P = (cosθ,sinθ) θ O P = (1,0) Figure 4.1: Counter-clockwise Rotation by an angle θ P = (x,y ) θ P = (x,y) α O Example LetT L(R 2 )representacounterclockwiserotation byanangleθ,0 θ < 2π. Then, using Figure 4.1, x = OP cosα and y = OP sinα, verify that [ ] [ x OP y = ] [ ( )] [ cos(α+θ) OP cosαcosθ sinαsinθ OP = sin(α+θ) OP ( sinαcosθ+cosαsinθ ) cosθ sinθ = sinθ cosθ Or equivalently, the matrix in the standard ordered basis of R 2 equals [ ] [ ] cosθ sinθ [T] = T(e 1 ),T(e 2 ) =. ( ) sinθ cosθ 2. Let T L(R 2 ) with T((x,y) T ) = (x+y,x y) T. (a) Then [T] = [ [T(e 1 )] [T(e 2 )] ] [ ] 1 1 =. 1 1 {[ ] [ ]} 1 1 (b) On the image space take the ordered basis C =,. Then 0 1 [T] = [ [[ ] [ ] ] [ ] ] [T(e 1 )] C [T(e 2 )] C = = C C {[ ] [ ]} 1 3 (c) In the above, let the ordered basis of the domain space be B =,. Then 1 1 [[ [ ]] [ [ ]] ] [[ ] [ ] ] [ ] T[B,C] = T T = = C C C C ][ x y ].

101 4.3. MATRIX OF A LINEAR TRANSFORMATION Let B = [e 1,e 2 ] and C = [e 1 + e 2,e 1 e 2 ] be two ordered bases of R 2. Then Compute T[B,B] and T[C,C], where T((x,y) [ T ) = (x+y,x 2y) ] T. 1 1 Solution: Let A = Id 2 and B =. Then, A = Id 2 and B 1 = 1 [ ] 1 1. So, as T[B,B] = T[C,C] = [[ T [[ T ([ ])] [ 1, T 0 B ])] ([ 1 1 [ ] [ ] 2 2 = B 1 1 and 1 C C ([ ])] ] 0 = 1 B ])] ] [ ([ 1, T 1 [ ] [ ] 0 0 = B C C [[ ] [ ] ] [ , = 1 2 B B 1 2 ] ] ] = [[ 2 1, C [ 0 3 C = [ 1 2 ] ] and 4. Let T L(R 3,R 2 ) be defined by T((x,y,z) T ) = (x+y z,x+z) T. Determine [T]. Solution: By definition [T] = [[T(e 1 )],[T(e 2 )],[T(e 3 )]] = [[ ] 1, 1 [ ] 1, 0 [ ]] [ 1 = 1 ] Define T L(C 3 ) by T(x) = x, for all x C 3. Determine the coordinates with respect to the ordered basis B = [ e 1,e 2,e 3 ] and C = [ (1,0,0),(1,1,0),(1,1,1) ]. Solution: By definition, verify that T[B,C] = [[T(e 1 )] C,[T(e 2 )] C,[T(e 3 )] C ] = 0, 1, 0 = C C C and T[C,B] = 0, 1, 1 = B B B Thus, verify that T[C,B] 1 = T[B,C] and T[B,B] = T[C,C] = I 3 as the given map is indeed the identity map. 6. Fix A M n (C) and define T L(C n ) by T(x) = Ax, for all x C n. If B is the standard basis of C n then [T][:,i] = [T(e i )] B = [A(e i )] B = [A[:,i]] B = A[:,i]. 7. Fix A M m,n (C) and define T L(C n,c m ) by T(x) = Ax, for all x C n. Let B and C be the standard ordered bases of C n and C m, respectively. Then T[B,C] = A as (T[B,C])[:,i] = [T(e i )] C = [A(e i )] C = [A[:,i]] C = A[:,i].

102 102 CHAPTER 4. LINEAR TRANSFORMATIONS 8. Fix A M n (C) and define T L(C n ) by T(x) = Ax, for all x C n. Let B = [v 1,...,v n ] and C = [u 1,...,u n ] be two ordered basses of C n with respective matrices B and C. Then T[B,C] = [[T(v 1 )] C,...,[T(v 1 )] C ] = [ C 1 T(v 1 ),...,C 1 T(v 1 ) ] = [ C 1 Av 1,...,C 1 ] Av 1 = C 1 A[v 1,...,v n ] = C 1 AB. In particular, if (a) B = C then T[B,B] = B 1 AB. (b) A = I n so that T = Id then Id[B,C] = C 1 B, an invertible matrix. Similarly, Id[C,B] = B 1 C. So, Id[C,B] Id[B,C] = (B 1 C)(C 1 B) = I n. (c) A = I n so that T = Id and B = C then Id[B,B] = I n. Exercise Let T L(R 2 ) represent the reflection about the line y = mx. Find its matrix with respect to the standard ordered basis of R Let T L(R 3 ) represent the reflection about the X-axis. Find its matrix with respect to the standard ordered basis of R Let T L(R 3 ) represent the counterclockwise rotation around the positive Z-axis by an angle θ,0 θ < 2π. Find its matrix with respect to the standard ordered basis of R 3. cosθ sinθ 0 [Hint: Is sinθ cosθ 0 the required matrix?] Define a function D L(R[x;n]) by D(f(x)) = f (x). Find the matrix of D with respect to the standard ordered basis of R[x;n]. Observe that Rng(D) R[x;n 1]. 4.3.A Dual Space* Definition Let V be a vector space over F. Then a map T L(V,F) is called a linear functional on V. Example The following linear transformations are linear functionals. 1. T(A) = trace(a) for T L(M n (R),R). 2. T(f) = 3. T(f) = b a b a f(t)dt for T L(C([a,b],R),R). t 2 f(t)dt for T L(C([a,b],R),R). Exercise Let V be a vector space. Suppose there exists v V such that f(v) = 0, for all f V. Then prove that v = 0. Definition Let V be a vector space over F. Then L(V,F) is called the dual space of V and is denoted by V. The double dual space of V, denoted V, is the dual space of V. We first give an immediate corollary of Theorem

103 4.3. MATRIX OF A LINEAR TRANSFORMATION 103 Corollary Let V and W be vector spaces over F with dimv = n and dimw = m. 1. Then L(V,W) = F mn. Moreover, {f ij 1 i n,1 j m} is a basis of L(V,W). 2. In particular, if W = F then L(V,F) = V = F n. Moreover, { if {v 1,...,v n } is a basis of V then the set {f i 1 i n} is a basis of V 1, if k = i, where f i (v k ) =. 0, k i. So, we see that V can be understood through a basis of V. Thus, one can understand V again via a basis of V. But, the question arises can we understand it directly via the vector space V itself? We answer this in affirmative by giving a canonical isomorphism from V to V. To do so, for each v V, we define a map L v : V F by L v (f) = f(v), for each f V. Then L v is a linear functional as L v (αf +g) = (αf +g)(v) = αf(v)+g(v) = αl v (f)+l v (g). So, for each v V, we have obtained a linear functional L v V. We use it to give the required canonical isomorphism. Theorem Let V be a vector space over F. If dim(v) = n then the canonical map T : V V defined by T(v) = L v is an isomorphism. Proof. Note that the map T satisfies the following: 1. For each f V, note that L αv+u (f) = f(αv +u) = αf(v)+f(u) = αl v (f)+l u (f) = (αl v +L u )(f). Thus, L αv+u = αl v + L u. Hence, T(αv + u) = αt(v) + T(u). Thus, T is a linear transformation. 2. We now show that T is one-one. So, suppose assume that T(v) = T(u), for some u,v V. Then, L v = L u. That is, L v (f) = L u (f), for all f V. Or equivalently, f(v u) = 0, for all f V. Hence, by Exercise v u = 0. So, v = u. Therefore T is one-one. Thus, T gives an inclusion map from V to V. Further, applying Corollary to V, gives dim(v ) = dim(v ) = n. Hence, the required result follows. We now give a few immediate consequences of Theorem Corollary Let V be a vector space of dimension n with basis B = {v 1,...,v n }. 1. Then, a basis of V, the double dual of V, equals D = {L v1,...,l vn }. Thus, for each T V there exists α V such that T(f) = f(α), for all f V. 2. If C = {f 1,...,f n } is the dual basis of V defined using the basis B (see Corollary ) then D is indeed the dual basis of V obtained using the basis C of V. Thus, each basis of V is the dual basis of some basis of V.

104 104 CHAPTER 4. LINEAR TRANSFORMATIONS Proof. Part 1 is direct as T : V V was a canonical inclusion map. For Part 2, we need to show that L vi (f j ) = { 1, if j = i 0, if j i which indeed holds true using Corollary or equivalently f j (v i ) = { 1, if j = i 0, if j i Let V be a finite dimensional vector space. Then Corollary implies that the spaces V and V are naturally dual to each other. We are now ready to prove the main result of this subsection. To start with, let T L(V,W). Then, we want to define a map T : W V. As g W, T(g) V, a linear functional. So, we need to be evaluate at an element of V. Thus, we define T(g)(v) = g(t(v)), for all v V. Note that T L(W,V ) as for every g,h W, ) ( ( T(αg+h) (v) = (αg+h)(t(v)) = αg(t(v))+h(t(v)) = α T(g)+ T(h) ) (v), for all v V implies that T(αg+h) = α T(g)+ T(h). Theorem Let V and W be two vector spaces over F with ordered bases B = [v 1,...,v n ] and C = [w 1,...,w m ], respectively. Also, let B = [f 1,...,f n ] and C = [g 1,...,g m ] be the corresponding ordered bases of the dual spaces V and W, respectively. Then, T[C,B ] = (T[B,C]) T, the transpose of the coordinate matrix T. Proof. Note that we need to compute T[C,B ] = [[ T(g1 )] [ T(gm ]B ],..., ) and prove that B it equals the transpose of the matrix T[B,C]. So, let T[B,C] = [[T(v 1 )] C,...,[T(v n )] C ] = a 11 a 12 a 1n a 21 a 22 a 2n Thus, to prove the required result, we need to show that. a m1 a m2 a mn a j1 a j2 a jn [ T(gj )] = [f n 1,...,f n ] B. = a jk f k, for 1 j m. ( ) ( n ) Now, recall that the functionals f i s and g j s satisfy α k f k (v t ) = n α k (f k (v t )) = α t, for k=1 k=1 1 t n and [g j (w 1 ),...,g j (w m )] = e T j, a row vector with 1 at the j-th place and 0, elsewhere. So, let C = [w 1,...,w m ] and evaluate T(g j ) at v t s, the elements of B. ( T(gj )) (v t ) = g j (T(v t )) = g j (C[T(v t )] C ) = [g j (w 1 ),...,g j (w m )][T(v t )] C a 1t ( = e T a 2t n j. = a jt = a jk f k )(v t ). a mt k=1 k=1

105 4.4. SIMILARITY OF MATRICES 105 Thus, the linear functional T(g j ) and n a jk f k are equal at v t, for 1 t n, the basis vectors k=1 of V. Hence T(g j ) = n a jk f k which gives Equation ( ). k=1 Remark The proof of Theorem also shows the following. 1. For each T L(V,W) there exists a unique map T L(W,V ) such that ( T(g) ) (v) = g(t(v)), for each g W. 2. The coordinate matrices T[B,C] and T[C,B ] are transpose of each other, where the ordered bases B of V and C of W correspond, respectively, to the ordered bases B of V and C of W. 3. Thus, the results on matrices and its transpose can be re-written in the language a vector space and its dual space. 4.4 Similarity of Matrices Let V be a vector space over F with dim(v) = n and ordered basis B. Then any T L(V) corresponds to a matrix in M n (F). What happens if the ordered basis needs to change? We answer this in this subsection. T[B,C] m n S[C,D] p m (V,B,n) (W,B 2,m) (Z,D,p) (ST)[B,D] p n = S[C,D] T[B,C] Figure 4.2: Composition of Linear Transformations Theorem (Composition of Linear Transformations). Let V, W and Z be finite dimensional vector spaces over F with ordered bases B,C and D, respectively. Also, let T L(V,W) and S L(W,Z). Then S T = ST L(V,Z) (see Figure 4.2). Then (ST) [B,D] = S[C,D] T[B,C]. Proof. Let B = [u 1,...,u n ], C = [v 1,...,v m ] and D = [w 1,...,w p ] be the ordered bases of V, W and Z, respectively. Then using Theorem , we have (ST)[B,D] = [[ST(u 1 )] D,...,[ST(u n )] D ] = [[S(T(u 1 ))] D,...,[S(T(u n ))] D ] = [S[C,D][T(u 1 )] C ],...,S[C,D][T(u n )] C ]] = S[C,D] [[T(u 1 )] C ],...,[T(u n )] C ]] = S[C,D] T[B,C]. Hence, the proof of the theorem is complete. As an immediate corollary of Theorem we have the following result.

106 106 CHAPTER 4. LINEAR TRANSFORMATIONS Theorem (Inverse of a Linear Transformation). Let V is a vector space with dim(v) = n. If T L(V) is invertible then for any ordered basis B, (T[B,B]) 1 = T 1 [B,B]. That is, the coordinate matrix is invertible. Proof. As T is invertible, TT 1 = Id. Thus, Example c and Theorem imply I n = Id[B,B] = (TT 1 )[B,B] = T[B,B] T 1 [B,B]. Hence, by definition of inverse, T 1 [B,B] = (T[B,B]) 1 and the required result follows. Exercise Find the matrix of the linear transformations given below. 1. Define T L(R 3 ) by T(x 1 ) = x 2, T(x 2 ) = x 3 and T(x 3 ) = x 1. Find T[B,B], where B = [ ],x 2,x 3 is an ordered basis of R 3. Is T invertible? 2. Let B = [ 1,x,x 2,x 3] be an ordered basis of R[x;3] and define T L(R[x;3]) by T(1) = 1, T(x) = 1+x, T(x 2 ) = (1 +x) 2 and T(x 3 ) = (1+x) 3. Prove that T is invertible. Also, find T[B,B] and T 1 [B,B]. Let V be a finite dimensional vector space. Then, the next result answer the question what happens to the matrix T[B,B] if the ordered basis B changes to C? T[B,B] (V,B) (V,B 1 ) Id T Id[B, C] Id[B, C] T Id (V,C) (V,C) T[C,C] Figure 4.3: Commutative Diagram for Similarity of Matrices Theorem Let B = [u 1,...,u n ] and C = [v 1,...,v n ] be two ordered bases of V and Id the identity operator. Then, for any linear operator T L(V) T[C,C] = (Id[C,B]) 1 T[B,B] Id[C,B]. ( ) Proof. The proof uses Theorem to represent T[B,C] as (Id T)[B,C] and (T Id)[B,C] (see Figure 4.3 for clarity). Now, by Theorem , T[B,C] = (Id T)[B,C] = Id[B,C] T[B,B] and T[B,C] = (T Id)[B,C] = T[C,C] Id[B,C]. Hence, T[C,C] Id[B,C] = Id[B,C] T[B,B] and hence T[C,C] = (Id[C,B]) 1 T[B,B] Id[C,B] and the result follows. Let V be a vector space and let T L(V). If dim(v) = n then every ordered basis B of V gives an n n matrix T[B,B]. So, as we change the ordered basis, the coordinate matrix of T changes. Theorem tells us that all these matrices are related by an invertible matrix. Thus we are led to the following remark and the definition.

107 4.5. SUMMARY 107 Remark As T[C,C] = Id[B,C] T[B,B] Id[C,B], the matrix Id[B,C] is called the B : C change of basis matrix (also, see Theorem ). Definition [Similar Matrices] Let B,C M n (C). Then, B and C are said to be similar if there exists a non-singular matrix P such that P 1 BP = C BP = PC. Example LetB = [ 1+x,1+2x+x 2,2+x ] andc = [ 1,1+x,1+x+x 2] beorderedbases of R[x;2]. Then, for Id(a+bx+cx 2 ) = a+bx+cx 2, verify that verify that Id[B,C] 1 = Id[C,B], where Id[C,B] = [[1] B,[1+x] B,[1+x+x 2 ] B ] = and Id[B,C] = [[1+x] C,[1+2x+x 2 ] C,[2+x] C ] = Exercise Define T L(R 3 ) by T((x,y,z) T ) = (x+y+2z,x y 3z,2x+3y+z) T. Let B be the standard ordered basis and C = [ (1,1,1),(1, 1,1),(1,1,2) ] be another ordered basis of R 3. Then find (a) matrices T[B,B] and T[C,C]. (b) the matrix P such that P 1 T[B,B] P = T[C,C]. 2. Let V be a vector space with dim(v) = n. Let T L(V) satisfy T n 1 0 but T n = 0. (a) Prove that there exists u V with {u,t(u),...,t n 1 (u)}, a basis of V (b) If B = [ u,t(u),...,t n 1 (u) ] then T[B,B] = (c) Let A be an n n matrix satisfying A n 1 0 but A n = 0. Then prove that A is similar to the matrix given in Part 1b. 3. Let V, W be vector spaces over F with dim(v) = n and dim(w) = m and ordered bases B and C, respectively. Define I B,C : L(V,W) M m,n (F) by I B,C (T) = T[B,C]. Show that I B,C is an isomorphism. Thus, when bases are fixed, the number of m n matrices is same as the number of linear transformations. 4.5 Summary

108 108 CHAPTER 4. LINEAR TRANSFORMATIONS

109 Chapter 5 Inner Product Spaces 5.1 Definition and Basic Properties Recall the dot product in R 2 and R 3. Dot product helped us to compute the length of vectors and angle between vectors. This enabled us to rephrase geometrical problems in R 2 and R 3 in the language of vectors. We generalize the idea of dot product to achieve similar goal for a general vector space. Definition [Inner Product] Let V be a vector space over F. An inner product over V, denoted by,, is a map from V V to F satisfying 1. au+bv,w = a u,w +b v,w, for all u,v,w V and a,b F, 2. u,v = v,u, the complex conjugate of u,v, for all u,v V and 3. u,u 0 for all u V. Furthermore, equality holds if and only if u = 0. Remark Using the definition of inner product, we immediately observe that 1. v,αw = αw,v = α w,v = α v,w, for all α F and v,w V. 2. If u,v = 0 for all v V then in particular u,u = 0. Hence, u = 0. Definition [Inner Product Space] Let V be a vector space with an inner product,. Then (V,, ) is called an inner product space (in short, ips). Example Examples 1 and 2 that appear below are called the standard inner product or the dot product on R n and C n, respectively. Whenever an inner product is not clearly mentioned, it will be assumed to be the standard inner product. 1. For u = (u 1,...,u n ) T,v = (v 1,...,v n ) T R n define u,v = u 1 v u n v n = v T u. Then, is indeed an inner product and hence ( R n,, ) is an ips. 2. For u = (u 1,...,u n ),v = (v 1,...,v n ) C n define u,v = u 1 v u n v n = v u. Then ( C n,, ) is an ips.

110 110 CHAPTER 5. INNER PRODUCT SPACES [ ] For x = (x 1,x 2 ) T,y = (y 1,y 2 ) T R 2 and A =, define x,y = y T Ax. Then, 1 2, is an inner product as x,x = (x 1 x 2 ) 2 +3x 2 1 +x Fix A = [ ] a b b c with a,c > 0 and ac > b 2. Then x,y = y T Ax is an inner product on R 2 as x,x = ax bx 1x 2 +cx 2 2 = a [x 1 + bx 2 a ] 2 [ + 1 a ac b 2 ] x Verify that for x = (x 1,x 2,x 3 ) T,y = (y 1,y 2,y 3 ) T R 3, x,y = 10x 1 y 1 +3x 1 y 2 +3x 2 y 1 + 2x 2 y 2 +x 2 y 3 +x 3 y 2 +x 3 y 3 defines an inner product. 6. For x = (x 1,x 2 ) T,y = (y 1,y 2 ) T R 2, we define three maps that satisfy at least one condition out of the three conditions for an inner product. Determine the condition which is not satisfied. Give reasons for your answer. (a) x,y = x 1 y 1. (b) x,y = x 2 1 +y2 1 +x2 2 +y2 2. (c) x,y = x 1 y1 3 +x 2y Let A M n (C) be a Hermitian matrix. Then, for x,y C n, define x,y = y Ax. Then,, satisfies x,y = y,x and x + αz,y = x,y + α z,y, for all x,y,z C n and α C. Does there exist conditions on A such that x,x 0 for all x C? This will be answered in affirmative in the chapter on eigenvalues and eigenvectors. 8. For A,B M n (R), define A,B = tr(b T A). Then A+B,C = Tr ( C T (A+B) ) = Tr(C T A)+Tr(C T B) = A,C + B,C and A,B = Tr(B T A) = Tr( (B T A) T ) = Tr(A T B) = B,A. If A = [a ij ] then A,A = Trr(A T A) = n (A T A) ii = n a ij a ij = n A,A > 0 for all non-zero matrix A. i,j=1 a 2 ij i,j=1 and therefore, 1 9. Consider the complex vector space C[ 1, 1] and define f,g = f(x)g(x)dx. Then (a) f,f = 1 1 as f is continuous. 1 (b) g,f = g(x)f(x)dx = (c) f +g,h = (d) αf,g = f(x) 2 dx 0 as f(x) 2 0 and this integral is 0 if and only if f g(x)f(x)dx = (f +g)(x)h(x)dx = (αf(x))g(x)dx = α f(x)g(x)dx = f,g. 1 [f(x)h(x)+g(x)h(x)]dx = f,h + g,h. 1 1 f(x)g(x)dx = α f,g. 1

111 5.1. DEFINITION AND BASIC PROPERTIES 111 (e) Fix an ordered basis B = [u 1,...,u n ] of a complex vector space V. Then, for any u,v V, with [u] B = a 1. a n and [v] B = b 1. b n, define u,v = n a i b i. Then,, is indeed an inner product in V. So, any finite dimensional vector space can be endowed with an inner product. 5.1.A Cauchy Schwartz Inequality As u,u > 0, for all u 0, we use inner product to define length of a vector. Definition [Length/Norm of a Vector] Let V be a vector space over F. Then for any vector u V, we define the length (norm) of u, denoted u = u,u, the positive square u root. A vector of norm 1 is called a unit vector. Thus, is called the unit vector in the u direction of u. Example Let V be an ips and u V. Then for any scalar α, αu = α u. 2. Let u = (1, 1,2, 3) T R 4. Then u = = 15. Thus, 1 15 u and 1 15 u are vectors of norm 1. Moreover 1 15 u is a unit vector in the direction of u. Exercise Let u = ( 1,1,2,3,7) T R 5. Find all α R such that αu = Let u = ( 1,1,2,3,7) T C 5. Find all α C such that αu = Prove that x+y 2 + x y 2 = 2 ( x 2 + y 2), for all x T,y T R n. This equality is called the Parallelogram Law as in a parallelogram the sum of square of the lengths of the diagonals is equal to twice the sum of squares of the lengths of the sides. 4. Apollonius Identity: Let the length of the sides of a triangle be a,b,c R and that of the median( be d R. If the median is drawn on the side with length a then prove that ( ) a 2 b 2 +c 2 = 2 d ) 5. Let A M n (C) satisfy Ax x for all x C n. Then prove that if α C with α > 1 then A αi is invertible. 6. Let u = (1,2) T,v = (2, 1) T R 2. Then, does there [ exist an ] inner product in R 2 such a b that u = 1, v = 1 and u,v = 0? [Hint: Let A = and define x,y = y T Ax. b c Use given conditions to get a linear system of 3 equations in the unknowns a,b,c.] 7. Let x = (x 1,x 2 ) T, y = (y 1,y 2 ) T R 2. Then x,y = 3x 1 y 1 x 1 y 2 x 2 y 1 +x 2 y 2 defines an inner product. Use this inner product to find (a) the angle between e 1 = (1,0) T and e 2 = (0,1) T. (b) v R 2 such that v,e 1 = 0. (c) x,y R 2 such that x = y = 1 and x,y = 0.

112 112 CHAPTER 5. INNER PRODUCT SPACES A very useful and a fundamental inequality, commonly called the Cauchy-Schwartz inequality, concerning the inner product is proved next. Theorem (Cauchy-Bunyakovskii-Schwartz inequality). Let V be an inner product space over F. Then, for any u,v V u,v u v. ( ) Moreover, equality holds in Inequality ( ) if and only if u and v are linearly dependent. u u Furthermore, if u 0 then v = v, u u. Proof. If u = 0 then Inequality ( ) holds. Hence, let u 0. Then, by Definition , λu+v,λu+v 0 for all λ F and v V. In particular, for λ = v,u u 2, 0 λu+v,λu+v = λλ u 2 +λ u,v +λ v,u + v 2 = v,u v,u u 2 u 2 u 2 v,u v,u u,v u 2 u 2 v,u + v 2 = v 2 v,u 2 u 2. Or, in other words v,u 2 u 2 v 2 and the proof of the inequality is over. Now, note that equality holds in Inequality ( ) if and only if λu + v,λu + v = 0, or equivalently, λu+v = 0. Hence, u and v are linearly dependent. Moreover, implies that v = λu = v,u u 2 u = u ( n Corollary Let x,y R n. Then 0 = 0,u = λu+v,u = λ u,u + v,u u u v, u. ) 2 x i y i ( n x 2 i )( n yi 2 ). 5.1.B Angle between two Vectors Let V be a real vector space. Then, for u,v V, the Cauchy-Schwartz inequality implies that 1 u,v u v 1. We use this together with the properties of the cosine function to define the angle between two vectors in an inner product space. Definition [Angle between two vectors] Let V be a real vector space. If θ [0, π] is the angle between u,v V\{0} then we define Example cosθ = u,v u v. 1. Take (1,0) T,(1,1) T R 2. Then cosθ = 1 2. So θ = π/4. 2. Take (1,1,0) T,(1,1,1) T R 3. Then angle between them, say β = cos Angle depends on the IP. Take x,y = 2x 1 y 1 +x 1 y 2 +x 2 y 1 +x 2 y 2 on R 2. Then angle between (1,0) T,(1,1) T R 2 equals cos As x,y = y,x for any real vector space, the angle between x and y is same as the angle between y and x.

113 5.1. DEFINITION AND BASIC PROPERTIES 113 ( 1 5. Let a,b R with a,b > 0. Then prove that (a+b) a + 1 ) 4. b 6. ( For 1 i n, let a i R with a i > 0. Then, use Corollary to show that n n ) 1 i)( n a 2. a i 7. Prove that z 1 + +z n n( z z n 2 ), for z 1,...,z n C. When does the equality hold? 8. Let V be an ips. If u,v V with u = 1, v = 1 and u,v = 1 then prove that u = αv for some α F. Is α = 1? C b a A B c Figure 2: Triangle with vertices A,B and C We will now prove that if A,B and C are the vertices of a triangle (see Figure 5.1.B) and a,b and c, respectively, are the lengths of the corresponding sides then cos(a) = b2 +c 2 a 2 2bc. This in turn implies that the angle between vectors has been rightly defined. Lemma Let A,B and C be the vertices of a triangle (see Figure 5.1.B) with corresponding side lengths a,b and c, respectively, in a real inner product space V then cos(a) = b2 +c 2 a 2. 2bc Proof. Let 0, u and v be the coordinates of the vertices A,B and C, respectively, of the triangle ABC. Then AB = u, AC = v and BC = v u. Thus, we need to prove that cos(a) = v 2 + u 2 v u 2 2 v u v 2 + u 2 v u 2 = 2 v u cos(a). Now, by definition v u 2 = v 2 + u 2 2 v,u and hence v 2 + u 2 v u 2 = 2 u,v. As v,u = v u cos(a), the required result follows. Definition [Orthogonality] Let V be an inner product space over R. Then 1. the vectors u,v V are called orthogonal/perpendicular if u,v = Let S V. Then, the orthogonal complement of S in V, denoted S, equals S = {v V : v,w = 0, for all w S}. Example is orthogonal to every vector as 0,x = 0 for all x V. 2. If V is a vector space over R or C then 0 is the only vector that is orthogonal to itself. 3. Let V = R.

114 114 CHAPTER 5. INNER PRODUCT SPACES (a) S = {0}. Then S = R. (b) S = R, Then S = {0}. (c) Let S be any subset of R containing a non-zero real number. Then S = {0}. 4. Let u = (1,2) T. What is u in R 2? Solution: {(x,y) T R 2 x+2y = 0}. Is this Null(u)? Note that (2, 1) T is a basis of u and for any vector x R 2, x = x,u u u 2 + ( x x,u ) u u 2 = x 1 +2x 2 5 (1,2) T + 2x 1 x 2 (2, 1) T 5 is a decomposition of x into two vectors, one parallel to u and the other parallel to u. 5. Fix u = (1,1,1,1) T,v = (1,1, 1,0) T R 4. Determine z,w R 4 such that u = z+w with the condition that z is parallel to v and w is orthogonal to v. Solution: As z is parallel to v, z = kv = (k,k, k,0) T, for some k R. Since w is orthogonal to v the vector w = (a,b,c,d) T satisfies a+b c = 0. Thus, c = a+b and (1,1,1,1) T = u = z+w = (k,k, k,0) T +(a,b,a+b,d) T. Comparing the corresponding coordinates, gives the linear system d = 1, a + k = 1, b+k = 1 and a+b k = 1 in the unknowns a,b,d and k. Thus, solving for a,b,d and k gives z = 1 3 (1,1, 1,0)T and w = 1 3 (2,2,4,3)T. 6. Let x,y R n then prove that (a) x,y = 0 x y 2 = x 2 + y 2 (Pythagoras Theorem). Solution: Use x y 2 = x 2 + y 2 2 x,y to get the required result follows. (b) x = y x+y,x y = 0 (x and y form adjacent sides of a rhombus as the diagonals x+y and x y are orthogonal). Solution: Use x+y,x y = x 2 y 2 to get the required result follows. (c) 4 x,y = x+y 2 x y 2 (polarization identity in R n ). Solution: Just expand the right hand side to get the required result follows. (d) x+y 2 + x y 2 = 2 x 2 +2 y 2 (parallelogram law: the sum of squares of the diagonals of a parallelogram equals twice the sum of squares of its sides). Solution: Just expand the left hand side to get the required result follows. 7. Let P = (1,1,1) T, Q = (2,1,3) T and R = ( 1,1,2) T be three vertices of a triangle in R 3. Compute the angle between the sides PQ and PR. Solution: Method 1: Notethat PQ = (2,1,3) T (1,1,1) T = (1,0,2) T, PR = ( 2,0,1) T and RQ = ( 3,0, 1) T. As PQ, PR π = 0, the angle between the sides PQ and PR is 2. Method 2: PQ = 5, PR = 5 and QR = 10. As QR 2 = PQ 2 + PR 2, by Pythagoras theorem, the angle between the sides PQ and PR is π 2. Exercise Let V be an ips.

115 5.1. DEFINITION AND BASIC PROPERTIES 115 (a) If S V then S is a subspace of V and S = (LS(S)). (b) Furthermore, if V is finite dimensional then S and LS(S) are complementary. That is, V = LS(S)+S. Equivalently, u,w = 0, for all u LS(S) and w S. 2. Consider R 3 with the standard inner product. Find (a) S for S = {(1,1,1) T,(0,1, 1) T } and S = LS((1,1,1) T,(0,1, 1) T ). (b) vectors v,w R 3 such that v,w,u = (1,1,1) T are mutually orthogonal. (c) the line passing through (1,1, 1) T and parallel to (a,b,c) 0. (d) the plane containing (1,1 1) with (a,b,c) 0 as the normal vector. (e) the area of the parallelogram with three vertices 0 T, (1,2, 2) T and (2,3,0) T. (f) the area of the parallelogram when x = 5, x y = 8 and x+y = 14. (g) the plane containing (2, 2,1) T and perpendicular to the line with parametric equation x = t 1,y = 3t+2,z = t+1. (h) the plane containing the lines (1,2, 2) +t(1,1,0) and (1,2, 2) +t(0,1,2). (i) k such that cos 1 ( u,v ) = π/3, where u = (1, 1,1) T and v = (1,k,1) T. (j) the plane containing (1,1,2) T and orthogonal to the line with parametric equation x = 2+t,y = 3 and z = 1 t. (k) a parametric equation of a line containing (1, 2,1) T and orthogonal to x+3y+2z = Let P = (3,0,2) T,Q = (1,2, 1) T and R = (2, 1,1) T be three points in R 3. Then, (a) find the area of the triangle with vertices P,Q and R. (b) find the area of the parallelogram built on vectors PQ and QR. (c) find a nonzero vector orthogonal to the plane of the above triangle. (d) find all vectors x orthogonal to PQ and QR with x = 2. (e) the volume of the parallelepiped built on vectors PQ and QR and x, where x is one of the vectors found in Part 3d. Do you think the volume would be different if you choose the other vector x? 4. Let p 1 be a plane containing A = (1,2,3) T and (2, 1,1) T as its normal vector. Then (a) find the equation of the plane p 2 that is parallel to p 1 and contains ( 1,2, 3) T. (b) calculate the distance between the planes p 1 and p In the parallelogram ABCD, AB DC and AD BC and A = ( 2,1,3) T,B = ( 1,2,2) T and C = ( 3,1,5) T. Find the (a) coordinates of the point D, (b) cosine of the angle BCD. (c) area of the triangle ABC (d) volume of the parallelepiped determined by AB,AD and (0,0, 7) T. 6. Let W = {(x,y,z,w) T R 4 : x+y +z w = 0}. Find a basis of W. 7. Recall the ips M n (R) (see Example ). If W = {A M n (R) A T = A} then W?

116 116 CHAPTER 5. INNER PRODUCT SPACES 5.1.C Normed Linear Space To proceed further, recall that a vector space over R or C was a linear space. Definition Let V be a linear space. 1. Then, a norm on V is a function f(x) = x from V to R such that (a) x 0 for all x V and if x = 0 then x = 0. (b) αx = α x for all α F and x V. (c) x+y x + y for all x,y V (triangle inequality). 2. A linear space with a norm on it is called a normed linear space (nls). Theorem Let V be a normed linear space and x,y V. Then x y x y. Proof. As x = x y +y x y + y one has x y x y. Similarly, one obtains y x y x = x y. Combining the two, the required result follows. Example On R 3, x = x 2 1 +x2 2 +x2 3 is a norm. Also, observe that this norm corresponds to x,x, where, is the standard inner product. 2. Let V be an ips. Is it true that f(x) = x,x is a norm? Solution: Yes. The readers should verify the first two conditions. For the third condition, recalling the Cauchy-Schwartz inequality, we get f(x+y) 2 = x+y,x+y = x,x + x,y + y,x + y,y x 2 + x y + x y + y 2 = (f(x)+f(y)) 2. Thus, x = x,x is a norm, called the norm induced by the inner product,. Exercise Let V be an ips. Then 4 x,y = x+y 2 x y 2 +i x+iy 2 i x iy 2 (Polarization Identity). 2. Consider the complex vector space C n. If x,y C n then prove that (a) If x 0 then x+ix 2 = x 2 + ix 2, even though x,ix 0. (b) x,y = 0 whenever x+y 2 = x 2 + y 2 and x+iy 2 = x 2 + iy 2. The next result is stated without proof as the proof is beyond the scope of this book. Theorem Let be a norm on a nls V. Then, is induced by some inner product if and only if satisfies the parallelogram law: x+y 2 + x y 2 = 2 x 2 +2 y 2. Example For x = (x 1,x 2 ) T R 2, we define x 1 = x 1 + x 2. Verify that x 1 is indeed a norm. But, for x = e 1 and y = e 2, 2( x 2 + y 2 ) = 4 whereas x+y 2 + x y 2 = (1,1) 2 + (1, 1) 2 = ( ) 2 +( ) 2 = 8. So, the parallelogram law fails. Thus, x 1 is not induced by any inner product in R 2.

117 5.1. DEFINITION AND BASIC PROPERTIES Does there exist an inner product in R 2 such that x = max{ x 1, x 2 }? 3. If is a norm in V then d(x,y) = x y, for x,y V, defines a distance function as (a) d(x,x) = 0, for each x V. (b) using the triangle inequality, for any z V, we have d(x,y) = x y = (x z) (y z) (x z) + (y z) = d(x,z)+d(z,y). 5.1.D Application to Fundamental Spaces We end this section by proving the fundamental theorem of linear algebra. So, the readers are advised to recall the four fundamental subspaces and also to go through Theorem (the rank-nullity theorem for matrices). We start with the following result. Lemma Let A M m,n (R). Then Null(A) = Null(A T A). Proof. Let x Null(A). Then Ax = 0. So, (A T A)x = A T (Ax) = A T 0 = 0. Thus, x Null(A T A). That is, Null(A) Null(A T A). Supposethat x Null(A T A). Then (A T A)x = 0 and 0 = x T 0 = x T (A T A)x = (Ax) T (Ax) = Ax 2. Thus, Ax = 0 and the required result follows. Theorem (Fundamental Theorem of Linear Algebra). Let A M n (C). Then 1. dim(null(a)) + dim(col(a)) = n. 2. Null(A) = ( Col(A ) ) and Null(A ) = ( Col(A) ). 3. dim(col(a)) = dim(col(a )). Proof. Part 1: Proved in Theorem Part 2: We first prove that Null(A) Col(A ). Let x Null(A). Then Ax = 0 and 0 = 0,u = Ax,u = u Ax = (A u) x = x,a u, for all u C n. But Col(A ) = {A u u C n }. Thus, x Col(A ) and Null(A) Col(A ). We now prove that Col(A ) Null(A). Let x Col(A ). Then, for every y C n, 0 = x,a y = (A y) x = y (A ) x = y Ax = Ax,y. In particular, for y = Ax C n, we get Ax 2 = 0. Hence Ax = 0. That is, x Null(A). Thus, the proof of the first equality in Part 2 is over. We omit the second equality as it proceeds on the same lines as above. Part 3: Use the first two parts to get the required result. Hence the proof of the fundamental theorem is complete. We now give some implications of the above theorem. Corollary Let A M n (R). Then the function T : Col(A T ) Col(A) defined by T(x) = Ax is one-one and onto.

118 118 CHAPTER 5. INNER PRODUCT SPACES Proof. In view of Theorem , we just need to show that the map is one-one. So, let us assume that there exist x,y Col(A T ) such that T(x) = T(y). Or equivalently, Ax = Ay. Thus, x y Null(A) = (Col(A T )) (by Theorem ). Therefore, x y (Col(A T )) Col(A T ) = {0} (by Example 2). Thus, x = y and hence the map is one-one. Remark Let A M m,n (R). 1. Then the spaces Col(A) and Null(A T ) are not only orthogonal but are orthogonal complement of each other. 2. Further if Rank(A) = r then, using Corollary , we observe the following: (a) If i 1,...,i r are the pivot rows of A then {A(A[i 1,:] T ),...,A(A[i r,:] T )} form a basis of Col(A). (b) Similarly, if j 1,...,j r are the pivot columns of A then {A T (A[:,j 1 ]),...,A T (A[:,j r ])} form a basis of Col(A T ). (c) So, if we choose the rows and columns corresponding to the pivot entries then the corresponding r r submatrix of A is invertible. The readers should look at Example and Remark We give one more example Example Let A = Then verify that {(0,1,1) t,(1,1,2) T } is a basis of Col(A). 2. {(1,1, 1) T } is a basis of Null(A T ). 3. Null(A T ) = (Col(A)). Exercise Find distinct subspaces W 1 and W 2 (a) in R 2 such that W 1 and W 2 are orthogonal but not orthogonal complement. (b) in R 3 such that W 1 {0} and W 2 {0} are orthogonal, but not orthogonal complement. 2. Let A M m,n (C). Then Null(A) = Null(A A). 3. Let A M m,n (R). Then Col(A) = Col(A T A). 4. Let A M m,n (R). Then Rank(A) = n if and only if Rank(A T A) = n. 5. Let A M m,n (C). Then, for every (a) x R n, x = u+v, where u Col(A T ) and v Null(A) are unique. (b) y R m, y = w+z, where w Col(A) and z Null(A T ) are unique. For more information related with the fundamental theorem of linear algebra the interested readers are advised to see the article The Fundamental Theorem of Linear Algebra, Gilbert Strang, The American Mathematical Monthly, Vol. 100, No. 9, Nov., 1993, pp

119 5.1. DEFINITION AND BASIC PROPERTIES E Properties of Orthonormal Vectors At the end of the previous section, we saw that Col(A) is orthogonal to Null(A T ). So, in this section, we try to understand the orthogonal vectors. Definition [Orthonormal Set] S = {v 1,...,v n } V is called an orthonormal set if Let V be an ips. Then a non-empty set 1. u i,u j = 0, for all 1 i j n. That is, v i and v j are mutually orthogonal, for 1 i j n. 2. v i = 1, for 1 i n If S is also a basis of V then S is called an orthonormal basis of V. Example A few orthonormal sets in R 2 are { (1,0) T,(0,1) T}, { 1 2 (1,1) T, 1 (1, 1) T} and { 1 5 (2,1) T, (1, 2) T}. 2. Let S = {e 1,...,e n } be the standard basis of R n. Then S is an orthonormal set as (a) e i = 1, for 1 i n. (b) e i,e j = 0, for 1 i j n. { [ ] T ] T [ 3. The set 3 1, 1 1 3, 3, 1 [0, 2 1, 2, 2 6 1, 6, 1 ] } T 6 is an orthonormal in R Recall that f(x),g(x) = π π f(x)g(x)dx defines the standard inner product in C[ π, π]. Consider S = {1} {e m m 1} {f n n 1}, where 1(x) = 1, e m (x) = cos(mx) and f n (x) = sin(nx), for all m,n 1 and for all x [ π, π]. Then (a) S is a linearly independent set. (b) 1 2 = 2π, e m 2 = π and f n 2 = π. (c) the functions in S are orthogonal. { { } { } Hence, } π e m m 1 π f n n 1 is an orthonormal set in C[ π, π]. 2π Theorem Let V be an ips with {u 1,...,u n } as a set of mutually orthogonal vectors. 1. Then the set {u 1,...,u n } is linearly independent. 2. Let v = n α i u i V. Then v 2 = n α i u i 2 = n α i 2 u i 2 ; 3. Let v = n α i u i V. So, for 1 i n, if u i = 1 then α i = v,u i. That is, v = n v,u i u i and v 2 = n v,u i Let dim(v) = n. Then v,u i = 0 for all i = 1,2,...,n if and only if v = 0.

120 120 CHAPTER 5. INNER PRODUCT SPACES Proof. Part 1: Consider the linear system c 1 u 1 + +c n u n = 0 in the unknowns c 1,...,c n. As 0,u = 0 and u j,u i = 0, for all j i, we have n 0 = 0,u i = c 1 u 1 + +c n u n,u i = c j u j,u i = c i u i,u i. j=1 As u i 0, u i,u i 0 and therefore c i = 0, for 1 i n. Thus, the above linear system has only the trivial solution. Hence, this completes the proof of Part 1. Part 2: A similar argument gives n n α i u i 2 = α i u i, = n α i u i = n n α i α j u i,u j = j=1 Part 3: If u i = 1, for 1 i n then v,u i = n α i u i, n α j u j j=1 n α i α i u i,u i = Part 4: Follows directly using Part 3 as {u 1,...,u n } is a basis of V. n α i 2 u i 2. n α j u j,u i = n α j u j,u i = α j. Remark Using Theorem , we see that if B = [ ] v 1,...,v n is an ordered u,v 1 orthonormal basis of an ips V then for each u V, [u] B =.. Thus, in place of solving u,v n a linear system to get the coordinates of a vector, we just need to compute the inner product with basis vectors. j=1 j=1 Exercise Find v,w R 3 such that v,w,(1, 1, 2) T are mutually orthogonal. [ [ ] [ ]] [[ ]] [ x+y ] 2. Let B = , 2 1 x be an ordered basis of R Then = 2 x y. y B For the ordered basis B = , 2 1 1, 6 1 of R 3, [(2,3,1) T ] B = Let S = {u 1,...,u n } R n and define A = [u 1,...,u n ]. Then prove that A is an orthogonal matrix if and only if S is an orthonormal basis of R n. 5. Let A be an n n orthogonal matrix. Then prove that (a) the rows/columns of A form an orthonormal basis of R n. (b) for any two vectors x,y R n, Ax,Ay = x,y Orthogonal matrices preserve angle. (c) for any vector x R n, Ax = x Orthogonal matrices preserve length. 6. Let A be an n n unitary matrix. Then prove that

121 5.2. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 121 (a) the rows/columns of A form an orthonormal basis of the complex vector space C n. (b) for any two vectors x,y C n, Ax,Ay = x,y Unitary matrices preserve angle. (c) for any vector x C n, Ax = x Unitary matrices preserve length. 7. Let A,B M n (C) be two unitary matrices. Then prove that AB and BA are unitary matrices. 8. If A = [a ij ] and B = [b ij ] are unitarily equivalent then prove that ij a ij 2 = ij b ij Let A be an n n upper triangular matrix. If A is also an orthogonal matrix then A is a diagonal matrix with diagonal entries ± Gram-Schmidt Orthogonalization Process In view of the importance of Theorem , we inquire into the question of extracting an orthonormal basis from a given basis. The process of extracting an orthonormal basis from a finite linearly independent set is called the Gram-Schmidt Orthogonalization process. We first consider a few examples. Example Which point on the plane P is closest to the point, say Q? Q 0 Plane P y Solution: Letybethefoot of theperpendicularfromqon P. Thus, bypythagoras Theorem, this point is unique. So, the question arises: how do we find y? Note that yq gives a normal vector of the plane P. Hence, y = Q yq. So, need to find a way to compute yq, a line on the plane passing through 0 and y. Thus, we see that given u,v V\{0}, we need to find two vectors, say y and z, such that y is parallel to u and z is perpendicular to u. Thus, y = ucos(θ) and z = usin(θ), where θ is the angle between u and v. OR = v v,u u 2 u R O θ v P Q OQ = v,u u 2 u Figure 3: Decomposition of vector v We do this as follows (see Figure 5.2). Let û = u be the unit vector in the direction u of u. Then using trigonometry, cos(θ) = OQ OP. Hence OQ = OP cos(θ). Now using u

122 122 CHAPTER 5. INNER PRODUCT SPACES Definition , OQ = v v,u v u length/norm is a positive quantity. Thus, Hence, y = OQ = v, = v,u u OQ = OQ u u û = v, u u., where the absolute value is taken as the u u u u v, and z = v u u u is called the orthogonal projection of v on u, denoted Proj u (v). Thus, Proj u (v) = u u v, u u and Proj u(v) = OQ =. In literature, the vector y = OQ u v,u u. ( ) Moreover, the distance of u from the point P equals OR = PQ u = v v, u u Example XY-plane. 1. Determine the foot of the perpendicular from the point (1, 2, 3) on the Solution: Verify that the required point is (1,2,0)? u. 2. Determine the foot of the perpendicular from the point Q = (1,2,3,4) on the plane generated by (1,1,0,0),(1,0,1,0) and (0,1,1,1). Answer: (x,y,z,w) lies on the plane x y z+2w = 0 (1, 1, 1,2),(x,y,z,w) = 0. So, the required point equals 1 (1,2,3,4) (1,2,3,4), (1, 1, 1,2) 1 (1, 1, 1,2) 7 7 = (1,2,3,4) 4 7 (1, 1, 1,2) = 1 7 (3,18,25,20). 3. Determine the projection of v = (1,1,1,1) T on u = (1,1, 1,0) T. u Solution: By Equation ( ), we have Proj v (u) = v,u u 2 = 1 3 (1,1, 1,0)T and w = (1,1,1,1) T Proj v (u) = 1 3 (2,2,4,3)T is orthogonal to u. 4. Let u = (1,1,1,1) T,v = (1,1, 1,0) T,w = (1,1,0, 1) T R 4. Write v = v 1 +v 2, where v 1 is parallel to u and v 2 is orthogonal to u. Also, write w = w 1 +w 2 +w 3 such that w 1 is parallel to u, w 2 is parallel to v 2 and w 3 is orthogonal to both u and v 2. Solution: Note that (a) v 1 = Proj u (v) = v,u u u 2 = 1 4 u = 1 4 (1,1,1,1)T is parallel to u. (b) v 2 = v 1 4 u = 1 4 (3,3, 5, 1)T is orthogonal to u. Note that Proj u (w) is parallel to u and Proj v2 (w) is parallel to v 2. Hence, we have (a) w 1 = Proj u (w) = w,u u u 2 = 1 4 u = 1 4 (1,1,1,1)T is parallel to u, (b) w 2 = Proj v2 (w) = w,v 2 v 2 v 2 2 = 7 44 (3,3, 5, 1)T is parallel to v 2 and (c) w 3 = w w 1 w 2 = 3 11 (1,1,2, 4)T is orthogonal to both u and v 2.

123 5.2. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 123 That is, from the given vector subtract all the orthogonal projections/components. If the new vector is non-zero then this vector is orthogonal to the previous ones. This idea is generalized to give the Gram-Schmidt Orthogonalization process. Theorem (Gram-Schmidt Orthogonalization Process). Let V be an ips. If {v 1,...,v n } is a set of linearly independent vectors in V then there exists an orthonormal set {w 1,...,w n } in V. Furthermore, LS(w 1,...,w i ) = LS(v 1,...,v i ), for 1 i n. Proof. Note that for orthonormality, we need w i = 1, for 1 i n and w i,w j = 0, for 1 i j n. Also, by Corollary , v i / LS(v 1,...,v i 1 ), for 2 i n, as {v 1,...,v n } is a linearly independent set. We are now ready to prove the result by induction. Step 1: Define w 1 = v 1 v 1 then LS(v 1) = LS(w 1 ). Step 2: Define u 2 = v 2 v 2,w 1 w 1. Then u 2 0 as v 2 LS(v 1 ). So, let w 2 = u 2 u 2. Note that {w 1,w 2 } is orthonormal and LS(w 1,w 2 ) = LS(v 1,v 2 ). Step 3: For induction, assume that we have obtained an orthonormal set {w 1,...,w k 1 } such that LS(v 1,...,v k 1 ) = LS(w 1,...,w k 1 ). Now, note that u k = v k k 1 v k,w i w i = v k k 1 Proj wi (v k ) 0 as v k / LS(v 1,...,v k 1 ). So, let us put w k = u k u k. Then, {w 1,...,w k } is orthonormal as w k = 1 and k 1 k 1 u k w k,w 1 = u k,w 1 = v k v k,w i w i,w 1 = v k,w 1 v k,w i w i, w 1 k 1 = v k,w 1 v k,w i w i, w 1 = v k,w 1 v k,w 1 = 0. Similarly, w k,w i = 0, for 2 i k 1. Clearly, w k = u k / u k LS(w 1,...,w k 1,v k ). So, w k LS(v 1,...,v k ). As v k = u k w k + k 1 v k,w i w i, we get v k LS(w 1,...,w k ). Hence, by the principle of mathematical induction LS(w 1,...,w k ) = LS(v 1,...,v k ) and the required result follows. We now illustrate the Gram-Schmidt process with a few examples. Example Let S = {(1, 1,1,1),(1,0,1,0),(0,1,0,1)} R 4. Find an orthonormal set T such that LS(S) = LS(T). Solution: Let v 1 = (1,0,1,0) T,v 2 = (0,1,0,1) T and v 3 = (1, 1,1,1) T. Then w 1 = 1 2 (1,0,1,0) T. As v 2,w 1 = 0, we get w 2 = 1 2 (0,1,0,1) T. For the third vector, let u 3 = v 3 v 3,w 1 w 1 v 3,w 2 w 2 = (0, 1,0,1) T. Thus, w 3 = 1 2 (0, 1,0,1) T. 2. Let S = {v 1 = [ ] T,v2 = [ ] T,v3 = [ 1 2 orthonormal set T such that LS(S) = LS(T) ] T,v4 = [ ] T }. Find an Solution: Take w 1 = v 1 v 1 = [ ] T = e1. For the second vector, consider u 2 = v w 1 = [ ] T. So, put w2 = u 2 u 2 = [ ] T = e2.

124 124 CHAPTER 5. INNER PRODUCT SPACES For the third vector, let u 3 = v 3 2 v 3,w i w i = (0,0,0) T. So, v 3 LS((w 1,w 2 )). Or equivalently, the set {v 1,v 2,v 3 } is linearly dependent. So, for again computing the third vector, define u 4 = v 4 2 v 4,w i w i. Then, u 4 = v 4 w 1 w 2 = e 3. So w 4 = e 3. Hence, T = {w 1,w 2,w 4 } = {e 1,e 2,e 3 }. 3. Find an orthonormal set in R 3 containing (1,2,1) T. Solution: Let (x,y,z) T R 3 with (1,2,1),(x,y,z) = 0. Thus, (x,y,z) = ( 2y z,y,z) = y( 2,1,0) +z( 1,0,1). Observe that ( 2,1,0) and ( 1,0,1) are orthogonal to (1,2,1) but are themselves not orthogonal. Method 1: Apply Gram-Schmidt process to { 1 6 (1,2,1) T,( 2,1,0) T,( 1,0,1) T } R 3. Method 2: Valid only in R 3 using the cross product of two vectors. In either case, verify that { 1 6 (1,2,1), 1 5 (2, 1,0), 1 30 (1,2, 5)} is the required set. We now state two immediate corollaries without proof. Corollary Let V {0} be an ips. If 1. V is finite dimensional then V has an orthonormal basis. 2. S is a non-empty orthonormal set and dim(v) is finite then S can be extended to form an orthonormal basis of V. Remark Let S = {v 1,...,v n } {0} be a non-empty subset of a finite dimensional vector space V. If we apply Gram-Schmidt process to 1. S then we obtain an orthonormal basis of LS(v 1,...,v n ). 2. a re-arrangement of elements of S then we may obtain another orthonormal basis of LS(v 1,...,v n ). But, observe that the size of the two bases will be the same. Exercise Let V be an ips with B = {v 1,...,v n } as a basis. Then prove that B is orthonormal if and only if for each x V, x = n x,v i v i. [Hint: Since B is a basis, each x V has a unique linear combination in terms of v i s.] 2. Let S be a subset of V having 101 elements. Suppose that the application of the Gram- Schmidt process yields u 5 = 0. Does it imply that LS(v 1,...,v 5 ) = LS(v 1,...,v 4 )? Give reasons for your answer. 3. Let B = {v 1,...,v n } be an orthonormal set in R n. For 1 k n, define A k = k v i vi T. Then prove that A T k = A k and A 2 k = A k. Thus, A k s are projection matrices. 4. Determine an orthonormal basis of R 4 containing (1, 2,1,3) T and (2,1, 3,1) T. 5. Let x R n with x = 1.

125 5.3. ORTHOGONAL OPERATOR AND RIGID MOTION 125 (a) Then prove that {x} can be extended to form an orthonormal basis of R n. (b) Let the extended basis be {x,x 2,...,x n } and B = [e ] 1,...,e n ] the standard ordered basis of R n. Prove that A = [[x] B, [x 2 ] B,..., [x n ] B is an orthogonal matrix. 6. Let v,w R n,n 1 with u = w = 1. Prove that there exists an orthogonal matrix A such that Av = w. Prove also that A can be chosen such that det(a) = Let (V,, ) be an n-dimensional ips. If u V with u = 1 then give reasons for the following statements. (a) Let S = {v V v,u = 0}. Then dim(s ) = n 1. (b) Let 0 β F. Then S = {v V : v,u = β} is not a subspace of V. (c) Let v V. Then v = v 0 + v,u u for a vector v 0 S. That is, V = LS(u,S ). 5.3 Orthogonal Operator and Rigid Motion We now give the definition and a few properties of an orthogonal operator. Definition [Orthogonal Operator] Let V be a vector space. Then, a linear operator T : V V is said to be an orthogonal operator if T(x) = x, for all x V. Example Each T L(V) given below is an orthogonal operator. 1. Fix a unit vector a V and define T(x) = 2 a,x a x, for all x V. Solution: Note that Proj a (x) = a,x a. So, a,x a, x a,x a = 0. Also, by Pythagoras theorem x a,x a 2 = x 2 ( a,x ) 2. Thus, T(x) 2 = ( a,x a)+( a,x a x) 2 = a,x a 2 + x a,x a 2 = x Let n = 2,V = R 2 and 0 θ < 2π. Now define T(x) = [ cosθ sinθ sinθ cosθ We now show that an operator is orthogonal if and only if it preserves the angle. Theorem Let T L(V). Then, the following statements are equivalent. 1. T is an orthogonal operator. ][ x y 2. T(x),T(y) = x,y, for all x,y V. That is, T preserves inner product. Proof. 1 2 Let T be an orthogonal operator. Then, T(x + y) 2 = x + y 2. So, T(x) 2 + T(y) 2 +2 T(x),T(y) = T(x)+T(y) 2 = T(x+y) 2 = x 2 + y 2 +2 x,y. Thus, using definition again T(x),T(y) = x,y. 2 1 If T(x),T(y) = x,y, for all x,y V then T is an orthogonal operator as T(x) 2 = T(x),T(x) = x,x = x 2. As an immediate corollary, we obtain the following result. ].

126 126 CHAPTER 5. INNER PRODUCT SPACES Corollary Let T L(V). Then T is an orthogonal operator if and only if for every orthonormal basis {u 1,...,u n } of V, {T(u 1 ),...,T(u n )} is an orthonormal basis of V. Thus, if B is an orthonormal ordered basis of V then T [B,B] is an orthogonal matrix. Definition [Isometry, Rigid Motion] Let V be a vector space. Then, a map T : V V is said to be an isometry or a rigid motion if T(x) T(y) = x y, for all x,y V. That is, an isometry is distance preserving. Observe that if T and S are two rigid motions then ST is also a rigid motion. Furthermore, it is clear from the definition that every rigid motion is invertible. Example The maps given below are rigid motions/isometry. 1. Let V be a linear space with norm. If a V then the translation map T a : V V (see Exercise 1), defined by T a (x) = x+a for all x V, is an isometry/rigid motion as T a (x) T a (y) = (x+a) (y+a) = x y. 2. Let V be an ips. Then, using Theorem , we see that every orthogonal operator is an isometry. We now prove that every rigid motion that fixes origin is an orthogonal operator. Theorem Let V be a real ips. Then, the following statements are equivalent for any map T : V V. 1. T is a rigid motion that fixes origin. 2. T is linear and T(x),T(y) = x,y, for all x,y V (preserves inner product). 3. T is an orthogonal operator. Proof. We have already seen the equivalence of Part 2 and Part 3 in Theorem Let us now prove the equivalence of Part 1 and Part 2/Part 3. If T is an orthogonal operator then T(0) = 0 and T(x) T(y) = T(x y) = x y. This proves Part 3 implies Part 1. We now prove Part 1 implies Part 2. So, let T be a rigid motion that fixes 0. Thus, T(0) = 0 and T(x) T(y) = x y, for all x,y V. Hence, in particular for y = 0, we have T(x) = x, for all x V. So, T(x) 2 + T(y) 2 2 T(x),T(y) = T(x) T(y),T(x) T(y) = T(x) T(y) 2 = x y 2 = x y,x y = x 2 + y 2 2 x,y.

127 5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 127 Thus, using T(x) = x, for all x V, we get T(x),T(y) = x,y, for all x,y V. Now, to prove T is linear, we use T(x),T(y) = x,y in 3-rd and 4-th line to get T(x+y) (T(x)+T(y)) 2 = T(x+y) (T(x)+T(y)),T(x+y) (T(x)+T(y)) = T(x+y),T(x+y) 2 T(x+y),T(x) 2 T(x+y),T(y) + T(x)+T(y),T(x)+T(y) = x+y,x+y 2 x+y,x 2 x+y,y + T(x),T(x) +2 T(x),T(y) + T(y),T(y) = x+y,x+y + x,x +2 x,y + y,y = 0. Thus, T(x+y) (T(x)+T(y)) = 0 and hence T(x+y) = T(x)+T(y). A similar calculation gives T(αx) = αt(x) and hence T is linear. 5.4 Orthogonal Projections and Applications Till now, our main interest was to understand the linear system Ax = b from different points of view. But, in most practical situations the system has no solution. So, we try to find x such that the vector err = b Ax has minimum norm. The next result gives the existence of an orthogonal subspace of a finite dimensional inner product space. Theorem (Decomposition). Let V be an ips having W as a finite dimensional subspace. Suppose {f 1,...,f k } is an orthonormal basis of W. Then, for each b V, y = k b,f i f i is the only closest point in W from b. Furthermore, b y W. Proof. Clearly y = k b,f i f i W. As the closet point is the feet of the perpendicular, we need to show that b y W. To do so, we verify that b y,f i = 0, for 1 i k. k k b y,f i = b,f i b,f j f j,f i = b,f i b,f j f j,f i = b,f i b,f i = 0. j=1 j=1 Also, note that for each w W, y w W and hence b w 2 = b y+y w 2 = b y 2 + y w 2 b y 2. Thus, y is the closet point in W from b. Now, use Pythagoras theorem to conclude that y is unique. Thus, the required result follows. We now give a definition and then an implication of Theorem Definition [Orthogonal Projection] Let W be a finite dimensional subspace of an ips V. Then, by Theorem , for each v V there exist unique vectors w W and u W with v = w+u. We thus define the orthogonal projection of V onto W, denoted P W, by P W : V V by P W (v) = w. The vector w is called the projection of v on W.

128 128 CHAPTER 5. INNER PRODUCT SPACES Remark Let A M m,n (R). Then, to find the orthogonal projection y of a vector b on Col(A), we can use either of the following ideas: 1. Determine an orthonormal basis {f 1,...,f k } of Col(A) and get y = k b,f i f i. 2. By Remark , the spaces Col(A) and Null(A T ) are completely orthogonal. Hence, every b R m equals b = u+v for unique u Col(A) and v Null(A T ). Thus, using Definition and Theorem , y = u. Before proceeding to projections, we give an application of Theorem to a linear system. Corollary Let A M m,n (R) and b R m. Then, every least square solution of Ax = b is a solution of the system A T Ax = A T b. Conversely, every solution of A T Ax = A T b is a least square solution of Ax = b. Proof. Let W = Col(A). Then, by Remark , b = y +v, where y W, v Null(A T ) and min{ b w w W} = b y. As y W there exists x 0 R n such that Ax 0 = y. That is, x 0 is the least square solution of Ax = b. Hence, (A T A)x 0 = A T (Ax 0 ) = A T y = A T (b v) = A T b 0 = A T b. Conversely, we need to show that min{ b Ax x R n } = b Ax 1, where x 1 R n is a solutionofa T Ax = A T b. Thus,A T (Ax 1 b) = 0. Hence, foranyx R n, b Ax 1,A(x x 1 ) = (x x 1 ) T A T (b Ax 1 ) = (x x 1 ) T 0 = 0. Thus, b Ax 2 = b Ax 1 +Ax 1 Ax 2 = b Ax Ax 1 Ax 2 b y 2. Hence, the required result follows. The above corollary gives the following result. Corollary Let A M m,n (R) and b R m. If 1. A T A is invertible then the least square solution of Ax = b equals x = (A T A) 1 A T b. 2. A T A is not invertible then the least square solution of Ax = b equals x = (A T A) A T b, where (A T A) is the pseudo-inverse of A T A. Proof. Part 1 directly follows from Corollary For Part 1, let W = Col(A). Then, by Remark , b = y +v, where y W and v Null(A T ). So, A T b = A T (y +v) = A T y. Since y W, there exists x 0 R n such that Ax 0 = y. Thus, A T b = A T Ax 0. Now, using the definition of pseudo-inverse (see Exercise ), we see that (A A A) ( (A T A) A T b ) = (A T A)(A T A) (A T A)x 0 = (A T A)x 0 = A T b. Thus, we see that (A T A) A T b is a solution of the system A T Ax = A T b. Hence, by Corollary , the required result follows. We now give a few examples to understand projections.

129 5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 129 Example Use the fundamental theorem of linear algebra to compute the vector of the orthogonal projection. 1. Determine the projection of (1,1,1,1,1) T on Null([1, 1,1, 1,1]). Solution: Here A = [1, 1,1, 1,1]. So, a basis of Col(A T ) equals {(1, 1,1, 1,1) T } and that of Null(A) equals {(1,1,0,0,0) T,(1,0, 1,0,0) T,(1,0,0,1,0) T,(1,0,0,0, 1) T }. Then, the solution of the linear system Bx = 1 1, where B = equals x = Thus, the projection is ( 6(1,1,0,0,0) T 4(1,0, 1,0,0) T +6(1,0,0,1,0) T 4(1,0,0,0, 1) T) = (2,3,2,3,2)T. 2. Determine the projection of (1,1,1) T on Null([1,1, 1]). Solution: Here A = [1,1, 1]. So, a basis of Null(A) equals {(1, 1,0) T,(1,0,1) T } and that of Col(A T ) equals {(1,1, 1) T }. Then, the solution of the linear system Bx = 1, where B = equals x = Thus, the projection is ( ( 2)(1, 1,0) T +4(1,0,1) T) = (1,1,2)T. 3. Determine the projection of (1,1,1) T on Col ( [1,2,1] T). Solution: Here, A T = [1,2,1], abasis of Col(A) equals {(1,2,1) T }and that of Null(A T ) equals {(1,0, 1) T,(2, 1,0) T }. Then, using the solution of the linear system Bx = 1, where B = gives 2 3 (1,2,1)T as the required vector To use the first idea in Remark , we prove the following result. Theorem [Matrix of Orthogonal Projection] Let W be a subspace of an ips V with dim(w) <. If {f 1,...,f k } is an orthonormal basis of W then P W = k f i fi T. ( k ) Proof. Let v V. Then P W v = f i fi T v = k ( f i f T i v ) = k v,f i f i. As P W v indeed gives the only closet point (see Theorem ), the required result follows. Example In each of the following, determine the matrix of the orthogonal projection. Also, verify that P W + P W = I. What can you say about Rank(P W ) and Rank(P W )? Also, verify the orthogonal projection vectors obtained in Example W = {(x 1,...,x 5 ) T R 5 x 1 x 2 +x 3 x 4 +x 5 = 0} = Null([1, 1,1, 1,1]) Solution: An orthonormal basis of W is , , , Thus,

130 130 CHAPTER 5. INNER PRODUCT SPACES P W = f 4 i fi T = and P W = W = {(x,y,z) T R 3 x+y z = 0} = Null([1,1, 1]). { } 1 Solution: Note {(1,1, 1)} is abasis of W 1 and 2 (1, 1,0), (1,1,2) an orthonor- 6 mal basis of W. So, P W = and P W = Verify that P W +P W = I 3, Rank(P W ) = 2 and Rank(P W ) = W = LS( (1,2,1) ) = Col ( [1,2,1] T) R 3. Solution: Using Example and Equation ( ) W = LS({( 2,1,0),( 1,0,1)}) = LS({( 2,1,0),(1,2, 5)}) So, P W = and P W = We advise the readers to give a proof of the next result. Theorem Let {f 1,...,f k } be an orthonormal basis of a subspace W of R n. If {f 1,...,f n } is an extended orthonormal basis of R n, P W = k f i fi T and P W = n f i fi T then prove that 1. I n P W = P W. i=k+1 2. (P W ) T = P W and (P W ) T = P W. That is, P W and P W are symmetric. 3. (P W ) 2 = P W and (P W ) 2 = P W. That is, P W and P W are idempotent. 4. P W P W = P W P W = 0. Exercise Let W = {(x,y,z,w) R 4 : x = y,z = w} be a subspace of R 4. Determine the matrix of the orthogonal projection. 2. Let P W1 and P W2 be the orthogonal projections of R 2 onto W 1 = {(x,0) : x R} and W 2 = {(x,x) : x R}, respectively. Note that P W1 P W2 is a projection onto W 1. But, it is not an orthogonal projection. Hence or otherwise, conclude that the composition of two orthogonal projections need not be an orthogonal projection? [ ] Let A =. Then A is idempotent but not symmetric. Now, define P : R R 2 by P(v) = Av, for all v R 2. Then (a) P is idempotent. (b) Null(P) Rng(P) = Null(A) Col(A) = {0}.

131 5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 131 (c) R 2 = Null(P)+Rng(P). But, (Rng(P)) = (Col(A)) Null(A). (d) Since (Col(A)) Null(A), the map P is not an orthogonal projector. In this case, P is called a projection of R 2 onto Rng(P) along Null(P). 4. Find all 2 2 real matrices A such that A 2 = A. Hence, or otherwise, determine all projection operators of R Let W be an (n 1)-dimensional subspace of R n with ordered basis B W = [f 1,...,f n 1 ]. Suppose B = [f 1,...,f n 1,f n ] is an orthogonal ordered basis of R n obtained by extending B W. Now, define a function Q : R n R n by Q(v) = v,f n f n n 1 v,f i f i. Then (a) Q fixes every vector in W. (b) Q sends every vector w W to w. (c) Q Q = I n. The function Q is called the reflection operator with respect to W. Theorem (Bessel sinequality). Let V be an ips with {v 1,,v n }as an orthogonal set. n u,v k 2 n Then v k 2 u 2 u,v k, for each u V. Equality holds if and only if u = v k 2 v k. k=1 Proof. For1 k n, definew k = v k v k andu 0 = n u,w k w k. Then,byTheorem u 0 is k=1 the nearest the vector to u in LS(v 1,,v n ). Also, u u 0,u 0 = 0. So, u 2 = u u 0 + u 0 2 = u u u 0 2 u 0 2. Thus, we have obtained the required inequality. The equality is attained if and only if u u 0 = 0 or equivalently, u = u 0. u k=1 0 u 0 We now give a generalization of the pythagoras theorem. The proof is left as an exercise for the reader. Theorem (Parseval s formula). Let V be an ips with {v 1,,v n } as an orthonormal basis of V. Then, for each x,y V, x,y = n x,v i y,v i. Furthermore, if x = y then x 2 = n x,v i 2, giving us a generalization of the Pythagoras Theorem. Exercise Let A M m,n (R). Then there exists a unique B such that Ax,y = x,by, for all x R n,y R m. In fact B = A T. 5.4.A Orthogonal Projections as Self-Adjoint Operators* Theorem implies that the matrix of the projection operator is symmetric. We use this idea to proceed further.

132 132 CHAPTER 5. INNER PRODUCT SPACES Definition [Self-Adjoint Operator] Let V be an ips with inner product,. A linear operator P : V V is called self-adjoint if P(v),u = v,p(u), for every u,v V. A careful understanding of the examples given below shows that self-adjoint operators and Hermitian matrices are related. It also shows that the vector spaces C n and R n can be decomposed in terms of the null space and column space of Hermitian matrices. They also follow directly from the fundamental theorem of linear algebra. Example Let A be an n n real symmetric matrix. If P : R n R n is defined by P(x) = Ax, for every x R n then (a) P is a self adjoint operator as A = A T, for every x,y R n, implies P(x),y = (y T )Ax = (y T )A T x = (Ay) T x = x,ay = x,p(y). (b) Null(P) = (Rng(P)) as A = A T. Thus, R n = Null(P) Rng(P). 2. Let A be an n n Hermitian matrix. If P : C n C n is defined by P(z) = Az, for all z C n then using similar arguments (see Example ) prove the following: (a) P is a self-adjoint operator. (b) Null(P) = (Rng(P)) as A = A. Thus, C n = Null(P) Rng(P). We now state and prove the main result related with orthogonal projection operators. Theorem Let V be a finite dimensional ips. If V = W W then the orthogonal projectors P W : V V on W and P W : V V on W satisfy 1. Null(P W ) = {v V : P W (v) = 0} = W = Rng(P W ). 2. Rng(P W ) = {P W (v) : v V} = W = Null(P W ). 3. P W P W = P W, P W P W = P W (Idempotent). 4. P W P W = 0 V and P W P W = 0 V, where 0 V (v) = 0, for all v V 5. P W +P W = I V, where I V (v) = v, for all v V. 6. The operators P W and P W are self-adjoint. Proof. Part 1: As V = W W, for each u W, one uniquely writes u = 0+u, where 0 W and u W. Hence, by definition, P W (u) = 0 and P W (u) = u. Thus, W Null(P W ) and W Rng(P W ). Now suppose that v Null(P W ). So, P W (v) = 0. As V = W W, v = w +u, for unique w W and unique u W. So, by definition, P W (v) = w. Thus, w = P W (v) = 0. That is, v = 0+u = u W. Thus, Null(P W ) W. A similar argument implies Rng(P W ) W and thus completing the proof of the first part. Part 2: Use an argument similar to the proof of Part 1.

133 5.4. ORTHOGONAL PROJECTIONS AND APPLICATIONS 133 Part 3, Part 4 and Part 5: Let v V. Then v = w +u, for unique w W and unique u W. Thus, by definition, ( (P W P W )(v) = P W PW (v) ) = P W (w) = w and P W (v) = w (P W P W )(v) = P W ( PW (v) ) = P W (w) = 0 and (P W P W )(v) = P W (v)+p W (v) = w+u = v = I V (v). Hence, P W P W = P W, P W P W = 0 V and I V = P W P W. Part 6: Let u = w 1 +x 1 and v = w 2 +x 2, for unique w 1,w 2 W and unique x 1,x 2 W. Then, by definition, w i,x j = 0 for 1 i,j 2. Thus, P W (u),v = w 1,v = w 1,w 2 = u,w 2 = u,p W (v) and the proof of the theorem is complete. Remark Theorem gives us the following: 1. The orthogonal projectors P W and P W are idempotent and self-adjoint. 2. Let v V. Then v P W (v) = (I V P W )(v) = P W (v) W. Thus, v P W (v),w = 0, for every v V and w W. 3. As P W (v) w W, for each v V and w W, we have v w 2 = v P W (v)+p W (v) w 2 = v P W (v) 2 + P W (v) w 2 +2 v P W (v),p W (v) w = v P W (v) 2 + P W (v) w 2. Therefore, v w v P W (v) and equality holds if and only if w = P W (v). Since P W (v) W, we see that d(v,w) = inf { v w : w W} = v P W (v). That is, P W (v) is the vector nearest to v W. This can also be stated as: the vector P W (v) solves the following minimization problem: inf v w = v P W(v). w W The next theorem is a generalization of Theorem We omit the proof as the arguments are similar and uses the following: Let V be a finite dimensional ips with V = W 1 W k, for certain subspaces W i s of V. Then, for each v V there exist unique vectors v 1,...,v k such that 1. v i W i, for 1 i k, 2. v i,v j = 0 for each v i W i,v j W j,1 i j k and

134 134 CHAPTER 5. INNER PRODUCT SPACES 3. v = v 1 + +v k. Theorem Let V be a finite dimensional ips with subspaces W 1,...,W k of V such that V = W 1 W k. Then, for each i,j, 1 i j k, there exist orthogonal projectors P Wi : V V of V onto W i satisfying the following: 1. Null(P Wi ) = W i = W 1 W 2 W i 1 W i+1 W k. 2. Rng(P Wi ) = W i. 3. P Wi P Wi = P Wi. 4. P Wi P Wj = 0 V. 5. P Wi is a self-adjoint operator, and 6. I V = P W1 P W2 P Wk. 5.5 QR Decomposition The next result gives the proof of the QR decomposition for real matrices. The readers are advised to prove similar results for matrices with complex entries. This decomposition and its generalizations are helpful in the numerical calculations related with eigenvalue problems (see Chapter 6). Theorem (QRDecomposition). Let A M n (R) be invertible. Then there exist matrices Q and R such that Q is orthogonal and R is upper triangular with A = QR. Furthermore, if det(a) 0 then the diagonal entries of R can be chosen to be positive. Also, in this case, the decomposition is unique. Proof. AsAisinvertible, it scolumnsformabasisofr n. So, anapplication ofthegram-schmidt orthonormalization process to {A[:,1],...,A[:,n]} gives an orthonormal basis {v 1,...,v n } of R n satisfying LS(A[:,1],...,A[:,i]) = LS(v 1,...,v i ), for 1 i n. Since A[:,i] LS(v 1,...,v i ), for 1 i n, there exist α ji R,1 j i, such that α 11 α 12 α 1n α 1i 0 α 22 α 2n A[:,i] = [v 1,...,v i ].. Thus, if Q = [v 1,...,v n ] and R =. then α ii α nn 1. Q is an orthogonal matrix (see Exercise ), 2. R is an upper triangular matrix, and 3. A = QR. Thus, this completes the proof of the first part. Note that

135 5.5. QR DECOMPOSITION α ii 0, for 1 i n, as A[:,1] 0 and A[:,i] / LS(v 1,...,v i 1 ). 2. if α ii < 0, for some i,1 i n then we can replace v i in Q by v i to get a new Q ad R in which the diagonal entries of R are positive. Uniqueness: suppose Q 1 R 1 = Q 2 R 2 for some orthogonal matrices Q i s and upper triangular matrices R i s with positive diagonal entries. As Q i s and R i s are invertible, we get Q 1 2 Q 1 = R 2 R1 1. Now, using 1. Exercises , , the matrix R 2 R 1 1 is an upper triangular matrix. 2. Exercises , Q 1 2 Q 1 is an orthogonal matrix. So, thematrix R 2 R 1 1 is an orthogonal upper triangular matrix and hence, by Exercise , R 2 R 1 1 = I n. So, R 2 = R 1 and therefore Q 2 = Q 1. Let A be an n k matrix with Rank(A) = r. Then, by Remark , an application of the Gram-Schmidt orthonormalization process to columns of A yields an orthonormal set {v 1,...,v r } R n such that LS(A[:,1],...,A[:,j]) = LS(v 1,...,v i ), for 1 i j k. Hence, proceeding on the lines of the above theorem, we have the following result. Theorem (Generalized QR Decomposition). Let A be an n k matrix of rank r. Then A = QR, where 1. Q = [v 1,...,v r ] is an n r matrix with Q T Q = I r, 2. LS(A[:,1],...,A[:,j]) = LS(v 1,...,v i ), for 1 i j k and 3. R is an r k matrix with Rank(R) = r Example Let A = Find an orthogonal matrix Q and an upper triangular matrix R such that A = QR. Solution: From Example , we know that w 1 = 1 2 (1,0,1,0) T, w 2 = 1 2 (0,1,0,1) T and w 3 = 1 2 (0, 1,0,1) T. We now compute w 4. If v 4 = (2,1,1,1) T then u 4 = v 4 v 4,w 1 w 1 v 4,w 2 w 2 v 4,w 3 w 3 = 1 2 (1,0, 1,0)T. Thus, w 4 = 1 2 ( 1,0,1,0) T. Hence, we see that A = QR with 1 1 Q = [ ] w 1,...,w 4 = 2 0 and R =

136 136 CHAPTER 5. INNER PRODUCT SPACES Let A = Find a 4 3 matrix Q satisfying QT Q = I 3 and an upper triangular matrix R such that A = QR. Solution: Let us apply the Gram-Schmidt orthonormalization process to the columns of A. As v 1 = (1, 1,1,1) T, we get w 1 = 1 2 v 1. Let v 2 = (1,0,1,0) T. Then u 2 = v 2 v 2,w 1 w 1 = (1,0,1,0) T w 1 = 1 2 (1,1,1, 1)T. Hence, w 2 = 1 2 (1,1,1, 1)T. Let v 3 = (1, 2,1,2) T. Then u 3 = v 3 v 3,w 1 w 1 v 3,w 2 w 2 = v 3 3w 1 +w 2 = 0. So, we again take v 3 = (0,1,0,1) T. Then u 3 = v 3 v 3,w 1 w 1 v 3,w 2 w 2 = v 3 0w 1 0w 2 = v 3. So, w 3 = 1 2 (0,1,0,1) T. Hence, Q = [v 1,v 2,v 3 ] = and R = The readers are advised to check the following: (a) Rank(A) = 3, (b) A = QR with Q T Q = I 3, and (c) R is a 3 4 upper triangular matrix with Rank(R) = 3. Remark Let A M m,n (R). v 1,A[:,1] v 1,A[:,2] v 1,A[:,3] 0 v 2,A[:,2] v 2,A[:,3] 1. If A = QR with Q = [v 1,...,v n ] then R = 0 0 v 3,A[:,3] In case Rank(A) < n then a slight modification gives the matrix R. 2. Further, let Rank(A) = n. (a) Then A T A is invertible (see Exercise ). (b) By Theorem , A = QR with Q a matrix of size m n and R an upper triangular matrix of size n n. Also, Q T Q = I n and Rank(R) = n. (c) Thus, A T A = R T Q T QR = R T R. As A T A is invertible, the matrix R T R is invertible. Since R is a square matrix, by Exercise 4.4a, the matrix R itself is invertible. Hence, (R T R) 1 = R 1 (R T ) 1.

137 5.6. SUMMARY 137 (d) So, if Q = [v 1,...,v n ] then A(A T A) 1 A T = QR(R T R) 1 R T Q T = (QR)(R 1 (R T ) 1 )R T Q T = QQ T. (e) Hence, using Theorem , we see that the matrix P = A(A T A) 1 A T = QQ T = [v 1,...,v n ] is the orthogonal projection matrix on Col(A). v T 1. v T n = n v i vi T 3. Further, let Rank(A) = r < n. If j 1,...,j r are the pivot columns of A then Col(A) = Col(B), where B = [A[:,j 1 ],...,A[:,j r ]] is an m r matrix with Rank(B) = r. So, using Part 2e we see that B(B T B) 1 B T is the orthogonal projection matrix on Col(A). So, compute RREF of A and choose columns of A corresponding to the pivot columns. 5.6 Summary In the previous chapter, we learnt that if V is vector space over F with dim(v) = n then V basically looks like F n. Also, any subspace of F n is either Col(A) or Null(A) or both, for some matrix A with entries from F. So, we started this chapter with inner product, a generalization of the dot product in R 3 or R n. We used the inner product to define the length/norm of a vector. The norm has the property that the norm of a vector is zero if and only if the vector itself is the zero vector. We then proved the Cauchy-Bunyakovskii-Schwartz Inequality which helped us in defining the angle between two vector. Thus, one can talk of geometrical problems in R n and proved some geometrical results. We then independently defined the notion of a norm in R n and showed that a norm is induced by an inner product if and only if the norm satisfies the parallelogram law (sum of squares of the diagonal equals twice the sum of square of the two non-parallel sides). The next subsection dealt with the fundamental theorem of linear algebra where we showed that if A M m,n (C) then 1. dim(null(a)) + dim(col(a)) = n. 2. Null(A) = ( Col(A ) ) and Null(A ) = ( Col(A) ). 3. dim(col(a)) = dim(col(a )). We then saw that having an orthonormal basis is an asset as determining the 1. coordinates of a vector boils down to computing the inner product. 2. projection of a vector on a subspace boils down to finding an orthonormal basis of the subspace and then summing the corresponding rank 1 matrices.

138 138 CHAPTER 5. INNER PRODUCT SPACES So, the question arises, how do we compute an orthonormal basis? This is where we came across the Gram-Schmidt Orthonormalization process. This algorithm helps us to determine an orthonormal basis of LS(S) for any finite subset S of a vector space. This also lead to the QR-decomposition of a matrix. Thus, we observe the following about the linear system Ax = b. If 1. b Col(A) then we can use the Gauss-Jordan method to get a solution. 2. b / Col(A) then in most cases we need a vector x such that the least squareerror between b and Ax is minimum. We saw that this minimum is attained by the projection of b on Col(A). Also, this vector can be obtained either using the fundamental theorem of linear algebra or by computing the matrix B(B T B) 1 B T, where the columns of B are either the pivot columns of A or a basis of Col(A).

139 Chapter 6 Eigenvalues, Eigenvectors and Diagonalization 6.1 Introduction and Definitions In this chapter, every matrix is an element of M n (C) and x = (x 1,...,x n ) T C n, for some n N. We start with a few examples to motivate this chapter. [ ] [ ] [ ] x Example Let A = and B = and x =. Then, y [ ] [ ][ ] [ ] (a) A magnifies the nonzero vector three times as = 3. Verify that [ ] [ ] 1 1 B = 5 and hence B magnifies 5 times. 1 1 [ ] [ ] [ ] (b) A behaves by changing the direction of as = 1, whereas B fixes it (c) x T Ax = 3 (x+y)2 (x y)2 and x T Bx = 5 (x+y)2 + (x y)2. So, maximum and minimum[ displacement ] occurs[ along ] lines x + y = 0 and x y = 0, where 1 1 x+y = (x,y) and x y = (x,y). 1 1 (d) the curve x T Ax = 1 represents a hyperbola, where as the curve x T Bx = 1 represents an ellipse (see Figure 6.1 drawn using the package MATHEMATICA ). [ ] Let C =, a non-symmetric matrix. Then, does there exist a nonzero x C 2 which 1 3 gets magnified by C? So, we need x 0 and α C such that Cx = αx [αi 2 C]x = 0. As x 0, [αi 2 C]x = 0 has a solution if and only if det[αi A] = 0. But, det[αi A] = det ([ ]) α 1 2 = α 2 4α+1. 1 α 3 So, α = 2± 3. Forα = 2+ 3, verifythatthex 0thatsatisfies [ ] x = 0 3 1

140 140 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Figure 6.1: A Hyperbola and two Ellipses (first one has orthogonal axes). [ ] 3 1 equals x =. Similarly, for α = 2 3, the vector x = 1 [ ] x = 0. In this example, 3 1 [ ] 3 1 (a) we still have magnifications in the directions and 1 [ ] [ ] satisfies (b) the maximum/minimum displacements do not occur along the lines ( 3 1)x+y = 0 and ( 3+1)x y = 0 (see the third curve in Figure 6.1). (c) the lines ( 3 1)x+y = 0 and ( 3+1)x y = 0 are not orthogonal. 3. Let A be a real symmetric matrix. Consider the following problem: Maximize (Minimize) x T Ax such that x R n and x T x = 1. To solve this, consider the Lagrangian L(x,λ) = x T Ax λ(x T x 1) = ( n n n ) a ij x i x j λ x 2 i 1. j=1 Partially differentiating L(x,λ) with respect to x i for 1 i n, we get L x 1 = 2a 11 x 1 +2a 12 x a 1n x n 2λx 1,. =. L x n = 2a n1 x 1 +2a n2 x a nn x n 2λx n. Therefore, to get the points of extremum, we solve for ( L 0 T =, L,..., L ) T = L x 1 x 2 x n x = 2(Ax λx). Thus, tosolve theextremal problem, weneedλ R, x R n suchthat x 0andAx = λx. We observe the following about the matrices A,B and C that appear in Example det(a) = 3 = 3 1, det(b) = 5 = 5 1 and det(c) = 1 = (2+ 3) (2 3).

141 6.1. INTRODUCTION AND DEFINITIONS Tr(A) = 2 = 3 1, Tr(B) = 6 = 5+1 and det(c) = 4 = (2+ 3)+(2 3). {[ ] [ ]} {[ ] [ ]} Both the sets, and, are linearly independent [ ] [ ] If v 1 = and v 1 2 = and S = [v 1 1,v 2 ] then [ ] [ ] (a) AS = [Av 1,Av 2 ] = [3v 1, v 2 ] = S S AS = = diag(3, 1). 0 1 [ ] [ ] (b) BS = [Bv 1,Bv 2 ] = [5v 1,v 2 ] = S S AS = = diag(5,1). 0 1 (c) Let u 1 = 1 v 1 and u 2 = 1 v 2. Then, u 1 and u 2 are orthonormal unit vectors. 2 2 That is, if U = [u 1,u 2 ] then I = UU = u 1 u 1 +u 2u 2 and i. A = 3u 1 u 1 u 2u 2. ii. B = 5u 1 u 1 +u 2u 2. [ ] [ ] If v 1 = and v 1 2 = and S = [v 1 1,v 2 ] then [ ] S CS = 0 2 = diag(2+ 3,2 3). 3 Thus, we see that given A M n (C), the number λ C and x C n,x 0 satisfying Ax = λx have certain nice properties. For example, there exists a basis of C 2 in which the matrices A,B and C behave like diagonal matrices. To understand the ideas better, we start with the following definitions. Definition Let A M n (C). Then, 1. the equation Ax = λx (A λi n )x = 0 ( ) is called the eigen-condition. 2. an α C is called a characteristic value/root or eigenvalue or latent root of A if there exists a non-zero vector x satisfying Ax = αx. 3. a non-zero vector x satisfying Equation ( ) is called a characteristic vector or eigenvector or invariant/latent vector of A corresponding to λ. 4. the tuple (α,x) with x 0 and Ax = αx is called an eigen-pair or characteristic-pair. 5. for an eigenvalue α C, Null(A αi) = {x R n Ax = αx} is called the eigen-space or characteristic vector space of A corresponding to α. Theorem Let A M n (C) and α C. Then, the following statements are equivalent. 1. α is an eigenvalue of A. 2. det(a αi n ) = 0. Proof. We know that α is an eigenvalue of A if any only if the system (A αi n )x = 0 has a non-trivial solution. By Theorem this holds if and only if det(a αi) = 0.

142 142 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Definition Let A M n (C). Then, 1. det(a λi) is a polynomial of degree n in λ and is called the characteristic polynomial of A, denoted p A (λ), or in short p(λ). 2. the equation p A (λ) = 0 is called the characteristic equation of A. We thus observe the following. Remark Let A M n (C). If α C is a root of p A (λ) = 0 then α is an eigenvalue. As Null(A αi) is a subspace of C n, the following statements hold. (a) (α,x) is an eigen-pair of A if and only if (α,cx) is an eigen-pair of A, for c C\{0}. (b) If x 1,...,x r are linearly independent eigenvectors of A for the eigenvalue α then r c i x i, with at least one c i 0, is also an eigenvector of A for α. Hence, if S is a collection of eigenvectors, S needs to be linearly independent. 2. A αi is singular. Therefore, if Rank(A αi) = r then r < n. Hence, by Theorem , the system (A αi)x = 0 has n r linearly independent solutions. Almost all books in mathematics differentiate between characteristic value and eigenvalue as the ideas change when one moves from complex numbers to any other scalar field. We give the following example for clarity. Remark Let A M 2 (F). Then A induces a map T L(F 2 ) defined by T(x) = Ax, for all x F 2. We use this idea to understand the difference. [ ] Let A =. Then p A (λ) = λ So, ±i are the roots of p(λ) = 0 in C. Hence, 1 0 (a) A has (i,(1,i) T ) and ( i,(i,1) T ) as eigen-pairs or characteristic-pairs. (b) A has no characteristic value over R. [ ] Let A =. Then 2± 3 are the roots of the characteristic equation. Hence, 1 3 (a) A has characteristic values or eigenvalues over R. (b) A has no characteristic value over Q. Let us look at some more examples. Example Let A = diag(d 1,...,d n ) with d i C,1 i n. Then p(λ) = n (λ d i ) and thus verify that (d 1,e 1 ),...,(d n,e n ) are the eigen-pairs. [ ] Let A =. Then p(λ) = (1 λ) 2. Hence, 1 is a repeated eigenvalue. But the 0 1 complete solution of the system (A I 2 )x = 0 equals x = ce 1, for c C. Hence using Remark , e 1 isaneigenvector. Therefore, 1 is a repeated eigenvalue whereas there is only one eigenvector.

143 6.1. INTRODUCTION AND DEFINITIONS 143 [ ] Let A =. Then, 1 is a repeated eigenvalue of A. In this case, (A I 2 )x = 0 has 0 1 a solution for every x C 2. Hence, any two linearly independent vectors x t,y t from C 2 gives (1,x) and (1,y) as the two eigen-pairs for A. In general, if S = {x 1,...,x n } is a basis of C n then (1,x 1 ),...,(1,x n ) are eigen-pairs of I n, the identity matrix. [ ] ( Let A =. Then, 3+ [ ]) ( 3 2i, 2+ 2i and 3 [ ]) 3 2i, 2 2i are the eigenpairs of A [ ] ( [ ]) ( [ ]) 1 1 i 1 5. Let A =. Then, 1+i, and 1 i, are the eigen-pairs of A i Verify that A = has the eigenvalue 0 repeated 3 times with e 1 as the only eigenvector as Ax = 0 with x = (x 1,x 2,x 3 ) T implies x 2 = 0 = x Verify that A = has the eigenvalue 0 repeated 5 times with e 1 and e 4 as the only eigenvectors as Ax = 0 with x = (x 1,x 2,x 3 ) T implies x 2 = 0 = x 3 = x 5. Note that the diagonal blocks of A are nilpotent matrices. Exercise Let A M n (R). Then, prove that (a) if α σ(a) then α k σ(a k ), for all k N. (b) if A is invertible and α σ(a) then α k σ(a k ), for all k Z. 2. [ Find eigen-pairs ] over [ C, for each ] of[ the following matrices: ] [ ] 1 1+i i 1+i cosθ sinθ cosθ sinθ,, and. 1 i 1 1+i i sinθ cosθ sinθ cosθ 3. Let A = [a ij ] M n (C) with n a ij = a, for all 1 i n. Then prove that a is an j=1 eigenvalue of A. What is the corresponding eigenvector? 4. Prove that the matrices A and A T have the same set of eigenvalues. Construct a 2 2 matrix A such that the eigenvectors of A and A T are different. 5. Let A be an idempotent matrix. Then prove that its eigenvalues are either 0 or 1 or both. 6. Let A be a nilpotent matrix. Then prove that its eigenvalues are all 0. Theorem Let λ 1,...,λ n, not necessarily distinct, be the A = [a ij ] M n (C). Then det(a) = n λ i and Tr(A) = n a ii = n λ i.

144 144 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Proof. Since λ 1,...,λ n are the eigenvalues of A, by definition, n det(a xi n ) = ( 1) n (x λ i ) ( ) is an identity in x as polynomials. Therefore, by substituting x = 0 in Equation ( ), we get det(a) = ( 1) n ( 1) n n λ i = n λ i. Also, a 11 x a 12 a 1n a 21 a 22 x a 2n det(a xi n ) =. ( )..... a n1 a n2 a nn x = a 0 xa 1 + +( 1) n 1 x n 1 a n 1 +( 1) n x n ( ) for some a 0,a 1,...,a n 1 C. Then, a n 1, the coefficient of ( 1) n 1 x n 1, comes from the term (a 11 x)(a 22 x) (a nn x). So, a n 1 = n a ii = Tr(A), the trace of A. Also, from Equation ( ) and ( ), we have n a 0 xa 1 + +( 1) n 1 x n 1 a n 1 +( 1) n x n = ( 1) n (x λ i ). Therefore, comparing the coefficient of ( 1) n 1 x n 1, we have { } n n Tr(A) = a n 1 = ( 1) ( 1) λ i = λ i. Hence, we get the required result. Exercise Let A be a 3 3 orthogonal matrix (AA T = I). If det(a) = 1, then prove that there exists v R 3 \{0} such that Av = v. 2. Let A M 2n+1 (R) with A T = A. Then prove that 0 is an eigenvalue of A. 3. Let A M n (C). Then, A is invertible if and only if 0 is not an eigenvalue of A. 6.1.A Spectrum of an eigenvalue Definition Let A M n (C). Then, 1. the collection of eigenvalues of A, counting with multiplicities, is called the spectrum of A, denoted σ(a). 2. themultiplicityofα σ(a)iscalledthealgebraicmultiplicityofa,denotedalg.mul α (A). 3. forα σ(a), dim(null(a αi))iscalledthegeometric multiplicityofa,geo.mul α (A). We now state the following observations.

145 6.1. INTRODUCTION AND DEFINITIONS 145 Remark Let A M n (C). 1. Then, for each α σ(a), using Theorem dim(null(a αi)) 1. So, we have at least one eigenvector. 2. If the algebraic multiplicity of α σ(a) is r 2 then the Example implies that we need not have r linearly independent eigenvectors. Theorem Let A and B be two similar matrices. Then 1. α σ(a) if and only if α σ(b). 2. for each α σ(a), Alg.Mul α (A) = Alg.Mul α (B) and Geo.Mul α (A) = Geo.Mul α (B). Proof. Since A and B are similar, there exists an invertible matrix S such that A = SBS 1. So, α σ(a) if and only if α σ(b) as det(a xi) = det(sbs 1 xi) = det ( S(B xi)s 1) = det(s)det(b xi)det(a 1 ) = det(b xi). ( ) Note that Equation ( ) also implies that Alg.Mul α (A) = Alg.Mul α (B). We will now show that Geo.Mul α (A) = Geo.Mul α (B). So, let Q 1 = {v 1,...,v k } be a basis of Null(A αi). Then, B = SAS 1 implies that Q 2 = {Sv 1,...,Sv k } Null(B αi). Since Q 1 is linearly independent and S is invertible, we get Q 2 is linearly independent. So, Geo.Mul α (A) Geo.Mul α (B). Now, we can start with eigenvectors of B and use similar arguments to get Geo.Mul α (B) Geo.Mul α (A) and hence the required result follows. Remark Let A M n (C). Then, for any invertible matrix B, the matrices AB and BA = B(AB)B 1 are similar. Hence, in this case the matrices AB and BA have 1. the same set of eigenvalues. 2. Alg.Mul α (A) = Alg.Mul α (B), for each α σ(a). 3. Geo.Mul α (A) = Geo.Mul α (B), for each α σ(a). We will now give a relation between the geometric multiplicity and the algebraic multiplicity. Theorem Let A M n (C). Then, for α σ(a), Geo.Mul α (A) Alg.Mul α (A). Proof. LetGeo.Mul α (A) = k. SupposeQ 1 = {v 1,...,v k }isan orthonormal basisof Null(A αi). Extend Q 1 to get {v 1,...,v k,v k+1,...,v n } as an orthonormal basis of C n. Put P = [v 1,...,v k,v k+1,...,v n ]. Then P = P 1 and P AP = P [Av 1,...,Av k,av k+1,...,av n ] v α = vk v [αv 1,...,αv k,,..., ] = 0 α 0 0. k v n

146 146 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Now, if we denote the lower diagonal submatrix as D then p A (x) = det(a xi) = det(p AP xi) = (α x) k det(d xi). So, Alg.Mul α (A) = Alg.Mul α (P AP) k = Geo.Mul α (A). Exercise Let A M m n (R) and B M n m (R). (a) If α σ(ab) and α 0 then i. α σ(ba). ii. Alg.Mul α (AB) = Alg.Mul α (BA). iii. Geo.Mul α (AB) = Geo.Mul α (BA). (b) If 0 σ(ab) and n = m then Alg.Mul 0 (AB) = Alg.Mul 0 (BA) as there are n eigenvalues, counted with multiplicity. (c) Give an example to show that Geo.Mul 0 (AB) need not equal Geo.Mul 0 (BA) even when n = m. 2. Let A M n (R) be an invertible matrix and let x,y R n with x 0 and y T A 1 x 0. Define B = xy T A 1. Then prove that (a) λ 0 = y T A 1 x is an eigenvalue of B of multiplicity 1. (b) 0 is an eigenvalue of B of multiplicity n 1 [Hint: Use Exercise a]. (c) 1+αλ 0 is an eigenvalue of I +αb of multiplicity 1, for any α R. (d) 1 is an eigenvalue of I +αb of multiplicity n 1, for any α R. (e) det(a+αxy T ) equals (1+αλ 0 )det(a), for any α R. This result is known as the Shermon-Morrison formula for determinant. 3. Let A,B M 2 (R) such that det(a) = det(b) and Tr(A) = Tr(B). (a) Do A and B have the same set of eigenvalues? (b) Give examples to show that the matrices A and B need not be similar. 4. Let A,B M n (R). Also, let (λ 1,u) and (λ 2,v) are eigen-pairs of A and B, respectively. (a) If u = αv for some α R then (λ 1 +λ 2,u) is an eigen-pair for A+B. (b) Give an example to show that if u and v are linearly independent then λ 1 +λ 2 need not be an eigenvalue of A+B. 5. Let A M n (R) be an invertible matrix with eigen-pairs (λ 1,u 1 ),...,(λ n,u n ). Then prove that B = [u 1,...,u n ] forms a basis of R n. If [b] B = (c 1,...,c n ) T then the system Ax = b has the unique solution x = c 1 λ 1 u 1 + c 2 λ 2 u c n λ n u n.

147 6.2. DIAGONALIZATION Diagonalization Let A M n (C) and let T L(C n ) be defined by T(x) = Ax, for all x C n. In this section, we first find conditions under which one can obtain a basis B of C n such that T[B,B] is a diagonal matrix. And, then it is shown that normal matrices satisfy the above conditions. To start with, we have the following definition. Definition [Matrix Digitalization] A matrix A is said to be diagonalizable if A is similar to a diagonal matrix. Or equivalently, P 1 AP = D AP = PD, for some diagonal matrix D and invertible matrix P. [ ] 0 1 Example Let A =. Then, A cannot be diagonalized. 0 0 solution: Suppose A is diagonalizable. Then, A is similar to D = diag(d 1,d 2 ). Thus, by Theorem , {d 1,d 2 } = σ(d) = σ(a) = {0,0}. Hence, D = 0 and therefore, A = SDS 1 = 0, a contradiction Let A = Then, A cannot be diagonalized solution: Suppose A is diagonalizable. Then, A is similar to D = diag(d 1,d 2,d 3 ). Thus, by Theorem , {d 1,d 2,d 3 } = σ(d) = σ(a) = {2,2,2}. Hence, D = 2I 3 and therefore, A = SDS 1 = 2I 3, a contradiction. [ ] ( [ ]) ( [ ]) 0 1 i i 3. Let A =. Then i, and i, are two eigen-pairs of A. Define [ ] [ ] U = 1 i i 2. Then U U = I 2 = UU and U i 0 AU = i Theorem Let A M n (R). 1. Let S be an invertible matrix such that S 1 AS = diag(d 1,...,d n ). Then, for 1 i n, the i-th column of S is an eigenvector of A corresponding to d i. 2. Then A is diagonalizable if and only if A has n linearly independent eigenvectors. Proof. Let S = [u 1,...,u n ]. Then AS = SD gives [Au 1,...,Au n ] = A[u 1,...,u n ] = AS = SD = S diag(d 1,...,d n ) = [d 1 u 1,...,d n u n ]. Or equivalently, Au i = d i u i, for 1 i n. As S is invertible, {u 1,...,u n } are linearly independent. Hence, (d i,u i ), for 1 i n, are eigen-pairs of A. This proves Part 1 and only if part of Part 2. Conversely, let {u 1,...,u n } be n linearly independent eigenvectors of A corresponding to eigenvalues α 1,...,α n. Then, by Theorem , S = [u 1,...,u n ] is non-singular and α α 2 0 AS = [Au 1,...,Au n ] = [α 1 u 1,...,λ n u n ] = [u 1,...,u n ]. = SD, α n where D = diag(α 1,...,α n ). Therefore, S 1 AS = D and hence A is diagonalizable.

148 148 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Theorem Let (α 1,v 1 ),...,(α k,v k ) be k eigen-pairs of A M n (C) with α i s distinct. Then, {v 1,...,v k } is linearly independent. Proof. Suppose{v 1,...,v k }islinearlydependent. Then,thereexistsasmallestl {1,...,k 1} and β 0 such that v l+1 = β 1 v 1 + +β l v l. So, α l+1 v l+1 = α l+1 β 1 v 1 + +α l+1 β l v l. ( ) and α l+1 v l+1 = Av l+1 = A(β 1 v 1 + +β l v l ) = α 1 β 1 v 1 + +α l β l v l. ( ) Now, subtracting Equation ( ) from Equation ( ), we get 0 = (α l+1 α 1 )β 1 v 1 + +(α l+1 α l )β l v l. So, v l LS(v 1,...,v l 1 ), a contradiction to the choice of l. Thus, the required result follows. An immediate corollary of Theorem and Theorem is stated next without proof. Corollary Let A M n (C) have n distinct eigenvalues. Then A is diagonalizable. The converse of Theorem is not true as I n has n linearly independent eigenvectors corresponding to the eigenvalue 1, repeated n times. Corollary Let α 1,...,α k be k distinct eigenvalues A M n (C). Also, for 1 i k, let dim(null(a α i I n )) = n i. Then A has k n i linearly independent eigenvectors. Proof. For 1 i k, let S i = {u i1,...,u ini } be a basis of Null(A α i I n ). Then, we need to ( provethat k k ) S i islinearlyindependent. Todoso,denotep j (A) = (A α i I n ) /(A α j I n ), for 1 j k. Then note that p j (A) is a polynomial in A of degree k 1 and 0, if u Null(A α i I n ), for some i j p j (A)u = (α j α i )u if u Null(A α j I n ) ( ) i j So, to prove that k S i is linearly independent, consider the linear system c 11 u c 1n1 u 1n1 + +c k1 u k1 + +c knk u knk = 0 in the unknowns c ij s. Now, applying the matrix p j (A) and using Equation ( ), we get (α j α i ) ( ) c j1 u j1 + +c jnj u jnj = 0. i j But (α j α i ) 0 as α i s are distinct. Hence, c j1 u j1 + +c jnj u jnj = 0. As S j is a basis of i j Null(A α j I n ), we get c jt = 0, for 1 t n j. Thus, the required result follows. Corollary Let A M n (C) with distinct eigenvalues α 1,...,α k. Then A is diagonalizable if and only if Geo.Mul αi (A) = Alg.Mul αi (A), for each 1 i k.

149 6.2. DIAGONALIZATION 149 Proof. Let Alg.Mul αi (A) = m i. Then, i k. Then, by Corollary A has Theorem , n i m i, for 1 i m i. k m i = n. Let Geo.Mul αi (A) = n i, for 1 k n i linearly independent eigenvectors. Also, by Now, let A be diagonalizable. Then, by Theorem , A has n linearly independent eigenvectors. So, n = k n i. As n i m i and k m i = n, we get n i = m i. Now, assume that Geo.Mul αi (A) = Alg.Mul αi (A), for 1 i k. Then, for each i,1 i k n, A has n i = m i linearly independent eigenvectors. Thus, A has n i = k m i = n linearly independent eigenvectors. Hence by Theorem , A is diagonalizable Example Let A = Then 1, 0 and 2, 1 are the only eigen-pairs. Hence, by Theorem , A is not diagonalizable Exercise Is the matrix A = diagonalizable? [ ] A 0 2. Let A M n (R) and B M m (R). Suppose C =. Then prove that C is diagonalizable if and only if both A and B are 0 B diagonalizable. 3. Let J n be an n nmatrix with all entries 1. Then, prove that Geo.Mul 1 (J n ) = Alg.Mul 1 (J n ) = 1 and Geo.Mul 0 (J n ) = Alg.Mul 0 (J n ) = n Let A = [a ij ] M n (R), where a ij = a, if i = j and b, otherwise. Then, verify that A = (a b)i n +bj n. Hence, or otherwise determine the eigenvalues and eigenvectors of J n. Is A diagonalizable? 5. Let T : R 5 R 5 be a linear operator with Rank(T I) = 3 and Null(T) = {(x 1,x 2,x 3,x 4,x 5 ) R 5 x 1 +x 4 +x 5 = 0, x 2 +x 3 = 0}. (a) Determine the eigenvalues of T? (b) For each distinct eigenvalue α of T, determine Geo.Mul α (T). (c) Is T diagonalizable? Justify your answer. 6. Let A M n (R) with A 0 but A 2 = 0. Prove that A cannot be diagonalized. 7. Are the following matrices diagonalizable? i) , ii) 0 0 1, iii) and iv) [ ] 2 i. i 0

150 150 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION 6.2.A Schur s Unitary Triangularization We now prove one of the most important results in diagonalization, called the Schur s Lemma or Schur s unitary triangularization. Lemma (Schur s unitary triangularization (SUT)). Let A M n (C). Then there exists a unitary matrix U such that A is an upper triangular matrix. Further, if A M n (R) and σ(a) have real entries then U is real orthogonal matrix. Proof. We prove the result by induction on n. The result is clearly true for n = 1. So, let n > 1 and assume the result to be true for k < n and prove it for n. Let (λ 1,x 1 ) be an eigen-pair of A with x 1 = 1. Now, extend it to form an orthonormal basis {x 1,x 2,...,u n } of C n and define X = [x 1,x 2,...,u n ]. Then X is a unitary matrix and X AX = X x 2 [Ax 1,Ax 2,...,Ax n ] = [λ. 1 x 1,Ax 2,...,Ax n ] = x 1 x n [ ] λ1, ( ) 0 B where B M n 1 (C). Now, by induction hypothesis there exists a unitary [ matrix ] U M n 1 (C) 1 0 such that U BU = T is an upper triangular matrix. Define Û = X. Then, using 0 U Exercise 7.7, the matrix Û is unitary and ) A (Û Û = = [ ] [ ] [ ][ ][ ] U X λ1 1 0 AX = 0 U 0 U 0 B 0 U [ ][ ] [ ] [ ] λ1 1 0 λ1 λ1 0 U = B 0 U 0 U =. BU 0 T [ ] λ1 Since T is upper triangular, is upper triangular. 0 T Further, if A M n (R) and σ(a) has real entries then x 1 R n with Ax 1 = λ 1 x 1. Now, one uses induction once again to get the required result. Remark Let A M n (C). Then, by Schur s Lemma there exists a unitary matrix U such that U AU = T = [t ij ], a triangular matrix. Thus, {α 1,...,α n } = σ(a) = σ(u AU) = {t 11,...,t nn }. ( ) Furthermore, we can get the α i s in the diagonal of T in any prescribed order. Definition [Unitary Equivalence] Let A,B M n (C). Then A and B are said to be unitarily equivalent/similar if there exists a unitary matrix U such that A = U BU. Exercise Use the exercises given below to conclude that the upper triangular matrix obtained in the Schur s Lemma need not be unique.

151 6.2. DIAGONALIZATION Prove that B = and C = are unitarily equivalent Prove that D = and E = are unitarily equivalent Let A 1 = and A 2 = Then prove that (a) A 1 and D are unitarily equivalent. (b) A 2 and B are unitarily equivalent. (c) Do the above results contradict Exercise c? Give reasons for your answer Prove that A = and B = are unitarily equivalent Let A be a normal matrix. If all the eigenvalues of A are 0 then prove that A = 0. What happens if all the eigenvalues of A are 1? 6. Let A M n (C). Then Prove that if x Ax = 0, for all x C n, then A = 0. Do these results hold for arbitrary matrices? [ ] [ ] Show that the matrices A = and B = are similar. Is it possible to find a unitary matrix U such that A = U BU? Remark We know that if two matrices are unitarily equivalent then they are necessarily similar as U = U 1, for every unitary matrix U. But, similarity doesn t imply unitary equivalence (see Exercise ). In numerical calculations, unitary transformations are preferred as compared to similarity transformations due to the following main reasons: 1. Exercise c? implies that Ax = x, whenever A is a normal matrix. This need not be true under a similarity change of basis. 2. As U 1 = U, for a unitary matrix, unitary equivalence is computationally simpler. 3. Also, computation of conjugate transpose doesn t create round-off error in calculation. We use Lemma to give another proof of Theorem Corollary Let A M n (C). If σ(a) = {α 1,...,α n } then det(a) = n α i and Tr(A) = n α i.

152 152 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Proof. By Schur s Lemma there exists a unitary matrix U such that U AU = T = [t ij ], a triangular matrix. By Remark , σ(a) = σ(t). Hence, det(a) = det(t) = n t ii = n and Tr(A) = Tr(A(UU )) = Tr(U (AU)) = Tr(T) = n t ii = n α i. α i 6.2.B Diagonalizability of some Special Matrices We now use Schur s unitary triangularization Lemma to state the main theorem of this subsection. Also, recall that A is said to be a normal matrix if AA = A A. Theorem (Spectral Theorem for Normal Matrices). Let A M n (C). If A is a normal matrix then there exists a unitary matrix U such that U AU = diag(α 1,...,α n ). Proof. By Schur s Lemma there exists a unitary matrix U such that U AU = T = [t ij ], a triangular matrix. Since A is upper triangular, we see that T T = (U AU) (U AU) = U A AU = U AA U = (U AU)(U AU) = TT. Thus, weseethatt isanuppertriangularmatrixwitht T = TT. Thus,byExercise , T is a diagonal matrix and this completes the proof. Exercise Let A M n (C). If A is either a Hermitian, skew-hermitian or Unitary matrix then A is a normal matrix. We re-write Theorem in another form to indicate that A can be decomposed into linear combination of orthogonal projectors onto eigen-spaces. Thus, it is independent of the choice of eigenvectors. Remark Let A M n (C) be a normal matrix with eigenvalues α 1,...,α n. 1. Then there exists a unitary matrix U = [u 1,...,u n ] such that (a) I n = u 1 u 1 + +u nu n. (b) the columns of U form a set of orthonormal eigenvectors for A (use Theorem ). (c) A = A I n = A(u 1 u 1 + +u nu n ) = α 1u 1 u 1 + +α nu n u n. 2. Let α 1,...,α k be the distinct eigenvalues of A. Also, let W i = Null(A α i I n ), for 1 i k, be the corresponding eigen-spaces. (a) Then, we can group the u i s such that they form an orthonormal basis of W i, for 1 i k. Hence, C n = W 1 W k. (b) If P αi is the orthogonal projector onto W i, for 1 i k then A = α 1 P 1 + +α k P k. Thus, A depends only on eigen-spaces and not on the computed eigenvectors. We now give the spectral theorem for Hermitian matrices. Theorem Let A M n (C) be a Hermitian matrix. Then 1. the eigenvalues α i, for 1 i n, of A are real.

153 6.2. DIAGONALIZATION there exists a unitary matrix U such that U AU = D, where D = diag(α 1,...,α n ). Proof. The second part is immediate from Theorem and hence the proof is omitted. For Part 1, let (α,x) be an eigen-pair. Then Ax = αx. As A is Hermitian A = A. Thus, x A = x A = (Ax) = (αx) = αx. Hence, using x A = αx, we get αx x = x (αx) = x (Ax) = (x A)x = (αx )x = αx x. As x is an eigenvector, x 0. Hence, x 2 = x x 0. Thus λ = λ. That is, λ R. As an immediate corollary of Theorem and the second part of Lemma , we give the following result without proof. Corollary Let A M n (R) be symmetric. Then A = Udiag(α 1,...,α n )U, where 1. the α i s are all real, 2. the columns of U can be chosen to have real entries, 3. the eigenvectors that correspond to the columns of U form an orthonormal basis of R n. Exercise Let A be a skew-symmetric matrix. Then the eigenvalues of A are either zero or purely imaginary and A is unitarily diagonalizable. 2. Let A be a skew-hermitian matrix. Then, A is unitarily diagonalizable. 3. Let A be a normal matrix with (λ,x) as an eigen-pair. Then (a) (A ) k x for k Z + is also an eigenvector corresponding to λ. (b) (λ,x) is an eigen-pair for A. [Hint: Verify A x λx 2 = Ax λx 2.] 4. Let A be an n n unitary matrix. Then (a) λ = 1 for any eigenvalue λ of A. (b) the eigenvectors x, y corresponding to distinct eigenvalues are orthogonal. 5. Let A be a 2 2 orthogonal matrix. Then prove the following: [ ] cosθ sinθ (a) if det(a) = 1 then A =, for some θ,0 θ < 2π. That is, A sinθ cosθ counterclockwise rotates every point in R 2 by an angle θ. [ ] cosθ sinθ (b) if deta = 1 then A =, for some θ,0 θ < 2π. That is, A sinθ cosθ reflects every point in R 2 about a line passing through origin. Determine[ this line. ] 1 0 Or equivalently, there exists a non-singular matrix P such that P 1 AP = Let A be a 3 3 orthogonal matrix. Then prove the following:

154 154 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION (a) if det(a) = 1 then A is a rotation about a fixed axis, in the sense that A has an eigen-pair (1,x) such that the restriction of A to the plane x is a two dimensional rotation in x. (b) if deta = 1 then A corresponds to a reflection through a plane P, followed by a rotation about the line through origin that is orthogonal to P. 7. Let A be a normal matrix. Then prove that Rank(A) equals the number of non-zero eigenvalues of A. 6.2.C Cayley Hamilton Theorem Let A M n (C). Then, in Theorem , we saw that p A (x) = det(a xi) = ( 1) n( x n a n 1 x n 1 +a n 2 x n 2 + +( 1) n 1 a 1 x+( 1) n ) a 0 ( ) for certain a i C, 0 i n 1. Also, if α is an eigenvalue of A then p A (α) = 0. So, x n a n 1 x n 1 +a n 2 x n 2 + +( 1) n 1 a 1 x+( 1) n a 0 = 0 is satisfied by n complex numbers. It turns out that the expression A n a n 1 A n 1 +a n 2 A n 2 + +( 1) n 1 a 1 A+( 1) n a 0 I = 0 holds true as a matrix identity. This is a celebrated theorem called the Cayley Hamilton Theorem. We give a proof using Schur s unitary triangularization. To do so, we look at multiplication of certain upper triangular matrices. Lemma Let A 1,...,A n M n (C) be upper triangular matrices such that the (i,i)-th entry of A i equals 0, for 1 i n. Then, A 1 A 2 A n = 0. Proof. We use induction to prove that the first k columns of A 1 A 2 A k is 0, for 1 k n. The result is clearly true for k = 1 as the first column of A 1 is 0. For clarity, we show that the first two columns of A 1 A 2 is 0. Let B = A 1 A 2. Then, by matrix multiplication B[:,i] = A 1 [:,1](A 2 ) 1i +A 1 [:,2](A 2 ) 2i + +A 1 [:,n](a 2 ) ni = = 0 as A 1 [:,1] = 0 and (A 2 ) ji = 0, for i = 1,2 and j 2. So, assume that the first n 1 columns of C = A 1 A n 1 is 0 and let B = CA n. Then, for 1 i n, we see that B[:,i] = C[:,1](A n ) 1i +C[:,2](A n ) 2i + +C[:,n](A n ) ni = = 0 as C[:,j] = 0, for 1 j n 1 and (A n ) ni = 0, for i = n 1,n. Thus, by induction hypothesis the required result follows. We now prove the Cayley Hamilton Theorem using Schur s unitary triangularization. Theorem (Cayley Hamilton Theorem). Let A M n (C). Then A satisfies its characteristic equation. That is, if p A (x) = det(a xi n ) = a 0 xa 1 + +( 1) n 1 a n 1 x n 1 +( 1) n x n then A n a n 1 A n 1 +a n 2 A n 2 + +( 1) n 1 a 1 A+( 1) n a 0 I = 0 holds true as a matrix identity.

155 6.2. DIAGONALIZATION 155 Proof. Let σ(a) = {α 1,...,α n } then p A (x) = n (x α i ). And, by Schur s unitary triangularization there exists a unitary matrix U such that U AU = T, an upper triangular matrix with t ii = α i, for 1 i n. Now, observe that if A i = T α i I then the A i s satisfy the conditions of Lemma Hence, (T α 1 I) (T α n I) = 0. Therefore, p A (A) = n (A α i I) = n [ ] (UTU α i UIU ) = U (T α 1 I) (T α n I) U = U0U = 0. Thus, the required result follows. We now give some examples and then implications of the Cayley Hamilton Theorem. [ ] 1 2 Remark Let A =. Then, p 1 3 A (x) = x 2 +2x 5. Hence, verify that [ ] [ ] [ ] A A 5I 2 = +2 5 = Further, verify that A 1 = 1 5 (A+2I 2) = 1 [ ] [ ] Let A =. Then p A (x) = x 2. So, even though A 0, A 2 = For A = 0 0 0, p A (x) = x 3. Thus, by the Cayley Hamilton Theorem A 3 = 0. But, it turns out that A 2 = Let A M n (C) with p A (x) = a 0 xa 1 + +( 1) n 1 a n 1 x n 1 +( 1) n x n. (a) Then, for any l N, the division algorithm gives α 0,α 1,...,α n 1 C and a polynomial f(x) with coefficients from C such that x l = f(x)p A (x)+α 0 +xα 1 + +x n 1 α n 1. Hence, by the Cayley Hamilton Theorem, A l = α 0 I +α 1 A+ +α n 1 A n 1. i. Thus, to compute any power of A, one needs to apply the division algorithm to get α i s and know A i, for 1 i n 1. This is quite helpful in numerical computation as computing powers takes much more time that division. ii. Note that LS { I,A,A 2,... } is a subspace of M n (C). Also, dim(m n (C)) = n 2. But, the above argument implies that dim ( LS { I,A,A 2,... }) n. iii. In the language of graph theory, it says the following: Let G be a graph on n vertices and A its adjacency matrix. Suppose there is no path of length n 1 or less from a vertex v to a vertex u in G. Then, G doesn t have a path from v to u of any length. That is, the graph G is disconnected and v and u are in different components of G.

156 156 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION (b) Suppose A is non-singular. Then, by definition a 0 = det(a) 0. Hence, A 1 = 1 [ a1 I a 2 A+ +( 1) n 2 a n 1 A n 2 +( 1) n 1 A n 1]. a 0 This matrix identity can be used to calculate the inverse. (c) The above also implies that if A is invertible then A 1 LS { I,A,A 2,... }. That is, A 1 is a linear combination of the vectors I,A,...,A n Exercise Find the inverse of 5 6 7, and by the Cayley Hamilton Theorem. Exercise Miscellaneous Exercises: [ ] 0 B 1. Let B be an m n matrix and A =. Then, prove that B T 0 ( [ ]) x if and only if λ, is an eigen-pair. y 2. Let B and C be n n matrices and A = ( λ, [ ]) x is an eigen-pair y [ ] B C. Then, prove the following: C B (a) if s is a real eigenvalue of A with corresponding eigenvector [ ] y eigenvalue corresponding to the eigenvector. x [ ] x y then s is also an [ ] x+iy (b) if s+it is a complex eigenvalue of A with corresponding eigenvector then y+ix [ ] x iy s it is also an eigenvalue of A with corresponding eigenvector. y ix (c) (s+it,x+iy) is an eigen-pair of B+iC if and only if (s it,x iy) is an eigen-pair of B ic. ( [ ]) x+iy (d) s+it, is an eigen-pair of A if and only if (s+it,x+iy) is an eigenpair of B y+ix +ic. (e) det(a) = det(b +ic) 2. We end this chapter with an application to the study of conic sections in analytic geometry. 6.3 Quadratic Forms Definition Let A M n (C). Then A is said to be 1. positive semi-definite(psd) if x Ax 0, for all x C n. 2. positive definite(pd) if x Ax > 0, for all x C n \{0}. 3. negative semi-definite(psd) if x Ax 0, for all x C n. 4. negative definite(pd) if x Ax < 0, for all x C n \{0}.

157 6.3. QUADRATIC FORMS 157 Remark Let A = [a ij ] M n (C) be positive semi-definite (positive definite/negative semi-definite or negative definite) matrix. Then, A is Hermitian. Solution: By definition, a ii = e i Ae i R. Also, a ii +a jj +a ij +a ji = (e i +e j ) A(e i +e j ) R. So, Im(a ij ) = Im(a ji ). Similarly, a ii +a jj +ia ij ia ji = (e i +ie j ) A(e i +ie j ) R implies that Re(a ij ) = Re(a ji ). [ ] 2 1 Example Let A =. Then, A is positive definite. 1 2 [ ] Let A =. Then, A is positive semi-definite but not positive definite. 1 1 [ ] Let A =. Then, A is negative definite. 1 2 [ ] Let A =. Then, A is negative semi-definite. 1 1 [ ] Let A =. Then, A is neither positive semi-definite nor positive definite nor 1 1 negative semi-definite and nor negative definite. Theorem Let A M n (C). Then the following statements are equivalent. 1. A is positive semi-definite. 2. A = A and each eigenvalue of A is non-negative. 3. A = B B, for some B M n (C). Proof. 1 2: Let A be positive semi-definite. Then, by Remark A is Hermitian. If (α,v) is an eigen-pair of A then α v 2 = v Av 0. So, α : Let σ(a) = {α 1,...,α n }. Then, by spectral theorem, there exists a unitary matrix U such that U AU = D with D = diag(α 1,...,α n ). As α i 0, for 1 i n, define D 1 2 = diag( α 1,..., α n ). Then, A = UD 1 2[D 1 2U ] = B B. 3 1: Let A = B B. Then, for x C n, x Ax = x B Bx = Bx 2 0. Thus, the required result follows. A similar argument give the next result and hence the proof is omitted. Theorem Let A M n (C). Then the following statements are equivalent. 1. A is positive definite. 2. A = A and each eigenvalue of A is positive. 3. A = B B, for a non-singular matrix B M n (C). Definition Let V be a vector space over F. Then, 1. for a fixed m N, a function f : V m F is called an m-multilinear function if f is linear in each component. That is, f (v 1,...,v i 1,(v i +αu),v i+1...,v m ) = f(v 1,...,v i 1,v i,v i+1...,v m ) +αf(v 1,...,v i 1,u,v i+1...,v m ) for α F, u V and v i V, for 1 i m.

158 158 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION 2. An m-multilinear form is also called an m-form. 3. A 2-form is called a bilinear form. Definition [Sesquilinear, Hermitian and Quadratic Forms] Let A = [a ij ] M n (C) be a Hermitian matrix and let x,y C n. Then a sesquilinear form in x,y C n is defined as H(x,y) = y Ax. In particular, H(x,x), denoted H(x), is called the Hermitian form. In case A M n (R), H(x) is called the quadratic form. Remark Observe that 1. if A = I n then the bilinear/sesquilinear form reduces to the standard inner product. 2. H(x, y) is linear in the first component and conjugate linear in the second component. 3. the Hermitian form H(x) is a real number. Hence, for α R, the equation H(x) = α, represents a conic in C n. Example Let v i C n, for 1 i n. Then, f (v 1,...,v n ) = det([v 1,...,v n ]) is an n-form on C n. 2. Let A M n (R). Then, f(x,y) = y T Ax, for x,y R n, is a bilinear form on R n. [ ] [ ] 1 2 i x 3. Let A =. Then A = A and for x =, verify that 2+i 2 y H(x) = x Ax = x 2 +2 y 2 +2Re((2 i)xy) where Re denotes the real part of a complex number is a sesquilinear form. 6.3.A Sylvester s law of inertia The main idea of this section is to express H(x) as sum or difference of squares. Since H(x) is a quadratic in x, replacing x by cx, for c C, just gives a multiplication factor by c 2. Hence, one needs to study only the normalized vectors. Also, in Example , we expressed x T Ax = 3 (x+y)2 (x y)2 and x T Bx = 5 (x+y)2 + (x y)2. But, we can also express them as x T Ax = 2(x+y) 2 (x 2 +y 2 ) and x T Bx = 2(x+y) 2 +(x 2 +y 2 ). Note that the first expression clearly gives the direction of maximum and minimum displacements or the axes of the curves that they represent whereas such deductions cannot be made from the other expression. So, in this subsection, we proceed to clarify these ideas. Let A M n (C) be Hermitian. Then, by Theorem , σ(a) = {α 1,...,α n } R and there exists a unitary matrix U such that U AU = D = diag(α 1,...,α n ). Let x = Uz. Then x = 1 and U is unitary implies that z = 1. If z = (z 1,...,z n ) then H(x) = z U AUz = z Dz = n λ i z i 2 = p 2 r λ i z i i=p+1 λ i z i 2. ( ) Thus, the possible values of H(x) depend only on the eigenvalues of A. Since U is an invertible matrix, the components z i s of z = U x are commonly known as the linearly independent linear

159 6.3. QUADRATIC FORMS 159 forms. Note that each z i is a linear expression in the components of x. Also, note that in Equation ( ), p corresponds to the number of positive eigenvalues and r p to the number of negative eigenvalues. So, as a next result, we show that in any expression of H(x) as a sum or difference of n absolute squares of linearly independent linear forms, the number p (respectively, r p) gives the number of positive (respectively, negative) eigenvalues of A. This is popularly known as the Sylvester s law of inertia. Lemma (Sylvester s law of inertia). Let A M n (C) be a Hermitian matrix and let x C n. Then every Hermitian form H(x) = x Ax, in n variables can be written as H(x) = y y p 2 y p+1 2 y r 2 where y 1,...,y r are linearly independent linear forms in the components of x and the integers p and r satisfying 0 p r n, depend only on A. Proof. Equation ( ) implies that H(x) has the required form. We only need to show that p and r are uniquely determined by A. Hence, let us assume on the contrary that there exist p,q,r,s N with p > q such that H(x) = y y p 2 y p+1 2 y r 2 = z z q 2 z q+1 2 y s 2, ( ) where y = (y 1,...,y n ) = Mx and z = (z 1,...,z n ) = Nx for some invertible matrices M and N. Hence, z = By, for [ B = NM ] 1, an invertible matrix. Let us write Y 1 = (y 1,...,y p ), Z 1 = (z 1,...,z q ) B1 B 2 and B =, where B 1 is a q p matrix. As p > q, the homogeneous B 3 B 4 ] [Ỹ1 linear system B 1 Y 1 = 0 has a non-zero solution, say Ỹ1 = (ỹ 1,...,ỹ p ) and let ỹ = 0. Then Z 1 = 0 and thus, using Equation ( ), we have H(ỹ) = ỹ ỹ ỹ p 2 = ( z q z s 2 ). Now, this can hold only if ỹ 1 = = ỹ p = 0, a contradiction. Hence p = q. Similarly, the case r > s can be resolved. Thus, the proof of the lemma is over. Remark Since A is Hermitian, Rank(A) equals the number of non-zero eigenvalues. Hence, Rank(A) = r. The number r is called the rank and the number r 2p is called the inertial degree of the Hermitian form H(x). Definition [Associate Quadratic Form] Let f(x,y) = ax 2 +2hxy+by 2 +2fx+2gy+c be a general quadratic in x and y, with coefficients from R. Then, H(x) = x T Ax = [ x, y ][ ][ ] a h x = ax 2 +2hxy +by 2 h b y is called the associated quadratic form of the conic f(x,y) = 0.

160 160 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION We now obtain conditions on the eigenvalues of A, corresponding to the associated quadratic form, to characterize conic sections in R 2, with respect to the standard inner product. Proposition Consider the general quadratic f(x, y), for a, b, c, g, f, h R. Then f(x,y) = 0 represents 1. an ellipse or a circle if ab h 2 > 0, 2. a parabola or a pair of parallel lines if ab h 2 = 0, 3. a hyperbola or a pair of intersecting lines if ab h 2 < 0. Proof. As A is symmetric, by Corollary , A = Udiag(α 1,α 2 )U T, where U = [u 1,u 2 ] is an orthogonal matrix, with (α 1,u 1 ) and (α 2,u 2 ) as eigen-pairs of A. Let [u, v] = x T U. As u 1 and u 2 are orthogonal, u and v represent orthogonal lines passing through origin in the (x,y)-plane. In most cases, these lines form the principal axes of the conic. We also have x T Ax = α 1 u 2 +α 2 v 2 and hence f(x,y) = 0 reduces to λ 1 u 2 +λ 2 v 2 +2g 1 u+2f 1 v +c = 0. ( ) for some g 1,f 1 R. Now, we consider different cases depending of the values of α 1,α 2 : 1. If α 1 = 0 = α 2 then A = 0 and Equation ( ) gives the straight line 2gx+2fy+c = if α 1 = 0 and α 2 0 then ab h 2 = det(a) = α 1 α 2 = 0. So, after dividing by α 2, Equation ( ) reduces to (v +d 1 ) 2 = d 2 u+d 3, for some d 1,d 2,d 3 R. Hence, let us look at the possible subcases: (a) Let d 2 = d 3 = 0. Then v +d 1 = 0 is a pair of coincident lines. (b) Let d 2 = 0, d 3 0. i. If d 3 > 0, then we get a pair of parallel lines given by v = d 1 ± d3 α 2. ii. If d 3 < 0, the solution set of the corresponding conic is an empty set. (c) If d 2 0. Then the given equation is of the form Y 2 = 4aX for some translates X = x+α and Y = y +β and thus represents a parabola. LetH(x) [ = ] x 2 +4y 2 +4xy betheassociated quadraticformforaclassofcurves. Then, 1 2 A =, α = 0,α 2 = 5 and v = x+2y. Now, let d 1 = 3 and vary d 2 and d 3 to get different curves (see Figure 6.2 drawn using the package MATHEMATICA ). 3. α 1 > 0 and α 2 < 0. Then ab h 2 = det(a) = λ 1 λ 2 < 0. If α 2 = β 2, for β 2 > 0, then Equation ( ) reduces to α 1 (u+d 1 ) 2 β 2 (v +d 2 ) 2 = d 3, for some d 1,d 2,d 3 R ( ) whose understanding requires the following subcases: (a) If d 3 = 0 then Equation ( ) equals ( α1 (u+d 1 )+ α1 β 2 (v +d 2 )) ( (u+d 1 ) ) β 2 (v +d 2 ) = 0 or equivalently, a pair of intersecting straight lines in the (u, v)-plane.

161 6.3. QUADRATIC FORMS Figure 6.2: Curves for d 2 = 0 = d 3, d 2 = 0,d 3 = 1 and d 2 = 1,d 3 = 1. (b) Without loss of generality, let d 3 > 0. Then Equation ( ) equals λ 1 (u+d 1 ) 2 d 3 α 2(v +d 2 ) 2 d 3 = 1 or equivalently, a hyperbola with orthogonal principal axes u+d 1 = 0 and v+d 2 = 0. Let H(x) = [ 10x 2 5y ] 2 +20xy be the associated quadratic form for a class of curves Then, A =, α = 15,α 2 = 10 and 5u = 2x+y, 5v = x 2y. Now, let d 1 = 5,d 2 = 5 to get 3(2x+y +1) 2 2(x 2y 1) 2 = d 3. Now vary d 3 to get different curves (see Figure 6.3 drawn using the package MATHEMATICA ). 4. α 1,α 2 > 0. Then, ab h 2 = det(a) = α 1 α 2 > 0 and Equation ( ) reduces to λ 1 (u+d 1 ) 2 +λ 2 (v +d 2 ) 2 = d 3, for some d 1,d 2,d 3 R. ( ) We consider the following three subcases to understand this. (a) If d 3 = 0 then we get a pair of orthogonal lines u+d 1 = 0 and v +d 2 = 0. (b) If d 3 < 0 then the solution set of Equation ( ) is an empty set. (c) If d 3 > 0 then Equation ( ) reduces to α 1(u+d 1 ) 2 + α 2(v +d 2 ) 2 = 1, an d 3 d 3 ellipse or circle with u+d 1 = 0 and v +d 2 = 0 as the orthogonal principal axes. Let H(x) [ = ] 6x 2 +9y 2 +4xy be the associated quadratic form for a class of curves. Then, 6 2 A =, α = 10,α 2 = 5and 5u = x+2y, 5v = 2x y. Now, let d 1 = 5,d 2 = 5 to get 2(x+2y+1) 2 +(2x y 1) 2 = d 3. Now vary d 3 to get different curves (see Figure 6.4 drawn using the package MATHEMATICA ). Thus, we have considered all the possible cases and the required result follows.

162 162 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Figure 6.3: Curves for d 3 = 0, d 3 = 1 and d 3 = Figure 6.4: Curves for d 3 = 0 and d 3 = 5. [ ] x Remark Observe that the condition = [ ] [ ] u u 1 u 2 y v axes of the conic are functions of the eigenvectors u 1 and u 2. implies that the principal Exercise Sketch the graph of the following surfaces: 1. x 2 +2xy +y 2 6x 10y = x 2 +6xy +3y 2 12x 6y = x 2 4xy +2y 2 +12x 8y = 10.

163 6.3. QUADRATIC FORMS x 2 6xy +5y 2 10x+4y = 7. As a last application, we consider a quadratic in 3 variables, namely x,y and z. To do so, let a d e x l y 1 A = d b f, x = y, b = m and y = y 2 with e f c z n y 3 f(x,y,z) = x T Ax+2b T x+q ( ) Then, we now observe the following: = ax 2 +by 2 +cz 2 +2dxy +2exz +2fyz +2lx+2my +2nz +q( ) 1. As A is symmetric, P T AP = diag(α 1,α 2,α 3 ), where P = [u 1,u 2,u 3 ] is an orthogonal matrix and (α i,u i ), for i = 1,2,3 are eigen-pairs of A. 2. Let y = P T x. Then f(x,y,z) reduces to g(y 1,y 2,y 3 ) = α 1 y 2 1 +α 2 y 2 2 +α 3 y l 1 y 1 +2l 2 y 2 +2l 3 y 3 +q. ( ) 3. Depending on the values of α i s, rewrite g(y 1,y 2,y 3 ) to determine to determine the center and the planes of symmetry of f(x,y,z) = 0. Example Determine the following quadrics f(x, y, z) = 0, where 1. f(x,y,z) = 2x 2 +2y 2 +2z 2 +2xy +2xz +2yz +4x+2y +4z f(x,y,z) = 3x 2 y 2 +z f(x,y,z) = 3x 2 y 2 +z f(x,y,z) = 3x 2 y 2 +z Solution: Part 1 Here, A = 1 2 1, b = 1 and q = 2. So, the orthogonal matrices P = and PT AP = Hence, f(x,y,z) = 0 reduces to (y )2 +(y ) 2 +(y ) 2 = So, the standard form of the quadric is 4z 2 1 +z2 2 +z2 3 = 9 12, where x y = P z the center and x+y +z = 0,x y = 0 and x+y 2z = 0 as the principal axes = Part 2 Here f(x,y,z) = 0 reduces to y2 10 3x2 10 z2 10 = 1 which is the equation of a hyperboloid consisting of two sheets with center 0 and the axes x, y and z as the principal axes is

164 164 CHAPTER 6. EIGENVALUES, EIGENVECTORS AND DIAGONALIZATION Part 3 Here f(x,y,z) = 0 reduces to 3x2 10 y z2 10 = 1 which is the equation of a hyperboloid consisting of one sheet with center 0 and the axes x, y and z as the principal axes. Part 4 Here f(x,y,z) = 0 reduces to z = y 2 3x which is the equation of a hyperbolic paraboloid. The different curves are given in Figure 6.5. These curves have been drawn using the package MATHEMATICA. Figure 6.5: Ellipsoid, Hyperboloid of two sheets and one sheet, Hyperbolic Paraboloid.

Notes on Mathematics

Notes on Mathematics Notes on Mathematics - 12 1 Peeyush Chandra, A. K. Lal, V. Raghavendra, G. Santhanam 1 Supported by a grant from MHRD 2 Contents I Linear Algebra 7 1 Matrices 9 1.1 Definition of a Matrix......................................

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

Systems of Linear Equations and Matrices

Systems of Linear Equations and Matrices Chapter 1 Systems of Linear Equations and Matrices System of linear algebraic equations and their solution constitute one of the major topics studied in the course known as linear algebra. In the first

More information

Math113: Linear Algebra. Beifang Chen

Math113: Linear Algebra. Beifang Chen Math3: Linear Algebra Beifang Chen Spring 26 Contents Systems of Linear Equations 3 Systems of Linear Equations 3 Linear Systems 3 2 Geometric Interpretation 3 3 Matrices of Linear Systems 4 4 Elementary

More information

Linear Algebra Highlights

Linear Algebra Highlights Linear Algebra Highlights Chapter 1 A linear equation in n variables is of the form a 1 x 1 + a 2 x 2 + + a n x n. We can have m equations in n variables, a system of linear equations, which we want to

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2 MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS SYSTEMS OF EQUATIONS AND MATRICES Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Linear Algebra March 16, 2019

Linear Algebra March 16, 2019 Linear Algebra March 16, 2019 2 Contents 0.1 Notation................................ 4 1 Systems of linear equations, and matrices 5 1.1 Systems of linear equations..................... 5 1.2 Augmented

More information

Systems of Linear Equations and Matrices

Systems of Linear Equations and Matrices Chapter 1 Systems of Linear Equations and Matrices System of linear algebraic equations and their solution constitute one of the major topics studied in the course known as linear algebra. In the first

More information

Lecture Notes in Linear Algebra

Lecture Notes in Linear Algebra Lecture Notes in Linear Algebra Dr. Abdullah Al-Azemi Mathematics Department Kuwait University February 4, 2017 Contents 1 Linear Equations and Matrices 1 1.2 Matrices............................................

More information

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same. Introduction Matrix Operations Matrix: An m n matrix A is an m-by-n array of scalars from a field (for example real numbers) of the form a a a n a a a n A a m a m a mn The order (or size) of A is m n (read

More information

MATH 106 LINEAR ALGEBRA LECTURE NOTES

MATH 106 LINEAR ALGEBRA LECTURE NOTES MATH 6 LINEAR ALGEBRA LECTURE NOTES FALL - These Lecture Notes are not in a final form being still subject of improvement Contents Systems of linear equations and matrices 5 Introduction to systems of

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND First Printing, 99 Chapter LINEAR EQUATIONS Introduction to linear equations A linear equation in n unknowns x,

More information

Chapter 5. Linear Algebra. A linear (algebraic) equation in. unknowns, x 1, x 2,..., x n, is. an equation of the form

Chapter 5. Linear Algebra. A linear (algebraic) equation in. unknowns, x 1, x 2,..., x n, is. an equation of the form Chapter 5. Linear Algebra A linear (algebraic) equation in n unknowns, x 1, x 2,..., x n, is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b where a 1, a 2,..., a n and b are real numbers. 1

More information

MTH 102A - Linear Algebra II Semester

MTH 102A - Linear Algebra II Semester MTH 0A - Linear Algebra - 05-6-II Semester Arbind Kumar Lal P Field A field F is a set from which we choose our coefficients and scalars Expected properties are ) a+b and a b should be defined in it )

More information

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra 1.1. Introduction SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear algebra is a specific branch of mathematics dealing with the study of vectors, vector spaces with functions that

More information

MAT 2037 LINEAR ALGEBRA I web:

MAT 2037 LINEAR ALGEBRA I web: MAT 237 LINEAR ALGEBRA I 2625 Dokuz Eylül University, Faculty of Science, Department of Mathematics web: Instructor: Engin Mermut http://kisideuedutr/enginmermut/ HOMEWORK 2 MATRIX ALGEBRA Textbook: Linear

More information

Linear Algebra: Lecture notes from Kolman and Hill 9th edition.

Linear Algebra: Lecture notes from Kolman and Hill 9th edition. Linear Algebra: Lecture notes from Kolman and Hill 9th edition Taylan Şengül March 20, 2019 Please let me know of any mistakes in these notes Contents Week 1 1 11 Systems of Linear Equations 1 12 Matrices

More information

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations Chapter 1: Systems of linear equations and matrices Section 1.1: Introduction to systems of linear equations Definition: A linear equation in n variables can be expressed in the form a 1 x 1 + a 2 x 2

More information

Chapter 7. Linear Algebra: Matrices, Vectors,

Chapter 7. Linear Algebra: Matrices, Vectors, Chapter 7. Linear Algebra: Matrices, Vectors, Determinants. Linear Systems Linear algebra includes the theory and application of linear systems of equations, linear transformations, and eigenvalue problems.

More information

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0. Matrices Operations Linear Algebra Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0 The rectangular array 1 2 1 4 3 4 2 6 1 3 2 1 in which the

More information

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 11 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,, a n, b are given real

More information

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show MTH 0: Linear Algebra Department of Mathematics and Statistics Indian Institute of Technology - Kanpur Problem Set Problems marked (T) are for discussions in Tutorial sessions (T) If A is an m n matrix,

More information

NOTES on LINEAR ALGEBRA 1

NOTES on LINEAR ALGEBRA 1 School of Economics, Management and Statistics University of Bologna Academic Year 207/8 NOTES on LINEAR ALGEBRA for the students of Stats and Maths This is a modified version of the notes by Prof Laura

More information

A FIRST COURSE IN LINEAR ALGEBRA. An Open Text by Ken Kuttler. Matrix Arithmetic

A FIRST COURSE IN LINEAR ALGEBRA. An Open Text by Ken Kuttler. Matrix Arithmetic A FIRST COURSE IN LINEAR ALGEBRA An Open Text by Ken Kuttler Matrix Arithmetic Lecture Notes by Karen Seyffarth Adapted by LYRYX SERVICE COURSE SOLUTION Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)

More information

ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices

ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS KOLMAN & HILL NOTES BY OTTO MUTZBAUER 11 Systems of Linear Equations 1 Linear Equations and Matrices Numbers in our context are either real numbers or complex

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND Second Online Version, December 998 Comments to the author at krm@mathsuqeduau All contents copyright c 99 Keith

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K. R. MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND Second Online Version, December 1998 Comments to the author at krm@maths.uq.edu.au Contents 1 LINEAR EQUATIONS

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K. R. MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND Corrected Version, 7th April 013 Comments to the author at keithmatt@gmail.com Chapter 1 LINEAR EQUATIONS 1.1

More information

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to 1.1. Introduction Linear algebra is a specific branch of mathematics dealing with the study of vectors, vector spaces with functions that

More information

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and Section 5.5. Matrices and Vectors A matrix is a rectangular array of objects arranged in rows and columns. The objects are called the entries. A matrix with m rows and n columns is called an m n matrix.

More information

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and Section 5.5. Matrices and Vectors A matrix is a rectangular array of objects arranged in rows and columns. The objects are called the entries. A matrix with m rows and n columns is called an m n matrix.

More information

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

MATH 240 Spring, Chapter 1: Linear Equations and Matrices MATH 240 Spring, 2006 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 8th Ed. Sections 1.1 1.6, 2.1 2.2, 3.2 3.8, 4.3 4.5, 5.1 5.3, 5.5, 6.1 6.5, 7.1 7.2, 7.4 DEFINITIONS Chapter 1: Linear

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT Math Camp II Basic Linear Algebra Yiqing Xu MIT Aug 26, 2014 1 Solving Systems of Linear Equations 2 Vectors and Vector Spaces 3 Matrices 4 Least Squares Systems of Linear Equations Definition A linear

More information

Chapter 2: Matrices and Linear Systems

Chapter 2: Matrices and Linear Systems Chapter 2: Matrices and Linear Systems Paul Pearson Outline Matrices Linear systems Row operations Inverses Determinants Matrices Definition An m n matrix A = (a ij ) is a rectangular array of real numbers

More information

Linear Systems and Matrices

Linear Systems and Matrices Department of Mathematics The Chinese University of Hong Kong 1 System of m linear equations in n unknowns (linear system) a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.......

More information

Math Linear Algebra Final Exam Review Sheet

Math Linear Algebra Final Exam Review Sheet Math 15-1 Linear Algebra Final Exam Review Sheet Vector Operations Vector addition is a component-wise operation. Two vectors v and w may be added together as long as they contain the same number n of

More information

Digital Workbook for GRA 6035 Mathematics

Digital Workbook for GRA 6035 Mathematics Eivind Eriksen Digital Workbook for GRA 6035 Mathematics November 10, 2014 BI Norwegian Business School Contents Part I Lectures in GRA6035 Mathematics 1 Linear Systems and Gaussian Elimination........................

More information

ANSWERS (5 points) Let A be a 2 2 matrix such that A =. Compute A. 2

ANSWERS (5 points) Let A be a 2 2 matrix such that A =. Compute A. 2 MATH 7- Final Exam Sample Problems Spring 7 ANSWERS ) ) ). 5 points) Let A be a matrix such that A =. Compute A. ) A = A ) = ) = ). 5 points) State ) the definition of norm, ) the Cauchy-Schwartz inequality

More information

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved Fundamentals of Linear Algebra Marcel B. Finan Arkansas Tech University c All Rights Reserved October 9, 200 2 PREFACE Linear algebra has evolved as a branch of mathematics with wide range of applications

More information

Linear Algebra. Workbook

Linear Algebra. Workbook Linear Algebra Workbook Paul Yiu Department of Mathematics Florida Atlantic University Last Update: November 21 Student: Fall 2011 Checklist Name: A B C D E F F G H I J 1 2 3 4 5 6 7 8 9 10 xxx xxx xxx

More information

Steven J. Leon University of Massachusetts, Dartmouth

Steven J. Leon University of Massachusetts, Dartmouth INSTRUCTOR S SOLUTIONS MANUAL LINEAR ALGEBRA WITH APPLICATIONS NINTH EDITION Steven J. Leon University of Massachusetts, Dartmouth Boston Columbus Indianapolis New York San Francisco Amsterdam Cape Town

More information

Matrix & Linear Algebra

Matrix & Linear Algebra Matrix & Linear Algebra Jamie Monogan University of Georgia For more information: http://monogan.myweb.uga.edu/teaching/mm/ Jamie Monogan (UGA) Matrix & Linear Algebra 1 / 84 Vectors Vectors Vector: A

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Introduction to Matrix Algebra August 18, 2010 1 Vectors 1.1 Notations A p-dimensional vector is p numbers put together. Written as x 1 x =. x p. When p = 1, this represents a point in the line. When p

More information

Appendix A: Matrices

Appendix A: Matrices Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows

More information

3 Matrix Algebra. 3.1 Operations on matrices

3 Matrix Algebra. 3.1 Operations on matrices 3 Matrix Algebra A matrix is a rectangular array of numbers; it is of size m n if it has m rows and n columns. A 1 n matrix is a row vector; an m 1 matrix is a column vector. For example: 1 5 3 5 3 5 8

More information

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017 Math 4A Notes Written by Victoria Kala vtkala@math.ucsb.edu Last updated June 11, 2017 Systems of Linear Equations A linear equation is an equation that can be written in the form a 1 x 1 + a 2 x 2 +...

More information

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved Fundamentals of Linear Algebra Marcel B. Finan Arkansas Tech University c All Rights Reserved 2 PREFACE Linear algebra has evolved as a branch of mathematics with wide range of applications to the natural

More information

Matrices A brief introduction

Matrices A brief introduction Matrices A brief introduction Basilio Bona DAUIN Politecnico di Torino Semester 1, 2014-15 B. Bona (DAUIN) Matrices Semester 1, 2014-15 1 / 41 Definitions Definition A matrix is a set of N real or complex

More information

Matrices and systems of linear equations

Matrices and systems of linear equations Matrices and systems of linear equations Samy Tindel Purdue University Differential equations and linear algebra - MA 262 Taken from Differential equations and linear algebra by Goode and Annin Samy T.

More information

Extra Problems for Math 2050 Linear Algebra I

Extra Problems for Math 2050 Linear Algebra I Extra Problems for Math 5 Linear Algebra I Find the vector AB and illustrate with a picture if A = (,) and B = (,4) Find B, given A = (,4) and [ AB = A = (,4) and [ AB = 8 If possible, express x = 7 as

More information

Matrix Arithmetic. j=1

Matrix Arithmetic. j=1 An m n matrix is an array A = Matrix Arithmetic a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn of real numbers a ij An m n matrix has m rows and n columns a ij is the entry in the i-th row and j-th column

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

Math 321: Linear Algebra

Math 321: Linear Algebra Math 32: Linear Algebra T. Kapitula Department of Mathematics and Statistics University of New Mexico September 8, 24 Textbook: Linear Algebra,by J. Hefferon E-mail: kapitula@math.unm.edu Prof. Kapitula,

More information

Chapter 1: Systems of Linear Equations

Chapter 1: Systems of Linear Equations Chapter : Systems of Linear Equations February, 9 Systems of linear equations Linear systems Lecture A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where

More information

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background Lecture notes on Quantum Computing Chapter 1 Mathematical Background Vector states of a quantum system with n physical states are represented by unique vectors in C n, the set of n 1 column vectors 1 For

More information

Mathematics. EC / EE / IN / ME / CE. for

Mathematics.   EC / EE / IN / ME / CE. for Mathematics for EC / EE / IN / ME / CE By www.thegateacademy.com Syllabus Syllabus for Mathematics Linear Algebra: Matrix Algebra, Systems of Linear Equations, Eigenvalues and Eigenvectors. Probability

More information

Matrices. Chapter Definitions and Notations

Matrices. Chapter Definitions and Notations Chapter 3 Matrices 3. Definitions and Notations Matrices are yet another mathematical object. Learning about matrices means learning what they are, how they are represented, the types of operations which

More information

MAT Linear Algebra Collection of sample exams

MAT Linear Algebra Collection of sample exams MAT 342 - Linear Algebra Collection of sample exams A-x. (0 pts Give the precise definition of the row echelon form. 2. ( 0 pts After performing row reductions on the augmented matrix for a certain system

More information

Matrix Theory. A.Holst, V.Ufnarovski

Matrix Theory. A.Holst, V.Ufnarovski Matrix Theory AHolst, VUfnarovski 55 HINTS AND ANSWERS 9 55 Hints and answers There are two different approaches In the first one write A as a block of rows and note that in B = E ij A all rows different

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory

More information

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) PROFESSOR STEVEN MILLER: BROWN UNIVERSITY: SPRING 2007 1. CHAPTER 1: MATRICES AND GAUSSIAN ELIMINATION Page 9, # 3: Describe

More information

Matrices and Linear Algebra

Matrices and Linear Algebra Contents Quantitative methods for Economics and Business University of Ferrara Academic year 2017-2018 Contents 1 Basics 2 3 4 5 Contents 1 Basics 2 3 4 5 Contents 1 Basics 2 3 4 5 Contents 1 Basics 2

More information

1 Determinants. 1.1 Determinant

1 Determinants. 1.1 Determinant 1 Determinants [SB], Chapter 9, p.188-196. [SB], Chapter 26, p.719-739. Bellow w ll study the central question: which additional conditions must satisfy a quadratic matrix A to be invertible, that is to

More information

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra Massachusetts Institute of Technology Department of Economics 14.381 Statistics Guido Kuersteiner Lecture Notes on Matrix Algebra These lecture notes summarize some basic results on matrix algebra used

More information

We could express the left side as a sum of vectors and obtain the Vector Form of a Linear System: a 12 a x n. a m2

We could express the left side as a sum of vectors and obtain the Vector Form of a Linear System: a 12 a x n. a m2 Week 22 Equations, Matrices and Transformations Coefficient Matrix and Vector Forms of a Linear System Suppose we have a system of m linear equations in n unknowns a 11 x 1 + a 12 x 2 + + a 1n x n b 1

More information

5.6. PSEUDOINVERSES 101. A H w.

5.6. PSEUDOINVERSES 101. A H w. 5.6. PSEUDOINVERSES 0 Corollary 5.6.4. If A is a matrix such that A H A is invertible, then the least-squares solution to Av = w is v = A H A ) A H w. The matrix A H A ) A H is the left inverse of A and

More information

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511) GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511) D. ARAPURA Gaussian elimination is the go to method for all basic linear classes including this one. We go summarize the main ideas. 1.

More information

Mathematical Foundations of Applied Statistics: Matrix Algebra

Mathematical Foundations of Applied Statistics: Matrix Algebra Mathematical Foundations of Applied Statistics: Matrix Algebra Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/105 Literature Seber, G.

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

Math Linear Algebra II. 1. Inner Products and Norms

Math Linear Algebra II. 1. Inner Products and Norms Math 342 - Linear Algebra II Notes 1. Inner Products and Norms One knows from a basic introduction to vectors in R n Math 254 at OSU) that the length of a vector x = x 1 x 2... x n ) T R n, denoted x,

More information

1 Matrices and Systems of Linear Equations

1 Matrices and Systems of Linear Equations March 3, 203 6-6. Systems of Linear Equations Matrices and Systems of Linear Equations An m n matrix is an array A = a ij of the form a a n a 2 a 2n... a m a mn where each a ij is a real or complex number.

More information

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y. as Basics Vectors MATRIX ALGEBRA An array of n real numbers x, x,, x n is called a vector and it is written x = x x n or x = x,, x n R n prime operation=transposing a column to a row Basic vector operations

More information

Vectors and matrices: matrices (Version 2) This is a very brief summary of my lecture notes.

Vectors and matrices: matrices (Version 2) This is a very brief summary of my lecture notes. Vectors and matrices: matrices (Version 2) This is a very brief summary of my lecture notes Matrices and linear equations A matrix is an m-by-n array of numbers A = a 11 a 12 a 13 a 1n a 21 a 22 a 23 a

More information

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian FE661 - Statistical Methods for Financial Engineering 2. Linear algebra Jitkomut Songsiri matrices and vectors linear equations range and nullspace of matrices function of vectors, gradient and Hessian

More information

a s 1.3 Matrix Multiplication. Know how to multiply two matrices and be able to write down the formula

a s 1.3 Matrix Multiplication. Know how to multiply two matrices and be able to write down the formula Syllabus for Math 308, Paul Smith Book: Kolman-Hill Chapter 1. Linear Equations and Matrices 1.1 Systems of Linear Equations Definition of a linear equation and a solution to a linear equations. Meaning

More information

Chapter 2: Matrix Algebra

Chapter 2: Matrix Algebra Chapter 2: Matrix Algebra (Last Updated: October 12, 2016) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). Write A = 1. Matrix operations [a 1 a n. Then entry

More information

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p Math Bootcamp 2012 1 Review of matrix algebra 1.1 Vectors and rules of operations An p-dimensional vector is p numbers put together. Written as x 1 x =. x p. When p = 1, this represents a point in the

More information

1 9/5 Matrices, vectors, and their applications

1 9/5 Matrices, vectors, and their applications 1 9/5 Matrices, vectors, and their applications Algebra: study of objects and operations on them. Linear algebra: object: matrices and vectors. operations: addition, multiplication etc. Algorithms/Geometric

More information

Chapter 1 Matrices and Systems of Equations

Chapter 1 Matrices and Systems of Equations Chapter 1 Matrices and Systems of Equations System of Linear Equations 1. A linear equation in n unknowns is an equation of the form n i=1 a i x i = b where a 1,..., a n, b R and x 1,..., x n are variables.

More information

Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

More information

Linear Algebra M1 - FIB. Contents: 5. Matrices, systems of linear equations and determinants 6. Vector space 7. Linear maps 8.

Linear Algebra M1 - FIB. Contents: 5. Matrices, systems of linear equations and determinants 6. Vector space 7. Linear maps 8. Linear Algebra M1 - FIB Contents: 5 Matrices, systems of linear equations and determinants 6 Vector space 7 Linear maps 8 Diagonalization Anna de Mier Montserrat Maureso Dept Matemàtica Aplicada II Translation:

More information

Review of Matrices and Block Structures

Review of Matrices and Block Structures CHAPTER 2 Review of Matrices and Block Structures Numerical linear algebra lies at the heart of modern scientific computing and computational science. Today it is not uncommon to perform numerical computations

More information

An Introduction To Linear Algebra. Kuttler

An Introduction To Linear Algebra. Kuttler An Introduction To Linear Algebra Kuttler April, 7 Contents Introduction 7 F n 9 Outcomes 9 Algebra in F n Systems Of Equations Outcomes Systems Of Equations, Geometric Interpretations Systems Of Equations,

More information

Linear Algebra Homework and Study Guide

Linear Algebra Homework and Study Guide Linear Algebra Homework and Study Guide Phil R. Smith, Ph.D. February 28, 20 Homework Problem Sets Organized by Learning Outcomes Test I: Systems of Linear Equations; Matrices Lesson. Give examples of

More information

INSTITIÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW MATRICES

INSTITIÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW MATRICES 1 CHAPTER 4 MATRICES 1 INSTITIÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW MATRICES 1 Matrices Matrices are of fundamental importance in 2-dimensional and 3-dimensional graphics programming

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Mathematical Methods wk 2: Linear Operators

Mathematical Methods wk 2: Linear Operators John Magorrian, magog@thphysoxacuk These are work-in-progress notes for the second-year course on mathematical methods The most up-to-date version is available from http://www-thphysphysicsoxacuk/people/johnmagorrian/mm

More information

Linear Algebra Lecture Notes-II

Linear Algebra Lecture Notes-II Linear Algebra Lecture Notes-II Vikas Bist Department of Mathematics Panjab University, Chandigarh-64 email: bistvikas@gmail.com Last revised on March 5, 8 This text is based on the lectures delivered

More information

MATRICES. knowledge on matrices Knowledge on matrix operations. Matrix as a tool of solving linear equations with two or three unknowns.

MATRICES. knowledge on matrices Knowledge on matrix operations. Matrix as a tool of solving linear equations with two or three unknowns. MATRICES After studying this chapter you will acquire the skills in knowledge on matrices Knowledge on matrix operations. Matrix as a tool of solving linear equations with two or three unknowns. List of

More information

Elementary Row Operations on Matrices

Elementary Row Operations on Matrices King Saud University September 17, 018 Table of contents 1 Definition A real matrix is a rectangular array whose entries are real numbers. These numbers are organized on rows and columns. An m n matrix

More information

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1...

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1... Chapter Matrices We review the basic matrix operations What is a Matrix? An array of numbers a a n A = a m a mn with m rows and n columns is a m n matrix Element a ij in located in position (i, j The elements

More information

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway Linear Algebra: Lecture Notes Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway November 6, 23 Contents Systems of Linear Equations 2 Introduction 2 2 Elementary Row

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education MTH 3 Linear Algebra Study Guide Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education June 3, ii Contents Table of Contents iii Matrix Algebra. Real Life

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

Elementary Linear Algebra

Elementary Linear Algebra Matrices J MUSCAT Elementary Linear Algebra Matrices Definition Dr J Muscat 2002 A matrix is a rectangular array of numbers, arranged in rows and columns a a 2 a 3 a n a 2 a 22 a 23 a 2n A = a m a mn We

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

A = 3 B = A 1 1 matrix is the same as a number or scalar, 3 = [3].

A = 3 B = A 1 1 matrix is the same as a number or scalar, 3 = [3]. Appendix : A Very Brief Linear ALgebra Review Introduction Linear Algebra, also known as matrix theory, is an important element of all branches of mathematics Very often in this course we study the shapes

More information