NOTES ON LINEAR ALGEBRA. 1. Determinants

Similar documents
II. Determinant Functions

Determinants. Beifang Chen

1. General Vector Spaces

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

Math Linear Algebra Final Exam Review Sheet

DETERMINANTS. , x 2 = a 11b 2 a 21 b 1

Linear Systems and Matrices

1 Determinants. 1.1 Determinant

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form:

Matrices. In this chapter: matrices, determinants. inverse matrix

A Brief Outline of Math 355

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Chapter 2:Determinants. Section 2.1: Determinants by cofactor expansion

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Math113: Linear Algebra. Beifang Chen

Determinants Chapter 3 of Lay

LINEAR ALGEBRA REVIEW

a 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12

MATH Topics in Applied Mathematics Lecture 12: Evaluation of determinants. Cross product.

ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices

Determinants - Uniqueness and Properties

ELE/MCE 503 Linear Algebra Facts Fall 2018

NOTES on LINEAR ALGEBRA 1

Linear Algebra: Lecture notes from Kolman and Hill 9th edition.

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)

EE731 Lecture Notes: Matrix Computations for Signal Processing

Linear Algebra Primer

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

Lecture Notes in Linear Algebra

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Math Camp Notes: Linear Algebra I

Evaluating Determinants by Row Reduction

Conceptual Questions for Review

1300 Linear Algebra and Vector Geometry

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Chapter 2: Matrices and Linear Systems

Linear Algebra Highlights

Matrix Operations: Determinant

Chapter 2. Square matrices

Graduate Mathematical Economics Lecture 1

Determinants: Uniqueness and more

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

Linear Algebra M1 - FIB. Contents: 5. Matrices, systems of linear equations and determinants 6. Vector space 7. Linear maps 8.

Chapter 4 - MATRIX ALGEBRA. ... a 2j... a 2n. a i1 a i2... a ij... a in

SPRING OF 2008 D. DETERMINANTS

MATH 106 LINEAR ALGEBRA LECTURE NOTES

1. Select the unique answer (choice) for each problem. Write only the answer.

3 Matrix Algebra. 3.1 Operations on matrices


Row Space and Column Space of a Matrix

Notes on Linear Algebra

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

ENGR-1100 Introduction to Engineering Analysis. Lecture 21

Notes on Determinants and Matrix Inverse

Properties of the Determinant Function

Lecture Summaries for Linear Algebra M51A

APPENDIX: MATHEMATICAL INDUCTION AND OTHER FORMS OF PROOF

Components and change of basis

CSL361 Problem set 4: Basic linear algebra

ECON 186 Class Notes: Linear Algebra

Matrix Algebra. Matrix Algebra. Chapter 8 - S&B

Formula for the inverse matrix. Cramer s rule. Review: 3 3 determinants can be computed expanding by any row or column

Matrix Algebra Determinant, Inverse matrix. Matrices. A. Fabretti. Mathematics 2 A.Y. 2015/2016. A. Fabretti Matrices

Math 313 Chapter 5 Review

c c c c c c c c c c a 3x3 matrix C= has a determinant determined by

Chapter SSM: Linear Algebra Section Fails to be invertible; since det = 6 6 = Invertible; since det = = 2.

Solution to Homework 1

REVIEW FOR EXAM II. The exam covers sections , the part of 3.7 on Markov chains, and

Linear Algebra (part 1) : Matrices and Systems of Linear Equations (by Evan Dummit, 2016, v. 2.02)

Eigenvalues and Eigenvectors. Review: Invertibility. Eigenvalues and Eigenvectors. The Finite Dimensional Case. January 18, 2018

Solution Set 7, Fall '12

Chapter 3. Determinants and Eigenvalues

Math 240 Calculus III

4. Determinants.

Chapter 7. Linear Algebra: Matrices, Vectors,

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

ENGR-1100 Introduction to Engineering Analysis. Lecture 21. Lecture outline

Lectures on Linear Algebra for IT

Determinants. Samy Tindel. Purdue University. Differential equations and linear algebra - MA 262

Undergraduate Mathematical Economics Lecture 1

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Unit 3: Matrices. Juan Luis Melero and Eduardo Eyras. September 2018

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

Algorithms to Compute Bases and the Rank of a Matrix

Chapter 2: Matrix Algebra

Solutions to Final Exam

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

1 Last time: determinants

Linear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.

Fundamentals of Engineering Analysis (650163)

1 Matrices and Systems of Linear Equations

MTH 464: Computational Linear Algebra

SUMMARY OF MATH 1600

Linear Algebra Primer

MAT 2037 LINEAR ALGEBRA I web:

Transcription:

NOTES ON LINEAR ALGEBRA 1 Determinants In this section, we study determinants of matrices We study their properties, methods of computation and some applications We are familiar with the following formulas for determinants: [ a b det[a] = a, det c d ] = ad bc, and det a b c d e f g h i = aei ahf bdi+bgf +cdh ceg There is an explicit formula for determinant, det A, for any n n matrix A This involves n! terms Our approach to determinants is via their properties It makes their study more elegant Let A be a 3 3 matrix with real entries Let A i be the i th row of A, i = 1, 2, 3 We write A = (A 1, A 2, A 3 ) We think of A 1, A 2, A 3 as vectors in R 3 Then det A = det(a 1, A 2, A 3 ) = A 1 A 2 A 3 Thus det A is the volume of the parallelepiped formed by the row vectors of A Hence det A = 0 if and only if A 1, A 2, A 3 are coplanar The following properties of det A follow from those of triple scalar product: (1) Multilinearity : Let B 2, C 2 R 3 and α, β R Then det(a 1, αb 2 + βc 2, A 3 ) = α det(a 1, B 2, A 3 ) + β det(a 1, C 2, A 3 ) Similarly we can show that linearity holds for other rows also (2) Alternating : det A = 0 when two rows of A are equal In this case the rows generate either a plane or a line whose volume is zero (3) Normalization : det(e 1, e 2, e 3 ) = 1 where e 1, e 2, e 3 are the unit coordinate vectors in the direction of positive x-axis, y-axis and z-axis respectively These properties motivate the axioms for determinant like functions on n n matrices We will write an n n matrix A with row vectors A 1, A 2,, A n as A = (A 1, A 2,, A n ) Let d(a 1, A 2,, A n ) be a function defined on the rows A 1, A 2,, A n of a matrix A We assume that the entries of A are in any field F Definition 11 (i) d(a 1, A 2,, A n ) is called multilinear if for each k = 1, 2,, n; α, β scalars and a vector C F n d(a 1,, αa k + βc,, A n ) = αd(a 1,, A k,, A n ) + βd(a 1, A 2,, C,, A n ) (ii) d(a 1, A 2,, A n ) is called alternating if d(a 1, A 2,, A n ) = 0 if A i = A j for i j 1

2 NOTES ON LINEAR ALGEBRA (iii) d(a 1, A 2,, A n ) is called normalized if d(e 1, e 2,, e n ) = 1 where e i is the i th unit coordinate vector (0, 0,, 1, 0,, 0) where 1 occurs in i th place (iv) A normalized alternating and multillinear function d(a 1, A 2,, A n ) on n n matrices A = (A 1, A 2,, A n ) is called a determinant function of order n Our immediate objective is to show that there is only one determinant function of order n This fact is very useful in finding formulas for determinants or proving that certain formula gives determinant of some matrix We simply show that the formula defines an alternating, multilinear and normalized function on the rows of n n matrices Lemma 12 Suppose that d(a 1, A 2,, A n ) is a multilinear alternating function on rows of n n matrices Then (1) If some A k = 0 then d(a 1, A 2,, A n ) = 0 (2) d(a 1, A 2,, A i,, A j,, A n ) = d(a 1, A 2,, A j,, A i,, A n ) Proof (1) If A k = 0 then by multilinearity d(a 1, A 2,, 0A k,, A n ) = 0 d(a 1, A 2,, A k,, A n ) = 0 (2) Put A i = B, A j = C Then by alternating property of d(a 1, A 2,, A n ), 0 = d(a 1, A 2,, B + C,, B + C,, A n ) = d(a 1, A 2,, B,, B + C,, A n ) + d(a 1, A 2,, C,, B + C,, A n ) = d(a 1, A 2,, B,, C,, A n ) + d(a 1, A 2,, C,, B,, A n ) Hence d(a 1, A 2,, B,, C,, A n ) = d(a 1, A 2,, C,, B,, A n ) Computation of determinants We now derive the familiar formula for the determinant of 2 2 matrices Suppose d(a 1, A 2 ) is an alternating multilinear normalized function on 2 2 matrices A = (A 1, A 2 ) Then [ ] x y d = xu yz z u To derive this formula, write the first row as A = xe 1 + ye 2 and the second row as A 2 = ze 1 + ue 2 Then d(a 1, A 2 ) = d(xe 1 + ye 2, ze 1 + ue 2 ) = d(xe 1 + ye 2, ze 1 ) + d(xe 1 + ye 2, ue 2 ) = d(xe 1, ze 1 ) + d(ye 2, ze 1 ) +d(xe 1, ue 2 ) + d(ye 2, ue 2 ) = yzd(e 2, e 1 ) + xud(e 1, e 2 ) = (xu yz)d(e 1, e 2 ) = xu yz

NOTES ON LINEAR ALGEBRA 3 Exercise 13 Let d be an alternating, multilinear, and normalized function on the rows of n n matrices If the rows of A are dependent show that d(a) = 0 Exercise 14 Let U be an upper triangular matrix and d is an alternating multilinear normalized function on the rows of n n matrices Show that d(u) = product of diagonal entries of U Show the same result for lower triangular matrices Computation by Gauss Elimination Method This is one of the most efficient ways to calculate determinant functions Let A be an n n matrix Suppose E = the n n elementary matrix for the row operation ca j + A i F = the n n elementary matrix for the row operation A i A j G = the n n elementary matrix for the row operation A i ca i Suppose that U is the row-echelon form of A If c 1, c 2,, c p are the multipliers used for the row operations A i ca i and r row exchanges have been used to get U from A then for any alternating multilinear function d, d(a) = ( 1) r (c 1 c 2 c p ) 1 d(u) To see this we simply note that d(f A) = d(a), d(ea) = d(a) and d(ga) = cd(a) Suppose that u 11, u 22,, u nn are the diagonal entries of U then d(a) = ( 1) r (c 1 c 2, c p ) 1 u 11 u 22 u nn d(e 1, e 2,, e n ) Existence and uniqueness of determinant function Theorem 15 (Uniqueness of determinant function) Let f be an alternating multilinear function of order n and d a determinant function of order n Then for all n n matrices A = (A 1, A 2,, A n ), f(a 1, A 2,, A n ) = d(a 1, A 2,, A n )f(e 1, e 2,, e n ) In particular, if f is also a determinant function then f(a 1, A 2,, A n ) = d(a 1, A 2,, A n ) Proof Consider the function g(a 1, A 2,, A n ) = f(a 1, A 2,, A n ) d(a 1, A 2,, A n )f(e 1, e 2,, e n ) We show that g 0 Since f, d are alternating and multilinear so is g and thus we have g(a 1, A 2,, A n ) = cg(e 1, e 2,, e n ), where c depends only on A = (A 1, A 2,, A n ) Thus g(e 1, e 2,, e n ) = f(e 1, e 2,, e n ) d(e 1, e 2,, e n )f(e 1,, e n ) = 0 Hence g 0

4 NOTES ON LINEAR ALGEBRA We shall use det A or A for d(a 1,, A n ) since d is unique We have proved uniqueness of determinant function of order n It remains to show their existence We set det [a] = a by definition We have already proved the formula for det A for 2 2 matrices The determinant of an n n matrix A can be computed in terms of certain (n 1) (n 1) determinants by a process called expansion by minors Let A ij = be the (n 1) (n 1) matrix obtained from A by deleting the i th row and j th column of A Theorem 16 Let A = (a ij ) be an n n matrix Then, for 1 k n, det A = ( 1) k+1 (a 1k det A 1k a 2k det A 2k + + ( 1) n+1 a nk det A nk ) Proof We prove the case k = 1 The other cases are similar Let us denote the right hand side of the above equation by f(a 1, A 2,, A n ) We show that f(a 1, A 2,, A n ) is a determinant function by induction on n It is easily checked for n = 1 and n = 2 Suppose that the rows A j and A j+1 of A are equal Then A i1 have equal rows except when i = j or i = j + 1 By induction det A i1 = 0 for i j, j + 1 Thus det A = a j1 [ ( 1) j+1 det (A j1 ) ] + [ ( 1) j+2 det A j+11 ] aj+11 Since A j = A j+1, a j1 = a j+1 1 and A j1 = A j+11 Thus det A = 0 Therefore f(a 1, A 2,, A n ) is alternating If A = (e 1, e 2,, e n ) then by induction det A = 1 det A 11 = det (e 1, e 2,, e n 1 ) = 1 We leave the multilinear property of f(a 1,, A n ) as an exercise for the reader Determinant and Invertibility Theorem 17 Let A, B be two n n matrices Then det (AB) = det A det B Proof Let D i denote the ith row of of a matrix D Then (AB) i = A i B Therefore we need to prove: det (A 1 B, A 2 B, A n B) = det (A 1, A 2,, A n ) det (B 1,, B n ) = det (A 1, A 2,, A n ) det (e 1 B, e 2 B,, e n B) Keep B fixed and define f(a 1, A 2,, A n ) = det (A 1 B, A 2 B,, A n B) We show that f is alternating and multilinear Let C F n Then

NOTES ON LINEAR ALGEBRA 5 Therefore f(a 1,, A i,, A i,, A n ) = det (A 1 B,, A i B,, A i B,, A n B) = 0 f(a 1,, αa k + βc,, A n ) = det (A 1 B,, (αa k + βc)b,, A n B) = det (A 1 B,, αa k B + βcb,, A n B) = α det (A 1 B,, A k B,, A n B) + β det (A 1 B,, CB,, A n B) = αf(a 1,, A n ) + βf(a 1,, C,, A n ) f(a 1, A 2,, A n ) = det (A 1,, A n )f(e 1, e 2,, e n ) = det (A 1,, A n ) det (B 1, B 2,, B n ) Hence det (AB) = det A det B Lemma 18 A is an invertible matrix if and only if det A 0 In this case, det A 1 = 1 det A Proof Suppose A is invertible Then AA 1 = I Thus det A 1 det A = det I = 1 So det A 0 and det A 1 = 1/ det A Coversely let det A 0 By Exercise 13,we have rank A = n Thus the standard basis vectors e 1, e 2,, e n F n 1 can be expressed in terms of the column vectors of A For i = 1, 2,, n write e i = b 1i A 1 + b 2i A 2 + + b ni A n for uniquely determined scalars b ij Let B = (b ij ) Then AB = I Hence A is invertible Theorem 19 For any n n matrix A, det A = det A t Proof If rank A < n then A is not invertible and det A = 0 Since the row rank and the column rank are equal, it follows that det A t = 0 So we may assume that A is invertible By Gauss elimination A can be reduced to the identity matrix by elementary row operations Thus A is a product of elementary matrices Now, it can be easily checked that each elementary matrix is either symmetric or lower triangular or upper trianlgular By Exercise 14 the determinant of a lower or upper triangular matrix is equal to the determinant of its transpose Let A = E 1 E 2 E r for some elementary matrices E 1, E 2,, E r Hence r r r det A = det E i = det Ei t = det Er i t = det A t i=1 i=1 i=1

6 NOTES ON LINEAR ALGEBRA It follows from the theorem above that the determinant function is multilinear, alternating, and normalized with respect to the columns We also have the row expansion analog of Theorem 16 Theorem 110 Let B be a n n matrix Then for k = 1, 2,, n, det B = ( 1) k+1 (b k1 det B k1 b k2 det B k2 + + ( 1) n+1 b kn det B kn ) The cofactor matrix and a formula for A 1 Definition: Let A = (a ij ) be an n n matrix The cofactor of a ij, denoted by cof a ij is defined as cof a ij = ( 1) i+j det A ij The cofactor matrix of A denoted by cof A is the matrix cof A = ( cof a ij ) Theorem 111 For any n n matrix A with n 2, A( cof A) t = ( det A)I = ( cof A) t A In particular, if det A is nonzero then A 1 = 1 det A ( cof A)t, hence A is invertible Proof The (i, j) entry of ( cof A) t A is : a 1j cof a 1i + a 2j cof a 2i + + a nj cof a ni If i = j, it is easy to see that it is det A When i j consider the matrix B obtained by replacing i th column of A by j th column of A So B has a repeated column Hence det B = 0 To get the other equation, substitute A t for A in the equation just proved, take transpose and observe that cof A t = ( cof A) t Theorem 112 (Cramer s Rule) Suppose a 11 a 12 a 1n a 21 a 22 a 2n a n1 a n2 a nn x 1 x 2 x n b 1 = b 2 is system of n linear equations in n unknowns, x 1, x 2,, x n Suppose the coefficient matrix A = (a ij ) is invertible Let C j be the matrix obtained from A by replacing j th column of A by b = (b 1, b 2,, b n ) t Then for j = 1, 2,, n, x j = det C j det A Proof We have b = x 1 A 1 + x 2 A 2 + + x n A n, where A j is the jth column of A Using multilinearity of the determinant on the columns of C j we see that det C j = x j det A b n

NOTES ON LINEAR ALGEBRA 7 Determinants and permutations We derive a formula for determinant of a matrix by means of permutations Recall that a permutation of the set [n] = {1, 2,, n} is a bijection of [n] Let σ be a permutation of [n] We write this as ( σ = 1 2 n σ(1) σ(2) σ(n) Let the set of permutations of [n] be denoted by S n For any σ S n we define the permutation matrix A σ by If τ S n then for all k = 1, 2,, n, Hence A στ = A σ A τ A σ = [e σ(1), e σ(2),, e σ(n) ] ) A στ (e k ) = e στ(k) and A σ A τ (e k ) = A σ e τ(k) = e στ(k) Definition 113 The signature ɛ(σ) of a permutation σ S n is defined by ɛ(σ) = det A σ Since A σ is obtained from the identity matrix by permuting the rows, ɛ(σ) = ±1 Lemma 114 For all σ, τ S n, ɛ(στ) = ɛ(σ)ɛ(τ) Proof ɛ(στ) = det A στ = det A σ det A τ = ɛ(σ)ɛ(τ) Definition 115 A permutation σ is called a transposition if there exist i j [n] such that σ(i) = j, σ(j) = i and σ(k) = k for all k [n] \ {i, j} In this case we write σ = (ij) Theorem 116 Every permutation in S n is a product of transpositions Proof Let σ S n Suppose that σ(n) = n Then σ S n 1 By induction on n, σ is a product of transpositions If σ(n) n and σ(n) = i then (in)σ(n) = n Hence the permutation (in)σ is a product of transpositions Thus σ is a product of transpositions Definition 117 A permutation σ is called even (resp odd) if ɛ(σ) = 1 (resp 1) Since ɛ(στ) = ɛ(σ)ɛ(τ), product of even permutations is even, product of an odd and even permutation is odd and product of odd permutations is even A transposition is an odd permutation Hence if a permutation is a product of even number of transpositions if and only is it is even and it is odd if and only if it is a product of odd number of transpositions Theorem 118 Let F be a field and A = (a ij ) F n n Then det A = σ S n ɛ(σ)a 1σ(1) a 2σ(2) a nσ(n)

8 NOTES ON LINEAR ALGEBRA Proof Let a i denote the i th row vector of A Then det (a 1, a 2,, a n ) = det a 1j e j, a 2, a 3,, a n ) = = = = j=1 a 1j det (e j, a 2, a 3,, a n )) j=1 a 1j1 det j 1 =1 j 1 =1 j 2 =1 j 1 =1 j 2 =1 = e j1, a 2j2 e j2, a 3,, a n ) j 2 =1 a 1j1 a 2j2 det (e j1, e j2, a 3,, a n ) a 1j1 a 2j2 a njn det (e j1, e j2,, e njn ) j n=1 σ S n ɛ(σ)a 1σ(1) a 2σ(2) a nσ(n) Determinant and rank of a matrix Finally we discuss the rank of a matrix in terms of determinants An r r minor of an m n matrix is the determinant of a submatrix obtained by deleting m r rows and n r columns of A Lemma 119 Let F be a field and A F n n Then rank(a) = n if and only if det A 0 Proof Let rank(a) = n Then by row and column operations A can be reduced to the identity matrix Since the nonvanishing of det A remains unchanged under elementary row and column operations, we conclude that det A 0 Conversely let det A 0 If rank(a) < n then the column vectors of A are linearly dependent Let b 1 A 1 + b 2 A 2 + + b n A n = 0 for some scalars b 1, b 2,, b n Then Ab = 0 where b = (b 1, b 2,, b n ) t Since det A 0, A is invertible Hence b = 0 which is a contradiction Theorem 120 The rank of an m n matrix A is r if and only if A has an r r nonzero minor and all r + k r + k minors of A are zero for all k = 1, 2, Proof We say that det rank A = r if the condition on the minors stated in the theorem is satisfied Let rank(a) = r Then there exist r linearly independent rows of A and any s rows of A where s > r are linearly dependent We may then form an r n submatrix B of A whose rows are linearly independent Hence rank(b) = r Hence we can find r linearly independent columns of B Hence there is an r r nonzero minor of A Hence det rank(a) rank(a)

NOTES ON LINEAR ALGEBRA 9 Coversely let det rank(a) = r Then there exists an r r submatrix of A having nonzero determinant Hence the rows of A that contain this submatrix are linearly independent Therefore rank(a) det rank(a) Determinant of a linear map Definition 121 Let T : V V be a linear operator of an n-dimensional dimensional vector space V Let B be a basis of V Then the determinant det T of T is defined to be det (MB B (T )) Note that determinant of a linear operator on V is well-defined Indeed, if C is another basis of V, then we know that MC C(T ) = P 1 AP for an invertible matrix P where A = MB B(T ) Hence det M C C (T ) = det A Theorem 122 Let T be a linear operator on an n-dimensional vector space V Then T is invertible if and only if det T 0 Proof Let T be invertible Then there is a linear map S : V V such that T S = ST = I Let M(T ) denote the matrix of T with respect to a basis B of V Hence M(T )M(S) = I Thus det M(T ) det M(S) = 1 Hence det M(T ) = det T 0 Conversely let det T 0 Then det M(T ) 0 Hence rank M(T ) = rank T = n Thus T is onto Since V is finite-dimensional, T is invertible 2 Orthogonal projections, best approximations and least squares Let V be a finite dimensional inner product space We have seen how to project a vector onto a nonzero vector We now discuss the orthogonal projection of a vector onto a subspace Let W be a nonzero subspace of V The orthogonal subspace W of W is defined as W = {u V u w for all w W } Theorem 21 Every v V can be written uniquely as v = x + y, where x W and y W Proof (Existence) Let {v 1, v 2,, v k } be an orthonormal basis of W Set x = v, v 1 v 1 + v, v 2 v 2 + + v, v k v k

10 NOTES ON LINEAR ALGEBRA and put y = v x Clearly v = x + y and x W We now check that y W For i = 1, 2,, k we have It follows that y W y, v i = v x, v i = v, v i x, v i k = v, v i v, v j v j, v i = v, v i j=1 k v, v j v j, v i j=1 = v, v i v, v i (by orthonormality) = 0 (Uniqueness) Let v = x + y = x + y, where x, x W and y, y W Then x x = y y W W But W W = {0} Hence x = x and y = y Exercise 22 Show that dim W + dim W = dim V Definition 23 For a subspace W, we define a function p W : V W as follows: given v V, express v uniquely as v = x + y, where x W and y W Define p W (v) = x We call p W (v) the orthogonal projection of v onto W Note that v p W (v) W Definition 24 Let W be a subspace of V and let v V A best approximation to v by vectors in W is a vector u in W such that v u v w, for all w W The next result shows that orthogonal projection gives the unique best approximator Theorem 25 Let v V and let W be a subspace of V Let w W Then the following are equivalent: (1) w is a best approximation to v by vectors in W (2) w = p W (v) (3) v w W Proof We have v w 2 = v p W (v) + p W (v) w 2 = v p W (v) 2 + p W (v) w 2, where the second equality follows from Pythogoras theorem on noting that p W (v) w W and v p W (v) W It follows that (1) and (2) are equivalent To see the equivalence of (1) and (3) write v = w + (v w) and apply Theorem 1 Example 26 Let V = C[0, 2π] = {f : [0, 2π] R f is continuous} Consider the trigonometric functions: u 0 = 1, u 2n 1 = cos nx, u 2n = sin nx for n = 1, 2,

NOTES ON LINEAR ALGEBRA 11 The functions in the subspace T generated by u n, n = 0, 1, 2, are called trigonometric polynomials The vector space V is an inner product space with the inner product f(x), g(x) = 2π 0 f(x)g(x)dx It is easy to check that u m u n for all m n Check that u 0, u 0 = 2π 0 Thus the set dt = 2π, cos nx, cos nx = B m = {φ 0 = 1 2π, φ 2n 1 (x) = 2π 0 cos 2 nx = sin nx, sin nx 2π cos nx sin nx, φ 2n (x) = n = 1, 2,, n} π π 0 sin 2 nx = π is an orthonormal basis of the subspace W = L(B m ) For a function f V the best approximation by trigonometric polynomials in W is given by P W (f) = 2m k=0 f, φ k φ k The real numbers f, φ k for k = 0, 1, are called the Fourier coefficients in honour of the French mathematician Joseph Fourier (1768-1830), who was led to these in his study of the heat equation Projection of a vector onto the column space of a matrix Let us now consider projection from the matrix point of view Consider R n with standard inner product Let A be an n m (m n) matrix and let b R n We want to project b onto the column space of A The projection of b onto the column space of A will be a vector of the form p = Ax, for some x R m From Theorem 25, p is the projection if and only if b Ax is orthogonal to every column of A In other words, x should satisfy the equations A t (b Ax) = 0, ora t Ax = A t b The above equations are called normal equations in Gauss-Markov theory in statistics Lemma 27 rank(a t A) = rank A and nullity(a t A) = nullity(a) In particular, if the columns of A are linearly independent, then A t A is an invertible matrix Proof Let A t Az = 0, for z R m Then A t w = 0, where w = Az Now w is in the column space of A and is orthogonal to every column of A This implies that w = 0 Thus N(A) = N(A t A) where N(B) denotes the nullspace of a mrtix B Let n(b) denote the nullity of B By the rank-nullity theorem, rank(a t A) + n(a t A) = m = rank A + n(a) Hence rank(a t A) = rank(a) Thus if the columns of A are linearly independent then the m m matrix A t A has rank m Hence it is invertible If the columns of A are linearly independent, the (unique) solution to the normal equations is (A t A) 1 A t b and the projection of b onto the column space of A is A(A t A) 1 A t b Note that the normal equations always have a solution (why?), although the solution will not be unique in case the columns of A are linearly dependent

12 NOTES ON LINEAR ALGEBRA Example 28 Let A = 1 1 1 0 0 1 and b = 1 0 5 Then A t A = [ 2 1 1 2 ] and A t b = (1, 4) t The unique solution to the normal equations is x = (2, 3) t and b Ax = (2, 2, 2) t (note that this vector is orthogonal to the columns of A) 1 1 1 2 1 3/2 1 Now let B = 1 0 1/2 We have B t B = 1 2 3/2 and B t b = 4 0 1 1/2 3/2 3/2 3/2 3/2 Note that A and B have the same column spaces (the third column of B is the average of the first two columns) So the projection of b onto the column space of B will be the same as before However the normal equations do not have a unique solution in this case Check that x = (2, 3, 0) t, (3, 2, 2) t are both solutions of the normal equations B t Bx = B t b Gauss Least squares method Suppose we have a large number of data points (x i, y i ), i = 1, 2,, n collected from some experiment Frequently there is reason to believe that these points should lie on a straight line So we want a linear function y(x) = s + tx such that y(x i ) = y i, i = 1,, n Due to uncertainity in data and experimental error, in practice the points will deviate somewhat from a straightline and so it is impossible to find a linear y(x) that passes through all of them So we seek a line that fits the data well, in the sense that the errors are made as small as possible A natural question that arises now is: how do we define the error? Consider the following system of linear equations, in the variables s and t, and known coefficients x i, y i, i = 1,, n: s + x 1 t = y 1, s + x 2 t = y 2, s + x n t = y n Note that typically n would be much greater than 2 If we can find s and t to satisfy all these equations, then we have solved our problem However, for reasons mentioned above, this is not always possible For given values of s and t the error in the ith equation is y i s x i t There are several ways of combining the errors in the individual equations to get a measure of the total error The following are three examples: n (y i s x i t) 2, y i s x i t, max 1 i n y i s x i t i=1 i=1 Both analytically and computationally, a nice theory exists for the first of these choices and this is what we shall study The problem of finding s, t so as to minimize n (y i s x i t) 2 is called a least squares problem i=1

Let A = NOTES ON LINEAR ALGEBRA 13 1 x 1 1 x 2 1 x n, b = y 1 y 2 y n, x = The least squares problem is finding an x such that b Ax is minimized, ie, find an x such that Ax is the best approximation to b in the column space of A This is precisely the problem of projecting b onto the column space of A A straight line can be considered as a polynomial of degree 1 We can also try to fit an m th degree polynomial y(x) = s 0 + s 1 x + s 2 x 2 + + s m x m to the data points (x i, y i ), i = 1,, n, so as to minimize the error (in the least squares sense) In this case s 0, s 1,, s m are the variables and we have A = 1 x 1 x 2 1 x m 1 1 x 2 x 2 2 x m 2 1 x n x 2 n x m n, b = y 1 y 2 y n [ s t ], x = Example 29 Find C, D such that the straight line b = C + Dt best fits the following data in the least squares sense: b = 1 at t = 1, b = 1 at t = 1, b = 3 at t = 2 1 1 We want to project b = (1, 1, 3) t onto the column space of A = 1 1 Now A t A = 1 2 [ ] 3 2 and A t b = (5, 6) t The normal equations are 2 6 [ 3 2 2 6 ] [ C D ] = [ 5 6 The solution is C = 9/7, D = 4/7 and the best line is b = 9 7 + 4 7 t ] s 0 s 1 s m