NOTES ON LINEAR ALGEBRA. 1. Determinants

NOTES ON LINEAR ALGEBRA 1 Determinants In this section, we study determinants of matrices We study their properties, methods of computation and some applications We are familiar with the following formulas for determinants: [ a b det[a] = a, det c d ] = ad bc, and det a b c d e f g h i = aei ahf bdi+bgf +cdh ceg There is an explicit formula for determinant, det A, for any n n matrix A This involves n! terms Our approach to determinants is via their properties It makes their study more elegant Let A be a 3 3 matrix with real entries Let A i be the i th row of A, i = 1, 2, 3 We write A = (A 1, A 2, A 3 ) We think of A 1, A 2, A 3 as vectors in R 3 Then det A = det(a 1, A 2, A 3 ) = A 1 A 2 A 3 Thus det A is the volume of the parallelepiped formed by the row vectors of A Hence det A = 0 if and only if A 1, A 2, A 3 are coplanar The following properties of det A follow from those of triple scalar product: (1) Multilinearity : Let B 2, C 2 R 3 and α, β R Then det(a 1, αb 2 + βc 2, A 3 ) = α det(a 1, B 2, A 3 ) + β det(a 1, C 2, A 3 ) Similarly we can show that linearity holds for other rows also (2) Alternating : det A = 0 when two rows of A are equal In this case the rows generate either a plane or a line whose volume is zero (3) Normalization : det(e 1, e 2, e 3 ) = 1 where e 1, e 2, e 3 are the unit coordinate vectors in the direction of positive x-axis, y-axis and z-axis respectively These properties motivate the axioms for determinant like functions on n n matrices We will write an n n matrix A with row vectors A 1, A 2,, A n as A = (A 1, A 2,, A n ) Let d(a 1, A 2,, A n ) be a function defined on the rows A 1, A 2,, A n of a matrix A We assume that the entries of A are in any field F Definition 11 (i) d(a 1, A 2,, A n ) is called multilinear if for each k = 1, 2,, n; α, β scalars and a vector C F n d(a 1,, αa k + βc,, A n ) = αd(a 1,, A k,, A n ) + βd(a 1, A 2,, C,, A n ) (ii) d(a 1, A 2,, A n ) is called alternating if d(a 1, A 2,, A n ) = 0 if A i = A j for i j 1

2 NOTES ON LINEAR ALGEBRA (iii) d(a 1, A 2,, A n ) is called normalized if d(e 1, e 2,, e n ) = 1 where e i is the i th unit coordinate vector (0, 0,, 1, 0,, 0) where 1 occurs in i th place (iv) A normalized alternating and multillinear function d(a 1, A 2,, A n ) on n n matrices A = (A 1, A 2,, A n ) is called a determinant function of order n Our immediate objective is to show that there is only one determinant function of order n This fact is very useful in finding formulas for determinants or proving that certain formula gives determinant of some matrix We simply show that the formula defines an alternating, multilinear and normalized function on the rows of n n matrices Lemma 12 Suppose that d(a 1, A 2,, A n ) is a multilinear alternating function on rows of n n matrices Then (1) If some A k = 0 then d(a 1, A 2,, A n ) = 0 (2) d(a 1, A 2,, A i,, A j,, A n ) = d(a 1, A 2,, A j,, A i,, A n ) Proof (1) If A k = 0 then by multilinearity d(a 1, A 2,, 0A k,, A n ) = 0 d(a 1, A 2,, A k,, A n ) = 0 (2) Put A i = B, A j = C Then by alternating property of d(a 1, A 2,, A n ), 0 = d(a 1, A 2,, B + C,, B + C,, A n ) = d(a 1, A 2,, B,, B + C,, A n ) + d(a 1, A 2,, C,, B + C,, A n ) = d(a 1, A 2,, B,, C,, A n ) + d(a 1, A 2,, C,, B,, A n ) Hence d(a 1, A 2,, B,, C,, A n ) = d(a 1, A 2,, C,, B,, A n ) Computation of determinants We now derive the familiar formula for the determinant of 2 2 matrices Suppose d(a 1, A 2 ) is an alternating multilinear normalized function on 2 2 matrices A = (A 1, A 2 ) Then [ ] x y d = xu yz z u To derive this formula, write the first row as A = xe 1 + ye 2 and the second row as A 2 = ze 1 + ue 2 Then d(a 1, A 2 ) = d(xe 1 + ye 2, ze 1 + ue 2 ) = d(xe 1 + ye 2, ze 1 ) + d(xe 1 + ye 2, ue 2 ) = d(xe 1, ze 1 ) + d(ye 2, ze 1 ) +d(xe 1, ue 2 ) + d(ye 2, ue 2 ) = yzd(e 2, e 1 ) + xud(e 1, e 2 ) = (xu yz)d(e 1, e 2 ) = xu yz

NOTES ON LINEAR ALGEBRA 3 Exercise 13 Let d be an alternating, multilinear, and normalized function on the rows of n n matrices If the rows of A are dependent show that d(a) = 0 Exercise 14 Let U be an upper triangular matrix and d is an alternating multilinear normalized function on the rows of n n matrices Show that d(u) = product of diagonal entries of U Show the same result for lower triangular matrices Computation by Gauss Elimination Method This is one of the most efficient ways to calculate determinant functions Let A be an n n matrix Suppose E = the n n elementary matrix for the row operation ca j + A i F = the n n elementary matrix for the row operation A i A j G = the n n elementary matrix for the row operation A i ca i Suppose that U is the row-echelon form of A If c 1, c 2,, c p are the multipliers used for the row operations A i ca i and r row exchanges have been used to get U from A then for any alternating multilinear function d, d(a) = ( 1) r (c 1 c 2 c p ) 1 d(u) To see this we simply note that d(f A) = d(a), d(ea) = d(a) and d(ga) = cd(a) Suppose that u 11, u 22,, u nn are the diagonal entries of U then d(a) = ( 1) r (c 1 c 2, c p ) 1 u 11 u 22 u nn d(e 1, e 2,, e n ) Existence and uniqueness of determinant function Theorem 15 (Uniqueness of determinant function) Let f be an alternating multilinear function of order n and d a determinant function of order n Then for all n n matrices A = (A 1, A 2,, A n ), f(a 1, A 2,, A n ) = d(a 1, A 2,, A n )f(e 1, e 2,, e n ) In particular, if f is also a determinant function then f(a 1, A 2,, A n ) = d(a 1, A 2,, A n ) Proof Consider the function g(a 1, A 2,, A n ) = f(a 1, A 2,, A n ) d(a 1, A 2,, A n )f(e 1, e 2,, e n ) We show that g 0 Since f, d are alternating and multilinear so is g and thus we have g(a 1, A 2,, A n ) = cg(e 1, e 2,, e n ), where c depends only on A = (A 1, A 2,, A n ) Thus g(e 1, e 2,, e n ) = f(e 1, e 2,, e n ) d(e 1, e 2,, e n )f(e 1,, e n ) = 0 Hence g 0

4 NOTES ON LINEAR ALGEBRA We shall use det A or A for d(a 1,, A n ) since d is unique We have proved uniqueness of determinant function of order n It remains to show their existence We set det [a] = a by definition We have already proved the formula for det A for 2 2 matrices The determinant of an n n matrix A can be computed in terms of certain (n 1) (n 1) determinants by a process called expansion by minors Let A ij = be the (n 1) (n 1) matrix obtained from A by deleting the i th row and j th column of A Theorem 16 Let A = (a ij ) be an n n matrix Then, for 1 k n, det A = ( 1) k+1 (a 1k det A 1k a 2k det A 2k + + ( 1) n+1 a nk det A nk ) Proof We prove the case k = 1 The other cases are similar Let us denote the right hand side of the above equation by f(a 1, A 2,, A n ) We show that f(a 1, A 2,, A n ) is a determinant function by induction on n It is easily checked for n = 1 and n = 2 Suppose that the rows A j and A j+1 of A are equal Then A i1 have equal rows except when i = j or i = j + 1 By induction det A i1 = 0 for i j, j + 1 Thus det A = a j1 [ ( 1) j+1 det (A j1 ) ] + [ ( 1) j+2 det A j+11 ] aj+11 Since A j = A j+1, a j1 = a j+1 1 and A j1 = A j+11 Thus det A = 0 Therefore f(a 1, A 2,, A n ) is alternating If A = (e 1, e 2,, e n ) then by induction det A = 1 det A 11 = det (e 1, e 2,, e n 1 ) = 1 We leave the multilinear property of f(a 1,, A n ) as an exercise for the reader Determinant and Invertibility Theorem 17 Let A, B be two n n matrices Then det (AB) = det A det B Proof Let D i denote the ith row of of a matrix D Then (AB) i = A i B Therefore we need to prove: det (A 1 B, A 2 B, A n B) = det (A 1, A 2,, A n ) det (B 1,, B n ) = det (A 1, A 2,, A n ) det (e 1 B, e 2 B,, e n B) Keep B fixed and define f(a 1, A 2,, A n ) = det (A 1 B, A 2 B,, A n B) We show that f is alternating and multilinear Let C F n Then

NOTES ON LINEAR ALGEBRA 5 Therefore f(a 1,, A i,, A i,, A n ) = det (A 1 B,, A i B,, A i B,, A n B) = 0 f(a 1,, αa k + βc,, A n ) = det (A 1 B,, (αa k + βc)b,, A n B) = det (A 1 B,, αa k B + βcb,, A n B) = α det (A 1 B,, A k B,, A n B) + β det (A 1 B,, CB,, A n B) = αf(a 1,, A n ) + βf(a 1,, C,, A n ) f(a 1, A 2,, A n ) = det (A 1,, A n )f(e 1, e 2,, e n ) = det (A 1,, A n ) det (B 1, B 2,, B n ) Hence det (AB) = det A det B Lemma 18 A is an invertible matrix if and only if det A 0 In this case, det A 1 = 1 det A Proof Suppose A is invertible Then AA 1 = I Thus det A 1 det A = det I = 1 So det A 0 and det A 1 = 1/ det A Coversely let det A 0 By Exercise 13,we have rank A = n Thus the standard basis vectors e 1, e 2,, e n F n 1 can be expressed in terms of the column vectors of A For i = 1, 2,, n write e i = b 1i A 1 + b 2i A 2 + + b ni A n for uniquely determined scalars b ij Let B = (b ij ) Then AB = I Hence A is invertible Theorem 19 For any n n matrix A, det A = det A t Proof If rank A < n then A is not invertible and det A = 0 Since the row rank and the column rank are equal, it follows that det A t = 0 So we may assume that A is invertible By Gauss elimination A can be reduced to the identity matrix by elementary row operations Thus A is a product of elementary matrices Now, it can be easily checked that each elementary matrix is either symmetric or lower triangular or upper trianlgular By Exercise 14 the determinant of a lower or upper triangular matrix is equal to the determinant of its transpose Let A = E 1 E 2 E r for some elementary matrices E 1, E 2,, E r Hence r r r det A = det E i = det Ei t = det Er i t = det A t i=1 i=1 i=1

6 NOTES ON LINEAR ALGEBRA It follows from the theorem above that the determinant function is multilinear, alternating, and normalized with respect to the columns We also have the row expansion analog of Theorem 16 Theorem 110 Let B be a n n matrix Then for k = 1, 2,, n, det B = ( 1) k+1 (b k1 det B k1 b k2 det B k2 + + ( 1) n+1 b kn det B kn ) The cofactor matrix and a formula for A 1 Definition: Let A = (a ij ) be an n n matrix The cofactor of a ij, denoted by cof a ij is defined as cof a ij = ( 1) i+j det A ij The cofactor matrix of A denoted by cof A is the matrix cof A = ( cof a ij ) Theorem 111 For any n n matrix A with n 2, A( cof A) t = ( det A)I = ( cof A) t A In particular, if det A is nonzero then A 1 = 1 det A ( cof A)t, hence A is invertible Proof The (i, j) entry of ( cof A) t A is : a 1j cof a 1i + a 2j cof a 2i + + a nj cof a ni If i = j, it is easy to see that it is det A When i j consider the matrix B obtained by replacing i th column of A by j th column of A So B has a repeated column Hence det B = 0 To get the other equation, substitute A t for A in the equation just proved, take transpose and observe that cof A t = ( cof A) t Theorem 112 (Cramer s Rule) Suppose a 11 a 12 a 1n a 21 a 22 a 2n a n1 a n2 a nn x 1 x 2 x n b 1 = b 2 is system of n linear equations in n unknowns, x 1, x 2,, x n Suppose the coefficient matrix A = (a ij ) is invertible Let C j be the matrix obtained from A by replacing j th column of A by b = (b 1, b 2,, b n ) t Then for j = 1, 2,, n, x j = det C j det A Proof We have b = x 1 A 1 + x 2 A 2 + + x n A n, where A j is the jth column of A Using multilinearity of the determinant on the columns of C j we see that det C j = x j det A b n

NOTES ON LINEAR ALGEBRA 7 Determinants and permutations We derive a formula for determinant of a matrix by means of permutations Recall that a permutation of the set [n] = {1, 2,, n} is a bijection of [n] Let σ be a permutation of [n] We write this as ( σ = 1 2 n σ(1) σ(2) σ(n) Let the set of permutations of [n] be denoted by S n For any σ S n we define the permutation matrix A σ by If τ S n then for all k = 1, 2,, n, Hence A στ = A σ A τ A σ = [e σ(1), e σ(2),, e σ(n) ] ) A στ (e k ) = e στ(k) and A σ A τ (e k ) = A σ e τ(k) = e στ(k) Definition 113 The signature ɛ(σ) of a permutation σ S n is defined by ɛ(σ) = det A σ Since A σ is obtained from the identity matrix by permuting the rows, ɛ(σ) = ±1 Lemma 114 For all σ, τ S n, ɛ(στ) = ɛ(σ)ɛ(τ) Proof ɛ(στ) = det A στ = det A σ det A τ = ɛ(σ)ɛ(τ) Definition 115 A permutation σ is called a transposition if there exist i j [n] such that σ(i) = j, σ(j) = i and σ(k) = k for all k [n] \ {i, j} In this case we write σ = (ij) Theorem 116 Every permutation in S n is a product of transpositions Proof Let σ S n Suppose that σ(n) = n Then σ S n 1 By induction on n, σ is a product of transpositions If σ(n) n and σ(n) = i then (in)σ(n) = n Hence the permutation (in)σ is a product of transpositions Thus σ is a product of transpositions Definition 117 A permutation σ is called even (resp odd) if ɛ(σ) = 1 (resp 1) Since ɛ(στ) = ɛ(σ)ɛ(τ), product of even permutations is even, product of an odd and even permutation is odd and product of odd permutations is even A transposition is an odd permutation Hence if a permutation is a product of even number of transpositions if and only is it is even and it is odd if and only if it is a product of odd number of transpositions Theorem 118 Let F be a field and A = (a ij ) F n n Then det A = σ S n ɛ(σ)a 1σ(1) a 2σ(2) a nσ(n)

8 NOTES ON LINEAR ALGEBRA Proof Let a i denote the i th row vector of A Then det (a 1, a 2,, a n ) = det a 1j e j, a 2, a 3,, a n ) = = = = j=1 a 1j det (e j, a 2, a 3,, a n )) j=1 a 1j1 det j 1 =1 j 1 =1 j 2 =1 j 1 =1 j 2 =1 = e j1, a 2j2 e j2, a 3,, a n ) j 2 =1 a 1j1 a 2j2 det (e j1, e j2, a 3,, a n ) a 1j1 a 2j2 a njn det (e j1, e j2,, e njn ) j n=1 σ S n ɛ(σ)a 1σ(1) a 2σ(2) a nσ(n) Determinant and rank of a matrix Finally we discuss the rank of a matrix in terms of determinants An r r minor of an m n matrix is the determinant of a submatrix obtained by deleting m r rows and n r columns of A Lemma 119 Let F be a field and A F n n Then rank(a) = n if and only if det A 0 Proof Let rank(a) = n Then by row and column operations A can be reduced to the identity matrix Since the nonvanishing of det A remains unchanged under elementary row and column operations, we conclude that det A 0 Conversely let det A 0 If rank(a) < n then the column vectors of A are linearly dependent Let b 1 A 1 + b 2 A 2 + + b n A n = 0 for some scalars b 1, b 2,, b n Then Ab = 0 where b = (b 1, b 2,, b n ) t Since det A 0, A is invertible Hence b = 0 which is a contradiction Theorem 120 The rank of an m n matrix A is r if and only if A has an r r nonzero minor and all r + k r + k minors of A are zero for all k = 1, 2, Proof We say that det rank A = r if the condition on the minors stated in the theorem is satisfied Let rank(a) = r Then there exist r linearly independent rows of A and any s rows of A where s > r are linearly dependent We may then form an r n submatrix B of A whose rows are linearly independent Hence rank(b) = r Hence we can find r linearly independent columns of B Hence there is an r r nonzero minor of A Hence det rank(a) rank(a)

NOTES ON LINEAR ALGEBRA 9 Coversely let det rank(a) = r Then there exists an r r submatrix of A having nonzero determinant Hence the rows of A that contain this submatrix are linearly independent Therefore rank(a) det rank(a) Determinant of a linear map Definition 121 Let T : V V be a linear operator of an n-dimensional dimensional vector space V Let B be a basis of V Then the determinant det T of T is defined to be det (MB B (T )) Note that determinant of a linear operator on V is well-defined Indeed, if C is another basis of V, then we know that MC C(T ) = P 1 AP for an invertible matrix P where A = MB B(T ) Hence det M C C (T ) = det A Theorem 122 Let T be a linear operator on an n-dimensional vector space V Then T is invertible if and only if det T 0 Proof Let T be invertible Then there is a linear map S : V V such that T S = ST = I Let M(T ) denote the matrix of T with respect to a basis B of V Hence M(T )M(S) = I Thus det M(T ) det M(S) = 1 Hence det M(T ) = det T 0 Conversely let det T 0 Then det M(T ) 0 Hence rank M(T ) = rank T = n Thus T is onto Since V is finite-dimensional, T is invertible 2 Orthogonal projections, best approximations and least squares Let V be a finite dimensional inner product space We have seen how to project a vector onto a nonzero vector We now discuss the orthogonal projection of a vector onto a subspace Let W be a nonzero subspace of V The orthogonal subspace W of W is defined as W = {u V u w for all w W } Theorem 21 Every v V can be written uniquely as v = x + y, where x W and y W Proof (Existence) Let {v 1, v 2,, v k } be an orthonormal basis of W Set x = v, v 1 v 1 + v, v 2 v 2 + + v, v k v k

10 NOTES ON LINEAR ALGEBRA and put y = v x Clearly v = x + y and x W We now check that y W For i = 1, 2,, k we have It follows that y W y, v i = v x, v i = v, v i x, v i k = v, v i v, v j v j, v i = v, v i j=1 k v, v j v j, v i j=1 = v, v i v, v i (by orthonormality) = 0 (Uniqueness) Let v = x + y = x + y, where x, x W and y, y W Then x x = y y W W But W W = {0} Hence x = x and y = y Exercise 22 Show that dim W + dim W = dim V Definition 23 For a subspace W, we define a function p W : V W as follows: given v V, express v uniquely as v = x + y, where x W and y W Define p W (v) = x We call p W (v) the orthogonal projection of v onto W Note that v p W (v) W Definition 24 Let W be a subspace of V and let v V A best approximation to v by vectors in W is a vector u in W such that v u v w, for all w W The next result shows that orthogonal projection gives the unique best approximator Theorem 25 Let v V and let W be a subspace of V Let w W Then the following are equivalent: (1) w is a best approximation to v by vectors in W (2) w = p W (v) (3) v w W Proof We have v w 2 = v p W (v) + p W (v) w 2 = v p W (v) 2 + p W (v) w 2, where the second equality follows from Pythogoras theorem on noting that p W (v) w W and v p W (v) W It follows that (1) and (2) are equivalent To see the equivalence of (1) and (3) write v = w + (v w) and apply Theorem 1 Example 26 Let V = C[0, 2π] = {f : [0, 2π] R f is continuous} Consider the trigonometric functions: u 0 = 1, u 2n 1 = cos nx, u 2n = sin nx for n = 1, 2,

NOTES ON LINEAR ALGEBRA 11 The functions in the subspace T generated by u n, n = 0, 1, 2, are called trigonometric polynomials The vector space V is an inner product space with the inner product f(x), g(x) = 2π 0 f(x)g(x)dx It is easy to check that u m u n for all m n Check that u 0, u 0 = 2π 0 Thus the set dt = 2π, cos nx, cos nx = B m = {φ 0 = 1 2π, φ 2n 1 (x) = 2π 0 cos 2 nx = sin nx, sin nx 2π cos nx sin nx, φ 2n (x) = n = 1, 2,, n} π π 0 sin 2 nx = π is an orthonormal basis of the subspace W = L(B m ) For a function f V the best approximation by trigonometric polynomials in W is given by P W (f) = 2m k=0 f, φ k φ k The real numbers f, φ k for k = 0, 1, are called the Fourier coefficients in honour of the French mathematician Joseph Fourier (1768-1830), who was led to these in his study of the heat equation Projection of a vector onto the column space of a matrix Let us now consider projection from the matrix point of view Consider R n with standard inner product Let A be an n m (m n) matrix and let b R n We want to project b onto the column space of A The projection of b onto the column space of A will be a vector of the form p = Ax, for some x R m From Theorem 25, p is the projection if and only if b Ax is orthogonal to every column of A In other words, x should satisfy the equations A t (b Ax) = 0, ora t Ax = A t b The above equations are called normal equations in Gauss-Markov theory in statistics Lemma 27 rank(a t A) = rank A and nullity(a t A) = nullity(a) In particular, if the columns of A are linearly independent, then A t A is an invertible matrix Proof Let A t Az = 0, for z R m Then A t w = 0, where w = Az Now w is in the column space of A and is orthogonal to every column of A This implies that w = 0 Thus N(A) = N(A t A) where N(B) denotes the nullspace of a mrtix B Let n(b) denote the nullity of B By the rank-nullity theorem, rank(a t A) + n(a t A) = m = rank A + n(a) Hence rank(a t A) = rank(a) Thus if the columns of A are linearly independent then the m m matrix A t A has rank m Hence it is invertible If the columns of A are linearly independent, the (unique) solution to the normal equations is (A t A) 1 A t b and the projection of b onto the column space of A is A(A t A) 1 A t b Note that the normal equations always have a solution (why?), although the solution will not be unique in case the columns of A are linearly dependent

12 NOTES ON LINEAR ALGEBRA Example 28 Let A = 1 1 1 0 0 1 and b = 1 0 5 Then A t A = [ 2 1 1 2 ] and A t b = (1, 4) t The unique solution to the normal equations is x = (2, 3) t and b Ax = (2, 2, 2) t (note that this vector is orthogonal to the columns of A) 1 1 1 2 1 3/2 1 Now let B = 1 0 1/2 We have B t B = 1 2 3/2 and B t b = 4 0 1 1/2 3/2 3/2 3/2 3/2 Note that A and B have the same column spaces (the third column of B is the average of the first two columns) So the projection of b onto the column space of B will be the same as before However the normal equations do not have a unique solution in this case Check that x = (2, 3, 0) t, (3, 2, 2) t are both solutions of the normal equations B t Bx = B t b Gauss Least squares method Suppose we have a large number of data points (x i, y i ), i = 1, 2,, n collected from some experiment Frequently there is reason to believe that these points should lie on a straight line So we want a linear function y(x) = s + tx such that y(x i ) = y i, i = 1,, n Due to uncertainity in data and experimental error, in practice the points will deviate somewhat from a straightline and so it is impossible to find a linear y(x) that passes through all of them So we seek a line that fits the data well, in the sense that the errors are made as small as possible A natural question that arises now is: how do we define the error? Consider the following system of linear equations, in the variables s and t, and known coefficients x i, y i, i = 1,, n: s + x 1 t = y 1, s + x 2 t = y 2, s + x n t = y n Note that typically n would be much greater than 2 If we can find s and t to satisfy all these equations, then we have solved our problem However, for reasons mentioned above, this is not always possible For given values of s and t the error in the ith equation is y i s x i t There are several ways of combining the errors in the individual equations to get a measure of the total error The following are three examples: n (y i s x i t) 2, y i s x i t, max 1 i n y i s x i t i=1 i=1 Both analytically and computationally, a nice theory exists for the first of these choices and this is what we shall study The problem of finding s, t so as to minimize n (y i s x i t) 2 is called a least squares problem i=1

Let A = NOTES ON LINEAR ALGEBRA 13 1 x 1 1 x 2 1 x n, b = y 1 y 2 y n, x = The least squares problem is finding an x such that b Ax is minimized, ie, find an x such that Ax is the best approximation to b in the column space of A This is precisely the problem of projecting b onto the column space of A A straight line can be considered as a polynomial of degree 1 We can also try to fit an m th degree polynomial y(x) = s 0 + s 1 x + s 2 x 2 + + s m x m to the data points (x i, y i ), i = 1,, n, so as to minimize the error (in the least squares sense) In this case s 0, s 1,, s m are the variables and we have A = 1 x 1 x 2 1 x m 1 1 x 2 x 2 2 x m 2 1 x n x 2 n x m n, b = y 1 y 2 y n [ s t ], x = Example 29 Find C, D such that the straight line b = C + Dt best fits the following data in the least squares sense: b = 1 at t = 1, b = 1 at t = 1, b = 3 at t = 2 1 1 We want to project b = (1, 1, 3) t onto the column space of A = 1 1 Now A t A = 1 2 [ ] 3 2 and A t b = (5, 6) t The normal equations are 2 6 [ 3 2 2 6 ] [ C D ] = [ 5 6 The solution is C = 9/7, D = 4/7 and the best line is b = 9 7 + 4 7 t ] s 0 s 1 s m