Introduction to systems of linear equations

Size: px

Start display at page:

Download "Introduction to systems of linear equations"

Stella Boyd
5 years ago
Views:

1 Introduction to systems of linear equations These slides are based on Section in Linear Algebra and its Applications by David C. Lay. Definition. A linear equation in the variables x,...,x n is an equation that can be written as a x +a x + +a n x n =b. Example. Which of the following equations are linear? x 5x +=x linear: x 5x = x =( 6 x )+x linear: x +x x = 6 x 6x =x x not linear: x x x = x 7 not linear: x Definition. A system of linear equations (or a linear system) is a collection of one or more linear equations involving the same set of variables, say, x,x,...,x n. A solution of a linear system is a list (s, s,...,s n ) of numbers that makes each equation in the system true when the values s,s,...,s n are substituted for x,x,...,x n, respectively. Example. (Two equations in two variables) In each case, sketch the set of all solutions. x + x = x + x = x x = x x = 8 x + x = x x = Theorem 5. A linear system has either no solution, or one unique solution, or infinitely many solutions. Definition 6. A system is consistent if a solution exists.

2 How to solve systems of linear equations Strategy: replace system with an equivalent system which is easier to solve Definition 7. Linear systems are equivalent if they have the same set of solutions. Example 8. To solve the first system from the previous example: x + x = x + x = R R+R x > + x = x = Once in this triangular form, we find the solutions by back-substitution: x =/, x =/ Example 9. The same approach works for more complicated systems. x x + x = x 8x = 8 x + 5x + 9x = 9 x x + x = x 8x = 8 x + x = 9 x x + x = x 8x = 8 x =,, R R+R R R+ R By back-substitution: x =, x = 6, x = 9. It is always a good idea to check our answer. Let us check that (9, 6,) indeed solves the original system: x x + x = x 8x = 8 x + 5x + 9x = Matrix notation x x = x + x = (coefficient matrix) (augmented matrix)

3 Definition. An elementary row operation is one of the following: (replacement) Add one row to a multiple of another row. (interchange) Interchange two rows. (scaling) Multiply all entries in a row by a nonzero constant. Definition. Two matrices are row equivalent, if one matrix can be transformed into the other matrix by a sequence of elementary row operations. Theorem. If the augmented matrices of two linear systems are row equivalent, then the two systems have the same solution set. Example. Here is the previous example in matrix notation. x x + x = x 8x = 8 8 8, x + 5x + 9x = R R+R x x + x = x 8x = 8 x + x = , R R+ R x x + x = x 8x = 8 x = 8 8 Instead of back-substitution, we can continue with row operations. After R R+8R, R R R, we obtain: x x = x = x = Finally, R R+R, R R results in: x = 9 x = 6 x = We again find the solution (x,x,x )=(9, 6,). 9 6

4 Row reduction and echelon forms Definition. A matrix is in echelon form (or row echelon form) if: () Each leading entry (i.e. leftmost nonzero entry) of a row is in a column to the right of the leading entry of the row above it. () All entries in a column below a leading entry are zero. () All nonzero rows are above any rows of all zeros. Example 5. Here is a representative matrix in echelon form. ( stands for any value, and for any nonzero value.) Example 6. Are the following matrices in echelon form? (a) YES (b) (c) (d) Related and extra material NOPE (but it is after exchanging the first two rows) YES NO In our textbook: parts of.,.,. (just pages 78 and 79) However, I would suggest waiting a bit before reading through these parts (say, until we covered things like matrix multiplication in class). Suggested practice exercise:,, 5,, from Section.

5 Pre-lecture trivia Who are these four? Artur Avila, Manjul Bhargava, Martin Hairer, Maryam Mirzakhani Just won the Fields Medal! analog to Nobel prize in mathematics awarded every four years winners have to be younger than cash prize: 5, C$

6 Review Each linear system corresponds to an augmented matrix. x x = 6 x +x x = 9 x +x = 6 9 augmented matrix To solve a system, we perform row reduction. > R R+ R > R R+ R echelon form! Echelon form in general: The leading terms in each row are the pivots.

7 Row reduction and echelon forms, continued Definition. A matrix is in reduced echelon form if, in addition to being in echelon form, it also satisfies: Each pivot is. Each pivot is the only nonzero entry in its column. Example. Our initial matrix in echelon form put into reduced echelon form: Note that, to be in reduced echelon form, the pivots also have to be scaled to. Example. Are the following matrices in reduced echelon form? (a) YES 5 7 (b) 6 5 NO (c) 7 NO Theorem. (Uniqueness of the reduced echelon form) Each matrix is row equivalent to one and only one reduced echelon matrix. Question. Is the same statement true for the echelon form? Clearly not; for instance, are different row equivalent echelon forms.

8 Example 5. Row reduce to echelon form (often called Gaussian elimination) and then to reduced echelon form (often called Gauss Jordan elimination): Solution. After R R, we get: (R R would b e another option; try it!) Then, R R R yields: Finally, R R R produces the echelon form: To get the reduced echelon form, we first scale all rows: 5 Then, R R R and R R R, gives: 7 Finally, R R+R produces the reduced echelon form: 7

9 Solution of linear systems via row reduction After row reduction to echelon form, we can easily solve a linear system. (especially after reduction to reduced echelon form) Example x +6x +x = x 8x = 5 x 5 = 7 The pivots are located in columns,,5. The corresponding variables x,x,x 5 are called pivot variables (or basic variables). The remaining variables x,x are called free variables. We can solve each equation for the pivot variables in terms of the free variables (if any). Here, we get: x = 6x x x +6x +x = x free x 8x = 5 x =5+8x x 5 = 7 x free x 5 =7 This is the general solution of this system. The solution is in parametric form, with parameters given by the free variables. Just to make sure: Is the above system consistent? Does it have a unique solution? Example 7. Find a parametric description of the solution set of: x 6x +6x +x 5 = 5 x 7x +8x 5x +8x 5 = 9 x 9x +x 9x +6x 5 = 5 Solution. The augmented matrix is We determined earlier that its reduced echelon form is 7. 5

10 The pivot variables are x,x,x 5. The free variables are x,x. Hence, we find the general solution as: x = +x x x = 7+x x x free x free x 5 = Related and extra material In our textbook: still, parts of.,.,. (just pages 78 and 79) As before, I would suggest waiting a bit before reading through these parts (say, until we covered things like matrix multiplication in class). Suggested practice exercise: Section.:, ; Section.: (only reduce A,B to echelon form) 6

11 Review We have a standardized recipe to find all solutions of systems such as: x 6x +6x +x 5 = 5 x 7x +8x 5x +8x 5 = 9 x 9x +x 9x +6x 5 = 5 The computational part is to start with the augmented matrix and to calculate its reduced echelon form (which is unique!). Here: 7., pivot variables (or basic variables): x,x,x 5 free variables: x,x solving each equation for the pivot variables in terms of the free variables: x x +x = x x +x = 7 x 5 = x = +x x x = 7+x x x free x free x 5 =

12 Questions of existence and uniqueness The question whether a system has a solution and whether it is unique, is easier to answer than to determine the solution set. All we need is an echelon form of the augmented matrix. Example. Is the following system consistent? If so, does it have a unique solution? x 6x +6x +x 5 = 5 x 7x +8x 5x +8x 5 = 9 x 9x +x 9x +6x 5 = 5 Solution. In the course of an earlier example, we obtained the echelon form: Hence, it is consistent (imagine doing back-substitution to get a solution). Theorem. (Existence and uniqueness theorem) A linear system is consistent if and only if an echelon form of the augmented matrix has no row of the form where b is nonzero.... b, If a linear system is consistent, then the solutions consist of either a unique solution (when there are no free variables) or infinitely many solutions (when there is at least one free variable). Example. For what values of h will the following system be consistent? x 9x = x +6x = h Solution. We perform row reduction to find an echelon form: 9 R R+ > R 9 6 h h+ 8 The system is consistent if and only if h= 8.

13 Brief summary of what we learned so far Each linear system corresponds to an augmented matrix. Using Gaussian elimination (i.e. row reduction to echelon form) on the augmented matrix of a linear system, we can read off, whether the system has no, one, or infinitely many solutions; find all solutions by back-substitution. We can continue row reduction to the reduced echelon form. Solutions to the linear system can now be just read off. This form is unique! Note. Besides for solving linear systems, Gaussian elimination has other important uses, such as computing determinants or inverses of matrices. A recipe to solve linear systems (Gauss Jordan elimination) () Write the augmented matrix of the system. () Row reduce to obtain an equivalent augmented matrix in echelon form. Decide whether the system is consistent. If not, stop; otherwise go to the next step. () Continue row reduction to obtain the reduced echelon form. () Express this final matrix as a system of equations. (5) Declare the free variables and state the solution in terms of these. Questions to check our understanding On an exam, you are asked to find all solutions to a system of linear equations. You find exactly two solutions. Should you be worried? Yes, because if there is more than one solution, there have to be infinitely many solutions. Can you see how, given two solutions, one can construct infinitely many more? True or false? There is no more than one pivot in any row. True, because a pivot is the first nonzero entry in a row. There is no more than one pivot in any column. True, because in echelon form (that s where pivots are defined) the entries below a pivot have to zero. There cannot be more free variables than pivot variables. False, consider, for instance, the augmented matrix 7 5.

14 The geometry of linear equations Adding and scaling vectors Example. We have already encountered matrices such as Each column is what we call a (column) vector. In this example, each column vector has entries and so lies in R.. Example 5. A fundamental property of vectors is that vectors of the same kind can be added and scaled. + = 5 5, 7 x x x = 7x 7x 7x. Example 6. (Geometric description of R ) A vector x x (x,x ) in the plane. represents the point Given x= and y=, graph x, y, x+y, y.

15 Adding and scaling vectors, the most general thing we can do is: Definition 7. Given vectors v,v,,v m in R n and scalars c,c,,c m, the vector is a linear combination of v,v,,v m. c v +c v + +c m v m The scalars c,,c m are the coefficients or weights. Example 8. Linear combinations of v,v,v include: v v +7v, v +v, v,. Example 9. Express 5 as a linear combination of and. Solution. We have to find c and c such that This is the same as: c Solving, we find c = and c =. Indeed, +c c c = c +c = 5 + = 5 = 5 Note that the augmented matrix of the linear system is.. 5, and that this example provides a new way of think about this system. 5

16 The row and column picture Example. We can think of the linear system in two different geometric ways. Row picture. Each equation defines a line in R. x y = x+y = 5 Which points lie on the intersection of these lines? Column picture. The system can be written as x Which linear combinations of +y = 5. and This example has the unique solution x=, y=. produce 5 (,) is the (only) intersection of the two lines x y= and x+y=5. + is the (only) linear combination producing. 5? 6

17 Pre-lecture: the shocking state of our ignorance Q: How fast can we solve N linear equations in N unknowns? Estimated cost of Gaussian elimination: to create the zeros below the pivot: on the order of N operations if there is N pivots: on the order of N N =N op s A more careful count places the cost at N op s. For large N, it is only the N that matters. It says that if N N then we have to work times as hard. That s not optimal! W e can do better than Gaussian elim ination: Strassen algorithm (969): N log7 =N.87 Coppersmith Winograd algorithm (99): N.75 Stothers Williams Le Gall (): N.7 Is N possible? We have no idea! (better is im p ossible; why?) Good news for applications: Matrices typically have lots of structure and zeros (will see an exam ple soon) which makes solving so much faster.

18 Organizational Help sessions in AH: MW -6pm, TR 5-7pm Review A system such as x y = x+y = 5 can be written in vector form as x +y = 5. The left-hand side is a linear combination of the vectors and. The row and column picture Example. We can think of the linear system x y = x+y = 5 in two different geometric ways. Here, there is a unique solution: x=, y=. Row picture. Each equation defines a line in R. Which points lie on the intersection of these lines? (, ) is the (only) intersection of the two lines x y = and x + y=5. 5 5

19 Column picture. The system can be written as x +y = 5 Which linear combinations of and produce 5? (, ) are the coefficients of the (only) such linear combination.. 5 Example. Consider the vectors a =, a =, a = 6, b= 8 5. Determine if b is a linear combination of a,a,a. Solution. Vector b is a linear combination of a,a,a if we can find weights x,x, x such that: x +x +x 6 = 8 5 This vector equation corresponds to the linear system: x +x +x = +x +6x = 8 x +x +x = 5 Corresponding augmented matrix: Note that we are looking for a linear combination of the first three columns which

20 produces the last column. Such a combination exists the system is consistent. Row reduction to echelon form: Since this system is consistent, b is a linear combination of a,a,a. It is consistent, because there is no row of the form b with b. Example. In the previous example, express b as a linear combination of a,a,a. Solution. The reduced echelon form is: We read off the solution x =, x =, x =, which yields + 6 = 8 5. Summary A vector equation x a +x a + +x m a m =b has the same solution set as the linear system with augmented matrix a a a m b. In particular, b can be generated by a linear combination of a,a,,a m if and only if this linear system is consistent.

21 The span of a set of vectors Definition. The span of vectors v,v,,v m is the set of all their linear combinations. We denote it by span{v,v,,v m }. In other words, span{v,v,,v m } is the set of all vectors of the form where c,c,,c m are scalars. c v +c v + +c m v m, Example 5. (a) Describe span{ } geometrically. The span consists of all vectors of the form α As points in R, this is a line. (b) Describe span{, The span is all of R, a plane. } geometrically. That s because any vector in R can be written as x +x Let s show this without relying on our geometric intuition: let.5. b b b b b b b is consistent any vector. Hence, b b (c) Describe span{ Note that is a linear combination of, = and. } geometrically.. Hence, the span is as in (a). Again, we can also see this after row reduction: let b b b b b b b any vector. is not consistent for all b b b b is in the span of and So the span consists of vectors b b only if b b = (i.e. b = b ). =b. 5

22 A single (nonzero) vector always spans a line, two vectors v,v usually span a plane but it could also be just a line (if v =αv ). We will come back to this when we discuss dimension and linear independence. Example 6. Is span {, } a line or a plane? Solution. The span is a plane unless, for some α, =α Looking at the first entry, α =, but that does not work for the third entry. Hence, there is no such α. The span is a plane.. Example 7. Consider A=, 5 b= 8 7. Is b in the plane spanned by the columns of A? Solution. b in the plane spanned by the columns of A if and only if is consistent To find out, we row reduce to an echelon form: From the last row, we see that the system is inconsistent. Hence, b is not in the plane spanned by the columns of A. 6

23 Conclusion and summary The span of vectors a,a,,a m is the set of all their linear combinations. Some vector b is in span{a, a,, a m } if and only if there is a solution to the linear system with augmented matrix a a a m b. Each solution corresponds to the weights in a linear combination of thea,a,, a m which gives b. This gives a second geometric way to think of linear systems! 7

24 Pre-lecture: the goal for today We wish to write linear systems simply as Ax=b. For instance: Why? It s concise. x +x = b x +x = b x x The compactness also sparks associations and ideas! b = b For instance, can we solve by dividing by A? x=a b? If Ax=b and Ay=, then A(x+y)=b. Leads to matrix calculus and deeper understanding. multiplying, inverting, or factoring matrices Matrix operations Basic notation We will use the following notations for an m n matrix A (m rows, n columns). In terms of the columns of A: A= a a a n = a a a n In terms of the entries of A: a, a, a,n a A=, a, a,n a m, a m, a m,n, a entry in i,j= i-th row, j-th column Matrices, just like vectors, are added and scaled componentwise. Example. (a) + 5 = 8

25 (b) 7 = 7 Matrix times vector Recall that (x,x,,x n ) solves the linear system with augmented matrix if and only if A b = a a a n b x a +x a + +x n a n =b. It is therefore natural to define the product of matrix times vector as Ax=x a +x a + +x n a n, x= x. The system of linear equations with augmented matrix A b can be written in matrix form compactly as Ax=b. The product of a matrix A with a vector x is a linear combination of the columns of A with weights given by the entries of x. x n Example. (a) 5 (b) (c) (d) x x = 5 = =x + +x = x +x = x +x This illustrates that linear systems can be simply expressed as Ax=b: x +x = b x +x = b = 5 x x b = b

26 Example. Suppose A is m n and x is in R p. Under which condition does Ax make sense? We need n=p. Matrix times matrix If B has just one column b, i.e. B= b, then AB= Ab. In general, the product of matrix times matrix is given by (Go through the definition of Ax to m ake sure you see why!) AB= Ab Ab Ab p, B= b b b p. Example. (a) 5 because 5 and 5 (b) 5 = = = = = =. Each column of AB is a linear combination of the columns of A with weights given by the corresponding column of B. Remark 5. The definition of the matrix product is inevitable from the multiplication of matrix times vector and the fact that we want AB to be defined such that (AB)x= A(Bx). A(Bx) =A(x b +x b + ) =x Ab +x Ab + =(AB)x if the columns of AB are Ab,Ab, Example 6. Suppose A is m n and B is p q. (a) Under which condition does AB make sense? We need n=p. (b) What are the dimensions of AB in that case? AB is a m q matrix. (Go through the boxed characterization of AB to m ake sure you see why!)

27 Basic properties Example 7. (a) (b) = = This is the identity matrix. Theorem 8. Let A,B,C be matrices of appropriate size. Then: A(BC) =(AB)C A(B+C)=AB+AC (A + B)C = AC + BC associative left-distributive right-distributive Example 9. However, matrix multiplication is not commutative! 5 (a) = 5 (b) = Example. Also, a product can be zero even though none of the factors is: = Transpose of a matrix Definition. The transpose A T of a matrix A is the matrix whose columns are formed from the corresponding rows of A. rows columns Example. (a) T (b) x x x T = (c) = x T = x x A matrix A is called symmetric if A=A T.

28 Practice problems True or false? AB has as many columns as B. AB has as many rows as B. The following practice problem illustrates the rule (AB) T =B T A T. Example. Consider the matrices A=, B= Compute: (a) AB= = (b) (AB) T = (c) B T A T = (d) A T B T = =. = W hat s that fishy sm ell? 5

29 Review: matrix multiplication Ax is a linear combination of the columns of A with weights given by the entries of x. 7 = + = 7 Ax=b is the matrix form of the linear system with augmented matrix A b. x x b = b x +x = b x +x = b Each column of AB is a linear combination of the columns of A with weights given by the corresponding column of B. 7 = 7 Matrix multiplication is not commutative: usually, AB BA. A comment on lecture notes My personal suggestion: before lecture: have a quick look (5min or so) at the pre-lecture notes to see where things are going during lecture: take a minimal amount of notes (everything on the screens will be in the post-lecture notes) and focus on the ideas after lecture: go through the pre-lecture notes again and fill in all the blanks by yourself then compare with the post-lecture notes Since I am writing the pre-lecture notes a week ahead of time, there is usually some minor differences to the post-lecture notes.

30 Transpose of a matrix Definition. The transpose A T of a matrix A is the matrix whose columns are formed from the corresponding rows of A. rows columns Example. (a) T (b) x x x T = (c) = x T = x x A matrix A is called symmetric if A=A T. Theorem. Let A,B be matrices of appropriate size. Then: (A T ) T =A (A+B) T =A T +B T (AB) T =B T A T (illustrated by last practice problem s) Example. Deduce that (ABC) T =C T B T A T. Solution. (ABC) T =((AB)C) T =C T (AB) T =C T B T A T

31 Back to matrix multiplication Review. Each column of AB is a linear combination of the columns of A with weights given by the corresponding column of B. Two more ways to look at matrix multiplication Example 5. What is the entry (AB) i,j at row i and column j? The j-th column of AB is the vector A (col j of B). Entry i of that is (row i of A) (col j of B). In other words: (AB) i,j =(row i of A) (col j of B) Use this row-column rule to compute: 6 6 = 6 6 Can you see the rule (AB) T =B T A T from here? Observe the symmetry between rows and columns in this rule! It follows that the interpretation Each column of AB is a linear combination of the columns of A with weights given by the corresponding column of B. has the counterpart Each row of AB is a linear combination of the rows of B with weights given by the corresponding row of A. Example 6. (a) 5 6 =

32 LU decomposition Elementary matrices Example 7. a b a b (a) = c d c d a b c d (b) = c d a b (c) a b c d e f = a b c d e f g h i g h i (d) a b c a b c d e f = d e f g h i a+g b+h c+i Definition 8. An elementary matrix is one that is obtained by performing a single elementary row operation on an identity matrix. The result of an elementary row operation on A is EA where E is an elementary matrix (namely, the one obtained by performing the same row operation on the appropriate identity matrix). Example 9. (a) We write = =, but more on inverses soon. Elementary matrices are invertible because elementary row operations are reversible. (b) = (c) = (d) =

33 Practice problems Example. Choose either column or row interpretation to see the result of the following products. (a) = (b) = 5

34 Review Example. Elementary matrices in action: (a) a b c d e f = g h i d e f g h i a b c (b) a b c d e f = a b c d e f 7 g h i 7g 7h 7i (c) a b c a b c d e f = d e f g h i a+g b+h c+i (d) a b c d e f = a+c b c d+f e f g h i g+i h i (e) =

35 LU decomposition, continued Gaussian elimination revisited Example. Keeping track of the elementary matrices during Gaussian elimination on A: EA= A= 6 6 R R R = 8 Note that: A=E 8 = 8 We factored A as the product of a lower and upper triangular matrix! We say that A has triangular factorization. A=LU is known as the LU decomposition of A. L is lower triangular, U is upper triangular. Definition. lower triangular upper triangular m issing entries are

36 Example. Factor A= 6 as A=LU. 7 Solution. We begin with R R R followed by R R+R: E A= 6 7 E (E A)= 8 7 E E E A= 8 = 8 8 =U The factor L is given by: note that E E E A=U A=E E E U L =E E E = = = In conclusion, we found the following LU decomposition of A: 6 7 = 8 Note: The extra steps to compute L were unnecessary! The entries in L are precisely the negatives of the ones in the elementary matrices during elimination. Can you see it?

37 Once we have A=LU, it is simple to solve Ax=b. Ax=b L(Ux)=b Lc=b and Ux=c Both of the final systems are triangular and hence easily solved: Lc=b by forward substitution to find c, and then Ux=c by backward substitution to find x. Important practical point: can be quickly repeated for many different b. Example 5. Solve 6 7 x=. Solution. We already found the LU decomposition A = LU: 6 7 = 8 Forward substitution to solve Lc=b for c: c= c= Backward substitution to solve Ux=c for x: 8 x= x= It s always a good idea to do a quick check: 6 7 x=

38 Triangular factors for any matrix Can we factor any matrix A as A=LU? Yes, almost! Think about the process of Gaussian elimination. In each step, we use a pivot to produce zeros below it. The corresponding elementary matrices are lower diagonal! The only other thing we might have to do, is a row exchange. Namely, if we run into a zero in the position of the pivot. All of these row exchanges can be done at the beginning! Definition 6. A permutation matrix is one that is obtained by performing row exchanges on an identity matrix. Example 7. E= is a permutation matrix. EA is the matrix obtained from A by permuting the last two rows. Theorem 8. For any matrix A there is a permutation matrix P such that PA=LU. In other words, it might not be possible to write A as A=LU, but we only need to permute the rows of A and the resulting matrix PA now has an LU decomposition: PA=LU. 5

39 Practice problems Is upper triangular? Lower triangular? Is upper triangular? Lower triangular? True or false? A permutation matrix is one that is obtained by performing column exchanges on an identity matrix. Why do we care about LU decomposition if we already have Gaussian elimination? Example 9. Solve 6 7 x= 5 9 using the factorization we already have. Example. The matrix A= cannot be written as A=LU (so it doesn t have a LU decomposition). But there is a permutation matrix P such that PA has a LU decomposition. Namely, let P =. Then PA=. PA can now be factored as PA=LU. Do it!! (By the way, P = would work as well.) 6

40 Review Elementary matrices performing row operations: a b c d e f g h i = a b c d a e b f c g h i Gaussian elimination on A gives an LU decomposition A=LU: 6 7 = 8 U is the echelon form, and L records the inverse row operations we did. LU decomposition allows us to solve Ax=b for many b. a = a Already not so clear: a = a b ab b Goal for today: invert these and any other matrices (if possible)

41 The inverse of a matrix Example. The inverse of a real number a is denoted asa. For instance,7 = 7 and 7 7 =7 7=. In the context of n n matrix multiplication, the role of is taken by the n n identity matrix I n =. Definition. An n n matrix A is invertible if there is a matrix B such that AB=BA=I n. In that case, B is the inverse of A and we write A =B. Example. We already saw that elementary matrices are invertible. =

42 Note. The inverse of a matrix is unique. Why? So A is well-defined. Assume B and C are both inverses of A. Then: C =CI n =CAB=I n B=B Do not write A B. Why? Because it is unclear whether it should mean AB or B A. If AB=I, then BA=I (and so A =B). N ot easy to show at this stage. Example. The matrix A= is not invertible. Why? Solution. a b c d c d = Example 5. If A= a b c d, then A = d b ad bc c a provided that ad bc. Let s check that: Note. d b ad bc c a a b c d ad bc = ad bc cb+ad A matrix a is invertible a. a b A matrix is invertible ad bc. c d =I We will encounter the quantities on the right again when we discuss determinants.

43 Solving systems using matrix inverse Theorem 6. Let A be invertible. Then the system Ax = b has the unique solution x=a b. Proof. Multiply both sides of Ax=b with A (from the left!). Example 7. Solve 7x +x = 5x x = using matrix inversion. Solution. In matrix form Ax=b, this system is Computing the inverse: Recall that a b c d 7 5 = Hence, the solution is: 7 5 x= = 5 7 d b ad bc c a x=a b= = = 7

44 Recipe for computing the inverse To solve Ax=b, we do row reduction on A b. To solve AX =I, we do row reduction on A I. To compute A : Form the augmented matrix A I. Compute the reduced echelon form. If A is invertible, the result is of the form I A. Example 8. Find the inverse of A=, if it exists. Solution. By row reduction: A I I A Gauss Jordan method (i.e. Gauss Jordan elim ination) Hence, A =. Example 9. Let s do the previous example step by step. A I I A R R+ > R R > R R R 5

45 Note. Here is another way to see why this algorithm works: Each row reduction corresponds to multiplying with an elementary matrix E: A I E A E I E E A E E So at each step: A I FA F with F =E r E E If we manage to reduce A I to I F, this means FA=I and hence A =F. Some properties of matrix inverses Theorem. Suppose A and B are invertible. Then: A is invertible and (A ) =A. Why? Because AA =I A T is invertible and (A T ) =(A ) T. Why? Because (A ) T A T =(AA ) T =I T =I (R ecall that (AB) T =B T A T.) AB is invertible and (AB) =B A. Why? Because (B A )(AB)=B IB=B B=I 6

46 Review The inverse A of a matrix A is, if it exists, characterized by a b d b = c d ad bc c a AA =A A=I n. If A is invertible, then the system Ax=b has the unique solution x=a b. Gauss Jordan method to compute A : bring to RREF A I I A (A ) =A (A T ) =(A ) T (AB) =B A Why? Because (B A )(AB)=B IB=B B=I Further properties of matrix inverses Theorem. Let A be an n n matrix. Then the following statements are equivalent: (i.e., for a given A, they are either all true or all false) (a) A is invertible. (b) A is row equivalent to I n. (c) A has n pivots. (d) For every b, the system Ax=b has a unique solution. Namely, x=a b. (Easy to check!) (e) There is a matrix B such that AB=I n. (f) There is a matrix C such that CA=I n. (A has a right inverse.) (A has a left inverse.) Note. Matrices that are not invertible are often called singular. The book uses singular for n n matrices that do not have n pivots. As we just saw, it doesn t make a difference. Example. We now see at once that A= Why? Because it has only one pivot. is not invertible.

47 Application: finite differences Let us apply linear algebra to the boundary value problem (BVP) d u =f(x), x, dx u()=u()=. f(x) is given, and the goal is to find u(x). Physical interpretation: models steady-state temperature distribution in a bar (u(x) is temperature at point x) under influence of an external heat source f(x) and with ends fixed at (ice cube at the ends?). Remark. Note that this simple BVP can be solved by integrating f(x) twice. We get two constants of integration, and so we see that the boundary condition u()=u()= makes the solution u(x) unique. Of course, in the real applications the BVP would be harder. Also, f(x) might only be known at some points, so we cannot use calculus to integrate it. u(x) We will approximate this problem as follows: x replace u(x) by its values at equally spaced points in, u = u = u(h) u = u(h) u = u(h) u n = u(nh) u n+ =... h h h nh approximate d u dx at these points (finite differences) replace differential equation with linear equation at each point solve linear problem using Gaussian elimination

48 Finite differences Finite differences for first derivative: du dx u x = u(x+h) u(x) or u(x) u(x h) or u(x+h) u(x h) h symmetric and most accurate Note. Recall that you can always use L Hospital s rule to determine the limit of such quantities (especially more complicated ones) as h. Finite difference for second derivative: d u dx u(x+h) u(x)+u(x h) h the only symmetric choice involving only u(x), u(x ± h) Question. Why does this approximate d u as h? dx Solution. d u du du dx (x+h) dx dx (x) h u(x+h) u(x) h u(x) u(x h) h h u(x+h) u(x)+u(x h) h

49 Setting up the linear equations d u =f(x), x, dx u()=u()=. u = u = u(h) u = u(h) u = u(h) u n = u(nh) u n+ =... h h h nh Using d u u(x+h) u(x)+u(x h) dx h, we get: at x=h: =f(h) u u =h f(h) () u(h) u(h)+u() h at x=h: u(h) u(h)+u(h) =f(h) h u +u u =h f(h) () at x=h: u +u u =h f(h) () at x=nh: u((n+)h) u(nh)+u((n )h) h u n +u n =h f(nh) =f(nh) (n) Example 5. In the case of six divisions (n=5, h= ), we get: 6 A u u u u u 5 x = h f(h) h f(h) h f(h) h f(h) h f(5h) b

50 Such a matrix is called a band matrix. As we will see next, such matrices always have a particularly simple LU decomposition. Gaussian elimination: > R R+ R > R R+ R > R R+ R 5 > 5 R5 R5+ 5 R In conclusion, we have the LU decomposition: = That s how the LU decomposition of band matrices always looks like. 5

51 Review Goal: solve for u(x) in the boundary value problem (BVP) d u =f(x), x, dx u()=u()=. replace u(x) by its values at equally spaced points in, u = u = u(h) u = u(h) u = u(h) u n = u(nh) u n+ =... h h h nh d u u(x+h) u(x)+u(x h) dx h at these points (finite differences) get a linear equation at each point x=h,h,,nh; for n=5, h= 6 : A u u u u u 5 x = h f(h) h f(h) h f(h) h f(h) h f(5h) b Compute the LU decomposition: = That s how the LU decomposition of band matrices always looks like.

52 LU decomposition vs matrix inverse In many applications, we don t just solve Ax=b for a single b, but for many different b (think millions). Note, for instance, that in our example of steady-state temperature distribution in a bar the matrix A is always the same (it only depends on the kind of problem), whereas the vector b models the external heat (and thus changes for each specific instance). That s why the LU decomposition saves us from repeating lots of computation in comparison with Gaussian elimination on A b. What about computing A? We are going to see that this is a bad idea. (It usually is.) Example. When using LU decomposition to solve Ax=b, we employ forward and backward substitution: Ax=b Here, we have to solve, for each b, A=LU G Lc=b and Ux=c 5 by forward and backward substitution. c=b, x=c How many operations (additions and m ultiplications) are needed in the n n case? (n ) for Lc=b, and +(n ) for Ux=c. So, roughly, a total of n operations. On the other hand, A = How many operations are needed to compute A b? This time, we need roughly n additions and multiplications..

53 Conclusions Large matrices met in applications usually are not random but have some structure (such as band matrices). When solving linear equations, we do not (try to) compute A. It destroys structure in practical problems. As a result, it can be orders of magnitude slower, and require orders of magnitude more memory. It is also numerically unstable. LU decomposition can be adjusted to not have these drawbacks. A practice problem Example. Above we computed the LU decomposition for n = 5. For comparison, here are the details for computing the inverse when n=. Do it for n=5, and appreciate just how much computation has to be done. Invert A=. Solution. Hence, = R R+ > R R R+ > R R > R R R R R R R+ > R R R+ R.

54 Vector spaces and subspaces We have already encountered vectors in R n. Now, we discuss the general concept of vectors. In place of the space R n, we think of general vector spaces. Definition. A vector space is a nonempty set V of elements, called vectors, which may be added and scaled (multiplied with real numbers). The two operations of addition and scalar multiplication must satisfy the following axioms for all u,v,w in V, and all scalars c,d. (a) u+v is in V (b) u+v=v+u (c) (u+v)+w=u+(v+w) (d) there is a vector (called the zero vector) in V such that u+=u for all u in V (e) there is a vector u such that u+( u)= (f) cu is in V (g) c(u+v)=cu+cv (h) (c+d)u=cu+du (i) (cd)u=c(du) (j) u=u tl;dr A vector space is a collection of vectors which can be added and scaled (without leaving the space!); subject to the usual rules you would hope for. namely: associativity, commutativity, distributivity Example. Convince yourself that M ={ a b c d } :a,b,c,d in R is a vector space. Solution. In this context, the zero vector is = Addition is componentwise: a b c d e f + g h. a+e b+f = c+g d+h Scaling is componentwise: a b r c d ra rb = rc rd

55 Addition and scaling satisfy the axioms of a vector space because they are defined component-wise and because ordinary addition and multiplication are associative, commutative, distributive and what not. Important note: we do not use matrix multiplication here! Note: as a vector space, M behaves precisely like R ; we could translate between the two via a a b c d b c. d A fancy person would say that these two vector spaces are isomorphic. Example 5. Let P n be the set of all polynomials of degree at most n. Is P n a vector space? Solution. Members of P n are of the form p(t)=a +a t+ +a n t n, where a,a,,a n are in R and t is a variable. P n is a vector space. Adding two polynomials: a +a t+ +a n t n +b +b t+ +b n t n = (a +b )+(a +b )t+ +(a n +b n )t n So addition works component-wise again. Scaling a polynomial: ra +a t+ +a n t n = (ra )+(ra )t+ +(ra n )t n Scaling works component-wise as well. Again: the vector space axioms are satisfied because addition and scaling are defined component-wise. As in the previous example, we see that P n is isomorphic to R n+ : a +a t+ +a n t n a a a n 5

56 Example 6. Let V be the set of all polynomials of degree exactly. IsV a vector space? Solution. No, because V does not contain the zero polynomial p(t) =. Every vector space has to have a zero vector; this is an easy necessary (but not sufficient) criterion when thinking about whether a set is a vector space. More generally, the sum of elements in V might not be in V : +t +t + t+t t = t+5t 6

57 Review A vector space is a set of vectors which can be added and scaled (without leaving the space!); subject to the usual rules. The set of all polynomials of degree up to is a vector space. a +a t+a t +b +b t+b t = (a +b )+(a +b )t+(a +b )t ra +a t+a t = (ra )+(ra )t+(ra )t Note how it works just like R. The set of all polynomials of degree exactly is not a vector space. +t+t + t t = +t degree degree NOT degree An easy test that often works is to check whether the set contains the zero vector. (Works in the previous case.) Example. Let V be the set of all functions f: R R. Is V a vector space? Solution. Yes! Addition of functions f and g: (f +g)(x)=f(x)+g(x) Note that, once more, this definition is component-wise. Likewise for scalar multiplication.

58 Subspaces Definition. A subsetw of a vector spacev is a subspace if W is itself a vector space. Since the rules like associativity, commutativity and distributivity still hold, we only need to check the following: W V is a subspace of V if W contains the zero vector, W is closed under addition, (i.e. if u,v W then u+v W ) W is closed under scaling. (i.e. if u W and c R then cu W ) Note that in W (first condition) follows from W closed under scaling (third condition). But it is crucial and easy to check, so deserves its own bullet point. Example. Is W = span{ } a subspace of R? Solution. Yes! W contains =. a a c a a + b b = a+b a+b = ca ca Example. Is W = Solution. Yes! W contains b a + c a b = is in W.. { is in W. a b a = a +a b b +b ca cb is in W. } :a,b in R a subspace of R? is in W. The subspace W is isomorphic to R (translation: same! Example 5. Is W = { a b Solution. No! Missing. a b } :a,b in R a subspace of R? a b ) but they are not the

59 Note: W = { + a b } :a,b in R is close to a vector space. Geometrically, it is a plane, but it does not contain the origin. Example 6. Is W ={ } a subspace of R? Solution. Yes! W contains c. + = = is in W. is in W. Example 7. Is W ={ x x+ :x in R } a subspace of R? Solution. No! W does not contain If is missing, some other things always go wrong as well. For instance, = or + = 5 are not in W. Example 8. Is W ={. } } x { x+ :x in R a subspace of R? In other words, W is the set from the previous example plus the zero vector. Solution. No! = not in W. Spans of vectors are subspaces Review. The span of vectors v,v,,v m is the set of all their linear combinations. We denote it by span{v,v,,v m }. In other words, span{v,v,,v m } is the set of all vectors of the form where c,c,,c m are scalars. c v +c v + +c m v m, Theorem 9. If v,,v m are in a vector space V, then span{v,,v m } is a subspace of V. Why? is in span{v,,v m } c v + +c m v m +d v + +d m v m =(c +d )v + +(c m +d m )v m rc v + +c m v m =(rc )v + +(rc m )v m

60 Example. Is W ={ a+b a b :a,b in R } a subspace of R? Solution. Write vectors in W in the form a+b a b = a a b + b =a +b to see that { W = span, }. By the theorem, W is a vector space. Actually, W = R. Example. Is W ={ a b a+b a matrices? :a,b in R } a subspace of M, the space of Solution. Write vectors in W in the form a b a+b a = a +b to see that { W = span, }. By the theorem, W is a vector space.

61 Practice problems Example. Are the following sets vector spaces? } (a) W ={ a b c d :a+b=,a c= No, W does not contain. (b) W ={ a+c b b+c { Yes, W = span c :a,b,c in R } },,. Hence, W is a subspace of the vector space Mat of all matrices. (c) W ={ a b :ab } No. For instance, + = is not in W. (d) W is the set of all polynomials p(t) such that p ()=. No. W does not contain the zero polynomial. (e) W 5 is the set of all polynomials p(t) such that p ()=. Yes. If p ()= and q ()=, then (p+q) ()=. Likewise for scaling. Hence, W 5 is a subspace of the vector space of all polynomials. 5

62 Midterm! Midterm : Thursday, 7 8:5pm in Psych if your last name starts with A or B in Foellinger Auditorium if your last name starts with C, D,..., Z bring a picture ID and show it when turning in the exam Review A vector space is a set V of vectors which can be added and scaled (without leaving the space!); subject to the usual rules. W V is a subspace of V if it is a vector space itself; that is, W contains the zero vector, W is closed under addition, (i.e. if u,v W then u+v W ) W is closed under scaling. (i.e. if u W and c R then cu W ) span{v,,v m } is always a subspace of V. (v,,v m are vectors in V ) Example. Is W ={ a b b matrices? :a,b in R } a subspace of M, the space of Solution. No, W does not contain the zero vector. Example. Is W ={ a b b matrices? a :a,b in R } a subspace of M, the space of Solution. Write vectors in W in the form a b = a +b b a to see that { W = span, }. Like any span, W is a vector space.

63 Example. Are the following sets vector spaces? } (a) W ={ a b c d :a+b=,a c= No, W does not contain. (b) W ={ a+c b b+c c :a,b,c in R } { Yes, W = span,, Hence, W is a subspace of the vector space Mat of all matrices. (c) W ={ a+c b b+c c+7 :a,b,c in R } { We still have W = + span 7 Hence, W is a subspace if and only if }., 7 Equivalently (why?!), we have to check whether },. (more complicated) is in the span. (W e can answer such questions!) a+c b b+c c+7 = has solutions a,b,c. There is no solution ( b = implies b =, then b + c = implies c = ; this contradicts c+7=). (d) W ={ a b :ab } No. For instance, + = is not in W. (e) W 5 is the set of all polynomials p(t) such that p ()=. No. W 5 does not contain the zero polynomial. (f) W 6 is the set of all polynomials p(t) such that p ()=. Yes. If p ()= and q ()=, then (p+q) ()=p ()+q ()=. Likewise for scaling. Hence, W 6 is a subspace of the vector space of all polynomials.

64 What we learned before vector spaces Linear systems Systems of equations can be written as Ax=b. x x = x + x = x= Sometimes, we represent the system by its augmented matrix. A linear system has either no solution (such a system is called inconsistent), echelon form contains row... b with b one unique solution, system is consistent and has no free variables infinitely many solutions. system is consistent and has at least one free variable We know different techniques for solving systems Ax=b. Gaussian elimination on A b LU decomposition A=LU using matrix inverse, x=a b

65 Matrices and vectors A linear combination of v,v,,v m is of the form c v +c v + +c m v m. span{v,v,,v m } is the set of all such linear combinations. Spans are always vector spaces. For instance, a span in R can be {}, a line, a plane, or R. The transpose A T of a matrix A has rows and columns flipped. (A+B) T =A T +B T (AB) T =B T A T T = An m n matrix A has m rows and n columns. The product Ax of matrix times vector is a a a n x =x a +x a + +x n a n. Different interpretations of the product of matrix times matrix: column interpretation row interpretation row-column rule a b c d e f g h i x n a b c d e f g h i = = a+c b c d+f e f g+i h i a b c d e f a+g b+h c+i (AB) i,j =(row i of A) (col j of B)

66 The inverse A of A is characterized by A A=I (or AA =I). a b d b = c d ad bc c a Can compute A using Gauss Jordan method. (A T ) =(A ) T (AB) =B A A I RREF > I A An n n matrix A is invertible A has n pivots Ax=b has a unique solution (if true for one b, then true for all b) Gaussian elimination Gaussian elimination can bring any matrix into an echelon form. It proceeds by elementary row operations: (replacement) Add one row to a multiple of another row. (interchange) Interchange two rows. (scaling) Multiply all entries in a row by a nonzero constant. Each elementary row operation can be encoded as multiplication with an elementary matrix. a b c d e f g h i j k l = a b c d e a f b g c h d i j k l We can continue row reduction to obtain the (unique) RREF. 5

67 Using Gaussian elimination Gaussian elimination and row reductions allow us: solve systems of linear systems x = +x x = 7+x x free x = compute the LU decomposition A=LU 6 7 compute the inverse of a matrix to find = =, we use Gauss Jordan: 8 R > RE F determine whether a vector is a linear combination of other vectors is a linear combination of the system corresponding to (Each solution x x and if and only if is consistent. gives a linear combination =x +x.) 6

68 Organizational Interested in joining class committee? meet times to discuss ideas you may have for improving class Next: bases, dimension and such

69 Solving Ax= and Ax=b Column spaces Definition. The column space Col(A) of a matrixais the span of the columns of A. If A= a a n, then Col(A)=span{a,,a n }. In other words, b is in Col(A) if and only if Ax=b has a solution. Why? Because Ax=x a + +x n a n is the linear combination of columns of A with coefficients given by x. If A is m n, then Col(A) is a subspace of R m. Why? Because any span is a space. Example. Find a matrix A such that W = Col(A) where Solution. Note that x y y 7x+y W = x y y : x,y in R. 7x+y = x 7 +y. Hence, W = span 7, = Col 7.

70 Null spaces Definition. The null space of a matrix A is Nul(A)={x : Ax=}. In other words, if A is m n, then its null space consists of those vectors x R n which solve the homogeneous equation Ax =. Theorem. If A is m n, then Nul(A) is a subspace of R n. Proof. We check that Nul(A) satisfies the conditions of a subspace: Nul(A) contains because A=. If Ax= and Ay=, then A(x+y)=Ax+Ay=. Hence, Nul(A) is closed under addition. If Ax=, then A(cx)=cAx=. Hence, Nul(A) is closed under scalar multiplication.

71 Solving Ax = yields an explicit description of Nul(A). By that we mean a description as the span of some vectors. Example 5. Find an explicit description of Nul(A) where A= 6. Solution R R R > R > R 6 5 R R R > 6 5 From the RREF we read off a parametric description of the solutions x to Ax=. Note that x, x, x 5 are free. x= x x x x x 5 = =x x x x 5 x 6x + 5x 5 x x 5 +x 6 +x 5 5 In other words, Nul(A) = span, 6, 5. Note. The number of vectors in the spanning set for Nul(A) as derived above (which is as small as possible) equals the number of free variables in Ax=.

72 Another look at solutions to Ax=b Theorem 6. Let x p be a solution of the equation Ax=b. Then every solution to Ax=b is of the form x=x p +x n, where x n is a solution to the homogeneous equation Ax =. In other words, {x : Ax=b}=x p + Nul(A). We often call x p a particular solution. The theorem then says that every solution to Ax = b is the sum of a fixed chosen particular solution and some solution to Ax=. Proof. Let x be another solution to Ax=b. We need to show that x n =x x p is in Nul(A). A(x x p )=Ax Ax p =b b= Example 7. Let A= and b= 5 5 Using the RREF, find a parametric description of the solutions to Ax=b: R R R R R+R > R R R > R > R R R R >. Every solution to Ax=b is therefore of the form: x x +x x= x x = x x x x 5

73 = x p +x +x elements of Nul(A) We can see nicely how every solution is the sum of a particular solution x p and solutions to Ax=. Note. A convenient way to just find a particular solution is to set all free variables to zero (here, x = and x =). Of course, any other choice for the free variables will result in a particular solution. For instance, x = and x = we would get x p = Practice problems True or false? The solutions to the equation Ax=b form a vector space. No, with the only exception of b=. The solutions to the equation Ax= form a vector space. Yes. This is the null space Nul(A). Example 8. Is the given set W a vector space? If possible, express W as the column or null space of some matrix A. { } (a) W = (b) W = (c) W = { { x y z x y z : 5x=y+z } : 5x =y+z x y x+y } : x,y in R Example 9. Find an explicit description of Nul(A) where. 5 A=. 6

74 Review Every solution to Ax=b is the sum of a fixed chosen particular solution and some solution to Ax=. For instance, let A= Every solution to Ax=b is of the form: x x +x x= x x = x x x x Is span {,, } and b= 5 5 = equal to R?. x p +x +x elem ents of N ul(a) Linear independence Review. span{v,v,,v m } is the set of all linear combinations c v +c v + +c m v m. span{v,v,,v m } is a vector space. Example. Is span {,, } equal to R? Solution. Recall that the span is equal to x : x in R. Hence, the span is equal to R if and only if the system with augmented matrix is consistent for all b,b,b. b b b

75 Gaussian elimination: b b b b b b b b b b b b b +b The system is only consistent if b b +b =. Hence, the span does not equal all of R. What went wrong? Well, the three vectors in the span satisfy = +. span {,, } Hence, span {,, } = span {, }. We are going to say that the three vectors are linearly dependent because they satisfy + =. Definition. Vectors v,,v p are said to be linearly independent if the equation x v +x v + +x p v p = has only the trivial solution (namely, x =x = =x p =). Likewise, v,,v p are said to be linearly dependent if there exist coefficients x,,x p, not all zero, such that x v +x v + +x p v p =.

76 Example. Are the vectors,, independent? If possible, find a linear dependence relation among them. Solution. We need to check whether the equation x +x +x = has more than the trivial solution. In other words, the three vectors are independent if and only if the system x= has no free variables. To find out, we reduce the matrix to echelon form: Since there is a column without pivot, we do have a free variable. Hence, the three vectors are not linearly independent. To find a linear dependence relation, we solve this system. Initial steps of Gaussian elimination are as before: x is free. x = x, and x =x. Hence, for any x, x x +x = Since we are only interested in one linear combination, we can set, say, x =: + =.

77 Linear independence of matrix columns Note that a linear dependence relation, such as + =, can be written in matrix form as =. Hence, each linear dependence relation among the columns of a matrix A corresponds to a nontrivial solution to Ax=. Theorem. Let A be an m n matrix. The columns of A are linearly independent. Ax= has only the solution x=. Nul(A)={} A has n pivots. (one in each colum n) Example 5. Are the vectors,, independent? Solution. Put the vectors in a matrix, and produce an echelon form: Since each column contains a pivot, the three vectors are independent. Example 6. (once again, short version) Are the vectors,, independent? Solution. Put the vectors in a matrix, and produce an echelon form: Since the last column does not contain a pivot, the three vectors are linearly dependent.

78 Special cases A set of a single nonzero vector {v } is always linearly independent. Why? Because x v = only for x =. A set of two vectors {v, v } is linearly independent if and only if neither of the vectors is a multiple of the other. Why? Because if x v +x v = with, say, x, then v = x x v. A set of vectors {v,,v p } containing the zero vector is linearly dependent. Why? Because if, say, v =, then v +v + +v p =. If a set contains more vectors than there are entries in each vector, then the set is linearly dependent. In other words: Any set {v,,v p } of vectors in R n is linearly dependent if p>n. Why? Let A be the matrix with columns v,,v p. This is a n p matrix. The columns are linearly independent if and only if each column contains a pivot. If p>n, then the matrix can have at most n pivots. Thus not all p columns can contain a pivot. In other words, the columns have to be linearly dependent. Example 7. With the least amount of work possible, decide which of the following sets of vectors are linearly independent. } (a) (b) {, 9 6 Linearly independent, because the two vectors are not multiples of each other. { } Linearly independent, because it is a single nonzero vector (c) columns of (d) Linearly dependent, because these are more than (namely, ) vectors in R. { }, 9 6, Linearly dependent, because the set includes the zero vector. 5

79 Review Vectors v,,v p are linearly dependent if and not all the coefficients are zero. x v +x v + +x p v p =, The columns of A are linearly independent each column of A contains a pivot. Are the vectors,, independent? So: no, they are dependent! (Coeff s x =, x =, x =) Any set of vectors in R is linearly dependent. A basis of a vector space Definition. A set of vectors {v,,v p } in V is a basis of V if V = span{v,,v p }, and the vectors v,,v p are linearly independent. In other words, {v,,v p } in V is a basis of V if and only if every vector w in V can be uniquely expressed as w=c v + +c p v p. Example. Let e =, e = Show that {e,e,e } is a basis of R., e =. It is called the standard basis. Solution. Clearly, span{e,e,e }=R. {e,e,e } are independent, because has a pivot in each column. Definition. V is said to have dimension p if it has a basis consisting of p vectors.

80 This definition makes sense because if V has a basis of p vectors, then every basis of V has p vectors. Why? (Think of V = R.) A basis of R cannot have more than vectors, because any set of or more vectors in R is linearly dependent. A basis of R cannot have less than vectors, because vectors span at most a plane (challenge: can you think of an argument that is more rigorous?). Example. R has dimension. Indeed, the standard basis Likewise, R n has dimension n.,, has three elements. Example 5. Not all vector spaces have a finite basis. For instance, the vector space of all polynomials has infinite dimension. Its standard basis is,t,t,t, This is indeed a basis, because any polynomial can be written as a unique linear combination: p(t)=a +a t+ +a n t n for some n. Recall that vectors in V form a basis of V if they span V and if they are linearly independent. If we know the dimension of V, we only need to check one of these two conditions: Theorem 6. Suppose that V has dimension d. A set of d vectors in V are a basis if they span V. A set of d vectors in V are a basis if they are linearly independent. Why? If the d vectors were not independent, then d of them would still span V. In the end, we would find a basis of less than d vectors. If the d vectors would not span V, then we could add another vector to the set and have d+ independent ones. Example 7. Are the following sets a basis for R? } (a) (b) {, No, the set has less than elements. {,,, } No, the set has more than elements.

81 (c) {,, } The set has elements. Hence, it is a basis if and only if the vectors are independent. 5 Since each column contains a pivot, the three vectors are independent. Hence, this is a basis of R. Example 8. Let P be the space of polynomials of degree at most. What is the dimension of P? Is {t, t,+t t } a basis of P? Solution. The standard basis for P is {,t,t }. This is indeed a basis because every polynomial a +a t+a t can clearly be written as a linear combination of,t,t in a unique way. Hence, P has dimension. The set {t, t,+t t } has elements. Hence, it is a basis if and only if the three polynomials are linearly independent. We need to check whether x t+x ( t)+x (+t t ) = (x +x )+(x x +x )t x t has only the trivial solution x =x =x =. We get the equations x +x = x x +x = x = which clearly only have the trivial solution. (If you don t see it, solve the system!) Hence, {t, t,+t t } is a basis of P.

82 Shrinking and expanding sets of vectors We can find a basis for V = span{v,,v p } by discarding, if necessary, some of the vectors in the spanning set. Example 9. Produce a basis of R from the vectors v =, v =, v =. Solution. Three vectors in R have to be linearly dependent. Here, we notice that v = v. The remaining vectors {v,v } are a basis of R, because the two vectors are clearly independent. Checking our understanding Example. Subspaces of R can have dimension,,,. The only -dimensional subspace is {}. A -dimensional subspace is of the form span{v} where v. These subspaces are lines through the origin. A -dimensional subspace is of the form span{v, w} where v and w are not multiples of each other. These subspaces are planes through the origin. The only -dimensional subspace is R itself. True or false? Suppose that V has dimension n. Then any set in V containing more than n vectors must be linearly dependent. That s correct. The space P n of polynomials of degree at most n has dimension n+. True, as well. A basis is {,t,t,,t n }. The vector space of functions f: R R is infinite-dimensional. Yes. A still-infinite-dimensional subspace are the polynomials. Consider V = span{v,,v p }. If one of the vectors, say v k, in the spanning set is a linear combination of the remaining ones, then the remaining vectors still span V. True, v k is not adding anything new.

83 Review {v,,v p } is a basis of V if the vectors span V, and are independent. The dimension of V is the number of elements in a basis. The columns of A are linearly independent each column of A contains a pivot. Warmup Example. Find a basis and the dimension of a+b+c W = a+b+c+d b+c+d : a,b,c,d real. a+c+d Solution. First, note that W = span,,,. Is dimw =? No, because the third vector is the sum of the first two. Suppose we did not notice A= Not a pivot in every column, hence the vectors are dependent.

84 Not necessary here, but: To get a relation, solve Ax=. Set free variable x =. Then x =, x = x = and x = x x =. The relation is Hence, a basis for W is Precisely, what we noticed to begin with.,, + + and dimw =. =. It follows from the echelon form that these vectors are independent. Every set of linearly independent vectors can be extended to a basis. In other words, let {v,,v p } be linearly independent vectors in V. If V has dimension d, then we can find vectors v p+,,v d such that {v,,v d } is a basis of V. Example. Consider H = span,. Give a basis for H. What is the dimension of H? Extend the basis of H to a basis of R. Solution. The vectors are independent. By definition, they span H. } Therefore, {, In particular, dimh =. } {, is a basis for H. is not a basis for R. Why? Because a basis for R needs to contain vectors. Or, because, for instance, is not in H. So: just add this (or any other) missing vector! } By construction, {,, Hence, this automatically is a basis of R. is independent.

85 Bases for column and null spaces Bases for null spaces To find a basis for Nul(A): find the parametric form of the solutions to Ax=, express solutions x as a linear combination of vectors with the free variables as coefficients; these vectors form a basis of Nul(A). Example. Find a basis for Nul(A) with Solution A= The solutions to Ax= are: x= Hence, Nul(A) = span x 5x x 5 x x +5x 5 x x 5, 5 = x, 5. +x 5 +x 5 5 These vectors are clearly independent. If you don t see it, do compute an echelon form! (perm ute first and third row to the b ottom ) Better yet: note that the first vector corresponds to the solution with x = and the other free variables x =, x 5 =. The second vector corresponds to the solution with x = and the other free variables x =, x 5 =. The third vector Hence,, 5, 5 is a basis for Nul(A).

86 Bases for column spaces Recall that the columns of A are independent Ax= has only the trivial solution (namely, x=), A has no free variables. A basis for Col(A) is given by the pivot columns of A. Example. Find a basis for Col(A) with A= Solution The pivot columns are the first and third. Hence, a basis for Col(A) is,. Warning: For the basis of Col(A), you have to take the columns of A, not the columns of an echelon form. Row operations do not preserve the column space. For instance, > R R have different column spaces (of the same dimension).

87 Review {v,,v p } is a basis of V if the vectors span V and are independent. To obtain a basis for Nul(A), solve Ax=: Hence,, 5 x= x 5x x x x RR > E F 5 form a basis for Nul(A). =x +x To obtain a basis for Col(A), take the pivot columns of A. Hence,, > form a basis for Col(A). 5 Row operations do not preserve the column space. For instance, > R R. On the other hand: row operations do preserve the null space. Why? Recall why/that we can operate on rows to solve systems like Ax=! 5 Dimension of Col(A) and Nul(A) Definition. The rank of a matrix A is the number of its pivots. Theorem. Let A be an m n matrix of rank r. Then: dim Col(A)=r Why? A basis for Col(A) is given by the pivot columns of A. dim Nul(A)=n r is the number of free variables of A Why? In our recipe for a basis for Nul(A), each free variable corresponds to an element in the basis. dim Col(A)+dim Nul(A)=n Why? Each of the n columns either contains a pivot or corresponds to a free variable.

88 The four fundamental subspaces Row space and left null space Definition. The row space of A is the column space of A T. Col(A T ) is spanned by the columns of A T and these are the rows of A. The left null space of A is the null space of A T. Why left? A vector x is in Nul(A T ) if and only if A T x=. Note that A T x= (A T x) T =x T A= T. Hence, x is in Nul(A T ) if and only if x T A=. Example. Find a basis for Col(A) and Col(A T ) where A= Solution. We know what to do for Col(A) from an echelon form of A, and we could likewise handle Col(A T ) from an echelon form of A T. But wait! Instead of doing twice the work, we only need an echelon form of A:

89 5 =B Hence, the rank of A is. A basis for Col(A) is,. Recall that Col(A) Col(B). That s because we performed row operations. However, the row spaces are the same! Col(A T )=Col(B T ) The row space is preserved by elementary row operations. In particular: a basis for Col(A T ) is given by, 5. Theorem 5. (Fundamental Theorem of Linear Algebra, Part I) Let A be an m n matrix of rank r. dim Col(A)=r (subspace of R m ) dim Col(A T )=r (subspace of R n ) dim Nul(A)=n r (subspace of R n ) (# of free variables of A) dim Nul(A T )=m r (subspace of R m ) In particular: The column and row space always have the same dimension! In other words, A and A T have the same rank. i.e. sam e num ber of pivots Easy to see for a matrix in echelon form but not obvious for a random matrix. 7,

90 Linear transformations Throughout, V and W are vector spaces. Definition 6. A map T:V W is a linear transformation if T(cx+dy)=cT(x)+dT(y) for all x,y in V and all c,d in R. In other words, a linear transformation respects addition and scaling: T(x+y)=T(x)+T(y) T(cx)=cT(x) It also sends the zero vector in V to the zero vector in W : T()= because T()=T( )= T()= Example 7. Let A be an m n matrix. Then the map T(x)=Ax is a linear transformation T: R n R m. Why? Because matrix multiplication is linear: A(cx+dy)=cAx+dAy The LHS is T(cx+dy) and the RHS is ct(x)+dt(y). Example 8. Let P n be the vector space of all polynomials of degree at mostn. Consider the map T: P n P n given by This map is linear! Why? T(p(t))= d dt p(t). Because differentiation is linear: d dt ap(t)+bq(t)=a d dt p(t)+b d dt q(t) The LHS is T(ap(t)+bq(t)) and the RHS is at(p(t))+bt(q(t)). Representing linear maps by matrices Let x,,x n be a basis for V. A linear map T:V W is determined by the values T(x ),,T(x n ).

91 Why? Take any v in V. It can be written as v=c x + +c n x n because {x,,x n } is a basis and hence spans V. Hence, by the linearity of T, T(v)=T(c x + +c n x) = c T(x )+ +c n T(x n ). Definition 9. (From linear maps to matrices) Let x,,x n be a basis for V, and y,,y m a basis for W. The matrix representing T with respect to these bases has n columns (one for each of the x j ), the j-th column has m entries a,j,,a m,j determined by T(x j )=a,j y + +a m,j y m. Example. Let V = R and W = R. Let T be the linear map such that ( ) T = ( ), T =. 7 What is the matrix A representing T with respect to the standard bases? Solution. The standard bases are, x x for R, y, y, y for R. T(x ) = = + + =y +y +y A= T(x ) = =y +y +7y 7 A= 7 5

92 (We did not have time yet to discuss the next example in class, but it will be helpful if your discussion section already meets Tuesdays.) Example. As in the previous example, let V = R and W = R. Let T be the (same) linear map such that ( T ) = ( ), T =. 7 What is the matrix B representing T with respect to the following bases?, x x for R, y, y, y for R. Solution. This time: ( ) T(x ) =T + 7 = =5 B = 5 5 ( ) T(x ) =T = ( =T =7 B = = 5 ) ( ) +T +5 ( = T ) = 7 + ( +T ) can you see it? otherwise: do it! Tedious, even in this simple example! (But we can certainly do it.) A matrix representing T encodes in column j the coefficients of T(x j ) expressed as a linear combination of y,,y m. 6

93 Practice problems Example. Suppose A= 7 8 fundamental subspaces of A.. Find the dimensions and a basis for all four Example. Suppose A is a 5 5 matrix, and that v is a vector in R 5 which is not a linear combination of the columns of A. What can you say about the number of solutions to Ax=? Solution. Stop reading, unless you have thought about the problem! Existence of such a v means that the 5 columns of A do not span R 5. Hence, the columns are not independent. In other words, A has at most pivots. So, at least one free variable. Which means that Ax= has infinitely many solutions. 7

94 Linear transformations A map T:V W between vector spaces is linear if T(x+y)=T(x)+T(y) T(cx)=cT(x) Let A be an m n matrix. T: R n R m defined by T(x)=Ax is linear. T: P n P n defined by T(p(t))=p (t) is linear. The only linear maps T: R R are T(x)=αx. Recall that T()= for linear maps. Linear maps T: R R are of the form T For instance, T(x,y)=xy is not linear: T ( x y) =αx+βy. ( ) x T(x,y) y Example. Let V = R and W = R. Let T be the linear map such that What is T( T ( = ) =T )? + ( + ( T ) =, T ( ) ( ) ( ) =T +T ) =. = 8 + = 8 Let x,,x n be a basis for V. A linear map T:V W is determined by the values T(x ),,T(x n ). Why? Take any v in V. Write v=c x + +c n x n. (Possible, because {x,,x n } spans V.) By linearity of T, T(v)=T(c x + +c n x) = c T(x )+ +c n T(x n ).

95 Important geometric examples We consider some linear maps R R, which are defined by matrix multiplication, that is, by x Ax. In fact: all linear maps R n R m are given by x Ax, for some matrix A. Example. c The matrix A= c gives the map x cx, i.e. stretches every vector in R by the same factor c. Example. The matrix A= gives the map x y y x, i.e. reflects every vector in R through the line y=x. Example. The matrix A= gives the map x y x, i.e. projects every vector in R through onto thex-axis.

96 Example 5. The matrix A= gives the map x y y x, i.e. rotates every vector in R counter-clockwise by 9. Representing linear maps by matrices Definition 6. (From linear maps to matrices) Let x,,x n be a basis for V, and y,,y m a basis for W. The matrix representing T with respect to these bases has n columns (one for each of the x j ), the j-th column has m entries a,j,,a m,j determined by T(x j )=a,j y + +a m,j y m. Example 7. Recall the map T given by x y y x (reflects every vector in R through the line y=x) Which matrix A represents T with respect to the standard bases? Which matrix B represents T with respect to the basis,?. Solution. ) T( T( = ) =. Hence, A=... Hence, A= If a linear map T: R n R m is represented by the matrix A with respect to the standard bases, then T(x)=Ax.

97 Matrix multiplication corresponds to function composition! That is, if T, T are represented by A, A, then T (T (x))=(a A )x. T( T( ) = ) = = = +. Hence, B=.. Hence, B=. Example 8. Let T: R R be the linear map such that ( ) T = ( ), T =. 7 What is the matrix B representing T with respect to the following bases?, x x for R, y, y, y for R. Solution. This time: ( ) T(x ) =T + 7 = =5 B = 5 5 ( ) T(x ) =T = ( =T =7 B = = 5 ) ( ) +T +5 ( = T ) = 7 + ( +T ) can you see it? otherwise: do it! Tedious, even in this simple example! (But we can certainly do it.) A matrix representing T encodes in column j the coefficients of T(x j ) expressed as a linear combination of y,,y m.

98 Practice problems Example 9. Let T: R R be the map which rotates a vector counter-clockwise by angle θ. Which matrix A represents T with respect to the standard bases? Verify that T(x)=Ax. Solution. Only keep reading if you need a hint! The first basis vector gets send to cosθ sinθ Hence, the first column of A is. 5

99 Review A linear map T:V W satisfies T(cx+dy)=cT(x)+dT(y). T: R n R m defined by T(x)=Ax is linear. (A an m n m atrix) A is the matrix representing T w.r.t. the standard bases For instance: T(e )=Ae = st column of A e = Let x,,x n be a basis for V, and y,,y m a basis for W. The matrix representing T w.r.t. these bases encodes in column j the coefficients of T(x j ) expressed as a linear combination of y,,y m. For instance: let T: R R be reflection through the x-y-plane, that is, (x,y,z) (x,y, z). The matrix representing T w.r.t. the basis,, is. T T ( ( ) ) = = =, T ( + ) = Example. Let T: P P be the linear map given by T(p(t))= d dt p(t). What is the matrix A representing T with respect to the standard bases? Solution. The bases are The matrix A has columns and rows., t, t, t for P,, t, t for P.

100 The first column encodes T()= and hence is For the second column, T(t)= and hence it is For the third column, T(t )=t and hence it is For the last column, T(t )=t and hence it is In conclusion, the matrix representing T is.... A=. Note: By the way, what is the null space of A? The null space has basis. The corresponding polynomial is p(t)=. No surprise here: differentation kills precisely the constant polynomials. Note: Let us differentiate 7t t+ using the matrix A. First: 7t t+ w.r.t. standard basis: 7 = in the standard basis is +t. 7.

101 Orthogonality The inner product and distances Definition. The inner product (or dot product) of v, w in R n : v w = v T w=v w + +v n w n. Because we can think of this as a special case of the matrix product, it satisfies the basic rules like associativity and distributivity. In addition: v w=w v. Example. For instance, = = 6= 7. Definition. The norm (or length) of a vector v in R n is v = v v = v + +v n. This is the distance to the origin. The distance between points v and w in R n is dist(v,w) = v w. v w v w Example 5. For instance, in R, dist ( x y, x y ) = x x y y = (x x ) +(y y ).

102 Orthogonal vectors Definition 6. v and w in R n are orthogonal if v w=. How is this related to our understanding of right angles? Pythagoras: v and w are orthogonal v + w = v w v v+w w=(v w) (v w) v v v w+w w v w= w v w v Example 7. Are the following vectors orthogonal? (a), = ( )+ =. So, yes, they are orthogonal. (b), = ( )+ + =. So not orthogonal.

103 Review v w=v T w=v w + +v n w n, the inner product of v, w in R n Length of v: v = v v = v + +v n Distance between points v and w: v w v and w in R n are orthogonal if v w=. This simple criterion is equivalent to Pythagoras theorem. Example. The vectors, are orthogonal to each other, and have length., We are going to call such a basis orthonormal soon. Theorem. Suppose thatv,,v n are nonzero and (pairwise) orthogonal. Thenv,, v n are independent. Proof. Suppose that c v + +c n v n =. Take the dot product of v with both sides: =v (c v + +c n v n ) =c v v +c v v + +c n v v n =c v v =c v But v and hence c =. Likewise, we find c =,, c n =. Hence, the vectors are independent. Example. Let us consider A= 6 Find Nul(A) and Col(A T ). Observe!. Solution. Nul(A)=span{ Col(A T )=span{ } } The two basis vectors are orthogonal! = can you see it? if not, do it!

104 Example. Repeat for A= Solution. Nul(A) = span Col(A T )=span { { 6 }, 6 } Again, the vectors are orthogonal! Note: Because in the row space.. > RREF The vectors form a basis. =, =. is orthogonal to both basis vectors, it is orthogonal to every vector Vectors in Nul(A) are orthogonal to vectors in Col(A T ). The fundamental theorem, second act Definition 5. Let W be a subspace of R n, and v in R n. v is orthogonal to W, if v w= for all w in W. ( v is orthogonal to each vector in a basis of W ) Another subspace V is orthogonal to W, if every vector in V is orthogonal to W. The orthogonal complement of W is the space W of all vectors that are orthogonal to W. Exercise: show that the orthogonal complement is indeed a vector space. Example 6. In the previous example, A= We found that 6. Nul(A) = span, Col(AT )=span,

105 are orthogonal subspaces. Indeed, Nul(A) and Col(A T ) are orthogonal complements. Why? Because,, are orthogonal, hence independent, and hence a basis of all of R. Remark 7. Recall that, for an m n matrix A, Nul(A) lives in R n and Col(A) lives in R m. Hence, they cannot be related in a similar way. In the previous example, they happen to be both subspaces of R : Nul(A) = span, Col(A)=span, But these spaces are not orthogonal: Theorem 8. (Fundamental Theorem of Linear Algebra, Part I) Let A be an m n matrix of rank r. dim Col(A)=r (subspace of R m ) dim Col(A T )=r (subspace of R n ) dim Nul(A)=n r (subspace of R n ) dim Nul(A T )=m r (subspace of R m ) Theorem 9. (Fundamental Theorem of Linear Algebra, Part II) Nul(A) is orthogonal to Col(A T ). (both subspaces of R n ) Note that dim Nul(A)+dim Col(A T )=n. Hence, the two spaces are orthogonal complements in R n. Nul(A T ) is orthogonal to Col(A). Again, the two spaces are orthogonal complements.

106 Review v and w in R n are orthogonal if v w=v w + +v n w n =. This simple criterion is equivalent to Pythagoras theorem. Nonzero orthogonal vectors are independent. Nul { } = span, Col { } T = span 6 6 Fundamental Theorem of Linear Algebra: (A an m n matrix) Nul(A) is orthogonal to Col(A T ). (both subspaces of R n ) Moreover, dim Col(A T ) = r (rank of A) + dim Nul(A) =n = n r Hence, they are orthogonal complements in R n. Nul(A T ) and Col(A) are orthogonal complements. (in R m ) Nul(A) is orthogonal to Col(A T ). Why? Suppose that x is in Nul(A). That is, Ax=. But think about what Ax= means (row-column rule). It means that the inner product of every row with x is zero. But that implies that x is orthogonal to the row space. Example. Find all vectors orthogonal to and. Solution. (FTLA, no thinking) In other words: ) find the orthogonal complement of Col ) FTLA: this is Nul which has basis: } span { (. T ( = Nul( are the vectors orthogonal to ),. and.

107 Solution. (a little thinking) The FTLA is not magic! x= and x= x= ( ) x in Nul This is the same null space we obtained from the FTLA. Example. Let V = { a b c } : a+b=c. Find a basis for the orthogonal complement of V. Solution. (FTLA, no thinking) We note that V = Nul( ). FTLA: the orthogonal complement is Col( T ). Basis for the orthogonal complement: Solution. (a little thinking) a+b=c a b c =. So: V is actually defined as the orthogonal complement of span { }. A new perspective on Ax=b Ax=b is solvable b is in Col(A) b is orthogonal to Nul(A T ) ( direct approach) ( indirect approach) The indirect approach means: if y T A= then y T b=.

108 Example. Let A= Solution. (old) b b 5 b 5. For which b does Ax=b have a solution? b 5 b +b 5 b So, Ax=b is consistent b +b +b =. b 5 b +b b +b +b Solution. (new) Ax=b solvable b orthogonal to Nul(A T ) to find Nul(A T ): We conclude that Nul(A T ) has basis Ax=b is solvable b 5. =. As above! RREF > Motivation Example. Not all linear systems have solutions. In fact, for many applications, data needs to be fitted and there is no hope for a perfect match. For instance, Ax=b with has no solution: x= is not in Col(A)=span { Instead of giving up, we want the x which makes Ax and b as close as possible. } b Ax Such x is characterized by Ax being orthogonal to the error b Ax (see picture!)

109 Application: directed graphs Graphs appear in network analysis (e.g. internet) or circuit analysis. arrow indicates direction of flow no edges from a node to itself at most one edge between nodes 5 Definition 5. Let G be a graph with m edges and n nodes. The edge-node incidence matrix of G is the m n matrix A with, if edge i leaves node j, A i,j = +, if edge i enters node j,, otherwise. Example 6. Give the edge-node incidence matrix of our graph. Solution. 5 A= each column represents a node each row represents an edge

110 Review A graph G can be encoded by the edge-node incidence matrix: A= each column represents a node 5 each row represents an edge If G has m edges and n nodes, then A is the m n matrix with, if edge i leaves node j, A i,j = +, if edge i enters node j,, otherwise. Meaning of the null space The x in Ax is assigning values to each node. You may think of assigning potentials to each node. x x x x = x +x x +x x +x x +x x +x So: Ax= nodes connected by an edge are assigned the same value 5

111 For our graph: Nul(A) has basis This always happens as long as the graph is connected. Example. Give a basis for Nul(A) for the following graph.. Solution. If Ax = then x = x (connected by edge) and x = x (connected by edge). Nul(A) has the basis:, Just to make sure: the edge-node incidence matrix is: A= In general: dim Nul(A) is the number of connected subgraphs. For large graphs, disconnection may not be apparent visually.. But we can always find out by computing dim Nul(A) using Gaussian elimination! Meaning of the left null space The y in y T A is assigning values to each edge. You may think of assigning currents to each edge. A=, AT = y y y y y 5 = y y y y y y +y y 5 y +y 5 5

112 So: A T y= at each node, (directed) values assigned to edges add to zero When thinking of currents, this is Kirchhoff s first law. (at each node, incoming and outgoing currents balance) What is the simplest way to balance current? Assign the current in a loop! Here, we have two loops: edge, edge, edge and edge, edge 5, edge. Correspondingly, and are in Nul(AT ). Check! Example. Suppose we did not see this. Let us solve A T y= for our graph: > RREF The parametric solution is So, a basis for Nul(A T ) is y y 5 y +y 5 y y 5 y 5, Observe that these two basis vectors correspond to loops. Note that we get the simpler loop as In general: dim Nul(A T ) is the number of (independent) loops. For large graphs, we now have a nice way to computationally find all loops.

113 Practice problems Example. Give a basis for Nul(A T ) for the following graph. Example. Consider the graph with edge-node incidence matrix A=. (a) Draw the corresponding directed graph with numbered edges and nodes. (b) Give a basis for Nul(A) and Nul(A T ) using properties of the graph.

114 Solutions to practice problems Example 5. Give a basis for Nul(A T ) for the following graph. Solution. This graph contains no loops, so Nul(A T )={}. Nul(A T ) has the empty set as basis (no basis vectors needed). For comparison: the edge-node incidence matrix A= indeed has Nul(A T )={}. Example 6. Consider the graph with edge-node incidence matrix A=. Give a basis for Nul(A) and Nul(A T ). Solution. If Ax=, then x =x =x =x (all connected by edges). Nul(A) has the basis:. The graph is connected, so only connected subgraph and dim Nul(A)=. The graph has one loop: edge, edge, edge Assign values y =, y =, y = along the edges of that loop. Nul(A T ) has the basis:. The graph has loop, so dim Nul(A T )=. 5

115 Review for Midterm As of yet unconfirmed: final exam on Friday, December, 7 pm conflict exam on Monday, December 5, 7 pm Directed graphs Go from directed graph to edge-node incidence matrix A and vice versa. Basis for Nul(A) from connected subgraphs. For each connected subgraph, get a basis vector x that assigns to all nodes in that subgraph, and to all other nodes. Basis for Nul(A T ) from (independent) loops. For each (independent) loop, get a basis vector y that assigns and (depending on direction) to the edges in that loop, and to all other edges. Example. Basis for Nul(A): Basis for Nul(A T ):, 5 Fundamental notions Vectors v,,v n are independent if the only linear relation c v + +c n v n = is the one with c =c = =c n =. How to check for independence? The columns of a matrix A are independent Nul(A)={}. Vectors v,,v n in V are a basis for V if they span V, that is V = span{v,,v n }, and they are independent. In that case, V has dimension n. Vectors v,w in R m are orthogonal if v w=v w + +v m w m =.

116 Subspaces From an echelon form of A, we get bases for: Nul(A) by solving Ax= Col(A) by taking the pivot columns of A Col(A T ) by taking the nonzero rows of the echelon form Example. A= > RREF 5 Basis for Col(A): Basis for Col(A T ): Basis for Nul(A):,,, Dimension of Nul(A T ): 5 5 The solutions to Ax=b are given by x p + Nul(A). The fundamental theorem states that Nul(A) and Col(A T ) are orthogonal complements So: dim Nul(A)+dim Col(A T )=n (number of columns of A) Nul(A T ) and Col(A) are orthogonal complements So: dim Nul(A T )+dim Col(A)=m (number of rows of A) In particular, if r= rank(a) (nr of pivots): dim Col(A)=r dim Col(A T )=r dim Nul(A)=n r dim Nul(A T )=m r Example. Consider the following subspaces of R : (a) V = : a+b=, a+b+d= a b c d

117 (b) V = a+b c b a+c c : a,b,c in R In each case, give a basis for V and its orthogonal complement. Try to immediately get an idea what the dimensions are going to be! Solution. First step: express these subspaces as one of the four subspaces of a matrix. ) (a) V = Nul( (b) V = Col Give a basis for each. (a) row reductions: basis for V : (b) row reductions: basis for V :, 5 (no need to continue; we already see that the columns are independent),, Use the fundamental theorem to find bases for the orthogonal complements. (a) V = Col( T ) note the two rows are clearly independent. basis for V : (b) V = Nul row reductions: basis for V :, /5 /5 /5 T = Nul ( 5 ) /5 /5 /5

118 Example. What does it mean for Ax=b if Nul(A)={}? Solution. It means that if there is a solution, then it is unique. That s because all solutions to Ax=b are given by x p + Nul(A). Linear transformations Example 5. Let T: R R be the linear map represented by the matrix with respect to the bases ) (a) What is T(?, of R and,, of R. (b) Which matrix represents T with respect to the standard bases? Solution. The matrix tells us that: T ( ( T ) ) = = = 5 = (a) Note that Hence, T( (b) Note that Hence, T( = + ) =T( = + ) =T( We already know that T( So, T is represented by. ) +T(. ) +T( 6 5 ) = 5 ) =. ) = = = with respect to the standard bases..

119 Check your understanding Think about why each of these statements is true! Ax=b has a solution x if and only if b is in Col(A). That s because Ax are linear combinations of the columns of A. A and A T have the same rank. Recall that the rank of A (number of pivots of A) equals dim Col(A). So this is another way of saying that dim Col(A)=dim Col(A T ). The columns of an n n matrix are independent if and only if the rows are. Let r be the rank of A, and let A be m n for now. The columns are independent r=n (so that dim Nul(A)=). But also: the rows are independent r=m. In the case m=n, these two conditions are equivalent. Ax=b has a solution x if and only if b is orthogonal to Nul(A T ). This follows from Ax=b has a solution x if and only if b is in Col(A) together with the fundamental theorem, which says that Col(A) is the orthogonal complement of Nul(A T ). The rows of A are independent if and only if Nul(A T )={}. Recall that elements of Nul(A) correspond to linear relations between the columns of A. Likewise, elements of Nul(A T ) correspond to linear relations between the rows of A. 5

We can deal with complicated linear systems, but what to do if there is no solutions and we want a best approximate solution? This is important for many applications, including fitting data.

120 We can deal with complicated linear systems, but what to do if there is no solutions and we want a best approximate solution? This is important for many applications, including fitting data. Suppose Ax=b has no solution. This means b is not in Col(A). Idea: find best approximate solution by replacing b with its projection onto Col(A). Recall: if v,,v n are (pairwise) orthogonal: v (c v + +c n v n ) = c v v Implies: the v,,v n are independent (unless one is the zero vector) Orthogonal bases Definition. A basis v,, v n of a vector space V is an orthogonal basis if the vectors are (pairwise) orthogonal. Example. The standard basis Example. Are the vectors Solution.,,,, is an orthogonal basis for R. an orthogonal basis for R?

121 = = So this is an orthogonal basis. = Note that we do not need to check that the three vectors are independent. That follows from their orthogonality. Example. Suppose v,,v n is an orthogonal basis of V, and that w is in V. Find c,,c n such that w=c v + +c n v n. Solution. Take the dot product of v with both sides: v w =v (c v + +c n v n ) =c v v +c v v + +c n v v n =c v v Hence, c = v w v v. In general, c j = v j w v j v j. If v,,v n is an orthogonal basis of V, and w is in V, then Example 5. Express w=c v + +c n v n with c j = w v j v j v j. 7 in terms of the basis,,. Solution. 7 =c 7 = = +c +c

122 Definition 6. A basis v,, v n of a vector space V is an orthonormal basis if the vectors are orthogonal and have length. Example 7. The standard basis,, is an orthonormal basis for R. If v,,v n is an orthonormal basis of V, and w is in V, then Example 8. Express w=c v + +c n v n with c j =v j w. 7 Solution. That s trivial, of course: in terms of the basis 7 But note that the coefficients are 7 Example 9. Is the basis = =, 7 to produce an orthonormal basis. Solution. has length has length has length, The corresponding orthonormal basis is, +7, + =7, 7,. =. orthonormal? If not, normalize the vectors = normalized: = normalized: = is already normalized:,,.

123 Example. Express 7 in terms of the basis,,. Solution. 7 Hence, just as in Example 5: 7 =, = 7 =, =. Orthogonal projections x x ˆx y Definition. The orthogonal projection of vector x onto vector y is xˆ= x y y y y. The vector xˆ is the closest vector to x, which is in span{y}. Characterized by: the error x =x xˆ is orthogonal to span{y}. To find the formula for xˆ, start with xˆ=cy. (x xˆ) y=(x cy) y=x y cy It follows that c= x y. y y wanted x is also called the component of x orthogonal to y. Example. What is the orthogonal projection of x= 8 onto y=? Solution. xˆ= x y 8 + y= y y + = The component of x orthogonal to y is 8 6 x xˆ= =. 6 6 = (Note that, indeed 6 and are orthogonal.)

124 Example. What are the orthogonal projections of,,? Solution. on on on : + ( )+ +( ) + = : = : Note that these sum up to = + + = That s because the three vectors are an orthogonal basis for R.! onto each of the vectors Recall: If v,,v n is an orthogonal basis of V, and w is in V, then w=c v + +c n v n with c j = w v j v j v j. w decomposes as the sum of its projections onto each basis vector 5

125 Comments on midterm Suppose V is a vector space, and you are asked to give a basis. CORRECT: V has basis CORRECT: V has basis OK: V = span {, {, }, (but you really should point out that the two vectors are independent) } INCORRECT: V = { INCORRECT: basis = Review, } x x ˆx y Orthogonal projection of x onto y: xˆ= x y y y y. Error x =x xˆ is orthogonal to y. If y,, y n is an orthogonal basis of V, and x is in V, then x=c y + +c n y n with c j = x y j y j y j. x decomposes as the sum of its projections onto each vector in the orthogonal basis. Example. Express x in terms of the basis y, y, Solution. Note that y,y,y is an orthogonal basis of R. y.

126 =c = +c projection of x onto y = + +c + + projection of x onto y + projection of x onto y Orthogonal projection on subspaces Theorem. LetW be a subspace of R n. Then, eachxin R n can be uniquely written as x= xˆ in W + x. in W x x v v ˆx xˆ is the orthogonal projection of x onto W. xˆ is the point in W closest to x. For any other y in W, dist(x,xˆ)<dist(x,y). If v,,v m is an orthogonal basis of W, then Once xˆ is determined, x =x xˆ. ( ) ( ) x v x vm xˆ= v + + v m. v v v m v m (This is also the orthogonal projection of x onto W.) Example. Let W = span {, }, and x= Find the orthogonal projection of x onto W. (or: find the vector in W which is closest to x) Write x as a vector in W plus a vector orthogonal to W..

127 Solution. Note that w = and w = are an orthogonal basis for W. We will soon learn how to construct orthogonal bases ourselves. Hence, the orthogonal projection of x onto W is: xˆ= x w w w w + x w w w w = = + + xˆ is the vector in W which best approximates x. = Orthogonal projection of x onto the orthogonal complement of W : x = Note: Indeed, 9 = 9. Hence, x= is orthogonal to w = = in W +. 9 in W and w =. Definition. Let v,,v m be an orthogonal basis of W, a subspace of R n. Note that the projection map π W : R n R n, given by ( ) ( ) x x v x vm xˆ= v + + v m v v v m v m is linear. The matrix P representing π W with respect to the standard basis is the corresponding projection matrix. Example 5. Find the projection } matrix P which corresponds to orthogonal projection onto W = span {, Solution. Standard basis : in R.,, The first column of P encodes the projection of + =.. Hence P = : 9.

128 The second column of P encodes the projection of + =. Hence P = The third column of P encodes the projection of + Example 6. (again) = Find the orthogonal projection of x = Solution. xˆ=px= 9. Hence P = = : 9. : 9 onto W = span. {, }, as in the previous example. Example 7. Compute P for the projection matrix we just found. Explain!. Solution. 9 9 = 9 Projecting a second time does not change anything anymore. Practice problems Example 8. Find the closest point to x in span{v,v }, where x=, v =, v = Solution. This is the orthogonal projection of x onto span{v,v }..

129 Review Let W be a subspace of R n, and x in R n (but maybe not in W ). x x v v ˆx Let xˆ be the orthogonal projection of x onto W. (vector in W as close as possible to x) If v,,v m is an orthogonal basis of W, then xˆ= ( x v v v ) v proj. of x onto v The decomposition x= + x xˆ in is unique. W in W Least squares ( ) x vm + + v m v m v m proj. of x onto v m Definition. xˆ is a least squares solution of the system Ax=b if xˆ is such that Axˆ b is as small as possible. If Ax=b is consistent, then a least squares solution xˆ is just an ordinary solution. Interesting case: Ax = b is inconsistent. (in that case, Axˆ b=) (in other words: the system is overdetermined). Idea. Ax=b is consistent b is in Col(A) So, if Ax=b is inconsistent, we replace b with its projection bˆ onto Col(A), and solve Axˆ=bˆ. (consistent by construction!) b Ax

130 Example. Find the least squares solution to Ax=b, where A=, b=. Solution. Note that the columns of A are orthogonal. Otherwise, we could not proceed in the same way. Hence, the projection bˆ of b onto Col(A) is bˆ= + = + =. We have already solved Axˆ=bˆ in the process: xˆ= / /. The normal equations The following result provides a straightforward recipe (thanks to the FTLA) to find least squares solutions for any matrix. The previous example was only simple because the columns of A were orthogonal. Theorem. xˆ is a least squares solution of Ax=b A T Axˆ=A T b (the normal equations) Proof. xˆ is a least squares solution of Ax=b Axˆ b is as small as possible Axˆ b is orthogonal to Col(A) A T (Axˆ b)= A T Axˆ=A T b GFTLA Axˆ b is in Nul(A T )

131 Example. (again) Find the least squares solution to Ax=b, where A=, b=. Solution. A T A= A T b= = = The normal equations A T Axˆ=A T b are Solving, we find (again) xˆ= / /. xˆ=. Example 5. Find the least squares solution to Ax=b, where What is the projection of b onto Col(A)? A=, b=. Solution. A T A= A T b= 7 = 5 9 = The normal equations A T Axˆ=A T b are xˆ=. Solving, we find xˆ= The projection of b onto Col(A) is Axˆ=. =. Just to make sure: why is Axˆ the projection of b onto Col(A)? Because, for a least squares solution xˆ, Axˆ b is as small as possible.

132 The projection bˆ of b onto Col(A) is bˆ=axˆ, with xˆ such that A T Axˆ=A T b. If A has full column rank, this is bˆ=a(a T A) A T b. Hence, the projection matrix for projecting onto Col(A) is P =A(A T A) A T. (columns of A independent) Application: least squares lines Experimental data: (x i,y i ) Wanted: parameters β,β such that y i β +β x i for all i 6 8 This approximation should be so that SS res = i y i (β +β x i ) is as small as possible. residual sum of squares Example 6. Find β,β such that the line y=β +β x best fits the data points (,), (5,),(7,),(8,). Solution. The equations y i =β +β x i in matrix form: x x x x design m atrix X β β = y y y y observation vector y

133 Here, we need to find a least-squares solution to Solving X T X = βˆ= 9 57 X T y= 5 7 8, we find β β β β Hence, the least squares line is y= x. =. = = 9 57 = /7 5/

134 perimeter = perimeter = perimeter = π perimeter = π perimeter = perimeter = perimeter = perimeter = π perimeter = π perimeter = π π = Happy Halloween!

135 Review xˆ is a least squares solution of the system Ax=b xˆ is such that Axˆ b is as small as possible GFTLA A T Axˆ=A T b (the normal equations) Application: least squares lines Example. Find β,β such that the line y=β +β x best fits the data points (,), (5,),(7,),(8,). 6 8 Comment. As usual in practice, we are minimizing the (sum of squares of the) vertical offsets: Solution. The equations y i =β +β x i in matrix form:

136 x x x x design m atrix X β β = Here, we need to find a least squares solution to Solving X T X = βˆ= 9 57 X T y= 5 7 8, we find β β β β Hence, the least squares line is y= x. = y y y y observation vector y. = = 9 57 = /7 5/. 6 8 How well does the line fit the data (,),(5,),(7,),(8,)? How small is the sum of squares of the vertical offsets?

137 residual sum of squares: SS res = (y i (β +β x i ) error at (x i,y i ) The choice of β,β from least squares, makes SS res as small as possible. total sum of squares: SS tot = (y i ȳ), where ȳ = n yi is the mean of the observed data coefficient of determination: R = SS r e s SS t o t General rule: the closer R is to, the better the regression line fits the data. Here, ȳ =9/: (,),(5,),(7,),(8,) ) R = ( ( )) + ( ( 9 ( = =.97 very close to good fit Other curves ) + ( 9 5 )) + ( ( ) + ( )) + ( ) + ( 9 We can also fit the experimental data (x i,y i ) using other curves. Example. y i β +β x i +β x i with parameters β,β,β. The equations y i =β +β x i +β x i in matrix form: x x x x x x design m atrix X β β = β y y y observation vector y ) ( )) Given data (x i,y i ), we then find the least squares solution to Xβ=y. Multiple linear regression In statistics, linear regression is an approach for modeling the relationship between a scalar dependent variable and one or more explanatory variables. The case of one explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression. The experimental data might be of the form (v i, w i, y i ), where now the dependent variable y i depends on two explanatory variables v i,w i (instead of just one x i ).

138 Example. Fitting a linear relationship y i β +β v i +β w i, we get: v w v w v w design m atrix β β = β y y y observation vector And we again proceed by finding a least squares solution. Review x x v v ˆx Suppose v,,v m is an orthonormal basis of W. The orthogonal projection of x onto W is: xˆ= x,v v proj. of x onto v + + x,v m v m proj. of x onto v m. (To stay agile, we are writing x,v =x v for the inner product.) Gram Schmidt Example. Find an orthonormal basis for V = span Recipe. (Gram Schmidt orthonormalization) Given a basis a,,a n, produce an orthonormal basis q,,q n.,, b =a, q = b b b =a a,q q, q = b b b =a a,q q a,q q, q = b b. 5

139 Example 5. Find an orthonormal basis for V = span,,. Solution. b =, q = b =,q q =, q = b =,q q,q q =, q = We have obtained an orthonormal basis for V :,,. 6

140 Review Vectors q,,q n are orthonormal if q i T q j = {, if i j,, if i=j. Gram Schmidt orthonormalization: Input: basis a,,a n for V. Output: orthonormal basis q,,q n for V. b a b =a, q = b b b =a a,q q, q = b b b =a a,q q a,q q, q = b b a Example. Apply Gram Schmidt to the vectors,,. Solution. b =, q = b =, =, q = b =,q q,q q = =, q 9 = We obtained the orthonormal vectors,,. Theorem. The columns of an m n matrix Q are orthonormal Q T Q=I (the n n identity) Proof. Let q,,q n be the columns of Q. They are orthonormal if and only if q i T q j = {, if i j,, if i=j.

141 All these inner products are packaged in Q T Q=I: q q q T q T = Definition. An orthogonal matrix is a square matrix Q with orthonormal columns. It is historical convention to restrict to square matrices, and to say orthogonal matrix even though orthonormal matrix might be better. An n n matrix Q is orthogonal Q T Q=I In other words, Q =Q T. Example. P = is orthogonal. In general, all permutation matrices P are orthogonal. Why? Because their columns are a permutation of the standard basis. And so we always have P T P =I. cosθ sinθ Example 5. Q= sinθ cosθ Q is orthogonal because: is an orthonormal basis of R cos θ sin θ, sin θ cos θ Just to make sure: why length? Because Alternatively: Q T Q= cos θ sin θ sin θ cos θ Example 6. Is H = orthogonal? cos θ sin θ cos θ sin θ sin θ cos θ No, the columns are orthogonal but not normalized. But is an orthogonal matrix. = cos θ+sin θ = Just for fun: a n n matrix with entries ± whose columns are orthogonal is called a Hadamard matrix of size n. A size example: H H H H = Continuing this construction, we get examples of size 8, 6,, It is believed that Hadamard matrices exist for all sizes n. But no example of size 668 is known yet. =.

142 The QR decomposition (flashed at you) Gaussian elimination in terms of matrices: A=LU Gram Schmidt in terms of matrices: A=QR Let A be an m n matrix of rank n. Then we have the QR decomposition A=QR, where Q is m n and has orthonormal columns, and R is upper triangular, n n and invertible. (columns independent) Idea: Gram Schmidt on the columns of A, to get the columns of Q. Example 7. Find the QR decomposition of A= 5 6 Solution. We apply Gram Schmidt to the columns of A: =q ,q q =,q q 5 6,,q q = =q 5 Hence: Q= q q q = To find R in A=QR, note that Q T A=Q T QR=R. R= 5 = Summarizing, we have 5 6, = =q Recipe. In general, to obtain A=QR: Gram Schmidt on (columns of) A, to get (columns of) Q. Then, R=Q T A.

143 The resulting R is indeed upper triangular, and we get: a a = q q q T a q T a q T a q T a q T a q T a It should be noted that, actually, no extra work is needed for computing R: all the inner products in R have been computed during Gram Schmidt. (Just like the LU decomposition encodes the steps of Gaussian elimination, the QR decomposition encodes the steps of Gram Schmidt.) Practice problems Example 8. Complete, to an orthonormal basis of R. (a) by using the FTLA to determine the orthogonal complement of the span you already have (b) by using Gram Schmidt after throwing in an independent vector such as Example 9. Find the QR decomposition of A=.

144 Review Let A be an m n matrix of rank n. Then we have the QR decomposition A=QR, (columns independent) where Q is m n with orthonormal columns, and R is upper triangular and invertible. To obtain a a = q q q T a q T a q T a q T a q T a q T a Gram Schmidt on (columns of) A, to get (columns of) Q. Then, R=Q T A. (actually unnecessary!) Example. The QR decomposition is also used to solve systems of linear equations. (we assum e A is n n, and A exists) Ax=b QRx=b Rx=Q T b The last system is triangular and is solved by back substitution. QR is a little slower than LU but makes up in numerical stability. If A is not n n and invertible, then Rx=Q T b gives the least squares solutions! Example. The QR decomposition is very useful for solving least squares problems: A T Axˆ=A T b (QR) T QRxˆ=(QR) T b =R T Q T QR R T Rxˆ=R T Q T b Rxˆ=Q T b Again, the last system is triangular and is solved by back substitution. xˆ is a least squares solution of Ax=b Rxˆ=Q T b (where A=QR)

145 Application: Fourier series Review. Given an orthogonal basis v,v,, we express a vector x as: x=c v +c v +, c i v i = x,v i v i,v i v i projection of x onto v i A Fourier series of a function f(x) is an infinite expansion: f(x)=a +a cos(x)+b sin(x)+a cos(x)+b sin(x)+ Example. sin(x) π π π π π sin(x) π π π π π sin(x) π π π π π

146 Example. (just a preview) blue function = (sin(x)+ π sin(x)+ 5 sin(5x)+ 7 ) sin(7x)+ π π π π π We are working in the vector space of functions R R. More precisely, nice (say, piecewise continuous) functions that have period π. These are infinite dimensional vector spaces. The functions, cos(x), sin(x), cos(x), sin(x), are a basis of this space. In fact, an orthogonal basis! That s the reason for the success of Fourier series. But what is the inner product on the space of functions? Vectors in R n : v,w =v w + +v n w n Functions: f,g = π f(x)g(x)dx Why these limits? Because our functions have period π. Example 5. Show that cos(x) and sin(x) are orthogonal. Solution. cos(x), sin(x) = π cos(x)sin(x)dx = (sin(x)) π = More generally,, cos(x), sin(x), cos(x), sin(x), are all orthogonal to each other.

147 Example 6. What is the norm of cos(x)? Solution. cos(x), cos(x) = π cos(x)cos(x)dx = π Why? There s many ways to evaluate this integral. For instance: you could use integration by parts, you could use a trig identity, here s a simple way: π cos (x)dx= π sin (x)dx cos (x)+sin (x)= So: π cos (x)dx= π dx=π Hence, cos(x) is not normalized. It has norm cos(x) = π. (cos and sin are just a shift apart) Example 7. The same calculation shows that cos(kx) and sin(kx) have norm well. Fourier series of f(x): f(x)=a +a cos(x)+b sin(x)+a cos(x)+b sin(x)+ π as Example 8. How do we find a? Or: how much cosine is in a function f(x)? Solution. a = f(x), cos(x) cos(x), cos(x) = π π f(x)cos(x)dx f(x) has the Fourier series where f(x)=a +a cos(x)+b sin(x)+a cos(x)+b sin(x)+ f(x), cos(kx) a k = cos(kx), cos(kx) = π f(x), sin(kx) b k = sin(kx), sin(kx) = π a = f(x), =, π π π π f(x)cos(kx)dx, f(x)sin(kx)dx, f(x)dx.

148 Example 9. Find the Fourier series of the π-periodic function f(x) defined by {, for x ( π,), f(x)= +, for x (,π). π π π π π Solution. Note that π and In conclusion, π π a = π f(x)dx= π π a n = π f(x)cos(nx)dx π π = π π b n = π f(x)sin(nx)dx π π = sin(nx)dx+ π π = π sin(nx)dx π = n π π cos(nx) = πn cos(nπ) = πn ( )n = are the same here. π cos(nx)dx+ cos(nx)dx { πn π sin(nx)dx if n is odd if n is even = f(x)= (sin(x)+ π sin(x)+ 5 sin(5x)+ 7 ) sin(7x)+. (why?!) π π π π π 5

149 Review An inner product for π-periodic functions: f,g = π f(x)g(x)dx (in R n : v,w =v w + +v n w n ), cos(x), sin(x), cos(x), sin(x), are orthogonal π π π π π π π π π π π π π π π An expansion in that basis is a Fourier series: f(x)=a +a cos(x)+b sin(x)+a cos(x)+b sin(x)+ where f(x), cos(kx) a k = cos(kx), cos(kx) = π f(x), sin(kx) b k = sin(kx), sin(kx) = π a = f(x), =, π π π π f(x)cos(kx)dx, f(x)sin(kx)dx, f(x)dx. Example. blue function = (sin(x)+ π sin(x)+ 5 sin(5x)+ 7 ) sin(7x)+ π π π π π

150 Example. Find the Fourier series of the π-periodic function f(x) defined by {, for x ( π,), f(x)= +, for x (,π). π π π π π Solution. Note that π and π π a = π f(x)dx= π π a n = π f(x)cos(nx)dx π π = π π b n = π f(x)sin(nx)dx π π = sin(nx)dx+ π π = π sin(nx)dx π = n π π cos(nx) = πn cos(nπ) = πn ( )n = are the same here. π cos(nx)dx+ cos(nx)dx { πn π sin(nx)dx if n is odd if n is even = (why?!) In conclusion, f(x)= (sin(x)+ π sin(x)+ 5 sin(5x)+ 7 ) sin(7x)+.

151 Determinants For the next few lectures, all matrices are square! Recall that a b c d The determinant of = a matrix is det( a b c d a matrix is det( a )=a. d b ad bc c a ) =ad bc, Goal: A is invertible det(a) ) We will write both det( a b and c d a b c d Definition. The determinant is characterized by: the normalization deti=,. for the determinant. and how it is affected by elementary row operations: (replacement) Add one row to a multiple of another row. Does not change the determinant. (interchange) Interchange two rows. Reverses the sign of the determinant. (scaling) Multiply all entries in a row by s. Multiplies the determinant by s. Example. Compute Solution.. 7 R R = Example 5. Compute. 7 Solution. R R R R R R R R R =

152 The determinant of a triangular matrix is the product of the diagonal entries. Example 6. Compute. Solution. R R R R R R R = ( 7) ( 7) = Example 7. Discover the formula for a b c d. Solution. a b c d R c R a a b ( d c b =a d c )=ad bc a b a Example 8. Compute 5 5. Solution. 5 5 R R 5 7 = 7 = The following important properties follow from the behaviour under row operations. det(a)= A is not invertible Why? Because det(a)= if only if, in an echelon form, a diagonal entry is zero (that is, a pivot is missing). det(ab)=det(a)det(b) det(a )= det (A) det(a T )=det(a)

153 Example 9. Recall that AB=, then it does not follow that A= or B=. However, show that det(a)= or det(b)=. Solution. Follows from det(ab)=det()=, and det(ab)=det(a)det(b). A bad way to compute determinants Example. Compute by cofactor expansion. Solution. We expand by the first = = ( ) ( )+= Each term in the cofactor expansion is ± times an entry times a smaller determinant (row and column of entry deleted). The ± is assigned to each entry according to Solution. We expand by the second column: = = ( )+( ) = +( ) Solution. We expand by the third column: = + = ( )+ ( 7)= + + Practice problems Problem. Compute Solution. The final answer should be. 5

154 Review The determinant is characterized by deti= and the effect of row op s: replacement: does not change the determinant interchange: reverses the sign of the determinant scaling row by s: multiplies the determinant by s 5 = = det(a)= A is not invertible det(ab)=det(a)det(b) det(a )= det(a) det(a T )=det(a) What s wrong?! det(a )=det d b ad bc c a The corrected calculation is: det d b ad bc c a = ad bc (da ( b)( c))= = (ad bc) (da ( b)( c))= ad bc This is compatible with det(a )= det(a). Example. Suppose A is a matrix with det(a)=5. What is det(a)? Solution. A has three rows. Multiplying all of them by produces A. Hence, det(a)= det(a)=.

155 A bad way to compute determinants Example. Compute by cofactor expansion. Solution. We expand by the second column: = = ( )+( ) = +( ) + Solution. We expand by the third column: = + = ( )+ ( 7)= + + Why is the method of cofactor expansion not practical? Because to compute a large n n determinant, one reduces to n determinants of size (n ) (n ), then n(n ) determinants of size (n ) (n ), and so on. In the end, we have n!=n(n ) many numbers to add. WAY TOO MUCH WORK! Already 5!= Context: today s fastest computer, Tianhe-, runs at pflops (. 6 op s per second). By the way: fastest is measured by computed LU decompositions! Example. First off, say hello to a new friend: i, the imaginary unit It is infamous for i =. = i i = i = i i i = i i i i i i = i = i i i i i i = i i i i i i i i i = i i i =5

156 i i i i i i i i = i i i i i i i i i i i =5+=8 The Fibonacci numbers! Do you know about the connection of Fibonacci numbers and rabbits? Eigenvectors and eigenvalues Throughout, A will be an n n matrix. Definition. An eigenvector of A is a nonzero x such that Ax=λx for some scalar λ. The scalar λ is the corresponding eigenvalue. In words: eigenvectors are those x, for which Ax is parallel to x. Example 5. Verify that is an eigenvector of A=. Solution. Ax= = 8 =x Hence, x is an eigenvector of A with eigenvalue.

157 Example 6. Use your geometric understanding to find the eigenvectors and eigenvalues of A=. Solution. A x y = y x i.e. multiplication with A is reflection through the line y=x. A A = So: x= is an eigenvector with eigenvalue λ=. = = is an eigenvector with eigenvalue λ=. So: x= Practice problems Problem. Let A be an n n matrix. Express the following in terms of det(a): det(a )= det(a)= Hint: (unless n=) this is not just det(a)

158 Review If Ax=λx, then x is an eigenvector of A with eigenvalue λ. EG: x= is an eigenvector of A= because Ax= = =x 8 with eigenvalue Multiplication with A= A = So: x= = So: x= A is reflection through the line y=x. is an eigenvector with eigenvalue λ=. = is an eigenvector with eigenvalue λ=. Example. Use your geometric understanding to find the eigenvectors and eigenvalues of A=. Solution. A x y = x i.e. multiplication with A is projection onto the x-axis. A A = So: x= is an eigenvector with eigenvalue λ=. = = So: x= is an eigenvector with eigenvalue λ=.

159 Example. Let P be the projection matrix corresponding to orthogonal projection onto the subspace V. What are the eigenvalues and eigenvectors of P? Solution. For every vector x in V, Px=x. These are the eigenvectors with eigenvalue. For every vector x orthogonal to V, Px=. These are the eigenvectors with eigenvalue. Definition. Given λ, the set of all eigenvectors with eigenvalue λ is called the eigenspace of A corresponding to λ. Example. (continued) We saw that the projection matrix P has the two eigenvalues λ=,. The eigenspace of λ= is V. The eigenspace of λ= is V. How to solve Ax=λx Key observation: Ax=λx Ax λx= (A λi)x= This has a nonzero solution det(a λi)= Recipe. To find eigenvectors and eigenvalues of A. First, find the eigenvalues λ using: λ is an eigenvalue of A det(a λi)= Then, for each eigenvalue λ, find corresponding eigenvectors by solving (A λi)x=. Example 5. Find the eigenvectors and eigenvalues of A=.

160 Solution. A λi= det(a λi)= λ = λ λ λ λ =( λ) =λ 6λ+8= λ =, λ = This is the characteristic polynomial of A. Its roots are the eigenvalues of A. Find eigenvectors with eigenvalue λ =: A λ I= Solutions to So: x = x= have basis is an eigenvector with eigenvalue λ =. All other eigenvectors with λ= are multiples of x. { } span is the eigenspace for eigenvalue λ=. Find eigenvectors with eigenvalue λ =: A λ I= Solutions to So: x = x= have basis is an eigenvector with eigenvalue λ =. { The eigenspace for eigenvalue λ= is span.. }. (A= (A= ) ) Example 6. Find the eigenvectors and the eigenvalues of A= 6. Solution. The characteristic polynomial is: det(a λi)= λ 6 λ λ =( λ)(6 λ)( λ) A has eigenvalues,,6. A= The eigenvalues of a triangular matrix are its diagonal entries. λ =: (A λ I)x= x= x = 5/ 6

161 λ =: (A λ I)x= λ =6: (A λ I)x= x= x = x= x = In summary,ahas eigenvalues,,6 with corresponding eigenvectors. / These three vectors are independent. By the next result, this is always so. 5/,, Theorem 7. If x,,x m are eigenvectors of A corresponding to different eigenvalues, then they are independent. Why? Suppose, for contradiction, that x,,x m are dependent. By kicking out some of the vectors, we may assume that there is (up to multiples) only one linear relation: c x + +c m x m =. Multiply this relation with A: A(c x + +c m x m )=c λ x + +c m λ m x m = This is a second independent relation! Contradiction. Practice problems Example 8. Find the eigenvectors and eigenvalues of A=. Example 9. What are the eigenvalues of A= No calculations!?

162 Review If Ax=λx, then x is an eigenvector of A with eigenvalue λ. All eigenvectors (plus ) with eigenvalue λ form the eigenspace of λ. λ is an eigenvalue of A det(a λi) characteristic polynomial Why? Because Ax=λx (A λi)x=. =. By the way: this means that the eigenspace of λ is just Nul(A λi). E.g., if A= 6 then det(a λi)=( λ)(6 λ)( λ). Eigenvectors x,,x m of A corresponding to different eigenvalues are independent. By the way: product of eigenvalues = determinant sum of eigenvalues = trace (sum of diagonal entries) Example. Find the eigenvalues of A as well as a basis for the corresponding eigenspaces, where A=. Solution. The characteristic polynomial is: λ det(a λi) = λ λ =( λ) λ λ =( λ)( λ) =( λ)(λ )(λ ) A has eigenvalues,,. A= Since λ= is a double root, it has (algebraic) multiplicity. λ =: (A λ I)x= x=

163 Two independent solutions: x =, x = { In other words: the eigenspace for λ= is span λ =: (A λ I)x= x= x = In summary, A has eigenvalues and : eigenspace for λ= has basis eigenspace for λ= has basis,.,, }. An n n matrix A has up to n different eigenvalues. Namely, the roots of the degree n characteristic polynomial det(a λi). For each eigenvalue λ, A has at least one eigenvector. That s because Nul(A λi) has dimension at least. If λ has multiplicity m, then A has up to m (independent) eigenvectors for λ. Ideally, we would like to find a total of n (independent) eigenvectors of A. Why can there be no more than n eigenvectors?! Two sources of trouble: eigenvalues can be complex numbers (that is, not enough real roots), or repeated roots of the characteristic polynomial. Example. Find the eigenvectors and eigenvalues of. Geometrically, what is the trouble? A= Solution. A x y = y x i.e. multiplication with A is rotation by 9 (counter-clockwise). Which vector is parallel after rotation by 9? Trouble.

164 Fix: work with complex numbers! det(a λi)= λ λ =λ + So, the eigenvalues are λ =i and λ = i. λ =i: i x= x = i i Let us check: λ = i: i i i x= = =i i i x = i Example. Find the eigenvectors and eigenvalues of A= Solution. det(a λi)= λ λ =( λ) So: λ= is the only eigenvalue (it has multiplicity ). (A λi)x= x= x = So: the eigenspace is span{ }. Only dimension! Trouble: only independent eigenvector for a matrix This kind of trouble cannot really be fixed. We have to lower our expectations and look for generalized eigenvectors. These are solutions to (A λi) x=, (A λi) x=, Practice problems Example. Find the eigenvectors and eigenvalues of A = 5 8. What is the trouble?.

165 Review Eigenvector equation: Ax=λx (A λi)x= λ is an eigenvalue of A det(a λi) characteristic polynomial =. An n n matrix A has up to n different eigenvalues λ. The eigenspace of λ is Nul(A λi). That is, all eigenvectors of A with eigenvalue λ. If λ has multiplicity m, then A has up to m eigenvectors for λ. At least one eigenvector is guaranteed (because det(a λi)=). Test yourself! What are the eigenvalues and eigenvectors? λ=, (ie. multiplicity ), eigenspace is R λ=,, eigenspace is R λ=,, eigenspace is span { } Diagonalization Diagonal matrices are very easy to work with. Example. For instance, it is easy to compute their powers. If A=, then A = and A = Example. If A= 6, then A =? Solution. 6 λ Characteristic polynomial: λ = =(λ )(λ 5) λ =: v= eigenvector v = λ =5: eigenvector v = Key observation: A v =λ v and A v =λ v For A, we need A and A.

166 =c v +c v = A =A ( A = ) = + 5 We find the second column of A likewise. Left as exercise! The key idea of the previous example was to work with respect to a basis given by the eigenvectors. Put the eigenvectors x,,x n as columns into a matrix P. Ax i =λ i x i A x In summary: AP =PD x n = λ x λ n x n = x x n λ Suppose that A is n n and has independent eigenvectors v,,v n. Then A can be diagonalized as A=PDP. the columns of P are the eigenvectors the diagonal matrix D has the eigenvalues on the diagonal Such a diagonalization is possible if and only if A has enough eigenvectors. λ n

Example. Fibonacci numbers:,,,,,5,8,,,, By the way: not a universal law but only a fascinatingly prevalent tendency Coxeter Did you notice: =.65, =.

Fn+ = F n F n F n+ =F n +F n Hence: F n+ F n = n F F But we know how to compute F n F F n or n! = Solution. (Exercise to fill in all details!

(c = ) F n+ F n =A n =λ n c v +λ n c v Hence, F n =λ n c +λ n c = 5 That s Binet s formula.

167 Example. Fibonacci numbers:,,,,,5,8,,,, By the way: not a universal law but only a fascinatingly prevalent tendency Coxeter Did you notice: =.65, =.65, =.69, 8 The golden ratio ϕ =.68 Where s that coming from? By the way, this ϕ is the most irrational number (in a precise sense). Fn+ = F n F n F n+ =F n +F n Hence: F n+ F n = n F F But we know how to compute F n F F n or n! = Solution. (Exercise to fill in all details!) The characteristic polynomial of A= is λ λ. The eigenvalues are λ = (the golden mean!) and λ = Corresponding eigenvectors: v = λ, v = λ Write =c v +c v. (c = ) F n+ F n =A n =λ n c v +λ n c v Hence, F n =λ n c +λ n c = 5 That s Binet s formula. ( + 5 But λ <, and so F n λ n c = 5 ( In fact, F n = round 5 Practice problems ( + 5 ) n ( 5 ( + 5 ) n. ) n. ) ) n. Don t you feel powerful!?, c = 5 5 Problem. Find, if possible, the diagonalization of A=.

Review. Example 1. Elementary matrices in action: (a) a b c. d e f = g h i. d e f = a b c. a b c. (b) d e f. d e f.

Review. Example 1. Elementary matrices in action: (a) a b c. d e f = g h i. d e f = a b c. a b c. (b) d e f. d e f. Review Example. Elementary matrices in action: (a) 0 0 0 0 a b c d e f = g h i d e f 0 0 g h i a b c (b) 0 0 0 0 a b c d e f = a b c d e f 0 0 7 g h i 7g 7h 7i (c) 0 0 0 0 a b c a b c d e f = d e f 0 g