Linear Algebra Highlights

Save this PDF as:

Size: px
Start display at page:

Download "Linear Algebra Highlights"


1 Linear Algebra Highlights Chapter 1 A linear equation in n variables is of the form a 1 x 1 + a 2 x a n x n. We can have m equations in n variables, a system of linear equations, which we want to solve simultaneously. Consistent system - has at least one solution; so, has 1 or infinitely many solutions. If it has infinitely many solutions, write them parametrically. Inconsistent system - has no solution. Two systems are equivalent if they have the same solution set. Thus, our goal is to get a system into a simpler form which is equivalent to the original using the Allowable Operations. A m n matrix is a rectangular array of real numbers with m rows and n columns. Entries are labeled a ij where i is the row the entry is in and j the column. We can represent a system of equations in matrix form: A x = b. This is easier to work with, plus matrices are interesting in their own right. Elementary Row Operations (EROs) - 1. Interchange rows R i R j 2. multiply a row by a nonzero constant cr i 3. add a multiple of one row to another R i + cr j (replaces row i). Two matrices are row equivalent if you can obtain one from the other using EROs. Row equivalence satisfies the properties: reflexive (A is row equivalent to itself - do nothing), symmetric (if A is equivalent to B, B is equivalent to B - can get back to the original), and transitive (if A is equivalent to B and B to C, then A is equivalent to C - as you are using the EROs all intermediary steps are also row equivalent). Namely, REF and RREF are row equivalent to the original, so we can use those forms to answer questions about the original matrix, such as finding solutions to systems of equations. Row Echelon Form (REF) - all rows consisting entirely of 0 s are at the bottom of the matrix, 1st nonzero entry in each row is a 1, leading 1 in a higher row is farther left than the leading 1 in a lower row. If you put it in this form, use back substitution to find the answer (solving system this way is called Gaussian elimination with back sub). Reduced Row Echelon Form (RREF) - REF with the additional requirement that every column containing a leading 1 has 0 s in the rest of the column. Can read off the answer in this form (usually preferred); putting a matrix into RREF is called Gauss-Jordan elimination. We can always put a matrix in these forms. The RREF of a matrix is unique (the REF is not). When solving a system of equations using an augmented matrix, a row/rows of all zeros mean there are infinitely many solutions. The column(s) missing a leading 1 is/are the free variable(s) (use a different parameter letter for each free variable). If you have a row of zeros equaling a nonzero constant term, there are no solutions. If the coefficient matrix is row equivalent to the identity matrix, there is exactly 1 solution. Homogeneous system - all constant terms are 0. Always consistent: x 1 = = x n = 0 is a solution. 1

2 An underdetermined system has more variables than equations; it usually (but not always) has infinitely many solutions (it never has 1). An overdetermined system has more equations than variables; it usually is inconsistent (but not always). Chapter 2 Let A = [a ij ], B = [b ij ], C = [c ij ] and c be a constant. Let I be the identity matrix (1 s on diagonal, 0 s elsewhere) and [0] be the matrix with all entries equal to 0. A + B - add matrices by adding corresponding entries: A + B = [a ij + b ij ]. Matrices must be the same size in order to add them. ca - scalar multiplication - multiply every entry in the matrix by the scalar: ca = [ca ij ]. AB - the number of columns in the first matrix must equal the number of rows in the second; if A has size m n and B has size n p, then AB has size m p. The ij th entry of AB is n the dot product of the i th row of A with the j th column of B: c ij = a ik b kj. In general matrix multiplication is not commutative: AB BA. AB = BA for all B. k=1 A = ci if and only if Diagonal matrix - square matrix where all non-diagonal entries are 0; a ij = 0 whenever i j. Trace - the trace of a square matrix is the sum of its diagonal entries: T r(a) = a 11 + a a nn. It is a linear function: T r(a + B) = T r(a) + T r(b) and T r(ca) = ct r(a). The set of all n n matrices over real numbers is an algebra. The following properties hold: 1. there is an additive identity: A + [0] = [0] + A = A 2. every matrix has an additive inverse: A + ( A) = A + A = [0] 3. additive associativity: A + (B + C) = (A + B) + C 4. additive commutativity: A + B = B + A 5. scalar distributes: c(a + B) = ca + cb 6. matrix distributes over scalars: (c + d)a = ca + da 7. scalar associativity: (cd)a = c(da) 8. scalar multiplicative identity: 1A = A 9. c(ab) = (ca)b = A(cB) 10. distributivity: A(B + C) = AB + AC and (A + B)C = AC + AB 11. multiplicative associativity: A(BC) = (AB)C 12. multiplicative identity: AI = IA = A. Note: all these properties hold for any size matrices as long as the sizes are such that the sums and products are defined. 2

3 Be careful of the following: AB does not necessarily equal BA (see above). We do not in general have the cancellation law: AC = BC does not imply A = B in general. AC = BC or CA = CB imply A = B if C is invertible. AB = [0] does not imply that A = [0] or B = [0]. laws of exponents hold for square matrices Transpose - the transpose of a matrix A is switching the columns to rows and rows to columns. If A is size m n, then A T is n m. Form A T by a ij a ji for all i, j. Notice that the diagonal entries remain the same. Properties of transpose: (A T ) T = A, (A + B) T = A T + B T, (ca) T = c(a T ), (AB) T = B T A T. Especially note the last one. A is symmetric if A = A T. Thus, A must be square. a ij = a ji for all i, j. A is skew-symmetric if A = A T. AA T is symmetric for all matrices. An n n matrix A is invertible (nonsingular) if there exists a matrix B such that AB = I = BA. An inverse of a matrix is unique, denote it by A 1. Not all matrices have inverses. Find the inverse of a matrix by adjoining the identity matrix and use Gauss-Jordan elimination to turn A into the identity matrix and the right side will be the inverse: [A I] [I A 1 ]. If you cannot use the EROs to make the left side the identity, the matrix is not invertible (is singular). Check your answer. For a 2 [ 2 matrix, ] we have a shortcut [ formula (we ] will look at extending this formula later). a b If A =, then A c d 1 = 1 d b ad bc. Note that this means a 2 2 matrix is c a invertible iff ad bc 0. Properties of inverses: 1. (A 1 ) 1 = A 2. (A k ) 1 = (A 1 ) k 3. (ca) 1 = 1/cA 1 4. (A T ) 1 = (A 1 ) T 5. (AB) 1 = B 1 A 1. Note that 1 and 5 are the same as for transpose. Note: the product of two invertible matrices is invertible, however, the sum of two invertible matrices is not necessarily invertible. If A is invertible, then A x = b has a unique solution: x = A 1 b. It is the same amount of work to solve a system this way. The advantage is that if we have trying to solve the same coefficient system with different constants, I only have to solve it once. Idempotent - a matrix A such that A 2 = A. Nilpotent - a matrix A such that A k = [0] for some natural number k. Elementary matrix - a matrix which can be obtained from the identity matrix by performing one ERO. Its inverse exists and is an elementary matrix. Uses: can use matrix multiplication to perform EROs, some results are easier to prove with elementary matrices, used to find LU-factorization. Performing an ERO on a matrix A is the same as multiplying on the left by the elementary matrix corresponding to that ERO. Example: doing R 2 2R 1 to A is the same as forming E by R 2 2R 1 to the identity matrix and then multiplying EA. 3

4 B is row-equivalent to A if there exists elementary matrices E 1,..., E k such that B = E k E 1 A. Thus, B is invertible if and only if it can be written as a product of elementary matrices (A = I). Summary of invertibility equivalences: If A is an n n matrix, then the following are equivalent (all true or all false): 1. A is invertible 2. A x = b has a unique solution for every column matrix b 3. A x = [0] has only the trivial solution 4. A is row-equivalent to I n 5. A can be written as a product of elementary matrices. Lower triangular matrix - all entries above main diagonal are zero. Strictly lower triangular - all entries on and above main diagonal are zero. Upper triangular matrix - all entries below main diagonal are zero. Strictly upper triangular - all entries on and below main diagonal are zero. LU-factorization - writing a square matrix as a product of a lower and upper triangular matrix. Used in an efficient algorithm for solving systems of linear equations. To find the LU-factorization, use EROs to put A into an upper triangular matrix: U = E k E 1 A. Then L = E1 1 E 1 k. If we can use only adding multiples of rows, A definitely has an LU-factorization and it is easy to get L. L will have 1 s on the diagonal and the negative of each multiplier used to obtain U in the same position. To solve a system A x = b using LU, solve L y = b and then solve U x = y. relatively easy since they are in triangular form. Each one is Chapter 4 Let V be a set on which two operations (vector addition and scalar multiplication) are defined. V is a vector space if the following axioms hold for all u,v,w in V and every scalar c,d. 1. closure under vector operation: u v V 2. vector operation is commutative 3. vector operation is associative 4. there exists an vector identity: u æ = u 5. every vector has an inverse 6. closure under scalar operation: c u V 7. a scalar distributes over vectors 8. a vector distributes over scalars 9. scalar operation is associative: (cd)u = c(du) 10. there is a scalar identity: e u = u Standard examples of vector spaces: R n for n 1, m n matrices, P n - polynomials of degree less than or equal to n, real-valued continuous functions Non-examples of vector spaces: integers, polynomials of exactly degree n, R 2 with scalar operation defined to be c(x, y) = (cx, 0) 4

5 Need to say what the set of vectors, set of scalars, vector operation and scalar operation are. To prove something is a vector space, need to show all 10 axioms hold. To show something is not a vector space, you just need to show one axiom does not hold. vector identity is unique and each vector has a unique inverse A nonempty subset W of a vector space V is a subspace of V if W is a vector space under the same operations as defined in V. It is enough to check 1. nonempty 2. closure under vector operation 3. closure under scalar operation. Note that the vector identity must be in W and will be the same one as in V. Every vector space has at least two subspaces: {æ} - trivial subspace of just the vector identity, V - the entire vector space. Subspaces of R 2 : the two trivial subspaces and any line through (0, 0). Subspaces of R 2 : the two trivial subspaces, any line through (0, 0), plane through (0, 0). Subspaces of M n : symmetric, triangular, many more Look through the exercises in the book to get many examples and non-examples of vector spaces. The intersection of two subspaces of a vector space U is a subspace (has at least the vector identity). The union of two subspaces of a vector space U is not necessarily a subspace. Some important examples of subspaces of vector space U. vector identity U For subspaces V, W : V W The sum of two subspaces: V + W = {v + w} set of all linear combinations of a set of vectors in U: {c 1 u 1 + c 2 u c k u k } (span) Let V be a vector space. v V is a linear combination of vectors u 1, u 2,..., u k V if v = c 1 u 1 + c 2 u c k u k for some scalars. We can write this as a system of equations to solve for the scalars - the vectors go as columns in a matrix. It can be written as a linear combination as long as the matrix is consistent. Let S = {v 1,..., v k } be a subset of a vector space V. S is a spanning set of V if every vector in V can be written as a linear combination of vectors in S. Say S spans V (span(s)=v). A vector space can have many spanning sets. 5

6 Let S = {v 1,..., v k } be a set of vectors in a vector space V. span(s) = {c 1 v 1 + c 2 v c k v k } - it is the set of all linear combinations of the vectors in S. span(s) is a subspace of V - essentially we take the vectors in S and put in all other vectors necessary to make a subspace. It is the smallest subspace of V which contains S. Let S = {v 1,..., v k } be a set of vectors in a vector space V. The set is linearly independent if the equation c 1 v 1 + c 2 v c k v k = 0 has only the trivial solution c 1 = c k = 0. Else it is linearly dependent. The set is linearly dependent if one of the vectors v i can be written as a linear combination of the others. To test for linear independence: write the system of equations in a matrix and solve. Example: In R n, n vectors - put the vectors as columns in a matrix and they will be LI if the matrix is invertible. A set of vectors S = {v 1,..., v n } in a vector space V is a basis if 1. S spans V and 2. S is linearly independent. You can think of it as a minimal spanning set. Bases are not unique. S does not have to be a finite set. [ ] [ ] Standard bases: R 3 : {(1, 0, 0), (0, 1, 0), (0, 0, 1)}; P 2 : {1, x, x 2 }; M 2 :,, [ ] [ ] , While bases are not unique, every basis for a given vector space V has the same number of vectors. So if a set of n vectors is a basis for V, every basis for V has n vectors. This number, the number of vectors in a basis for V, is called the dimension of the vector space. Examples: The trivial vector space has dimension 0, R n has dimension n, dim(p n )=n+1, dim(m m,n ) = mn. Every vector in V can be written uniquely has a linear combination of vectors in a basis for V. Let V be a vector space of dimension n. Then 1. If S is a LI set of n vectors, then S is a basis. 2. If S is a spanning set of n vectors, then S is a basis. In other words, if I already know the dimension of a vector space, I only have to make sure I have the correct number of vectors and check one of the 2 conditions for a basis. Let A be an m n matrix. The row space of A is the subspace of R n spanned by the row vectors of A. The column space of A is the subspace of R n spanned by the column vectors of A. If two matrices are row-equivalent, then their row spaces are equal. Thus, we can put a matrix A into row-echelon form (or RREF), and the non-zero rows will form a basis for the row space of A. To find a basis for a subspace of R n spanned by a given set of vectors we can place the vectors as rows in a matrix and find the basis for the row space. 6

7 To find a basis for the column space of A, use one of the following methods. 1. column space of A = row space of A T. 2. row operations change the column space, but not the linear dependency, so choose the columns in A with leading ones in RREF(A) 3. use elementary column operations The row space and column space of A have the same dimension, this is called the rank of A = rank(a). This is the number of nonzero rows in REF(A). Let A be a fixed m n matrix. Then {x R n : Ax = 0} is a subspace of R n, called the nullspace of A. The dimension of the nullspace of A is called the nullity of A. To find the nullspace of a matrix A, find RREF(A). Solve the homogeneous system of equations. The number of free variables is the nullity of A. To find a basis, choose values for your free variables, or write the solutions as sx+ty etc; x,y are basis vectors. Thus, the nullity of an invertible matrix is 0. Rank-Nullity Theorem: If A is an m n matrix, then rank(a)+nullity(a)=n. The system of linear equations Ax = b is consistent if and only if b is in the column space of A. Let B = { v 1,..., v n } be an ordered basis (notice the order I write the vectors matter) for a vector space V and x be a vector in V such that x = c 1 v c n v n. The scalars c 1,..., c n are called the coordinates of x relative to the basis B. The coordinate matrix/vector of x relative to B is the column matrix whose components are the coordinates of x. Change of basis - given coordinates of a vector relative to one basis, and want to find the coordinates relative to another basis. To do a change of basis from B to B, find [x] B = P [x] B where P is the transition matrix. To find P, find [B : B] [I : P ] Chapter 5 Let u, v, w be vectors in a vector space V, let c be a scalar. An inner product on V is a function that associates a real number u, v with each pair of vectors and satisfies: 1. u, v = v, u 2. c u, v = c u, v = u, c v 3. u, v + w = u, v + u, w 4. u, u 0 and = 0 iff u =add id. NB: for R n, the standard inner product is the dot product. NB: we need to check 5 things, that it outputs a real number and satisfies 1-4. A vector space with an inner product is called an inner product space. Inner products on R n are of the form: u, v = c 1 u 1 v c n u n v n where all of the c i > 0. Let u,v be in an inner product space V. The norm of u is u = u, u. The distance between u,v is d( u, v) = u v. The angle between u,v is cos θ = u, u u v. 7

8 Two vectors are orthogonal if their inner product is 0. A unit vector has norm equal to 1. A unit vector in the same direction as v is u = Let u, v be vectors in an inner product space V. Cauchy-Schwarz Inequality: u,v u v Traingle Inequality: u + v u + v v v. Pythagorean Theorem: u and v are orthogonal if and only if u + v 2 = u 2 + v 2 Let u and v be vectors in an inner product space such that v 0. Then the orthogonal u, v projection of u unto v is proj v u = v, v v A set S of vectors in an inner product space V is called orthogonal if every pair of vectors in S is orthogonal. If each vector is a unit vector, then S is orthonormal. We really like orthonormal bases because they are easier to work with. The standard bases are orthonormal. B = {(cos θ, sin θ, 0), ( sin θ, cos θ, 0), (0, 0, 1)} is also an orthonormal basis for R n. An orthogonal set of nonzero vectors in an inner product space is linearly independent. If w is orthogonal to every vector in a set S, then it is orthogonal to every linear combination of vectors in S. If B = { v 1, v 2,..., v n } is an orthonormal basis for an inner product space V, then the coordinate representation of a vector w with respect to B is: w = w, v 1 v 1 + w, v 2 v w, v n v n. Hence, it is easier to find the coordinates in regards to an orthonormal basis than one which is not. Gram-Schmidt Orthonormalization Process: turns a basis into an orthonormal basis. It does so by using the projection of the vectors to get an orthogonal set, then making them unit vectors. You can make each vector a unit vector as you go along, or make them all unit vectors at the very end. Let B = { v 1, v 2,..., v n } be a basis for an inner product space V. Define w 1 = v 1 w 1 = v 2 v 2,w 1 w 1,w 1 w 1 = v 2 proj w1 v 2... w n = v n proj w1 v n proj w2 v n proj wn 1 v n Then {w 1,..., w n } is an orthogonal basis. Make an orthonormal basis by taking u i = w i w i. An invertible n n matrix P is an orthogonal matrix if P 1 = P T. In other words, P P T = I. The row (column) vectors of P form an orthonormal basis for R n. Let W be a subspace of the inner product space V (W is also an inner product space.) Then the orthogonal complement of W, W, is also a subspace of V. W = {v V : v, w = 0 w W } (read W perp ). It is the set of all vectors orthogonal to all the vectors in W. (W ) = W 8

9 Let U, W be subspaces of a vector space V. If V = U + W and U W = {add id}, then V is the direct sum of U and W : V = U W. Furthermore, every vector in V has a unique representation of the form u + w. V = W W. Thus, dim(w ) + dim(w ) = dim(v ). To find the orthogonal complement of a subspace S of R n, set up the homogeneous system of linear equations of the inner product of each basis vector with a generic vector (v 1, v 2,..., v n ). Then S =nullspace(a); the nullspace of that system. For the dot product on R n, put the basis vectors of S as the rows in a matrix and find the nullspace. Important examples: Let A be an m n matrix. Then R m = row(a) null(a T ) and R n = row(a T ) null(a). Chapter 6 Let T : V W be a function that maps a vector space V to a vector space W. T : V W. V is the domain, W is the codomain, T (V ) is the range. Note that codomain and range have different meanings. T (v) = w. w is the image of v under T. Range is the set of all images of vectors in V. {v : T (v) = w} is the preimage of w. T : V W is a linear transformation of V into W if the following is true for all u, v V and scalars c. 1. T ( u + V v) = T ( u) + W T ( v) 2. T (c V u) = c W T ( u) So, it is a function which preserves vector addition and scalar multiplication. We always have two linear transformations: send everything to the additive identity and the identity transformation. T ( u) =add. id u T : V V - T ( u) = u u Properties of linear transformations 1. T (æ V ) = æ W additive identity goes to additive identity 2. additive inverse goes to additive inverse (c=-1 above) 3. T (c 1 v c n v n ) = c 1 T (v 1 ) + + c n T (v n ) linearity extends These properties can help spot functions which are not linear transformations. Also, the 3rd property tells us that a linear transformation is determined by its action on a basis. So, we just have to know what it does to a basis of V to know what it does to every vector in V. Let A be a m n matrix. T ( v) = A v is a linear transformation from R n to R m. Every linear transformation from R n to R m can be represented in this way. Examples of linear transformations: rotation (use sin, cos), projection (column of 0 s), T (A) = A T transpose, differentiation, line through the origin 9

10 A fixed point is a vector which gets mapped to itself under a linear transformation: T : V V, T (u) = u. Note in the identity transformation, all points are fixed. The additive identity is always fixed. kernel of T : V W is {v V : T (v) = æ W } = ker(t). The set of all vectors in V which go to the additive identity in W. The kernel is never empty since the additive identity of V will always be in it. Note: this is an analogue to nullspace since for T (v) = Av the ker(t)=null(a). The kernel of the zero transformation is V and the kernel of the identity transformation is the trivial subspace. The ker(t) is a subspace of V. The dimension of ker(t) is the nullity. range(t)= {T (v) : v V } is a subspace of W. Dimension of range(t) is the rank. For T (v) = Av the range(t)=col(a). We have the Rank-Nullity Theorem for linear transformations: dim(range)+dim(kernel)=dim(domain). A function is one-to-one (1-1) if and only if T (u) = T (v) u = v. (Two different elements do NOT map to the same thing.) T : V W is one-to-one if and only if ker(t) is trivial (only contains add. id.) A function is onto if and only if for all w W, there exists v V such that T (v) = w. In other words, the range is W; and thus, rank(t)=dim(w). If V and W have the same dimension, then T is 1-1 iff T is onto. A linear transformation T : V W which is 1-1 and onto is called an isomorphism. V and W are isomorphic if there exists an isomorphism T : V W. Isomorphism says that two spaces are essentially the same. They are the same dim, rank, isomorphic subspaces, etc. It says if we understand one vector space, we understand all of them to which it is isomorphic. Let V and W be finite-dimensional vector spaces. Then V and W are isomorphic if and only if they have the same dimension. The linear transformation which does this takes v i w i for each basis vector. Note, this is only for finite-dimensional and is special to vector spaces (so don t try to do it with groups in Abstract Alg ;)) Every linear transformation can be represented by a matrix: T (v) = Av for some matrix A. Where the basis vectors get mapped to determines the matrix. Let T : R n R m, where both vector spaces have the stand bases. Let e i be the standard basis vector with a 1 in the ith position and 0 s elsewhere. Then T can be defined by T (v) = Av for a m n matrix A. If T (e i ) = [a 1i a 2i... a mi ] T for all i, then A = [a ij ]. Call A the standard matrix. In other words, determine where each standard basis vector goes to, and place those vectors as columns in your matrix A. 10

11 Let T 1 : R n R m and T 2 : R m R p have standard matrices A 1 and A 2. The composition T 2 (T 1 (v)) has standard matrix A 2 A 1. T : V V is invertible if there exists a linear transformation T 1 : V V such that T (T 1 (v)) = T 1 (T (v)) = v. I can get back to the element I started with; compose to get the identity transformation. If the standard matrix for T is A, then the standard matrix for T 1 is A 1. Let T : V V with standard matrix A. The following are equivalent: 1. T is invertible 2. T is an isomorphism 3. A is invertible. If you are not using the standard bases, then you need to find T (v i ) and then do a change of basis using the technique from chap 4. Let T : V W where V has basis B = {v 1,..., v n } and W has basis B. Find [T (v i )] B ; those vectors are the columns in the matrix A relative to bases B and B. Let A, B be n n matrices. A is similar to B if there exists an invertible matrix P such that A = P 1 BP. Write A B. Similarity satisfies reflexive (A is similar to A), symmetric (if A is similar to B, then B is similar to A), and transitive (if A is similar to B and B to C, then A is to C). Chapter 3 The determinant is a map from square matrices to the real numbers. det:m n,n R. It is defined recursively. If A is 2 by 2, det(a)=a 11 a 22 a 21 a 12. To find the determinant of a larger matrix, pick a row or column. For each element in that row/column, multiply the number by the determinant of the smaller matrix obtained from covering up the row and column of that element; signs alternate. The minor M ij of the element a ij is the determinant obtained by deleting the ith row and jth column of A. The cofactor is C ij = ( 1) i+j M ij. n det(a) = A = a ij C ij for any fixed i. You can also switch the i s and j s and use one j=1 column instead. Choose which row/column you use wisely. The determinant of a diagonal or triangular matrix is the product of the diagonal entries. If any row or column is all 0 s, the determinant is 0. Likewise, if any 2 rows are the same or multiples of each other, the det is 0. You can use elementary row operations to find the determinant of a matrix. It is easier this way since the more 0 s you have the easier the determinant. And, EROs are easier than the recursive determinant. If you put the matrix in REF, the determinant will be the product of the diagonal entries. 1. Interchanging two rows changes the sign of the determinant. 11

12 2. Adding a multiple of one row to another does not change the determinant. 3. Multiplying a row by a nonzero constant multiplies the determinant by that constant. Typically by just using (2) we can simplify the matrix enough to find the determinant easily. We do not really have to get 1 s on the diagonal. You can also do column operations. It makes the same changes as above. If A and B are n n matrices, det(ab)=det(a)det(b) and det(ca)= c n det(a). A square matrix is invertible if and only if det(a) 0. det(a 1 ) = det(a)=det(a T ) The determinant of an orthogonal matrix is ±1. If A B, then det(a)=det(b). 1 det(a). The adjoint of A is the transpose of the matrix of cofactors. Each entry a ij in the adjoint matrix is the cofactor of a ji. A 1 = 1 det(a) adj(a). Cramer s Rule - If a system of n equations in n variables has a coefficient matrix with nonzero determinant, then the solution to the system of equations is x 1 = A 1 A,..., x n = An A, where A i is the matrix A with replacing the ith column with the column of constants. Chapter 7 Let A be a square matrix. The scalar λ is an eigenvalue of A if there exists a nonzero vector x such that Ax = λx. x is an eigenvector of A corresponding to λ The eigenspace of λ is the set of all eigenvectors of λ along with the zero vector. It is a subspace of R n. Ax = λx Ax λx = 0 (A λi)x = 0. This system has nonzero (remember eigenvectors cannot be the zero vector) solutions if and only if the coefficient matrix A λi is not invertible if and only if the det(a λi) = 0. We find eigenvalues by setting the polynomial det(a λi) equal to 0 (it has degree n, so there are at most n eigenvalues). We find each eigenvalue s corresponding eigenvectors by solving the system of equations (A λi) x = 0. Finding a basis for the solution spaces gives us a basis for the eigenspace. det(a λi) is called the characteristic polynomial of A and det(a λi) = 0 is the characteristic equation. If λ is a multiple root of order k of the characteristic polynomial, then we say it has multiplicity k. If A is a square triangular matrix, then its eigenvalues are the entries on the diagonal. This is because the det is the product of diagonal entries. 12

13 We can also talk about the eigenvalues/vectors of a linear transformation (Ax = λx is just a special case). An eigenvalue for a linear transformation is a scalar λ such that there exists a nonzero vector x such that T (x) = λx. Eigenvectors and eigenspaces are likewise defined. trace(a) equals the sum of the eigenvalues. det(a) equals the product of the eigenvalues. Cayley-Hamilton Theorem - a matrix satisfies its characteristic equation; i.e. if p(λ) = 0 is the characteristic equation, then p(a) = 0. A square matrix A is diagonalizable iff A is similar to a diagonal matrix. If A and B are similar matrices, then they have the same eigenvalues. Thus, if A is diagonalizable, it is similar to a diagonal matrix with A s eigenvalues on the diagonal. A is diagonalizable if and only if it has n linearly independent eigenvectors. In other words, the dimensions of all the eigenspaces add up to n. A is diagonalizable if and only if there exists a matrix P such that A = P DP 1 where D is a diagonal matrix with the eigenvalues of A on the diagonal and the columns of P are n linearly independent eigenvectors of A (the basis vectors for all the eigenspaces). So, we get P using the eigenvectors and D using eigenvalues. Distinct eigenvalues have linearly independent eigenvectors, so if A has n distinct eigenvalues, it is diagonalizable. By using the standard matrix for a linear transformation, we can find a basis for V such that the matrix for T relative to B is diagonal. The basis is the eigenvectors of the standard matrix. If A is not diagonalizable it is still similar to an upper triangular matrix with the eigenvalues on the diagonal and 1 s on the superdiagonal corresponding to which eigenspaces do not have full dimension. Real Spectral Theorem - If A is a symmetric matrix, then 1. A is diagonalizable 2. all of its eigenvalues are real 3. if λ has multiplicity k then its eigenspace has dimension k. The matrix P that diagonalizes a symmetric matrix (D = P AP 1 ) is an orthogonal matrix. A matrix A is orthogonally diagonalizable if there exists a diagonal matrix P such that D = P AP 1. Fundamental Theorem of Symmetric Matrices - A is orthogonally diagonalizable and has real eigenvalues if and only if A is symmetric. The End! 13