Proofs for Quizzes 1 Linear Equations 2 Linear Transformations Theorem 1 (2.1.3, linearity criterion). A function T : R m R n is a linear transformation if and only if a) T (v + w) = T (v) + T (w), for all vectors v and w in R m, and b) T (kv) = kt (v), for all vectors v in R m and all scalars k. Suppose T is a linear transformation, and let A be a matrix such that T (x) = Ax for all x R m. Then T (v + w) = A(v + w) = Av + Aw = T (v) + T (w), T (kv) = A(kv) = k(av) = kt (v). To prove the converse, suppose that a function T : R m R n satisfies (a) and (b). Then for all x R m, so T is a linear transformation. x 1 x 2 T (x) = T. = T (x 1e 1 + x 2 e 2 + + x m e m ) x m = T (x 1 e 1 ) + T (x 2 e 2 ) + + T (x m e m ) = x 1 T (e 1 ) + x 2 T (e 2 ) + + x m T (e m ) = x 1 T (e 1 ) T (e 2 ) T (e m ) x 2., x m Theorem 2. If T : R m R p and S : R p R n are linear transformations, then their composition S T : R m R n given by (S T )(x) = S(T (x)) is also a linear transformation. We show that if T and S satisfy the linearity criteria, then so does S T. Let v, w R m and k R. Then (S T )(v + w) = S(T (v + w)) = S(T (v) + T (w)) = S(T (v)) + S(T (w)) = (S T )(v) + (S T )(w), (S T )(kv) = S(T (kv)) = S(kT (v)) = k(s(t (v)) = k(s T )(v). Theorem 3 (2.4.7, inverse of a product of matrices). If A and B are invertible n n matrices, then AB is invertible as well, and (AB) 1 = B 1 A 1. 1
To show that B 1 A 1 is the inverse of AB, we check that their product in either order is the identity matrix: (AB)(B 1 A 1 ) = A(BB 1 )A 1 = AI n A 1 = AA 1 = I n, (B 1 A 1 )(AB) = B 1 (A 1 A)B = B 1 I n B = B 1 B = I n. 3 Subspaces of R n and Their Dimensions Theorem 4 (3.1.4, the image is a subspace). The image of a linear transformation T : R m R n has the following properties: a) contains zero vector: 0 im(t ). b) closed under addition: If y 1, y 2 im(t ), then y 1 + y 2 im(t ). c) closed under scalar multiplication: If y im(t ) and k R, then ky im(t ). As we will see in the next section, these three properties mean that im(t ) is a subspace. a) 0 = A0 = T (0) im(t ). b) There exist vectors x 1, x 2 R m such that T (x 1 ) = y 1, T (x 2 ) = y 2. Since T is linear, y 1 + y 2 = T (x 1 ) + T (x 2 ) = T (x 1 + x 2 ) im(t ). c) There exists a vector x R m such that T (x) = y. Since T is linear, ky = kt (x) = T (kx) im(t ). Theorem 5 (3.1.4, the kernel is a subspace). The kernel of a linear transformation T : R m R n has the following properties: a) contains zero vector: 0 ker(t ). b) closed under addition: If x 1, x 2 ker(t ), then x 1 + x 2 ker(t ). c) closed under scalar multiplication: If x ker(t ) and k R, then kx ker(t ). As we will see in the next section, these three properties mean that ker(t ) is a subspace. a) T (0) = A0 = 0. b) Since T is linear, T (x 1 + x 2 ) = T (x 1 ) + T (x 2 ) = 0 + 0 = 0. c) Since T is linear, T (kx) = kt (x) = k0 = 0. Theorem 6 (3.2.10, bases and unique representation). The vectors v 1,..., v m form a basis of a subspace V of R n if and only if every vector v V can be expressed uniquely as a linear combination v = c 1 v 1 + + c m v m. 2
Suppose v 1,..., v m is a basis of V R n and let v be any vector in V. Since v 1,..., v m span V, v can be expressed as a linear combination of v 1,..., v m. Suppose there are two such representations Subtracting the equations yields the linear relation v = c 1 v 1 + + c m v m, v = d 1 v 1 + + d m v m. 0 = (c 1 d 1 )v 1 + + (c m d m )v m. Since v 1,..., v m are linearly independent, this relation is trivial, meaning that c 1 d 1 = = c m d m = 0, so c i = d i for all i. Thus any two representations of v as a linear combination of the basis vectors are in fact identical, so the representation is unique. Conversely, suppose every vector v V can be expressed uniquely as a linear combination of v 1,..., v m. Applying this statement with v = 0, we see that 0v 1 + + 0v m = 0 is the only linear relation among v 1,..., v m, so these vectors are linearly independent. Since each v V is a linear combination of v 1,..., v m, these vectors span V. We conclude that v 1,..., v m form a basis for V. Theorem 7 (3.4.6). Similarity is an equivalence relation, which means that it satisfies the following three properties for any n n matrices A, B, and C: a) reflexivity: A A. b) symmetry: If A B, then B A. c) transitivity: If A B and B C, then A C. a) A = IAI = I 1 AI. b) If A B, then there exists S such that B = S 1 AS. Multiplying on the left of each side by S and on the right of each side by S 1, we get SBS 1 = A, or A = SBS 1 = (S 1 ) 1 B(S 1 ), which shows that B A. c) If A B and B C, then there exists S such that B = S 1 AS and T such that C = T 1 BT. Substituting for B in the second equation yields C = T 1 (S 1 AS)T = (ST ) 1 A(ST ), which shows that A C. 4 Linear Spaces Theorem 8 (4.2.4, properties of isomorphisms). Let T : V W be a linear transformation. a) T is an isomorphism if and only if ker(t ) = {0} and im(t ) = W. a) Suppose T is an isomorphism. If T (f) = 0 for an element f V, then we can apply T 1 to each side to obtain T 1 (T (f)) = T 1 (0), or f = 0, so ker(t ) = {0}. To see that im(t ) = W, note that any g in W can be written as g = T (T 1 (g)) im(t ). 3
Now suppose ker(t ) = {0} and im(t ) = W. To show that T is invertible, we must show that T (f) = g has a unique solution f for each g. Since im(t ) = W, there is at least one solution. If f 1 and f 2 are two solutions, with T (f 1 ) = g and T (f 2 ) = g, then T (f 1 f 2 ) = T (f 1 ) T (f 2 ) = g g = 0, so that f 1 f 2 is in the kernel of T. Since ker(t ) = {0}, we have f 1 f 2 = 0 and thus f 1 = f 2. Theorem 9 (isomorphism is an equivalence relation). Isomorphism of linear spaces is an equivalence relation, which means that it satisfies the following three properties for any linear spaces V, W, and U: a) reflexivity: V V. b) symmetry: If V W, then W V. c) transitivity: If V W and W U, then V U. a) Any linear space V is isomorphic to itself via the identity transformation I : V V defined by I(f) = f, which is its own inverse. b) If V W, then there exists an invertible linear transformation T : V W. The inverse transformation T 1 : W V is then an isomorphism from W to V, so W V. c) If V W and W U, then there exist invertible linear transformations T : V W and S : W U. Composing these transformations, we obtain (S T ) : V U, with inverse transformation (S T ) 1 = T 1 S 1. Thus S T is an isomorphism and V U. 5 Orthogonality and Least Squares Theorem 10. A vector x R n is orthogonal to a subspace V R n with basis v 1,..., v m if and only if x is orthogonal to all of the basis vectors v 1,..., v m. If x is orthogonal to V, then x is orthogonal to v 1,..., v m by definition. Conversely, if x is orthogonal to v 1,..., v m, then any v V can be written as a linear combination v = c 1 v 1 + + c m v m of basis vectors, from which it follows that so x is orthogonal to v. x v = x (c 1 v 1 + + c m v m ) = x (c 1 v 1 ) + + x (c m v m ) = c 1 (x v 1 ) + + c m (x v m ) = c 1 (0) + + c m (0) = 0, Theorem 11. Orthonormal vectors u 1,..., u m are linearly independent. 4
Consider a relation Taking the dot product of each side with u i, we get which simplifies to c 1 u 1 + + c i u i + + c m u m = 0. (c 1 u 1 + + c i u i + + c m u m ) u i = 0 u i = 0, c 1 (u 1 u i ) + c 2 (u 2 u i ) + + c i (u i u i ) + + c m (u m u i ) = 0. Since all of the dot products are 0 except for u i u i = 1, we have c i = 0. This is true for all i = 1, 2,..., m, so u 1,..., u m are linearly independent. Theorem 12 (orthogonal transformations preserve the dot product). A linear transformation T : R n R n is orthogonal if and only if T preserves the dot product: v w = T (v) T (w) for all v, w R n. Suppose T is orthogonal. Then T preserves the length of v + w, so T (v + w) 2 = v + w 2 (T (v) + T (w)) (T (v) + T (w)) = (v + w) (v + w) T (v) T (v) + 2T (v) T (w) + T (w) T (w) = v v + 2v w + w w T (v) 2 + 2T (v) T (w) + T (w) 2 = v 2 + 2v w + w 2 2T (v) T (w) = 2v w T (v) T (w) = v w, where we have used that T (v) = v and T (w) = w. Conversely, suppose T preserves the dot product. Then so T (v) = v, and T is orthogonal. T (v) 2 = T (v) T (v) = v v = v 2, 6 Determinants Theorem 13 (6.2.7, determinants of similar matrices). If A is similar to B, then det A = det B. By definition, there exists an invertible matrix S such that AS = SB. By the preceding theorem, (det A)(det S) = det(as) = det(sb) = (det S)(det B). Since S is invertible, det S 0, so we can divide each side by it to obtain det A = det B. Theorem 14 (6.3.8, Cramer s Rule). Given a linear system Ax = b, with A invertible, define A b,j to be the matrix obtained by replacing the jth column of A by b. Then the components x j of the unique solution vector x are x j = det(a b,j) det A. 5
Write A in terms of its columns, as A = [ v 1 v j v n. If x is the solution of the system Ax = b, then det(a b,j ) = det [ v 1 b v n = [ v 1 Ax v n = [ v 1 (x 1 v 1 + + x j v j + + x n v n ) v n = [ v 1 x j v j v n = x j [ v1 v j v n = x j det A. 7 Eigenvalues and Eigenvectors Theorem 15 (7.2.1, finding eigenvalues). A scalar λ is an eigenvalue of an n n matrix A if and only if det(a λi n ) = 0. The expression f A (λ) = det(a λi n ) is called the characteristic polynomial of A. Note that Av = λv Av λv = 0 Av λ(i n v) = 0 (A λi n )v = 0, so that we have the following chain of equivalent statements: λ is an eigenvalue of A There exists v 0 such that Av = λv There exists v 0 such that (A λi n )v = 0 ker(a λi n ) {0} A λi n is not invertible det(a λi n ) = 0. Theorem 16 (eigenvectors with distinct eigenvalues are linearly independent). Let A be a square matrix. If v 1, v 2,..., v s are eigenvectors of A with distinct eigenvalues, then v 1, v 2,..., v s are linearly independent. We use proof by contradiction. Suppose v 1,..., v s are linearly dependent, and let v m be the first redundant vector in this list, with v m = c 1 v 1 + + c m 1 v m 1. Suppose Av i = λ i v i. Since the eigenvector v m is not 0, there must be some nonzero coefficient c k. Multiplying the equation v m = c 1 v 1 + + c k v k + + c m 1 v m 1 by A, we get Av m = A(c 1 v 1 + + c k v k + + c m 1 v m 1 ) Av m = c 1 Av 1 + + c k Av k + + c m 1 Av m 1 λ m v m = c 1 λ 1 v 1 + + c k λ k v k + + c m 1 λ m 1 v m 1. 6
Multiplying the same equation instead by λ m, we get λ m v m = c 1 λ m v 1 + + c k λ m v k + + c m 1 λ m v m 1, which, when subtracted from our result above, yields 0 = (λ m λ m )v m = c 1 (λ 1 λ m )v 1 + + c k (λ k λ m )v k + + c m 1 (λ m 1 λ m )v m 1. Since c k and λ k λ m are nonzero, we have a nontrivial linear relation among the vectors v 1,..., v m 1, contradicting the minimality of m. Theorem 17 (7.3.6, eigenvalues of similar matrices). Suppose A is similar to B. Then a) f A (λ) = f B (λ). a) If B = S 1 AS and A, B are n n matrices, then f B (λ) = det(b λi n ) = det(s 1 AS λs 1 I n S) = det(s 1 (A λi n)s) = (det S 1 )(det(a λi n ))(det S) = (det S) 1 (det S)(det(A λi n )) = det(a λi n ) = f A (λ). 7