Linear Algebra 1 M.T.Nair Department of Mathematics, IIT Madras 1 Eigenvalues and Eigenvectors 1.1 Definition and Examples Definition 1.1. Let V be a vector space (over a field F) and T : V V be a linear operator. A scalar λ is called an eigenvalue of T if there exists a non-zero x V such that T x = λx, and in that case x is called an eigenvector of T corresponding to the eigenvalue λ. The set of all eigenvalues of T is called the eigen-spectrum or point spectrum of T, and we denote it by σ eig (T ). Let T : V V be a linear operator and λ F. Observe: λ σ eig (T ) T λi is not one-one. A non-zero x V is an eigenvector of T corresponding to λ σ eig (T ) x N(A λi)\{0}. The set of all eigenvectors T corresponding to λ σ eig (T ) is the set N(A λi) \ {0}. Definition 1.2. Let T : V V be a linear operator and λ be an eigenvalue of T. 1. The subspace N(T λi) of V is called the eigenspace of T corresponding to the eigenvalue λ. 2. dim[n(t λi)] is called the geometric multiplicity of λ. Remark 1.3. If V is the zero space, then zero operator is the only operator on V, and it does not have any eigenvalue as there is no non-zero vector in V. Example 1.4. Let A R n n, and consider it as a linear operator from R n to itself. We know that A is not one-one if and only if columns of A are linearly dependent if and only if det(a) = 0. Thus, λ σ eig (A) det(a λi) = 0. 1 Lectures for the course MA5310, July-November 2012. 1
1.2 Existence of eigenvalues Note that for a given A R n n, there need not exist λ R such that det(a λi) = 0. For example, consider n = 2 and [ ] 0 1 A =. 1 0 This matrix has no eigenvalues! However, if A C n n, then, by the fundamental theorem of algebra, there exists λ C such that det(a λi) = 0. Thus, in this case σ eig (A). Now, recall that if V is a finite dimensional vector space, say of dimension n, and {u 1,..., u n } is a basis of V and if T : V V is a linear transformation, then T is one-one columns of [T ] EE are linearly independent, and hence, in this case, λ σ eig (T ) det([t ] EE λi) = 0. Note that the above equivalence is true for any basis E of V. Hence, eigenvalues of a linear operator T can be found by finding the zeros of the polynomial det([t ] EE λi) in F. This also shows that: THEOREM 1.5. If V is a finite dimensional over an algebraically closed field F, then every linear operator on V has atleast one eigenvalue Recall from algebra that C is an algebraically closed field, whereas R and Q are not algebraically closed. We shall give a proof for the above theorem without relying on the concept of determinant. Before that let us observe that the conclusion in the above theorem need not hold if the space is infinite dimensional. Example 1.6. (i) Let V = P, the space of all polynomials over F, which is either R or C. Let T p(t) = tp(t), p(t) P. Note that for λ F and p(t) P, T p(t) = tp(t) p(t) = 0. Hence, σ eig (T ) =. (ii) Let V = c 00 and T be the right shift operator on V, i.e., T (α 1, α 2,...) = (0, α 1, α 2,...). Then we see that σ eig (T ) =. 2
Proof of Theorem 1.5 independent of determinant. Let V be an n dimensional vector space over an algebraically closed field F. Let x be a non-zero vector in V. If T x = 0, then 0 is an eigenvalue. Assume that T x 0. Then we known that {x, T x,..., T n x} is linearly dependent, so that there exist α 0, α 1,..., α k in F with k{1,..., n} such that a k 0 and α 0 x + α 1 T x + + α k T k x = 0, i.e., Thus, (α 0 I + α 1 T + + α k T k )x = 0. p(t )x = 0, where p(t) := α 0 + α 1 t + + α k t k. By fundamental theorem of algebra, there exist λ 1,..., λ k in F such that p(t) = α k (t λ 1 ) (t λ k ). Since p(t )x = 0, we have α k (T λ 1 I) (T λ k I)x. This shows that atleast one of T λ 1 I,..., T λ k I is not one-one. Thus, at least one of λ 1,..., λ k is an eigenvalue of T, and hence, σ eig (T ). Can we show existence of an eigenvalue by imposing more conditions on the space V and the operator? Here is an answer in this respect. THEOREM 1.7. Let V be a non-zero finite dimensional inner product space over F which is either R or C, and T be a self adjoint operator on V. Then σ eig (T ), σ eig (T ) R. Proof. Let x be a non-zero vector in V such that T x = 0. p(t) := α 0 + α 1 t + + α k t k be such that α k 0 and As in the proof of Theorem 1.5, let p(t )x = 0. Let λ 1,..., λ k in C be such that p(t) = α k (t λ 1 ) (t λ k ). If λ j R for some j, then we know that λ j is also a zero of p(t). So, there is l such that λ l = λ j. Writing λ j = α j + iβ j with α j, β j R and β j 0, we have (t λ j )(t λ l ) = [t (α j + iβ j )][t (α j iβ j )] = (t α j ) 2 + β 2 j. Since p(t )x = 0, it follows that either there exists some m such that λ m R and T λ m I is not one-one or there exists some j such that λ j R and (T α j I) 2 + βj 2 I is not one-one. In the first case, λ m R is an eigenvalue. In the latter case, there exists u 0 in V such that [(T α j I) 2 + β 2 j I]u = 0. 3
Now, using the self adjointness of T, [(T α j I) 2 + βj 2 I]u, u = (T α j I) 2 u, u + βj 2 I u, u = (T α j I)u, (T α j I)u + βj 2 I u, u. Since u 0, it follows that β j = 0 and (T α j I)u = 0. Thus, T has a real eigenvalue. Next, suppose that λ σ eig (T ). If x is an eigenvector corresponding to λ, then we have λ x, x = λx, x = T x, x = x, T x = x, λx = λ x, x. Hence, λ R. THEOREM 1.8. Eigenvectors corresponding to distinct eigenvalues of a linear operator are linearly independent. Proof. Let λ 1,... λ n be eigenvalues of a linear operator T : V V and let u 1,..., u n be eigenvectors corresponding to λ 1,..., λ n, respectively. We prove the result by induction: Let n = 2, and let α 1, α 2 such that α 1 u 1 + α 2 u 2 = 0. Then T (α 1 u 1 + α 2 u 2 ) = 0, λ 2 (α 1 u 1 + λ 2 α 2 u 2 ) = 0 so that α 1 λ 1 u 1 + α 2 λ 2 u 2 = 0 (i), α 1 λ 2 u 1 + α 2 λ 2 u 2 = 0. (ii) Hence, (ii) (i) implies α 1 (λ 2 λ 1 )u 1 = 0. Since λ 2 λ 1 we have α 1 = 0. Hence, from the equation α 1 u 1 + α 2 u 2 = 0, we obtain α 2 = 0. Next, assume that the result is true for n = k for some k N, 2 k < n. Let α 1,..., α k+1 be such that α 1 u 1 +... + α k+1 u k+1 = 0. (iii) Since T (α 1 u 1 +... + α k+1 u k+1 ) = 0, λ n (α 1 u 1 +... + α k+1 u k+1 ) = 0, we have α 1 λ 1 u 1 +... + α k+1 λ k+1 u k+1 ) = 0 (iv), α 1 λ n u 1 +... + α k+1 λ k+1 u k+1 ) = 0. (v) Hence, (v) (iv) implies α 1 (λ 1 λ k+1 )u 1 +... + α k (λ k λ k+1 )u k = 0. By induction assumption, u 1,..., u k are linearly independent. Since λ 1,..., λ k, λ k+1 are distinct, it follows that α 1 = 0, α 2 =,..., α k = 0. Hence, from (iii), α k+1 = 0 as well. This completes the proof. 4
LEMMA 1.9. Let V be a non-zero finite dimensional inner product space over F which is either R or C, and T be a normal operator on V. Let λ F and x V. Then Ax = λx A x = λx. Proof. Since A is normal, i.e., A A = AA, it can be seen that A λi is also a normal operator. Indeed, (A λi)(a λi) = AA λa λa + λ 2 I = A A λa λa + λ 2 I = (A λi)(a λi). Thus, (A λi)x 2 = (A λi)x, (A λi)x = (A λi)(a λi)x, x = (A λi)(a λi)x, x = (A λi)x, (A λi)x = (A λi)x 2. Hence, Ax = λx A x = λx. THEOREM 1.10. Let V be a non-zero finite dimensional inner product space over F which is either R or C, and T be a normal operator on V. Then eigenvectors associated with distinct eigenvalues are orthogonal. In particular, λ µ = N(T λi) N(T µi). Proof. Let T be a normal operator and let λ and µ be distinct eigenvalues of T with corresponding eigenvectors x and y, respectively. Then λ x, y = λx, y = Ax, y. = x, A y = x, µy = µ x, y so that (λ µ) x, y = 0. Since λ µ, we have x, y = 0. 1.3 Diagonalizability We observe: If V is a finite dimensional vector space and T be a linear operator on V such that there is a basis E for V consisting of eigenvectors of T, then [T ] EE is a diagonal matrix. In view of the above observation we have the following definition. 5
Definition 1.11. Let V be a finite dimensional vector space and T be a linear operator on V. Then T is said to be diagonalizable if there is a basis E for V consisting of eigenvectors of T such that [T ] EE is a diagonal matrix. THEOREM 1.12. Let V be a finite dimensional vector space and T be a linear operator on V. Then T is diagonalizable if and only if there are distinct λ 1,..., λ k in F such that V = N(T λ 1 I) + + N(T λ k I). Look at the following example. Example 1.13. Consider the matrix [ ] 0 1 A =. 0 0 We observe that A as a linear operator on R 2 has only one eigenvalue which is 0 and its geometric multiplicity is 1. Hence there is no basis for R 2 consisting of eigenvectors of A. Hence, the above operator is not diagonalizable. Remark 1.14. Let V be an n- dimensional vector space and T be a linear operator on V. Suppose T is diagonalizable. Let {u 1,..., u n } be a basis of V consisting of eigenvectors of T, and let λ j F be such that u j = λ j u j for j = 1,..., n. Let use the notation U := [u 1,..., u n ] for a map from F n to V defined by α 1 [u 1,..., u n ]. α n = α 1u 1 + α n u n. Then we have T U = T [u 1,..., u n ] = [T u 1,..., T u n ] = [λ 1 u 1,..., λ n u n ]. Thus, using the standard basis {e 1,..., e n } of F n, we have T Ue j = λ j e j, j = 1,..., n. Thus, equivalently, T U = UΛ, U 1 T U = Λ, where Λ := diag(λ 1,..., λ n ), the diagonal matrix with diagonal entries λ 1,..., λ n. If T itself is an n n-matrix, then the above relation shows that T is similar to a diagonal matrix. Under what condition on the space V and operator T can we say that T is diagonalizable? THEOREM 1.15. Let V be a finite dimensional vector space, say dim(v ) = n, and T be a linear operator on V. 6
(i) If T has n distinct eigenvalues, then T is diagonalizable. (ii) If T has an eigenvalue λ such that N(T λi) is a proper subspace of N(T λi) 2, then T is not diagonalizable. Proof. (i) Follows from Theorem 1.8. (ii) Assume for a moment that T is diagonalizable. Then by Theorem 1.12, there are distinct λ 1,..., λ k in F such that V = N(T λ 1 I) + + N(T λ k I). Let x N(T λ 1 ) 2, and let x j N(T λ j ) be such that x = x 1 + + x k. Then (T λ 1 I)x = (T λ 1 I)x 1 + + (T λ 1 I)x k. We observe that (T λ 1 I)x N(T λ 1 I) and (T λ 1 I)x j N(T λ j I) for j = 1,..., k. Hence, (T λ 1 I)(x x 1 ) = 0. Consequently, x N(T λ 1 I). Since N(T λ 1 ) N(T λ 1 ) 2, we obtain that N(T λ 1 I) 2 = N(T λ 1 I). Similarly, we have N(T λ j I) 2 = N(T λ j I) for j = 1,..., k. In view of the above theorem, we introduce the following definition. Definition 1.16. An eigenvalue λ of a linear operator T : V V is said to be defective if N(T λi) is a proper subspace of N(T λi) 2. THEOREM 1.17. Let T be a self-adjoint operator on an inner product space V. Then every eigenvalue of T is non-defective. Proof. Since T is self-adjoint, for x V, Hence, N(T λi) 2 = N(T λi). (T λi) 2 x, x = (T λi)x, (T λi)x. Still it is not clear from whatever we have proved whether a self-adjoint operator on a finite dimensional space is diagonalizable or not. We shall take up this issue in the next section. Before that let us observe some facts: For any linear operator T : V V, {0} N(T ) N(T 2 ) N(T ) N(T n ) N(T ). 7
If there exists k N such that then N(T k ) = N(T k+1 ) N(T k ) = N(T k+j ) j N. If V is finite dimensional and N(T ) {0}, then there exists k N such that N(T k 1 ) N(T k ) = N(T k+j ) j N. Definition 1.18. Let V be finite dimensional space and λ be an eigenvalue of T. Then the number l := min{k : N(T λi) k 1 N(T λi) k = N(T λi) k+1 } is called the ascent or index of λ. Note that: If l is the ascent of an eigenvalue λ, then N(T λi) l = N(T λi) k. Definition 1.19. Let V be finite dimensional space and λ be an eigenvalue of T with ascent l. Then the space N(T λi) l is called the generalized eigen-space of T corresponding to the eigenvalue λ. Members of a generalized eigen-space are called generalized eigenvectors. k=1 1.4 Spectral representation of self adjoint operators A natural question is whether every self-adjoint operator on a finite dimensional inner product space is diagonalizable. The answer is in affirmative. In order to prove this, we shall make use of a definition and a preparatory lemma. Definition 1.20. Let V be a vector space and T be a linear operator on V. A subspace V 0 of V is said to be invariant under T if T (V 0 ) V 0, that is, for every x V, x V 0 = T x V 0, and in that case, we say that V 0 is an invariant subspace of T. LEMMA 1.21. Let T be a self-adjoint operator on an inner product space V. Let V 0 of V be an invariant subspace of T. Then (i) V 0 is invariant under T, (ii) T 0 := T V0 : V 0 V 0, the restriction of T to V 0, in self-adjoint. 8
Proof. (i) Suppose V 0 is invariant under T. Then for every x V 0 and u V 0, we have T u V 0, and hence, so that T x V 0. (ii) For every x, y V 0, we have This completes the proof. T x, u = x, T u = 0 T 0 x, y = T x, y = x, T y = x, T 0 y. THEOREM 1.22. (Spectral representation) Let T be a self-adjoint operator on a finite dimensional inner product space V, say of dimension n. Let λ 1,..., λ k be the distinct eigenvalues of T. Then V = N(T λ 1 I) + + N(T λ k I). Further, there exists a linear operator U : F n V such that U U = I n, UU = I V and [T ] EE = U T U is a diagonal matrix with diagonal entries λ 1,..., λ k such that λ j repeated n j := dim(t λ j I) times for j = 1,..., k. Proof. Let V 0 = N(T λ 1 I) + + N(T λ k I). By Projection Theorem, V = V 0 + V 0. Its enough to show that V0 = {0}. Suppose V0 {0}. By Lemma 1.21, V0 is invariant under T and the operator T 1 := T V : V 0 V0, the restriction of T to V0, is self-adjoint. By theorem 1.7, 0 T has an eigenvalue λ R. Let x V0 be a corresponding eigenvector. Now, since λ 1 x = T 1 x = T x, λ {λ 1,..., λ k }. Without loss of generality, assume that λ = λ 1. Then x N(T λ 1 I) V 0. Thus, x V 0 V 0 = {0}, a contradiction. Hence, V 0 = {0}, and V = N(T λ 1 I) + + N(T λ k I). To see the remaining part, for each j {1,..., k}, let {u j1,..., u jnj } be an ordered orthonormal basis of N(T λ j I). Then we see that E = {u 11,..., u 1n1, u 21,..., u 2n2,..., u k1,..., u knk } is an ordered orthonormal basis for V. To simplify the notation, let us write the above ordered E as {u 1,..., u n } and µ i, i = 1,..., n such that µ nj 1+i = λ i for i = 1,..., n j with n 0 = 0 and j = 1,..., k. Let J : V F n be the canonical isomorphism defined by J(x) = [x] E, x V. Then, we have J = J 1 and U := J satisfies U U = JJ 1 = I V, UU = J 1 J = I n, U T U = JT J 1 = A := [T ] EE. 9
Further, Ae j = JT J 1 e j = JT u j = J(µ j u j ) = µ j Ju j = µ j e j. Thus, A := [T ] EE is a diagonal matrix with diagonal entries µ 1,..., µ n. Remark 1.23. Recall that the U introduced in the proof of Theorem 1.22 is same as the operator introduced in Remark 1.14, namely, U = [u 11,..., u 1n1, u 21,..., u 2n2,..., u k1,..., u knk ]. COROLLARY 1.24. (Spectral representation) Let T be a self-adjoint operator on a finite dimensional inner product space V, say of dimension n. Let λ 1,..., λ k be the distinct eigenvalues of T. For each i, let {u i1,..., u ini } be an ordered orthonormal basis of N(T λ i I). Then k n i T x = λ i x, u ij u ij, x V. i=1 i=1 COROLLARY 1.25. (Spectral representation) Let T be a self-adjoint operator on a finite dimensional inner product space V, say of dimension n. Let λ 1,..., λ k be the distinct eigenvalues of T. For each i {1,..., k}, let P j be the orthogonal projection onto N(T λ i I). Then k T = λ i P i. i=1 COROLLARY 1.26. (Diagonal representation) Let A F n n be a self adjoint matrix (i.e., hermitian if F = C and symmetric if F = R). Then there exists a unitary matrix U F n n such that U T U is a diagonal matrix. 1.5 Singular value representation Let T be a linear operator on a finite dimensional inner product space V. The we know that T T is a self adjoint operator. By spectral theorem, we know that V has an orthonormal basis E; {u 1,..., u n } consisting of eigenvectors of T T, and if T T u j = λ j u j for j = 1,..., n (where λ j s need not be distinct), then Note that T T x = n λ j x, u j u j, x V. j=1 λ j = λ j u j, u j = λ j u j, u j = T T u j, u j = T u j, T u j = T u j 2 0. Let λ 1,..., λ k be the nonzero (positive) numbers among λ 1,..., λ n. For j {1,..., k}, let us write λ j = s 2 j, where s j is the positive square-root of λ j. Thus, writing v j = T uj s j, we obtain T u j = s j v j, T v j = s j u j. 10
Further, since x = k j=1 x, u j u j, we have Also, Hence, Observe that T x = k x, u j T u j = j=1 k x, T y = T x, y = s j x, u j v j, y = j=1 T y = k s j x, u j v j. (1) j=1 k s j x, u j v j, y = x, j=1 k s j y, v j u j. j=1 k s j y, v j u j. (2) j=1 s j v i, v j = v i, s j v j = v i, T u j = T v i, u j = s i u i, u j = s i u i, u j. Therefore, {v j : j = 1,..., k} is an orthonormal set. From the representations (1) and (2), it can be seen that {u 1,..., u k } is an orthonormal basis of N(T ), and {v 1,..., v k } is an orthonormal basis of R(T ). Definition 1.27. The numbers s 1,..., s n are called the singular values of T and the set {(s j, u j, v j ) : j = 1,..., n} is called the singular system for T. The representations (1) and (2) above are called the singular value representations of T and T, respectively. If we write U 0 = [u 1,..., u k ], V 0 = [v 1,..., v k ] as the operators on F k defied as in Remark 1.14, then, in view of the relations T u j = s j v j and T v j = s i u j, we have T U 0 = V 0 S 0, T V 0 = U 0 S, where S 0 = diag(s 1,..., s k ). Suppose n > k. If we extend the orthonormal sets {u 1,..., u k } and {v 1,..., v k } to orthonormal bases {u 1,..., u n } and {v 1,..., v n }, then for j = k + 1,..., n, u j n(t ) and v j R(T ) so that,since R(T ) = N(T ), we obtain T U = V S, T V = US, where U = [u 1,..., u n ], V 0 = [v 1,..., v n ], S = diag(s 1,..., s n ), with s j = 0 for j > k. Thus, we have V T U = S, U T V = S. 11
1.6 Spectral decomposition Throughout this section we assume that V is a finite dimensional space over C and T : V V is a linear operator. In the following, if V 1 and V 2 are subspaces of V, then by V 1 V 2 we mean V 1 + V 2 whenever V 1 V 2 = {0}. The main theorem, in this section, is the following. THEOREM 1.28. Let λ 1,..., λ k be the distinct eigenvalues of T with ascents l 1,..., l k be the ascents of λ 1,..., λ k, respectively. Then V = N(T λ 1 I) l1 N(T λ k I) l k. where each N(T λ j I) lj is invariant under T. In particular, T is diagonalizable if and only if ascent of each eigenvalue of T is 1. Since ascent of each eigenvalue of a self adjoint operator on an inner product space, an immediate corollary of the above theorem is Theorem 1.22. For proving Theorem 1.28, we shall make use of the following lemma. LEMMA 1.29. Let V be a finite dimensional vector space and T : V V be a linear operator. Let λ be an eigenvalue of T with ascent l. Then the following hold. 1. For every j N, N(T λi) j and R(T λi) j are invariant under T. 2. V = N(T λi) l R(T λi) l. 3. λ is an eigenvalue of T 0 := T, and λ is the only eigenvalue of T N(T λi) l 0. 4. If µ λ, then for each j N, N(T µi) j N(T λi) l = {0}. Proof. 1. Let j N and x N(T λi) j. Then (T λi) j T x = T (T λi) j x = 0 = T x N(T λi) j. Hence, T x N(T λi) j. Let y R(T λi) j. Then x V such that (T λi) j x = y. Hence, T y = T (T λi) j x = (T λi) j T x R(T λi) j. Hence, T y R(T λi) j. 2. Since dim(v ) < and since dim[n(t λi) l ] + dim[r(t λi) l ] = dim(v ), it is enough to show that N(T λi) l R(T λi) l = {0}. 12
Suppose x N(T λi) l R(T λi) l = {0}. Then, (T λi) l x = 0 and there exists u V such that x = (T λi) l u. Then (T λi) l x = (T λi) 2l u = 0 so that u N(T λi) 2l = N(T λi) l. Thus x = (T λi) l u = 0. 3. Note that, if 0 x N(T λi), then x N(T λi) l and hence λx = T x = T 0 x so that λ is an eigenvalue of T 0. Next suppose that µ C such that µ λ and µ is an eigenvalue of T 0 with a corresponding eigenvector y N(T λi) l. Then we have 0 = (T λi) l y = (λ µ) l y which is a contradiction, since λ µ and y 0. Thus, λ is the only eigenvalue of T 0. 4. By (2), it is enough to show that N(T µi) j R(T λi) l. We shall prove this by induction. Let j = 1 and x N(T µi). By (2), there exists u N(T λi) l and v R(T λi) l such that x = u + v. Then 0 = (T µi)x = (T µi)u + (T µi)v. Since (T µi)u N(T λi) l and (T µi)v R(T λi) l, by (2) we have (T µi)u = 0. Now, if u 0, then it follows that, µ is also an eigenvalue of T 0, which is a contradiction, due to (3). Thus, u = 0 and x = v R(T λi) l. Next assume that N(T µi) j R(T λi) l for some j 1. We have to show that N(T µi) j+1 R(T λi) l. So let x N(T µi) j+1. By (2), there exists u N(T λi) l and v R(T λi) l such that x = u + v. Then 0 = (T µi) j+1 x = (T µi) j+1 u + (T µi) j+1 v. Since (T µi) j+1 u N(T λi) l and (T µi) j+1 v R(T λi) l, by (2) we have (T µi) j+1 u = 0, i.e., (T µi)u N(T µi) j N(T λi) l. But, by induction hypothesis, N(T µi) j R(T λi) l. Thus, (T µi)u N(T λi) l R(T λi) l = {0}. Thus, if u 0, then µ is also an eigenvalue of T 0. which is a contradiction, due to (3). Thus, u = 0 and x = v R(T λi) l. Proof of Theorem 1.28. In view of Lemma 1.29, it is enough to prove that V is spanned by generalized eigenvectors of T. We shall prove this by induction on dimension of V. The case of dim(v ) = 1 is obvious, for in this case, V is spanned by the eigenspace of T, as there is only one eigenvalue and the generalized eigenspace corresponding to that is the eigenspace which is the whole space. Next assume that the result is true for all vector spaces of dimension less than n, and let dim(v ) = n. Let λ be an eigenvalue of T with ascent l. Then, by Lemma 1.29, V = N(T λi) l + R(T λi) l where dim[r(t λi) l ] < n. Let T := T. By induction assumption, R(T R(T λi) l λi)l is spanned by the generalized eigenvectors of T. But, generalized eigenvectors of T are generalized eigenvectors of T as well. Thus both N(T λi) l and R(T λi) l are spanned by the generalized eigenvectors of T. This completes the proof. THEOREM 1.30. Let λ 1,..., λ k be the distinct eigenvalues of T with ascents of λ 1,..., λ k, respectively. Let p(t) = (t λ 1 ) l1 (t λ k ) lk. 13
Then, p(t ) = 0. Further, if q(t) is a polynomial satisfying q(t ) = 0, then p(t) divides q(t). Proof. Since (T λ r I) lr and (T λ s I) ls commute, it follows that p(t )u = 0 for every u N(T λ i I) li, i = 1,..., k. Hence, by Theorem 1.28, p(t )x = 0 for every x V. Consequently, p(t ) = 0. Next, let q(t) be a polynomial such that q(t ) = 0. Let µ 1,..., µ r be the distinct zeros of q(t) so that q(t) = a(t µ 1 ) n1 (t µ r ) nr for some 0 a C. Since q(t ) = 0, for each j {1,..., k}, we have a(t µ 1 I) n1 (T µ r I) nr u = 0 u N(T λ j I) lj. ( ) Now, if µ i λ j, then we know that (T µ i I) ni there exists i such that µ i = λ j such that is one-one on N(T λ j I) lj. Hence, it follows that (T λ j I) ni u = 0 u N(T λ j I) lj. Taking u N(T λ j I) lj \ N(T λ j I) lj 1, it follows that n i l j. Thus, {λ 1,..., λ k } {µ 1,..., µ r }. Without loss of generality, we can assume that m j = λ j so that n j l j for j = 1,..., k. Thus, p(t) divides q(t). Definition 1.31. A monic polynomial p(t) is called a minimal polynomial for T if p(t ) = 0 and for any polynomial q(t) with q(t ) = 0, p(t) divides q(t). Theorem 1.30 shows that if λ 1,..., λ k are the distinct eigenvalues of T with ascents of λ 1,..., λ k, respectively, then p(t) := (t λ 1 ) l1 (t λ k ) lk is the minimal polynomial of T. For the next definition we recall the concept of matrix representation: Let V be a finite dimensional vector space, and let E 1 := {u 1,..., u n } and E 2 := {v 1,..., v n } be bases of V. Let T : V V be a linear operator. Let Then [T ] E1E 1 = [J 1 ] E2E 1 [T ] E2E 2 [J] E1E 2 = [J] 1 E 2E 1 [T ] E2E 2 [J] E1E 2, where J : V V is the isomorphism defined by J(α 1 u 1 +... + α n u n ) = α 1 v 1 +... + α n v n. 14
Hence, we have det[t ] E1E 1 = det[t ] E2E 2. Thus, determinant of the matrix representation of an operator is independent of the basis with respect to which it is represented. Definition 1.32. Let E be a basis of V. The monic polynomial q T (t) := det[ti T ] EE is called the characteristic polynomial of T, where E is any basis of V. We know that eigenvalues of T are the zeros of the characteristic polynomial q T (t). Thus, λ 1,..., λ k are the distinct eigenvalues of T if and only if q T (t) = (t λ 1 ) n1 (t λ k ) n k with n 1,..., n k in N such that n 1 + + n k = n := dim(v ). THEOREM 1.33. (Cayley Hamilton theorem) q T (T ) = 0. Proof. Recall that for operators T, T 1, T 2 : V V and α C, [T 1 + T 2 ] EE = [T 1 ] EE + [T 2 ] EE, [αt ] EE = α[t ] EE. Hence, if q T (t) = t n + a 1 t n 1 +... + a n 1 t + a n, then [q T (T )] EE = [T ] n EE + a 1 [T ] n 1 EE = q T ([T ] EE ). +... + a n 1[T ] EE + a n [I] EE Recall that, by the Cayley Hamilton theorem for matrices, we have q T ([T ] EE ) = 0. [q T (T )] EE = 0 so that q T (T ) = 0. Therefore, Definition 1.34. Let λ be an eigenvalue of T and λ be an eigenvalue of T. Then the order of λ as a zero of the characteristic polynomial q T (t) is called the algebraic multiplicity of λ. THEOREM 1.35. Let λ be an eigenvalue of T with ascent l. Then m := dim[n(t λi) l ] is the algebraic multiplicity of λ. In order to prove the above theorem we make use of the following observation. PROPOSITION 1.36. Suppose V 1 and V 2 are invariant subspaces of a linear operator T : V V such that V = V 1 V 2. Let T 1 = T V1 and T 2 = T V2. Then det(t ) = det(t 1 ) det(t 2 ). 15
Proof. Writing x V as we have Define T 1, T 2 : V V by Then we have x = x 1 + x 2 with x 1 V 1, x 2 V 2, T x = T 1 x 1 + T 2 x 2. T 1 x = T 1 x 1 + x 2, T2 x = x 1 + T 2 x 2. T 1 T2 x = T 1 (x 1 + T 2 x 2 ) = T 1 x 1 + T 2 x 2 = T x. Thus, with respect to any basis E of V, we have [T ] EE = [ T 1 ] EE [ T 2 ] EE and hence Next we show that det(t ) = det( T 1 ) det( T 2 ). det( T 1 ) = det(t 1 ), det( T 2 ) = det(t 2 ). For this, let E 1 = {u 1,..., u r } and E 2 = {u r+1,..., u n } be bases of V 1 and V 2 respectively. Consider the basis E = E 1 E 2 for V. Then, we have { { T 1 u j, j = 1,..., r, u j, j = 1,..., r, T 1 u j = and T2 u j = u j, j = r + 1,..., s. T 2 u j, j = r + 1,..., s. Hence, we obtain, det( T 1 ) = det(t 1 ), det( T 2 ) = det(t 2 ). This completes the proof. Proof of Theorem 1.35. Let K = N(T λi) l and R = R(T λi) l. We know that K and R are invariant under T and V = K R. Let T 1 := T K and T 2 := T R. We know that λ is the only eigenvalue of T 1. Also, observe that λ is not an eigenvalue of T 2. Indeed, if x R such that T 2 x = λx, then x N(T λi) K so that x = 0. By Proposition 1.36, det(ti T ) = det(ti 1 T 1 ) det(ti 2 T 2 ), where I 1 and I 2 are identity operators on K and R respectively. Since det(λi 2 T 2 ) 0, it is clear that the algebraic multiplicity of λ as an eigenvalue of T is same as the algebraic multiplicity of λ as an eigenvalue of T 1. Since λ is the only eigenvalue of T 1, we obtain that m := dim(k) is the algebraic multiplicity of λ. Remark 1.37. Recall that if T is a self-adjoint operator on a finite dimensional inner product space, then we have k T = λ i P i i= 16
where λ 1,..., λ k are the distinct eigenvalues of T and P 1,..., P k are the orthogonal projections onto the eigenspaces N(T λ I ),..., N(T λ k I), respectively. Next suppose that V is a finite dimensional vector space and T is a diagonalisable operator. Again let λ 1,..., λ k be the distinct eigenvalues of T. We know that V = N(T λ I ) N(T λ k I). Hence, every x V can be written uniquely as x = x 1 + + x k with x i N(T λ i I). For i = 1,..., k, let P i : V V be defined by P i x = x i, x V. The, it can be easily seen that P 2 i = P i so that P i is a projection onto N(T λ i I). Hence, and I = P 1 + + P k T = T P 1 + + T P k = k λ i P i. Next, consider any linear operator finite dimensional vector space over C and let λ 1,..., λ k be the distinct eigenvalues of T with ascents l,..., l k, respectively. Then, by spectral decomposition theorem, we have Hence, every x V can be written uniquely as i=1 V = N(T λ I ) N(T λ k I). x = x 1 + + x k with x i N(T λ i I) li. Again, for i = 1,..., k, let P i : V V be defined by Then we have and T = T P 1 + + T P k = Let D i = (T λ i I)P i. Then we see that P i x = x i, x V. I = P 1 + + P k k λ i P i + i=1 k (T λ i I)P i. i=1 D li i = 0 and D li 1 0. Thus, D i is a nilpotent operator of index l i for i = 1,..., k. 17
1.7 Triangulization and Jordan representation As in last section, we assume that V is a finite dimensional space over C and T : V V is a linear operator. THEOREM 1.38. (Triangulization) There exists a basis E for V such that [T ] EE is a triangular matrix. Proof. First let us assume that T has only one eigenvalue λ with ascent l. Then V = N(T λi) l. If l = 1, then the result is obvious. In fact, in this case T is diagonalizable. So, assume that l > 1. Let K j = N(T λi) j and g j = dim(k j ), j = 1,..., l. Then, we have K l = V and K j is a proper subspace of K j+1 for j = 1,..., l 1. Let E = {u 1,..., u n } be a basis of V such that {u 1,..., u gj } is a basis of K j for j = 1,..., l. Then, {u 1,..., u g1 } is a basis of K 1 := N(T λi) and {u gj+1,..., u gj } K j+1 \ K j, j {1,..., l 1}. Further, span({u gj+1,..., u gj+1 }) K j = {0}. Note that for each k {1,..., n}, T u k = λu k + (T λi)u k. Clearly, T u k = λu k for k = 1,..., g 1. If k {g 1 + 1,..., n}, then there exists j {1,..., l 1} such that k {g j +1,..., g j+1 }, i.e., k is such that u k {u gj+1,..., u gj+1 }. Then we have (T λi)u k K j so that T u k takes the form g j T u k = λu k + α (k) i u i. Thus, [T ] EE is a triangular matrix with every diagonal entry λ. Next assume that the distinct eigenvalues of T are λ 1,..., λ r with l 1,..., l r, respectively. Let i=1 V j := N(T λi) lj, j = 1,..., r. Let T j : V j V j be the restriction of T to V j. Then we know that λ j is the only eigenvalue of T j. Let E j be a basis for V j such that A j := [T j ] EjE j is a triangular diagonal matrix with diagonal entries λ j. Now, taking E = r j=1 E j, we it follows that E is a basis of V and [T ] EE has block diagonal form with blocks A 1,..., A r. THEOREM 1.39. (Jordan form) There exists a basis E such that [T ] E = (a ij ), where { 0 if j < i and j > i + 1, a ii {λ 1,..., λ k }, a ij = 0 or 1 if j = i + 1 18
Proof. In view of the fact that each N(T λ j I) lj is invariant under T and the spectral decomposition theorem (Theorem 1.28), it is enough to consider the case of T having only one eigenvalue. So, let λ be the only eigenvalue of T with ascent l. Then V = N(T λi) l. If l = 1, then we are done. In fact, in this case T is diagonalizable. So, assume that l > 1, and let K j = N(T λi) j and g j := dim(k j ) for j {1,..., l}. Then for j {1,..., l 1}, K j is a proper subspace of K j+1. Let h 1 = g 1 and for j = 1,..., l 1, let h j+1 = g j+1 g j. Thus, h j+1 = dim(y j+1 ), j = 1,..., l 1, and h 1 + + h l = g l = dim(v ). The idea is to identify linearly independent vectors u (i) j, j = 1,..., h i, in K i \ K i 1 for each i = 1,..., l so that their union is the basis of V with respect to which T has a the required form. Now, let u (l) 1,..., u(l) h l be a basis of Y l. Let us observe that following: 1. (T λi)u (l) 1,..., (T λi)u(l) h l are linearly independent, and 2. (T λi)u (l) j K l 1 \ K l 2 for j = 1,..., h l, whenever l > 2. Let α 1,..., α hl Hence, h l i=1 α iu (l) i C be such that h l i=1 α i(t λi)u (l) i = 0. Then h l i=1 see (2), first we observe that (T λi)u (l) j α i u (l) i N(T λi) K l 1. K l 1 Y l = {0} so that α i = 0 for i = 1,..., h l. Thus, (1) is proved. To K l 1. Suppose (T λi)u (l) j K l 2 for some j. Then u (l) j K l 1 Y l = {0}, which is not possible. This proves (2). Now, let us denote u (l 1) j = (T λi)u (l) j, j = 1,..., h l. Find u (l 1) j K l 1 \ K l 2 for j = h l + 1,..., h l 1 so that u (l 1) j, j = 1,..., h l 1 are linearly independent. Continuing this procedure to the next level downwards, we obtain a basis for V as E = E l E l 1 E 1, E i := {u (i) j : j = 1,..., h i }. ( ) Note that h 1 + h 2 + + h l = g 1 + (g 2 g 1 ) + + (g l g l 1 ) = g l. Also, T u (1) j = λu (1) j, j = 1,..., h 1 = g 1 and for i > 1, T u (i) j = λu (i) j + (T λi)u (i) j = λu (i) j + u (i 1) j, j = 1,..., h i. Reordering the basis vectors in E appropriately, we obtain the required form of the matrix representation of T. Note that at the upper off-diagonal of [T ] E there are g 1 1 number of 0 s and g l g 1 number of 1 s. 19
1.8 Problems In the following V is a vector space over F which is either R or C, and T : V T is a linear operator. 1. Let A R n n, and consider it as a linear operator from R n to itself. Prove that λ σ eig (A) det(a λi) = 0. 2. Show that σ eig (T ) = in the following cases: (a) Let V = P, the space of all polynomials over F and let T p(t) = tp(t), p(t) P. (b) Let V = c 00 and T be the right shift operator on V. 3. Find the eigenvalues and some corresponding eigenvectors for the following cases: (a) V = P and T f = f. (b) V = C(R)) and T f = f. 4. Let V = P 2. Using a matrix representation of T, find eigenvalues of T 1 f = f and T 2 f = f. 5. Find eigenspectrum of T if T 2 = T. 6. Prove that eigenvectors corresponding to distinct eigenvalues of T are linearly independent. 7. Prove that, for every polynomial p(t) and λ F and x V, T x = λx = p(t )x = p(λ)x. 8. Suppose V is an inner product space and T is a normal operator, i.e., T T = T T. Prove that vector x is an eigenvector of T corresponding to an eigenvalue λ if and only if x is an eigenvector of T corresponding to the eigenvalue λ. 9. Prove that, if V is a finite dimensional inner product space and T is a self adjoint operator, then σ eig (T ). 10. Let V be a finite dimensional vector space. (a) Prove that T is diagonalizable if and only if there are distinct λ 1,..., λ k in F such that V = N(T λ 1 I) + + N(T λ k I). (b) Prove that, if T has an eigenvalue λ such that N(T λi) is a proper subspace of N(T λi) 2, then T is not diagonalizable. Is the converse true? (c) Give an example of a non-diagonalizable operator on a finite dimensional vector space. 11. Let V be a finite dimensional vector space and T be diagonalizable. If p(t) is a polynomial which vanishes at the eigenvalues of T, then prove that p(t ) = 0. 12. Let V be a finite dimensional vector space. (a) Let λ µ. Prove that N(T λi) i N(T µi) j = {0} for every i, j N. 20
(b) Prove that generalized eigenvectors associated with distinct eigenvalues are linearly independent. (c) Prove Cayley-Hamilton theorem for operators. 13. Let V be finite dimensional over C and λ be an eigenvalue of T with ascent l. Prove that m := dim[n(t λi) l ] is the algebraic multiplicity of λ. 14. Let V finite dimensional, k N be such that {0} N(T k ) N(T k+1 ), and let Y k be a subspace of N(T k+1 ) such that N(T k+1 ) = N(T k ) Y k. Prove that dim(y k ) dim[n(t k )]. 15. Let V be a finite dimensional vector space and T be diagonalizable. Let u 1,..., u n be eigenvectors of T which for a basis of T, and let λ 1,..., λ n be such that T u j = λ j u j, j = 1,..., n. Let f be an F-valued function defined on an opens set Ω F such that Ω σ eig (T ). x = n j=1 α ju j V, define n f(t )x = α j f(λ j )u j. j=1 Prove that there is a polynomial p(t) such that f(t ) = p(t ) [Hint: Lagrange interpolation]. For 21