GENERALIZED EIGENVECTORS, MINIMAL POLYNOMIALS AND THEOREM OF CAYLEY-HAMILTION FRANZ LUEF Abstract. Our exposition is inspired by S. Axler s approach to linear algebra and follows largely his exposition in Down with Determinants, check also the book LinearAlgebraDoneRight by S. Axler [1]. These are the lecture notes for the course of Prof. H.G. Feichtinger Lineare Algebra 2 from 15.11.2006. Before we introduce generalized eigenvectors of a linear transformation we recall some basic facts about eigenvalues and eigenvectors of a linear transformation. Let V be a n-dimensional complex vector space. Recall a complex number λ is called an eigenvalue of a linear operator T on V if T λi is not injective, i.e. ker(t λi) {0}. The main result about eigenvalues is that every linear operator on a finite-dimensional complex vector space has an eigenvalue! Furthermore we call a vector v V an eigenvector of T if T v = λv for some eigenvalue λ. The central result on eigenvectors is that Non-zero eigenvectors corresponding to distinct eigenvalues of a linear transformation on V are linearly independent. Consequently the number of distinct eigenvalues of T cannot exceed thte dimension of V. Unfortunately the eigenvectors of T need not span V. transformation on C 4 whose matrix is 0 1 0 0 T = 0 0 1 0 0 0 0 1 0 0 0 0 For example the linear as only the eigenvalue 0, and its eigenvectors form a one-dimensional subspace of C 4. Observe that T, T 2 0 but T 3 = 0. More generally a linear operator T such that T, T 2,..., T p 1 0 and T p = 0 is called nilpotent of index p. More generally, let T be a linear operator on V, then the space of all linear operators on V is finitedimensional (actually of dimension n 2 ). Then there exists a smallest positive integer k such that I, T, T 2,..., T k are not linearly independent. In other words there exist unique complex numbers a 0, a 1,..., a k 1 such that a 0 I + a 1 T + + a k 1 T k 1 + T k = 0. The polynomial m(x) = a 0 + a 1 x + + a k 1 z k 1 + z k is called the minimal polynomial of T. It is the monic polynomial of smallest degree such that m(t ) = 0. A polynomial q such that q(t ) = 0 is a so-called annihilating polynomial. The 1
Fundamental Theorem of Algebra yields that m(x) = (x λ 1 ) α 1 (x λ 2 ) α2 (x λ m ) αm, where α j is the multiplicity of the eigenvalue λ j of T. Since m(t ) = (T λ 1 I) α 1 (T λ 2 I) α2 (T λ m I) αm = 0 implies that for some j (T λ j ) α j = 0 is not injective, i.e. ker(t λ j I) αj {0}. What is the structure of the subspace ker(t λ j I)? First of all we call a vector v V a generalized eigenvector of T if (T λi) k v = 0 for some eigenvalue λ of T. Then ker(t λi) k is the space of all generalized eigenvectors of T corresponding to an eigenvalue λ. Lemma 0.1. The set of generalized eigenvectors of T on a n-dimensional complex vector space corresponding to an eigenvalue λ equals ker(t λi) n. Proof. Obviously, every element of ker(t λi) n is a generalized eigenvector of T corresponding to λ. Let us show the other inclusion. If v 0 is a generalized eigenvector of T corresponding to V, then we need to prove that (T λi) n v = 0. By assumption there is a smallest non-negative integer k such that (T λi) k v = 0. We are done if we show that k n. In other words we proof that v, (T λi)v,..., (T λi) k 1 v are linearly independent vectors. Since then we will have k linearly independent elements in an n-dimensional vector space, which implies that k n. Let a 0, a 1,..., a k 1 be complex numbers such that a 0 v + a 1 (T λi)v + + a k 1 (T λi) k 1 v = 0. Apply (T λi) k 1 to both sides of the equation above, getting a 0 (T λi) k 1 v = 0, which yields a 0 = 0. Now apply (T λi) k 2 to both sides of the equation, getting a 1 (T λi) k 1 v = 0, which implies a 1 = 0. Continuing in this fashion, we see that a j = 0 for each j, as desired. Following the basic pattern of the proof that non-zero eigenvectors corresponding to discinct eigenvalues of T are linearly independent, we obtain: Proposition 0.2. Non-zero generalized eigenvectors corresponding to distinct eigenvalues of T are linearly independent. Proof. Suppose that v 1,.., v m are non-zero generalized eigenvectors of T corresponding to distinct eigenvalues λ 1,..., λ m. We assume that there are complex numbers a 1,..., a m such that a 1 v 1 + a 2 v 2 + + a m v m = 0. Then we have to show that a 1 = a 2 = = a m = 0. Let k be the smallest positive integer such that (T λi) k v 1 = 0. Then apply the linear operator (T λ 1 I) k 1 (T λ 2 I) n (T λ m I) n 2
to both sides of the previous equation, getting a 1 (T λ 1 I) k 1 (T λ 2 I) n (T λ m I) n v 1 = 0. We rewrite (T λ 2 I) n T λ m I) n as ((T λ 1 ) + (λ 1 λ 2 )I) n (T λ n ) + (λ 1 λ n )I) n v 1 = 0. An application of the binomial theorem gives a sum of terms which when combined with (T λ 1 I) k 1 on the left and applied to v 1 gives 0, except for the term a 1 (λ 1 λ 2 ) n (λ 1 λ m ) n (T λ 1 ) k 1 v 1 = 0. Thus a 1 0. Continuing in a similar fashion, we get a j = 0 for each j, as desired. The central fact about generalized eigenvectors is that they span V. Theorem 0.3. Let V be a n-dimensional complex vector space and let λ be an eigenvalue of T. Then V = ker(t λi) n im(t λi) n. Proof. The proof will be an induction on n, the dimension of V. The result holds for n = 1. Suppose that n > 1 and that the result holds for all vector spaces of dimension less than n. Let λ be any eigenvalue of T. Then we want to show that V = ker(t λi) n im(t λi) n =: V 1 V 2. Let v V 1 V 2. Then (T λi) n v = 0 and there exists a u V such that (T λi) n u = v. Applying (T λi) n to both sides of the last equation, we have that (T λi) 2n u = 0. Consequently, (T λi) n u = 0, i.e. v = 0. Thus V 1 V 2 = {0}. Now V 1 and V 2 are the kernel and the image of a linear operator on V, we have dim V = dim V 1 + dim V 2. Note that V 1 {0}, because λ is an eigenvalue of T, thus dim V 2 < n. Furthermore T maps V 2 into V 2 since T commutes with (T λi) n. By our induction hypothesis, V 2 is spanned by the generalized eigenvectors of T V2, each of wich is also a generalized eigenvector of T. Everything in V 1 is a generalized eigenvector of T, which gives the desired result. Corollary 0.4. If 0 is the only eigenvalue of a linear operator on V, then T is nilpotent. Proof. By assumption 0 is the only eigenvalue of T. Then every vector v in V is a generalized eigenvector of T corresponding to the eigenvalue λ = 0. Consequently T p = 0 for some p. As a consequence we get the following structure theorem for linear transformations. Theorem 0.5. Let λ 1,..., λ m be the distinct eigenvalues of T, with E 1,..., E m denoting the corresponding sets of generalized eigenvectors. Then (1) V = E 1 E 2 E m ; (2) T maps each E j into itself; (3) each (T λ j I) Ej is nilpotent; (4) each T Ej has only one eigenvalue, namely λ j. 3
Proof. (1) Follows from the linear independence of generalized eigenvectors corresponding to distinct eigenvalues and that the generalized eigenvectors of λ j span E j. (2) Suppose v E j. Then (T λ j I) k v = 0 for some positive integer k. Furthermore we have (T λ j ) k T v = T (T λ j ) k v = T (0) = 0, i.e. T v U j. (3) is a reformulation of the definition of a generalized eigenvector. (4) Let λ be an eigenvalue of T Uj, with corresponding non-zero eigenvector v U j. Then (T λ j I)v = (λ λ j )v, and hence (T λ j I) k v = (λ λ j ) k v for each positive integer k. But v is a generalized eigenvector of T corresponding to λ j, the left hand side of the equation is 0 for some k, i.e. λ = λ j. The next theorem connects the minimal polynomial of T to th decomposition of V as a direct sum of generalized eigenvectors. Theorem 0.6. Let λ 1,..., λ m be the distinct eigenvalues of T, let E j denote the set of the generalized eigenvectors corresponding to λ j, and let α j be the smallest positive integer such that (T λ j I) α j v = 0 for every v E j. Let Then Proof. m(x) = (x λ 1 ) α 1 (x λ 2 ) α2 (x λ m ) αm. (1) m has degree at most dim(v ); (2) if p is another annihilating polynomial of T, then p is a polynomial multiple of m; (3) m is the minimal polynomial of T. Each α j is at most the dimension of E j and V = E 1 E m gives that the α j s can at most add up to n. Let p be a polynomial such that p(t ) = 0. We show that p is a polynomial multiple of each (x λ j ) α j. We now fix j. Then q has to be of the form p(x) = a(x r 1 ) δ 1 (x r 2 ) δ2 (x r M ) α M (x λ j ) δ, where a is a non-zero complex number and the r k s are complex numbers all different from λ j, the δ k s are positive integers, and δ is a non-negative integer. Suppose v E j. Then (T λ j I) δ v is also in E j. Now a(t r 1 ) δ 1 (T r 2 ) δ2 (T r M ) α M (T λ j ) δ v = p(t )v = 0 and (T r 1 ) δ 1 (T r 2 ) δ2 (T r M ) α M is injective on E j. Thus (T λ j I) δ v = 0. But v was an arbitrary element of E j, this implies α j δ, i.e. p is a polynomial multiple of (x λ j ) α j. 4
Suppose v is a vector in some E j. Then m(t )v = 0. Because E 1,..., E m span V, we conclude that m(t ) = 0, but from (ii) we know that no monic polynomial of lower degree has this property thus m must be the minimal polynomial. Let λ be an eigenvalue of T. Then the geometric multiplicity is defined as the dimension of the set of generalized eigenvectors of T corresponding to λ. Then the sum of the multiplicities of all eigenvalues of T equals n. Let λ 1,..., λ m be the distinct eigenvalues of T, with corresponding multiplicities β 1,..., β m. Then the polynomial c(x) = (x λ 1 ) β1 (x λ m ) βm is called the characteristic polynomial of T. Theorem 0.7 (Cayley-Hamilton). Let c be the characteristic polynomial of T. Then c(t ) = 0. Note that α j β j = dim E j, i.e. c(t) is a polynomial multiple of m(t ). A linear operator on V is called diagonalizable if its eigenvectors to distinct eigenvalues allow to span V. In terms of our approach this may be expressed as follows: A linear operator T is diagonalizable if and only if all generalized eigenvectors are actually eigenvectors. It turns out that the class of diagonalizable matrices consists of those linear operators such that T T = T T, so-called normal matrices. References [1] S. Axler. Linear algebra done right. 2nd ed. Springer, New York, NY, 1997. Fakultät für Mathematik, Nordbergstrasse 15, 1090 Wien, Austria, E-mail address: franz.luef@univie.ac.at 5