The Cyclic Decomposition Theorem

The Cyclic Decomposition Theorem Math 481/525, Fall 2009 Let V be a finite-dimensional F -vector space, and let T : V V be a linear transformation. In this note we prove that V is a direct sum of cyclic T -invariant subspaces. More specifically, we prove that V is a direct sum of cyclic T -invariant subspaces whose annihilators are generated by powers of irreducible polynomials, and that the collection of these polynomials is uniquely determined. Recall that by using T we can make V into an F [x]-module by defining scalar multiplication by f(x) v = f(t )(v) for all f(x) F [x]. We also recall that a T -invariant subspace of V is nothing more than an F [x]-submodule of V. We use module language in this note. 1 Decomposition into Cyclic Submodules Definition 1.1. Let v V. Then ann(v) = {f(x) F [x] : f(x) v = 0} and ann(v ) = ann(v) = {f(x) V : f(x)v = 0 for all v V }. v V Lemma 1.2. Let p(x) be the minimal polynomial of T. Then ann(v ) = (p(x)). Proof. Let v V. Then p(t )(v) = 0 since p(t ) is the 0 linear transformation. Thus, p(x) ann(v ), which implies that (p(x)) ann(v ). Conversely, let f(x) ann(v ). Then f(x)w = 0 for all w V. Thus, f(t )(w) = 0 for all w W. This means the linear transformation f(t ) : V V sends every vector to 0. Thus, f(t ) = 0. By the properties of minimal polynomial, we see that p(x) divides f(x), so f(x) (p(x)). Therefore, (p(x)) = ann(v ). The following lemma gives two facts which appear in Project 2. Lemma 1.3. Let V = F [x]v be a cyclic F [x]-module. 1. ann(v ) = ann(v). 2. If ann(v ) = (f(x)), then dim(v ) = deg(f(x)). 1

3. If V = W 1 W n with each W i an F [x]-submodule of V, and if ann(w i ) = (f i (x)) for some f i (x) F [x], then ann(v ) is generated by gcd(f 1 (x),..., f r (x)). Proof. We only prove the third statement since the first two are from the project. Let f(x) = gcd(f 1 (x),..., f r (x)). By definition, if w W i, then f i (x)w = 0. Since f i (x) divides f(x), we see that f(x)w = 0. Consequently, if v V, we may write v = w 1 + + w r with each w i W i. Then f(x)v = f(x)w 1 + + f(x)w r = 0. Therefore, f(x) ann(v ). Conversely, suppose that g(x) ann(v ). Then g(x)v = 0 for all v V. In particular, g(x)w i = 0 for each w i W i. Therefore, g(x) ann(w i ) = (f i (x)). Thus, f i (x) divides g(x) for each i. This implies that f(x) = gcd(f 1 (x),..., f r (x)) divides g(x), so g(x) (f(x)). Thus, ann(v ) = (f(x)). Lemma 1.4. Let V be a cyclic F [x]-module. If ann(v ) = (f(x)), then for each divisor g(x) of f(x), there is a unique F [x]-submodule of V with annihilator (g(x)). Proof. Write V = F [x]v. Given a divisor g(x) of f(x), write f(x) = g(x)h(x) for some h(x) F [x]. Set w = h(x)v and W = F [x]w. Then W is a submodule of V, and ann(w ) = ann(w). However, g(x)w = g(x)(h(x)v) = f(x)v = 0. Thus, g(x) ann(w). However, if ann(w) = (k(x)), then 0 = k(x)w = k(x)(h(x)v) = (k(x)h(x))v. Therefore, k(x)h(x) ann(v) = (f(x)), so f(x) = g(x)h(x) divides k(x)h(x). Consequently, g(x) divides k(x), and so k(x) (g(x)). Thus, ann(w) = (g(x)). This proves that V has a submodule with annihilator (g(x)). To prove uniqueness, suppose U = F [x]u is a submodule with annihilator (g(x)). Write u = l(x)v for some l(x) F [x]. We have 0 = g(x)u = (g(x)l(x))v, so f(x) divides g(x)l(x). This implies h(x) divides l(x), so u F [x](h(x)v) = W. Thus, U W. To prove the reverse inclusion, since h(x) divides l(x), we may write l(x) = h(x)m(x) for some m(x) F [x]. Let d(x) = gcd(m(x), g(x)). If d(x) > 1, then write g(x) = g 1 (x)d(x) for some g 1 (x) F [x]. Then g 1 (x)u = g 1 (x)l(x)v = g 1 (x)m(x)l(x)v. But, since d(x) divides m(x), g(x) = g 1 (x)d(x) divides g 1 (x)m(x). Consequently, f(x) divides g 1 (x)m(x)l(x), so g 1 (x)u = 0. This is a contradiction to ann(u) = (g(x)). Thus, gcd(m(x), g(x)) = 1. We may then write 1 = m(x)α(x) + g(x)β(x) for some α(x), β(x) F [x]. Then αu = αmhv = (1 gβ)hv = hv (βgh)v = hv = w. This proves w U, which combined with the inclusion U W = F [x]w yields U = W. Thus, V contains a unique submodule with annihilator (g(x)). Because of the primary decomposition theorem, to prove that V is a direct sum of cyclic invariant subspaces, it suffices to prove this for vector spaces whose annihilator is a power of an irreducible polynomial. Thus, for the remainder of this section, we assume that ann(v ) = (p(x) r ) for some irreducible polynomial p(x). The proofs we give are virtually those given in [?]. Lemma 1.5. Suppose V is an F [x]-module, and let p(x) be an irreducible factor of ann(v ). Then V contains an F [x]-submodule with annihilator (p(x). 2

Proof. Let ann(v ) = (f(x). We may write f(x) = p(x)q(x) for some q(x) F [x]. Since ann(v ) = (f(x)), there is some v V with w := q(x)v 0. Consequently, p(x)(q(x)v)) = f(x)v = 0. So, W := F [x]w is an F [x]-submodule, and since p(x)w = 0, we see that ann(w ) = (p(x)) since p(x) is irreducible. Lemma 1.6. Let V be an F [x]-module with annihilator a power of p(x). If V has a unique submodule with annihilator (p(x)), then V is a cyclic F [x]-module. Proof. We argue by induction on dim(v ), the case dim(v ) = 1 is trivial since V = F [x]v for any nonzero v V. So, suppose dim(v ) > 1 and the result holds for F [x]-modules of smaller dimension. Let ann(v ) = (p(x) r ), and let S = p(t ). Set K = ker(s). Note that K = {v V : p(x)v = 0}, and so ann(k) = (p(x)). If K = V, then by hypothesis on V, K has a unique submodule with annihilator (p(x)). But, every nonzero submodule of K has annihilator (p(x)). Thus, V = K = F [x]w for any nonzero w K, and so V is cyclic. So, suppose that K is a proper submodule of V. Then V/K is an F [x]-module with dim(v/k) < dim(v ). Moreover, V/K is isomorphic to im(s), a submodule of V. Since submodules of im(s) with annihilator (p(x)) are submodules of V with annihilator (p(x)), we see that im(s) has a unique submodule with annihilator (p(x)) by the hypothesis on V. By induction, im(s) is cyclic. Therefore, V/K is cyclic, and so there is v V such that V/K = F [x](v + K). From this equality and the definition of the quotient space, we have V = K + F [x]v. Since F [x]v is cyclic, it has a unique submodule of annihilator (p(x)), but this will then be the unique submodule of annihilator (p(x)) in V by hypothesis on V. Therefore, this is K, and so K F [x]v. Therefore, V = F [x]v is cyclic. Lemma 1.7. Let V be an F [x]-module with annihilator a power of (p(x)). If W is a cyclic F [x]-submodule of maximum dimension, then there is an F [x]-submodule U of V with V = W U. Proof. We prove this by induction on dim(v ); the case dim(v ) = 1 is obvious. Suppose dim(v ) > 1 and that the result holds for F [x]-modules of smaller dimension. We may assume that V is not cyclic since W = V in that case, and V = V 0. By the previous lemma, V has at least two submodules with annihilator (p(x)), but W has exactly one. Therefore, V has a cyclic submodule K with annihilator (p(x)) not contained in W. Then K W is a submodule of W, and has annihilator (p(x)) or F [x]. Since K has no proper submodule with annihilator (p(x)), if K W has annihilator (p(x)), then K]capW = K, which implies K W, which is false. Thus, K W = {0}. Now, we claim that (W +K)/K is a cyclic submodule of maximum dimension inside V/K. First, it is cyclic since if W = F [x]w, then (W +K)/K = F [x](w+k). Second, if F [x](v + K) is a cyclic submodule of V/K, then F [x]v is a cyclic submodule of V with (F [x]v + K)/K = F [x](v + K). But, (F [x]v + K)/K has dimension at most dim(f [x]v) since there is a surjective linear transformation F [x]v (F [x]v + K)/K given by sending g(x)v to g(x)(v + K). Thus, dim(f [x]v) dim(w ), by hypothesis on W. Consequently, dim(f [x](v + K) dim(w ) = dim((w + K)/K), so (W + K)/K is indeed a cyclic submodule of V/K of maximum dimension. By induction, there is a submodule 3

U of V/K with V/K = (W + K)/K U. Let U = {v V : v + K U}. Then U is a submodule of V containing K with U/K = U. From V/K = (W +K)/K U/K we get V = (W +K)+U = W +(K +U) = W +U. If w W U, then w+k (W +K)/K U/K = 0, so w K. Thus, w W K = 0, so w = 0. Thus, W U = 0, and so V = W U. Theorem 1.8. Let V be a finite-dimensional F -vector space, and let T hom F (V, V ). Then V is a direct sum of cyclic T -invariant subspaces. Proof. By the primary decomposition result, it is enough to prove this for vector spaces whose annihilator is generated by a power of an irreducible polynomial p(x). We prove the result by induction on dim(v ). Let W be a cyclic F [x]-submodule of V of maximum dimension. By the previous lemma, V = W U for some F [x]-submodule U. Since dim(w ) > 0, we see that dim(u) < dim(v ). So, by induction, U is a direct sum of cyclic F [x]-submodules. Then, since W is cyclic, V is a direct sum of cyclic submodules. 2 Uniqueness In this section we prove a uniqueness result about the decomposition obtained in the previous section. As before, let V be a finite-dimensional F -vector space, and let T : V V be a linear transformation. We make V into an F [x]-module via f(x) v = f(t )(v). Theorem 1.8 implies that we can write V = W 1 W n with each W i a direct sum of cyclic submodules each of whose annihilators are powers of an irreducible polynomial. We want to prove that the set of polynomials generating the various annihilators are uniquely determined. We break the problem into two parts, much like how we proved the decomposition result above. Lemma 2.1. Suppose that V = W 1 W n = U1 U m such that, for each i, W i and U i a submodule of V with annihilator a power of an irreducible polynomial p i (x). Then n = m and W i = Ui for each i. Theorem 2.2. Suppose that ann(v ) is a power of an irreducible polynomial p(x). If V is a direct sum of cyclic submodules with annihilators generated by p r 1 (x),..., p rn (x) with r 1 r 2 r n and also a direct sum of cyclic submodules with annihilators generated by p s 1 (x),..., p sm (x) with s 1 s 2 s m, then n = m and r i = s i for each i. Proof. We use induction on dim(v ). If dim(v ) = 1, then necessarily n = m = 1, r 1 = s 1 = 1 (and p(x) has degree 1). Now, suppose dim(v ) > 1 and that the result holds for vector spaces of smaller dimension. Say V = U 1 U n with U i cyclic and annihilator generated by p(x) r i. Also suppose that V = W 1 W m with W i cyclic and annihilator generated by p(x) s i. Consider the submodule p(x)v. Then W is a proper subspace of V, since if ann(v ) = (p(x) t ), then there is v V with p(x) t 1 v 0. Then v p(x)v, since if v = p(x)w for some w, then p(x) t 1 v = p(x) t w = 0. Now, the two direct sum decompositions for V yield two for p(x)w : p(x)v = p(x)u 1 p(x)u n = p(x)w 1 p(x)w m 4

and the annihilator of p(x)u i (resp. p(x)w i ) is generated by p(x) r 1 1 (resp. p(x) s i 1. Note that if U i has annihilator (p(x)), then p(x)u i = 0; a similar statement holds for the W i. Let n n be such that r i > 1 when i n, and m m such that s i > 1 when i m. Then p(x)v = p(x)u 1 p(x)u n = p(x)w 1 p(x)w m, both direct sums involve all nonzero spaces. By the induction hypothesis applied to p(x)v, we conclude that n = m and, for each i with 1 i n, r i 1 = s i 1. Thus, r i = s i for each such i. Thus, the r i consist of r 1,..., r n and n n 1 s. Similarly, the other set consists of r 1,..., r n and m n 1 s. To finish the proof, we note that dim(v ) = deg(p(x)) P n i=1 r i = deg(p(x)) P n i=1 s i. Comparing exponents then yields n = m. Thus, r i = s i for each i. If V is a direct sum of cyclic submodules each of which has annihilator generated by a power of an irreducible polynomial, then the collection of the powers of these irreducible polynomials is the set of elementary divisors of V. Let f(x) = x m + a m 1 x m 1 + + a + 0 F [x]. If V = F [x]v is a cyclic F [x]-module with ann(v) = (f(x), then, by an earlier lemma, V has an ordered basis [v, xv,..., x m 1 v], where m = deg(f(x)) = dim(v ). If we represent T by a matrix with respect to this matrix, we see that the matrix is equal to 0 0 0 0 a 0 1 0 0 0 a 1 C(f(x)) = 0 1 0 0 a 2..... 0 0 0 1 a m 1 The matrix C(f(x)) is called the companion matrix for f(x). Theorem 2.3 (Cayley-Hamilton). Let g 1 (x),..., g t (x) be the elementary divisors of T. Then the minimal polynomial of T is gcd(g 1 (x),..., g t (x)) and the characteristic polynomial of T is g 1 (x) g t (x). Thus, the minimal polynomial of T divides the characteristic polynomial of T, and the two have the same irreducible factors. Proof. That the minimal polynomial of T is the greatest common divisor of the elementary divisors follows from the definition of elementary divisors together with the first two lemmas above. Next, we may represent T with a matrix C 1 C 2 A =... C t 5

Where C i is the companion matrix for g i (x). Thus, the characterisic polynomial of T is det(xi A), which is equal to the product of all det(xi C i ) = g i (x). This description of the characteristic polynomial shows that it divides some power of the minimal polynomial, which together with what we have just shown implies that the two have the same irreducible factors. Example 2.4. Let T : V V be a nilpotent linear transformation. That means T r = 0 for some integer r. If f(x) is the minimal polynomial of T, then f(x) divides x r, which means f(x) = x s for some s r. Thus, ann(v ) = (x r ) is generated by the power of an irreducible polynomial. Since the minimal polynomial x r is the gcd of the elementary divisors, each elementary divisor is a power of x (and one must be equal to x r ). The companion matrix to x m is the m m matrix 0 0 0 0 0 1 0 0 0 0 C(x m ) = 0 1 0 0 0,..... 0 0 0 1 0 and then T can be represented by a matrix in block diagonal form, each of whose blocks has this type of form. Lemma 2.5. Suppose that V is a cyclic F [x]-module. Then V = F [x]/ ann(v ) as F [x]- modules. Proof. Finally, define ϕ : V F [x]/ ann(v) by ϕ(g(x)v) = g(x) + ann(v). We first check that ϕ is well defined. Suppose that g(x)v = h(x)v for some g(x), h(x) F [x]. Then (g(x) h(x))v = 0. Thus, g(x) h(x) ann(v). Therefore, g(x) + ann(v) = h(x) + ann(v), as desired. A straightforward argument shows that ϕ is an F [x]-module homomorphism. It is onto, since if g(x) + ann(v) F [x]/ ann(v), then g(x) + ann(v) = ϕ(g(x)v). It is 1-1 since if g(x)v ker(ϕ), then g(x) + ann(v) = 0, which is equivalent to g(x) ann(v), which is equivalent to g(x)v = 0. Thus, ker(ϕ) = 0, so ϕ is 1-1. Thus, ϕ is an F [x]-module isomorphism. Proposition 2.6 (Chinese Remainder Theorem). Let f 1 (x),..., f r (x) be pairwise relatively prime polynomials, and let f(x) = f 1 (x) f r (x). Then as F [x]-modules. F [x]/(f(x)) = F [x]/(f 1 (x)) F [x]/(f r (x)) Proof. Define ϕ : F [x] F [x]/(f(x)) = F [x]/(f 1 (x)) F [x]/(f r (x)) by ϕ(g(x)) = (g(x) + (f 1 (x)),..., g(x) + (f r (x))). SHOW ϕ is an F [x]-module homomorphism. We have ker(ϕ) = {g(x) F [x] : g(x) + (f i (x)) = 0 i} = {g(x) F [x] : g(x) (f i (x)) i} = {g(x) F [x] : f i (x) divides g(x) i} = (f(x)), 6

the last equality holding because f(x) = f 1 (x) f r (x) and since the f i (x) are pairwise relatively prime, each divides a polynomial g(x) if and only if their product divides g(x). Therefore, there is an induces F [x]-module homomorphism ϕ : F [x]/(f(x)) F [x]/(f(x)) = F [x]/(f 1 (x)) F [x]/(f r (x)) sending g(x) + (f(x)) to ϕ(g(x)). Moreover, this map is 1-1. We prove that ϕ is onto in the case of r = 2; the general case will follow by an induction argument. Let h 1 (x), h 2 (x) F [x]. We wish to find g(x) F [x] with g(x) + (f i (x)) = h i (x) + (f i (x)) for i = 1, 2. Since gcd(f 1 (x), f 2 (x)) = 1, we can write 1 = a(x)f 1 (x) + b(x)f 2 (x) for some a(x), b(x) F [x]. Multiplying this equation by h 1 (x) h 2 (x) yields an equation of the form h 1 (x) h 2 (x) = c(x)f 1 (x) + d(x)f 2 (x) for appropriate c(x), d(x). Then h 1 (x) c(x)f 1 (x) = h 2 (x)+d(x)f 2 (x). Call this polynomial g(x). Then g(x) h 1 (x) (f 1 (x)) and g(x) h 2 (x) (f 2 (x)). Therefore, ϕ(g(x) + (f(x)) = (h 1 (x) + (f 1 (x)), h 2 (x) + (f 2 (x)). Thus, ϕ is onto. Let p 1 (x),..., p k (x) be the distinct irreducible factors of f(x). Then each elementary divisor of T is a power of one of the p ( x). We may then write the elementary divisors in the form p 1 (x) e 11... p 1 (x) e 1n... p k (x) e k1... p k (x) e kn with e i1... e in 0 for each i. Note that in order to make it appear that we have n elementary divisors which are powers of p i (x) for each i, we must allow 0 exponents. Let f j (x) = p 1 (x) e 1j p k (x) e kj for 1 j n. From the condition on the e ij, we see that f j+1 divides f j (x) for each j with 1 j < n. By Theorem XYZ, we know that V = ij F [x]/(p i (x) e ij, and by repeated use of the Chinese Remainder Theorem, we conclude that V = F [x]/(f 1 (x)) F [x]/(f n (x)). The polynomials f 1 (x),..., f n (x) are called the invariant factors of T. Theorem 2.7. The invariant factors of T are uniquely determined. That is, if we have two decompositions V = F [x]v 1 F [x]v n and V = F [x]w 1 F [x]w m, where ann(v i ) = (f i (x)) and ann(w i ) = (g i (x)) for each i, and such that f n (x) f 1 (x) and g m (x) g 1 (x), then n = m and g i (x) = f i (x) for each i. Proof. By factoring the f i (x) and g i (x) into irreducible polynomials, we may write f j (x) = p 1 (x) e 1j p k (x) e kj and g j (x) = p 1 (x) f 1j p k (x) f kj for each j; we can use the same irreducible polynomials p 1 (x),..., p k (x) in all cases by allowing exponents to be equal to 0. By the Chinese Remainder Theorem, we may decompose F [x]v j = F [x]/(fj (x)) = F [x]/(p 1 (x) e 1j F [x]/(p k (x) e kj. Doing this for all j and using the uniqueness of elementary divisors, we see that the set {p i (x) e ij is the set of elementary divisors for T. Similarly, we see that {p i (x) f ij is the set of elementary divisors. The divisibility relations imply that e 1j e 2j e kj for each j, and f 1j f kj for each j. This forces e ij = f ij for each i, j, and so f i (x) = g i (x) for each j. 7