SUPPLEMENT TO CHAPTERS VII/VIII

SUPPLEMENT TO CHAPTERS VII/VIII The characteristic polynomial of an operator Let A M n,n (F ) be an n n-matrix Then the characteristic polynomial of A is defined by: C A (x) = det(xi A) where I denotes the n n-identity matrix If τ L(V ) is a linear operator of a finite dimensional vector space V, the characteristic polynomial of τ is defined as follows: Let B = (b 1,, b n ) be an ordered basis of V and let M = [τ] B = ([τ(b 1 )] B [τ(b n )] B ) denote the matrix which represents τ with respect to the ordered basis B Then the characteristic polynomial of τ is defined as: C τ (x) = det(xi M) This definition only makes sense if it is independent of the choice of the chosen ordered basis B Let B = (b 1,, b n) be another ordered basis of V The matrix of τ with respect to B : relates to [τ] B by: M = [τ] B = ([τ(b 1)] B [τ(b n)] B ) [τ] B = M B,B [τ] B M 1 B,B, that is, [τ] B and [τ] B are similar matrices The next proposition shows that similar matrices have the same characteristic polynomial Proposition A Similar matrices have the same characteristic polynomial Proof Let A, B, P M n,n (F ) with P invertible (nonsingular) so that: Then: B = P AP 1 C B (x) = det(xi B) = det(xi P AP 1 ) = det(p (xi)p 1 P AP 1 ) = det(p (xi A)P 1 ) = det(p )det(xi A)det(P ) 1 = det(xi A) = C A (x) 1 Typeset by AMS-TEX

2 SUPPLEMENT TO CHAPTERS VII/VIII The companion matrix Let V be a finite dimensional vector space and τ L(V ) a linear operator Suppose that decomposes as follows into a direct sum of cyclic F [x]-modules: This means that where U i are submodules with = F [x]/(f1 (x)) F [x]/(f r (x)) = U 1 U r U i = F [x]/(f i (x)) Case 1: First assume that is a cyclic F [x]-module, that is, = F [x]/(f(x)) where f(x) = x n + a n 1 x n 1 + a 1 x + a 0 is a monic polynomial Note that and F [x]/(f(x)) are isomorphic as F [x]-modules and as vector spaces over F Proposition B [1], [x],, [x n 1 ] is a basis of the F -vector space F [x]/(f(x)) Proof Let [g(x)] F [x]/(f(x)) We make use of the division algorithm in F [x] and write g(x) = q(x)f(x) + r(x) where r(x) = 0 or deg(r(x)) < deg(f(x)) Suppose that r(x) = b n 1 x n 1 + b 1 x+ b 0 Since g(x) r(x) mod f(x): [g(x)] = [r(x)] = b n 1 [x n 1 ] + + b 1 [x] + b 0 [1] This shows that [1], [x],, [x n 1 ] spans F [x]/(f(x)) In order to show that [1], [x],, [x n 1 ] are linearly independent, suppose that r 0 [1] + r 1 [x] + + r n 1 [x n 1 ] = [r 0 + r 1 x + + r n 1 x n 1 ] = [0] = 0 where r i F Then f(x) divides the polynomial r 0 + r 1 x + + r n 1 x n 1 and by degree reasons r i = 0 for all 0 i n 1 Since and F [x]/(f(x)) are isomorphic, there is an isomorphism of F [x]- modules: Γ : F [x]/(f(x)) Note that Γ is also an isomorphism of the F -vector spaces There is a vector v V with Γ(v) = [1] Since Γ is an isomorphism of the F [x]-modules we have that Similarly, for i N: Γ(τ(v)) = Γ(xv) = xγ(v) = x[1] = [x] Γ(τ i (v)) = Γ(x i v) = x i Γ(v) = x i [1] = [x i ] This proves that B = (v, τ(v),, τ n 1 (v)) is a basis of V, since B is mapped under the isomorphism Γ into a basis of the F -vector space F [x]/(f(x))

SUPPLEMENT TO CHAPTERS VII/VIII 3 We want to find the matrix of τ with respect to the basis B = (v, τ(v),, τ n 1 (v)) where v V is such that Γ(v) = [1] Set b i = τ i (v) Then τ(b i ) = b i+1 for all 0 i n 1 It remains to compute τ(b n ) Since x n a n 1 x n 1 a 1 x a 0 mod f(x) we have that Thus [x n ] = [ a n 1 x n 1 a 1 x a 0 ] = a n 1 [x n 1 ] a 1 [x]x a 0 [1] Γ(τ(τ n 1 (v))) = Γ(x(τ n 1 (v))) Since Γ is an isomorphism This shows that = xγ(τ n 1 (v)) = x[x n 1 ] = [x n ] = a n 1 [x n 1 ] a 1 [x] a 0 [1] = a n 1 Γ(τ n 1 (v)) a 1 Γ(τ(v)) a 0 Γ(v) = Γ( a n 1 τ n 1 a 1 τ(v) a 0 v) τ(b n ) = τ(τ n 1 (v)) = a n 1 τ n 1 (v) a 1 τ(v) a 0 v = a n 1 b n a 1 b 2 a 0 b 1 0 0 0 0 a 0 1 0 0 0 a 1 0 1 0 0 a 2 [τ] B = 0 0 0 1 a n 1 Proposition C If is a cyclic F [x]-module with = F [x]/(f(x)) then C τ (x) = f(x) Proof Using the basis B = (v, τ(v),, τ n 1 (v)) from above, we see that x 0 0 0 a 0 1 x 0 0 a 1 0 1 x 0 a 2 C τ (x) = det 0 0 0 1 x + a n 1

4 SUPPLEMENT TO CHAPTERS VII/VIII The value of the determinant is not changed if we add the multiple of one row to another row Add to obtain: x 2nd row to the first row x 2 3d row to the first row x n 1 last row to the first row 0 0 0 0 f(x) 1 x 0 0 a 1 0 1 0 0 a 2 C τ (x) = det 0 0 0 1 x + a n 1 By switching the first row with every other row the determinant is at each step multiplied by 1, and Thus 1 x 0 0 a 1 0 1 x 0 a 2 0 0 1 0 a C τ (x) = ( 1) n 1 3 det 0 0 0 0 f(x) which proves the assertion Theorem D If then C τ (x) = ( 1) n 1 ( 1) n 1 f(x) = F [x]/(f1 (x)) F [x]/(f r (x)) C τ (x) = f 1 (x) f r (x) Proof Under the assumption decomposes into a direct sum of submodules: = U 1 U r where U i = F [x]/(fi (x)) Each U i is a τ-invariant subspace of V, thus τ i = τ Ui a linear operator of U i By the 1st case there is a vector v i U i so that is B i = (v i, τ(v i ),, τ s i 1 (v i )) is an ordered basis of U i where s i = dim(u i ) Then B = B 1 B r

SUPPLEMENT TO CHAPTERS VII/VIII 5 is a basis of V and ( ) [τ] B = [τ U1 ] B1 0 0 0 [τ U2 ] B2 0 that is, [τ] B is a block matrix Thus 0 0 [τ Ur ] Br xi 1 [τ U1 ] B1 xi 2 [τ U2] B2 0 xi [τ] B = 0 xir [τ Ur ]Br is again a block matrix and C τ (x) = det(xi [τ] B ) = det(xi i [τ Ui ] Bi ) = f i (x) Definition The matrix ( ) is called the companion matrix of τ Eigenvalues and eigenvectors Let V be a finite dimensional vector space over a field F and τ L(V ) a linear operator Suppose that M = [τ] B is the matrix representing τ with respect to the ordered basis B The characteristic polynomial of τ is given by: C τ (x) = det(xi M) A root λ F of C τ (x) is called an eigenvalue of τ In particular, if λ is an eigenvalue of τ, then det(λi M) = 0 and the matrix λi M has a nontrivial null space Every nonzero vector a null(λi M) satisfies that Ma = λa Since every nonzero a null(λi M) corresponds to a nonzero vector v ker(λι τ), for every eigenvalue λ of τ there is a nonzero vector v V with τ(v) = λv Such a nonzero vector v is called an eigenvector of τ belonging to λ More generally, we define the eigenspace belonging to λ by: E λ = ker(λι τ) = {v V τ(v) = λv} Note that E λ is the set of eigenvectors fo λ together with the zero vector Theorem E (Cayley-Hamilton) Let V be a finite dimensional vector space, τ L(V ), and C τ (x) the characteristic polynomial of τ Then C τ (τ) = 0 Proof The F [x]-module is isomorphic to: = F [x]/(p e 1 1 ) F [x]/(pe r )

6 SUPPLEMENT TO CHAPTERS VII/VIII where p e 1 1,, pe r r are the elementary divisors of V We have seen in the last section that C τ = Moreover, the minimal polynomial m τ (x) of τ is the least common multiple of p e 1 1,, pe r r This implies that m τ (x) divides C τ (x) and that C τ (τ) = 0 Theorem 84 Suppose that λ 1,, λ k are distinct eigenvalues of the linear operator τ L(V ) p e i i (1) If i j then E λi E λj = {0} (2) If v i E λi (0) for all 1 i k then v 1, v k are linearly independent Proof (1) Let v E λi E λj Then τ(v) = λ i v = λ j v Since λ i λ j and since every nonzero vector is linearly independent, it follows that v = 0 (2) The proof is by induction on k The case k = 1 is trivial For the induction step assume that v 1,, v k 1 are linearly independent and that v 1,, v k are linearly dependent Then there are scalars r i F, not all r i = 0, so that ( ) r 1 v 1 + r 2 v 2 + + r k v k = 0 Note that since v 1,, v k 1 are linearly independent we have that r k 0 and that at least one r i 0 for some 1 i k 1 Applying τ to ( ) yields: Multiplying ( ) by λ k gives: Subtract (2) from (1): (1) r 1 λ 1 v 1 + r 2 λ 2 v 2 + + r k λv k = 0 (2) r 1 λ k v 1 + r 2 λ k v 2 + + r k λ k v k = 0 ( ) r 1 (λ 1 λ k )v 1 + r 2 (λ 2 λ k )v 2 + + r k 1 (λ k 1 λ k )v k 1 = 0 By assumption λ i λ k 0 for all 1 i k 1 Moreover, r i 0 for at least one 1 i k 1 Thus ( ) is a nontrivial linear combination of the zero vector and v 1,, v k 1 are linearly dependent, contradiction Jordan forms Let V be a finite dimensional vector space and τ L(V ) a linear operator of V Throughout this section suppose that the characteristic polynomial splits completely into linear factors: C τ (x) = where λ i λ j if i j This implies that (x λ i ) l i = F [x]/((x µ1 ) e 1 ) F [x]/((x µ t ) e t )

SUPPLEMENT TO CHAPTERS VII/VIII 7 where (x µ 1 ) e 1,, (x µ t ) e t are the elementary divisors of Note that {µ 1,, µ t } = {λ 1,, λ r } and possibly µ i = µ j if i j As in the second section can be written as = U 1 U t where U i is a submodule of with U i = F [x]/((x µi ) e i Case 1: = F [x]/((x µ) e ) Let : F [x]/((x µ) e ) be an isomorphisms of F [x]-modules Note that is also an isomorphism of F -vector spaces The F -vector space F [x]/((x µ) e has basis [(x µ) e 1 ], [(x µ) e 2 ],, [x µ], [1] Let b 1,, b e V so that (b 1 ) = [(x µ) e 1 ], (b 2 ) = [(x µ) e 1 ],, (b e 1 ) = [x µ], (b e ) = [1] Since is also an isomorphism of the F -vector spaces, B = (b 1,, b e ) is an ordered basis of V We want to find τ(b i ) Note that ((τ µι)(b 1 )) = ((x µ)b 1 ) = (x µ) (b 1 ) = (x µ)[(x µ) e 1 ] = [(x µ) e ] = 0 Thus (τ µι)(b 1 ) = 0 and τ(b 1 ) = µb 1 Similarly, for 2 i n ((τ µι)b i ) = ((x µ)b i ) = (x µ) (b i ) = (x µ)[(x µ) e i ] = [(x µ) e (i 1) ] = (b i 1 ) Thus (τ µι)(b i ) = b i 1 and τ(b i ) = b i 1 + µb i This implies that the matrix of τ with respect to the basis B has the form µ 1 0 0 0 0 µ 1 0 0 0 0 µ 0 0 [τ] B = 0 0 0 µ 1 0 0 0 0 µ A matrix of the above form is called a Jordan block

8 SUPPLEMENT TO CHAPTERS VII/VIII General case: Suppose now that and that = F [x]/((x µ1 ) e 1 ) F [x]/((x µ t ) e t )) = U 1 U t where U i is a submodule of with U i = F [x]/((x µi ) e i ) Note that each U i is a τ-invariant subspace of V and τ Ui is an operator of U i By the first case for each 1 i t there is a basis B i of U i so that the matrix of τ Bi is a Jordan block: µ i 1 0 0 0 0 µ i 1 0 0 0 0 µ i 0 0 [τ Ui ] Bi = 0 0 0 µ i 1 0 0 0 0 µ i Then B = B 1 B t is an ordered basis of V and the matrix of τ with respect to B is a block matrix consisting of Jordan blocks ( ) [τ] B = [τ U1 ] B1 0 0 0 [τ U2 ] B2 0 0 0 [τ Ut ] Bt The matrix [τ] B is called a Jordan form of τ matrix Note that [τ] B is a triangular Theorem E Let A, B M n,n (F ) so that the characteristic polynomials of A and B split completely into linear factors Then A and B are similar if and only if A and B have the same Jordan form (up to order of the Jordan blocks) Remark (a) The Jordan form of a matrix A (a linear operator τ, respectively) only exists if the characteristic polynomial splits into linear factors For example, if F = C is the field of complex numbers, then every linear operator over a complex vector space has a Jordan form (b) Suppose that τ L(V )is a linear operator and B is an ordered basis of V so that [τ] B is a Jordan form Then one can read off the elementary divisors of from [τ] B Computation of the Jordan form Definition A matrix A M n,n is called nilpotent if A m = 0 for some m N Remark 1 Consider the n n-matrix 0 1 0 0 0 0 0 1 0 0 A = 0 0 0 0 1 0 0 0 0 0

SUPPLEMENT TO CHAPTERS VII/VIII 9 A is a nilpotent matrix of rank n 1 In particular, rk(a i ) = n i for 1 i n, A n 1 0, and A n = 0 Remark 2 If A, B M n,n (F ) are similar matrices then so are the matrices λi A m and λi B m Proof Suppose that B = P AP 1 where P M n,n (F ) is an invertible matrix Then λi B m = λi (P AP 1 ) m = λi (P AP 1 )(P AP 1 ) (P AP 1 ) = λi P A m P 1 = P (λi A m )P 1 Let A M n,n (F ) be a matrix In order to find the Jordan form of A (if it exists), we start by computing the characteristic polynomial C A (x) of A Then we need to split the characteristic polynomial into linear factors If this is not possible, the Jordan form fails to exit Suppose now that C A (x) splits into linear factors: C A (x) = (x λ i ) t i where λ i λ j if i j Case: 1: C A (x) = (x λ) n In this case the Jordan form of A is: λ 1 0 0 λ 1 0 λ J A = λ 1 0 0 λ 0 1 λ We need to find the number and size of the Jordan blocks in J A J A is an n n- matrix of rank n if λ 0 and rank n s if λ = 0, where s is the number of Jordan blocks in J A Thus, if λ 0, then in λi J A the rank in each Jordan block is lowered by 1 and σ 1 = n rk(λi J A ) is the number of Jordan blocks in J A Hence: σ 1 = n rk(λi J A ) is the number of Jordan blocks in J A By taking (λi J A ) 2 the rank of each Jordan block of size 2 is lowered by 1 and σ 2 = rk(λi J A ) rk(λi J A ) 2 is the number of Jordan blocks of size 2 Continuing like this we see that σ k = rk(λi J A ) k 1 rk(λi J A ) k

10 SUPPLEMENT TO CHAPTERS VII/VIII is the number of Jordan blocks of size k From the σ i we can determine the number of Jordan blocks of size k General case: Suppose now that C A (x) = (x λ i ) t i where λ i λ j if i j Note that every eigenvalue λ i occurs exactly t i times on the diagonal of J A and that n = r t i In the matrix λ i I J A the rank of every Jordan block belonging to λ j for j i is maximal, while, if λ i 0, the rank of each Jordan block belonging to λ i is lowered by 1 Hence again σ (1) 1 = n rk(λ i I J A ) is the number of Jordan blocks belonging to λ i Similarly, σ (i) k = rk(λ ii J A ) k 1 rk(λ i I J A ) k is the number of Jordan blocks belonging to λ i of size k Of course, we want a formula to compute the Jordan form which should not involve the (unknown) Jordan form of A Notice that by Remark 2, rk(λi A) i = rk(λi J A ) i since (λi A) i and (λi J A ) i are similar matrices Thus, in order to compute the Jordan form of a given n n-matrix A, we follow the following steps: Step 1: Find the characteristic polynomial C A (x) of A Step 2: Find the roots of C A (x) and decide if C A (x) splits into linear factors If C A (x) fails to split into linear factors, the Jordan form of A does not exist So, suppose that C A (x) splits into linear factors: C A (x) = (x λ i ) t i where λ i λ j if i j Step 3: Let σ (i) k denote the number of Jordan blocks belonging to λ i of size k Then σ (i) 1 = n rk(λ i I A) and σ (i) k = rk(λ ii A) k 1 rk(λ i I A) k if k > 1 Then σ k i σk+1 i is the number of Jordan blocks belonging to λ i of size k Triangularization and diagonalization Definition An n n-matrix A is called triangularizable if A is similar to a triangular matrix Similarly, A is diagonalizable if A is similar to a diagonal matrix

SUPPLEMENT TO CHAPTERS VII/VIII 11 Theorem F Let A be an n n-matrix with characteristic polynomial C A (x) and minimal polynomial m A (x) (a) A is triangularizable if and only if C A (x) splits into linear factors (b) A is diagonalizable if and only if m A (x) splits into linear factors and has only simple roots Proof (a) If A is similar to a triangular matrix B, then A and B have the same characteristic polynomial Obviously, the characteristic polynomial of B is a product of linear factors Conversely, if C A (x) splits into linear factors, the Jordan form of A exists, which is a triangular matrix (b) Suppose that A is similar to a diagonal matrix B Similar matrices have the same minimal polynomial since their associated F [x]-modules are isomorphic Obviously, the minimal polynomial of a diagonal matrix splits into linear factors and only has simple roots Conversely, if m A (x) splits into linear factors and only has simple roots, then C A (x) splits into linear factors and A has a Jordan form Moreover, the elementary divisors of A are linear polynomials Thus all Jordan blocks of J A have size 1 and J A is a diagonal matrix