CHAPTER 1. Matrices Matrix Algebra

CHAPTER Matrices Matrix Algebra Fields A filed is a set F equipped with two operations + and such that (F, + and (F, are abelian groups, where F = F \ {0}, and a(b + c = ab + ac for all a, b, c F Examples of fields Q, R, C, Z p (p primes If F is a filed and f(x is an irreducible polynomial in F [x], the quotient ring F [x]/(f is field containing F as a subfield Eg, C = R[x]/(x 2 + ; Z 3 [x]/(x 3 x + is a field with 3 3 elements containing Z 3 If R is an integral domain (commutative ring without zero divisors, then all fractions p q (p, q R, q 0 form the fractional field of R with contains R Matrices Let F be a field M m n (F = the set of all m n matrices with entries in F ; M n (F = M n n (F For A = [a ij ], B = [b ij ] M m n (F, C = [c jk ] M n p (F, α F, A + B := [a ij + b ij ] M m n (F, αa := [αa ij ] M m n (F, AC := [d ik ] M m p (F, where d ik = For A M m n (F, B M n p (F, C M p q (F, (ABC = A(BC n a ij c jk j= (M n (F, +, is a ring with identity I n = GL(n, F = the set of invertible matrices in M n (F (GL(n, F, is the multiplicative group of M n (F, called the general linear group of degree n over F Multiplication by blocks Let A A n A = A m A mn, B = B B p, B n B np

2 MATRICES where A ij M mi n j (F, B jk M nj p k (F Then C C p AB =, C m C mp where is If C ik = n A ij B jk j= Transpose The transpose of a a n A = a m a mn is a block matrix, then Properties of transpose (i (αa + βb T = αa T + βb T (ii (AB T = B T A T (iii (A T = (A T a a m A T = a n a mn A A n A = A m A mn A T A T m A T = A T n A T mn Elementary operations and elementary matrices To perform an elementary row (column operation on a matrix A is to multiply the corresponding elementary matrix to A from the left (right Note The inverse of an elementary matrix is also an elementary matrix of the same type Proposition Every A GL(n, F is a product of elementary matrices Proof Use induction on n A can be transformed into [ ] A through suitable elementary row and column operations, ie, elementary matrices P,, P k, Q,, Q l such that [ ] P P k AQ Q l =, A

MATRIX ALGEBRA 3 Table Elementary row operations and elementary matrices type elementary row operation elementary matrix I multiply the ith row by α F α i II swap the ith and jth rows 0 i 0 j III add β times the jth row to the ith row, where i j, β F j i β where A GL(n, F By the induction hypothesis, A is a product of elementary matrices Thus [ ] A is a product of elementary matrices and so is A = P k P [ ] A Q l Q Equivalence Let A, B M m n (F We say that A is row equivalent to B, denoted A r B, if P GL(m, F such that A = P B; A is column equivalent to B, denoted A c B, if Q GL(n, F such that A = BQ; A is equivalent to B, denoted A B, if P GL(m, F and Q GL(n, F such that A = P BQ r, c and are equivalence relations on M m n (F Reduced row echelon forms A matrix A M m n (F is called a reduced row echelon form (rref if (i in each nonzero row of A, the first nonzero entry is ; such an entry is called a pivot of A; (ii if a column of A contains a pivot, then all other entries in the column are 0; (iii if a row contains a pivot, then every row above contains a pivot further to the left A reduced column echelon form (rcef is defined similarly

4 MATRICES Proposition 2 Every A M m n (F is row (column equivalent to a unique rref (rcef Proof Existence of rref Induction on the size of A Uniqueness of rref Use induction on m Let A, B M m n (F be rref s such that A = P B for some P GL(m, F We want to show that A = B May assume B 0 Assume that the first nonzero column of B is the jth column Then the first nonzero column of A = P B is also the jth column Write A = [ 0 a 0 0 A j ], B = where A, B M (m (n j (F are rref s Then [ ] [ ] a b = P 0 A 0 B [ 0 b 0 0 B j ], It follows that P = [ ] p, P GL(m, F, 0 P and [ ] [ ] a b + pb = 0 A 0 P B Since A = P B, by the induction hypothesis, A = B Let I be the set of indices of the pivot columns of B Since A, B are rref s, all components of a and b with indices in I are 0 Since pb = a b, all components of pb with indices in I are 0 Write B = [b,,, b n j ] Then pb i = 0 for all i I Note that every column of B is a linear combination of the pivot columns b i, i I So, pb = 0 Therefore, a = b So, A = B Proposition 3 Every A M m n (F is equivalent to [ ] I r 0, 0 0 where 0 r min{m, n} is uniquely determined by A Moreover, r = the number of pivots in the rref (rcef of A r is called the rank of A Proof We only have to show the uniqueness of r; the other claims are obvious Assume to the contrary that [ ] [ ] I r 0 0 0 I s 0 0 0, r < s Then P GL(m, F and Q GL(n, F such that [ ] [ ] I r 0 I s 0 P = Q 0 0 0 0

MATRIX ALGEBRA 5 Write P = [P P 2 ], Q = [ Q Q 2 ], where P M m r (F, Q M s n (F Then [P 0] = Hence Q = [Q 0], where Q M s r (F Since s > r, 0 x F s such that xq = 0 Then [ ] Q 0 [x 0]Q = [x 0] = 0, Q 2 [ Q 0 ] which is a contradiction since Q is invertible Easy fact Let A M n (F Then the following are equivalent (i A is invertible (ii rref(a = I n (iii rcef(a = I n (iv rank A = n Finding A Let A M n (F Perform elementary row operations: [A I n ] [rref(a B] If rref(a = I n, A = B; if rref(a I n, A is not invertible For A M m n (F, let ker r (A = {x M m (F : xa = 0} and ker c (A = {y M n (F : Ay = 0} Facts Let A, B M n (F (i A GL(n, F ker r (A = {0} ker c (A = {0} (ii If AB GL(n, F, then A, B GL(n, F In particular, if AB = I n, then B = A and BA = I n Proof (i To see that ker c (A = {0} A GL(n, F, note that if rref(a I n, then ker c (A {0} (ii ker c (B ker c (AB = {0} So, B GL(n, F Congruence and similarity Let A, B M n (F We say that A is congruent to B, denoted = B, if P GL(n, F such that A = P T BP ; A is similar to B, denoted A B, if P GL(n, F such that A = P BP Canonical forms of symmetric matrices under congruence will be discussed in Chapter??; canonical forms of matrices under similarity will be discussed in Chapter?? Given P GL(n, F, the map φ : M n (F M n (F defined by φ(a = P AP is an algebra isomorphism, ie, φ preserves the addition, multiplication and scalar multiplication

6 MATRICES Exercises Let A M m n (F with rank A = 0 and let p > 0 Prove that B M n p (F such that rank B = min{n r, p} and AB = 0 2 For i n let e i = [0 0 i 0 0] T F n (i Let σ be a permutation of {,, n} and let P σ = [ e σ( e σ(n ] P σ is called the permutation matrix of σ Prove that Pσ = Pσ T (ii Let A = [a, a n ] M m n (F, B = Prove that b M n p(f b n AP σ = [a σ(, a σ(n ], P σ B = b σ ( b σ (n Hence, multiplication of a matrix X by a permutation matrix from the left (right permutes the rows (columns of X In particular, P στ = P σ P τ if τ is another permutation of {,, n} 3 Let A = [a ij ] M m n (F and B = [b kl ] M pq (F Define a B a n B A B = M mp nq(f a m B a mn B (i Prove that (A B T = A T B T (ii Let C M n r (F and D M q s (F Prove that (A B(C D = AC BD (iii Let C = [c uv ] M r s (F Prove that A (B C = (A B C (iv Let σ be a permutation of {,, mp} defined by σ ( (i p + k = (k m + i for i m, k p, and let τ be a permutation of {,, nq} defined by τ ( (j q + l = (l n + j for j n, l q Show that the (u, v-entry of A B is the (σ(u, τ(v-entry of B A Namely P T σ (A BP τ = B A (Note If m = n and p = q, then σ = τ (v Prove that rank(a B = (rank A(rank B

CHAPTER 2 The Determinant 2 Definition, Properties and Formulas Let S n be the set (group of all permutations of {,, n} A permutation σ S n is denoted by ( 2 n σ = σ( σ(2 σ(n A transposition is a swap of i, j {,, n} (i j and is denoted by (i, j Every σ S n is a product of s transpositions The number s is not uniquely determined by σ, but s (mod 2 is Define sign(σ = ( s ; σ is called an even (odd permutation if sign(σ = ( Definition 2 Let A = [a ij ] M n (F The determinant of A, denoted by det A of A, is defined to be Easy facts det A = σ S n sign(σa σ(a a nσ(n (i det A T = det A (ii det A is an F -linear function of every row and column of A (iii If A has two identical rows (columns, then det A = 0 Proof (i det A T = σ S n sign(σa σ(, a σ(n,n = σ S n sign(σ a,σ ( a n,σ (n = det A (iii Assume that the first two rows of A are identical representatives of the left cosets of (, 2 in S n Then Let C be a set of det A = σ C sign(σa σ( a nσ(n + σ C sign(σ (, 2a σ(, a σ(n,n = 0 7

8 2 THE DETERMINANT Effect of elementary row and column operations on the determinant det[ αv i ] = α det[ v i ], det[ v i v j ] = det[ v j v i ], det[ v i v j + αv i ] = det[ v i v j ] Theorem 22 (The Laplace expansion Let A M n (F For I, J {,, n}, let A(I, J denote the submatrix of A with row indices in I and column indices in J Fix I {,, n} with I = k We have det A = ( i I i+ j J j det A(I, J det A(I c, J c, J {,,n} J =k where I c = {,, n} \ I Lemma 23 Let σ = ( k k + n i i k i i S n, n k where i < < i k and i < < i n k Then sign(σ = ( i+ +i k+ 2 k(k+ Proof We count the number of transpositions needed to permute i,, i k, i,, i n k into,, n There are i k k integers in {i,, i n k } that are < i k Thus, i k k transpositions are needed to move i k to the right place In general, i t t transpositions are needed to move i t to the right place So, sign(σ = ( k t= (it t = ( i+ +i k+ 2 k(k+ Corollary 24 Let ( i i k i i n k σ = j j k j j n k S n, where i < < i k, i < < i n k, j < < j k, j < < j n k Then Proof of Theorem 22 We have det A = σ S n sign(σa σ( a nσ(n = sign(σ = ( i+ +i k+j + +j k J {,,n} J =k σ S n σ(i=j sign(σa σ( a nσ(n To compute the inner sum in the above, let I = {i,, i k }, I c = {i,, i n k }, J = {j,, j k }, J c = {j,, j n k }, where i < < i k, i < < i n k, j < < j k, j < < j n k (, and i i k i i n k σ = j α( j α(k j β( j β(n k, where α S k and β S n k Then by Corollary 24, sign(σ = sign(αsign(β( i+ +i k+j + +j k

2 DEFINITION, PROPERTIES AND FORMULAS 9 Therefore, sign(σa σ( a nσ(n σ S n σ(i=j = ( i+ +i k+j + +j k ( ( sign(αa ij α( a ik j α(k sign(βa i j a i β( n k j β(n k α S k β S n k = ( i+ +i k+j + +j k det A(I, J det A(I c, J c Hence the theorem and Corollary 25 Let A = [a ij ] M n (F We have det A = det A = n ( i+j a ij det A ij, i n, j= n ( i+j a ij det A ij, j n, i= where A ij is the submatrix of A obtained after deleting the ith row and the jth column Proposition 26 Let e j = [0 0 j 0, 0] T F m Let f : M m n (F F such that (i f(a is F -linear in every column of A; (ii f(a = 0 whenever A has two identical columns; (iii f([e j e jn ] = 0 for all j < < j n m; (this condition becomes null when m < n Then f(a = 0 for all A M m n (F Proof f([v v i v j v n ] = f([v v j v i v n ] In fact, 0 = f([ v i + v j v i + v j ] = f([ v i v i ] + f([ v i v j ] + f([ v j v i ] + f([ v j v j ] = f([ v i v j ] + f([ v j v i ] 2 Each column of A is a linear combination of e,, e m By (i, f(a is a linear combination of f([e j e jn ], where j,, j n {,, m} Thus, it suffices to show f([e j e jn ] = 0 If j,, j n are not all distinct, by (ii, f([e j e jn ] = 0 If j,, j n are all distinct, by, we may assume j < < j n m By (iii, f([e j e jn ] = 0 Corollary 27 det : M n (F F is the unique function such that (i det A is F -linear in every column of A; (ii det A = 0 whenever A has two identical columns; (iii det I n =

0 2 THE DETERMINANT Theorem 28 (Cauchy-Binet Let A M n m (F and B M m n (F Let I = {,, n} Then (2 det(ab = J {,,m} J =n det A(I, J det B(J, I In particular, det(ab = { 0 if n > m, (det A(det B if n = m Proof Fix A M n m (F and let f(b be the difference of the two sides of (2 Then f : M m n (F F satisfies (i (iii in Proposition 26 Proposition 29 (The adjoint matrix For A M n (F, define We have adj(a = [ ( i+j det A ij ] T Mn (F A adj(a = adj(a A = (det AI n Moreover, A is invertible det A 0 When det A 0, A = det A adj(a Proof Let A = [a ij ] = [v,, v n ] Then the (i, j entry of adj(aa is n ( i+k (det A ki a kj = det[v,, v i j,, v n ] = k= { det A if i = j, 0 if i j So, adj(aa = (det AI n 22 Techniques for Computing Determinants Example 20 (The Vandermonde determinant For a,, a n F, let V (a,, a n = a a 2 a n 2 a n n a n a n Then V (a,, a n = (a j a i i<j n

22 TECHNIQUES FOR COMPUTING DETERMINANTS Proof Method Subtract a (row (n from row n,, a (row from row 2 0 a 2 a a n a V (a,, a n = 0 a 2 (a 2 a a n (a n a 0 a n 2 2 (a 2 a a n 2 n (a n a n = V (a 2,, a n (a j a = i<j n j=2 (a j a i (by induction Method 2 Assume a,, a n are all distinct V (a,, a n, x is a polynomial of degree n with leading coefficient V (a,, a n and have a,, a n as roots So, V (a,, a n, x = V (a,, a n n j= (x a j Use induction Example 2 Let a,, a n, b,, b n F such that a i + b j 0 for all i, j Then [ ] i<j det = (a i a j (b i b j a i + b j i,j (a i + b j Proof We may assume that a, a n are all distinct and so are b,, b n Denote the determinant by f(a,, a n ; b,, b n Let x be an indeterminate Then f(x, a 2,, a n ; b,, b n n j= (x + b j is a polynomial of degree n with leading coefficient a 2+b a 2+b n =: g(a,, a n ; b,, b n a n+b a n+b n j= and have a 2,, a n as roots So, n n (22 f(x, a 2,, a n ; b,, a n (x + b j = g(a 2,, a n ; b,, b n (x a i Similarly, g(a 2,, a n ; x, b 2,, b n n i=2 (a i + x is a polynomial of degree n with leading coefficient f(a 2,, a n ; b 2,, b n and have b 2,, b n as roots So, n n (23 g(a 2,, a n ; x, b 2,, b n (a i + x = f(a 2,, a n ; b 2,, b n (x b j i=2 By (22 (with x = a and (23 (with x = b, we have f(a,, a n ; b,, b n i=n or j=n The conclusion follows by induction (a i +b j = f(a 2,, a n ; b 2,, b n i=2 j=2 n (a a j (b b j j=2

2 2 THE DETERMINANT Example 22 (Circulant matrix Let a 0,, a n C and a 0 a a n a n a 0 a a n a 0 a C(a 0,, a n = a n a 0 a a a n a 0 Put Then 0 A = 0 0 0 C(a 0,, a n = a 0 A 0 + a A + + a n A n Let ɛ = e 2πi/n Then ɛ 0 ɛ 0(n ɛ 0 ɛ 0(n ɛ ɛ (n A = ɛ ɛ (n ɛ n ɛ (n 2 ɛ n ɛ (n 2 ɛ ɛ n Thus A ɛ ɛ n and So, C(a 0,, a n n i=0 a iɛ 0 i (n det C(a 0,, a n = a i ɛ ji j=0 i=0 n i=0 a iɛ (n i

EXERCISES 3 Exercises 2 Compute the (2n (2n determinant a a 2 a n b n+ b n a n+ b 2 b b 2n a 2n b 2n a 2n 22 (Tridiagonal determinant Let a, b, c C and define a b c a b c a b D n =, n c a b c a n n (i Prove that D n = ad n bcd n 2 for n 3 (ii Prove that α n+ β n+ if a 2 4bc 0, D n = α β ( a n (n + if a 2 4bc = 0, 2 where α = 2 (a + a 2 4bc, β = 2 (a a 2 4bc 23 Use Example 2 to compute the determinant of the Hilbert matrix H n = [ i+j ] i,j n 24 Prove that sin x cos x sin 2x cos 2x sin nx cos nx sin x 2 cos x 2 sin 2x 2 cos 2x 2 sin nx 2 cos nx 2 sin x 2n+ cos x 2n+ sin 2x 2n+ cos 2x 2n+ sin nx 2n+ cos nx 2n+ = ( n 2 2n2 sin x k x j 2 j<k 2n+

4 2 THE DETERMINANT 25 Prove that sin x cos x sin 2x cos 2x sin nx cos nx sin x 2 cos x 2 sin 2x 2 cos 2x 2 sin nx 2 cos nx 2 sin x 2n cos x 2n sin 2x 2n cos 2x 2n sin nx 2n cos nx 2n = ( n 2 ( 2n2 sin x k x j n 2n ( xj sin n + 2 2 π n + s j<k 2n s=0 j= 26 M Let A M m n (F and B M p q (F, where mp = nq Prove that { (det A p (det B m if m = n and p = q, det(a B = 0 otherwise 27 (Maillet s determinant Let p be an odd prime For each i, j {,, p 2 }, let m(i, j {,, p } such that j m(i, j i (mod p (When viewed as an element of Z p, m(i, j = i/j Let For example, D p = det[m(i, j] 4 5 D 7 = 2 3 3 5 Compute D p for p 9 using a computer Make a conjecture about D p Then compute D 23

CHAPTER 3 Vector Spaces and Linear Transformations 3 Basic Definitions Definition 3 A vector space over a field F is an abelian group (V, + equipped with a scalar multiplication F V V, (α, x αx such that for all x, y V and α, β F, (i α(x + y = αx + αy; (ii (α + βx = αx + βx; (iii α(βx = (αβx; (iv x = x Examples of vector spaces F n, where F is a field More generally, let V be a vector space and X any set Then V X = the set of all functions from X to V is a vector space over F If F is a subfield of K, K is a vector space over F M m n (F, F [x], etc The solution set of a linear system, a linear difference equation, a linear differential equation, etc p > 0, l p = { {a n } n= : a n C, n= a n p < } ( a n + b n p (2 max{ a n, b n } p = 2 p max{ a n p, b n p } 2 p ( a n p + b n p Subspaces Let V be a vector space over F A subset W V is called a subspace of V if W is a vector space over F under the same addition and scalar multiplication of V W is a subspace of V W and W is closed under addition and scalar multiplication Linear transformations Let V and W be vector spaces over F A function f : V W is called a linear transformation (or an F -map if for all x, y V and α F, f(x + y = f(x + f(y and f(αx = αf(x A bijective F -map is called an isomorphism If isomorphism f : V W, we say that V is isomorphic to W and write V = W ; in this case, f : W V is also an isomorphism An injective F -map f : V W is called an embedding Hom F (V, W = the set of all F -maps from V to W ; it is a subspace of W V An F -map f : V V is also called a linear operator of V Hom F (V, V is denoted by End F (V Easy fact Let f : V W be a linear transformation Then f(v is a subspace of W If W is a subspace of W, then f (W is a subspace of V In particular, ker f := f (0 is a subspace of V f is - ker f = {0} Easy fact Let V be a vector space over F and {V i : i I} a family of subspaces of V (i i I V i is a subspaces of V 5

6 3 VECTOR SPACES AND LINEAR TRANSFORMATIONS (ii Define { } V i = u i : u i V i, u i 0 for only finitely many i I i I i I Then i I V i is the smallest subspace of V containing i I V i Direct product and external direct sum Let {V i : i I} be a family of vector spaces over F Let V i = { (u i i I : u i V i, i I } (the cartesian product of {V i : i I} i I Then i I V i is a vector space over F with addition and scalar multiplication defined component wise; i I V i is called the direct product of {V i : i I} ext { V i := (u i } V i : a i = 0 for all but finitely many i i I i I is a subspace of i I V i ext i I V i is called the external direct sum of {V i : i I} When I <, i I V i = ext i I V i Internal direct sum Let V be a vector space over F and {V i : i I} a family of subspaces of V If ( V i V j = {0} for all i I, j I j i then i I V i is called an internal direct sum and is denoted by i I V i Easy facts (i i I V i is an internal direct sum every u i I V i has a unique representation u = i I u i, where u i V i and u i = 0 for all but finitely many i (ii i I V i = ext i I V i (For this reason, we usually do not distinguish internal and external direct sums ext is also denoted by Spans, Spanning Sets and Linearly Independent Sets Let V be a vector space over F and let S V The span of S, denoted by S or span S, is S = span S := {a u + + a n u n : n 0, u i V, a i F } S is the smallest subspace of V containing S If V = S, S is called a spanning set of V A subset S V is called a linearly independent set if for any u,, u n S (distinct and any a,, a n F not all zero, a u + + a n u n 0 Theorem 32 Let V be a vector space over F and S V Then the following statements are equivalent (i S is a maximal linearly independent set of V (ii S is a minimal spanning set of V (iii S is a linearly independent spanning set of V (iv Every element of V is a unique linear combination of elements in S

3 BASIC DEFINITIONS 7 Proof (i (iii (ii (iii (iv (iii By Zorn s lemma, maximal linearly independent sets of V exist S V satisfying one of (i (iv in Theorem 32 is called a basis of V A subset Proposition 33 Let V and W be vector spaces over F and let X be a basis of V Then every function f : X W can be extended to a unique F -map f : V W Proof Define f : V W x X a xx x X a xf(x Corollary 34 Let V and W be vector spaces over F Let S be a subspace of V and f : S W an F -map The f can be extended to an F -map g : V W Proof Let X be a basis of S Extend X to a basis of Y of V Extend f X to a function f : Y W By Proposition 33, f can be extended to an F -map g : V W Theorem 35 Any two bases of a vector space have the same cardinality Proof Let V be a vector space over F and let X, Y be two bases of V Assume that X < and Y < Write X = {x,, x n } and Y = {y,, y n } Assume to the contrary that n > m Then x = A x n y y m, y y m = B for some matrix A M n m (F and B M m n (F It follows that AB = I n There exists C GL(n, F such that CA = [ 0 0 ] Thus (0,, 0, C = (0,, 0, CAB = 0, 2 Assume X = We claim that Y = (Otherwise, X is spanned by Y which is spanned by a finite subset of X So, X is spanned by a finite subset of X, For each x X, a finite subset {y,, y n } Y such that x = a y + + a n y n, a i F Define f(x = {y,, y n } We claim that x X f(x = Y (Otherwise, X is spanned by Y := x X f(x Y ; hence Y is spanned by Y, Now, Y = f(x X ℵ 0 = X x X By symmetry, X Y So, X = Y Dimension Let V be a vector space over F with a basis X Define dim V (or dim F V = X We have V = s X x x n F x = ext F = F X x X

8 3 VECTOR SPACES AND LINEAR TRANSFORMATIONS Caution Let F be a field and X a set F X is the direct sum of X copies of F, ie, F X = x X F However, F X is the F -vector space of all functions from X to F, ie, F X = x X F Examples dim F n = n dim F [x] = ℵ 0 Let S n (F be the set of all n n symmetric matrices over F and U n (F the set of all n n upper triangular matrices over F Then dim S n (F = dim U n (F = 2n(n + Example 36 If V is a vector space over F such that V = and V > F Then dim V = V (Eg, dim Q R = ℵ Proof Let X be a basis of V Clearly, X = (If X <, since F < V =, we have V = F X < V, Let P 0 (X be the set of all finite subsets of X Then V = S S P 0(X P 0 (X max{ F, ℵ 0 } (since S = F S max{ F, ℵ 0 } = X max{ F, ℵ 0 } = max{ X, F } Since V F, we must have V X Example Let F be a subfield of K and V a vector space over K Then V is naturally a vector space over F Moreover, dim F V = dim K V dim F K Proof Let X be a basis of V over K and Y a basis over K over F Then as (y, x runs over Y X, yx are all distinct; Y X = {yx : y Y, x X} is a basis of V over F Easy facts (i Two vector spaces V and W over F are isomorphic iff dim V = dim W (ii dim i I V i = i I dim V i Example Let A M m n (F The row (column space of A, denoted by R(A (C(A, is the subspace of F n (F m spanned by the rows (columns of A The nonzero rows (columns of rref(a (rcef(a form a basis of R(A (C(A; dim R(A = dim C(A = rank A 32 Quotient Spaces and Isomorphism Theorems The quotient space Let S V be a vector space over F Recall that the quotient abelian group V/S = {u + S : u V } and the addition in V/S is defined by (u + S + (v + S = (u + v + S Define a scalar multiplication in V/S similarly For u + S V/S and α F, let α(u + S = αu + S The scalar multiplication is well defined and V/S becomes a vector space over F V/S is called the quotient space of V by S The map π : V V/S u u + S is an onto F -map with ker π = S π is called the canonical projection from V to V/S

32 QUOTIENT SPACES AND ISOMORPHISM THEOREMS 9 Proposition 37 Let S V be vector spaces over F Let {ɛ i : i I} be a basis of S and {δ j + S : j J} a basis of V/S Then {ɛ i : i I} {δ j : j J} is a basis of V So, V = S V/S and dim V = dim S + dim V/S If dim V <, then dim V/S = dim V dim S Easy fact (The correspondence theorem Let S V be vector spaces over F Let A be the set of all subspaces of V containing S and B the set of all subspaces of V/S Then A B W W/S is a bijection Theorem 38 (The universal mapping property of the quotient space Let S V be vector spaces over F Let W be another vector space over F and f : V W an F -map such that ker f S Then! F -map f : V/S W such that f = f π Moreover, f(v = f(v and ker f = ker f/s f V f π V/S Proof Define f : V/S W, u + S f(u Theorem 39 (The first isomorphism theorem Let f : V W be an F -map Then V/ ker f = f(v Proof By Theorem 38, an F -map f : V/ ker f W such that f = f π, where π : V V/ ker f is the canonical projection, and f(v = f(v, ker f = ker f/ ker f = {0 + ker f} Theorem 30 (The second isomorphism theorem Let V be a vector space over F and S, T subspaces of V Then (S + T /T = S/S T Proof Define an F -map W f : S (S + T /T s s + T f is onto with ker f = S T Use the first isomorphism theorem Theorem 3 (The third isomorphism theorem Let S T V be vector space over F Then (V/S / (T/S = V/T Proof Define an F -map f : V/S V/T, v + S v + T Then f is onto and ker f = T/S Corollary 32 (i If f : V W is an F -map, then dim V = null f + rank f, where null f := dim(ker f and rank f := dim f(v

20 3 VECTOR SPACES AND LINEAR TRANSFORMATIONS (ii Let S, T be subspaces of V Then Proof (ii Define an F -map dim S + dim T = dim(s + T + dim S T f : S T S + T (s, t s + t Then f is onto and ker f = {(s, s : s S T } = S T Hence dim S + dim T = dim(s T = dim(s + T + dim S T Facts 33 Finite Dimensional Vector Spaces (i If S V are vector spaces over F such that dim S = dim V <, then S = V (ii Let f : V W be an F -map, where dim V = dim W < Then f is - f is onto Proof (i dim V/S = 0 V = S Note When dim V =, both (i and (ii are false Let V be an n-dimensional vector space over F with an (ordered basis E = (ɛ,, ɛ n and W and m-dimensional vector space over F with an (ordered basis (δ,, δ m Let f : V W be an F -map Then ( f(ɛ,, f(ɛ n = (δ,, δ m A for some A M m n (F The map f A is an isomorphism Hom F (V, W M m n (F We have rank f = rank A and null f = null A (null A := dim{x F n : Ax = 0} If f End F (V (= Hom F (V, V, we have (f(ɛ,, f(ɛ n = (ɛ,, ɛ n A for some A M n (F A is called the E-matrix of f If E = (ɛ,, ɛ n is another (ordered basis of V and let B be the E -matrix of f Then B = P AP, where P GL(n, F is defined by (ɛ,, ɛ n = (ɛ,, ɛ n P (Proof f(ɛ,, ɛ n = f((ɛ,, ɛ n P = f(ɛ,, ɛ n P = (ɛ,, ɛ n AP = (ɛ,, ɛ np AP The map End F (V M n (F, f A, is not only an F -isomorphism but also preserves the multiplication; the map is an algebra isomorphism Facts about ranks of matrices Let A, B M m n (F and C M n p (F (i rank A = max{r : A has an r r invertible submatrix} (ii rank(a + B rank A + rank B (iii rank A + rank C n rank AC min{rank A, rank C} (iv If P GL(m, F and Q GL(n, F, then rank P AQ = rank A

33 FINITE DIMENSIONAL VECTOR SPACES 2 Proof (iii Method Define f : F n /C(C C(A/C(AC x + (C Ax + C(AC Then f is a well defined onto F -map So, dim ( F n /C(C dim ( C(A/C(AC Hence the result Method 2 May assume A = [ ] [ I r 0 0 0, where r = rank A Write C = C ] C 2, where C is of size r p Then rank AC = rank C and rank C + n r rank C Hence the result Homogeneous linear ordinary differential equations (ODE Let F = R or C Let I R be an open interval Let A : I M n (F be a continuous function Let x(t F n denote an unknown function of a real variable t For each t 0 I and x 0 F n, the initial value problem (3 { x (t = A(tx(t x(t 0 = x 0 has a unique solution x(t defined on I (This is a special case of existence and uniqueness theorems in ODE Let D(I be F-vector space of all differentiable functions from I to F n and let (F n I be the F-vector space of all functions from I to F n Then L : D(I (F n I x(t x (t A(tx(t is an F-map The homogeneous linear ODE x (t = A(tx(t becomes L(x = 0; its solution set is ker L The existence and uniqueness of the solution of (3 is equivalent to the following statement The F-map (32 ker L F n x x(t 0 is an isomorphism Therfore dim F ker L = n Let x,, x n ker L φ(t = det[x (t,, x n (t] is called the Wronskian of x,, x n By the isomorphism (32, x,, x n is a basis of ker L φ(t 0 0 Since t 0 I is arbitrary, φ(t 0 0 φ(t 0 t I The Wronskian φ(t is explicitly given by its initial value φ(t 0 : ( t (33 φ(t = φ(t 0 exp Tr(A(τdτ t 0

22 3 VECTOR SPACES AND LINEAR TRANSFORMATIONS Proof of (33 We have φ (t = d ( det[x (t,, x n (t] dt n = det[x (t,, x i (t, x i(t, x i+ (t, x n (t] = i= n det[x (t,, x i (t, A(tx i (t, x i+ (t, x n (t] i= = Tr A(t det[x (t,, x n (t] (by the next lemma = Tr A(t φ(t It follows that φ(t = φ(t 0 exp ( t t 0 Tr(A(τdτ (the product rule Let a 0 (t,, a n (t F be continuous functions of t I and x(t F an unknown function Then the nth order linear ODE (34 x (n (t + a n (tx (n (t + + a 0 (tx(t = 0 is equivalent to where y(t = x( x (t x (n (t, A(t = y (t = A(ty(t, Let S be the solution set of (34 Then for each t 0 I, 0 0 0 a 0 (t a (t a n (t S F n x(t (x(t 0, x (t 0,, x (n (t 0 T is an isomorphism If x,, x n S, their Wronskian is x (t x n (t x φ(t = (t x n(t x (n (t x (n n (t We have ( t φ(t = φ(t 0 exp a n (τdτ t 0 x,, x n form a basis of S φ(t 0 0 φ(t 0 t I Lemma 33 Let A, B M n (F and write B = [b, b n ] Then n (35 det[b,, b i, Ab i, b i+,, b n ] = TrA det B i=

34 THE DUAL SPACE 23 Proof Fix A = [a ij ] and let f(b be the difference of the two sides of (35 We only have to show that f satisfies (i (iii of Proposition 26 (i is obvious (ii Assume b = b 2 Then (iii f(b = det[ab, b 2, b 3,, b n ] + det[b, Ab 2, b 3,, b n ] = 0 f([e,, e n ] = = n det[e,, e i, Ae i, e i+,, e n ] i= i= a i n det[e,, e i, a ni 34 The Dual Space, e i+,, e n ] = n a ii = TrA Let V be a vector space over F Hom F (V, F is called the dual space of V and is denoted by V Let B be a basis of V For each v B,!v V such that { v if u = v, (u = 0 if u B \ {u} It is easy to see that {v : v B} are linearly independent in V Thus, B V, v v extends to an embedding V V (Note This embedding depends on the choice of the basis of B If dim V = n <, then dim V = n (Recall that Hom F (F n, F = M n (F So, the above embedding V V is an isomorphism {v : v B} is a basis of V and is called the dual basis of B Theorem 34 Let V be a vector space over F such that dim V = Then dim V = V = F dim V Proof Let B be a basis of V Then V = F B = F dim V Case Assume F dim V > F By Example 36, dim V = V = F dim V Case 2 Assume F dim V = F Let b 0, b, B be distinct For each a F, choose f a V such that f a (b j = a j, j 0 We claim that {f a : a F } is linearly independent This is quite obvious Let a,, a n F be distinct Then the n ℵ 0 matrix [ fai (b j ] = [a j i ] i= has linearly independent rows Therefore, dim V {f a : a F } = F Examples Let F = Q, V = Q ℵ0 Then dim V = ℵ ℵ0 0 = ℵ Let F = R, V = R ℵ0 Then dim V = ℵ ℵ0 = ℵ The pairing between V and V Define a map, : V V F by f, v = f(v, is bilinear, ie, af +bg, v = a f, v +b g, v and f, au+bv = a f, u +b f, v For any S V and A V, S := {f V : f, v = 0 v S} is a subspace of V and A := {v V : f, v = 0 f A} is a subspace of V

24 3 VECTOR SPACES AND LINEAR TRANSFORMATIONS Proposition 35 Let S, T be subspaces of V and A, B subspaces of V (i S T S T ; A B A B (ii S = S ; A A (iii φ : S (V/S (iv is an isomorphism, where is an embedding, where f f, f, : V/S F v + S f, v ψ : A (V /A v, v, v : V /A F f + A f, v (v If dim V = n <, then dim S + dim S = n, dim A + dim A = n, A = A, and the embedding ψ in (iv is an isomorphism Proof (ii Clearly, S S If u V \ S, then f V such that f(s = 0 but f(u 0 So, f S but f, u 0 Hence u / S So, S S (iii Proof that φ is onto Let π : V V/S be the natural projection g (V/S, we have g π S and g = φ(g π (v Note that dim(v/s = dim(v/s Thus by (iii, dim S = dim(v/s = n dim S Let A = {0} in (iv We have V = {0} V, v, v is an embedding Since dim V = dim V, this embedding is also onto Thus every α V is of the form, v for some v V It follows that the map ψ in (iv is onto (Let ρ : V V /A be the natural projection β (V /A, we have β ρ V ; hence β ρ =, v for some v V Clearly v A and ψ(v = β Consequently, dim A = dim(v /A = dim(v /A = n dim A Since A A and dim A = n dim A = dim A, we have A = A Note (i The embedding V V, v, v, is called the canonical embedding of V into V ; it does not depends on any bases of V and V (For comparison, note that the embedding V V at the beginning of this section depends on the choice of the bases of V and V When dim V <, the canonical embedding is an isomorphism (ii Statements (iii and (iv of Proposition 35 can be made a little more general See Exercise 34 (iii When dim V =, the claims in (v of Proposition 35 are false See the following counterexamples Let S = {0} V Then dim S = dim V > dim V ; hence dim S + dim S > dim V Let A = V Then dim A + dim A > dim V

EXERCISES 25 Since dim V > dim V, the canonical embedding V V is not onto Assume V has a countable basis ɛ, ɛ 2, Let A = {f V : f(ɛ n = 0 when n is large enough} Then A = {0} (If 0 v V, then v = a ɛ + + a N ɛ N for some N > 0 and a,, a N F Choose f V such that f(v = and f(ɛ n = 0 for all n > N Then f A but f, v 0, so v / A Therefore, A = {0} = V A When dim V = n <, the paring between V and V can be made more explicit Let v,, v n be a basis of V and v,, v n the dual basis of V Define isomorphisms α : F n V, (a,, a n a v + + a n v n, β : F n V, (b,, b n b v + + b n v n For v V and f V, write v = a v + + a n v n and f = b v + + b n v n Then f, v = b v + + b n v n, a v + + a n v n = b a + + b n a n = (b,, b n (a,, a n T = β (f α (v T Let S be a subspace of V and A a subspace of V Let ɛ,, ɛ k be a basis of α (S and δ,, δ l a basis of β (A Then β (S = ker r [ɛ T,, ɛ T k ], α (A = ker r [δ T,, δ T l ] Proposition 36 Let f : V W be an F -map (i Define f : W V, α α f Then f Hom F (W, V Moreover, ( : Hom F (V, W Hom F (W, V is an F -map (ii If g : W X is another F -map, then (g f = f g (iii Let θ V : V V and θ W : W W be the canonical embeddings Then the following diagram is commutative f V W θ V θ W V f W Proof Exercise Exercises 3 Let V be a vector space over F and let A, B, A be subspaces of V such that A A Prove that A (B + A = (A B + A

26 3 VECTOR SPACES AND LINEAR TRANSFORMATIONS 32 Let V be a vector space over F and let f be a linear transformation of V A subspace W V is called f-invariant if f(w W Define V = {a V : f k (a = 0 for some integer k > 0}, V 2 = f k (V k= (i Prove that V and V 2 are both f-invariant subspaces of V (ii If dim V <, prove that V = V V 2 (iii Give an example of a linear transformation f of an infinite dimensional vector space V such that V = V 2 = {0} 33 Let L = {f(x, y R[x, y] : deg x f n, deg y f n} Let = 2 x 2 Prove that D : L L f(x, y ( (x 2 + y 2 f(x, y (x 2 + y 2 f(x, y + 2 y 2 is a linear transformation Find the matrix of D relative to the basis {x i y j : 0 i, j n} of L 34 Let V be a vector space over F Let S T be subspaces of V and A B subspaces of V (i Define where φ : S /T (T/S f + T f, f, T/S F u + S f, u Prove that φ is a well defined isomorphism (ii Define where ψ : A /B (B/A u + B, u, u B/A F f + A f, u Prove that ψ is a well defined injective F -map When dim V <, ψ is an isomorphism 35 Prove Proposition 36 36 Let A = [ B D ] C, E where B M m n (F with rank B = r and E M p q (F largest possible values of rank A? What is the

EXERCISES 27 37 Let A M m n (F, B M n p (F, C M p q (F Prove that rank AB + rank BC rank B + rank ABC 38 (i Let V and W be vector spaces over Q and f : V W a function such that f(x + y = f(x + f(y for all x, y V Prove that f is a Q-linear map (ii Let f : R n R m be a continuous function such that f(x + y = f(x + f(y for all x, y R n Prove that f is an R-linear map (Note (ii is false if f is not continuous 39 Let X be a subspace of M n (F with dim X > n(n Prove that X contains an invertible matrix 30 Let F q be a finite field with q elements (i Prove that GL(n, F q = (q n (q n q (q n q n = q 2 n(n n i= (q i (ii Let 0 k n and let [ ] n k be the number of k-dimensional subspaces q in F n q Prove that [ ] n = (qn (q n q (q n q k k k q (q k (q k q (q k q k = q n k+i q i i= ( [ ] n k is called the gaussian coefficient q 3 Let n 0 and V = {f F [x] : deg f n} For each i n +, define L i V by L i (f = + 0 f(xe ix dx, f V Find a basis f,, f n+ of V such that L,, L n+ is its dual basis

CHAPTER 4 Rational Canonical Forms and Jordan Canonical Forms 4 A Criterion for Matrix Similarity The main purpose of this chapter is to determine when two matrices in M n (F are similar and to determine a canonical form in each similarity class Let V be an n-dimensional vector space over F Then two matrices in M n (F are similar iff they are the matrices of some T End(V relative to two suitable bases Therefore, to know canonical forms of the similarity classes of M n (F is to know canonical forms of linear transformations of V relative to suitable bases Matrices over M n (F [x] Let F [x] be the polynomial ring over F M m n (F [x] is the set of all m n matrices with entries in F [x]; M n (F [x] := M n n (F [x]; GL(n, F [x] is the set of all invertible matrices in M n (F [x] Fact A M n (F [x] is invertible det A F (= F \ {0} Proof ( = det(aa = (det A(det A F [x], ie, det A F ( A = det Aadj A So, det A is invertible in Equivalence in M m n (F [x] Two matrices A, B M m n (F [x] are called equivalent, denoted A B, if P GL(m, F [x] and Q GL(n, F [x] such that A = P BQ Elementary operations and elementary matrices in M n (F [x] Elementary operations and elementary matrices in M n (F [x] are almost the same as those in M n (F, cf Table?? For type I, we still require that α F (Requiring that 0 α F [x] is not enough For type III, β F [x] Elementary matrices in M n (F [x] are invertible and every matrix in GL(n, F [x] is a product of elementary matrices Theorem 4 Let A, B M n (F Then A and B are similar in M n (F xi A and xi B are equivalent in M n (F [x] Proof ( P GL(n, F such that A = P BP Note that P GL(n, F [x] and P (xi BP = xi A ( P, Q GL(n, F [x] such that P (xi A = (xi BQ Write P = P 0 + xp + + x s P s, where P i M n (F Divide P by xi B from the left We have P = (xi BS + T for some S M n (F [x] and T M n (F Divide 29

30 4 RATIONAL CANONICAL FORMS AND JORDAN CANONICAL FORMS Q by xi A from the right We have Q = S (xi A + T for some S M n (F [x] and T M n (F Thus ie, [(xi BS + T ](xi A = (xi B[S (xi A + T ], (4 (xi B(S S (xi A = (xi BT T (xi A We claim that S S = 0 (Otherwise, S S = S 0 + xs + + x k S k, S i M n (F, S k 0 Then (xi B(S S (xi A = x k+2 S k + terms of lower degree in x while the highest power of x at the RHS of (4 is x, Thus (xi BT T (xi A = 0, which implies that T = T and BT = T A It remains to show that T GL(n, F (Then B = T AT Write P = (xi AX + Y, where X M n (F [x] and Y M n (F Then (42 I = P P = [(xi BS + T ][(xi AX + Y ] = (xi BS [ (xi AX + Y ] + T (xi A + T Y = (xi BS [ (xi AX + Y ] + (xi BT + T Y ( T A = BT = (xi BZ + T Y for some Z M n (F [x] Compare the degrees of x at both sides of (42 We must have T Y = I and the proof is complete Now, the question is to determine when xi A is equivalent to xi B 42 The Smith Normal Form For two matrices A, B of any size, define A B = [ A B ] Theorem 42 Let A M m n (F [x] Then P GL(m, F [x] and Q GL(n, F [x] such that d d 2 (43 P AQ = 0, where d,, d r F [x] are monic (with leading coefficient and d d 2 d r The polynomials d,, d r F [x] are uniquely determined by A and are called the invariant factors of A The integer r is called the rank of A The matrix at the RHS of (43 is called the Smith normal form of A Proof Existence of the Smith normal form For 0 A = [a ij ] M m n (F [x], define δ(a = min{deg a ij : a ij 0} Use induction on min(m, n First assume min(m, n =, say m = Assume A 0 Among all matrices equivalent to A, choose B such that δ(b is as small as possible Write B = [b,, b n ] and, without loss of generality, assume deg b = δ(b Then b b ij for all 2 j n (If b b 2, then b 2 = qb + r for some q, r F [x] with 0 deg r < deg b Then d r

42 THE SMITH NORMAL FORM 3 B = [b, b 2 qb, b 3,, b n ] = [b, r, b 3,, b n ], which contradicts the minimality of δ(b Thus, suitable elementary column operations of type III transform B into [b, 0,, 0] We can make b monic using a type I elementary operation Now assume min(m, n > and A 0 Among all matrices equivalent to A, choose B such that δ(b is as small as possible Let B = [b ij ] and assume deg b = δ(b By the argument in the case m = we have b b j for all 2 j n and b b i for all 2 i m Then suitable type III elementary operations transform B into b 0 0 0 c 22 c 2n C = 0 c m2 c mn We claim that b c ij for all 2 i m and 2 j n (Since b c i2 c in 0 c 22 c 2n C, 0 c m2 c mn from the above we have b c ij for all 2 j n Therefore, C = [b ] b C, where C M (m (n (F [x] Apply the induction hypothesis to C Uniqueness of the Smith normal form For A M m n (F [x] and k min(m, n, define k (A = gcd{det X : X is a k k submatrix of A} ( k (A is called the kth determinantal divisor of A Also define 0 (A = We claim that if A, B M m n (F [x] are equivalent, then k (A = k (B for all 0 k min(m, n Assume B = P AQ, where P GL(m, F [x], Q GL(n, F [x] By Cauchy-Binet, for I {,, m} and J {,, n} with I = J = k, det B(I, J = det P (I, K det A(K, L det Q(L, J K {,,m} L {,,n} K = L =k Since k (A det A(K, L for all K, L, k (A det B(I, J for all I, J k (A k (B By symmetry, k (B k (A So, k (A = k (B Now, if d d 2 A 0, then (44 k (A = d r { d d k if 0 k r, 0 if k > r So,

32 4 RATIONAL CANONICAL FORMS AND JORDAN CANONICAL FORMS So, r is uniquely determined by A and (45 d k = k(a k (A, k r, are also uniquely determined by A Elementary divisors Let A M m n (F [x] and let d,, d r be the nonconstant invariant factors of A Write d i = p ei i pei,s i i, where p i,, p i,si F [x] are distinct monic irreducible polynomials and e i,, e i,si Z + Then p ei i,, pei,s i i, i r, are called the elementary divisors of A Corollary 43 Let A, B M m n (F [x] The following statements are equivalent (i A, B are equivalent (ii A, B have the same invariant factors (iii A, B have the same rank and same elementary divisors (iv A, B have the same determinantal divisors Proof By Theorem 42, (i (ii By (44 and (45, (ii (iv Obviously, (ii (iii (iii (ii It suffices to show that the invariant factors of a matrix A M m n (F [x] are determined by its rank and its elementary divisors Let rank A = r Let the elementary divisors of A be p e,, pe,s, p et t,, p et,s t t, where p,, p t F [x] are distinct monic irreducibles and 0 < e i e i,si, i t Then the last invariant factor of A is d r = p e,s p et,s t t The other invariant factors of A are determined by the remaining elementary divisors p e,, pe,s, p et t,, p et,s t t the same way Therefore, the invariant factors of A are determined by its rank and its elementary divisors Proposition 44 Let A, B be two matrices over F [x] Then the elementary divisor list of A B is the union of the elementary divisor lists of A and B Proof We may assume that A and B are Smith normal forms: A = f f s 0, g g t 0 Let p F [x] be any monic irreducible Write f i = p ai f i, g j = p bj g j, where p f i, p g j, and a a s, b b t Let c c s+t be a rearangement of a,, a s, b,, b t Then for k s + t, k (A B = p c+ +c k h k, h k F [x], p h k

42 THE SMITH NORMAL FORM 33 (Note that k (A B = 0 for k > s + t Hence, the kth invariant factor of A B is k (A B k (A B = pc k h k, h k F [x], p h k Therefore, the powers of p appearing in the elementary divisor list of A B are p c k, c k > 0 These are precisely the powers of p appearing in the union of the elementary divisor lists of A and B Example Let A M 5 4 (R[x] be given below 0 2x + 2 6x + 6 5x 4 + 0x 3 + 5x 2 + 8x + 8 2 2x + 2 2x 4 2x 3 4x 2 + 6x + 6 x 4 2x 3 3x 2 6x 6 A = x x 2 5x 5 + 6x 2 + 2x x 5 + 3x 4 + 5x 3 + 6x 2 + 5x + 4 0 x 4 + x 3 + 2x 2 3x 4 + 6x 3 + 9x 2 + 2x + 7 2 2x + 2 2x 4 2x 3 4x 2 + 6x + 6 2x 4 4x 3 6x 2 0x 8 We use elementary operations to bring A to its Smith normal form: A r r4 r ( 0 x 4 x 3 2x 2 3x 4 6x 3 9x 2 2x 7 2 2x + 2 2x 4 2x 3 4x 2 + 6x + 6 x 4 2x 3 3x 2 6x 6 x x 2 5x 5 + 6x 2 + 2x x 5 + 3x 4 + 5x 3 + 6x 2 + 5x + 4 0 2x + 2 6x + 6 5x 4 + 0x 3 + 5x 2 + 8x + 8 2 2x + 2 2x 4 2x 3 4x 2 + 6x + 6 2x 4 4x 3 6x 2 0x 8 r2 2 r r3 (x r r5 2 r 0 x 4 x 3 2x 2 3x 4 6x 3 9x 2 2x 7 0 2x + 2 6x + 6 5x 4 + 0x 3 + 5x 2 + 8x + 8 0 x 2 x 3 + 4x 2 + 2x 4x 5 + 6x 4 + 8x 3 + 9x 2 3 0 2x + 2 6x + 6 5x 4 + 0x 3 + 5x 2 + 8x + 8 0 2x + 2 6x + 6 4x 4 + 8x 3 + 2x 2 + 4x + 6 0 0 0 0 2x + 2 6x + 6 5x 4 + 0x 3 + 5x 2 + 8x + 8 0 x 2 x 3 + 4x 2 + 2x 4x 5 + 6x 4 + 8x 3 + 9x 2 3 0 2x + 2 6x + 6 5x 4 + 0x 3 + 5x 2 + 8x + 8 0 2x + 2 6x + 6 4x 4 + 8x 3 + 2x 2 + 4x + 6 2 6 5x 3 + 5x 2 + 0x + 8 x x 2 + 3x 4x 4 + 2x 3 + 6x 2 + 3x 3 = [] (x + 2 6 5x 3 + 5x 2 + 0x + 8, 2 6 4x 3 + 4x 2 + 8x + 6

34 4 RATIONAL CANONICAL FORMS AND JORDAN CANONICAL FORMS where 2 6 5x 3 + 5x 2 + 0x + 8 x x 2 + 3x 4x 4 + 2x 3 + 6x 2 + 3x 3 2 6 5x 3 + 5x 2 + 0x + 8 x 2 + 2 (x + (x 2 + 2 2 6 4x 3 + 4x 2 + 8x + 6 0 0 0 So, x + A (x + (x 2 + 2 (x + 2 (x 2 + 2 0 0 0 0 We have (A =, 2 (A = x +, 3 (A = (x + 2 (x 2 + 2, 4 (A = (x + 3 (x 2 + 2 2 The elementary divisors of A are x +, x +, (x + 2, x 2 + 2, x 2 + 2 43 Rational Canonical Forms Let A M n (F Since det(xi A 0 (in F [x], the Smith normal form of xi A has no 0 s on the diagonal So, the invariant factors of xi A are completely determined by the nonconstant invariant factors of xi A For this reason, when we speak of the invariant factors of xi A, we usually mean the nonconstant ones The invariant factors, elementary divisors and determinantal divisors of xi A are also called those of A Theorem 45 Let A, B M n (F Then the following statements are equivalent (i A B (ii A, B have the same invariant factors (iii A, B have the same elementary divisors (iv A, B have the same determinantal divisors Proof Immediate from Theorem 4 and Corollary 43 Corollary 46 For every A M n (F, A A T Proof xi A and xi A T have the same determinantal divisors

43 RATIONAL CANONICAL FORMS 35 The companion matrix Let f(x = x n + a n x n + + a 0 F [x] The companion matrix of f, denoted by M(f, is defined to be 0 0 M(f = 0 a 0 a a n 2 a n f(x is the only invariant factor of M(f In fact, x x n (M(f = = f(x, x a 0 a a n 2 x + a n n (M(f = Theorem 47 Let A (M n (F have invariant factors d,, d r and elementary divisors e,, e s Then A M(d M(d r M(e M(e s M(d M(d r and M(e M(e s are called the rational canonical forms (in terms of invariant factors/elementary divisors Proof The invariant factors of xi M(d M(d r are d,, d r The elementary divisors of M(e M(e s are e,, e s The characteristic polynomial Let A M n (F c A (x := det(xi A is called the characteristic polynomial of A Theorem 48 (Cayley-Hamilton Let A M n (F have characteristic c A (x = x n + a n x n + + a 0 Then c A (A = 0, ie, Proof We have A n + a n A n + + a 0 I = 0 (46 c A (xi = x n I + a n x n I + + a 0 I c A (A + c A (A = (xi Ap + c A (A for some p M n (F [x] We also have (47 c A (xi = det(xi A I = (xi A adj(xi A = (xi Aq, where q = adj(xi A M n (F [x] By (46 and (47, (xi A(p q = c A (A A comparison of degrees in x implies that q = p; hence c A (A = 0 The minimal polynomial Let A M n (F Let I = {f F [x] : f(a = 0} Then I = since c A I Let m I be monic and of the smallest degree Then every f I is a multiple of m (Write f = qm + r, where r = 0 or deg r < deg m Then 0 = f(a = r(a By the minimality of deg m, we have r = 0 Hence m is

36 4 RATIONAL CANONICAL FORMS AND JORDAN CANONICAL FORMS unique in I; it is called the minimal polynomial of A, denoted by m A We have m A c A Easy fact If A B, then c A (x = c B (x and m A (x = m B (x Proposition 49 M2 Let f(x = x n + a n x n + + a 0 F [x] Then the minimal polynomial of M(f is f(x Proof Let A = M(f Only have to show that A 0, A,, A n are linearly independent (Thus, g F [x] with deg g n such that g(a = 0 Using induction, we have 0 A i n i, 0 i n 0 = 0 0 Hence A i [, 0,, 0] T, 0 i n, are linearly independent So, A i, 0 i n, are linearly independent Proposition 40 Let A M n (F have invariant factors d,, d r, (d d 2 d r Then m A (x = d r (x Proof May assume A = M(d M(d r Then d r (A = d r ( M(d d r ( M(dr = 0 So, m A d r On the other hand, since m A (A = 0, m A (M(d r = 0 By Proposition 49, d r m A Example Let 4 3 8 6 0 8 0 A = 4 7 20 2 M 4(R 6 4 8 6 Then x 4 3 8 x + 3 0 x + 2 6 x 8 0 r+r2 0 x 8 6 xi A = 4 7 x + 20 2 c c4 2 7 x + 20 4 6 4 8 x 6 x 6 4 8 6 0 0 0 0 x + 30 8 0x + 26 0 2x + 56 x + 20 2x + 56 0 x 2 + 3x + 4 8 x 2 + 4x + 8

43 RATIONAL CANONICAL FORMS 37 c2 c3 8 x + 30 x 4 c4 c3 [] 8(x + 20 8(2x + 56 0 r3 8 8 x 2 + 3x + 4 x + 4 8 0 0 [] 0 x 2 82x 52x (x + 4(x + 20 0 x 2 8x 6 2x + 8 [ ] x 38 x + 20 [] [] (x + 4 x 4 2 [ ] 0 [] [] (x + 4 0 x 2 + 2x + 4 So, the invariant factors of A are x+4, (x+4(x 2 +2x+4; the elementary divisors are x + 4, x + 4, x 2 + 2x + 4 The rational canonical form of A is [ ] 0 [ 4] [ 4] 4 2 Eigenvalues, eigenvectors and eigenspaces Let A M n (F If 0 x F n and λ F such that Ax = λx, λ is called an eigenvalue of A and x is called an eigenvector of A (with eigenvalue λ Eigenvalues of A are the roots of the characteristic polynomial c A (x If λ is an eigenvalue of A, E λ (A := {x F n : Ax = λx} = ker c (A λi is called the eigenspace of A with eigenvalue λ dim E A (λ = null(a λi is called the geometric multiplicity of λ The multiplicity of λ as a root of c A (x is called the algebraic multiplicity of λ Similar matrices have the same eigenvalues together with their algebraic and geometric multiplicities Fact If A = M(f M(f k, where f i F [x] is monic and λ is an eigenvalue of A Then the geometric multiplicity of λ is {i : f i (λ = 0} In particular, geomult(λ algmult(λ Proof We have null(a λi = i null(m(f i λi, where null(m(f i λi = { 0 if f i (λ 0, if f i (λ = 0 Fact Let λ,, λ k F be distinct eigenvalues of A M n (F Then E A (λ + + E A (λ k = E A (λ E A (λ k

38 4 RATIONAL CANONICAL FORMS AND JORDAN CANONICAL FORMS Proof We want to show that E A (λ i ( E A (λ + + E A (λ i + E A (λ i+ + + E A (λ k = {0}, i k Without loss of generality, assume i = Let x E A (λ ( E A (λ 2 + + E A (λ k Then x = a 2 x 2 + + a k x k, x i E A (λ i, a i F So, [ k ] [ k ] [ k ] (λ λ i x = (A λ i I x = (A λ i I (a 2 x 2 + + a k x k = 0 i=2 Hence, x = 0 i=2 Diagonalizable matrices A M n (F is called diagonalizable (or diagonable if A is similar to a diagonal matrix Proposition 4 Let A M n (F and let λ,, λ k be all the eigenvalues of A in F The following statements are equivalent i=2 (i A is diagonalizable (ii All elementary divisors of A are of degree (iii F n = E A (λ E A (λ k (iv k i= geomult(λ i = n Simultaneous diagonalization Proposition 42 Let A,, A k M n (F such that each A i is diagonalizable and A i A j = A j A i for all i, j k Then P GL(n, F such that P A i P is diagonal for all i k Proof Use induction on k Since A is diagonalizable, we may assume A = a I n a s I ns, where a,, a k F are distinct and n + + n s = n For each 2 i n, since A i commutes with A, we must have A i = A i A is, A ij M nj (F Since A i is diagonalizable, each A ij is diagonalizable (Think of the elementary divisors Since A 2,, A k are pairwise commutative, for each j s, A 2j,, A kj are pairwise commutative By the induction hypothesis, P j GL(n j, F such that P j A ij P j is diagonal for all 2 i k Let P = P P s Then P A i P is diagonal for all i k The equation AX = XB Let A M m (F and B M n (F We compute dim{x M m n (F : AX = XB} Lemma 43 Let A M n (F such that c A (x = m A (x Then for any g F [x], rank g(a = n deg(g, c A Proof Let h = (g, c A Then rank g(a rank h(a Write h = ag + bc A for some a, b F [x] Then h(a = a(ag(a So, rank g(a rank h(a Hence rank g(a = rank h(a