LECTURE VII: THE JORDAN CANONICAL FORM MAT 204 - FALL 2006 PRINCETON UNIVERSITY ALFONSO SORRENTINO [See also Appendix B in the book] 1 Introduction In Lecture IV we have introduced the concept of eigenvalue and eigenvector of a linear operator on a real vector space (or equivalently for square matrices) Denition Let A M n (R) and λ R; we say that λ is an eigenvalue of A if there exists a non-zero vector x R n such that Ax = λx Such vector is called eigenvector associated to λ The set of eigenvalues of A is called spectrum of A and it is usually denoted by Λ(A) We call eigenspace of λ, the vector subspace dened by Null(A λi n ) (Null denotes the nullspace) This vector subspace is denoted by E λ [or E λ (A)] We have: E λ = {x R n : Ax = λx} The dimension of E λ is called geometric multiplicity of λ and it will be denoted by d λ We also showed a way for computing eigenvalues: Proposition 1 Let A M n (R) We have that λ is an eigenvalue of A, if and only if λ is a root of the characteristic polynomial P (λ) = det(a λi n ) The multiplicity of λ as root of the characteristic polynomial P of T is called algebraic multiplicity of λ and it will be denoted h λ (or h(λ)) In particular: Proposition 2 Let A M n (R) A has at most n distinct eigenvalues If we denote by λ 1,, λ s R such eigenvalues (with s n), let E λ1,, E λs be the relative eigenspaces and d λ1,, d λs the geometric multeplicities We have: i) s s i=1 d λ i n ii) s i=1 d λ i = dim ( s i=1 E λ i ) Denition Let A M n (R) We say that A is diagonalizable, if it is similar to a diagonal matrix; ie, there exists C GL n (R) and D diagonal matrix of order n, such that: A = C 1 DC The importance of introducing these concepts, derived from the following result, ie, the possibility of obtaining a simpler (diagonal) representation for our matrix/linear operator (up to similarity), in the special case that the matrix had n linearly independent eigenvectors (ie, a basis of eigenvectors) Proposition 3 Let A M n (R) If A has at n linearly independent eigenvectors, then it is diagonalizable In particular, if we denote by λ 1,, λ s R its eigenvalues (with s n) and with d λ1,, d λs their geometric multiplicities, then: s A is diagonalizable d λi = n 1 i=1
2 ALFONSO SORRENTINO Unfortunately, it is not always possible to nd n independent eigenvectors For instance, consider: 1 1 A = 0 1 ( 1 This has one only eigenvalue λ = 1, with eigenspace E 1 = 0 hence, it cannot be diagonalized ) [Exercise]; Even though a diagonal form is not possible, one would like to nd a simpler form for the matrix, that is as close as possible to the diagonal one The Jordan Canonical form provides this form and is essentially the simplest possible representation, after a similar transformation Moreover, this form provides a complete representation of the eigenstructure of the matrix and has important implications in study of dynamical systems and dierential equations (see Lecture VIII) 2 Complex eigenvalues and eigenvectors Note: We refer to 55 in the book (or any other Complex analysis textbook) for a review of the complex numbers Let A M n (R) and consider its characteristic polynomial P (λ) = det(a λi n ) The main problem we pointed out at the end of last section, is that in general this polynomial might not have enough solutions in R (for instance, x 2 + 1 has NO solutions over the real number!) Things change completely when we consider complex solutions (ie, in C) The Fundamental theorem of Algebra states that any polynomial can be linearly factorized over C, ie, it has as many roots (counted each with its multiplicity) as its degree [this is sometimes referred to, saying that C is an algebraically closed] Let us consider not only the real root of our characteristic polynomial, but also its complex ones We need to notice that, since P has real coeents, then if µ C is a root, also µ is a root In fact (using the properties of conjugation and the fact that the coecients of P are real): 0 = P (µ) = P (µ) = P (µ) = P (µ) we can divide its roots in two categories: λ 1,, λ s real roots (possibly repeated), µ 1, µ 1,, µ r, µ r non-real (couple of) roots (possible repeated), with n = s + 2r We can consider A as acting on C n = R n ir n (instead of R n ) and dene analogously complex eigenvectors Denition Let A M n (R) and µ C; we say that µ is an eigenvalue of A if there exists a non-zero vector z = x + iy C n such that Az = µz Observe, that the eigenvectors corresponding to a couple of eigenvalues µ, µ C are complex conjugated In fact, if Az = µz, since A has real entries, then A = A and: Az = Az = Az = µz = µz Let us start to consider the following lemma
LECTURE VII: THE JORDAN CANONICAL FORM 3 Lemma 1 Let A M 2 (R) with two non-real eigenvalues µ, µ, where µ = a + ib Then, there exist C GL 2 (R) such that C 1 a b AC = Proof Consider the complex (conjugate) eigenvectors ϕ, ϕ C 2, where ϕ = u+iv with u, v R 2 First of all observe that ϕ and ϕ are linearly independent in C 2 (since they are eigenvectors with distinct eigenvalues; see lecture IV) and consequently also u and v are linearly indipendent (as vectors of R 2 ) [Exercise] We know that: Au + iav = Aϕ = µϕ = (a + ib)(u + iv) = (au bv) + i(av + bu), and this implies: b Au = (v u) ( a) a Av = (v u) b Consider the matrix C = (v u) [placed in the columns]; C GL 2 (R) [since v, u form a basis] and from what written above: a b AC = A(v u) = (Au Av) = (v u) = C, or equivalently: C 1 a b AC = This results immediately implies a generalization of proposition 3 Proposition 4 Let A M n (R) If A has at n = r + 2s distinct eigenvalues λ 1,, λ r (real ones) and µ 1, µ 1,, µ r, µ r (non-real ones) [with µ j = a j + ib j ], then A is similar to a matrix of the form (the empty entries are zeros): λ 1 λ r a 1 b 1 b 1 a 1 a s b s Proof We proceed similarly to what already done in the previous lemma Let x 1,, x r R n are the eigenvectors corresponding to λ 1,, λ r (real ones) and ϕ 1, ϕ 1,, ϕ s, ϕ s C n the ones corresponding to µ 1, µ 1,, µ s, µ s (non-real ones), with ϕ i = u i + iv i (u i, v i R n ) It is sucient to consider the matrix C = (x 1 x r v 1 u 1 v s u s ) GL n (R), and show that C 1 AC is in the above form Obviously everything generalizes to the case that A has not all distinct eigenvalues, but one can nd anyway a basis of linearly independent eigenvectors (both for the real and non-real eigenspaces) Our next goal is to nd a canonical form for any matrix A M n (R) b s a s
4 ALFONSO SORRENTINO 3 The Jordan Canonical Form The main result of this section (without proof) provides a canonical form for any real square matrix Denition [Elementary Jordan block] Let λ C The matrix λ 1 0 0 0 0 λ 1 0 0 J λ,r = M r (C) 0 0 0 λ 1 0 0 0 0 λ is called elementary Jordan block of order r, corresponding to the eigenvalue λ If r = 1, then J λ,1 = λ Denition A matrix A M n (C) is said to be in complex Jordan canonical form if it is made of elementary Jordan blocks (along the diagonal) Namely, J λ1,r 1 J λ2,r2 J λs,r s One example might be, for instance: λ 1 1 0 0 λ 1 0 ; it is made of two blocks J λ1,2 and J λ2,1 One can prove the following theorem Theorem 1 (Complex Jordan Canonical form) Let A M n (C) Then A is similar to a matrix in Jordan canonical form, such that the λ's that appear in the elementary Jordan block's correspond to the eigenvalues of A When we work with a real matrix, we would like to have blocks made of real numbers In the spirit of lemma 1, one can prove what follows Lemma 2 Let µ = a + ib (with b 0) an eigenvalue of a real matrix A We know that µ = a ib is also an eigenvalue Then A is similar to a matrix in Jordan canonical form, where the blocks corresponding to couple µ, µ have the form: D I O O O O D I O O where Ĵ µ,s = ( a b D = If s = 1, then Ĵµ,s = D O O O D I O O O O D ) ( 1 0, I = 0 1 ) M 2s (R), and O = Finally, we can state our main result for real matrices ( 0 0 0 0 Denition A matrix A M n (R) is said to be in real Jordan canonical form, if it is made of elementary Jordan blocks of the forms J λ,r, with λ R, and Ĵµ,s, with µ = a + ib C (b 0) )
LECTURE VII: THE JORDAN CANONICAL FORM 5 Theorem 2 (Real Jordan Canonical form) Let A M n (R) Then A is similar to a matrix in real Jordan canonical form, made of blocks of the form J λ,r, corresponding to real eigenvalues λ and of elementary Jordan blocks of the form Ĵ µ,r, corresponding to (non-real) complex conjugate (couple of) eigenvalues µ, µ The proof of these results goes further beyond the goal of this course However, what one denitely needs is a description of the algorithm for nding the matrix C GL n (R), such that C 1 AC is in this form Unfortunately, the general algorithm might sound quite complicated (actually more than it actually is) because of the notation and would require some preparatory lemmata we will try to provide a complete description of it, restricting ourselves only to the two dimensional and three dimensional cases (ie, n = 2 and n = 3) This would be enough to get an idea of how it works and for future applications to (planar) dynamical systems (see Lecture VIII) Case n=2 (two-by-two real matrices) Let us consider A M 2 (R) and its characteristic polynomial p(λ) = det(a λi 2 ) This is a quadratic polynomial with real coecients We already know that there are always two solutions (maybe coinciding) in C and that if there is one non-real solution, then also its conjugate is a solution Therefore the only possibilities we may have are: a) two distinct real roots: λ 1 λ 2 R; b) two complex conjugate roots: µ, µ C, where µ = a + ib (b 0); c) only one real solution (with algebraic multiplicity 2): λ R There may not be other possibilities (this is a peculiarity of the two dimensional case) Let us analyze these cases singularly a) In this case, we already know that the matrix is diagonalizable In fact, we can nd two eigenvectors v 1 and v 2 (corresponding respectively to λ 1 and λ 2 ): Av 1 = λ 1 v 1 and Av 2 = λ 2 v 2 ; we have already proved in lecture IV that they are linearly independent Consider the matrix C = (v 1 v 2 ) GL 2 (R) (this is invertible because v 1, v 2 form a basis); one has: AC = A(v 1 v 2 ) = (Av 1 Av 2 ) = (λ 1 v 1 λ 2 v 2 ) = λ1 0 λ1 0 = (v 1 v 2 ) = C 0 λ 2 0 λ 2 λ1 0 0 λ 2 The corresponding Jordan Form is: λ1 0, 0 λ 2 made of two elementary Jordan blocks J λ1,1 and J λ2,1 b) This is exactly the content of lemma 1 We can nd two complex (conjugate) eigenvectors ϕ, ϕ C 2, where ϕ = u + iv with u, v R 2 As already observed u and v are linearly indipendent
6 ALFONSO SORRENTINO and if we consider the matrix C = (v u), then C GL 2 (R) [since v, u form a basis] and C 1 a b AC = In this case, the Jordan Form is: ( a b made of one elementary Jordan block Ĵµ,1 c) In this case we have only one eigenvalue λ, with algebraic multiplicity h λ = 2 We can consider the associated eigenspace E λ = {x R 2 : Ax = λx} We need to distinguish two cases, according to the dimension of this eigenspace: 1 d λ 2] (namely, the geometric multiplicity of λ) c1) If d λ = 2, then one can nd two linearly independent eigenvectors v 1 and v 2, that will form a basis and proceed as in case a) [with λ 1 = λ 2 = λ] there will exist C = (v 1 v 2 ) GL 2 (R) such that: C 1 λ 0 AC = 0 λ The corresponding Jordan Form is: λ 0, 0 λ ), made of two equal elementary Jordan blocks J λ,1 c2) if d λ = 1, then E λ is spanned by only one vector v 1 it is not possible to nd a basis of eigenvectors Remember that E λ = Null(A λi 2 ); instead, one could consider Null ( (A λi 2 ) 2) (it will be called generalized eigenspace) It is possible to show that (the rst inclusion is easy, while the second equality is less trivial): E λ Null ( (A λi 2 ) 2) = R 2 one can nd a basis for this, rather than for E λ One vector of this basis will be v 1 (the eigenvalue) and the other (that we will denote v 2 ) will be chosen in order to satisfy: (A λi 2 )v 2 = v 1 Let us show that such a vector always exists In fact, take any vector v E λ (ie, (A λi 2 )v 0) We know from above that (A λi 2 ) 2 v = 0, therefore: 0 = (A λi 2 ) 2 v = (A λi 2 )(Av λv) = 0 (Av λv) E λ Hence, Av λv = cv 1, where c 0 If we take v 2 = 1 c v, we have found the desired vector! Now, if we dene the matrix C = (v 1 v 2 ), then C GL 2 (R) [since v 1, v 2 form a basis]; moreover: AC = A(v 1 v 2 ) = (Av 1 Av 2 ) = (λv 1 v 1 + λv 2 ) = λ 1 λ 1 = (v 1 v 2 ) = C 0 λ 0 λ ( λ 1 0 λ )
LECTURE VII: THE JORDAN CANONICAL FORM 7 The corresponding Jordan Form consists of only one elementary jordan block J λ,2 : λ 1 0 λ Case n=3 (three-by-three real matrices) Let us consider A M 3 (R) and its characteristic polynomial [Exercise]: p(λ) = det(a λi 3 ) = λ 3 + Tr (A)λ 2 αλ + det(a) where α R Usually Tr (A) := Trace(A) and det(a) are not dicult to nd; therefore, one only needs to determine α; this could be done, for instance, choosing any c 0 and computing det(a ci 3 ), obtaining an equation in α: αc + ( c 3 + Tr (A)c 2 + det(a)) = det(a ci 3 ) Now that we know how to compute (eciently) this polynomial, we can start our analysis This is a cubic polynomial with real coecients We already know that there are always three solutions (maybe coinciding) in C and that if there is one non-real solution, then also its conjugate is a solution Moreover, as a peculiarity of the cubic case, there is always at least one real solution This is quite easy to explain In fact, if one sees this polynomial P (λ) as a function on R, then lim P (λ) = + and lim P (λ) = λ λ + [the sign depends on the sign of the leading term; in our specic case λ 3 ] By continuity (or by the Intermediate value theorem), there must be a point λ 0, where the function crosses che real axis, ie, a solution of P (λ) = 0 An other plausibile explanation is that there cannot be three non-real roots, since everytime there is a non-real one, also its complex conjugate is a root; therefore, the number of non-real solutions is always even We need to distinguish several cases depending on whether there are three real solutions, two real solutions or one real solution a) three distinct real roots: λ 1, λ 2, λ 3 R; b) two distinct real roots: λ 1, λ 2 R; c) only one real solution (with algebraic multiplicity 3): λ 0 R; d) only one real solution and a couple of non-real conjugate ones: λ 0 R and µ, µ C, where µ = a + ib (b 0) Let us analyze these cases singularly a) In this case, we already know that the matrix is diagonalizable In fact, we can nd three eigenvectors v 1, v 2 and v 3 (corresponding respectively to λ 1, λ 2 and λ 3 ): Av 1 = λ 1 v 1 Av 2 = λ 2 v 2 and Av 3 = λ 3 v 3 ; we have already proved in lecture IV that they are linearly independent Consider the matrix C = (v 1 v 2 v 3 ) GL 3 (R) (this is invertible because v 1, v 2, v 3 form a basis); one has: AC = A(v 1 v 2 v 3 ) = (Av 1 Av 2 Av 3 ) = (λ 1 v 1 λ 2 v 2 λ 3 v 3 ) = = (v 1 v 2 v 3 ) λ 1 0 0 0 λ 2 0 = C λ 1 0 0 0 λ 2 0 0 0 λ 3 0 0 λ 3
8 ALFONSO SORRENTINO λ 1 0 0 0 λ 2 0 0 0 λ 3 The corresponding Jordan Form is: λ 1 0 0 0 λ 2 0, 0 0 λ 3 made of three elementary Jordan blocks: J λ1,1, J λ2,1 and J λ3,1 b) Since there are only two real roots, necessarily one of them must have algebraic multiplicity equal to 2 (assume, without any loss of generality, that this root is λ 1 ); namely, the characteristic polynomial can be factorized as follows: P (λ) = (λ λ 1 ) 2 (λ λ 2 ) Let us denote by E λ1 and E λ2 the two eigenspaces We can easily deduce that dim E λ2 = 1 (since the algebraic multiplicity is 1); let us call v 2 one eigenvector that spans this space (it forms a basis): E λ2 = v 2 What can we say about E λ1? We need to distinguish two cases, according to whether its dimension d λ1 is 1 or 2 b1) If d λ1 = 2, then I can nd two linearly independent eigenvectors u 1 and u 2, that will form a basis for E λ1 and proceed as in case a) [with λ 1 = λ 2 = λ and λ 3 = λ 2 ] there exists C = (u 1 u 2 v 2 ) GL 3 (R) such that: λ 1 0 0 0 λ 1 0 The corresponding Jordan Form is: λ 1 0 0 0 λ 1 0, made of three elementary Jordan blocks: J λ1,1, J λ1,1 and J λ2,1 b2) if d λ1 = 1, then E λ is spanned by only one vector u 1 it is not possible to nd a basis of eigenvectors We proceed as we have already done in case c2) of n = 2 Remember that E λ1 = Null(A λ 1 I 3 ); instead, one could consider Null ( (A λ 1 I 2 ) 2) (it will be called generalized eigenspace) It is possible to show that this generalized eigenspace has dimension 2 and: E λ1 Null ( (A λ 1 I 3 ) 2) one can nd a basis for this, rather than for E λ1 One vector of this basis will be u 1 (the eigenvector) and the other (that we will denote u 2 ) will be chosen in order to satisfy: (A λ 1 I 3 )u 2 = u 1 Let us show that such a vector always exist In fact, take any vector v E λ1 (ie, (A λ 1 I 3 )v 0) We know from above that (A λ 1 I 3 ) 2 v = 0, therefore: 0 = (A λ 1 I 3 ) 2 v = (A λ 1 I 3 )(Av λ 1 v) = 0 (Av λ 1 v) E λ1 Hence, Av λ 1 v = cu 1, where c 0 If we take u 2 = 1 c v, we have found the desired vector!
LECTURE VII: THE JORDAN CANONICAL FORM 9 Now, if we dene the matrix C = (u 1 u 2 v 2 ), then C GL 3 (R) [since u 1, u 2, v 2 form a basis]; moreover: AC = A(u 1 u 2 v 2 ) = (Au 1 Au 2 Av 2 ) = (λ 1 u 1 u 1 + λ 2 u 2 λ 2 v 2 ) = = (u 1 u 2 v 2 ) λ 1 1 0 0 λ 1 0 = C λ 1 1 0 0 λ 1 0 λ 1 1 0 0 λ 1 0 The corresponding Jordan Form consists of two elementary jordan blocks J λ1,2 and J λ2,1 c) In this case, the characteristic polynomial can be factorized as follows: P (λ) = (λ λ 0 ) 3 Let us denote by E λ0 the eigenspace and with d λ0 its dimension From what already seen (see Lecture IV), 1 d λ0 3 We need to distinguish three cases, according to whether its dimension d λ0 is 1, 2 or 3 c1) If d λ0 = 3, then I can nd three linearly independent eigenvectors v 1, v 2, v 3, that will form a basis for E λ0 and proceed as in case a) [with λ 1 = λ 2 = λ 3 = λ 0 ] there exists C = (v 1 v 2 v 3 ) GL 3 (R) such that: λ 0 0 0 0 λ 0 0 The corresponding Jordan Form is: λ 0 0 0 0 λ 0 0, made of three elementary Jordan blocks: J λ0,1, J λ0,1 and J λ0,1 c2) If d λ0 = 2, then E λ0 has dimension 2 Let us denote by {w 1, w 2 } a basis for this space We would like to complete to a basis for R 3 ; but obviously, it cannot be done just considering eigenvectors we have to consider - in the same spirit as we have already done in case b2) - generalized eigenvectors It is possible to show that (the rst inclusion is easy, the second less trivial): E λ0 Null ( (A λ 0 I) 2) = R 3 one can nd a basis for this, rather than for E λ0 Let us proceed as follows Choose a vector v E λ0 Since Null ( (A λ 0 I) 2) = R 3 = v Null ( (A λ 0 I 3 ) 2) Hence, 0 = (A λ 0 I 3 ) 2 v = (A λ 0 I 3 )(Av λ 0 v) We can denote v 2 = Av λ 0 v (hence, Av = v 2 + λ 0 v) It is easy to verify that v 2 E λ0 We can take now any other vector v 1 in E λ0, that is linearly independent of v 2 (since d λ0 = 2, this is always possible)
10 ALFONSO SORRENTINO The three vectors {v 1, v 2, v} form a basis of R 3 (they are linearly independent by construction) Now, if we dene the matrix C = (v 1 v 2 v), then C GL 3 (R) [since v 1, v 2, v form a basis]; moreover: AC = A(v 1 v 2 v) = (Av 1 Av 2 Av) = (λ 0 v 1 λ 0 v 2 v 2 + λ 0 v) = = (v 1 v 2 v) λ 0 0 0 0 λ 0 1 = C λ 0 0 0 0 λ 0 1 λ 0 0 0 0 λ 0 1 The corresponding Jordan Form consists of two elementary jordan blocks J λ0,1 and J λ0,2 c3) If d λ0 = 1, then E λ1 has dimension 1; suppose that it is spanned by a vector v 1 (it is a basis) We would like to complete to a basis for R 3 ; but obviously, it cannot be done just considering eigenvectors we have to consider - in the same spirit as we have already done in case b2) - generalized eigenvectors It is possible to show that (the rst two inclusions are easy, the latter less trivial): E λ0 Null ( (A λ 0 I 3 ) 2) Null ( (A λ 0 I 3 ) 3) = R 3 In particular, one can show that dim Null ( (A λ 0 I 3 ) 2) = 2 Let us choose a vector v 3 Null ( (A λ 0 I 3 ) 2) Dene: v 3 Null ( (A λ 0 I 3 ) 2) v 2 = (A λ 0 I 3 )v 3 v 1 = (A λ 0 I 3 ) 2 v 3 = (A λ 0 I 3 )v 2 In particular, these three vectors are linearly indipendent and (using that (A λ 0 I 3 ) 3 v 1 = (A λ 0 I 3 ) 3 v 3 = 0): (A λ 0 I 3 )v 3 = v 2 = Av 3 = v 2 + λ 0 v 3 (A λ 0 I 3 )v 2 = v 1 = Av 2 = v 1 + λ 0 v 2 (A λ 0 I 3 )v 1 = 0 = Av 1 = λ 0 v 1 (v 1 E λ0 ) The three vectors {v 1, v 2, v 3 } form a basis of R 3 and, if we dene the matrix C = (v 1 v 2 v 3 ), then C GL 3 (R) [since v 1, v 2, v 3 form a basis]; moreover: AC = A(v 1 v 2 v 3 ) = (Av 1 Av 2 Av 3 ) = (λ 0 v 1 v 1 + λ 0 v 2 v 2 + λ 0 v 3 ) = = (v 1 v 2 v 3 ) λ 0 1 0 0 λ 0 1 = C λ 0 1 0 0 λ 0 1 λ 0 1 0 0 λ 0 1 The corresponding Jordan Form consists of only one elementary jordan block: J λ0,3
LECTURE VII: THE JORDAN CANONICAL FORM 11 d) In this case, the characteristic polynomial can be factorized as follows: P (λ) = (λ λ 0 )Q(λ), where Q(λ) is a quadratic polynomial with no real roots (ie, its discriminant is negative) [Exercise: one can check that Q(λ) = λ 2 2aλ+(a 2 +b 2 )] Let us denote by E λ0 the eigenspace corresponding to the real eigenvalue λ 0 ; its dimension d λ0 = 1, therefore, it is spanned by only one vector v 1 Now, let us consider the non-real part We proceed as already done for case b) [n = 2] (see also lemma 1) We can nd two complex (conjugate) eigenvectors ϕ, ϕ C 3, where ϕ = u+iv with u, v R 3 As already observed u and v are linearly indipendent Moreover also v 1, v, u are linearly independent [Exercise] and they form a basis for R 3 Consider the matrix C = (v 1 v u), then C GL 3 (R) [since v 1, v, u form a basis] One can check that λ 0 0 0 0 a b 0 (this is essentially due to the fact that A leaves invariant the subspace corresponding to non-real eigenvectors and the one corresponding to the real ones) In this case, the Jordan Form is: λ 0 0 0 0 a b 0 made of two elementary Jordan blocks J λ0,1 and Ĵµ,1 Department of Mathematics, Princeton University E-mail address: asorrent@mathprincetonedu