Cayley-Hamilton Theorem Massoud Malek In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n Let A be an n n matrix Although det (λ I n A is the proper definition of the characteristic polynomial of A, but it is sometimes convenient, to define it as: K A (λ = det (A λ I n = p 0 + p λ + p λ + + p n λ n + p n λ n Cayley-Hamilton Theorem For any n n matrix A, the matrix polynomial K A (A = p 0 + p A + p A + + p n A n + p n A n = Z n Proof The (i, j minor of A is the determinant of the matrix obtained from A, by deleting its i th row and j th column; it is denoted by A (i, j The adjoint (adjugate of the matrix A = (a ij, is the n n matrix + A (, A (, ( n A (, n A (, + A (, ( n A (, n B = ( n A (n, ( n+ A (n, + A (n, n The matrix B satisfies the condition B A = det (A I n The adjoint of the matrix a λ a a n a a λ a n A λ = A λ I n =, a n a n a nn λ + A λ (, A λ (, ( n A λ (, n A λ (, + A λ (, ( n A λ (, n B(λ =, ( n A λ (n, ( n+ A λ (n, + A λ (n, n where each entry is a polynomial of degree less than or equal to n Thus B(λ = B 0 + B λ + B λ + B n λ n, where B k (k = 0,,, n is an n n matrix with constant entries
Massoud Malek Cayley-Hamilton Theorem By the equation B A = det (A I n, we have B(λ (A λ I n = det (A λ I n I n = ( p 0 + p λ + p λ + + p n λ n + p n λ n I n On the other hand, we have B(λ (A λ I n = B 0 A + λ(b A B 0 + λ (B A B + + λ n (B n A B n λ n B n Thus = p 0 I n + λ p I n + λ p I n + + λ n p n I n + λ n p n I n B 0 A = p 0 I n, B A B 0 = p I n, B A B = p I n, = B n A B n = p n I n, B n = p n I n Multiplying the k th row of A by A k, we obtain: B 0 A = p 0 I n, B A B 0 A = p A, B A 3 B A = p A, = B n A n B n A n = p n A n, B n A n = p n A n By adding all the rows, on the left side we obtain the zero matrix and on the right side, we get Hence K A (A = Z n K A (A = p 0 + p A + p A + + p n A n + p n A n = Z n Note A n n matrix A is diagonalizable, if all its eigenvectors are linearly independent If A has n different eigenvalues, then it is diagonalizable A real matrix A is said to be normal, if A t A = A A t, for the case of complex matrices, the normality means A A = A A Examples of real normal matrices are symmetric ( A t = A and skew-symmetric ( A t = A matrices; in the complex case, hermitian ( A = A and skew-hermitian ( A = A matrices are normal Any matrix similar to a normal matrix is also diagonalizable There is a much shorter proof of the Cayley-Hamilton theorem, if A is diagonalizable; ie, if A has n linearly independent eigenvectors Cayley-Hamilton Theorem for Diagonalizable Matrix If the n n matrix A has n linearly independent eigenvectors, then K A (A = p 0 + p A + p A + + p n A n + p n A n = Z n Proof Let {u, u,, u n } be n linearly independent eigenvectors of A associated with the eigenvalues λ, λ,, λ n ; and let v = α u + α u + + α n u n Since λ i are roots of the characteristic polynomial K A (λ, it follows that for all i =,,, n, K A (λ i = 0 Hence K A (A v = n α i K A (A u i = i= n α i K A (λ i u i = i= n α i (0 u i = θ n i=
Massoud Malek Cayley-Hamilton Theorem 3 Since any vector v may be expressed as a linear combination of the eigenvectors u i s, we conclude that K A (A = Z n Corollary Let K A (λ = λ n + p λ n + + p n λ + p n be the characteristic polynomial of the n n invertible matrix A Then A = p n [ A n + p A n + + p n A + p n I n ] Proof According to the Cayley Hamilton s theorem we have A [ A n + p A n + + p n I n ] = p n I n, Since A is nonsingular, p n = ( n det (A 0; thus the result follows Inverse of Triangular Matrices Given a lower triangular matrix L = (l i,j, we denote by D, the diagonal part and L, strictly lower triangular parts of L Hence l, 0 0 0 0 0 0 0 0 l, 0 0 l, 0 0 0 L = D + L = + 0 0 0 l n,n l n, l n, l n,n 0 If L is not singular, then 0 0 0 l, 0 0 0 l D, = 0 0 0 and the matrix 0 0 0 l, 0 0 l, L = D L = D [D + L] = I n + D L = l n, l n, l n,n l n,n l n,n l n,n Note that L = L [ D ] = L D; hence L = L D Since all the eigenvalues of the matrix L are equal to, the characteristic polynomial of L is as follows: n n K L (λ = (λ n = ( k λ n k k n n n n = λ n λ n + λ n ( k λ n k ( n λ + ( n k n l n,n
Massoud Malek Cayley-Hamilton Theorem 4 Now according to the Cayley-Hamilton s theorem, K A (A = Z n, hence: ( n n L = ( [L n n L n ( k k L n k ( n ( n ] n L + ( n I n Similarly, we can find the inverse of an upper triangular matrix U = D + Û, by finding first and then using the following formula: [ ( n U = ( n U n U = D U = I n + D Û U n ( k ( n k Example Consider the invertible lower triangular matrix To find L, we need to find: 0 0 0 0 0 0 0 0 0 L = 3 4 0 0 0 5 6 7 0 0 8 9 0 3 4 5 6 7 ] n n U n k ( n U + ( n I n 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 6 0 0 0 0 L = 4 8 0 0 0 43 40 4 0 0, L 3 = 33 0 0 0 70 0 0 0 6 4 0 35 77 45 6 0 5 7 45 4 609 45 4 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 L 4 = 60 6 0 0 0 44 9 8 0 0, and L 5 = 95 0 0 0 0 95 30 35 0 0 346 84 88 8 0 0 675 45 0 0 576 56 6 60 8 4 90 605 0 35 The characteristic polynomial of L is and since K L (λ = (λ 6 = λ 6 6λ 5 + 5λ 4 0λ 3 + 5λ 6λ + ; K L (L = L 6 6 L 5 + 5 L 4 0 L 3 + 5 L 6 L + I 6 = Z 6, we conclude that 0 0 0 0 0 L = [ 0 0 0 0 L 5 6 L 4 + 5 L 3 0 L ] + 5 L 6 I 6 = 5 4 0 0 0 8 7 0 0 4 3 3 0 435 333 8 0 7
Massoud Malek Cayley-Hamilton Theorem 5 Reducing the Order of Matrix Polynomials Consider the n n matrix A with the characteristic polynomial K A (λ Let P (x be a polynomial of degree greater than or equal to n Then P (λ = Q (λ K A (λ + R (λ, where Q (λ is found by the long division, and the remainder polynomial R (λ is of degree less than or equal to n Now consider the corresponding matrix polynomial P (A = Q (A K A (A + R (A But Cayley-Hamilton states that K A (A = Z n, therefore P (A = R (A 0 4 Example Use the characteristic polynomial of the matrix A =, to reduce the order 0 3 of the matrix polynomial P (A = A 6 + 3 A 5 + 4 A 3 5 A + A + I 3 Solution: From the characteristic polynomial of A : det (λ I 3 A = λ 3 5 λ λ + 5 = (λ + (λ (λ 5, we obtain the eigenvalues λ =, λ =, and λ 3 = 5 By using the long division, we obtain P (λ = ( λ 3 + 8 λ + 4 λ + K A (λ + ( 056 λ + 9 λ 059 = Q(A det (λ I 3 A + R(A According to the Cayley-Hamilton Theorem, det (λ I 3 A = Z 3 Thus 8454 0 693 P (A = R(A = 056 A + 9 A 059 I 3 = 8466 6 694 8466 0 690 Analytic Functions of Matrices Assume that a scalar function f (x is analytic in a region of the complex plane Then in that region f (x may be expressed as: f (x = c k x k Case Let A be an n n matrix with characteristic polynomial K A (λ and n different eigenvalues λ, λ,, λ n Then f (A = Q (A K A (A + R (A, where Since λ i s are the roots of K A (A, we have: n R (x = α k x k n f (λ i = R (λ i = α k λ k i
Massoud Malek Cayley-Hamilton Theorem 6 Since the λ i, i =,,, n are known, f (λ i s define a set of simultaneous linear equations that will generate the coefficients λ, λ,, λ n From the Cayley-Hamilton theorem, we conclude that the matrix function f (A has the same series expansion as f (x, that is n f (A = R (A = α k A k Thus the defined analytic function of an n n matrix A may be expressed as a polynomial of degree (n or less Example 3 Let f (x = sin x and let A =, then find sin (A 3 Solution: From the characteristic polynomial of A : det (λ I A = λ 5 λ + 4 = (λ (λ 4, we obtain the eigenvalues λ = and λ = 4 Since n =, R (λ must be of degree one or less From f (A = R (A, we conclude that for i =,, f (λ i = R (λ i = a + b λ i and f (A = R (A = a I + b A If f (λ is the Maclaurin series associated with then the Maclaurin series for the matrix A is: Now from P (λ i = R (λ i = a + b, we obtain: sin( = a + b sin(4 = a + 4 b sin(λ = λ λ3 3! + λ5 5! + + λn n!, sin (A = A A3 3! + A5 5! + An n!, = a = [ sin(4 4 sin( ] 3 b = [ sin(4 sin( ] 3 Hence sin = { 0 [ 4 sin( sin(4 ] 3 3 0 = sin(4 + sin( sin(4 sin( 3 sin(4 sin( sin(4 + sin( } + [ sin(4 sin(] 3 = ( 03087 0655 0538 040 The exponential of the n n matrix A, denoted by e A or exp (A, is the n n matrix given by the power series e A = I n + A +! A + 3! A3 + + k! Ak + = The above series always converges, so the exponential of A is well-defined 0 Example 4 Lat A =, then find e 3 A Solution: The characteristic equation of A is λ 3 λ + = 0, and the eigenvalues are λ = and λ = So, for i =,, A k k! e λ i = R(λ i = a + b λ i and exp (A = R (A = a I + b A
Massoud Malek Cayley-Hamilton Theorem 7 Hence { e = a + b e = a + b = { a = e e b = e e Thus ( 0 e e e exp = e 955 46708 3 e e e = e 9345 0598 0 Example 5 Lat A =, then find e 0 A Solution: The characteristic equation of A is λ + = 0, and the eigenvalues are λ = i and λ = i So, for i =,, e λ i = R(λ i = a + b λ i and exp (A = R (A = a I + b A Hence { a + i b = e i = cos ( + i sin ( a i b = e i = cos ( i sin ( Thus exp 0 = 0 cos ( sin ( = sin ( cos ( = { a = cos ( b = sin ( 05403 0845 0845 05403 Case Suppose the characteristic polynomial of A has multiple roots, ie, m K A (λ = (λ λ k r k k= If f (λ = Q (λ K A (λ + R (λ, then since f (λ k = Q (λ k [ 0 ] + R (λ k = R (λ k, we conclude that d j d λ j f (λ k = d λ j R (λ k j = 0,,, (r k 0 0 Example 6 Let f (x = sin x and let A = 0, then find sin (A 3 d j Solution: From the characteristic polynomial K A (λ = (λ 3 of A, we conclude that We also have : sin (λ = R (λ = a + b λ + c λ, for some real numbers a, b and c d d λ sin (λ = cos (λ and d sin (λ = sin (λ d λ By using λ = λ = λ 3 =, we obtain sin ( = a + b + 4 c, cos ( = b + 4 c and sin ( = c The solution to this linear system is : sin ( a = cos ( sin (, b = cos ( + sin (, and c = Hence sin ( sin (A = [ cos ( sin ( ] I 3 + [ cos ( + sin ( ] A A 0 0 0 0 = [ cos ( sin ( ] 0 0 + [ cos ( + sin ( ] 0 0 0 3 09093 0 0 = 046 09093 0 96 484 09093 sin ( 4 0 0 4 4 0 4
Massoud Malek Cayley-Hamilton Theorem 8 You may verify the validity of the results in Example 3 and Example 6, by using Maclaurin series of sin (A with the following MATLAB code: >> SA = A ; for k = : 00, SA = SA + ( k (A ( k+ / factorial ( k + ; end ; SA or the MATLAB command : >> funm ( A, @ sin You may verify the validity of the results in Example 4 and Example 5, by using Maclaurin series of exp (A with the following MATLAB code: >> EA = eye ( ; for k = : 00, EA = EA + (A k / factorial (k ; end, EA or the MATLAB command : >> funm ( A, @ exp