Linear Algebra Massoud Malek Characteristic Polynomial Preleminary Results Let A = (a ij ) be an n n matrix If Au = λu, then λ and u are called the eigenvalue and eigenvector of A, respectively The eigenvalues of A are the roots of the characteristic polynomial K A (λ) = det (λi n A) The eigenvectors are the solutions to the Homogeneous system (λi n A) X = θ Note that K A (λ) is a monic polynomial (ie, the leading coefficient is one) Cayley-Hamilton Theorem If K A (λ) = λ n + p λ n + + p n λ + p n is the characteristic polynomial of the n n matrix A, then where Z n is the n n zero matrix K A (A) = A n + p A n + + p n A + p n I n = Z n, Corollary Let K A (λ) = λ n + p λ n + + p n λ + p n be the characteristic polynomial of the n n invertible matrix A Then A = p n [ A n + p A n 2 + + p n 2 A + p n I n ] Proof According to the Cayley Hamilton s theorem we have A [ A n + p A n 2 + + p n I n ] = pn I n, Since A is nonsingular, p n = ( ) n det(a) ; thus the result follows Newton s Identity Let λ, λ 2,, λ n be the roots of the polynomial If s k = λ k + λ k 2 + + λ k n, then P (λ) = λ n + c λ n + c 2 λ n 2 + + c n λ + c n c k = k (s k + s k c + s k 2 c 2 + + s 2 c k 2 c + s c k ) Proof From P (λ) = (λ λ )(λ λ 2 ) (λ λ n )(λ λ n ) and the use of logarithmic differentiation, we obtain P (λ) P (λ) = nλn + (n )c λ n 2 + + 2c n 2 λ + c n n λ n + c λ n + c 2 λ n 2 = + + c n λ + c n (λ λ i ) By using the geometric series for Hence n i= (λ λ i ) i= and choosing λ > max i n λ i, we obtain (λ λ i ) = n λ + s λ 2 + s 2 λ 3 + nλ n + (n )c λ n 2 + + c n = ( λ n + c λ n + c 2 λ n 2 + + c n ) ( n λ + s λ 2 + s 2 λ 3 + ) By equating both sides of the above equality we may obtain the Newton s identities
Massoud Malek The Method of Direct Expansion Page 2 The Method of Direct Expansion The characteristic polynomial of an n n matrix A = (a ij ) is defined as: λ a a 2 a n a K A (λ)=det(λi n A)= 2 λ a 22 a 2n a n a n2 λ a nn where σ = n a ii = trace(a) i= is the sum of all first-order diagonal minors of A, σ 2 = a ii a ji i<j a ij a jj is the sum of all second-order diagonal minors of A, σ 3 = i<j<k a ii a ij a ik a ji a jj a jk a ki a kj a kk =λ n σ λ n +σ 2 λ n 2 + ( ) n σ n, is the sum of all third-order diagonal minors of A, and so forth Finally, σ n = det(a) There are ( n k) diagonal minors of order k in A From this we find that the direct computation of the coefficients of the characteristic polynomial of an n n matrix is equivalent to computing ( ) n + ( ) n + 2 ( ) n = 2 n n determinants of various orders, which, generally speaking, is a major task This has given rise to special methods for expanding characteristic polynomial We shall explain some of these methods Example Compute the characteristic polynomial of A = 2 3 2 4 2 We have: σ = + + 2 = 4, σ 2 = 2 2 + 4 2 + 3 2 = ( 3) + (2) + ( ) = 2, 2 3 and σ 3 = det(a) = 2 4 2 = 7 Thus K A (λ) = det(λi 3 A) = λ 3 σ λ 2 + σ 2 λ σ 3 = λ 3 4λ 2λ + 7
Massoud Malek Leverrier s Algorithm Page 3 Leverrier s Algorithm This method allows us to find the characteristic polynomial of any n n matrix A using the trace of the matrix A k, where k =, 2, n Let σ(a) = {λ, λ 2,, λ n } be the set of all eigenvalues of A which is also called the spectrum of A Note that Let s k = trace(a k ) = n λ k i, for all k =, 2,, n i= K A (λ) = det(λi n A) = λ n + p λ n + + p n λ + p n be the characteristic polynomial of the matrix A, then for k n, the Newton s identities hold true: p k = k [s k + p s k + + p k s ] (k =, 2,, n) 2 2 Example Let A = Then 2 3 4 5 4 8 4 7 6 3 9 A 2 9 9 = A 3 2 5 8 3 42 28 8 23 = 43 9 6 22 5 2 6 7 9 3 7 So s = 4, s 2 = 2, s 3 = 44, and s 4 = 36 Hence p = s = 4, Therefore p 2 = 2 (s 2 + p s ) = (2 + ( 4)4) = 2, 2 A 4 = p 3 = 3 (s 3 + p s 2 + p 2 s ) = ( 44 + ( 4)2 + 2(4)) = 28, 3 25 48 6 4 22 23 22 46 9 4 4 2 66 2 7 p 4 = 4 (s 4 + p s 3 + p 2 s 2 + p 3 s ) = (36 + ( 4)( 44) + 2(2) + 28(4)) = 87 4 K A (λ) = λ 4 4λ 3 + 2λ 2 + 28λ 87 and A = [ A 3 4A 2 ] + 2A + 28I 4 = 87 7 6 3 9 8 4 2 42 28 8 23 9 9 2 4 +2 +28 87 43 9 6 22 3 2 5 8 2 3 9 3 7 5 2 6 7 4 5 4 A = 87 43 22 7 8 4 6 5 4 4 33 27 2 9
Massoud Malek The Method of Souriau Page 4 The Method of Souriau (or Fadeev and Frame) This is an elegant modification of the Leverrier s method Let A be an n n matrix, then define Theorem B n = Z n, and A = A, q = trace(a ), B = A + q I n A 2 = AB, q 2 = 2 trace(a 2), B 2 = A 2 + q 2 I n A n = AB n, q n = n trace(a n), B n = A n + q n I n K A (λ) = det(λi n A) = λ n + q λ n + + q n λ + q n If A is nonsingular, then A = q n B n Proof Suppose the characteristic polynomial of A is K A (λ) = det(λi n A) = λ n + p λ n + + p n λ + p n, where p ks are defined in the Leverrier s method Clearly p = trace(a) = trace(a ) = q, and now suppose that we have proved that Then by the hypothesis we have q = p, q 2 = p 2,, q k = p k A k = AB k = A(A k + q k I n ) = AA k + q k A = A[A(A k 2 + q k 2 I n )] + q k A = A 2 A k + q k 2 A 2 + q k A = Let s i = trace(a i ) (i =, 2,, k), then by Newton s identities = A k + q A k + + q k A kq k = trace(a k ) = trace(a k ) + q trace(a k ) + + q k trace(a) = s k + q s k + + q k s = s k + p s k + + p k s = kp k showing that p k = q k Hence this relation holds for all k and so By the Cayley-Hamilton theorem, B n = A n + q A n + + q n A + q n I n = Z n B n = A n + q n I n = Z n ; A n = AB n = q n I n If A is nonsingular, then det(a) = ( ) n K A () = ( ) n q n, and thus A = q n B n
Massoud Malek The Method of Souriau Page 5 Example Find the characteristic polynomial and if possible the inverse of the matrix 2 2 A = 2 3 4 5 4 For k =, 2, 3, 4, compute A k = AB k 2 2 A = 2 3 4 5 4 3 4 5 9 5 A 2 = 5 6 9 4 q k = k trace(a k), B k = A k + q k I 4, q = 4, B = 3 2 4 2 2 5 3 4 5 4 5 9 5, q 2 = 2, B 2 = ; 5 6 4 8 6 9 8 6 7 5 22 7 43 22 7 8 24 6 8 4 6 A 3 =, q 5 4 38 4 3 = 28, B 3 = ; 5 4 4 33 27 2 37 33 27 2 9 87 87 A 4 =, q 87 4 = 87, B 4 = 87 Therefore the characteristic polynomial of A is: K A (λ) = λ 4 4λ 3 + 2λ 2 + 28λ 87 Note that A 4 is a diagonal matrix, so we only need to multiply the first row of A by the first column of B 3 to obtain 87 Since q 4 = 87, the matrix A has an inverse 43 22 7 A = B 3 = 8 4 6 q 4 87 5 4 4 33 27 2 9 Matlab Program A = input( Enter a square matrix : ) m = size(a); n = m(); q = zeros(, n); B = A; AB = A; In = eye(n); for k = : n, q(k) = (/k) trace(ab)b = AB + q(k) In; AB = A B; end C = B; q(n) = (/n) trace(ab); Q = [ q]; disp( T he Characteristic polynomial looks like : ) disp( K A (x) = x n + q()x (n ) + + q(n )x + q(n) ), disp( ), disp( T he coefficients list c(k) is : ), disp( ), disp(q), disp( ) if q(n) ==, disp( T he matrix is singular ); else, disp( T he matrix has an inverse ), disp( ) C = (/q(n)) B; disp( T he inverse of A is : ), disp( ), disp(c) end ;
Massoud Malek The Method of Undetermined Coefficients Page 6 The Method of Undetermined Coefficients If one has to expand large numbers of characteristic polynomials of the same order, then the method of undetermined coefficients may be used to produce characteristic polynomials of those matrices Let A be an n n matrix and K A (λ) = det(λi n A) = λ n + p λ n + + p n λ + p n be its characteristic polynomial In order to find the coefficients p is of K A (λ) we evaluate D j = K A (j) = det(ji n A) j =,, 2,, n and obtain the following system of linear equations: Which can be changed into: p n = D n + p n + + p n = D 2 n + p 2 n + + p n = D 2 (n ) n + p (n ) n + + p n = D n n n 2 2 S n P = n 2 n 2 2 (n ) n (n ) n 2 n The system may be solved as follows: P = S n D p p 2 p n = D D n D 2 D 2 n D n D (n ) n = D Since the (n ) (n ) matrix S n depends only on the order of A, we may store R n, the inverse of S n beforehand and use it to find the coefficients of characteristic polynomial of various n n matrices Examples Compute the characteristic polynomials of the 4 4 matrices 3 4 2 2 3 3 2 A = and B = 2 2 2 3 3 2 4 5 4 First we find S = 8 4 2 and R = S = 2 27 9 3 Then for the matrix A we obtain D = det( A) = 48, D = det(i 4 A) = 72, 6 6 2 3 24 6 36 8 4 D 2 = det(2i 4 A) = 28 and D 3 = det(3i 4 A) = 8 D = D D 4 D 2 D 2 4 = 25 96 D 3 D 3 4 23
Massoud Malek The Method of Undetermined Coefficients Page 7 Hence Thus Hence Thus P = p p 2 = 2 p 3 For the matrix B we have Matlab Program 6 6 2 3 24 6 36 8 4 25 96 23 K A (λ) = λ 4 23λ 2 2λ 48 D = det( B) = 87, D = det(i 4 B) = 6, = 23 2 D 2 = det(2i 4 B) = 39 and D 3 = det(3i 4 B) = 2 P = p p 2 = 2 p 3 D = D D 4 D 2 D 2 4 = 26 32 D 3 D 3 4 6 6 6 2 3 24 6 36 8 4 26 32 = 6 K B (λ) = λ 4 4λ 3 + 2λ 2 + 28λ 87 4 2 28 N = input( Enter the size of your square matrix : ); n = N ; In = eye(n); S = zeros(n); R = zeros(n); D = zeros(, n); DSP = [ F or any, int2str(n), square matrix, you need S = ]; DSP 2 = [ Do you want to try with another, int2str(n), square matrix? (Y es = /No = ) ]; %DEF INING S for i = : n, for j = : n, S(i, j) = i (N j); end; end; disp( ), disp(dsp ), disp( ), disp(s), R = inv(s); ok = ; while ok == ; A = input([ Enteran, int2str(n), x, int2str(n), matrix A : ); disp( ) D = det(a); for k = : n; D(k) = det(k In A); end; for i = : n; DD(i) = D(i) D i N; end; P = R DD ; disp( T he Characteristic polynomial looks like : ) disp( K A (x) = x n + p()x (n ) + + p(n )x + p(n) ), disp( ), disp( T he coefficients list p(k) is : ), disp( ), disp([ P D]), disp( ), disp(dsp 2), disp( ), ok = input(dsp 2); end
Massoud Malek The Method of Danilevsky Page 8 The Method of Danilevsky Consider an n n matrix A and let K A (λ) = det(λi n A) = λ n + p λ n + + p n λ + p n be its characteristic polynomial Then the companion matrix of K A (λ) F [A] = p p 2 p 3 p n p n is similar to A and is called the Frobenius form of A The method of Danilevsky (937) applies the Gauss-Jordan method to obtain the Frobenius form of an n n matrix According to this method the transition from the matrix A to F [A] is done by means of n similarity transformations which successively transform the rows of A, beginning with the last, into corresponding rows of F [A] Let us illustrate the beginning of the process Our purpose is to carry the nth row of a a 2 a 3 a,n a n a 2 a 22 a 23 a 2,n a 2n a 3 a 32 a 33 a 3,n a 3n A = a n a n2 a n3 a n,n a nn into the row ( ) Assuming that a n,n, we replace the (n )th row of the n n identity matrix with the nth row of A and obtained the matrix The inverse of U n is U n = a n a n2 a n3 a n,n a nn V n = v n, v n,2 v n,3 v n,n v n,n where a ni v n,i = for i n a n,n and v n,n = a n,n
Massoud Malek The Method of Danilevsky Page 9 Multiplying the right side of A by V n, we obtain b b 2 b 3 b,n b n b 2 b 22 b 23 b 2,n b 2n AV n = B = b n, b n,2 b n,3 b n,n b n,n However the matrix B = AM n is not similar to A To have a similarity transformation, it is necessary to multiply the left side of B by U n = Vn Let C = U n AV n, then C is similar to A and is of the form c c 2 c 3 c,n c n b 2 b 22 b 23 b 2,n b 2n C = c n, c n,2 c n,3 c n,n c n,n Now, if c n,n, then similar operations are performed on matrix C by taking its (n 2)th row as the principal one We then obtain the matrix D = U n 2 CV n 2 = U n 2 U n AV n V n 2 with two reduced rows We continue the same way until we finally obtain the Frobenius form F [A] = U U 2 U n 2 U n AV n V n 2 V 2 V if, of course, all the n intermediate transformations are possible Exceptional case in the Danilevsky method Suppose that in the transformation of the matrix A into its Frobenius form F [A] we arrived, after a few steps, at a matrix of the form r r 2 r k r,n r n r 2 r 22 r 2k r 2,n r 2n r R = k r k2 r kk r k,n r kn and it was found that r k,k = or r k,k is very small It is then possible to continue the transformation by the Danilevsky method Two cases are possible here Case Suppose for some j =, 2,, k 2, r kj Then by permuting the jth row and (k )th row and the jth column and (k )th column of R we obtain a matrix R = (r ij ) similar to R with r k,k
Massoud Malek The Method of Danilevsky Page Case 2 Suppose now that r kj = for all j =, 2,, k 2 Then R is in the form r r 2 r,k r,k r,k+ r n r 2 r 22 r 2,k r 2,k r 2,k+ r 2n [ ] R R r R = 2 k, r k,2 r k,k r k,k r k,k+ r k,n = R 3 r kk r k,k+ r kn In this case the characteristic polynomial of R breaks up into two determinants: det(λi n R) = det(λi k R ) det(λi n k+ R 3 ) Here, the matrix R 3 is already reduced to the Frobenius form It remains to apply the Danilevsky s method to the matrix R Note Since U k A k only changes the kth row of A k, it is more efficient to multiply first A k by its (k + )th row and then multiply on the right side the resulting matrix by V k The next result shows that once we transform A into its Frobenius form ; we may obtain the eigenvectors with the help of the matrices V i s Theorem Let A be an n n matrix and let F [A] be its Frobenius form If λ is an eigenvalue of A, then v = λ n λ n 2 λ and are the eigenvectors of F [A] and A respectively Proof Since we have w = V n V n 2 V 2 V v det(λi n A) = det(λi n F [A] ) = λ n + p λ n + + p n λ + p n, λ p p 2 p 3 p n p n λ (λi n F [A])v = λ λ λ n λ n 2 λ n 3 λ = Since F [A] = V V 2 V n 2 V n AV n V n 2 V 2 V and F [A]v = λv, we conclude that λw = V n 2 V 2 V (λv) = (V n 2 V 2 V ) F [A]v = A (V n V n 2 V 2 V v) = Aw Note For expanding characteristic polynomials of matrices of order higher than fifth, the method of Danilevsky requires less multiplications and additions than other methods
Massoud Malek The Method of Danilevsky Page Example Reduce the matrix to its Frobenius form 3 4 2 2 A = 2 The matrix B = A 3 = U 3 AV 3 is as follows: 3 4 3 2 2 2 2 B = = 2 Since b 32 =, we need the permutation matrix J = ; thus 3 2 2 2 2 3 C = JBJ = = Next we obtain the matrix D = A 2 = U 2 CV 2 2 2 2 2 3 3 2 D = = Finally the Frobenius form F [A] = A = U DV, 2 2 2 3 2 4 2 3 2 F [A] = = Thus the Characteristic polynomial of A is : Matlab Program K A (λ) = x 4 x 3 4x 2 2x 3 A = input( Enter the square matrix A : ); m = size(a); N = m(); b = []; B = zeros(n); i = ; while i < N, J = eye(n); h = A(N i +, N i) while h == ; c = A(N i +, : N i); z = norm(c, inf); if z = ; k = ; r = ; while r == & k < N i; r = r + c(n i k); k = k + ; J(N i, N i) = ; J(N i, N i k + ) = ; J(N i k +, N i k + ) = ; J(N i k +, N i) =,
Massoud Malek The Method of Krylov Page 2 A = J A J; k = k + ; end else b = conv(b, [ A(N i +, N i + : N)]); B = A( : N i, : N i); A = B, N = N i; end h = A(N i +, N i); end U = eye(n); V = eye(n); U(N i, :) = A(N i +, :); V (N i, :) = A(N i +, :)/A(N i +, N i); V (N i, N i) = /A(N i +, N i); A = U A V ; i = i + ; end; b = conv(b, [ A(N i +, N i + : N)]); disp( ), disp( T he Characteristic polynomial looks like : ), disp( ), disp([ K A (x) = x n + c()x (n ), + + c(n )x + c(n) ]), disp( ), disp( T he coefficients list c(k) is : ), disp( ), disp(b), disp( ) The Method of Krylov Let A be an n n matrix For any n-dimensional nonzero column vector v we associate its successive transforms v k = A k v (k =,, 2, ), this sequence of vectors is called the Krylov sequence associated to the matrix A and the vector v At most n vectors of the sequence v, v, v 2, will be linearly independent Suppose for some r = r(v) n, the vectors v, v, v 2,, v r are linearly independent and the vector v r+ is a linear combination of the preceding ones Hence there exists a monic polynomial such that φ(λ) = c + c λ + c 2 λ 2 + + c r λ r + λ r φ(a)v = (c I n + c A + + c r A r + A r )v = c v + c v + + c r v r + c r v r+ = θ The polynomial φ(λ) is said to annihilate v and to be minimal for v If ω(λ) is another monic polynomial which annihilates v, ω(a)v =, then φ(λ) divides ω(λ) To show that; suppose ω(λ) = γ(λ)φ(λ) + ρ(λ), where ρ(λ) is the remainder after dividing ω by φ, hence of degree strictly less than r, it follows that ρ(a)v = But φ(λ) is minimal for v, hence ρ(λ) = Now of all vectors v there is at least one vector for which the degree v is maximal, since for any vector v, r(v) n We call such vector a maximal vector A monic polynomial µ A (λ) is said to be the minimal polynomial for A, if µ A (λ) is monic and of minimum degree satisfying φ(a) = Z n
Massoud Malek The Method of Krylov Page 3 Theorem Let A be an n n matrix and let φ(λ) be a minimal polynomial for a maximal vector v Then φ(λ) is the minimal polynomial for A Proof Consider any vector u such that u and v are linearly independent Let ψ(λ) be its minimal polynomial If ω is the lowest common multiple of φ and ψ, then ω annihilates every vector in the plane of u and v, since ω(a)(αu + βv) = αω(a)u + βω(a)v = θ Hence ω contains as a divisor the minimal polynomial of every vector in the plane But ω is of degree 2n at most, hence has only finitely many divisors Since there are infinitely many pairs of linearly independent vectors in the plane and finitely many divisors of ω, there is a pair of linearly independent vectors x and y in this plane with the same minimal polynomial This polynomial also annihilates v since v is on this plane Therefore φ is minimal for every vectors in the plane of u and v, and since u was any vector whatever, other than v, φ annihilates every n-dimensional vector Since φ annihilates every vector, it annihilates in particular every vector e i, hence φ(a)i = φ(a) = Z n Thus φ(λ) = µ A (λ) is the minimal polynomial for A If the minimal polynomial and the characteristic polynomial of a matrix are equal, then they may be found by the use of Krylov s sequence To produce the characteristic polynomial of A by Krylov method, first choose an arbitrary n-dimensional nonzero column vector v such as e, then use the Krylov sequence to define the matrix V = [v, Av, A 2 v,, An 2v, A n v] = [v, v, v 2, v n 2, v n ] If the matrix V has rank n, then the system V c = v n has a unique solution The monic polynomial c t = (c, c, c 2,, c n ) φ(λ) = c + c λ + c 2 λ 2 + + c n λ n + λ n which annihilates v is the characteristic polynomial of A If the system V c = v n does not have a unique solution, then change the initial vector and try for example with e 2 Examples Compute the characteristic polynomials of the following matrices: 2 2 A = and B = 2 3 4 5 4 2 3 2 3 3 2 Choosing the initial vector v = for both matrices, we obtain 7 V A = [v, Av, A 2 v, A 3 9 42 v] = and V 2 3 43 B = [v, Bv, B 2 v, B 3 v] = 4 5 9
Massoud Malek The Method of Krylov Page 4 The matrix V A is nonsingular, hence c A = V A A4 v = characteristic polynomial of A which is 87 28 From c 2 A we obtain the 4 K A (λ) = 87 + 28λ + 2λ 2 4λ 3 + λ 4 The matrix V B is singular, so we need another initial vector such as v = The new 2 9 8 matrix V B = is invertible, so c 3 5 2 B = V 2 B A4 v = From the 8 2 solution c B we obtain the characteristic polynomial of B which is K B (λ) = 9 2λ λ 2 + 2λ 3 + λ 4 Remark The minimal polynomial and the characteristic polynomial of the matrix 2 3 4 2 3 4 A = are m A (λ) = λ 3 3λ 2 7λ and K A (λ) = λ 4 3λ 3 7λ 2, respectively Therefore by choosing any initial vector v, the matrix V A = [v, Av, A 2 v, A 3 v] will always be singular This means that the Krylov sequence will never produce the characteristic polynomial K A (λ) Matlab Program A = input( Enter a square matrix A : ); m = size(a); n = m(); V = zeros(n, n); DL = [ Enter an initial, int2str(n), dimensional row vector v = ]; v = input(dl); z = ; k = ; while z == & k < 5 w = v; V (:, ) = w; for i = 2 : n, w = A w; V (:, i) = w; end, if det(v ) = ; k = 8; c = inv(v ) A w; else while k < 5 v = input( T he matrix V is singular, please enter another initial row vector v : ); k = k + ; end; end; z = det(v ); end; if k == 5; disp( Sorry, the Krylov method is not suited for this matrix ), disp( ), else; disp( T he Characteristic polynomial looks like : ), disp( ), disp([ K A (x) = c() + C()x + c(2)x 2 + + c(n )x (n ) + x n ]), disp( ), disp( T he coefficients list c(k) is : ), disp( ), disp([c, ]) end;