REVUE D ANALYSE NUMÉRIQUE ET DE THÉORIE DE L APPROXIMATION Tome 25, N os 1-2, 1996, pp 43 56 ON THE CHEBYSHEV METHOD FOR APPROXIMATING THE EIGENVALUES OF LINEAR OPERATORS EMIL CĂTINAŞ and ION PĂVĂLOIU (Cluj-Napoca) 1 INTRODUCTION Approaches to the problem of approximating the eigenvalues of linear operators by the Newton method have been done in a series of papers ([1], [3], [4], [5]) There is a special interest in using the Newton method because the operatorial equation to be solved has a special form, as we shall see We shall study in the following the convergence of the Chebyshev method attached to this problem and we shall apply the results obtained for the approximation of an eigenvalue and of a corresponding eigenvector for a matrix of real or complex numbers It is known that the convergence order of Newton method is usually 2 and the convergence order of Chebyshev method is usually 3 Taking into account as well the number of operations made at each step, we obtain that the Chebyshev method is more efficient than the Newton method Let E be a Banach space over K, where K = R or K = C, and T : E E a linear operator It is well known that the scalar λ is an eigenvalue of T if the equation (11) Tx λx = θ has at least one solution x θ, where θ is the null element of the space E The elements x θ that satisfy equation (11) are called eigenvectors of the operator T, corresponding to the eigenvalue λ For the simultaneously determination of the eigenvalues and eigenvectors of T we can proceed in the following way We attach to the equation (11) an equation of the form (12) Gx = 1 where G is a linear functional G : E K
44 Emil Cătinaş and Ion Păvăloiu 2 Consider the real Banach space F = E K, with the norm given by (13) u = max{ x, λ }, u F, u = ( x λ), with x E and λ K In this space we consider the operator f : F F given by (14) f ( ( ) x) Tx λx λ = Gx 1 Ifwedenotebyθ 1 = ( θ 0) thenullelementofthespacef, thentheeigenvalues and the corresponding eigenvectors of the operator T are solutions of the equation (15) f (u) = θ 1 Obviously, f is not a linear operator It can be easily seen that the first order Fréchet derivative of f has the following form [4]: (16) f (u 0 )h = ( Th 1 λ 0 h 1 λ 1 x 0 Gh 1 where u 0 = ( x 0 ) ( λ 0 and h = h1 ) λ 1 For the second order derivative of f we obtain the expression (17) f (u 0 )hk = ( λ 2 h 1 λ 1 h 2 0 where k = ( h 2 λ 2 ) The Fréchet derivatives of order higher than 2 of f are null Considering the above forms of the Fréchet derivatives of f, in the following we shall study the convergence of the Chebyshev method for the operators having the third order Fréchet derivative the null operator ) ),, 2 THE CONVERGENCE OF CHEBYSHEV METHOD The iterative Chebyshev method for solving equation (15) consists in the successive construction of the elements of the sequence (u n ) n 0 given by (21) u n+1 = u n Γ n f (u n ) 1 2 Γ nf (u n ) ( Γ n f (u n ) ) 2, n = 0,1,, u0 F, where Γ n = [f (u n )] 1
3 On the Chebyshev method 45 Let u 0 F and δ > 0, b > 0 be two real numbers Denote S = {u F : u u 0 δ} If m 2 = sup f (u), u S then Consider the numbers (22) sup f (u) f (u 0 ) +m2 δ, u S sup f (u) f (u 0 ) +δ f (u 0 ) +m 2 δ 2 =: m 0 u S µ = 1 2 m2 2 b4( 1+ 1 4 m 2m 0 b 2) ν = b ( 1+ 1 2 m 2m 0 b 2) With the above notations, the following theorem holds: Theorem 1 If the operator f is three times differentiable with f (u) θ 3 for all u S (θ 3 being the 3-linear null operator) and if, moreover, there exist u 0 F, δ > 0, b > 0 such that the following relations hold i the operator f (u) has a bounded inverse for all u S, and f (u) 1 b; ii the numbers µ and ν given by (22) satisfy the relations ρ 0 = µ f (u 0 ) < 1 and νρ 0 µ(1 ρ0 ) δ, then the following properties hold: j (u n ) n 0 given by (21) is convergent; jj if u = lim u u n, then u S and f (u) = θ 1 ; jjj u n+1 u n νρ3n 0 µ, n = 0,1,; jv u u n νρ3n 0 µ(1 ρ 3 n 0 ), n = 0,1, Proof Denote by g : S F the following mapping: (23) g(u) = Γ(u)f (u) 1 2 Γ(u)f (u) ( Γ(u)f (u) ) 2, where Γ(u) = f (u) 1
46 Emil Cătinaş and Ion Păvăloiu 4 It can be easily seen that for all u S the following identity holds f (u)+f (u)g(u)+ 1 2 f (u)g 2 (u) = [ = 1 2 f (u) f (u) 1 f (u),f (u) 1 f (u) ( f (u) 1 f (u) ) ] 2 + + 1 8 f (u) [f (u) 1 f (u) ( f (u) 1 f (u) ) ] 2 2 whence we obtain (24) f (u)+f (u)g(u)+ 1 2 f (u)g 2 (u) 1 2 m2 2 b4( 1+ 1 4 m 0m 2 b 2) f (u) 3, or (25) f (u)+f (u)g(u)+ 1 2 f (u)g 2 (u) µ f (u) 3, for all u S Similarly, by (23) and taking into account the notations we made, we get (26) g(u) ν f (u), for all u S Using the hypotheses of the theorem, the inequality (25) and the fact that f (u) = θ 3, we obtain the following inequality: f (u 1 ) f (u 1 ) f (u 0 ) f (u 0 )g(u 0 ) 1 2 f (u 0 )g 2 (u 0 ) + f (u0 )+f (u 0 )g(u 0 )+ 1 2 f (u 0 )g 2 (u 0 ) µ f (u 0 ) 3 Since u 1 u 0 = g(u 0 ), by (25) we have u 1 u 0 ν f (u 0 ) = ν µ f(u 0 ) µ < νρ 0 µ(1 ρ0 ) δ, whence it follows that u 1 S Suppose now that the following properties hold: a) u i S, i = 0,k; b) f (u i ) µ f (u i 1 ) 3, i = 1,k By the fact that u k S, using (25) it follows (27) f (u k+1 ) µ f (u k ) 3, and from relation u k+1 u k = g(u k ) (28) u k+1 u k < ν f (u k ) The inequalities b) and (27) lead us to (29) f (u i ) 1 µ ( µ f (u0 ) ) 3 i, i = 1,k +1
5 On the Chebyshev method 47 We have that u k+1 S : (210) k+1 k+1 u k+1 u 0 u i u i 1 ν f (u i 1 ) ν i=1 i=1 µ k+1 i=1 ρ 3i 1 0 νρ 0 (1 ρ 0 ) µ Now we shall prove that the sequence (u n ) n 0 is Cauchy Indeed, for all m,n N we have (211) m 1 u n+m u n u n+i+1 u n+i i=0 f (u n+i ) m 1 ν i=0 m 1 ν µ ρ 3n+i 0 i=0 m 1 = ν µ ρ 3n 0 i=0 νρ3n 0 µ(1 ρ 3 n 0 ), ρ 3n+i 3 n 0 whence, taking into account that ρ 0 < 1, it follows that (u n ) n 0 converges Let u = lim n u n Then, for m in (210) it follows jv The consequence jjj follows from (28) and (29) 3 THE APPROXIMATION OF THE EIGENVALUES AND EIGENVECTORS OF MATRICES In the following we shall apply the previously studied method to the approximation of eigenvalues and eigenvectors of matrices with elements real or complex numbers Let p N and the matrix A = (a ij ) i,j=1,p, where a ij K, i,j = 1,p Using the above notations, we shall consider E = K p and F = K p K Any solution of equation ) (31) f ( x λ ( ) Ax λx := x i0 1 = ( θ 0), i 0 {1,,p} being fixed, where x = (x 1,,x p ) K p and θ = (0,,0) K p, will lead us to an eigenvalue of A and to a corresponding eigenvector For the sake of simplicity denote λ = x p+1, so that equation 31 is represented by the
48 Emil Cătinaş and Ion Păvăloiu 6 system (32) f i (x 1,,x p,x p+1 ) = 0, i = 1,p+1 where (33) f i (x 1,,x p+1 ) = a i1 x 1 ++(a ii x p+1 )x i ++a ip x p, i = 1,p and (34) f p+1 (x) = x i0 1 Denote by P the mapping P : K p+1 K p+1 defined by relations (33) and (34) Let x n = ( x n 1 p+1),,xn K p+1 Then the first order Fréchet derivative of the operator P at x n has the following form (35) a 11 x n p+1 a 12 a 1i0 a 1p x n 1 h 1 a 21 a 22 x n p+1 a 2i0 a 2p x n 2 h 2 P (x n )h=, a p1 a p2 a pi0 a pp x n p+1 x n p h p 0 0 1 0 0 h p+1 where h = (h 1,,h p+1 ) K p+1 If we write k = (k 1,,k p+1 ) K p+1 then for the second order Fréchet derivative we get k p+1 0 0 k 1 h 1 0 k p+1 0 k 2 h 2 (36) P ( x n ) kh = 0 0 k p+1 k p h p 0 0 0 0 h p+1 Denote by Γ(x n ) the inverse of the matrix attached to the operator P (x n ) andu n =Γ(x n )P (x n ) = ( u n 1,un 2 p+1),,un Letvn = P (x n ) ( Γ(x n )P (x n ) ) 2 = P (x n ) u 2 n We obtain the following representation (37) u n p+1 0 0 u n 1 u n 1 0 u n p+1 0 u n 2 u n 2 v n = P (x n )u 2 n = 0 0 u n p+1 u n p u n p 0 0 0 0 u n p+1
7 On the Chebyshev method 49 From this relation we can easily deduce the equalities (38) v n i = 2u n p+1 un i, i = 1,p; v n p+1 = 0 Denoting w n = ( w n 1,wn 2,,wn p+1) = Γ(xn )v n and supposing that x n is an approximation of the solution of system (32), then the next approximation x n+1 given by method (21) is obtained by (39) x n+1 = x n u n 1 2 w n, i = 0,1, Consider K p with the norm of an element x = (x 1,,x p ) given by the equality (310) x = max 1 i p { x i }, and consequently (311) A = max 1 i p j=1 p a ij It can be easily seen that P (x n ) = 2, for all x n K p+1 Let x 0 K p+1 be an initial approximation of the solution of system (32) Consider a real number r > 0 and the set S = { x K p+1 x x 0 r } Denote by m 0 = P (x 0 ) +r P (x 0 ) +2r 2, µ =2b 4 ( 1+ 1 2 m 0b 2 ), ν =b ( 1+m 0 b 2 ), where b = sup x S Γ(x), Γ(x) being the inverse of the matrix attached to the operator P (x) Taking into account the results obtained in Section 2, the following consequence holds: Corollary 2 If x 0 K p+1 and r R, r > 0, are chosen such that the matrix attached to the operator P (x) is nonsingular for all x S, and the following inequalities hold: then the following properties are true ρ 0 < µ P (x 0 ) < 1 νρ 0 µ(1 ρ0 ) r
50 Emil Cătinaş and Ion Păvăloiu 8 j 1 the sequence (x n ) n 0 generated by (39) is convergent; jj 1 if x = lim n x n, then P (x) = θ 1 = (0,,0) K p+1 ; jjj 1 x n+1 x n νρ3n 0 µ, n = 0,1,; jv 1 x x n νρ 3n 0 µ(1 ρ 3 n 0 ), n = 0,1, Remark If the radius r of the ball S is given and there exists P (x 0 ) 1 and 2r P (x 0 ) 1 < 1, then P (x) 1 P (x 0 ) 1 1 2r P (x 0 ) 1, for all x S and in the above Corollary, taking into account the proof of Theorem 1, we can take b = P (x 0 ) 1 1 2r P (x 0 ) 1 4 A COMPARISON TO THE NEWTON METHOD Note that if in (39) we neglect the term 1 2 w n, then we get the Newton method: x n+1 = x n u n = x n Γ( x n )F ( x n ), n = 0,1, In order to compare the Newton method and the Chebyshev method, we shall consider that the linear systems which appear in both methods are solved by the Gauss method While the Newton method require at each step the solving of a system Ax = b, the Chebyshev method require the solving of two linear systems Ax = b, Ay = c with the same matrix A and the vector c depending on the solution x of the first system So, we shall adapt the Gauss method in order to perform as few as possible multiplications and divisions When comparing the two methods we shall neglect the number of addition and subtraction operations
9 On the Chebyshev method 51 The solving of a given linear system Ax = b, A M m (K), b,x K m, (where we have denoted m = p+1) using the Gauss method consists in two stages In the first stage, the given system is transformed into an equivalent one but with the matrix of the coefficients being upper triangular In the secondstagetheunknowns(x i ) i=1,m aredeterminedbybackwardsubstitution The first stage There are performed m 1 steps, at each one vanishing the elements on the same column below the main diagonal We write the initial system in the form a 1 11 a 1 1m a 1 m1 a 1 mm x 1 x m = b 1 1 b 1 m Suppose that a 1 11 0, a1 11 being called the first pivote The first line in the system is multiplied by α = a1 21 and is added to the second one, which a 1 11 becomes0,a 2 22,a2 23,,a2 2m,b2 2, afterperformingm+1multiplication ordivision (M/D) operations After m 1 such transformations the system becomes a 1 11 a 1 12 a 1 1m 0 a 2 22 a 2 2m x 1 x 2 = b 1 1 b 2 2 0 a 2 m2 a 2 mm x m b 2 m Hence at the first step there were performed (m 1)(m+1) M/D operations In the same manner, at the k-th step we have the system a 1 11 a 1 12 a 1 1k a 1 1,k+1 a 1 1m 0 a 2 22 a 2 2k a 2 2,k+1 a 2 2m 0 0 a k kk a k k,k+1 a k km 0 0 a k k+1,k a k k+1,k+1 a k k+1,m x 1 x 2 x k x k+1 = b 1 1 b 2 2 b k k b k k+1 0 0 a k mk a k m,k+1 a k mm x m b k m
52 Emil Cătinaş and Ion Păvăloiu 10 Supposing the k-th pivote a k kk M/D operations we get a 1 11 a 1 12 a 1 1k a 1 1,k+1 a 1 1m 0 a 2 22 a 2 2k a 2 2,k+1 a 2 2m 0 0 a k kk a k k,k+1 a k k,m 0 0 0 a k+1 k+1,k+1 a k+1 k+1,m 0 0 0 a k+1 m,k+1 a k+1 mm 0 and performing (m k)(m k+2) x 1 x 2 x k x k+1 x m = b 1 1 b 2 2 b k k b k+1 k+1 b k+1 m At each step k, the elements below the k-th pivote vanish, so they are not needed any more in the solving of the system The corresponding memory in the computer is used keeping in it the coefficients ak k+1,k a k kk,, ak mk a k, kk which, of course, will be needed only for solving another system Ay = c, with c depending on x, the solution of Ax = b At the first stage there are performed (m 1)(m+1)++1 3 = 2m3 +3m 2 5m 6 The second stage Given the system a 1 11 a 1 12 a 1 1m 0 a 2 22 a 2 2m 0 0 a m mm x 1 x 2 x m the solution x is computed in the following way: x m = b m m/a m mm, = M/D operations b 1 1 b 2 2 b m m x k = ( b k k (ak k,k+1 x k+1 ++a k km x m) ) /a k kk, x 1 = ( b 1 1 (a1 12 x 2 ++a 1 1m x m) ) /a 1 11
11 On the Chebyshev method 53 At this stage there are performed 1+2++m = m(m+1) 2 M/D operations In the both stages, there are totally performed m 3 3 +m2 m 3 M/D operations In the case when we solve the systems Ax = b, Ay = c, where the vector c depends on the solution x, we first apply the Gauss method for the system Ax = b and at the first stage we keep below the main diagonal the coefficients by which the pivotes were multiplied Then we apply to the vector c the transformations performed to the vector b when solving Ax = b Denote c = (c 1 i ) i=1,m At the first step At the k-th step At the m-th step c 2 2 := a 21c 1 1 +c1 2 c 2 m := a m1c 1 1 +c1 m c k+1 k+1 := a k+1,k c k k +ck k+1 c k+1 m := a mk c k k +ck m c m m := a m,m 1c m 1 m 1 +cm 1 m There were performed m 1+m 2++1 = m(m 1) 2 M/D operations Now the second stage of the Gauss method is applied to a 1 11 a 1 1m 0 a m mm y 1 y m = In addition to the case of a single linear system, in this case were performed m(m 1) 2 M/D operations, getting c 1 1 c m m m 3 3 + 3 2 m2 5 6m M/D operations, andtakingintoaccount (38) weadd(m 1) morem/d operations, obtaining m 3 3 + 3 2 m2 + m 6 1 M/D operations
54 Emil Cătinaş and Ion Păvăloiu 12 Remark At the first stage, if for some k we have a k kk 0, then an element a k i 0,k 0, i 0 {k+1,,m} must be find, and the lines i 0 and k in A and b be swapped In order to avoid the error accumulations, a partial or total pivote strategy is recommended even if a k kk 0 for k = 1,m 1 For partial pivoting, the pivote is chosen such that a k i 0,k = max i=k,m ak ik The interchange of lines can be avoided by using a permutation vector p = (p i ) i=1,m, which is first initialized so that p i = i The elements in A and b are then referred to as a ij := a p(i),j and b i := b p(i), and swapping the lines k and i 0 is done by swapping the k-th and i 0 -th elements in p For the Chebyshev method, the use of the vector p cannot be avoided by the very interchanging of the lines, because we must keep track for the permutations made, in order to apply them in the same order to the vector c Moreover, weneedtwoextravectors tandq Intwestorethetranspositions applied to the lines in Ax = b, which are then successively applied to q At the first stage of the Gauss method, when the k-th pivote is a k i 0,k and i 0 k, the k-th and i 0 -th elements in p are swapped, and we assign t k := i 0 to indicate that at the k-th step we applied to p the transposition (k,i 0 ) After computing the solution of Ax = b, we initialize the vector c by (37), the permutation vector q by q i := i, i := 1,m, and then we successively apply the transforms operated to b, taking into account the eventual transpositions The algorithm for solving the second linear system in the Chebyshev method is as follows for k := 1 to m 1 do begin if t[k] <> k then {at the k-th step the transposition} begin {(k, t[k]) has been applied to p} auxi := q[k]; q[k] := q[t[k]]; q[t[k]] := auxi; end, for i := k +1 to m do c[q[i]] := c[q[i]]+a[q[i],k] c[q[k]] end;
13 On the Chebyshev method 55 for i := m downto 1 do {the solution y is now computed} begin sum := 0; for j := i+1 to m do sum := sum+a[p[i],j] y[j]; {now p = q} y[i] := (c[p[i]] sum)/a[p[i],i]; end We adopt as the efficiency measure of an iterative method M the number E(M) = lnq s, where q is the convergence order and s is the number of M/D operations needed at each step We obtain for the Newton method and E(N) = E(C) = 3ln2 m 3 +3m 2 m 6ln3 2m 3 +9m 2 +m 6 for the Chebyshev method It can be easily seen that we have E(C) > E(N) for n 2, ie the Chebyshev method is more efficient than the Newton method Consider the real matrix 5 NUMERICAL EXAMPLE A = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 which has the following eigenvalues and eigenvectors λ 1,2,3 = 2, x 1 = (1,1,0,0), x 2 = (1,0,1,0), x 3 = (1,0,0,1) and λ 4 = 2, x 4 = (1, 1, 1, 1) Taking the initial value x 0 = (1, 15, 2, 15, 1), and applying the two methods we obtain the following results:
56 Emil Cătinaş and Ion Păvăloiu 14 Newton method n x 1 x 2 x 3 x 4 x 5 = λ 0 10-15000000000 -20000000000-15000000000 -10000000000 1 10-09000000000 -080000000000-090000000000 -16000000000 2 10-10125000000 -10250000000-10125000000 -20500000000 3 10-10001524390 -10003048780-10001524390 -20006097561 4 10-10000000232 -10000000465-10000000232 -20000000929 5 10-10000000000 -10000000000-10000000000 -20000000000 Chebyshev method n x 1 x 2 x 3 x 4 x 5 = λ 0 10-15000000000 -20000000000-15000000000 -10000000000 1 10-097200000000 -094400000000-097200000000 -18880000000 2 10-099995000189 -099990000377-099995000189 -19998000075 3 10-10000000000 -10000000000-10000000000 -20000000000 REFERENCES [1] MP Anselone, LB Rall, The solution of characteristic value-vector problems by Newton method, Numer Math 11 (1968), pp 38 45 [2] PG Ciarlet, Introduction à l analyse numérique matricielle et à l optimisation, Mason, Paris, 1990 [3] F Chatelin, Valeurs propres de matrices, Mason, Paris, 1988 [4] L Collatz, Functionalanalysis und Numerische Mathematik, Springer-Verlag, Berlin, 1964 [5] VS Kartîşov, FL Iuhno, O nekotorîh Modifikaţiah Metoda Niutona dlea Resenia Nelineinoi Spektralnoi Zadaci, J Vîcisl matem i matem fiz (33) (1973) 9, pp 1403 1409 [6] I Păvăloiu, Sur les procédés itératifs à un ordre élevé de convergence, Mathematica (Cluj) 12 (35) (1970) 2, pp 309 324 [7] JF Traub, Iterative Methods for the Solution of Equations, Prentice-Hall Inc, Englewood Cliffs, N J, 1964 Received March 7, 1996 T Popoviciu Institute of Numerical Analysis Romanian Academy PO Box 68 Cluj-Napoca 1 RO-3400 Romania