An iterative least-square method suitable for solving large sparse matrices

An iteratie least-square method suitable for soling large sparse matries By I. M. Khabaza The purpose of this paper is to report on the results of numerial experiments with an iteratie least-square method suitable for soling linear systems of equations with large sparse matries. The method of this paper gies satisfatory results for a wide ariety of matries and is not restrited to real or symmetri or definite matries. We onsider a system of n equations: Ax = b. (1) As a library program the method requires from the user an auxiliary sequene whih, gien x, produes y = Ax. Thus no knowledge of the elements of A is required. A need not be a matrix; it ould be any linear operator. Gien an approximate solution x we alulate the residual etor r: r = b - Ax. (2) We seek a better approximation x of the form x' = x +f(a)r (3) where f(a) is a matrix polynomial of the form f(a) =,/ + 2 A + 3 A 2 +...+ m A"'-K (4) The true orretion in (3) is A~ x r, and from the Cayley- Hamilton theorem, A- 1 may be represented as a polynomial in A of degree n 1 in general. Thus f(a)r is an approximation to A~ x r, but/(^) is of order m \ where m is a small integer, about 3, whereas n, the dimension of the etor x, may be large. The polynomial / is determined by the etor of dimension m: = (,, 2,..., m ) T. (5) The superfix T denotes transposition so that is a olumn etor. We define the polynomial g(a) as follows: g(a) = 1 - A/(A) = 1 -,A - 2 X 2 -... - jr. (6) The oeffiients u 2,..., m are determined by the method of least-squares so that g(/4)r is a minimum; the double strokes here denote the usual norm, i.e. the square root of the sum of the squares of the moduli of the omponents of a etor. We take, as an example, m = 3. We define /, = Ar, r 2 = Ar u r 3 = Ar 2. Then we determine u 2, 3 from the system of equations a n 2 + a l3 3 = b t a 22 2 + a 23 3 = b 2 a i2 2 + a 33 3 = b 3. (7) 202 fr p Here a u = rfr Let the eigensystem of the matrix A be: a = A,, A 2,..., A n = b 5i, $2,.., where J-,- is the eigenetor orresponding to the eigenalue A ;. In general we an express the residual etor r of equation (2) in the form r = fll, + a& 2 +... + a n (8) where a, is the omponent of r along %,. If we apply the iteration (3) 5 times the residual beomes +... + a ng^n (9) where g, = g(a,). Thus onergene ours only if g, < 1 for all i. The graph of g(a) in the range (a, b) will indiate the rate of onergene. The iteration (3) will eliminate rapidly the omponents orresponding to roots where g(x) w 0; it will blow up omponents orresponding to roots where g(a) > 1. We note that g(0) = 1. The program We gie here a desription of the program deeloped for testing the method of this paper. Starting with an approximation. we alulate the residual r; if no approximation is known we an take x = 0. To obtain the oeffiients,, 2,..., m of equation (4) we require m multipliations by the matrix A and an extra multipliation to find the new residual r'\ the new approximation x' is determined by using the same multipliations by A used for determining. For eah further approximation, obtained by iterating with a preiously alulated set of oeffiients, we require m 1 multipliations by A to alulate x' and a further multipliation to alulate r'. Let «=l r ; «'= r'. (10) After eah iteration we ompare with '. Iteration with the same set is ontinued as long as ' < C, where C is a onergene ontrol parameter; otherwise a new set of oeffiients is alulated. The parameter C is gien a small alue suh as C=0-2, if we expet rapid onergene, for slow onergene we take C = 0 8 or 0-9. If ' >, as may happen if g(a) blows up some Downloaded from https://aademi.oup.om/omjnl/artile-abstrat/6/2/202/364798 on 24 Deember 2017

omponents of the residual, it may be thought that it would be better to disard the latest approximation x' and its residual r' and alulate a new set from r. In fat this turned out to be wrong. This phenomenon is similar to oer-relaxation. Indeed it often paid to iterate one more although the residual is inreasing. The program ontains further parameters, D and F (e.g. D = 2, F = 10) whih determine when to stop iterating when the residual is inreasing (e.g. when ' > D 0 where 0 is the smallest preious residual), and when to rejet the latest approximation (e.g. when ' > F 0 ). The required solution is obtained when ' < E where E is another parameter (e.g. = 0000001). Finally, another parameter speifies the alue of m whih determines the dimension of or the order of the polynomial g(\) and the system of equations (7). Various small alues of m were tried; in most ases m 3 seemed to be the most suitable alue. Numerial examples We gie the numerial results forfieexamples. In eah of thefirstfour examples the initial approximation x was taken to be 0; a known approximation, howeer rough, would of ourse be better. To measure the effiieny of the method we ount the number of multipliations by A and this number is ompared with the orresponding number required to obtain the same auray by existing methods. The existing methods tried were the method of Gauss-Seidel with oer-relaxation (see Martin and Tee, 1961, equations 2-5 and 2-12) and the method of Conjugate Gradients (op. it. equations 7-10 and 711). In the first three examples the right-hand side b of equation (1) was taken to be e = (1,1,.. J) 7 "; n = 20; A = [1, W], the odiagonal matrix for whih A u = 1, A Ui -x = A u+l = W, otherwise A,, = 0. We onsidered the alues W = 0-25, 05, 0-6, respetiely. The roots A, of A are gien by the formula A, = 1+2W osf ) This result is used to determine the range \n + 1/ (a, b) of the roots of A and draw the graph of g(a) in this range. In eah example we gie a table. The -olumn gies the oeffiients u 2, 3. The -olumn following it gies the alues of the residuals for approximations obtained by repeated iteration with. The omputation was arried out on the Ferranti Merury Computer of the Uniersity of London Computer Unit. This is a mahine with a single-length, floating-point aumulator onsisting of 10 binary digits for the exponent and 30 binary digits for the argument. All alulations were arried out in single-length arithmeti. Well-onditioned matries We onsider the ase when W = 0-25. The roots of A lie between 0-5 and 1 5. The graph of g(a) in Fig. 1 orresponds to the first set of Table 1. The final solution obtained has eight figure auray; Method for soling large sparse matries 203 \ 3 91-4 76 1 81 Fig. 1 Table 1 V 0 035 3-08 0 003-2- 96 0-90 \ V 0-000009 0-0000002 this required a total of 14 multipliations by A. To obtain the same auray by the method of Conjugate Gradients required 16 multipliations by A, and by the method of oer-relaxation required 14 multipliations by A, ounting eah iteration as equialent to one multipliation, and using the optimum oer-relaxation parameter w = 107. As in pratial problems the optimum w is not known; this suggests that the method of this paper may be faster. Ill-onditioned matries We onsider the ase W= 0-5; this is the onedimensional Laplae operator. The roots of A lie between 0 01 and 1-99 giing a ondition number of about 200. Fig. 2 gies the graph of g(a) orresponding to the seond and third sets of Table 2. The graph of the larger set is not drawn to sale, it osillates between 3 and 82 in the range. The final solution obtained has eight figure auray; this required a total of 48 multipliations by A. The same equation was soled by the method of oer-relaxation using the optimum oer-relaxation parameter w = 1 74', to get the same auray required the equialent of 67 multipliations by A (ounting eah iteration as equialent to one multipliation). The method of onjugate gradients broke down in this ase; this is beause, for ill-onditioned matries, some of the intermediate alulation requires aumulation of double-length produts, whih is not easy on a single-length aumulator. We obsere that the etors fall into two ategories, small and large, whih our alternately. We ould make use of this "alternate diretions" phenomenon to eonomize on the alulations of new sets. Thus by Downloaded from https://aademi.oup.om/omjnl/artile-abstrat/6/2/202/364798 on 24 Deember 2017

Method for soling large sparse matries qft) Fig. 3 Fig. 2 Table 3 12-20 8 3-74 3-74 9-90 5-22 -516 1-41 Table 2 2-57 2-40 2-26 8618 0-56 -209-93 72-28 117-49 C V C V C 3-57 0137 91-29 0 00028 4-89 0000005-3-63 0132-157-85 00087-4-64 108 0126 66-88 1-23 -3-47 901-3-81 6-48 -7-71 2-36 1-58 02 08 017 007 003 002 004 5-54 -5-32 1-37 102 0-91 1-26 - 9-78 25-27 -11-66 -2-62 00096 5-87 14-62 00073 9-85 -606 00318-3-68 0-29 118 0 004 0 006 0012 applying the seond and third sets alternately it was possible to ahiee the same auray, but it required a total of 57 multipliations by A. Ill-onditioned non-definite matries We onsider the ase W = 0-6. The roots of A lie between 0-19 and 2-19, and one of the roots is equal to 0 0085, giing a ondition number of about 250. Fig. 3 gies the graph of g(\) orresponding to two onseutie sets. Thefinalsolution has eightfigureauray; this required a total of 98 iterations. This large number is partly due to the fat that g(s) annot be numerially small throughout the range (a, b), sine the origin is within the range and g(0) = 1; but it is probably mostly due to the illondition of the matrix. The method was also tried for the ase W = 10; onergene was a little faster beause the matrix is better onditioned. In this ase the roots are about equally spread on both sides of the origin; in suh ases there is some adantage in taking m een, e.g. m = 4. 204 9-76 0001-14-74 0003 4-85 -2-63 0-000007 13-26 0000015-5-83 C V C V -1104 00002 4-33 0000023 13-21 00007-5-20 0000022-3-76 1-57 0000038 7 09 0-000004 -8-62 2-77 The two-dimensional Laplae operator In this ase the omponents of the etor x orrespond to the alues JC (/, j) of a funtion of two ariables whih satisfies the Laplae equation in disrete form at the 81 internal points of a 10 by 10 square mesh. The internal points orrespond to i, j = 1, 2,..., 9. The boundary points orrespond to i = 0 or 10, or j = 0 or 10; at these points x was speified by the formula x(i,j) = i 3-3y 2. (11) We denote x{i, j), x{i, j-l), x(i l, j), x(i, j+ 1), x(i -f 1, j) by x 0, JC, x 2, x 3, x 4, respetiely. Then equation (1) takes the form Downloaded from https://aademi.oup.om/omjnl/artile-abstrat/6/2/202/364798 on 24 Deember 2017

+* 4 ) = 0 i,j = 1,2,...,9. (12) The operation y = Ax is obtained by setting x = 0 at the 40 boundary points and applying the substitution at the 81 internal points. The residual r = b Ax is obtained by first setting the boundary alues aording to (11), and using the substitution r 0 = *0 (I 4 ) at the 81 internal points. Equations (12) are equialent to the equation Ax = A where A has the form I L U (Martin & Tee, 1961, equation 1-2). To speed up onergene (A, b) are replaed by 04',*') where A' = /- {I-L)~ l U, b' = {I L)~ '*. This is done by replaing the substitutions (13) and (14) by the following: Mo=i(«i+"2+*3+*4) Jo=^o "o i,j= 1,2,...,9 (13') *4) r o =u o x o i,y=l,2 9. (14') In (13') as in (13) the alues of u and x are set to 0 at the boundary points. In (14') as in (14) the alues of u and x are set at the boundary points aording to equation (11). The alues of u obtained in (14') form the solution one would obtain after a single Gauss-Seidel iteration (op. it., equation 2-2), and r 0 is the orresponding displaement. The initial approximation in Table 4 was x = 0. Method for soling large sparse matries Complex matries The method of this paper was also tried on an 11 by 11 omplex matrix of the type whih arises in Power Systems analysis in Eletrial Engineering (Laughton & Humphrey Daies). Iteratie methods in this field are partiularly suitable beause a good approximation is usually known; beause the matries are ery sparse and an be ery large; beause the solution is usually required only to about four or fie signifiant figures; and finally beause the solution itself is part of an iteratie proedure whih may inole altering the oeffiients of the matrix after eah iteration. The matrix studied was typial: it is symmetri not Hermitian; in any row the sum of the elements is 0, or nearly 0, and the diagonal element is the largest in modulus; the smallest diagonal element is 0 7-69/ and the largest is 7-960 44-277/. The oeffiients a y and b, of equation (7) are defined as follows: a u = = r " r where the superfix H denotes the omplex onjugate transpose. 0-033+0-235/ 00125-0004/ -0 00007-0 00013/ Table 5 V C V 0-73 0-964+1-652/ 005 0-46 0-396-0-596/ 2-46 0-34 -0010-0004/ 4-31 -5-89 2-58 10-64 -24-60 13-73 Table 4 348 49-52 29-9 704 72-2 6-60 5-78 4-46 137-1514 100 11-36 0 0037 0 0026 0 0033 0 0036 2-85 -2-79 0-94 0-574 0-254 0 097 0 030 0013 0000041 0-000022 The final solution obtained has eight figure auray; this required a total of 55 multipliations by A'. The same equations soled by the method of oer-relaxation, using the oer-relaxation parameter w = 1-518, required the equialent of 44 multipliations. In this ase the method of this paper requires more multipliations than the method of oer-relaxation. In pratial problems, howeer, one has only a guess at the optimum w; thus taking w =* 1-4 requires more than 70 iterations to obtain the same auray. It would appear that in suh ases the method of this paper still has an adantage oer the method of oer-relaxation. 205 0-025+0-121/ 0 004-0 002/ -0-00002-0 00004/ 0031 3-58+2-45/ 00009 0028 0-63-2-12/ 0 0278-0-033-0-003/ The last approximation has a maximum error of one in the fifth signifiant figure; this auray required 55 multipliations by A. To obtain the same auray by the method of Gauss-Seidel required muh more than 100 multipliations. Conluding remarks The method of this paper seems to be an effiient way for soling large sparse matries or linear operators. It ompares faourably with existing iteratie methods. It also opes with omplex or non-definite or unsymmetri matries, whereas existing methods usually require the matrix to be symmetri definite. As for other iteratie methods, its adantage oer diret methods is that it requires few iterations if an approximation is already known or if only fewfiguresauray is required or if the matrix is well onditioned; it requires little storage if the matrix is sparse, and indeed A may be stored not as a matrix but as a program to apply a linear operator. The method is being tried on Atlas for matries of order seeral hundreds whih arise in Eletrial and Downloaded from https://aademi.oup.om/omjnl/artile-abstrat/6/2/202/364798 on 24 Deember 2017

Strutural Engineering; this was not possible before beause of the relatiely small size of the Merury fast store; I hope to report on the results in the near future. Method for soling large sparse matries I am indebted to the Diretor of the Uniersity of London Computer Unit, Dr. R. A. Bukingham, for enouragement, and to my olleague, Dr. M. J. M. Bernal, for frequent disussions. Referenes MARTIN, D. W., and TEE, G. J. (1961). "Iteratie methods for linear equations with symmetri positie definite matrix", The Computer Journal, Vol. 4, p. 242. LAUGHTON, M. A., and HUMPHREY DA VIES, M. W. "Numerial methods for Power System Load Flow studies". To be published in the Proeedings of the Institution of Eletrial Engineers. Note on the numerial solution of linear differential equations with onstant oeffiients By R. E. Sraton and J. W. Searl The numerial solution of the differential equation ay" + by'+ y = f{x) (1) where a, b, are onstants and f(x) is a numerially speified funtion, an be obtained by a ariety of methods. For automati omputation the Runge- Kutta method is normally used, but this may be unstable. The proedure desribed below proides an alternatie method whih has been found satisfatory where the Runge-Kutta method has failed. Suppose that f(x) is tabulated at interal h. In the usual notation, let/,, denote f(x 0 + ph) so that equation (1) may be written a d Jy + bh d y + h2 y = (2) dp 2 dp It is assumed that y 0 and y 0 are known, so that a proedure for determining j, and y[ makes it possible to tabulate both y and y' in a step-by-step manner. Let the sequene A r be defined by the equations ax r+2 + bh\ r+1 + h 2 \ r = 0, r > 0 and let ~ mkfj m\ ' (m + l)\ ' (w+2)! It is easily erified that + 2 (3)... (4) If, therefore, f p an be expanded in the form f p = A o + A lp + A 2 p 2 + A 3 p> +. (6) it an be shown that the solution of (2) satisfying the required initial onditions is Thus and where = J'oD - hw 2 (p)] + ahyiui(p) + h 2 [A 0 U 2 ( P )+A l.l\u 3 (p) hyl + A 2.2 \U 4 (p) + (7) h\a o u 2 +A l Alu 3 +A 2.2\u 4...) (8) u m = UJl) = (m + 1)! ' (m+2)! The oeffiients A r of equation (6) may be taken from any polynomial interpolation formula. For automati omputation, Lagrangian formulae are appropriate, and a four-point formula will be used as an illustration, iz: (9) f 0 if m = 0, 1 I p"- 2 (5) 206 2p*-p + 2)/ 0 Equations (8) and (9) may then be written (10) Downloaded from https://aademi.oup.om/omjnl/artile-abstrat/6/2/202/364798 on 24 Deember 2017