Scalng and structural condton numbers Arnold Neumaer Insttut fur Mathematk, Unverstat Wen Strudlhofgasse 4, A-1090 Wen, Austra emal: neum@cma.unve.ac.at revsed, August 1996 Abstract. We ntroduce structural condton numbers and dscuss ther sgncance for the proper scalng of nonsymmetrc and symmetrc matrces. 1 Introducton Drect methods for solvng a system of lnear equatons Ax = b wth a square nonsngular coecent matrx A are generally based on matrx factorzatons. Unless the matrx s postve dente, the factorzaton always nvolves some form of pvotng n order to have satsfactory numercal stablty propertes. However, to get good pvot choces, the matrces must be properly scaled. A common suggeston s to scale such that the scaled matrx be equlbrated,.e., all ts rows have l1-norm 1. However, ths can be dsastrous for the subsequent factorzaton when the entres n some row of A have wdely derng magntudes, and t s partcularly unsatsfactory for sparse matrces (Curts & Red [3]). One can mprove the behavor by enforcng also that the transposed matrx s equlbrated,.e., that all columns have norm 1. Ignorng the sgns, the scaled matrx s then doubly stochastc,.e., the entres of each row and column sum to 1. Now the scalng matrces must satsfy nonlnear equatons that are not easy to solve. Parlett & Lands descrbe n [8] (expensve) teratve procedures for scalng nonnegatve matrces to double stochastc 1
form, and they present results of tests comparng the new algorthms wth other methods. Recently, Olschowka & Neumaer [7] ntroduced another dea for choosng scalng matrces D1 and D2 n such a way that D1AD2 has the structure of a permuted I-matrx. In ths paper we relate scalng problems to questons of the closeness of a gven matrx to structurally sngular matrces, measured by a structural condton number. Among other results, we show that both doubly stochastc equlbraton and scalng to permuted I-matrx form leads to bounded structural condton numbers, thus justfyng ther observed good performance n the context of Gauss elmnaton. We also show that the permuted I-matrx scalng s naturally related to rank one bounds for the absolute value of A. Such bounds are requred n Neumaer [6] for the constructon of good scales for moded Cholesky factorzatons of ndente symmetrc matrces. 2 Structural sngularty and structural condton numbers A square matrx A 2 R nn s called structurally sngular f every matrx A 0 s sngular that has zeros A 0 k = 0 whenever A k = 0, and structurally nonsngular otherwse. A transversal of A s a permutaton of f1; : : : ; ng such that A ; 6= 0 for = 1; : : : ; n. Every nonsngular matrx has a transversal. More generally, snce the determnant s the sum of sgned products of entres on transversals, a square matrx has a transversal t s structurally nonsngular. The structural condton number of A 2 R nn s dened as 8 1 when A s structurally sngular; >< q(a) = max ;k >: ja kj otherwse. supf j A structurally nonsngularg 2
Here, A denotes the matrx wth entres ( A A k = k f ja k j ; 0 otherwse. The structural condton number s a measure of how close A s to a structurally sngular matrx. The relevance of the structural condton number s the fact that a nonsngular matrx A wth large q(a) contans n some of ts small entres sgncant nformaton about ts nverse. Snce A = 0 when > max ja k j, we always have q(a) 1. Therefore, the ;k structurally best behaved matrces are those wth q(a) = 1. If ths holds, the matrx A wth 0 < = max ja k j has all ts entres 2 f0; ;?g and ;k s nonsngular, hence has a transversal. After permutng ths transversal to the dagonal and dvdng the th rows of the permuted matrx P A by (P A) 2 f;?g, we obtan an I-matrx,.e., a matrx B wth B = 1 jb k j for all ; k: Conversely, every permuted I-matrx B has q(b) = 1 and s structurally optmally condtoned. The man result of Olschowka & Neumaer [7] s that one can scale every structurally nonsngular matrx to a permuted I-matrx, cf. Secton 3 below. As we shall show n Secton 4, structurally nonsngular symmetrc matrces can be scaled to a symmetrc permuted I- matrx. Therefore, for both nonsymmetrc and symmetrc matrces, large structural condton numbers are a reecton of bad scalng. For relatve perturbatons we have the followng `structural' analogue of the standard result ka?1 k kb? Ak " ) 1? " 1 + " cond(a) cond(b) 1 + " 1? " cond(a); n the followng, absolute values are taken componentwse. 2.1. Proposton. jb? Aj "jaj ) 1? " 1 + " q(a) q(b) 1 + " 1? " q(a): Proof. Straghtforward. 2 3
We have the followng characterzaton of structural condton numbers n terms of submatrces wth small entres. 2.2. Proposton. A 2 R nn has structural condton number q(a) ja k j= there s a submatrx A JK wth jjj + jkj > n and maxfja jk j j max ;k 2 J; k 2 Kg. Proof. Proposton 2.3 of [7] states that A s structurally nonsngular m1 + m2 n for all m1 m 2 - submatrces contanng zeros only. Thus A s structurally sngular some A JK wth jjj + jkj > n satses maxfja jk j j 2 J; k 2 Kg <. Ths mples the asserton. 2 Parlett & Lands [8] suggest to scale matrces to doubly stochastc form. That ths scalng procedure s reasonable from the pont of vew of structural condton follows from our next theorem. 2.3. Theorem. Every doubly stochastc matrx A 2 R nn satses q(a) n(n + 1): Proof. Ths s the case = = = 1 of the followng result. 2 2.4. Proposton. Suppose A 2 R nn satses jaje e; ja T je e; jaj ee T ; where e = (1; : : : ; 1) T ;.e., jaj has row sums, column sums, and entres. Then n n + 1 ) q(a) n(n + 1) : Proof. We consder a submatrx of the form n Proposton 2.2. Let r = jjj; s = jkj and denote by c and d the sum of all A k wth 2 J and k =2 K or k 2 K, respectvely. Lookng at column sums, we nd c (n? s); lookng at row sums, we nd c + d r. Hence d r? (n? s). Snce 4
= n=(n + 1) by assumpton, and r + s n + 1, there s some A k wth 2 J; k 2 K satsfyng ja k j = r? (n? s) = (n? s)? rs s rs n (n? s)? (n + 1)s (n + 1? s)s (n + 1)(n + 1? s) (n + 1)n : By, Proposton 2.2, ja k j q(a)?1, and the result follows. 2 3 Optmal rank one bounds Let A 2 R nn. Suppose we have (dagonal and nonsngular) scalng matrces D1 and D2 such that all entres of D1AD2 have absolute value 1. Then the rank one matrx B = ( k ) n ;k=1 (1) wth = (D1)?1 ; = (D2)?1 s a componentwse upper bound for jaj, and we may pose the scalng problem as one of ndng small ; k > 0 such that ja k j k for all ; k: (2) 3.1. Theorem. Let A be structurally nonsngular. Among all rank one matrces (1) satsfyng jaj Q B, the mnmal value of ( ) s attaned for values such that (2) holds wth equalty for k = ( = 1; : : : ; n), for some transversal. Proof. (2) mples ja ; j ( ) = ( ) for any transversal, so that ( ) max ja ; j: 5
The rght hand sde s postve snce A s structurally nonsngular. Equalty holds = ja ; j for some transversal,.e., D1AD2 s a permuted I-matrx, where D1 = Dag(?1 j = 1; : : : n) and D2 = Dag(?1 j = 1; : : : n). But, accordng to [7], such a transversal exsts. 2 If we wrte! k f A k 6= 0 and ntroduce k = log ja k j f! k; y = log ; z k = log k ; the problem to mnmze Q ( ) subject to (2) can be rewrtten as the lnear program mn P y k + P z k (3) s.t. y + z k k for k; The dual of the lnear program (3) s a lnear program assocated wth the weghted matchng problem for the drected graph wth adjacency relaton! and weghts k (for detals see, e.g., [7]); thus (3) can be solved by algorthms for the weghted bpartte matchng problem [1, 2, 4, 5, 7]. 4 Optmal symmetrc scalng For symmetrc matrces A 2 R nn one generally wants to nd a scalng procedure that preserves symmetry and hence uses the same scalng matrx D on the left and on the rght. Equvalently, we want to nd a good bound for jaj by a rank one matrx Thus we want to nd small k > 0 such that B = ( k ) n ;k=1 : (4) ja k j k for all ; k: (5) Numbers satsfyng (5) are requred n Neumaer [6] for the constructon of good scales for moded Cholesky factorzatons of ndente symmetrc matrces. 6
4.1. Theorem. Let the symmetrc matrx A be structurally nonsngular. Among all rank one matrces (4) wth jaj B, the mnmal value of Q s attaned for values such that (5) holds wth equalty for k = ( = 1; : : : ; n), for some transversal. Proof. (5) mples for any transversal, so that 2 ja ; j ( ) = max 2 ja ; j: (6) Now suppose that we have a soluton of the problem to mnmze Q ( ) subject to (2). Then the 0 = p satsfy q q q 0 0 k = k k = k k ja k j ja k j = ja k j and () 0 2 = ( ) = max ja ; j by Theorem 3.1. A comparson wth (6) shows the optmalty of 0. Hence equalty holds n the argument leadng to (6), and for any maxmzng, (5) holds wth equalty for k = ( = 1; : : : ; n). 2 If we wrte k f A k 6= 0 and ntroduce k = k = log ja k j f k; x k = log k ; the problem to mnmze Q 2 subject to (5) can be rewrtten as the lnear program mn P x k (7) s.t. x + x k k for k: From a soluton of the nonsymmetrc problem (3) we can nd a soluton of (7) by settng x k = 1 2 (y k + z k ). So we can solve (7), too, usng matchng algorthms. But perhaps some work can be saved there by explotng symmetry and/or lookng only for nvolutory paths, cf. the followng remark. 7
4.2. Remark. When the domnant transversal s unque (and ths holds wth probablty one under natural stochastc assumptons) then we must have?1 = snce?1 s agan a transversal. Thus 2 s the dentty. Therefore the actve constrants x + x = ; x x when =, but f 6= we only get x = 1 2 ; + w wth w + w = 0: In partcular, f the domnant transversal s dagonal ( = dentty, whch can be checked drectly by testng (5) wth = ja j), the soluton of (7) s unque. Otherwse, there may be multple solutons and we want to select a small one. Snce x 2 +x 2 = 2 1 2 ;+2w 2, we get the x wth smallest Eucldean norm, by solvng the convex quadratc program mn s.t. P w 2 w + w k k := k? 2 1 ;? 2 1 k;k; w + w = 0: The nvoluton and a startng pont are avalable from solvng the matchng problem (3). Ths gves small jw j, hence small x and. References 1. R. E. Burkard; U. Dergs. Assgnment and Matchng Problems: Soluton Methods wth FORTRAN-Programs. Lecture Notes n Econ. and Math. Systems, Sprnger, Berln 1980. 2. G. Carpaneto; P. Toth. Soluton of the assgnment problem (Algorthm 548). ACM Trans. Math. Softw. (1980), 104-111. 3. A. R. Curts; J. K. Red. On the automatc scalng of matrces for Gaussan elmnaton. J. Inst. Math. Appl. 10 (1972), 118-124. 4. U. Dergs; A. Metz. An ecent labelng technque for solvng sparse assgnment problems. Computng 36 (1986), 301-311. 8
5. H. W. Kuhn. The Hungaran method for the assgnment problem. Naval Research Logstcs Quarterly 2 (1955) 83. 6. A. Neumaer, On satsfyng second-order optmalty condtons by moded Cholesky factorzatons, submtted. 7. M. Olschowka and A. Neumaer, A new pvotng strategy for Gaussan elmnaton, Lnear Algebra Appl. 240 (1996), 131-151. 8. B. N. Parlett; T. L. Lands. Methods for scalng to double stochastc form. Lnear Algebra Appl. 48 (1982), 53-79. 9