ON THE BEHAVIOR OF THE CONJUGATE-GRADIENT METHOD ON ILL-CONDITIONED PROBLEMS

ON THE BEHAVIOR OF THE CONJUGATE-GRADIENT METHOD ON I-CONDITIONED PROBEM Anders FORGREN Technca Report TRITA-MAT-006-O Department of Mathematcs Roya Insttute of Technoogy January 006 Abstract We study the behavor of the conjugate-gradent method for sovng a set of near equatons, where the matrx s symmetrc and postve defnte wth one set of egenvaues that are arge and the remanng are sma. We characterze the behavor of the resduas assocated wth the arge egenvaues throughout the teratons, and aso characterze the behavor of the resduas assocated wth the sma egenvaues for the eary teratons. Our resuts show that the resduas assocated wth the arge egenvaues are made sma frst, wthout changng very much the resduas assocated wth the sma egenvaues. A concuson s that the -condtonng of the matrx s not refected n the conjugate-gradent teratons unt the resduas assocated wth the arge egenvaues have been made sma. Key words. conjugate-gradent method, symmetrc postve-defnte matrx, -condtonng AM subject cassfcatons. 65F0, 65F, 65K05. Introducton A fundamenta probem n near agebra s the souton of a system of near equatons on the form Ax = b, where A s an n n symmetrc postve defnte matrx and b s an n-dmensona vector. From an optmzaton perspectve, an equvaent probem may be formuated as mnmze x IR n xt Hx + c T x, (. where H s an n n symmetrc postve defnte matrx and c s an n-dmensona vector. The unque souton to (. s gven by Hx = c. Hence, by dentfyng A = H and b = c, the probems are equvaent. If x = H c denotes the optma souton to (., we may wrte xt Hx + c T x = (x x T H(x x x T Hx. If we et ξ = x x and consder a rotaton Optmzaton and ystems Theory, Department of Mathematcs, Roya Insttute of Technoogy, E-00 44 tockhom, weden (andersf@kth.se. Research supported by the wedsh Research Counc (VR.

The conjugate-gradent method on -condtoned probems of the varabes so that the Hessan becomes dagona,.e., H = dag(λ, where λ s the vector of egenvaues of H, (. may equvaenty be rewrtten as mnmze ξ IR n λ ξ, (. = where the constant term x T Hx has been gnored. Of course, x s not known, and hence ξ s not known. However, we w dscuss propertes of ξ, and the constructon s therefore convenent. We w throughout et the egenvaues of H be ordered such that λ λ... λ n > 0. The conjugate-gradent method s a we-known teratve method for sovng (.. At teraton k, t computes an approxmate souton ξ (k to (. as the mnmzer of the objectve functon of (. subject to the constrant that ξ ξ (0 beongs to the Kryov subspace spanned by Hξ (0,..., H k ξ (0. There s a convenent recurrence formua for recurrng ξ (k+ from ξ (k. For a thorough dscusson on the conjugate gradent method, see, e.g., Goub and Van oan [0, ectons 0. 0.3] or uenberger [6, Chapter 8]. The method was orgnay proposed by Hestenes and tefe [5]. There s a rch terature on conjugate-gradent methods, see, e.g., Hestenes [3], Axesson and Barker [], Goub and O eary [9], Axesson [] and aad [8]. Further references are Faddeev and Faddeeva [6], Dahqust, Estenstat and Goub [4], Hestenes [], and Hestenes and ten [4]. We are partcuary nterested n the stuaton where H has r arge egenvaues and the remanng n r egenvaues are sma. Our motvaton s twofod: frst, nteror methods [7, 8], where nfntey -condtoned matrces arse, and second, radaton therapy optmzaton [3], where -condtoned systems arsng from dscretzed Fredhom equatons of the frst knd arse. The conjugate-gradent method s known to behave n a reguarzng manner on such -condtoned near equatons, see, e.g., qure [9], Hanke [] and Voge [0]. Ths s aso reated to parta east-squares methods, see, e.g., Wod et a. [] and Edén [5]. We show that the components of the terates assocated wth the the arge egenvaues, ξ (k, =,..., r, are cose to the terates that are obtaned f the conjugategradent method s apped to the r-dmensona probem where ony the resduas assocated wth the r arge egenvaues are consdered. In addton, we show that the components of the eary terates assocated wth the sma egenvaues, ξ (k, = r +,..., n, are cose to the correspondng nta resdua ξ (0, = r +,..., n. Broady speakng, ths means that the path of terates s cose to frst satsfyng the parta east-squares probem assocated wth the arge resduas wthout sgnfcanty movng the sma resduas, and then movng to reducng the sma resduas. An mpcaton of ths resut s that f the arge egenvaues are of comparabe magntude, the -condtonng of the probem caused by the sma egenvaues does not appear n the eary teratons. Athough there s a rch terature on conjugategradent methods, we are not aware of an anayss aong the nes presented n our paper. We aow the nta resduas to appear n our expressons, but consder exact poynomas. Ths can be contrasted to cassca bounds, e.g., the bounds gven n Axesson and Barker [, Chapter ], based on Chebyshev poynomas.

. Background 3 The paper s organzed as foows: In ecton we gve a bref background on the conjugate-gradent method. ecton 3 contans a revew of reevant propertes of the poynomas assocated wth the conjugate-gradent method. In ecton 4, we defne the conjugate-gradent probem assocated wth the arge egenvaues ony. ecton 5 contans the man resuts of the paper, the characterzaton of the terates of the conjugate-gradent method. In ecton 6 we gve a bref reaton to the steepestdescent method, and fnay a summary s gven n ecton 7.. Background After k steps of the conjugate-gradent method, we obtan ξ (k and α (k from mnmze ξ IR n,α IR k λ ξ = subject to ξ = ξ (0 + k = λ ξ(0 α, =,..., n, (. as optma souton and optma vaue, respectvey. Ths formuaton s a convex quadratc program, where the Kryov vectors ξ (0, Hξ (0,..., H k ξ (0 appear expcty. Aternatvey, the constrants ξ = ξ (0 + k = λ ξ(0 α, =,..., n, may be vewed as to say that ξ = P k (λ ξ (0, where P k (λ s a kth degree poynoma n λ such that P k (0 =. Ths poynoma may be characterzed n terms of ts zeros ζ IR k as Q k (λ, ζ, wth Q k (λ, ζ = k = ( λ ζ, (. where we w assume that ζ s ordered such that ζ ζ... ζ k. Then, (. may equvaenty be rewrtten as mnmze ξ IR n,ζ IR k λ ξ = subject to ξ = Q k (λ, ζξ (0, =,..., n, (.3 where the optma souton s denoted by ξ (k and ζ (k. Note that the formuatons (. and (.3 are equvaent, and we w make use of both. We w denote by n n the number of teratons t takes for the conjugate-gradent method to sove (.. The number n n equas the number of dstnct egenvaues of H wth nonzero nta resduas. Wthout oss of generaty, the reader may consder n n = n, but we use n n for the sake of competeness. Probems (. and (.3 have unque soutons for k n n, wth the orderng of ζ n (.3 such that ζ ζ... ζ k. As mentoned n the ntroducton, we are nterested n the stuaton when H has r arge egenvaues and n r sma egenvaues. We w throughout quantfy ths stuaton by a scaar ɛ, ɛ (0, ] such that λ r+ ɛλ r. We w be nterested n the case when ɛ, and our anayss appes when ɛr.

4 The conjugate-gradent method on -condtoned probems λ ξ (0 ξ ( ξ ( ξ (3 ξ (4 ξ (5 =.0000.0000-0.733 0.07-0.0049 0.0000 0.0000 =.5000.0000 0.0-0.070 0.036-0.0000-0.0000 = 3.0000.0000 0.434 0.063-0.043 0.000-0.0000 = 4 0.000.0000 0.943 0.8659 0.7647-0.008 0.0000 = 5 0.000.0000 0.994 0.986 0.9747 0.8793-0.0000 Tabe : Iterates ξ (k, k = 0,..., 5, for probem wth λ = (,.5,, 0., 0.0 T and ξ (0 = (,,,, T. λ ξ (0 ξ ( ξ ( ξ (3 ξ (4 ξ (5 =.0000.0000-0.77 0.08-0.000 0.0000 0.0000 =.5000.0000 0. -0.064 0.0003-0.0000 0.0000 = 3.0000.0000 0.44 0.070-0.0006 0.0000 0.0000 = 4 0.000.0000 0.994 0.9864 0.9784-0.000 0.0000 = 5 0.000.0000 0.9999 0.9999 0.9998 0.9898-0.0000 Tabe : Iterates ξ (k, k = 0,..., 5, for probem wth λ = (,.5,, 0.0, 0.000 T and ξ (0 = (,,,, T. As a sma ustratve exampe, consder the case where λ = (,.5,, 0., 0.0 T and ξ (0 = (,,,, T. Here, we have three arge egenvaues, and then a gap between egenvaue 3 and 4, and smary a gap between egenvaue 4 and 5. Tabe shows the terates ξ (k, k = 0,..., 5. The numerca resuts of ths tabe, as we as those presented n other tabes and fgures of ths paper, have been obtaned n Matab usng doube precson arthmetc. We notce that the frst three teratons are spent makng the resduas assocated wth the three arge egenvaues sma, whereas that resduas assocated wth the sma egenvaues are not changed very much. There s aso a herarchy here, n that we may consder four egenvaues arge compared to the ffth one. Hence, the fourth teraton s spent makng the fourth resdua sma, wthout decreasng the ffth resdua much. If the gap between the arge and the sma egenvaues s ncreased, the behavor observed above s manfested more ceary. The terates for the case when λ = (.5 0.0 0.000 T and ξ (0 = ( T, are gven n Tabe. We notce that λ ξ (0 ξ ( ξ ( ξ (3 =.0000.0000-0.77 0.080 0.0000 =.5000.0000 0. -0.064 0.0000 = 3.0000.0000 0.44 0.07-0.0000 Tabe 3: Iterates ξ (k, k = 0,..., 3, for probem wth λ = (,.5, T and ξ (0 = (,, T.

. Background 5 the frst three components of the resduas, correspondng to arge egenvaues, n Tabes and are smar. The tendency not to reduce the resduas correspondng to the sma egenvaues s ncreased n Tabe, when the gap has been ncreased. The case when the resduas assocated wth the sma egenvaues are gnored entrey may be vewed as the mtng case when there s an nfnte gap. If we n the above exampe gnore the two smaest egenvaues, we obtan a three-dmensona probem wth λ = (.5 T and ξ (0 = ( T. The terates for ths probem are gven n Tabe 3. Note that durng the frst three teratons, the frst three components of the terates n Tabes and are cose to the terates of Tabe 3, whereas the ast two components of the terates n Tabes and not reduced very much. λ ξ (0 ξ ( ξ ( ξ (3 ξ (4 ξ (5 =.0000.0000-0.34 0.36-0.0964 0.0003 0.0000 =.5000.0000 0.0068-0.5540 0.4646-0.000 0.0000 = 3.0000.0000 0.3379-0.7300-0.83 0.005-0.0000 = 4 0.000 0.0000 9.3379 7.007.44-0.078-0.0000 = 5 0.000 0.0000 9.9338 9.6896 9.0454 8.797 0.0000 Tabe 4: Iterates ξ (k, k = 0,..., 5, for probem wth λ = (,.5,, 0., 0.0 T and ξ (0 = (,,, 0, 0 T. λ ξ (0 ξ ( ξ ( ξ (3 ξ (4 ξ (5 =.0000.0000-0.733 0.05-0.007 0.0000-0.0000 =.5000.0000 0.00-0.07 0.034-0.0000-0.0000 = 3.0000.0000 0.433 0.0605-0.0577 0.0000 0.0000 = 4 0.000 0.0000 9.943 9.864 9.736-0.000-0.0000 = 5 0.000 0.0000 9.9994 9.9986 9.9973 9.8978 0.0000 Tabe 5: Iterates ξ (k, k = 0,..., 5, for probem wth λ = (,.5,, 0.0, 0.000 T and ξ (0 = (,,, 0, 0 T. Note that the data for probem (. s the egenvaue vector λ and the nta resdua vector ξ (0. Hence, ξ (k depends on λ as we as on ξ (0. The mpact of a sma egenvaue λ, = r +,..., n, s aso affected by the sze of the nta resdua ξ (0. In the exampes above, we have chosen a resduas equa. If the nta resdua assocated wth a sma egenvaue s arge, compared to the resduas assocated wth the arge egenvaues, we envsage the mpact of such a sma egenvaue/nta resdua par to be arger. Ths s ustrated n Tabe 4, where the data s dentca to that of Tabe, except that the resduas assocated wth the two smaest egenvaues are made ten tmes arger. We see that the terates n Tabe 4 do not have the features of those of Tabe. The frst three components of the terates of Tabe 4 are not partcuary cose to those of Tabe 3, and the ast two components of the terates

6 The conjugate-gradent method on -condtoned probems of Tabe 4 are reduced sgnfcanty aso n the eary teratons. If the gap between the arge and sma egenvaues s ncreased, the behavor observed n Tabes and s restored. Ths s demonstrated n Tabe 5, where the data s dentca to that of Tabe, except that the resduas assocated wth the two smaest egenvaues are made ten tmes arger. The purpose of ths paper s to quantfy ths meanng of cosedness and non-reducton. Rather than ookng at a tabe of resduas, we may vew the terates n terms of the poynomas Q (k (λ, ζ (k. Fgure shows to the eft the poynomas Q (k (λ, ζ (k, k =,..., 5, for the exampe probem wth λ = (,.5,, 0., 0.0 T and ξ (0 = (,,,, T. The rght part of Fgure shows the poynomas Q (k (λ, ζ (k, k =,..., 3, for the probem wth λ = (,.5, T and ξ (0 = (,, T. Note Fgure : eft: Poynomas Q (k (λ, ζ (k, k =,..., 5, as a functon of λ, for probem wth λ = (,.5,, 0., 0.0 T and ξ (0 = (,,,, T. Rght: Poynomas Q (k (λ, ζ (k, k =,..., 3, as a functon of λ, for probem wth λ = (,.5, T and ξ (0 = (,, T. that the sma egenvaues have tte effect on the three frst poynomas n the fvedmensona exampe, whereas poynomas four and fve have ncreasng oscatons and amptude. The ffth poynoma does not even ft n the wndow. As we woud expect from the dscusson above, the frst three poynomas to the eft and to the rght are smar. Note that the -condtonng of the fve-dmensona probem does not appear n the frst three teratons, snce the terates are cose to those of the we-condtoned three-dmensona exampe probem. 3. Propertes of the poynomas In ths secton, we revew some we-known propertes of the poynomas Q (k (λ, ζ and ther zeros ζ (k that w be usefu n our anayss. The frst emma shows how ζ (k, =,..., k, may be expressed as a convex combnaton of the egenvaues λ, =,..., n.

3. Propertes of the poynomas 7 emma 3.. et ξ (k and ζ (k denote the optma souton to (.3. Then, for a k, t hods that where ζ (k = = (λ ξ (0 (Q k\ (λ, ζ (k nj= (λ j ξ (0 j (Q k\ (λ j, ζ (k λ, =,..., k, Q k\ (λ, ζ = k m= m In partcuar, λ n ζ (k λ, =,..., k. ( λ. ζ m Proof. We may emnate ξ from (.3 and wrte the objectve functon as f(ζ = = λ (Q k (λ, ζξ (0. (3. nce ζ (k s the goba mnmzer, whch s guaranteed to exst by the equvaence to the quadratc program (., t must hod that f(ζ (k / ζ = 0, =,..., k. Dfferentaton of (3. gves f(ζ ζ = ζ ( = λ ζ Hence, by the condton f(ζ (k / ζ = 0, (3. gves ζ (k = = λ (Q k\ (λ, ζξ (0, (3. (λ ξ (0 (Q k\ (λ, ζ (k nj= (λ j ξ (0 j (Q k\ (λ j, ζ (k λ, =,..., k, (3.3 gvng the requred expresson for ζ (k. nce (3.3 gves ζ (k of the λ s, t foows that λ n ζ (k λ. as a convex combnaton The convex combnaton provded by emma 3. s not very hepfu n genera, snce the weghts for one zero nvoves the other zeros. However, we get an expct expresson for ζ (. Coroary 3.. et ξ ( and ζ ( denote the optma souton to (.3 for k =. Then, nj= ζ ( λ 3 j = (ξ(0 j = λ nj= λ j (ξ(0 j ξ (0, =,..., n., and nj= λ ξ( j (ξ(0 j nj= λ 3 j (ξ(0 j The behavor of Q k (λ, ζ as a functon of λ, when λ s smaer than a the zeros ζ (k, =,..., k, s of fundamenta mportance n our anayss. The foowng emma gves the requred propertes, decreasng and convex. ee Fgure for ustratve exampes of poynomas.

8 The conjugate-gradent method on -condtoned probems emma 3.. For a fxed ζ IR k, wth ζ ζ... ζ k > 0, et Q k (λ, ζ be defned by (.. Then, Q k (λ, ζ s convex and decreasng as a functon of λ for λ [0, ζ k ]. In partcuar, for λ [0, ζ k ], λ k = ζ Q k (λ, ζ λ ζ k. Proof. We have Q k (λ, ζ λ Q k (λ, ζ λ = k k = ζ = k = m= ζ ζ m k m= m k p= p,m ( λ, (3.4a ζ m ( λζp. (3.4b It foows from (3.4b that Q k (λ, ζ/ λ s nonnegatve for λ [0, ζ k ], and hence Q k (λ, ζ s convex as a functon of λ for λ [0, ζ k ]. For λ [0, ζ k ], ths convexty mpes that Q k (λ, ζ = Q k (( λ 0 + λ ζ k, ζ ζ k ζ k ( λ ζk Q k (0, ζ + λ ζ k Q k (ζ k, ζ = ( λ ζk, where the denttes Q k (0, ζ = and Q k (ζ k, ζ = 0 have been used, thereby verfyng the upper bound on Q k (λ, ζ for λ [0, ζ k ]. In addton, the convexty mpes that Q k (λ, ζ Q k (0, ζ + Q k(0, ζ λ λ = λ gvng the requred ower bound on Q k (λ, ζ for λ [0, ζ k ]. k = ζ, 4. A reaxed probem for the eary teratons The bass for our anayss s to consder the conjugate-gradent probem that arses when ony the r arge egenvaues are consdered and egenvaues r + through n are dsregarded. For teraton k, ths means consderng the optmzaton probem mnmze ξ IR r,α IR k λ ξ = subject to ξ = ξ (0 + k = λ ξ(0 α, =,..., r, (4. (k where we denote the optma souton by, =,..., r, and ᾱ (k. We w denote by n r the frst teraton k for whch (4. has optma vaue zero. We have prevousy

4. A reaxed probem for the eary teratons 9 taked about eary teratons. Ths can now be made precse, and we w refer to teratons k, for whch 0 k n r, as eary teratons. (k Gven, =,..., r, and ᾱ (k, that sove (4. for a gven k, 0 k n r, we may defne k (k = ξ (0 + λ ξ (0 ᾱ (k, = r +,..., n, (4. = so as to obtan (k as an n-dmensona vector. Equvaenty, we may consder the optmzaton probem mnmze ξ IR n,α IR k λ ξ = subject to ξ = ξ (0 + k = λ ξ(0 α, =,..., n, (4.3 whch s equvaent to sovng (4. and then usng (4.. Hence, we obtan (k and ᾱ (k as the optma souton of (4.3. Equvaenty, we may wrte mnmze ξ IR n,ζ IR k λ ξ = subject to ξ = Q k (λ, ζξ (0, =,..., n, (4.4 where we anaogousy denote the optma souton by (k and ζ (k. We prefer the n-dmensona formuaton gven by (4.3, rather than combnng (4. and (4., snce (. and (4.3 have the same feasbe sets. For a gven feasbe pont to (. and (4.3, the objectve functon vaue of (. s at east as arge as the objectve functon vaue of (4.3, and ths means that (4.3 s a reaxaton of (.. In order to quantfy how cose the terates of the nta probem (. are to the terates of the reaxed probem (4.3, we start by showng that the dfference between the resduas assocated wth the sma egenvaues, (k, and the nta resdua ξ (0, = r +,..., n, s sma. Ths s a consequence of ony consderng the arge egenvaues n the mnmzaton probem. The foowng emma shows (k that f λ r+ ɛλ r, then the dfference between and ξ (0 s bounded by kɛ for k =,..., n r and = r +,..., n. Ths s a key resut to showng propertes of ξ (k ater n the paper. emma 4.. Assume that λ r+ ɛλ r, and et (k together wth ζ (k be optma souton to (4.4. Then, for k =,..., n r and = r +,..., n, ξ (0 ξ (0 (k ( kɛξ (0, f ξ (0 0, (k ( kɛξ (0, f ξ (0 0. Proof. et k be an teraton such that k n r. nce (4.4 s a conjugate-gradent probem that ony concerns egenvaues λ, =,..., r, and k n r, t foows that

0 The conjugate-gradent method on -condtoned probems ζ (k s unque, gven the orderng that ζ (k (k ζ λ r, =,..., k. Moreover, ζ (k... (k ζ k (k = Q k (λ, ζ (k ξ (0, =,..., n. In partcuar, for = r +,..., n, λ λ r r + n. Then, emma 3. gves ( Q k (λ, ζ k ( (k λ (k = ζ (k, and emma 3. shows (k ζ. et be an ndex such that k λ λ r Consequenty, snce = Q k (λ, ζ (k ξ (0, (4.5 gves ξ (0 ξ (0 0 and ξ (0 (k ( kɛξ (0 f ξ (0 0, as requred. ( kɛ. (4.5 (k ( kɛξ (0 f 5. A characterzaton of the terates (k Ths characterzaton of the resduas assocated wth the sma egenvaue ndces = r +,..., n aows us to gve an expct bound on the dfference between for the arge egenvaue ndces =,..., r. Ths shows how the terates ξ (k ξ (k (k (k foow the terates for =,..., k for the eary teratons, and then reman sma for the remanng teratons. Theorem 5.. Assume that λ r+ ɛλ r and ɛn r. et ξ (k together wth ζ (k be optma souton to (.3 and, for k n r, et (k together wth ζ (k be optma souton to (4.4. Then, = (ξ (k (k ɛ (ξ (0, k =,..., n r, = =r+ (ξ (k ɛ (ξ (0, k = n r +,..., n n. = =r+ Proof. Note ntay that ξ (k, =,..., n, s unquey determned for k n n, snce t s the souton to a conjugate-gradent quadratc program. For the same (k reason, =,..., r, s unquey determned for k n r. Frst et k n r. Note that (4.3 s a reaxaton of (., and that (. and (4.3 have the same feasbe sets. Hence, snce ξ (k s optma to (. and (k s optma to (4.3, we concude that λ ( (k λ (ξ (k λ (ξ (k λ ( (k. (5. = Consequenty, emma 4. n conjuncton wth λ ɛλ r, = r +,..., n, apped to (5., gve = λ ( (k = λ (ξ (k = = λ ( (k + ɛλ r = (ξ (0 =r+. (5.

5. A characterzaton of the terates A Tayor-seres expanson of the objectve functon of (4.3 around (k gves = λ (ξ (k = λ ( (k + = = λ (k (ξ (k (k + = λ (ξ (k (k. (5.3 nce (4.3 s an equaty-constraned quadratc program to whch (k s optma and ξ (k s feasbe, we concude that = λ (k (ξ (k (k = 0. (5.4 Consequenty, a combnaton of (5., (5.3 and (5.4 gves = λ (ξ (k (k ɛλ r n nce λ λ r, =,..., r, t foows from (5.5 that (ξ (k = (k ɛ (ξ (0 =r+ (ξ (0 =r+. (5.5, (5.6 as requred. Now et n r + k n n. Then, upon observng that the conjugate-gradent (nr method yeds decreasng vaues of the objectve functon and = 0, =,..., r, we obtan the nequates anaogous to (5. as = λ (ξ (k = λ (ξ (k = λ (ξ (nr =r+ λ ( (nr. (5.7 In addton, we appy emma 4. n conjuncton wth λ λ r, =,..., r, and λ ɛλ r, = r +,..., n, to (5.7, whch gves Consequenty, (5.8 gves λ r (ξ (k ɛλ r (ξ (0. (5.8 = =r+ (ξ (k ɛ (ξ (0, (5.9 = =r+ as requred. The compete resut s now gven by (5.6 and (5.9. mary, we may now obtan a resut that bounds the dfference ξ (k aso for the sma egenvaue ndces = r +,..., n durng the eary teratons. Ths bound, however, s not as expct as the bound for the arge egenvaue ndces. (k

The conjugate-gradent method on -condtoned probems Theorem 5.. Assume that λ r+ ɛλ r and ɛn r. et ξ (k together wth ζ (k be optma souton to (.3 and, for k n r, et (k together wth ζ (k be optma souton to (4.4. Further, et Ξ (0 and et V (k = dag(ξ(0,..., ξ(0 r, et ξ (0 = (ξ(0 r+,..., ξ(0 n T, be the r k matrx wth eement j gven by Then, for k =,..., n r, (V (k j = ( j λ. λ where σ k (Ξ (0 ξ (k V (k k/ ɛ 3/ ξ (0 σ k (Ξ (0 (k V (k ξ(0, = r +,..., n, denotes the kth snguar vaue of the matrx Ξ(0 Proof. A combnaton of (. and (4.3 gves ξ (k V (k. ( k (k = ξ (0 λ (α (k ᾱ (k, =,..., n. (5.0 = We may normaze (5.0 so that ξ (k (k k ( j = ξ (0 λ (λ j α(k j λ j ᾱ(k j= λ If we et β (k IR k have components β (k j rewrte (5. as ξ (k (k k = ξ (0 = (λ j α(k j ( j λ β (k j λ j= j, =,..., n. (5. λ j ᾱ(k j, j =,..., k, we may, =,..., n. (5. Wrtten n bock form for components =,..., r, (5. takes the form ξ (k (k = Ξ(0 V (k β(k, (5.3 where ξ (k s the k-dmensona vector wth components ξ(k, =,..., k. Takng norms n (5.3 gves ξ (k (k σ k(ξ (0 V (k β(k. (5.4 Note that σ k (Ξ (0 V (k > 0, snce k n r, and hence (5.4 gves an upper bound for β (k. Furthermore, f we use (5.4 for = r +,..., n n (5., upon observng that λ /λ ɛ, takng norms and usng the Cauchy-chwartz nequaty, we obtan ξ (k (k ξ (0 k / ɛ β (k k/ ɛ ξ (k (k σ k (Ξ (0 V (k ξ (0. (5.5

6. Reatonshp to the steepest-descent method 3 Theorem 5. gves ξ (k as requred. (k ξ (k ɛ/ ξ (0, whch nserted nto (5.5 gves (k k/ ɛ 3/ ξ (0 σ k (Ξ (0, V (k ξ(0 Note that the matrx Ξ (0 V (k of Theorem 5. s nonsnguar for k n r, but snce has Vandermonde structure, we expect the smaest snguar vaue Ξ (0 V (k to V (k become sma as k ncreases. We may now combne emma 4. and Theorem 5., to show that ξ (k s cose to for k =,..., n r and = r +,..., n for ɛ suffcenty sma. ξ (0 Coroary 5.. Assume that λ r+ ɛλ r and ɛn r. et ξ (k together wth ζ (k be optma souton to (.3. Further, et Ξ (0 = dag(ξ(0,..., ξ(0 r, et ξ (0 = (ξ (0 r+,..., ξ(0 n T, and et V (k be the r k matrx wth eement j gven by (V (k j = Then, for k =,..., n r and = r +,..., n, ( + k/ ɛ 3/ ξ (0 σ k (Ξ (0 V (k ξ (0 ξ (k ( + k/ ɛ 3/ ξ (0 σ k (Ξ (0 V (k ξ (0 ξ (k where σ k (Ξ (0 V (k ( j λ. λ ( kɛ k/ ɛ 3/ ξ (0 σ k (Ξ (0 V (k ( kɛ k/ ɛ 3/ ξ (0 σ k (Ξ (0 V (k denotes the kth snguar vaue of the matrx Ξ(0 ξ (0, f ξ (0 0, ξ (0, f ξ (0 0, V (k. Ths means that we have characterzed ξ (k (k as cose to for k =,..., n r, and ξ (k as cose to zero for k = n r +,..., n n, where ξ (k = (ξ(k,..., ξ(k r T and (k = ( (k (k,..., r T. In addton, we have characterzed ξ (k as cose to ξ (0 for = r +,..., n and k =,..., n r. Ths gves the desred characterzaton of ntay mnmzng the resduas assocated wth the arge egenvaues whereas not decreasng much the resduas assocated wth the sma egenvaues. 6. Reatonshp to the steepest-descent method As a remark, we aso brefy revew the steepest descent method n a poynoma framework. teepest descent uses a more greedy approach, n that a mnmzaton over a one-dmensona subspace s carred out at each teraton. The steepestdescent method may be vewed as a conjugate-gradent method whch s restarted every teraton. Here we obtan ξ (k and ζ k as the optma souton to the probem mnmze ξ IR n,ζ IR λ ξ =( subject to ξ = λ (6. ξ (k, =,..., n, ζ

4 The conjugate-gradent method on -condtoned probems whch has the cosed-form souton nj= ζ (k λ 3 j = (ξ(k j, and (6.a nj= λ j (ξ(k j ( ξ (k = λ ζ (k ξ (k = k = ( λ ζ ( ξ (0, =,..., n, (6.b see Coroary 3.. Ths means that steepest descent forms poynomas by addng one zero at the tme, and not changng the zeros that have aready been obtaned. Hence, (6. generates poynomas n the same fashon as the conjugate-gradent method, but they are ony optma n ths greedy sense. Consequenty, fnte termnaton s not obtaned, n genera. Fgure shows the frst ten poynomas generated by steepest descent for the exampe probem wth λ = (,.5,, 0., 0.0 T and ξ (0 = (,,,, T. Fgure : Frst ten poynomas generated by the steepest-descent method for probem wth λ = (,.5,, 0., 0.0 T and ξ (0 = (,,,, T. 7. ummary and dscusson We have characterzed the path of terates for the conjugate-gradent method apped to a system of near equatons when the n n postve-defnte symmetrc matrx nvoved s -condtoned n the sense that t has r arge egenvaues and the remanng n r egenvaues sma. The components assocated wth the arge egenvaues, ξ (k, =,..., r, are cose to the terates that are obtaned f the conjugate-gradent method s apped to the r-dmensona probem where ony the resduas assocated wth the r arge egenvaues are consdered. In addton, we have shown that the components of the eary terates assocated wth the sma egenvaues, ξ (k, = r +,..., n, are cose to the correspondng nta resdua ξ (0, = r +,..., n. An mpcaton of ths resut s that f the arge egenvaues are

References 5 of comparabe magntude, the -condtonng of the probem caused by the sma egenvaues does not appear n the eary teratons. Further research woud be drected towards sovng -condtoned systems arsng n nteror methods usng precondtoned conjugate-gradent methods. Aso, we are nterested n quas-newton methods for nonnear optmzaton probems wth -condtoned Hessans. The research presented n ths paper s of nterest for quas-newton methods, snce quas-newton methods are equvaent to a conjugategradent method when sovng (. f exact nesearch s used [7]. Acknowedgement I thank Fredrk Carsson, ars Edén, Php G, Axe Ruhe and Anders zepessy for hepfu dscussons on conjugate-gradent methods. References [] O. Axesson. Iteratve souton methods. Cambrdge Unversty Press, Cambrdge, 994. [] O. Axesson and V. A. Barker. Fnte eement souton of boundary vaue probems, voume 35 of Casscs n Apped Mathematcs. ocety for Industra and Apped Mathematcs (IAM, Phadepha, PA, 00. [3] F. Carsson and A. Forsgren. Iteratve reguarzaton n ntensty-moduated radaton therapy optmzaton. Med. Phys., 33(:5 34, 006. [4] G. Dahqust,. C. Esenstat, and G. H. Goub. Bounds for the error of near systems of equatons usng the theory of moments. J. Math. Ana. App., 37:5 66, 97. [5]. Edén. Parta east-squares vs. anczos bdagonazaton. I. Anayss of a projecton method for mutpe regresson. Comput. tatst. Data Ana., 46(: 3, 004. [6] D. K. Faddeev and V. N. Faddeeva. Computatona methods of near agebra. Transated by Robert C. Wams. W. H. Freeman and Co., an Francsco, 963. [7] A. Forsgren, P. E. G, and J. D. Grffn. Iteratve souton of augmented systems arsng n nteror methods. Report TRITA-MAT-005-O3, Department of Mathematcs, Roya Insttute of Technoogy, tockhom, weden, 005. [8] A. Forsgren, P. E. G, and M. H. Wrght. Interor methods for nonnear optmzaton. IAM Rev., 44(4:55 597 (eectronc (003, 00. [9] G. H. Goub and D. P. O eary. ome hstory of the conjugate gradent and anczos agorthms: 948 976. IAM Rev., 3(:50 0, 989. [0] G. H. Goub and C. F. Van oan. Matrx Computatons. The Johns Hopkns Unversty Press, Batmore, Maryand, thrd edton, 996. IBN 0-808-544-8. [] M. Hanke. Conjugate gradent type methods for -posed probems, voume 37 of Ptman Research Notes n Mathematcs eres. ongman centfc & Technca, Harow, 995. [] M. R. Hestenes. Iteratve methods for sovng near equatons. J. Optmzaton Theory App., :33 334, 973. [3] M. R. Hestenes. Conjugate-Drecton Methods n Optmzaton. prnger-verag, Bern, Hedeberg and New York, 980. [4] M. R. Hestenes and M.. ten. The souton of near equatons by mnmzaton. J. Optmzaton Theory App., :335 359, 973. [5] M. R. Hestenes and E. tefe. Methods of conjugate gradents for sovng near systems. J. Research Nat. Bur. tandards, 49:409 436 (953, 95.

6 References [6] D. G. uenberger. near and Nonnear Programmng. Addson-Wesey Pubshng Company, Readng, second edton, 984. IBN 0-0-5794-. [7]. Nazareth. A reatonshp between the BFG and conjugate gradent agorthms and ts mpcatons for new agorthms. IAM J. Numer. Ana., 6:794 800, 979. [8] Y. aad. Iteratve methods for sparse near systems. ocety for Industra and Apped Mathematcs, Phadepha, PA, 003. [9] W. qure. The souton of -condtoned near systems arsng from Fredhom equatons of the frst knd by steepest descents and conjugate gradents. Internat. J. Numer. Methods Engrg., 0(3:607 67, 976. [0] C. R. Voge. Computatona methods for nverse probems, voume 3 of Fronters n Apped Mathematcs. ocety for Industra and Apped Mathematcs (IAM, Phadepha, PA, 00. []. Wod, A. Ruhe, H. Wod, and W. J. Dunn III. The conearty probem n near regresson. The parta east squares (P approach to generazed nverses. IAM J. c. tat. Comp, 5:735 743, 984.