An Alternative Scaling Factor In Broyden s Class Methods for Unconstrained Optimization

Joural of Mathematics ad Statistics 6 (): 63-67, 00 ISSN 549-3644 00 Sciece Publicatios A Alterative Scalig Factor I Broyde s Class Methods for Ucostraied Optimizatio Muhammad Fauzi bi Embog, Mustafa bi Mamat, Mohd Rivaie ad Ismail bi Mohd Departmet of Computer Sciece ad Mathematics, Uiversity echology MARA, Kuala ereggau Campus, 080 Kuala ereggau, Malaia Departmet of Mathematics, Faculty of Sciece ad echology, Uiversity Malaia ereggau, 030 Kuala ereggau, Malaia Abstract: Problem statemet: I order to calculate step size, a suitable lie search method ca be employed. As the step size usually ot exact, the error is uavoidable, thus radically affect quasi- Newto method by as little as 0. percet of the step size error. Approach: A suitable scalig factor has to be itroduced to overcome this iferiority. Self-scalig Variable Metric algorithms (SSVM s) are commoly used method, where a parameter is itroduced, alterig Broyde s sigle parameter class of approximatios to the iverse esssia to a double parameter class. his study proposes a alterative scalig factor for the algorithms. Results: he alterative scalig factor had bee tried o several commoly test fuctios ad the umerical results shows that the ew scaled algorithm shows sigificat improvemet over the stadard Broyde s class methods. Coclusio: he ew algorithm performace is comparable to the algorithm with iitial scalig o iverse essia approximatio by step size. A improvemet over uscaled BFGS is achieved, as for most of the cases, the umber of iteratios are reduced. Key words: Broyde s class, eigevalues, essia, scalig factor, self-scalig variable matrices INRODUCION he quasi-newto methods are very popular ad efficiet methods for solvig ucostraied optimizatio problem: mi f (x) x R () where, f: R R is a twice cotiuously differetiable fuctio. here are a large umber of quasi-newto methods but the Broyde s class of update is more popular. As other quasi-newto method, the Broyde s method are iterative, whereby at (k+)-th iteratio: x = x +α d () k+ k where, d k deotes the search directio ad α k is its step size. he search directio, d k, is calculated by usig: k = he iverse essia approximatio. he step size α k is a positive step legth chose by a lie search so that at each iteratio either: gd f(x +α d ) f(x ) ηα (4) k dk = kgk (3) (g(xk +αkd k) dk δ dkgk (7) Where: g k = he gradiet of f evaluated at the curret iterate where, < δ< ad δ <δ < (4). x k Correspodig Author: Muhammad Fauzi bi Embog, Departmet of Computer Sciece ad Mathematics, Uiversity echology MARA, Kuala ereggau Campus, 080 Kuala ereggau, Malaia 63 Or: f(x +α d ) f(x ) η g d (5) where, η ad η are positive costats. Note that the coditios (4) ad (5) are assumes i Byrd ad Nocedal (989). hey cover a large class of lie search strategies uder suitable coditios. If the gradiet of f is Lipschitz cotiuous, the for several well kow lie search satisfy the Wolfe coditios: f(x +α d ) f(x ) δα d g (6)

J. Math. & Stat., 6 (): 63-67, 00 Byrd ad Nocedal (989) prove that if the ratio betwee successive trial values of α is bouded away from zero, the ew iteratio produced by a backtrackig lie search satisfies (4) ad (5). he iverse essia approximatio is the updated by: ss yyh = + + v v (8) k k yhy k Where: s = x x (9) + k Clearly, whe γ k =, the formula (3) is reduced to Broyde s class update (8). Choices of the scalig factor: he choice of a suitable scalig factor ca be determied by the followig theorem. heorem (Ore ad Lueberger, 974) Let [0,] ad γ k >0 Let k be the iverse essia approximatio ad k + be defied by (3). Let λ λ λ ad µ µ... µ be eigevalues of k ad k+ respectively. he the followig statemets hold: y = g g (0) + k ( ) s v = y y y k k () ad is a parameter that may take ay real value. Accordig to Deis ad More (977), there are two updated formulae that are cotaied i the Broyde s class method, amely BFGS update if the parameter k = 0 ad DFP update if k =. Cosequetly, (.8) may be writte as: DEP BFGS ( ) = + () If we let [0,] (.8) is called Broyde covex family. Meawhile, if [0,-σ] for σ [0,] the (.8) is called the restricted Broyde s class method (Byrd et al., 987). MAERIALS AND MEODS Scalig the Broyde s class method: Self-scalig Variable Metric (SSVM) method: May modificatios have bee applied o quasi-newto methods i attempt to improve its efficiecy. Now, the discussio will be o the self-scalig variable metric algorithms developed by Ore (973) ad Ore ad Lueberger (974). Multiplyig k by γ k ad the replacig γ k k i (8), the Broyde s class formula ca be writte as: yy v v yy k+ = k + γ k+ k (3) where γ k is a self-scalig parameter. he formula (3) is v kow as self-scalig variable metric (SSVM) formula. 64 If γkλ, the µ = ad γkλi+ µ γkλi i =,,,- If γkλi the µ = ad γλ k i µ i γλ k i, i =,3,, If γkλ γkλ ad i 0 is a idex with γkλi0 γkλ i0 γkλ µ γλ k µ... γλ k i0 the i0 i0+ k i0+... µ µ γλ γλ ad there is at least oe eigevalue i µ i0 ad µ i0+ which equals. Readers who are iterested i the proof for the above theorem may refer Ore ad Lueberger (974), or Su ad Yua (006). From the above theorem, it ca be show that: γ k = (4) ss k is a suitable scalig factor (Su ad Yua, 006). Shao ad Phua (978) suggested a simple iitial scalig which require o additioal iformatio about the object fuctio tha that routiely required by variable metric algorithm. Iitially o = I may be used to determie x, where α o is chose accordig to some step legth or liear search criterio to assure sufficiet reductio i the fuctio f. Oce x has bee chose, but before is calculated, 0 ow beig scaled by: Ĥ ad: =α (5) 0 0 0 ss yy ˆ ˆ = + v v (6) ˆ 0 0 0 ˆ ( ˆ ) s = y y ˆ y 0 k ˆ (7)

J. Math. & Stat., 6 (): 63-67, 00 Substitutio of (5) ito (6) yields: yy 0 0 yy 0 α+ v v (8) After the iitial scalig of 0 by a appropriate step size α, the iverse essia approximatio is ever rescaled. Numerical experimets show that the iitial scalig is simple ad effective for a lot of problems i which the curvature chages smoothly (Su ad Yua, 006). A alterative scale factor: I this article, the smallest eigevalue of iverse essia approximatio was proposed as a alterative scalig factor of iitial scalig o as i (5). Replacig step size, α with the smallest eige value of, λ ito (8) yields: yy 0 0 yy 0 λ+ v v (9) As proposed by Shao ad Phua (978), this update is also a iitial scalig o iverse essia approximatio. After the iitial iteratio, the iverse essia approximatio is ever rescaled. he followig algorithm is proposed with the smallest eigevalue of, λ as the scalig factor. (BFGS), with eigevalue scalig (9) (deoted as ES- BFGS) ad with the iitial scalig (8) (deoted as S- BFGS). he three algorithms was applied o eight commoly tested fuctios, cosist of two variables ( = ) (fuctios ad four variable) ( = 4) fuctios: Rosebrock fuctio with = f (x) = 00(x x ) + (x ) Cube fuctio with = f (x) = 00(x x ) + (x ) Shalow fuctio with = 3 f(x) = (x x ) + ( x ) Strait fuctio with = f (x) = (x x ) + 00(x ) Rosebrock fuctio with = 4 f (x) = 00(x x ) + (x ) + 00(x x ) + (x ) 4 3 3 Eigevalue scaled algorithm: For simplicity, we let =, thus a modificatio of the BFGS method is obtaied. For other members of Broyde s class methods, this algorithm is also applicable. Step : Iitializatio Give x 0, set k = 0 ad 0 =I. Step : Computig search directio d k = - k g k If g k = 0, the stop. Step 3: Computig step size, α. Step 4: Updatig ew poit, x k+ = x k +α k d k Step 5: Updatig approximatio of iverse essia approximatio, k+. For k =, use (9), else, use (8). Step 6: Coverget test ad stoppig criteria If (x k+ <(x k ) ad gk ε, the stop. Otherwise go to Step with k = k+. RESULS Cube fuctio with = 4 f (x) = 00(x x ) + (x ) + 00(x x ) + (x ) 3 3 4 3 3 Shalow fuctio with = 4 f(x) = (x x ) + ( x ) + (x x ) + ( x ) 4 3 3 Wood fuctio with = 4 f (x) = 00(x x ) + ( x ) + 90(x x ) + ( x ) + 0(x + x ) + 0.(x x ) 4 3 3 4 4 he umerical results produced by implemetig the three algorithms to the test fuctios are preseted i the able. he efficiecy of the algorithm are based o the umber of iteratios eeded to reach the miimum value of the fuctios. Algorithm with less iteratio is cosidered more efficiet. A tolerace of A MAPLE subroutie was programmed to test three algorithms, BFGS algorithm without scalig ε = 0 6 is set as the stoppig criteria. 65

J. Math. & Stat., 6 (): 63-67, 00 able : Numerical results produced by the three tested algorithms, BFGS without scalig (BFGS), with Eigevalue Scalig (ES-BFGS) ad with step size scalig (S-BFGS) Number of iteratios ---------------------------------------------------------------------------- est fuctio Iitial poit BFGS ES-BFGS S-BFGS Rosebrock (-,-) 7 5 6 ( = ) (-00,00) 30 8 8 (0 000, 0 000) 33 3 Cube (-.,.6) 7 8 8 ( = ) (.5,-50) 6 54 63 (00, 50) 44 44 35 Shalow (5, 5) 9 9 9 ( = ) (-00,00) 5 3 3 (000,-5000) 9 9 Strait (, ) 5 5 5 ( = ). (00,00) 9 3 9 (000, 000) 3 3 Rosebrock (-,-,-,-) 6 8 7 ( = 4). (-00,00,00,00) 30 9 8 (00, 00, 00,.5) 56 0 37 Cube (.5,-.5,.5,-.5) 4 40 39 ( = 4). (0,-0, 0,-0) 8 7 8 (5,-5, 5,-5) 45 8 4 Shalow (, 4,, 4) 7 8 7 ( = 4) (-00,400,00,400) 36 3 3 (000, 000, 000, 000) 78 74 99 Wood (,-,,-) 9 4 (= 4) (00,-5, 00,-5) 43 9 30 (000, 000, 000, 000) 80 56 58 DISCUSSION From the calculatio usig MAPLE software, a iterestig relatio betwee step size ad eigevalues is observed. For iitial iteratio, the value of step size ad the smallest eige value of essia approximatio is almost idetical, but after a umber of iteratios, the differece icreased accordigly. his explais why the ES-BFGS is effective o iitial scalig oly, a similarity with the scaled algorithm proposed by Shao ad Phua (978). Whe the scalig o essia approximatio was doe o every iteratio, the performace of the scaled BFGS deteriorated, or eve failed at all. he umerical results show that the ew algorithm (ES-BFGS) performace is comparable to the algorithm with iitial scalig (S-BFGS). CONCLUSION A improvemet over uscaled BFGS is achieved, as for most of the cases (8 out of 4), the umber of iteratios are reduced. Further ivestigatio will be carried out usig the alterative scalig factor, λ o the other types of quasi-newto methods. he relatioship betwee the smallest eigevalue of essia approximatio ad the optimal step size is also of the iterest i future research, triggerig the possibility of usig eigevalue as a ew step size i the quasi-newto methods. For every test fuctio, the miimum poit is (0,0) for the two variables fuctios ad (0,0,0,0) for the four variables fuctios. he miimum value is equal to zero for all the test fuctios. REFERENCES Byrd, R.., J. Nocedal ad J. Y.X Yua, 987. Global covergece of a class of quasi-ewto methods o Covex problems. SIAM J. Numer. Aal., 4: 7-89. DOI: 0.37/074077 Byrd, R.. ad J. Nocedal, 989. A tool for the aalis of quasi-newto methods with applicatio to ucostraied miimizatio. SIAM J. Numer. Aal., 6: 77-739. DOI: 0.37/07604 Deis, J.E. ad J.J. More, 977. Quasi-Newto methods, motivatio ad theory. SIAM Rev., 9: 46-89. DOI: 0.37/09005 Ore, S.S. ad D.G. Lueberger, 974. Self-scalig Variable Metric (SSVM) algorithms. Maage. Sci., 0: 845-86. URL: http://.www.jstor.org/stable/630094 Ore, S.S., 973. Self-scalig variable metric algorithms without lie search for ucostraied miimizatio. Math. Comput. Am. Math. Soc., 7: 863-874. URL: http://.www.jstor.org/stable/00553 66

J. Math. & Stat., 6 (): 63-67, 00 Su, W. ad Y.X. Yua, 006. Optimizatio heory ad Method. Spriger Optimizatio ad Its Applicatios. st Ed., New York USA., ISBN: 0:0-387-4975-3, pp: 73-8. Shao, D.F. ad K.. Phua, 978. Matrix coditioig ad oliear optimizatio. Math. Programm., 4: 49-60. DOI:0.007/BF058896 67