New Accelerated Conjugate Gradient Algorithms for Unconstrained Optimization

Size: px
Start display at page:

Download "New Accelerated Conjugate Gradient Algorithms for Unconstrained Optimization"

Transcription

1 ew Accelerated Conjugate Gradient Algorithms for Unconstrained Optimization eculai Andrei Research Institute for Informatics, Center for Advanced Modeling and Optimization, 8-0, Averescu Avenue, Bucharest, Romania and Academy of Romanian Scientists, 54, Splaiul Independentei, Bucharest 5, Romania Abstract ew accelerated nonlinear conjugate gradient algorithms which are mainly modifications of the Dai and Yuan s for unconstrained optimization are proposed Using the exact line search, the algorithm reduces to the Dai and Yuan conjugate gradient computational scheme For inexact line search the algorithm satisfies the sufficient descent condition Since the step lengths in conjugate gradient algorithms may differ from by two order of magnitude and tend to vary in a very unpredictable manner, we suggest an acceleration scheme able to improve the efficiency of the algorithms A global convergence result is proved when the strong Wolfe line search conditions are used Computational results for a set consisting of 750 unconstrained optimization test problems show that these new conjugate gradient algorithms substantially outperform the Dai-Yuan conjugate gradient algorithm and its hybrid variants Keywords: Unconstrained optimization, conjugate gradient method, sufficient descent condition, conjugacy condition, ewton direction, numerical comparisons AMS 000 Mathematics Subject Classification: 49M07, 49M0, 90C06, 65K Dedicated to the memory of Professor Gene H Golub (93-007) Introduction Conjugate gradient methods represent an important class of unconstrained optimization algorithms with strong local and global convergence properties and modest memory requirements A history of these algorithms has been given by Golub and O Leary [8], as well as by O Leary [38] A survey of development of different versions of nonlinear conjugate gradient methods, with special attention to global convergence properties is presented by Hager and Zhang [3] A survey on their definition including 40 conjugate gradient algorithms for unconstrained optimization is given by Andrei [7] he conjugate gradient algorithms have been introduced early in 95 by Cornelius Lanczos [34, 35] and by Magnus Hestenes with the cooperation of JB Rosser, G E Forsythe, L Paige, M Stein, R Hayes and U Hochstrasser at the Institute for umerical Analysis, a part of ational Bureau of Standards in Los Angeles, and by Eduard Stiefel at Eidgenössischen echnischen Hochschule in Zürich (see Hestenes and Stiefel [3]) Even if the conjugate gradient methods are now over 50 years old, they continue to be of a considerable interest particularly due to their convergence properties and to a very easy implementation in computer programs of the corresponding algorithms Many researchers in this area brought significant contributions, clarifying many theoretical and computational aspects of this class of algorithms Particularly, for the development of effective algorithms for solving basic linear algebra problems Golub and Kahan [7] discussed the use of the Lanczos algorithm in computing the singular value decomposition of non-quadratic functions Concus, Golub and O Leary [5] considered preconditioning of conjugate gradient algorithms An important extension of the conjugate

2 gradient algorithm was given by Concus and Golub [6] where the Hermitian part of the matrix is taen as a preconditioner A development of a bloc form of the Lanczos algorithm has been proposed by Golub, Underwood and Wilinson [9] Fletcher and Reeves extended the conjugate gradient algorithm to minimization of non-quadratic functions [5], thus opening an important area of research and applications Developments of this algorithm have been considered inter allia by Pola and Ribière [4], Polya [4], Perry [40], Shanno [44, 45], Gilbert and ocedal [6], Liu and Storey [37], Hu and Storey [33],ouati-Ahmed and Storey [46] Recently, the papers of ocedal [39], Dai and Yuan [0, ], Hager and Zhang [30], Andrei [-6, 8-] present important theoretical and computational contributions to the development of this class of algorithms In this paper we suggest new nonlinear conjugate gradient algorithms which are mainly modifications of the Dai and Yuan [0] conjugate gradient computational scheme In these algorithms the direction d + is computed as a linear combination between g + and s, ie d+ = θ+ g+ + β s, where s = x+ x he parameter θ is computed in such a way that the direction d + is the ewton direction or it satisfies the conjugacy condition On the other hand, β is a proper modification of the Dai and Yuan s computational scheme in such a way that the direction d + satisfies the sufficient descent condition For the exact line search the proposed algorithms reduce to the Dai and Yuan conjugate gradient computational scheme he paper has the following structure In Section we present the development of the conjugate gradient algorithms with sufficient descent condition, while in section 3 we prove the global convergence of the algorithms under strong Wolfe line search conditions In Section 4 we present an acceleration of the algorithm while in Section 5 we compare the computational performance of the new conjugate gradient schemes against the Dai and Yuan method and its hybrid variants [] using 750 unconstrained optimization test problems from the CUE [4] library along with some other large-scale unconstrained optimization problems presented in [] Using the Dolan and Moré performance profiles [3] we prove that these new accelerated conjugate gradient algorithms outperform the Dai-Yuan algorithm as well as its hybrid variants Modifications of the Dai-Yuan conjugate gradient algorithm For solving the unconstrained optimization problem { } min f ( x): x R n, () n where f: R R is continuously differentiable we consider a nonlinear conjugate gradient algorithm: x+ = x + α d, () where the stepsize α is positive and the directions are computed by the rule: where d d + θ+ g+ β s, β = + d = g, (3) 0 0 g+ g+ s g+ = ys ( ys ) ( ), (4) and θ + is a parameter which follows to be determined Here g = f ( x ) and y = g + g, s = x+ x Observe that if f is a quadratic function and α is selected to achieve the exact minimum of f in the direction d, then s g = + 0 and the formula (4) for β reduces to the Dai and Yuan computational scheme [0] However, in this paper we refer to general nonlinear functions and inexact line search We were led to this computational scheme by modifying the Dai and Yuan algorithm

3 β DY = g g + +, y s in order to conserve the sufficient descent condition and to have some other properties for an efficient conjugate gradient algorithm Using a standard Wolfe line search, the Dai and Yuan method always generates descent directions and under Lipschitz assumption it is globally convergent In [7] Dai established a remarable property relating the descent directions to the sufficient descent condition, showing that if there exist constants γ and γ such that γ g γ for all, then for any p ( 0, ), there exists a constant c > 0 such that the sufficient descent condition gi di c gi holds for at least p indices i [ 0, ], where j denotes the largest integer j In our algorithm the parameter β is selected in such a manner that the sufficient descent condition is satisfied at every iteration As we now, despite the strong convergence theory that has been developed for the Dai and Yuan method, it is susceptible to jamming, that is it begins to tae small steps without maing significant progress to the minimum When the iterates jam, y becomes tiny while g is DY bounded away from zero herefore, β is a proper modification of the β heorem If θ + /4, then the direction d+ = θ+ g+ + β s, (d0 = g 0 ), where β is given by (4) satisfies the sufficient descent condition g d + + θ+ g+ (5) 4 Proof Since d g +, we have g, we have g d = g, which satisfy (5) Multiplying (3) by 0 = ( g )( ) g ( s g ) + g+ g+ s θ+ + ys ( ys ) g d = g + (6) ow, using the inequality uv ( u + v ) we have: ( ys ) g+ / ( g+ s) g = ys ( ys ) ( g g )( g s ) ( ) ( ) ys g g s g ( ys) ( g+ s) g+ + ys = g + 4 ( ) (7) Using (7) in (6) we get (5) o conclude, the sufficient descent condition from (5), the quantity θ + /4 is required to be nonnegative Supposing that θ + /4 > 0, then the direction given by (3) and (4) is a descent direction Dai and Yuan [0, ] present conjugate gradient schemes with the property that g d < 0 when y s > 0 If f is strongly convex or the line search satisfies the Wolfe conditions, then y s > 0 and the Dai and Yuan scheme yield descent In our 3

4 algorithm observe that, if for all, θ + /4, and the line search satisfies the Wolfe conditions, then for all the search direction (3) and (4) satisfy the sufficient descent condition ote that in (5) we bound g + d + by ( θ + /4) g+, while for scheme of Dai and Yuan only the non-negativity of g + d + is established o determine the parameter θ + in (3) we suggest the following two procedures A) Our motivation to get a good algorithm for solving () is to choose the parameter θ + in such a way that for every the direction d + given by (3) be the ewton direction herefore, from the equation f ( x ) g = θ g + β s (8) after some algebra we get θ g + s g + + = ( ) s f x+ s + s g+ s f( x+ ) g+ ys ys he salient point in this formula for θ + is the presence of the Hessian For large-scale problems, choices for the update parameter that do not require the evaluation of the Hessian matrix are often preferred in practice to the methods that require the Hessian in each iteration herefore, in order to have an algorithm for solving large-scale problems we assume that in (8) we use an approximation B + of the true Hessian quasi-ewton equation B+ s = y his leads us to: f ( x + ) and let (9) B + satisfy the g ( s g ) + + θ+ = g + + s g + (0) y g+ ys Observe that if θ + given by (0) is greater than or equal to / 4, then according to heorem the direction (3) satisfies the sufficient descent condition (5) On the other hand, if in (0) θ + < /4, then we tae ex abrupto θ + = in (3) B) he second procedure is based on the conjugacy condition Dai and Liao [8] introduced the conjugacy condition yd = tsg where t 0 is a scalar his is indeed very, + + reasonable since in real computation the inexact line search is generally used However, this condition is very dependent on the nonnegative parameter t, for which we do not now any formula to choose in an optimal manner herefore, even if in our developments we use the inexact line search we adopt here a more conservative approach and consider the conjugacy condition yd + = 0 his leads us to: g ( s g ) + + θ+ = g + () y g+ ys Again, if θ + given by () is greater than or equal to / 4, then according to heorem the direction (3) satisfies the sufficient descent condition (5) On the other hand, if in () θ + < /4, then we tae θ + = in (3) he line search in the conjugate gradient algorithms for α computation is often based on the standard Wolfe conditions: f ( x + α d ) f ( x ) ρα g d (), g + d σ g d, (3) 4

5 where d is a descent direction and 0< ρ σ < In [] Dai and Yuan proved the global convergence of a conjugate gradient algorithm for DY which β = β t, where t [ c,] with c = ( σ ) /( + σ ) Our algorithm is a modification of the Dai and Yuan s with the following property Observe that where β g + s g+ ys ys DY = = β r, (4) r s g = (5) + ys s g+ σs g σys σs g+ From the second Wolfe condition it follows that = +, ie Since by the Wolfe condition σ s g+ ys σ s g+ σ ys > 0, it follows that Hence ys σ r σ herefore, β β DY σ 3 Convergence analysis In this section we analyze the convergence of the algorithm (), (3), (4) and (0) or () where d 0 = g 0 In the following we consider that g 0 for all, otherwise a stationary point is obtained Assume that: S = x R n : f( x) f( x ) is bounded (i) he level set { 0 } (ii) In a neighborhood of S, the function f is continuously differentiable and its gradient is Lipschitz continuous, ie there exists a constant L > 0 such that f ( x) f( y) L x y, for all x, y Under these assumptions on f there exists a constant Γ 0 such that f( x) Γ for all x S In order to prove the global convergence, we assume that the step size α in () is obtained by the strong Wolfe line search, that is, f ( x + α d ) f ( x ) ρα g d (3), g d g d + σ (3) where ρ and σ are positive constants such that 0< ρ σ < Dai et al [] proved that for any conjugate gradient method with strong Wolfe line search the following general result holds: Lemma 3 Suppose that the assumptions (i) and (ii) hold and consider any conjugate gradient method () and (3), where d is a descent direction and α is obtained by the strong Wolfe line search (3) and (3) If then =, (33) d 5

6 lim inf g = 0 (34) heorem 3 Suppose that the assumptions (i) and (ii) hold and consider the algorithm (), (3), (4) and (0) or (), where d + is a descent direction and α is obtained by the strong Wolfe line search (3) and (3) If there exists a constant γ 0 such γ f ( x), /4 θ τ, where τ is a positive constant and the angle ϕ between g and d is bounded, ie cosϕ ξ for all = 0,,, then the algorithm satisfies li minf g = 0 Proof Observe that ys = g+ s gs ( σ ) gs But gs = g s cos ϕ Since d is a descent direction it follows that g s g s ξ 0 for all = 0,,, ie With these where herefore β his relation shows that ys ( σ) g s ξ + g+ Γ σ ( σ) ξ ( σ) ξγ g η =, ys g s s s Γ η = ( σ ) ξγ η d g s s + θ+ + + β τγ+ = τγ+ η s = d ( τγ+ η) Hence, from Lemma 3 it follows that liminf g = 0 4 Acceleration of the algorithm In conjugate gradient algorithms the search directions tend to be poorly scaled and as a consequence the line search must perform more function evaluations in order to obtain a suitable steplength α In order to improve the performances of the conjugate gradient algorithms the efforts were directed to design procedures for direction computation based on the second order information For example, COMI [43], and SCALCG [-5] tae this idea of BFGS preconditioning In this section we focus on the step length modification In the context of gradient descent algorithm with bactracing the step length modification has been considered for the first time in [6] ocedal [39] pointed out that in conjugate gradient methods the step lengths may differ from in a very unpredictable manner hey can be larger or smaller than depending on how the problem is scaled umerical comparisons between conjugate gradient methods and the limited memory quasi ewton method, by Liu and ocedal [36], show that the latter is more successful [8] One explanation of the efficiency of the limited memory quasi-ewton method is given by its ability to accept unity step lengths along the iterations In this section we tae advantage of this behavior of conjugate gradient algorithms and present an acceleration scheme Basically this modifies the step length in a multiplicative manner to improve the reduction of the function values along the iterations 6

7 Line search For using the algorithm () one of the crucial elements is the stepsize computation For sae of generality in the following we consider the line searches that satisfy either the Goldstein s conditions [4]: ρα g d f( x + α d ) f( x ) ρα g d, (4) where 0< ρ < < ρ < and α > 0, or the Wolfe conditions () and (3) Proposition 4 Assume that d is a descent direction and f satisfies the Lipschitz condition f ( x) f( x) L x x for all x on the line segment connecting x and x + where L is a positive constant If the line search satisfies the Goldstein conditions (4),, then α g d ( ρ) (4) L If the line search satisfies the Wolfe conditions () and (3), then ( σ ) gd α (43) L d Proof If the Goldstein conditions are satisfied, then using the mean value theorem from (4) we get: ρα g d f( x + α d ) f( x ) d = α f ( x + ξd ) d where ξ [0, α ] From this inequality we immediately get (4) ow, to prove (43) subtract condition we get: But, g d α g d + Lα d from both sides of (3) and using the Lipschitz ( ) ( ) α (44) σ gd g+ g d L d d is a descent direction and since σ <, we immediately get (43) herefore, satisfying the Goldstein or the Wolfe line search conditions α is bounded away from zero, ie there exists a positive constant ω, such that α ω Acceleration scheme [6] Given the initial point x 0 we can compute f 0 = f( x 0 ), g0 = f( x 0 ) and by Wolfe line search conditions () and (3) the steplength α 0 is determined With these, the next iteration is computed as: x = x0 + α0d0, ( d0 = g 0 ) where f and are immediately determined and the direction d can be computed as: g d = θg+ β0 s0, where the conjugate gradient parameter β 0 is computed as in (4) and the scaling factor θ is computed as in (0) or () herefore, at the iteration =,, we now x, f, g and d = θg + β s Suppose that d is a descent direction (ie θ /4) By the Wolfe line search () and (3) we can compute α with which the following point z = x + αd is determined he first Wolfe condition () shows that the steplength α > 0 satisfies: f ( z) = f( x + α d ) f( x ) + ρα g d, 7

8 With these, let us introduce the accelerated conjugate gradient algorithm by means of the following iterative scheme: x = x + + γ αd, (45) where γ > 0 is a parameter which follows to be determined in such a manner as to improve the behavior of the algorithm ow, we have: f( x + αd) = f( x) + αgd + αd f( x) d + o( αd ) (46) On the other hand, for γ > 0 we have: f( x + γαd) = f( x) + γαgd + γ αd f( x) d + o( γαd ) (47) With these we can write: f( x + γα d ) = f( x + α d ) +Ψ ( γ ), (48) where Let us denote: Ψ ( γ) = ( γ ) αd f( x) d + ( γ ) αgd ( ) ( ) + γαo α d αo α d a = α g d 0, b = α d f( x ) d, ε = o ( α d ) Observe that a 0, since d is a descent direction, and for convex functions b 0 herefore, Ψ γ = γ + γ + γ α ε α ε ow, we see that Ψ ( γ ) = ( b + αε ) γ+ a and Ψ ( γ ) = 0 where ( ) ( ) b ( ) a m (49) (40) a γ m = b + α ε (4) Observe that Ψ (0) = a 0 herefore, assuming that b + α ε > 0, then Ψ ( γ ) is a convex quadratic function with minimum value in point γ m and ( a + ( b + αε )) Ψ ( γ m) = 0 ( b + αε ) m Considering γ = γ in (48) and since b 0, we see that for every ( a + ( b + αε )) f ( x + γmαd) = f( x + αd) f( x + αd), ( b + αε ) which is a possible improvement of the values of function f (when a + ( b + α ε ) 0) herefore, using this simple multiplicative modification of the stepsize α as γ α where γ = γ = a /( b + α ε ) we get: m ( a + ( b + αε )) f( x ) = f( x + γαd ) f( x ) + ραg d + ( b + αε ) ( a + ( b + αε )) = f ( x) ρa f( x), (4) ( b + αε ) 8

9 since a 0, ( d is a descent direction) Observe that if d is a descent direction, then ( a + ( b + αε )) ( a + b) > ( b + αε ) b and from (4) we get: ( a + ( b + αε )) f ( x+ ) f( x) ρa ( b + αε ) ( a + b ) < f ( x) ρa f( x ) b herefore, neglecting the contribution of ε, we still get an improvement on the function values ow, in order to get the algorithm we have to determine a way for computation For this, at point = + α we have: z x d f ( z) = f( x + αd) = f( x) + αgd + αd f( x ) d, where x is a point on the line segment connecting x and z On the other hand, at point x = z α d we have: f ( x) = f( z αd) = f( z) αgzd + αd f( x) d, where gz = f( z) and x is a point on the line segment connecting x and z Having in view the local character of searching and that the distance between x and z is small enough, we can consider x = x = x So, adding the above equalities we get: where y = g g z b = α y d, b (43) Observe that if a > b, then γ > In this case γ α > α and it is also possible that γ α or γ α > Hence, the steplength γ α can be greater than On the other hand, if a b, then γ In this case γ α α, so the steplength γ α is reduced herefore, if a b, then γ and the steplength α computed by Wolfe conditions will be modified by its increasing or its reducing through factor γ eglecting ε in (40), we see that Ψ () = 0 and if a b /, then Ψ (0) = a b / 0 and γ < herefore, for any γ [0,], Ψ ( γ ) 0 As a consequence for any γ (0,), it follows that f ( x + γα d ) < f( x ) In this case, for any γ [0,], γ α α However, in our algorithm we selected γ = γ m as the point achieving the minimum value of Ψ ( γ ) 5 AMDY and AMDYC Algorithms Considering the definitions of g, s and y we present the following conjugate gradient algorithms which are accelerated, modified versions of the Dai and Yuan algorithm with ewton direction (AMDY) or with conjugacy condition (AMDYC) 9

10 AMDY and AMDYC Algorithms Step Initialization Select x R n 0 and the parameters 0< ρ < σ < Compute f ( x ) 0 and g 0 Consider d0 = g0 and α 0 = / g 0 Set = 0 Step est for continuation of iterations If g 0 6, then stop, else set = + Step 3 Line search Compute α satisfying the Wolfe line search conditions () and (3) Step 4 Compute: z = x + αd, gz = f( z) and y = g gz Step 5 Compute: a α g d, and b = α y d = Step 6 If b 0, then compute γ = a / b and update the variables as g + Compute y g + g x = x + + γ α d, otherwise update the variables as x = x + + αd Compute f + and = and s = x x + Step 7 θ + computation For the algorithm AMDY, θ + is computed as in (0) For the algorithm AMDYC, θ + is computed as in () If θ + < /4, then we set θ + = Step 8 Direction computation Compute d = θ g + β s + +, where β is computed as in (4) If 3 g + d 0 d g +, (5) then define d + = d, otherwise set d+ = g + Compute the initial guess α = α d / d, set = + and continue with step It is well nown that if f is bounded along the direction then there exists a stepsize α satisfying the Wolfe line search conditions () and (3) In our algorithm when the angle between d and g + is not acute enough, then we restart the algorithm with the negative gradient g + More sophisticated reasons for restarting the algorithms have been proposed in the literature, but we are interested in the performance of a conjugate gradient algorithm that uses this restart criterion, associated to a direction satisfying the sufficient descent condition Under reasonable assumptions, conditions (), (3) and (5) are sufficient to prove the global convergence of the algorithm he initial selection of the step length crucially affects the practical behaviour of the algorithm At every iteration the starting guess for the step α in the line search is computed as α d / d his selection, was considered for the first time by Shanno and Phua in COMI [43] It is also considered in the pacages: SCG by Birgin and Martínez [3] and in SCALCG by Andrei [-5, ] Proposition 5 Suppose that f is a uniformly convex function on the level set { : ( ) ( )} S = x f x f x, and satisfies the sufficient descent condition g d where c > 0, and 0 d or AMDYC converges linearly to d < c g, d c g, where c > 0 hen the sequence generated by AMDY x *, solution to the problem () Proof From (4) we have that f ( x + ) f( x ) for all Since follows that lim( f( x) f( x + )) = 0 f is bounded below, it 0

11 ow, since f is uniformly convex there exist positive constants m and M, such that mi f ( x) MI on S Suppose that x + αd S and x + γ mαd S for all α > 0 We have: ( a + b) f( x + γmαd) f( x + αd) b But, from uniform convexity we have the following quadratic upper bound on f ( x + αd ): f( x + αd) f( x) + αgd + Mα d herefore, f ( x + αd) f( x) αc g + Mcα g = f( x) + cα + Mcα g c Observe that for 0 α c/( Mc), cα + Mcα α which follows from the convexity of cα + ( Mc /) α Using this result we get: since ρ < / f ( x + αd) f( x) cα g f( x) ρcα g, From proposition 4 the Wolfe line search terminates with a value α ω > 0 herefore, for 0 α c/( Mc ), this provides a lower bound on the decrease in the function f, ie On the other hand, f( x + αd ) f( x ) ρcω g (5) 4 ( α Mc αc) g ω α Mc M g c ( a + b) ( Mc c ) b γmα ρ ω Mc g (53) Considering (5) and (53) we get: ( ωmc c ) f( x + d ) f( x ) c g g (54) herefore, ( ωmc c) f( x) f( x + γmαd) ρcω+ g Mc But, f( x) f( x + ) 0 and as a consequence g goes to zero, ie x converges to x * Having in view that f ( x ) is a nonincreasing sequence, it follows that f ( x ) converges to * f ( x ) From (54) we see that Combining this with we conclude: ( ωmc c ) f( x+ ) f( x) ρcω+ g Mc * g m( f( x ) f ) and subtracting * * f ( x+ ) f c( f( x ) f ), where (55) * f from both sides of (55)

12 ( ωmc c) c= m ρcω+ < Mc * herefore, f ( x ) converges to f at least as fast as a geometric series with a factor that depends on the parameter ρ in the first Wolfe condition and the bounds m and M of the Hessian So, the convergence of the acceleration scheme is at least linear 6 umerical results and comparisons In this section we present the computational performance of a Fortran implementation of the AMDY and AMDYC algorithms on a set of 750 unconstrained optimization test problems he test problems are the unconstrained problems in the CUE [4] library, along with other large-scale optimization problems presented in [] We selected 75 large-scale unconstrained optimization problems in extended or generalized form For each function we have considered ten numerical experiments with the increasing number of variables n = 000, 000,, 0000 All algorithms implement the Wolfe line search conditions with ρ = 0000 and σ = 09, and the same stopping criterion g 0 6, where is the maximum absolute component of a vector he comparisons of algorithms are given in the ALG ALG following context Let f i and f i be the optimal value found by ALG and ALG, for problem i =,, 750, respectively We say that, in the particular problem i, the performance of ALG was better than the performance of ALG if ALG ALG 3 fi fi <0 (6) and the number of iterations, or the number of function-gradient evaluations, or the CPU time of ALG was less than the number of iterations, or the number of function-gradient evaluations, or the CPU time corresponding to ALG, respectively All codes are written in double precision Fortran and compiled with f77 (default compiler settings) on an Intel Pentium 4, 8GHz worstation All these codes are authored by Andrei In the first set of numerical experiments we compare AMDY versus AMDYC In able we present the number of problems solved by these two algorithms with a minimum number of iterations (#iter), a minimum number of function and its gradient evaluations (#fg) and the minimum cpu time able Performance of AMDY versus AMDYC 750 problems AMDY AMDYC = # iter # fg CPU Both algorithms have similar performances However, subject to cpu time metric, AMDY proves to be slightly better In the second set of numerical experiments we compare AMDY and AMDYC algorithms with the Dai and Yuan (DY) algorithm Figures and present the Dolan-Moré performance profile for these algorithms subject to the cpu time metric We see that both AMDY and AMDYC is top performer, being more successful and more robust than the Dai and Yuan algorithm When comparing AMDY with the Dai and Yuan algorithm (Figure ), subject to the number of iterations, we see that AMDY was better in 69 problems (ie it achieved the minimum number of iterations in 69 problems) DY was better in 7 problems and they achieved the same number of iterations in 60 problems, etc Out of 750 problems, only for 706 of them does the criterion (6) hold

13 Fig Performance profile of AMDY versus DY Fig Performance profile of AMDYC versus DY Dai and Yuan [] studied the hybrid conjugate gradient algorithms and proposed the following two hybrid methods: hdy σ DY HS DY β = max β, min { β, β }, (6) + σ β { { β }} = max 0, min, β, (63) hdyz HS DY 3

14 where HS / β + = y g y s, showing their global convergence when the Lipschitz assumption holds and the standard Wolfe line search is used he numerical experiments of Dai and i [9] proved that the second hybrid method (hdyz) is the better, outperforming the Pola- Ribière [4] and Polya [4] method In the third set of numerical experiments we compare the Dolan-Moré performance profile of AMDY and AMDYC versus Dai-Yuan hybrid hdy conjugate gradient subject to the cpu time metric, as in Figures 3 and 4 β Fig 3 Performance profile of AMDY versus hdy Fig 4 Performance profile of AMDYC versus hdy 4

15 In the fourth set of numerical experiments, in Figures 5 and 6, we compare the Dolan- Moré performance profile of AMDY and AMDYC versus Dai-Yuan hybrid conjugate hdyz gradient subject to the cpu time metric β Fig 5 Performance profile of AMDY versus hdyz Fig 6 Performance profile of AMDYC versus hdyz 5

16 7 Conclusion We have presented a new conjugate gradient algorithm for solving large-scale unconstrained optimization problems he parameter β is a modification of the Dai and Yuan computational scheme in such a manner that the direction d generated by the algorithm satisfies the sufficient descent condition, independent of the line search Under strong Wolfe line search conditions we proved the global convergence of the algorithm We present computational evidence that the performance of our algorithms AMDY and AMDYC was higher than that of the Dai and Yuan conjugate gradient algorithm and its hybrid variants, for a set consisting of 750 problems References [] Andrei, : Conjugate gradient algorithms for large scale unconstrained optimization ICI echnical Report, January, 005 [] Andrei, : Scaled conjugate gradient algorithms for unconstrained optimization Computational Optimization and Applications, 38 (007), pp [3] Andrei, : Scaled memoryless BFGS preconditioned conjugate gradient algorithm for unconstrained optimization Optimization Methods and Software (007), pp [4] Andrei, : A scaled BFGS preconditioned conjugate gradient algorithm for unconstrained optimization Applied Mathematics Letters 0 (007), pp [5] Andrei, : A scaled nonlinear conjugate gradient algorithm for unconstrained optimization Optimization, 57 (008), pp [6] Andrei, : An acceleration of gradient descent algorithm with bactracing for unconstrained optimization umerical Algorithms, 4 (006), pp63-73 [7] Andrei, : 40 conjugate gradient algorithms for unconstrained optimization A survey on their definition ICI echnical Report o 3/08, March 4, 008 [8] Andrei, : umerical comparison of conjugate gradient algorithms for unconstrained optimization Studies in Informatics and Control, 6 (007), pp [9] Andrei, : A Dai-Yuan conjugate gradient algorithm with sufficient descent and conjugacy conditions for unconstrained optimization Applied Mathematics Letters (008), pp 65-7 [0] Andrei, : Another hybrid conjugate gradient algorithm for unconstrained optimization umerical Algorithms, 47 (008), pp43-56 [] Andrei, : Performance profiles of conjugate gradient algorithms for unconstrained optimization Encyclopedia of Optimization, nd edition, 008, CA Floudas and PM Pardalos (Eds), entry 76, in press [] Andrei, : An unconstrained optimization test functions collection Advanced Modeling and Optimization An electronic International Journal, 0 (008), pp47-6 [3] Birgin, E, Martínez, JM: A spectral conjugate gradient method for unconstrained optimization, Applied Math and Optimization, 43 (00), pp7-8 [4] Bongartz, I, Conn, AR, Gould, IM, oint, PhL: CUE: constrained and unconstrained testing environments, ACM rans Math Software, (995), pp3-60 [5] Concus, P, Golub, GH, O Leary, DP: A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations, in JR Bunch and DJ Rose (Eds) Sparse Matrix Computations, Academic Press, ew Yor, 976, pp [6] Concus, P, Golub, GH: A generalized conjugate gradient method for nonsymmetric systems of linear equations, in R Glowinsi and JL Lions (Eds) Computing Methods in Applied Sciences and Engineering ( nd International Symposium, Versailles 975), Part I, Springer Lecture otes in Economics and Mathematical Systems 34, ew Yor, 976, pp56-65 [7] Dai, YH: ew properties of a nonlinear conjugate gradient method umer Math, 89 (00), pp

17 [8] Dai, YH, Liao, LZ: ew conjugacy con7 ditions and related nonlinear conjugate gradient methods Appl Math Optim, 43 (00), pp 87-0 [9] Dai, YH i, Q: esting different conjugate gradient methods for large-scale unconstrained optimization, J Comput Math, (003), pp3-30 [0] Dai, YH, Yuan, Y: A nonlinear conjugate gradient method with a strong global convergence property, SIAM J Optim, 0 (999), pp77-8 [] Dai, YH Yuan, Y: An efficient hybrid conjugate gradient method for unconstrained optimization Annals of Operations Research, 03 (00), pp33-47 [] Dai, YH, Han, JY, Liu, GH, Sun, DF, Yin, X, Yuan, Y: Convergence properties of nonlinear conjugate gradient methods SIAM Journal on Optimization 0 (999), [3] Dolan, ED, Moré, JJ: Benchmaring optimization software with performance profiles, Math Programming, 9 (00), pp 0-3 [4] Goldstein, AA: On steepest descent, SIAM J Control, Vol 3, pp47-5, 965 [5] Fletcher, R, Reeves, CM: Function minimization by conjugate gradients Computer Journal, 7 (964), pp49-54 [6] Gilbert, JC ocedal, J: Global convergence properties of conjugate gradient methods for optimization SIAM J Optim, (99), pp-4 [7] Golub, GH, Kahan, W: Calculating the singular values and pseudo-inverse of a matrix SIAM J umer Anal, (Series B) (965), pp 05-4 [8] Golub, GH O Leary, DP: Some history of the conjugate gradient and Lanczos algorithms: SIAM Review, 3 (976), pp50-00 [9] Golub, GH, Underwood, R, Wilinson, JH: he Lanczos algorithm for the symmetric Ax= λbxproblem echnical Report, Stanford University Computer Science Department, Stanford, California, 97 [30] Hager, WW, Zhang, H: A new conjugate gradient method with guaranteed descent and an efficient line search, SIAM Journal on Optimization, 6 (005), 70-9 [3] Hager, WW, Zhang, H: A survey of nonlinear conjugate gradient methods Pacific journal of Optimization, (006), pp35-58 [3] Hestenes, MR, Stiefel, E: Methods of conjugate gradients for solving linear systems, J Research at Bur Standards Sec B 48, pp , 95 [33] Hu, YF, Storey, C: Global convergence result for conjugate gradient methods JOA, 7, (99), pp [34] Lanczos, C: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, J Res at Bur Standards, 45 (950), pp55-8 [35] Lanczos, C: Solution of systems of linear equations by minimized iterations, J Res at Bur Standards, 49 (95), pp33-53 [36] Liu, DC, ocedal, J: On the limited memory BFGS method for large scale optimization methods Mathematical Programming, 45 (989), pp [37] Liu, Y, Storey, C: Efficient generalized conjugate gradient algorithms, Part : heory JOA, 69 (99), pp9-37 [38] O Leary, DP: Conjugate gradients and related KMP algorithms: he beginnings In L Adams and JL azareth (Eds) Linear and onlinear Conjugate Gradient Related Methods SIAM, Philadelphia, 996, pp-8 [39] ocedal, J: Conjugate gradient methods and nonlinear optimization, in L Adams and JL azareth (Eds) Linear and onlinear Conjugate Gradient Related Methods, SIAM, Philadelphia, 996, pp9-3 [40] Perry, JM: A class of conjugate gradient algorithms with a two-step variable-metric memory, Discussion Paper 69, Center for Mathematical Studies in Economic and Management Sciences, orthwestern University, Evanston, Illinois, 977 [4] Pola, E, Ribière, G : ote sur la convergence de méthodes de directions conjuguée, Revue Francaise Informat Recherche Opérationnelle, 3e Année 6 (969), pp35-43 [4] Polya, B: he conjugate gradient method in extreme problems,ussr Comp Math Math Phys, 9 (969), pp94-7

18 [43] Shanno, DF, Phua, V: Algorithm 500, Minimization of unconstrained multivariate functions, ACM rans on Math Soft,, pp87-94, 976 [44] Shanno, DF: On the convergence of a new conjugate gradient algorithm,siam J umer Anal, 5 (978), pp47-57 [45] Shanno, DF: Conjugate gradient methods with inexact searches Math Oper Res, 3 (978), pp44-56 [46] ouati-ahmed, D, Storey, C: Efficient hybrid conjugate gradient techniques, Journal Of Optimization heory and Applications, 64 (990), pp June 8, 008 8

New hybrid conjugate gradient algorithms for unconstrained optimization

New hybrid conjugate gradient algorithms for unconstrained optimization ew hybrid conjugate gradient algorithms for unconstrained optimization eculai Andrei Research Institute for Informatics, Center for Advanced Modeling and Optimization, 8-0, Averescu Avenue, Bucharest,

More information

First Published on: 11 October 2006 To link to this article: DOI: / URL:

First Published on: 11 October 2006 To link to this article: DOI: / URL: his article was downloaded by:[universitetsbiblioteet i Bergen] [Universitetsbiblioteet i Bergen] On: 12 March 2007 Access Details: [subscription number 768372013] Publisher: aylor & Francis Informa Ltd

More information

New Hybrid Conjugate Gradient Method as a Convex Combination of FR and PRP Methods

New Hybrid Conjugate Gradient Method as a Convex Combination of FR and PRP Methods Filomat 3:11 (216), 383 31 DOI 1.2298/FIL161183D Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat New Hybrid Conjugate Gradient

More information

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization BULLETIN of the Malaysian Mathematical Sciences Society http://math.usm.my/bulletin Bull. Malays. Math. Sci. Soc. (2) 34(2) (2011), 319 330 Open Problems in Nonlinear Conjugate Gradient Algorithms for

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S. Bull. Iranian Math. Soc. Vol. 40 (2014), No. 1, pp. 235 242 Online ISSN: 1735-8515 AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

More information

New hybrid conjugate gradient methods with the generalized Wolfe line search

New hybrid conjugate gradient methods with the generalized Wolfe line search Xu and Kong SpringerPlus (016)5:881 DOI 10.1186/s40064-016-5-9 METHODOLOGY New hybrid conjugate gradient methods with the generalized Wolfe line search Open Access Xiao Xu * and Fan yu Kong *Correspondence:

More information

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH Jie Sun 1 Department of Decision Sciences National University of Singapore, Republic of Singapore Jiapu Zhang 2 Department of Mathematics

More information

Bulletin of the. Iranian Mathematical Society

Bulletin of the. Iranian Mathematical Society ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 43 (2017), No. 7, pp. 2437 2448. Title: Extensions of the Hestenes Stiefel and Pola Ribière Polya conjugate

More information

A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search

A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search Yu-Hong Dai and Cai-Xia Kou State Key Laboratory of Scientific and Engineering Computing, Institute of

More information

Nonlinear conjugate gradient methods, Unconstrained optimization, Nonlinear

Nonlinear conjugate gradient methods, Unconstrained optimization, Nonlinear A SURVEY OF NONLINEAR CONJUGATE GRADIENT METHODS WILLIAM W. HAGER AND HONGCHAO ZHANG Abstract. This paper reviews the development of different versions of nonlinear conjugate gradient methods, with special

More information

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize Bol. Soc. Paran. Mat. (3s.) v. 00 0 (0000):????. c SPM ISSN-2175-1188 on line ISSN-00378712 in press SPM: www.spm.uem.br/bspm doi:10.5269/bspm.v38i6.35641 Convergence of a Two-parameter Family of Conjugate

More information

Research Article A Descent Dai-Liao Conjugate Gradient Method Based on a Modified Secant Equation and Its Global Convergence

Research Article A Descent Dai-Liao Conjugate Gradient Method Based on a Modified Secant Equation and Its Global Convergence International Scholarly Research Networ ISRN Computational Mathematics Volume 2012, Article ID 435495, 8 pages doi:10.5402/2012/435495 Research Article A Descent Dai-Liao Conjugate Gradient Method Based

More information

Modification of the Armijo line search to satisfy the convergence properties of HS method

Modification of the Armijo line search to satisfy the convergence properties of HS method Université de Sfax Faculté des Sciences de Sfax Département de Mathématiques BP. 1171 Rte. Soukra 3000 Sfax Tunisia INTERNATIONAL CONFERENCE ON ADVANCES IN APPLIED MATHEMATICS 2014 Modification of the

More information

on descent spectral cg algorithm for training recurrent neural networks

on descent spectral cg algorithm for training recurrent neural networks Technical R e p o r t on descent spectral cg algorithm for training recurrent neural networs I.E. Livieris, D.G. Sotiropoulos 2 P. Pintelas,3 No. TR9-4 University of Patras Department of Mathematics GR-265

More information

Global Convergence Properties of the HS Conjugate Gradient Method

Global Convergence Properties of the HS Conjugate Gradient Method Applied Mathematical Sciences, Vol. 7, 2013, no. 142, 7077-7091 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.311638 Global Convergence Properties of the HS Conjugate Gradient Method

More information

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence Journal of Mathematical Research & Exposition Mar., 2010, Vol. 30, No. 2, pp. 297 308 DOI:10.3770/j.issn:1000-341X.2010.02.013 Http://jmre.dlut.edu.cn A Modified Hestenes-Stiefel Conjugate Gradient Method

More information

An Efficient Modification of Nonlinear Conjugate Gradient Method

An Efficient Modification of Nonlinear Conjugate Gradient Method Malaysian Journal of Mathematical Sciences 10(S) March : 167-178 (2016) Special Issue: he 10th IM-G International Conference on Mathematics, Statistics and its Applications 2014 (ICMSA 2014) MALAYSIAN

More information

Step-size Estimation for Unconstrained Optimization Methods

Step-size Estimation for Unconstrained Optimization Methods Volume 24, N. 3, pp. 399 416, 2005 Copyright 2005 SBMAC ISSN 0101-8205 www.scielo.br/cam Step-size Estimation for Unconstrained Optimization Methods ZHEN-JUN SHI 1,2 and JIE SHEN 3 1 College of Operations

More information

Step lengths in BFGS method for monotone gradients

Step lengths in BFGS method for monotone gradients Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly

More information

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS Adeleke O. J. Department of Computer and Information Science/Mathematics Covenat University, Ota. Nigeria. Aderemi

More information

A modified quadratic hybridization of Polak-Ribière-Polyak and Fletcher-Reeves conjugate gradient method for unconstrained optimization problems

A modified quadratic hybridization of Polak-Ribière-Polyak and Fletcher-Reeves conjugate gradient method for unconstrained optimization problems An International Journal of Optimization Control: Theories & Applications ISSN:246-0957 eissn:246-5703 Vol.7, No.2, pp.77-85 (207) http://doi.org/0.2/ijocta.0.207.00339 RESEARCH ARTICLE A modified quadratic

More information

On Descent Spectral CG algorithms for Training Recurrent Neural Networks

On Descent Spectral CG algorithms for Training Recurrent Neural Networks 29 3th Panhellenic Conference on Informatics On Descent Spectral CG algorithms for Training Recurrent Neural Networs I.E. Livieris, D.G. Sotiropoulos and P. Pintelas Abstract In this paper, we evaluate

More information

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes

More information

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search ANZIAM J. 55 (E) pp.e79 E89, 2014 E79 On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search Lijun Li 1 Weijun Zhou 2 (Received 21 May 2013; revised

More information

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations International Journal of Mathematical Modelling & Computations Vol. 07, No. 02, Spring 2017, 145-157 An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations L. Muhammad

More information

The Wolfe Epsilon Steepest Descent Algorithm

The Wolfe Epsilon Steepest Descent Algorithm Applied Mathematical Sciences, Vol. 8, 204, no. 55, 273-274 HIKARI Ltd, www.m-hiari.com http://dx.doi.org/0.2988/ams.204.4375 The Wolfe Epsilon Steepest Descent Algorithm Haima Degaichia Department of

More information

Adaptive two-point stepsize gradient algorithm

Adaptive two-point stepsize gradient algorithm Numerical Algorithms 27: 377 385, 2001. 2001 Kluwer Academic Publishers. Printed in the Netherlands. Adaptive two-point stepsize gradient algorithm Yu-Hong Dai and Hongchao Zhang State Key Laboratory of

More information

Preconditioned conjugate gradient algorithms with column scaling

Preconditioned conjugate gradient algorithms with column scaling Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

A family of derivative-free conjugate gradient methods for large-scale nonlinear systems of equations

A family of derivative-free conjugate gradient methods for large-scale nonlinear systems of equations Journal of Computational Applied Mathematics 224 (2009) 11 19 Contents lists available at ScienceDirect Journal of Computational Applied Mathematics journal homepage: www.elsevier.com/locate/cam A family

More information

THE RELATIONSHIPS BETWEEN CG, BFGS, AND TWO LIMITED-MEMORY ALGORITHMS

THE RELATIONSHIPS BETWEEN CG, BFGS, AND TWO LIMITED-MEMORY ALGORITHMS Furman University Electronic Journal of Undergraduate Mathematics Volume 12, 5 20, 2007 HE RELAIONSHIPS BEWEEN CG, BFGS, AND WO LIMIED-MEMORY ALGORIHMS ZHIWEI (ONY) QIN Abstract. For the solution of linear

More information

1 Numerical optimization

1 Numerical optimization Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms

More information

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Improved Damped Quasi-Newton Methods for Unconstrained Optimization Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

A Novel of Step Size Selection Procedures. for Steepest Descent Method

A Novel of Step Size Selection Procedures. for Steepest Descent Method Applied Mathematical Sciences, Vol. 6, 0, no. 5, 507 58 A Novel of Step Size Selection Procedures for Steepest Descent Method Goh Khang Wen, Mustafa Mamat, Ismail bin Mohd, 3 Yosza Dasril Department of

More information

Globally convergent three-term conjugate gradient projection methods for solving nonlinear monotone equations

Globally convergent three-term conjugate gradient projection methods for solving nonlinear monotone equations Arab. J. Math. (2018) 7:289 301 https://doi.org/10.1007/s40065-018-0206-8 Arabian Journal of Mathematics Mompati S. Koorapetse P. Kaelo Globally convergent three-term conjugate gradient projection methods

More information

GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION

GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION By HONGCHAO ZHANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

The Conjugate Gradient Method for Solving Linear Systems of Equations

The Conjugate Gradient Method for Solving Linear Systems of Equations The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2

More information

A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION

A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION 1 A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION G E MANOUSSAKIS, T N GRAPSA and C A BOTSARIS Department of Mathematics, University of Patras, GR 26110 Patras, Greece e-mail :gemini@mathupatrasgr,

More information

1 Numerical optimization

1 Numerical optimization Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.11(2011) No.2,pp.153-158 Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method Yigui Ou, Jun Zhang

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Downloaded 12/02/13 to Redistribution subject to SIAM license or copyright; see

Downloaded 12/02/13 to Redistribution subject to SIAM license or copyright; see SIAM J. OPTIM. Vol. 23, No. 4, pp. 2150 2168 c 2013 Society for Industrial and Applied Mathematics THE LIMITED MEMORY CONJUGATE GRADIENT METHOD WILLIAM W. HAGER AND HONGCHAO ZHANG Abstract. In theory,

More information

Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search

Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search Abstract and Applied Analysis Volume 013, Article ID 74815, 5 pages http://dx.doi.org/10.1155/013/74815 Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search Yuan-Yuan Chen

More information

Quasi-Newton Methods

Quasi-Newton Methods Quasi-Newton Methods Werner C. Rheinboldt These are excerpts of material relating to the boos [OR00 and [Rhe98 and of write-ups prepared for courses held at the University of Pittsburgh. Some further references

More information

ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS

ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-203-OS-03 Department of Mathematics KTH Royal

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Unconstrained Multivariate Optimization

Unconstrained Multivariate Optimization Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued

More information

Math 164: Optimization Barzilai-Borwein Method

Math 164: Optimization Barzilai-Borwein Method Math 164: Optimization Barzilai-Borwein Method Instructor: Wotao Yin Department of Mathematics, UCLA Spring 2015 online discussions on piazza.com Main features of the Barzilai-Borwein (BB) method The BB

More information

Notes on Numerical Optimization

Notes on Numerical Optimization Notes on Numerical Optimization University of Chicago, 2014 Viva Patel October 18, 2014 1 Contents Contents 2 List of Algorithms 4 I Fundamentals of Optimization 5 1 Overview of Numerical Optimization

More information

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization

More information

Gradient methods exploiting spectral properties

Gradient methods exploiting spectral properties Noname manuscript No. (will be inserted by the editor) Gradient methods exploiting spectral properties Yaui Huang Yu-Hong Dai Xin-Wei Liu Received: date / Accepted: date Abstract A new stepsize is derived

More information

Jae Heon Yun and Yu Du Han

Jae Heon Yun and Yu Du Han Bull. Korean Math. Soc. 39 (2002), No. 3, pp. 495 509 MODIFIED INCOMPLETE CHOLESKY FACTORIZATION PRECONDITIONERS FOR A SYMMETRIC POSITIVE DEFINITE MATRIX Jae Heon Yun and Yu Du Han Abstract. We propose

More information

A line-search algorithm inspired by the adaptive cubic regularization framework, with a worst-case complexity O(ɛ 3/2 ).

A line-search algorithm inspired by the adaptive cubic regularization framework, with a worst-case complexity O(ɛ 3/2 ). A line-search algorithm inspired by the adaptive cubic regularization framewor, with a worst-case complexity Oɛ 3/. E. Bergou Y. Diouane S. Gratton June 16, 017 Abstract Adaptive regularized framewor using

More information

Residual iterative schemes for largescale linear systems

Residual iterative schemes for largescale linear systems Universidad Central de Venezuela Facultad de Ciencias Escuela de Computación Lecturas en Ciencias de la Computación ISSN 1316-6239 Residual iterative schemes for largescale linear systems William La Cruz

More information

Conjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization

Conjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization Conjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization Yasushi Narushima and Hiroshi Yabe September 28, 2011 Abstract Conjugate gradient

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

SOLVING SYSTEMS OF NONLINEAR EQUATIONS USING A GLOBALLY CONVERGENT OPTIMIZATION ALGORITHM

SOLVING SYSTEMS OF NONLINEAR EQUATIONS USING A GLOBALLY CONVERGENT OPTIMIZATION ALGORITHM Transaction on Evolutionary Algorithm and Nonlinear Optimization ISSN: 2229-87 Online Publication, June 2012 www.pcoglobal.com/gjto.htm NG-O21/GJTO SOLVING SYSTEMS OF NONLINEAR EQUATIONS USING A GLOBALLY

More information

Statistics 580 Optimization Methods

Statistics 580 Optimization Methods Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of

More information

New Inexact Line Search Method for Unconstrained Optimization 1,2

New Inexact Line Search Method for Unconstrained Optimization 1,2 journal of optimization theory and applications: Vol. 127, No. 2, pp. 425 446, November 2005 ( 2005) DOI: 10.1007/s10957-005-6553-6 New Inexact Line Search Method for Unconstrained Optimization 1,2 Z.

More information

DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS

DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS by Johannes Brust, Oleg Burdaov, Jennifer B. Erway, and Roummel F. Marcia Technical Report 07-, Department of Mathematics and Statistics, Wae

More information

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Numerical Optimization of Partial Differential Equations

Numerical Optimization of Partial Differential Equations Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada

More information

FALL 2018 MATH 4211/6211 Optimization Homework 4

FALL 2018 MATH 4211/6211 Optimization Homework 4 FALL 2018 MATH 4211/6211 Optimization Homework 4 This homework assignment is open to textbook, reference books, slides, and online resources, excluding any direct solution to the problem (such as solution

More information

and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s

and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s Global Convergence of the Method of Shortest Residuals Yu-hong Dai and Ya-xiang Yuan State Key Laboratory of Scientic and Engineering Computing, Institute of Computational Mathematics and Scientic/Engineering

More information

Gradient method based on epsilon algorithm for large-scale nonlinearoptimization

Gradient method based on epsilon algorithm for large-scale nonlinearoptimization ISSN 1746-7233, England, UK World Journal of Modelling and Simulation Vol. 4 (2008) No. 1, pp. 64-68 Gradient method based on epsilon algorithm for large-scale nonlinearoptimization Jianliang Li, Lian

More information

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods

More information

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:

More information

Chapter 10 Conjugate Direction Methods

Chapter 10 Conjugate Direction Methods Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method

More information

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1 journal of complexity 18, 557 572 (2002) doi:10.1006/jcom.2001.0623 Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1 M. Al-Baali Department of Mathematics

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Notes on Some Methods for Solving Linear Systems

Notes on Some Methods for Solving Linear Systems Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

HT14 A COMPARISON OF DIFFERENT PARAMETER ESTIMATION TECHNIQUES FOR THE IDENTIFICATION OF THERMAL CONDUCTIVITY COMPONENTS OF ORTHOTROPIC SOLIDS

HT14 A COMPARISON OF DIFFERENT PARAMETER ESTIMATION TECHNIQUES FOR THE IDENTIFICATION OF THERMAL CONDUCTIVITY COMPONENTS OF ORTHOTROPIC SOLIDS Inverse Problems in Engineering: heory and Practice rd Int. Conference on Inverse Problems in Engineering June -8, 999, Port Ludlow, WA, USA H4 A COMPARISON OF DIFFEREN PARAMEER ESIMAION ECHNIQUES FOR

More information

Lec10p1, ORF363/COS323

Lec10p1, ORF363/COS323 Lec10 Page 1 Lec10p1, ORF363/COS323 This lecture: Conjugate direction methods Conjugate directions Conjugate Gram-Schmidt The conjugate gradient (CG) algorithm Solving linear systems Leontief input-output

More information

A line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 )

A line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 ) A line-search algorithm inspired by the adaptive cubic regularization framewor with a worst-case complexity Oɛ 3/ E. Bergou Y. Diouane S. Gratton December 4, 017 Abstract Adaptive regularized framewor

More information

The semi-convergence of GSI method for singular saddle point problems

The semi-convergence of GSI method for singular saddle point problems Bull. Math. Soc. Sci. Math. Roumanie Tome 57(05 No., 04, 93 00 The semi-convergence of GSI method for singular saddle point problems by Shu-Xin Miao Abstract Recently, Miao Wang considered the GSI method

More information

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS HOMER F. WALKER Abstract. Recent results on residual smoothing are reviewed, and it is observed that certain of these are equivalent

More information

Matrix Derivatives and Descent Optimization Methods

Matrix Derivatives and Descent Optimization Methods Matrix Derivatives and Descent Optimization Methods 1 Qiang Ning Department of Electrical and Computer Engineering Beckman Institute for Advanced Science and Techonology University of Illinois at Urbana-Champaign

More information

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems Wei Ma Zheng-Jian Bai September 18, 2012 Abstract In this paper, we give a regularized directional derivative-based

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method

Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method

More information

A new initial search direction for nonlinear conjugate gradient method

A new initial search direction for nonlinear conjugate gradient method International Journal of Mathematis Researh. ISSN 0976-5840 Volume 6, Number 2 (2014), pp. 183 190 International Researh Publiation House http://www.irphouse.om A new initial searh diretion for nonlinear

More information

Introduction to Nonlinear Optimization Paul J. Atzberger

Introduction to Nonlinear Optimization Paul J. Atzberger Introduction to Nonlinear Optimization Paul J. Atzberger Comments should be sent to: atzberg@math.ucsb.edu Introduction We shall discuss in these notes a brief introduction to nonlinear optimization concepts,

More information

A class of Smoothing Method for Linear Second-Order Cone Programming

A class of Smoothing Method for Linear Second-Order Cone Programming Columbia International Publishing Journal of Advanced Computing (13) 1: 9-4 doi:1776/jac1313 Research Article A class of Smoothing Method for Linear Second-Order Cone Programming Zhuqing Gui *, Zhibin

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

2010 Mathematics Subject Classification: 90C30. , (1.2) where t. 0 is a step size, received from the line search, and the directions d

2010 Mathematics Subject Classification: 90C30. , (1.2) where t. 0 is a step size, received from the line search, and the directions d Journal of Applie Mathematics an Computation (JAMC), 8, (9), 366-378 http://wwwhillpublisheror/journal/jamc ISSN Online:576-645 ISSN Print:576-653 New Hbri Conjuate Graient Metho as A Convex Combination

More information

On spectral properties of steepest descent methods

On spectral properties of steepest descent methods ON SPECTRAL PROPERTIES OF STEEPEST DESCENT METHODS of 20 On spectral properties of steepest descent methods ROBERTA DE ASMUNDIS Department of Statistical Sciences, University of Rome La Sapienza, Piazzale

More information

A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS

A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS MEHIDDIN AL-BAALI AND HUMAID KHALFAN Abstract. Techniques for obtaining safely positive definite Hessian approximations with selfscaling

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization Journal of Computational and Applied Mathematics 155 (2003) 285 305 www.elsevier.com/locate/cam Nonmonotonic bac-tracing trust region interior point algorithm for linear constrained optimization Detong

More information

Cubic regularization of Newton s method for convex problems with constraints

Cubic regularization of Newton s method for convex problems with constraints CORE DISCUSSION PAPER 006/39 Cubic regularization of Newton s method for convex problems with constraints Yu. Nesterov March 31, 006 Abstract In this paper we derive efficiency estimates of the regularized

More information

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Clément Royer (Université du Wisconsin-Madison, États-Unis) Toulouse, 8 janvier 2019 Nonconvex optimization via Newton-CG

More information