New Accelerated Conjugate Gradient Algorithms for Unconstrained Optimization
|
|
- Kory Jordan
- 5 years ago
- Views:
Transcription
1 ew Accelerated Conjugate Gradient Algorithms for Unconstrained Optimization eculai Andrei Research Institute for Informatics, Center for Advanced Modeling and Optimization, 8-0, Averescu Avenue, Bucharest, Romania and Academy of Romanian Scientists, 54, Splaiul Independentei, Bucharest 5, Romania Abstract ew accelerated nonlinear conjugate gradient algorithms which are mainly modifications of the Dai and Yuan s for unconstrained optimization are proposed Using the exact line search, the algorithm reduces to the Dai and Yuan conjugate gradient computational scheme For inexact line search the algorithm satisfies the sufficient descent condition Since the step lengths in conjugate gradient algorithms may differ from by two order of magnitude and tend to vary in a very unpredictable manner, we suggest an acceleration scheme able to improve the efficiency of the algorithms A global convergence result is proved when the strong Wolfe line search conditions are used Computational results for a set consisting of 750 unconstrained optimization test problems show that these new conjugate gradient algorithms substantially outperform the Dai-Yuan conjugate gradient algorithm and its hybrid variants Keywords: Unconstrained optimization, conjugate gradient method, sufficient descent condition, conjugacy condition, ewton direction, numerical comparisons AMS 000 Mathematics Subject Classification: 49M07, 49M0, 90C06, 65K Dedicated to the memory of Professor Gene H Golub (93-007) Introduction Conjugate gradient methods represent an important class of unconstrained optimization algorithms with strong local and global convergence properties and modest memory requirements A history of these algorithms has been given by Golub and O Leary [8], as well as by O Leary [38] A survey of development of different versions of nonlinear conjugate gradient methods, with special attention to global convergence properties is presented by Hager and Zhang [3] A survey on their definition including 40 conjugate gradient algorithms for unconstrained optimization is given by Andrei [7] he conjugate gradient algorithms have been introduced early in 95 by Cornelius Lanczos [34, 35] and by Magnus Hestenes with the cooperation of JB Rosser, G E Forsythe, L Paige, M Stein, R Hayes and U Hochstrasser at the Institute for umerical Analysis, a part of ational Bureau of Standards in Los Angeles, and by Eduard Stiefel at Eidgenössischen echnischen Hochschule in Zürich (see Hestenes and Stiefel [3]) Even if the conjugate gradient methods are now over 50 years old, they continue to be of a considerable interest particularly due to their convergence properties and to a very easy implementation in computer programs of the corresponding algorithms Many researchers in this area brought significant contributions, clarifying many theoretical and computational aspects of this class of algorithms Particularly, for the development of effective algorithms for solving basic linear algebra problems Golub and Kahan [7] discussed the use of the Lanczos algorithm in computing the singular value decomposition of non-quadratic functions Concus, Golub and O Leary [5] considered preconditioning of conjugate gradient algorithms An important extension of the conjugate
2 gradient algorithm was given by Concus and Golub [6] where the Hermitian part of the matrix is taen as a preconditioner A development of a bloc form of the Lanczos algorithm has been proposed by Golub, Underwood and Wilinson [9] Fletcher and Reeves extended the conjugate gradient algorithm to minimization of non-quadratic functions [5], thus opening an important area of research and applications Developments of this algorithm have been considered inter allia by Pola and Ribière [4], Polya [4], Perry [40], Shanno [44, 45], Gilbert and ocedal [6], Liu and Storey [37], Hu and Storey [33],ouati-Ahmed and Storey [46] Recently, the papers of ocedal [39], Dai and Yuan [0, ], Hager and Zhang [30], Andrei [-6, 8-] present important theoretical and computational contributions to the development of this class of algorithms In this paper we suggest new nonlinear conjugate gradient algorithms which are mainly modifications of the Dai and Yuan [0] conjugate gradient computational scheme In these algorithms the direction d + is computed as a linear combination between g + and s, ie d+ = θ+ g+ + β s, where s = x+ x he parameter θ is computed in such a way that the direction d + is the ewton direction or it satisfies the conjugacy condition On the other hand, β is a proper modification of the Dai and Yuan s computational scheme in such a way that the direction d + satisfies the sufficient descent condition For the exact line search the proposed algorithms reduce to the Dai and Yuan conjugate gradient computational scheme he paper has the following structure In Section we present the development of the conjugate gradient algorithms with sufficient descent condition, while in section 3 we prove the global convergence of the algorithms under strong Wolfe line search conditions In Section 4 we present an acceleration of the algorithm while in Section 5 we compare the computational performance of the new conjugate gradient schemes against the Dai and Yuan method and its hybrid variants [] using 750 unconstrained optimization test problems from the CUE [4] library along with some other large-scale unconstrained optimization problems presented in [] Using the Dolan and Moré performance profiles [3] we prove that these new accelerated conjugate gradient algorithms outperform the Dai-Yuan algorithm as well as its hybrid variants Modifications of the Dai-Yuan conjugate gradient algorithm For solving the unconstrained optimization problem { } min f ( x): x R n, () n where f: R R is continuously differentiable we consider a nonlinear conjugate gradient algorithm: x+ = x + α d, () where the stepsize α is positive and the directions are computed by the rule: where d d + θ+ g+ β s, β = + d = g, (3) 0 0 g+ g+ s g+ = ys ( ys ) ( ), (4) and θ + is a parameter which follows to be determined Here g = f ( x ) and y = g + g, s = x+ x Observe that if f is a quadratic function and α is selected to achieve the exact minimum of f in the direction d, then s g = + 0 and the formula (4) for β reduces to the Dai and Yuan computational scheme [0] However, in this paper we refer to general nonlinear functions and inexact line search We were led to this computational scheme by modifying the Dai and Yuan algorithm
3 β DY = g g + +, y s in order to conserve the sufficient descent condition and to have some other properties for an efficient conjugate gradient algorithm Using a standard Wolfe line search, the Dai and Yuan method always generates descent directions and under Lipschitz assumption it is globally convergent In [7] Dai established a remarable property relating the descent directions to the sufficient descent condition, showing that if there exist constants γ and γ such that γ g γ for all, then for any p ( 0, ), there exists a constant c > 0 such that the sufficient descent condition gi di c gi holds for at least p indices i [ 0, ], where j denotes the largest integer j In our algorithm the parameter β is selected in such a manner that the sufficient descent condition is satisfied at every iteration As we now, despite the strong convergence theory that has been developed for the Dai and Yuan method, it is susceptible to jamming, that is it begins to tae small steps without maing significant progress to the minimum When the iterates jam, y becomes tiny while g is DY bounded away from zero herefore, β is a proper modification of the β heorem If θ + /4, then the direction d+ = θ+ g+ + β s, (d0 = g 0 ), where β is given by (4) satisfies the sufficient descent condition g d + + θ+ g+ (5) 4 Proof Since d g +, we have g, we have g d = g, which satisfy (5) Multiplying (3) by 0 = ( g )( ) g ( s g ) + g+ g+ s θ+ + ys ( ys ) g d = g + (6) ow, using the inequality uv ( u + v ) we have: ( ys ) g+ / ( g+ s) g = ys ( ys ) ( g g )( g s ) ( ) ( ) ys g g s g ( ys) ( g+ s) g+ + ys = g + 4 ( ) (7) Using (7) in (6) we get (5) o conclude, the sufficient descent condition from (5), the quantity θ + /4 is required to be nonnegative Supposing that θ + /4 > 0, then the direction given by (3) and (4) is a descent direction Dai and Yuan [0, ] present conjugate gradient schemes with the property that g d < 0 when y s > 0 If f is strongly convex or the line search satisfies the Wolfe conditions, then y s > 0 and the Dai and Yuan scheme yield descent In our 3
4 algorithm observe that, if for all, θ + /4, and the line search satisfies the Wolfe conditions, then for all the search direction (3) and (4) satisfy the sufficient descent condition ote that in (5) we bound g + d + by ( θ + /4) g+, while for scheme of Dai and Yuan only the non-negativity of g + d + is established o determine the parameter θ + in (3) we suggest the following two procedures A) Our motivation to get a good algorithm for solving () is to choose the parameter θ + in such a way that for every the direction d + given by (3) be the ewton direction herefore, from the equation f ( x ) g = θ g + β s (8) after some algebra we get θ g + s g + + = ( ) s f x+ s + s g+ s f( x+ ) g+ ys ys he salient point in this formula for θ + is the presence of the Hessian For large-scale problems, choices for the update parameter that do not require the evaluation of the Hessian matrix are often preferred in practice to the methods that require the Hessian in each iteration herefore, in order to have an algorithm for solving large-scale problems we assume that in (8) we use an approximation B + of the true Hessian quasi-ewton equation B+ s = y his leads us to: f ( x + ) and let (9) B + satisfy the g ( s g ) + + θ+ = g + + s g + (0) y g+ ys Observe that if θ + given by (0) is greater than or equal to / 4, then according to heorem the direction (3) satisfies the sufficient descent condition (5) On the other hand, if in (0) θ + < /4, then we tae ex abrupto θ + = in (3) B) he second procedure is based on the conjugacy condition Dai and Liao [8] introduced the conjugacy condition yd = tsg where t 0 is a scalar his is indeed very, + + reasonable since in real computation the inexact line search is generally used However, this condition is very dependent on the nonnegative parameter t, for which we do not now any formula to choose in an optimal manner herefore, even if in our developments we use the inexact line search we adopt here a more conservative approach and consider the conjugacy condition yd + = 0 his leads us to: g ( s g ) + + θ+ = g + () y g+ ys Again, if θ + given by () is greater than or equal to / 4, then according to heorem the direction (3) satisfies the sufficient descent condition (5) On the other hand, if in () θ + < /4, then we tae θ + = in (3) he line search in the conjugate gradient algorithms for α computation is often based on the standard Wolfe conditions: f ( x + α d ) f ( x ) ρα g d (), g + d σ g d, (3) 4
5 where d is a descent direction and 0< ρ σ < In [] Dai and Yuan proved the global convergence of a conjugate gradient algorithm for DY which β = β t, where t [ c,] with c = ( σ ) /( + σ ) Our algorithm is a modification of the Dai and Yuan s with the following property Observe that where β g + s g+ ys ys DY = = β r, (4) r s g = (5) + ys s g+ σs g σys σs g+ From the second Wolfe condition it follows that = +, ie Since by the Wolfe condition σ s g+ ys σ s g+ σ ys > 0, it follows that Hence ys σ r σ herefore, β β DY σ 3 Convergence analysis In this section we analyze the convergence of the algorithm (), (3), (4) and (0) or () where d 0 = g 0 In the following we consider that g 0 for all, otherwise a stationary point is obtained Assume that: S = x R n : f( x) f( x ) is bounded (i) he level set { 0 } (ii) In a neighborhood of S, the function f is continuously differentiable and its gradient is Lipschitz continuous, ie there exists a constant L > 0 such that f ( x) f( y) L x y, for all x, y Under these assumptions on f there exists a constant Γ 0 such that f( x) Γ for all x S In order to prove the global convergence, we assume that the step size α in () is obtained by the strong Wolfe line search, that is, f ( x + α d ) f ( x ) ρα g d (3), g d g d + σ (3) where ρ and σ are positive constants such that 0< ρ σ < Dai et al [] proved that for any conjugate gradient method with strong Wolfe line search the following general result holds: Lemma 3 Suppose that the assumptions (i) and (ii) hold and consider any conjugate gradient method () and (3), where d is a descent direction and α is obtained by the strong Wolfe line search (3) and (3) If then =, (33) d 5
6 lim inf g = 0 (34) heorem 3 Suppose that the assumptions (i) and (ii) hold and consider the algorithm (), (3), (4) and (0) or (), where d + is a descent direction and α is obtained by the strong Wolfe line search (3) and (3) If there exists a constant γ 0 such γ f ( x), /4 θ τ, where τ is a positive constant and the angle ϕ between g and d is bounded, ie cosϕ ξ for all = 0,,, then the algorithm satisfies li minf g = 0 Proof Observe that ys = g+ s gs ( σ ) gs But gs = g s cos ϕ Since d is a descent direction it follows that g s g s ξ 0 for all = 0,,, ie With these where herefore β his relation shows that ys ( σ) g s ξ + g+ Γ σ ( σ) ξ ( σ) ξγ g η =, ys g s s s Γ η = ( σ ) ξγ η d g s s + θ+ + + β τγ+ = τγ+ η s = d ( τγ+ η) Hence, from Lemma 3 it follows that liminf g = 0 4 Acceleration of the algorithm In conjugate gradient algorithms the search directions tend to be poorly scaled and as a consequence the line search must perform more function evaluations in order to obtain a suitable steplength α In order to improve the performances of the conjugate gradient algorithms the efforts were directed to design procedures for direction computation based on the second order information For example, COMI [43], and SCALCG [-5] tae this idea of BFGS preconditioning In this section we focus on the step length modification In the context of gradient descent algorithm with bactracing the step length modification has been considered for the first time in [6] ocedal [39] pointed out that in conjugate gradient methods the step lengths may differ from in a very unpredictable manner hey can be larger or smaller than depending on how the problem is scaled umerical comparisons between conjugate gradient methods and the limited memory quasi ewton method, by Liu and ocedal [36], show that the latter is more successful [8] One explanation of the efficiency of the limited memory quasi-ewton method is given by its ability to accept unity step lengths along the iterations In this section we tae advantage of this behavior of conjugate gradient algorithms and present an acceleration scheme Basically this modifies the step length in a multiplicative manner to improve the reduction of the function values along the iterations 6
7 Line search For using the algorithm () one of the crucial elements is the stepsize computation For sae of generality in the following we consider the line searches that satisfy either the Goldstein s conditions [4]: ρα g d f( x + α d ) f( x ) ρα g d, (4) where 0< ρ < < ρ < and α > 0, or the Wolfe conditions () and (3) Proposition 4 Assume that d is a descent direction and f satisfies the Lipschitz condition f ( x) f( x) L x x for all x on the line segment connecting x and x + where L is a positive constant If the line search satisfies the Goldstein conditions (4),, then α g d ( ρ) (4) L If the line search satisfies the Wolfe conditions () and (3), then ( σ ) gd α (43) L d Proof If the Goldstein conditions are satisfied, then using the mean value theorem from (4) we get: ρα g d f( x + α d ) f( x ) d = α f ( x + ξd ) d where ξ [0, α ] From this inequality we immediately get (4) ow, to prove (43) subtract condition we get: But, g d α g d + Lα d from both sides of (3) and using the Lipschitz ( ) ( ) α (44) σ gd g+ g d L d d is a descent direction and since σ <, we immediately get (43) herefore, satisfying the Goldstein or the Wolfe line search conditions α is bounded away from zero, ie there exists a positive constant ω, such that α ω Acceleration scheme [6] Given the initial point x 0 we can compute f 0 = f( x 0 ), g0 = f( x 0 ) and by Wolfe line search conditions () and (3) the steplength α 0 is determined With these, the next iteration is computed as: x = x0 + α0d0, ( d0 = g 0 ) where f and are immediately determined and the direction d can be computed as: g d = θg+ β0 s0, where the conjugate gradient parameter β 0 is computed as in (4) and the scaling factor θ is computed as in (0) or () herefore, at the iteration =,, we now x, f, g and d = θg + β s Suppose that d is a descent direction (ie θ /4) By the Wolfe line search () and (3) we can compute α with which the following point z = x + αd is determined he first Wolfe condition () shows that the steplength α > 0 satisfies: f ( z) = f( x + α d ) f( x ) + ρα g d, 7
8 With these, let us introduce the accelerated conjugate gradient algorithm by means of the following iterative scheme: x = x + + γ αd, (45) where γ > 0 is a parameter which follows to be determined in such a manner as to improve the behavior of the algorithm ow, we have: f( x + αd) = f( x) + αgd + αd f( x) d + o( αd ) (46) On the other hand, for γ > 0 we have: f( x + γαd) = f( x) + γαgd + γ αd f( x) d + o( γαd ) (47) With these we can write: f( x + γα d ) = f( x + α d ) +Ψ ( γ ), (48) where Let us denote: Ψ ( γ) = ( γ ) αd f( x) d + ( γ ) αgd ( ) ( ) + γαo α d αo α d a = α g d 0, b = α d f( x ) d, ε = o ( α d ) Observe that a 0, since d is a descent direction, and for convex functions b 0 herefore, Ψ γ = γ + γ + γ α ε α ε ow, we see that Ψ ( γ ) = ( b + αε ) γ+ a and Ψ ( γ ) = 0 where ( ) ( ) b ( ) a m (49) (40) a γ m = b + α ε (4) Observe that Ψ (0) = a 0 herefore, assuming that b + α ε > 0, then Ψ ( γ ) is a convex quadratic function with minimum value in point γ m and ( a + ( b + αε )) Ψ ( γ m) = 0 ( b + αε ) m Considering γ = γ in (48) and since b 0, we see that for every ( a + ( b + αε )) f ( x + γmαd) = f( x + αd) f( x + αd), ( b + αε ) which is a possible improvement of the values of function f (when a + ( b + α ε ) 0) herefore, using this simple multiplicative modification of the stepsize α as γ α where γ = γ = a /( b + α ε ) we get: m ( a + ( b + αε )) f( x ) = f( x + γαd ) f( x ) + ραg d + ( b + αε ) ( a + ( b + αε )) = f ( x) ρa f( x), (4) ( b + αε ) 8
9 since a 0, ( d is a descent direction) Observe that if d is a descent direction, then ( a + ( b + αε )) ( a + b) > ( b + αε ) b and from (4) we get: ( a + ( b + αε )) f ( x+ ) f( x) ρa ( b + αε ) ( a + b ) < f ( x) ρa f( x ) b herefore, neglecting the contribution of ε, we still get an improvement on the function values ow, in order to get the algorithm we have to determine a way for computation For this, at point = + α we have: z x d f ( z) = f( x + αd) = f( x) + αgd + αd f( x ) d, where x is a point on the line segment connecting x and z On the other hand, at point x = z α d we have: f ( x) = f( z αd) = f( z) αgzd + αd f( x) d, where gz = f( z) and x is a point on the line segment connecting x and z Having in view the local character of searching and that the distance between x and z is small enough, we can consider x = x = x So, adding the above equalities we get: where y = g g z b = α y d, b (43) Observe that if a > b, then γ > In this case γ α > α and it is also possible that γ α or γ α > Hence, the steplength γ α can be greater than On the other hand, if a b, then γ In this case γ α α, so the steplength γ α is reduced herefore, if a b, then γ and the steplength α computed by Wolfe conditions will be modified by its increasing or its reducing through factor γ eglecting ε in (40), we see that Ψ () = 0 and if a b /, then Ψ (0) = a b / 0 and γ < herefore, for any γ [0,], Ψ ( γ ) 0 As a consequence for any γ (0,), it follows that f ( x + γα d ) < f( x ) In this case, for any γ [0,], γ α α However, in our algorithm we selected γ = γ m as the point achieving the minimum value of Ψ ( γ ) 5 AMDY and AMDYC Algorithms Considering the definitions of g, s and y we present the following conjugate gradient algorithms which are accelerated, modified versions of the Dai and Yuan algorithm with ewton direction (AMDY) or with conjugacy condition (AMDYC) 9
10 AMDY and AMDYC Algorithms Step Initialization Select x R n 0 and the parameters 0< ρ < σ < Compute f ( x ) 0 and g 0 Consider d0 = g0 and α 0 = / g 0 Set = 0 Step est for continuation of iterations If g 0 6, then stop, else set = + Step 3 Line search Compute α satisfying the Wolfe line search conditions () and (3) Step 4 Compute: z = x + αd, gz = f( z) and y = g gz Step 5 Compute: a α g d, and b = α y d = Step 6 If b 0, then compute γ = a / b and update the variables as g + Compute y g + g x = x + + γ α d, otherwise update the variables as x = x + + αd Compute f + and = and s = x x + Step 7 θ + computation For the algorithm AMDY, θ + is computed as in (0) For the algorithm AMDYC, θ + is computed as in () If θ + < /4, then we set θ + = Step 8 Direction computation Compute d = θ g + β s + +, where β is computed as in (4) If 3 g + d 0 d g +, (5) then define d + = d, otherwise set d+ = g + Compute the initial guess α = α d / d, set = + and continue with step It is well nown that if f is bounded along the direction then there exists a stepsize α satisfying the Wolfe line search conditions () and (3) In our algorithm when the angle between d and g + is not acute enough, then we restart the algorithm with the negative gradient g + More sophisticated reasons for restarting the algorithms have been proposed in the literature, but we are interested in the performance of a conjugate gradient algorithm that uses this restart criterion, associated to a direction satisfying the sufficient descent condition Under reasonable assumptions, conditions (), (3) and (5) are sufficient to prove the global convergence of the algorithm he initial selection of the step length crucially affects the practical behaviour of the algorithm At every iteration the starting guess for the step α in the line search is computed as α d / d his selection, was considered for the first time by Shanno and Phua in COMI [43] It is also considered in the pacages: SCG by Birgin and Martínez [3] and in SCALCG by Andrei [-5, ] Proposition 5 Suppose that f is a uniformly convex function on the level set { : ( ) ( )} S = x f x f x, and satisfies the sufficient descent condition g d where c > 0, and 0 d or AMDYC converges linearly to d < c g, d c g, where c > 0 hen the sequence generated by AMDY x *, solution to the problem () Proof From (4) we have that f ( x + ) f( x ) for all Since follows that lim( f( x) f( x + )) = 0 f is bounded below, it 0
11 ow, since f is uniformly convex there exist positive constants m and M, such that mi f ( x) MI on S Suppose that x + αd S and x + γ mαd S for all α > 0 We have: ( a + b) f( x + γmαd) f( x + αd) b But, from uniform convexity we have the following quadratic upper bound on f ( x + αd ): f( x + αd) f( x) + αgd + Mα d herefore, f ( x + αd) f( x) αc g + Mcα g = f( x) + cα + Mcα g c Observe that for 0 α c/( Mc), cα + Mcα α which follows from the convexity of cα + ( Mc /) α Using this result we get: since ρ < / f ( x + αd) f( x) cα g f( x) ρcα g, From proposition 4 the Wolfe line search terminates with a value α ω > 0 herefore, for 0 α c/( Mc ), this provides a lower bound on the decrease in the function f, ie On the other hand, f( x + αd ) f( x ) ρcω g (5) 4 ( α Mc αc) g ω α Mc M g c ( a + b) ( Mc c ) b γmα ρ ω Mc g (53) Considering (5) and (53) we get: ( ωmc c ) f( x + d ) f( x ) c g g (54) herefore, ( ωmc c) f( x) f( x + γmαd) ρcω+ g Mc But, f( x) f( x + ) 0 and as a consequence g goes to zero, ie x converges to x * Having in view that f ( x ) is a nonincreasing sequence, it follows that f ( x ) converges to * f ( x ) From (54) we see that Combining this with we conclude: ( ωmc c ) f( x+ ) f( x) ρcω+ g Mc * g m( f( x ) f ) and subtracting * * f ( x+ ) f c( f( x ) f ), where (55) * f from both sides of (55)
12 ( ωmc c) c= m ρcω+ < Mc * herefore, f ( x ) converges to f at least as fast as a geometric series with a factor that depends on the parameter ρ in the first Wolfe condition and the bounds m and M of the Hessian So, the convergence of the acceleration scheme is at least linear 6 umerical results and comparisons In this section we present the computational performance of a Fortran implementation of the AMDY and AMDYC algorithms on a set of 750 unconstrained optimization test problems he test problems are the unconstrained problems in the CUE [4] library, along with other large-scale optimization problems presented in [] We selected 75 large-scale unconstrained optimization problems in extended or generalized form For each function we have considered ten numerical experiments with the increasing number of variables n = 000, 000,, 0000 All algorithms implement the Wolfe line search conditions with ρ = 0000 and σ = 09, and the same stopping criterion g 0 6, where is the maximum absolute component of a vector he comparisons of algorithms are given in the ALG ALG following context Let f i and f i be the optimal value found by ALG and ALG, for problem i =,, 750, respectively We say that, in the particular problem i, the performance of ALG was better than the performance of ALG if ALG ALG 3 fi fi <0 (6) and the number of iterations, or the number of function-gradient evaluations, or the CPU time of ALG was less than the number of iterations, or the number of function-gradient evaluations, or the CPU time corresponding to ALG, respectively All codes are written in double precision Fortran and compiled with f77 (default compiler settings) on an Intel Pentium 4, 8GHz worstation All these codes are authored by Andrei In the first set of numerical experiments we compare AMDY versus AMDYC In able we present the number of problems solved by these two algorithms with a minimum number of iterations (#iter), a minimum number of function and its gradient evaluations (#fg) and the minimum cpu time able Performance of AMDY versus AMDYC 750 problems AMDY AMDYC = # iter # fg CPU Both algorithms have similar performances However, subject to cpu time metric, AMDY proves to be slightly better In the second set of numerical experiments we compare AMDY and AMDYC algorithms with the Dai and Yuan (DY) algorithm Figures and present the Dolan-Moré performance profile for these algorithms subject to the cpu time metric We see that both AMDY and AMDYC is top performer, being more successful and more robust than the Dai and Yuan algorithm When comparing AMDY with the Dai and Yuan algorithm (Figure ), subject to the number of iterations, we see that AMDY was better in 69 problems (ie it achieved the minimum number of iterations in 69 problems) DY was better in 7 problems and they achieved the same number of iterations in 60 problems, etc Out of 750 problems, only for 706 of them does the criterion (6) hold
13 Fig Performance profile of AMDY versus DY Fig Performance profile of AMDYC versus DY Dai and Yuan [] studied the hybrid conjugate gradient algorithms and proposed the following two hybrid methods: hdy σ DY HS DY β = max β, min { β, β }, (6) + σ β { { β }} = max 0, min, β, (63) hdyz HS DY 3
14 where HS / β + = y g y s, showing their global convergence when the Lipschitz assumption holds and the standard Wolfe line search is used he numerical experiments of Dai and i [9] proved that the second hybrid method (hdyz) is the better, outperforming the Pola- Ribière [4] and Polya [4] method In the third set of numerical experiments we compare the Dolan-Moré performance profile of AMDY and AMDYC versus Dai-Yuan hybrid hdy conjugate gradient subject to the cpu time metric, as in Figures 3 and 4 β Fig 3 Performance profile of AMDY versus hdy Fig 4 Performance profile of AMDYC versus hdy 4
15 In the fourth set of numerical experiments, in Figures 5 and 6, we compare the Dolan- Moré performance profile of AMDY and AMDYC versus Dai-Yuan hybrid conjugate hdyz gradient subject to the cpu time metric β Fig 5 Performance profile of AMDY versus hdyz Fig 6 Performance profile of AMDYC versus hdyz 5
16 7 Conclusion We have presented a new conjugate gradient algorithm for solving large-scale unconstrained optimization problems he parameter β is a modification of the Dai and Yuan computational scheme in such a manner that the direction d generated by the algorithm satisfies the sufficient descent condition, independent of the line search Under strong Wolfe line search conditions we proved the global convergence of the algorithm We present computational evidence that the performance of our algorithms AMDY and AMDYC was higher than that of the Dai and Yuan conjugate gradient algorithm and its hybrid variants, for a set consisting of 750 problems References [] Andrei, : Conjugate gradient algorithms for large scale unconstrained optimization ICI echnical Report, January, 005 [] Andrei, : Scaled conjugate gradient algorithms for unconstrained optimization Computational Optimization and Applications, 38 (007), pp [3] Andrei, : Scaled memoryless BFGS preconditioned conjugate gradient algorithm for unconstrained optimization Optimization Methods and Software (007), pp [4] Andrei, : A scaled BFGS preconditioned conjugate gradient algorithm for unconstrained optimization Applied Mathematics Letters 0 (007), pp [5] Andrei, : A scaled nonlinear conjugate gradient algorithm for unconstrained optimization Optimization, 57 (008), pp [6] Andrei, : An acceleration of gradient descent algorithm with bactracing for unconstrained optimization umerical Algorithms, 4 (006), pp63-73 [7] Andrei, : 40 conjugate gradient algorithms for unconstrained optimization A survey on their definition ICI echnical Report o 3/08, March 4, 008 [8] Andrei, : umerical comparison of conjugate gradient algorithms for unconstrained optimization Studies in Informatics and Control, 6 (007), pp [9] Andrei, : A Dai-Yuan conjugate gradient algorithm with sufficient descent and conjugacy conditions for unconstrained optimization Applied Mathematics Letters (008), pp 65-7 [0] Andrei, : Another hybrid conjugate gradient algorithm for unconstrained optimization umerical Algorithms, 47 (008), pp43-56 [] Andrei, : Performance profiles of conjugate gradient algorithms for unconstrained optimization Encyclopedia of Optimization, nd edition, 008, CA Floudas and PM Pardalos (Eds), entry 76, in press [] Andrei, : An unconstrained optimization test functions collection Advanced Modeling and Optimization An electronic International Journal, 0 (008), pp47-6 [3] Birgin, E, Martínez, JM: A spectral conjugate gradient method for unconstrained optimization, Applied Math and Optimization, 43 (00), pp7-8 [4] Bongartz, I, Conn, AR, Gould, IM, oint, PhL: CUE: constrained and unconstrained testing environments, ACM rans Math Software, (995), pp3-60 [5] Concus, P, Golub, GH, O Leary, DP: A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations, in JR Bunch and DJ Rose (Eds) Sparse Matrix Computations, Academic Press, ew Yor, 976, pp [6] Concus, P, Golub, GH: A generalized conjugate gradient method for nonsymmetric systems of linear equations, in R Glowinsi and JL Lions (Eds) Computing Methods in Applied Sciences and Engineering ( nd International Symposium, Versailles 975), Part I, Springer Lecture otes in Economics and Mathematical Systems 34, ew Yor, 976, pp56-65 [7] Dai, YH: ew properties of a nonlinear conjugate gradient method umer Math, 89 (00), pp
17 [8] Dai, YH, Liao, LZ: ew conjugacy con7 ditions and related nonlinear conjugate gradient methods Appl Math Optim, 43 (00), pp 87-0 [9] Dai, YH i, Q: esting different conjugate gradient methods for large-scale unconstrained optimization, J Comput Math, (003), pp3-30 [0] Dai, YH, Yuan, Y: A nonlinear conjugate gradient method with a strong global convergence property, SIAM J Optim, 0 (999), pp77-8 [] Dai, YH Yuan, Y: An efficient hybrid conjugate gradient method for unconstrained optimization Annals of Operations Research, 03 (00), pp33-47 [] Dai, YH, Han, JY, Liu, GH, Sun, DF, Yin, X, Yuan, Y: Convergence properties of nonlinear conjugate gradient methods SIAM Journal on Optimization 0 (999), [3] Dolan, ED, Moré, JJ: Benchmaring optimization software with performance profiles, Math Programming, 9 (00), pp 0-3 [4] Goldstein, AA: On steepest descent, SIAM J Control, Vol 3, pp47-5, 965 [5] Fletcher, R, Reeves, CM: Function minimization by conjugate gradients Computer Journal, 7 (964), pp49-54 [6] Gilbert, JC ocedal, J: Global convergence properties of conjugate gradient methods for optimization SIAM J Optim, (99), pp-4 [7] Golub, GH, Kahan, W: Calculating the singular values and pseudo-inverse of a matrix SIAM J umer Anal, (Series B) (965), pp 05-4 [8] Golub, GH O Leary, DP: Some history of the conjugate gradient and Lanczos algorithms: SIAM Review, 3 (976), pp50-00 [9] Golub, GH, Underwood, R, Wilinson, JH: he Lanczos algorithm for the symmetric Ax= λbxproblem echnical Report, Stanford University Computer Science Department, Stanford, California, 97 [30] Hager, WW, Zhang, H: A new conjugate gradient method with guaranteed descent and an efficient line search, SIAM Journal on Optimization, 6 (005), 70-9 [3] Hager, WW, Zhang, H: A survey of nonlinear conjugate gradient methods Pacific journal of Optimization, (006), pp35-58 [3] Hestenes, MR, Stiefel, E: Methods of conjugate gradients for solving linear systems, J Research at Bur Standards Sec B 48, pp , 95 [33] Hu, YF, Storey, C: Global convergence result for conjugate gradient methods JOA, 7, (99), pp [34] Lanczos, C: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, J Res at Bur Standards, 45 (950), pp55-8 [35] Lanczos, C: Solution of systems of linear equations by minimized iterations, J Res at Bur Standards, 49 (95), pp33-53 [36] Liu, DC, ocedal, J: On the limited memory BFGS method for large scale optimization methods Mathematical Programming, 45 (989), pp [37] Liu, Y, Storey, C: Efficient generalized conjugate gradient algorithms, Part : heory JOA, 69 (99), pp9-37 [38] O Leary, DP: Conjugate gradients and related KMP algorithms: he beginnings In L Adams and JL azareth (Eds) Linear and onlinear Conjugate Gradient Related Methods SIAM, Philadelphia, 996, pp-8 [39] ocedal, J: Conjugate gradient methods and nonlinear optimization, in L Adams and JL azareth (Eds) Linear and onlinear Conjugate Gradient Related Methods, SIAM, Philadelphia, 996, pp9-3 [40] Perry, JM: A class of conjugate gradient algorithms with a two-step variable-metric memory, Discussion Paper 69, Center for Mathematical Studies in Economic and Management Sciences, orthwestern University, Evanston, Illinois, 977 [4] Pola, E, Ribière, G : ote sur la convergence de méthodes de directions conjuguée, Revue Francaise Informat Recherche Opérationnelle, 3e Année 6 (969), pp35-43 [4] Polya, B: he conjugate gradient method in extreme problems,ussr Comp Math Math Phys, 9 (969), pp94-7
18 [43] Shanno, DF, Phua, V: Algorithm 500, Minimization of unconstrained multivariate functions, ACM rans on Math Soft,, pp87-94, 976 [44] Shanno, DF: On the convergence of a new conjugate gradient algorithm,siam J umer Anal, 5 (978), pp47-57 [45] Shanno, DF: Conjugate gradient methods with inexact searches Math Oper Res, 3 (978), pp44-56 [46] ouati-ahmed, D, Storey, C: Efficient hybrid conjugate gradient techniques, Journal Of Optimization heory and Applications, 64 (990), pp June 8, 008 8
New hybrid conjugate gradient algorithms for unconstrained optimization
ew hybrid conjugate gradient algorithms for unconstrained optimization eculai Andrei Research Institute for Informatics, Center for Advanced Modeling and Optimization, 8-0, Averescu Avenue, Bucharest,
More informationFirst Published on: 11 October 2006 To link to this article: DOI: / URL:
his article was downloaded by:[universitetsbiblioteet i Bergen] [Universitetsbiblioteet i Bergen] On: 12 March 2007 Access Details: [subscription number 768372013] Publisher: aylor & Francis Informa Ltd
More informationNew Hybrid Conjugate Gradient Method as a Convex Combination of FR and PRP Methods
Filomat 3:11 (216), 383 31 DOI 1.2298/FIL161183D Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat New Hybrid Conjugate Gradient
More informationOpen Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization
BULLETIN of the Malaysian Mathematical Sciences Society http://math.usm.my/bulletin Bull. Malays. Math. Sci. Soc. (2) 34(2) (2011), 319 330 Open Problems in Nonlinear Conjugate Gradient Algorithms for
More informationA globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications
A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth
More informationAN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.
Bull. Iranian Math. Soc. Vol. 40 (2014), No. 1, pp. 235 242 Online ISSN: 1735-8515 AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.
More informationNew hybrid conjugate gradient methods with the generalized Wolfe line search
Xu and Kong SpringerPlus (016)5:881 DOI 10.1186/s40064-016-5-9 METHODOLOGY New hybrid conjugate gradient methods with the generalized Wolfe line search Open Access Xiao Xu * and Fan yu Kong *Correspondence:
More informationGLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH
GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH Jie Sun 1 Department of Decision Sciences National University of Singapore, Republic of Singapore Jiapu Zhang 2 Department of Mathematics
More informationBulletin of the. Iranian Mathematical Society
ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 43 (2017), No. 7, pp. 2437 2448. Title: Extensions of the Hestenes Stiefel and Pola Ribière Polya conjugate
More informationA Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search
A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search Yu-Hong Dai and Cai-Xia Kou State Key Laboratory of Scientific and Engineering Computing, Institute of
More informationNonlinear conjugate gradient methods, Unconstrained optimization, Nonlinear
A SURVEY OF NONLINEAR CONJUGATE GRADIENT METHODS WILLIAM W. HAGER AND HONGCHAO ZHANG Abstract. This paper reviews the development of different versions of nonlinear conjugate gradient methods, with special
More informationConvergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize
Bol. Soc. Paran. Mat. (3s.) v. 00 0 (0000):????. c SPM ISSN-2175-1188 on line ISSN-00378712 in press SPM: www.spm.uem.br/bspm doi:10.5269/bspm.v38i6.35641 Convergence of a Two-parameter Family of Conjugate
More informationResearch Article A Descent Dai-Liao Conjugate Gradient Method Based on a Modified Secant Equation and Its Global Convergence
International Scholarly Research Networ ISRN Computational Mathematics Volume 2012, Article ID 435495, 8 pages doi:10.5402/2012/435495 Research Article A Descent Dai-Liao Conjugate Gradient Method Based
More informationModification of the Armijo line search to satisfy the convergence properties of HS method
Université de Sfax Faculté des Sciences de Sfax Département de Mathématiques BP. 1171 Rte. Soukra 3000 Sfax Tunisia INTERNATIONAL CONFERENCE ON ADVANCES IN APPLIED MATHEMATICS 2014 Modification of the
More informationon descent spectral cg algorithm for training recurrent neural networks
Technical R e p o r t on descent spectral cg algorithm for training recurrent neural networs I.E. Livieris, D.G. Sotiropoulos 2 P. Pintelas,3 No. TR9-4 University of Patras Department of Mathematics GR-265
More informationGlobal Convergence Properties of the HS Conjugate Gradient Method
Applied Mathematical Sciences, Vol. 7, 2013, no. 142, 7077-7091 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.311638 Global Convergence Properties of the HS Conjugate Gradient Method
More informationA Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence
Journal of Mathematical Research & Exposition Mar., 2010, Vol. 30, No. 2, pp. 297 308 DOI:10.3770/j.issn:1000-341X.2010.02.013 Http://jmre.dlut.edu.cn A Modified Hestenes-Stiefel Conjugate Gradient Method
More informationAn Efficient Modification of Nonlinear Conjugate Gradient Method
Malaysian Journal of Mathematical Sciences 10(S) March : 167-178 (2016) Special Issue: he 10th IM-G International Conference on Mathematics, Statistics and its Applications 2014 (ICMSA 2014) MALAYSIAN
More informationStep-size Estimation for Unconstrained Optimization Methods
Volume 24, N. 3, pp. 399 416, 2005 Copyright 2005 SBMAC ISSN 0101-8205 www.scielo.br/cam Step-size Estimation for Unconstrained Optimization Methods ZHEN-JUN SHI 1,2 and JIE SHEN 3 1 College of Operations
More informationStep lengths in BFGS method for monotone gradients
Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly
More informationNUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS
NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS Adeleke O. J. Department of Computer and Information Science/Mathematics Covenat University, Ota. Nigeria. Aderemi
More informationA modified quadratic hybridization of Polak-Ribière-Polyak and Fletcher-Reeves conjugate gradient method for unconstrained optimization problems
An International Journal of Optimization Control: Theories & Applications ISSN:246-0957 eissn:246-5703 Vol.7, No.2, pp.77-85 (207) http://doi.org/0.2/ijocta.0.207.00339 RESEARCH ARTICLE A modified quadratic
More informationOn Descent Spectral CG algorithms for Training Recurrent Neural Networks
29 3th Panhellenic Conference on Informatics On Descent Spectral CG algorithms for Training Recurrent Neural Networs I.E. Livieris, D.G. Sotiropoulos and P. Pintelas Abstract In this paper, we evaluate
More informationJanuary 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13
Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes
More informationOn the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search
ANZIAM J. 55 (E) pp.e79 E89, 2014 E79 On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search Lijun Li 1 Weijun Zhou 2 (Received 21 May 2013; revised
More informationAn Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations
International Journal of Mathematical Modelling & Computations Vol. 07, No. 02, Spring 2017, 145-157 An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations L. Muhammad
More informationThe Wolfe Epsilon Steepest Descent Algorithm
Applied Mathematical Sciences, Vol. 8, 204, no. 55, 273-274 HIKARI Ltd, www.m-hiari.com http://dx.doi.org/0.2988/ams.204.4375 The Wolfe Epsilon Steepest Descent Algorithm Haima Degaichia Department of
More informationAdaptive two-point stepsize gradient algorithm
Numerical Algorithms 27: 377 385, 2001. 2001 Kluwer Academic Publishers. Printed in the Netherlands. Adaptive two-point stepsize gradient algorithm Yu-Hong Dai and Hongchao Zhang State Key Laboratory of
More informationPreconditioned conjugate gradient algorithms with column scaling
Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationA family of derivative-free conjugate gradient methods for large-scale nonlinear systems of equations
Journal of Computational Applied Mathematics 224 (2009) 11 19 Contents lists available at ScienceDirect Journal of Computational Applied Mathematics journal homepage: www.elsevier.com/locate/cam A family
More informationTHE RELATIONSHIPS BETWEEN CG, BFGS, AND TWO LIMITED-MEMORY ALGORITHMS
Furman University Electronic Journal of Undergraduate Mathematics Volume 12, 5 20, 2007 HE RELAIONSHIPS BEWEEN CG, BFGS, AND WO LIMIED-MEMORY ALGORIHMS ZHIWEI (ONY) QIN Abstract. For the solution of linear
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationImproved Damped Quasi-Newton Methods for Unconstrained Optimization
Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified
More informationA derivative-free nonmonotone line search and its application to the spectral residual method
IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral
More informationA Novel of Step Size Selection Procedures. for Steepest Descent Method
Applied Mathematical Sciences, Vol. 6, 0, no. 5, 507 58 A Novel of Step Size Selection Procedures for Steepest Descent Method Goh Khang Wen, Mustafa Mamat, Ismail bin Mohd, 3 Yosza Dasril Department of
More informationGlobally convergent three-term conjugate gradient projection methods for solving nonlinear monotone equations
Arab. J. Math. (2018) 7:289 301 https://doi.org/10.1007/s40065-018-0206-8 Arabian Journal of Mathematics Mompati S. Koorapetse P. Kaelo Globally convergent three-term conjugate gradient projection methods
More informationGRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION
GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION By HONGCHAO ZHANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
More informationSpectral gradient projection method for solving nonlinear monotone equations
Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department
More informationThe Conjugate Gradient Method for Solving Linear Systems of Equations
The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2
More informationA DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION
1 A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION G E MANOUSSAKIS, T N GRAPSA and C A BOTSARIS Department of Mathematics, University of Patras, GR 26110 Patras, Greece e-mail :gemini@mathupatrasgr,
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More informationA new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints
Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality
More informationGlobal Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction
ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.11(2011) No.2,pp.153-158 Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method Yigui Ou, Jun Zhang
More informationThe Lanczos and conjugate gradient algorithms
The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization
More informationDownloaded 12/02/13 to Redistribution subject to SIAM license or copyright; see
SIAM J. OPTIM. Vol. 23, No. 4, pp. 2150 2168 c 2013 Society for Industrial and Applied Mathematics THE LIMITED MEMORY CONJUGATE GRADIENT METHOD WILLIAM W. HAGER AND HONGCHAO ZHANG Abstract. In theory,
More informationResearch Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search
Abstract and Applied Analysis Volume 013, Article ID 74815, 5 pages http://dx.doi.org/10.1155/013/74815 Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search Yuan-Yuan Chen
More informationQuasi-Newton Methods
Quasi-Newton Methods Werner C. Rheinboldt These are excerpts of material relating to the boos [OR00 and [Rhe98 and of write-ups prepared for courses held at the University of Pittsburgh. Some further references
More informationON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS
ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-203-OS-03 Department of Mathematics KTH Royal
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More informationLine Search Methods for Unconstrained Optimisation
Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic
More informationUnconstrained Multivariate Optimization
Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued
More informationMath 164: Optimization Barzilai-Borwein Method
Math 164: Optimization Barzilai-Borwein Method Instructor: Wotao Yin Department of Mathematics, UCLA Spring 2015 online discussions on piazza.com Main features of the Barzilai-Borwein (BB) method The BB
More informationNotes on Numerical Optimization
Notes on Numerical Optimization University of Chicago, 2014 Viva Patel October 18, 2014 1 Contents Contents 2 List of Algorithms 4 I Fundamentals of Optimization 5 1 Overview of Numerical Optimization
More informationOPER 627: Nonlinear Optimization Lecture 14: Mid-term Review
OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization
More informationGradient methods exploiting spectral properties
Noname manuscript No. (will be inserted by the editor) Gradient methods exploiting spectral properties Yaui Huang Yu-Hong Dai Xin-Wei Liu Received: date / Accepted: date Abstract A new stepsize is derived
More informationJae Heon Yun and Yu Du Han
Bull. Korean Math. Soc. 39 (2002), No. 3, pp. 495 509 MODIFIED INCOMPLETE CHOLESKY FACTORIZATION PRECONDITIONERS FOR A SYMMETRIC POSITIVE DEFINITE MATRIX Jae Heon Yun and Yu Du Han Abstract. We propose
More informationA line-search algorithm inspired by the adaptive cubic regularization framework, with a worst-case complexity O(ɛ 3/2 ).
A line-search algorithm inspired by the adaptive cubic regularization framewor, with a worst-case complexity Oɛ 3/. E. Bergou Y. Diouane S. Gratton June 16, 017 Abstract Adaptive regularized framewor using
More informationResidual iterative schemes for largescale linear systems
Universidad Central de Venezuela Facultad de Ciencias Escuela de Computación Lecturas en Ciencias de la Computación ISSN 1316-6239 Residual iterative schemes for largescale linear systems William La Cruz
More informationConjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization
Conjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization Yasushi Narushima and Hiroshi Yabe September 28, 2011 Abstract Conjugate gradient
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More informationSOLVING SYSTEMS OF NONLINEAR EQUATIONS USING A GLOBALLY CONVERGENT OPTIMIZATION ALGORITHM
Transaction on Evolutionary Algorithm and Nonlinear Optimization ISSN: 2229-87 Online Publication, June 2012 www.pcoglobal.com/gjto.htm NG-O21/GJTO SOLVING SYSTEMS OF NONLINEAR EQUATIONS USING A GLOBALLY
More informationStatistics 580 Optimization Methods
Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of
More informationNew Inexact Line Search Method for Unconstrained Optimization 1,2
journal of optimization theory and applications: Vol. 127, No. 2, pp. 425 446, November 2005 ( 2005) DOI: 10.1007/s10957-005-6553-6 New Inexact Line Search Method for Unconstrained Optimization 1,2 Z.
More informationDENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS
DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS by Johannes Brust, Oleg Burdaov, Jennifer B. Erway, and Roummel F. Marcia Technical Report 07-, Department of Mathematics and Statistics, Wae
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationNumerical Optimization of Partial Differential Equations
Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada
More informationFALL 2018 MATH 4211/6211 Optimization Homework 4
FALL 2018 MATH 4211/6211 Optimization Homework 4 This homework assignment is open to textbook, reference books, slides, and online resources, excluding any direct solution to the problem (such as solution
More informationand P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s
Global Convergence of the Method of Shortest Residuals Yu-hong Dai and Ya-xiang Yuan State Key Laboratory of Scientic and Engineering Computing, Institute of Computational Mathematics and Scientic/Engineering
More informationGradient method based on epsilon algorithm for large-scale nonlinearoptimization
ISSN 1746-7233, England, UK World Journal of Modelling and Simulation Vol. 4 (2008) No. 1, pp. 64-68 Gradient method based on epsilon algorithm for large-scale nonlinearoptimization Jianliang Li, Lian
More informationMethods for Unconstrained Optimization Numerical Optimization Lectures 1-2
Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods
More informationCONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS
CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:
More informationChapter 10 Conjugate Direction Methods
Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method
More informationExtra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1
journal of complexity 18, 557 572 (2002) doi:10.1006/jcom.2001.0623 Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1 M. Al-Baali Department of Mathematics
More informationMS&E 318 (CME 338) Large-Scale Numerical Optimization
Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods
More informationNotes on Some Methods for Solving Linear Systems
Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationHT14 A COMPARISON OF DIFFERENT PARAMETER ESTIMATION TECHNIQUES FOR THE IDENTIFICATION OF THERMAL CONDUCTIVITY COMPONENTS OF ORTHOTROPIC SOLIDS
Inverse Problems in Engineering: heory and Practice rd Int. Conference on Inverse Problems in Engineering June -8, 999, Port Ludlow, WA, USA H4 A COMPARISON OF DIFFEREN PARAMEER ESIMAION ECHNIQUES FOR
More informationLec10p1, ORF363/COS323
Lec10 Page 1 Lec10p1, ORF363/COS323 This lecture: Conjugate direction methods Conjugate directions Conjugate Gram-Schmidt The conjugate gradient (CG) algorithm Solving linear systems Leontief input-output
More informationA line-search algorithm inspired by the adaptive cubic regularization framework with a worst-case complexity O(ɛ 3/2 )
A line-search algorithm inspired by the adaptive cubic regularization framewor with a worst-case complexity Oɛ 3/ E. Bergou Y. Diouane S. Gratton December 4, 017 Abstract Adaptive regularized framewor
More informationThe semi-convergence of GSI method for singular saddle point problems
Bull. Math. Soc. Sci. Math. Roumanie Tome 57(05 No., 04, 93 00 The semi-convergence of GSI method for singular saddle point problems by Shu-Xin Miao Abstract Recently, Miao Wang considered the GSI method
More informationRESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS
RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS HOMER F. WALKER Abstract. Recent results on residual smoothing are reviewed, and it is observed that certain of these are equivalent
More informationMatrix Derivatives and Descent Optimization Methods
Matrix Derivatives and Descent Optimization Methods 1 Qiang Ning Department of Electrical and Computer Engineering Beckman Institute for Advanced Science and Techonology University of Illinois at Urbana-Champaign
More informationA Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems
A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems Wei Ma Zheng-Jian Bai September 18, 2012 Abstract In this paper, we give a regularized directional derivative-based
More informationConjugate Gradient (CG) Method
Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous
More informationModification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method
Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method
More informationA new initial search direction for nonlinear conjugate gradient method
International Journal of Mathematis Researh. ISSN 0976-5840 Volume 6, Number 2 (2014), pp. 183 190 International Researh Publiation House http://www.irphouse.om A new initial searh diretion for nonlinear
More informationIntroduction to Nonlinear Optimization Paul J. Atzberger
Introduction to Nonlinear Optimization Paul J. Atzberger Comments should be sent to: atzberg@math.ucsb.edu Introduction We shall discuss in these notes a brief introduction to nonlinear optimization concepts,
More informationA class of Smoothing Method for Linear Second-Order Cone Programming
Columbia International Publishing Journal of Advanced Computing (13) 1: 9-4 doi:1776/jac1313 Research Article A class of Smoothing Method for Linear Second-Order Cone Programming Zhuqing Gui *, Zhibin
More informationA PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES
IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES
More informationNumerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen
Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen
More information2010 Mathematics Subject Classification: 90C30. , (1.2) where t. 0 is a step size, received from the line search, and the directions d
Journal of Applie Mathematics an Computation (JAMC), 8, (9), 366-378 http://wwwhillpublisheror/journal/jamc ISSN Online:576-645 ISSN Print:576-653 New Hbri Conjuate Graient Metho as A Convex Combination
More informationOn spectral properties of steepest descent methods
ON SPECTRAL PROPERTIES OF STEEPEST DESCENT METHODS of 20 On spectral properties of steepest descent methods ROBERTA DE ASMUNDIS Department of Statistical Sciences, University of Rome La Sapienza, Piazzale
More informationA COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS
A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS MEHIDDIN AL-BAALI AND HUMAID KHALFAN Abstract. Techniques for obtaining safely positive definite Hessian approximations with selfscaling
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.16
XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of
More informationNonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization
Journal of Computational and Applied Mathematics 155 (2003) 285 305 www.elsevier.com/locate/cam Nonmonotonic bac-tracing trust region interior point algorithm for linear constrained optimization Detong
More informationCubic regularization of Newton s method for convex problems with constraints
CORE DISCUSSION PAPER 006/39 Cubic regularization of Newton s method for convex problems with constraints Yu. Nesterov March 31, 006 Abstract In this paper we derive efficiency estimates of the regularized
More informationOptimisation non convexe avec garanties de complexité via Newton+gradient conjugué
Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Clément Royer (Université du Wisconsin-Madison, États-Unis) Toulouse, 8 janvier 2019 Nonconvex optimization via Newton-CG
More information