Research Article A Descent Dai-Liao Conjugate Gradient Method Based on a Modified Secant Equation and Its Global Convergence

Similar documents
A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

Bulletin of the. Iranian Mathematical Society

New hybrid conjugate gradient methods with the generalized Wolfe line search

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize

Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search

New hybrid conjugate gradient algorithms for unconstrained optimization

on descent spectral cg algorithm for training recurrent neural networks

First Published on: 11 October 2006 To link to this article: DOI: / URL:

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence

A modified quadratic hybridization of Polak-Ribière-Polyak and Fletcher-Reeves conjugate gradient method for unconstrained optimization problems

An Efficient Modification of Nonlinear Conjugate Gradient Method

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH

On Descent Spectral CG algorithms for Training Recurrent Neural Networks

Step-size Estimation for Unconstrained Optimization Methods

New Hybrid Conjugate Gradient Method as a Convex Combination of FR and PRP Methods

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13

Global Convergence Properties of the HS Conjugate Gradient Method

A derivative-free nonmonotone line search and its application to the spectral residual method

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations

Modification of the Armijo line search to satisfy the convergence properties of HS method

Nonlinear conjugate gradient methods, Unconstrained optimization, Nonlinear

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search

Globally convergent three-term conjugate gradient projection methods for solving nonlinear monotone equations

Conjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization

Step lengths in BFGS method for monotone gradients

Research Article A Two-Step Matrix-Free Secant Method for Solving Large-Scale Systems of Nonlinear Equations

Research Article Strong Convergence of a Projected Gradient Method

Unconstrained optimization

Research Article Finding Global Minima with a Filled Function Approach for Non-Smooth Global Optimization

Spectral gradient projection method for solving nonlinear monotone equations

Research Article A Two-Grid Method for Finite Element Solutions of Nonlinear Parabolic Equations

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization

Research Article Modified T-F Function Method for Finding Global Minimizer on Unconstrained Optimization

New Accelerated Conjugate Gradient Algorithms for Unconstrained Optimization

Research Article Identifying a Global Optimizer with Filled Function for Nonlinear Integer Programming

Research Article Modified Halfspace-Relaxation Projection Methods for Solving the Split Feasibility Problem

A family of derivative-free conjugate gradient methods for large-scale nonlinear systems of equations

Research Article Existence of Periodic Positive Solutions for Abstract Difference Equations

Research Article Convex Polyhedron Method to Stability of Continuous Systems with Two Additive Time-Varying Delay Components

Preconditioned conjugate gradient algorithms with column scaling

A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method

The Wolfe Epsilon Steepest Descent Algorithm

Research Article The Solution Set Characterization and Error Bound for the Extended Mixed Linear Complementarity Problem

Newton Method with Adaptive Step-Size for Under-Determined Systems of Equations

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

Line Search Methods for Unconstrained Optimisation

Research Article Frequent Oscillatory Behavior of Delay Partial Difference Equations with Positive and Negative Coefficients

Research Article Solvability for a Coupled System of Fractional Integrodifferential Equations with m-point Boundary Conditions on the Half-Line

New Inexact Line Search Method for Unconstrained Optimization 1,2

Quadrature based Broyden-like method for systems of nonlinear equations

and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s

Numerical Optimization of Partial Differential Equations

Research Article Mean Square Stability of Impulsive Stochastic Differential Systems

Two new spectral conjugate gradient algorithms based on Hestenes Stiefel

Higher-Order Methods

Research Article Strong Convergence of Parallel Iterative Algorithm with Mean Errors for Two Finite Families of Ćirić Quasi-Contractive Operators

Research Article Solution of Fuzzy Matrix Equation System

Research Article Almost Sure Central Limit Theorem of Sample Quantiles

1 Numerical optimization

THE RELATIONSHIPS BETWEEN CG, BFGS, AND TWO LIMITED-MEMORY ALGORITHMS

Acceleration Method for Convex Optimization over the Fixed Point Set of a Nonexpansive Mapping

Chapter 4. Unconstrained optimization

Research Article A New Conjugate Gradient Algorithm with Sufficient Descent Property for Unconstrained Optimization

Research Article Existence and Duality of Generalized ε-vector Equilibrium Problems

The Generalized Viscosity Implicit Rules of Asymptotically Nonexpansive Mappings in Hilbert Spaces

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

FALL 2018 MATH 4211/6211 Optimization Homework 4

GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION

Research Article The Dirichlet Problem on the Upper Half-Space

Research Article He s Variational Iteration Method for Solving Fractional Riccati Differential Equation

Research Article Some Generalizations of Fixed Point Results for Multivalued Contraction Mappings

2. Quasi-Newton methods

Research Article Design of PDC Controllers by Matrix Reversibility for Synchronization of Yin and Yang Chaotic Takagi-Sugeno Fuzzy Henon Maps

Research Article Cyclic Iterative Method for Strictly Pseudononspreading in Hilbert Space

17 Solution of Nonlinear Systems

Downloaded 12/02/13 to Redistribution subject to SIAM license or copyright; see

Numerical Methods for Large-Scale Nonlinear Systems

Research Article Existence for Elliptic Equation Involving Decaying Cylindrical Potentials with Subcritical and Critical Exponent

Research Article A New Global Optimization Algorithm for Solving Generalized Geometric Programming

A NOVEL FILLED FUNCTION METHOD FOR GLOBAL OPTIMIZATION. 1. Introduction Consider the following unconstrained programming problem:

Research Article An Inverse Eigenvalue Problem for Jacobi Matrices

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations

Research Article Fixed Point Theorems of Quasicontractions on Cone Metric Spaces with Banach Algebras

Introduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1

Statistics 580 Optimization Methods

Adaptive two-point stepsize gradient algorithm

A Filled Function Method with One Parameter for R n Constrained Global Optimization

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM

On Lagrange multipliers of trust-region subproblems

R-Linear Convergence of Limited Memory Steepest Descent

Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems

Research Article Taylor s Expansion Revisited: A General Formula for the Remainder

Research Article Attracting Periodic Cycles for an Optimal Fourth-Order Nonlinear Solver

On efficiency of nonmonotone Armijo-type line searches

Transcription:

International Scholarly Research Networ ISRN Computational Mathematics Volume 2012, Article ID 435495, 8 pages doi:10.5402/2012/435495 Research Article A Descent Dai-Liao Conjugate Gradient Method Based on a Modified Secant Equation and Its Global Convergence Ioannis E. Livieris and Panagiotis Pintelas Department of Mathematics, University of Patras, 265-00 Patras, Greece Correspondence should be addressed to Ioannis E. Livieris, livieris@upatras.gr Received 31 August 2011; Accepted 20 October 2011 Academic Editors: K. Eom and R. Joan-Arinyo Copyright 2012 I. E. Livieris and P. Pintelas. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original wor is properly cited. We propose a conjugate gradient method which is based on the study of the Dai-Liao conjugate gradient method. An important property of our proposed method is that it ensures sufficient descent independent of the accuracy of the line search. Moreover, it achieves a high-order accuracy in approximating the second-order curvature information of the objective function by utilizing the modified secant condition proposed by Babaie-Kafai et al. 2010). Under mild conditions, we establish that the proposed method is globally convergent for general functions provided that the line search satisfies the Wolfe conditions. Numerical experiments are also presented. 1. Introduction We consider the unconstrained optimization problem: min f x), x R n, 1) where f : R n R is a continuously differentiable function. Conjugate gradient methods are probably the most famous iterative methods for solving the optimization problem 1), especially when the dimension is large characterized by the simplicity of their iteration and their low memory requirements. These methods generate a sequence of points {x }, starting from an initial point x 0 R n, using the iterative formula: x +1 = x + α d, = 0, 1,..., 2) where α > 0 is the stepsize obtained by some line search and d is the search direction defined by g 0, if = 0; d = g + β d, otherwise, where g is the gradient of f at x and β is a scalar. Wellnown formulas for β include the Hestenes-Stiefel HS) [1], 3) the Fletcher-Reeves FR) [2], the Pola-Ribière PR) [3], the Liu-Storey LS) [4], the Dai-Yuan DY) [5] and the conjugate descent CD) [6]. They are specified by β HS = gt y y T d, β FR = g 2 g 2, β PR = gt y g 2, βls = gt y d T g, β DY = g 2 y T d, β CD = g 2 d T g, respectively, where s = x x, y = g g,and denotes the Euclidean norm. If f is a strictly convex quadratic function, and the performed line search is exact, all these methods are equivalent, but for a general function different choices of β give rise to distinct conjugate gradient methods with quite different computational efficiency and convergence properties. We refer to the boos [7, 8], the survey paper [9], and the references therein about the numerical performance and the convergence properties of conjugate gradient methods. During the last decade, much effort has been devoted to develop new conjugate gradient methods which are not only globally convergent for general 4)

2 ISRN Computational Mathematics functions but also computationally superior to classical methods and there are classified by two classes. The first class utilizes the second-order information of the objective function to improve the efficiency and robustness of conjugate gradient methods. Dai and Liao [10] proposed a new conjugate gradient method by exploiting a new conjugacy condition based on the standard secant equation in which β in 3)isdefinedby ) y ts β DL = gt d T y, 5) where t 0 is a scalar. Moreover, Dai and Liao also suggested a modification of 5) from a viewpoint of global convergence for general functions, by restricting the first term of being nonnegative, namely, } β DL+ { g T = max y d T y,0 t gt s d T y. 6) Along this line, many researchers [11 15] proposed variants of Dai-Liao method based on modified secant conditions with higher orders of accuracy in the approximation of the curvature. Under proper conditions, these methods are globally convergent and sometimes competitive to classical conjugate gradient methods. However, these methods do not ensure to generate descent directions, therefore, the descent condition is usually assumed in their analysis and implementations. The second class focuses to generate conjugate gradient methods which ensure sufficient descent independent of the accuracy of the line search. On the basis of this idea, Hager and Zhang [16] considered to modify parameter β in 3) and proposed a new conjugate gradient method, called CG- DESCENT in which the update parameter is defined as follows: where β HZ = max { β N, η }, 7) β N = gt y d T y 2 y 2 g Td ) 2, d T y 1 η = d 2 min { g, η }, and η>0isaconstant. An important feature of the CG- DESCENT method is that the generated direction satisfies d Tg 7/8) g 2. Moreover, Hager and Zhang [16] established that the CG-DESCENT is globally convergent for general functions under the Wolfe line search conditions. Quite recently, Zhang et al. [17] consideredadifferent approach, to modify the search direction such that the generated direction satisfies g Td = g 2, independently of the line search used. More analytically, they proposed a modified FR method in which the search direction is given by d = 1+β FR 8) g Td ) g g + β FRd. 9) This method is reduced to the standard FR method in case the performed line search is exact. Furthermore, in case β is specified by another existing conjugate gradient formula, the property g T d = g 2 is still satisfied. Along this line, many related conjugate gradient methods have been extensively studied [17 24] with strong convergence properties and good average performance. In this wor, we propose a new conjugate gradient method which has both characteristics of the two previously discussed classes. More analytically, our method ensures sufficient descent independent of the accuracy of the line search and achieves a high-order accuracy in approximating the second-order curvature information of the objective function by utilizing the modified secant condition proposed by Babaie-Kafaiet al. [11]. Under mild conditions, we establish the global convergence of our proposed method. The numerical experiments indicate that the proposed method is promising. The remainder of this paper is organized as follows. In Section 2, we present our motivation and our proposed conjugate gradient method. In Section3, we present the global convergence analysis of our method. The numerical experiments are reported in Section 4 using the performance profiles of Dolan and Moré [25]. Finally, Section 4 presents our concluding remars and our proposals for future research. Throughout this paper, we denote f x )and 2 f x )as f and G,respectively. 2. Algorithm Firstly, we recall that for quasi-newton methods, an approximation matrix B to the Hessian 2 f is updated so that anewmatrixb satisfies the following secant condition: B s = y. 10) Zhang et al. [26] and Zhang and Xu [27] expanded this condition and derived a class of modified secant condition with a vector parameter, in the form B s = ỹ, ỹ = y + θ s T 11) uu, where u is any vector satisfying s T u>0, and θ is defined by θ = 6 f f ) +3 g + g ) T s. 12) Observing that this new quasi-newton equation contains not only gradient value information but also function value information at the present and the previous step. Moreover, in [26], Zhang et al. proved that if s is sufficiently small, then s T ) G s y = O s 3), s T ) G s ỹ = O s 4) 13).

ISRN Computational Mathematics 3 Clearly, these equations imply that the quantity s T ỹ approximates the second order curvature s T G s with a higher precision than the quantity s T y does. However, for values of s greater than one i.e., s > 1), the standard secant equation 10) isexpected to be more accurate than the modified secant equation 11). Recently, Babaie-Kafai et al. [11] to overcome this difficulty considered an extension of the modified secant 11) as follows: B s = ỹ, ỹ = y + ρ max{θ,0} s T u u, 14) where parameter ρ is restricted to values of {0, 1} and adaptively switch between the standard secant equation 10) and the modified secant equation 14), by setting ρ = 1if s 1and setting ρ = 0, otherwise. In the same way as Dai and Liao [10], they obtained an expression for β,in the form β = gt ỹ ts ), 15) d T ỹ where t 0andỹ is defined by 14). Furthermore, following Dai and Liao s approach, in order to ensure global convergence for general functions, they modified formula 15) as follows: { } g T β = max ỹ d T,0 t gt s ỹ d T. 16) ỹ Motivated by the theoretical advantages of the modified secant equation 14) and the technique of the modified FR method [17], we propose a new conjugate gradient method as follows. Let the search direction be defined by g T d = 1+β d ) g g + β d, 17) where β is defined by 15). It is easy to see that the sufficient descent condition holds: g T d = g 2, 18) using any line search. Moreover, if f is a convex quadratic function and the performed line search is exact, then θ = 0, ỹ = y, andg T s = 0; hence, the conjugate gradient method 2) 17) is reduced to the standard conjugate gradient method, accordingly. Now, based on the above discussion, we present our proposed algorithm called modified Dai Liao conjugate gradient algorithm MDL-CG). 3. Convergence Analysis In order to establish the global convergence analysis, we mae the following assumptions for the objective function f. 1. Choose an initial point x 0 R n ;Set = 0. 2. If g =0, then terminate; Otherwise go to the next. 3. Compute the descent direction d by 15) and 17). 4. Determine a stepsize α by some line search rule. 5. Let x +1 = x + α d. 6. Set = +1andgoto2. Algorithm 1: MDL-CG). Assumption 1. The level set L ={x R n f x) f x 0 )} is bounded, namely, there exists a positive constant B>0such that x B, x L. 19) Assumption 2. In some neighborhood N L, f is differentiable and its gradient g is Lipschitz continuous, namely, there exists a positive constant L>0 such that gx) g y ) L x y, x, y N. 20) It follows directly from Assumptions 1 and 2 that there exists a positive constant M>0 suchthat gx) M, x L. 21) In order to guarantee the global convergence of Algorithm 1, we will impose that the steplength α satisfies the Armijo conditions or the Wolfe conditions. The Armijo line search is to find a steplength α = max{λ j, j = 0, 1, 2,...} such that f x + α d ) f x ) δα g T d, 22) where δ, λ 0, 1) are constants. In the Wolfe line search the steplength α satisfies with 0 <σ 1 <σ 2 < 1. f x + α d ) f x ) σ 1 α g T d, 23) gx + α d ) T d σ 2 g T d, 24) Next, we present some lemmas which are very important for the global convergence analysis. Lemma 1 see [11]). Suppose that Assumptions 1 and 2 hold. For θ and ỹ defined by 12) and 14), respectively,one has θ 3L s 2, 4L s. 25) ỹ Lemma 2. Suppose that Assumptions 1 and 2 hold. Let {x } be generated by Algorithm MDL-CG, where the line search satisfies the Armijo condition 22), then there exists a positive constant c > 0 such that α c g 2 d 2, 26) for all 0.

4 ISRN Computational Mathematics Proof. From the Armijo condition 22) and Assumptions 1 and 2,wehave δα d Tg < +. 27) 0 Using this together with inequality 18) implies that α g 2 = α d Tg < +. 28) 0 0 We now prove 26) by considering the following cases. Case 1. If α = 1, it follows from 18) that d g.in this case, inequality 26) is satisfiedwithc = 1. Case 2. If α < 1 by the line search, λ 1 α does not satisfy 22). This implies f x + λ 1 α d ) f x ) >δλ 1 α g T d. 29) By the mean-value theorem and Assumptions 1 and 2,we get f x + λ 1 α d ) f x ) λ 1 α g T d + Lλ 2 α 2 d 2. 30) Using this inequality with 18)and29), we have α 1 δ)λ L g Td 1 δ)λ g 2 2 = d L d 2. 31) Letting c = min{1, 1 δ)λ/l},we get26), which completes the proof. From inequalities 26)and28), we can easily obtain the following lemma. Lemma 3. Suppose that Assumptions 1 and 2 hold. Let {x } be generated by Algorithm MDL-CG where the line search satisfies the Armijo line search 22), then g 4 2 < +. 32) 0 d Next, we establish the global convergence theorem for Algorithm MDL-CG for uniformly convex functions. Theorem 1. Suppose that Assumptions 1 and 2 hold and f is uniformly convex on L, namely, there exists a positive constant γ>0 such that γ x y 2 f x) f y )) T x y ), x, y L. 33) Proof. Suppose that g 0forall 0. By the convexity assumption 33), we have d T ỹ dt y γα 1 s 2. 35) Combining the previous relation with Lemma 1,we obtain g β = T ỹ ts ) g ỹ ) + t s 4L + t γ d T ỹ g d T ỹ d. 36) Therefore, by the definition of the search direction d in 17) together with the previous inequality, we give an upper bound for d : d g d g + β g 2 g d + β g d +2 β 1+2 4L + t ) g. γ 37) Inserting this upper bound for d in 32) yields 0 g 2 <, which completes the proof. For simplicity, in Algorithm 1 in case the update parameter β is computed by 16), we refer it as Algorithm MDL + - CG. In the following, we show that Algorithm MDL + -CG is globally convergent for general nonlinear functions under the Wolfe line search conditions 23) and24). In the rest of this section, we assume that convergence does not occur, which implies that there exists a positive constant μ>0 such that g μ, 0. 38) Lemma 4. Suppose that Assumptions 1 and 2 hold. Let {x } and {d } be generated by Algorithm MDL + -CG and let α be obtained by the Wolfe line search 23) and 24), then there exist positive constants C 1 and C 2 such that for all 1, β β C 1 s, 39) g Td g 2 C 2 s. 40) Let {x } and {d } be generated by Algorithm MDL-CG and let α satisfy the Armijo condition 22), then one has either g = 0 for some or lim g = 0. 34) Proof. From 14), 18)and24), we have d T ỹ dt y σ 2 1)g T d = 1 σ 2 ) g 2. 41)

ISRN Computational Mathematics 5 Utilizing this with Lemma 1, Assumption 2, and relations 21)and38), we have β g T ỹ d T ỹ + t g Ts d T ỹ 4ML s 1 σ 2 )μ 2 + t M s 1 σ 2 )μ 2 = 4L + t)m 1 σ 2 )μ 2 s. 42) Letting C 1 = 4L + t)m/1 σ 2 )μ 2, then 39) is satisfied. Furthermore, by the Wolfe condition 24), we have g T d σ 2 g T d σ 2 d T ỹ + σ 2g T d. 43) Also, observe that g T d = d T y + g T d d T ỹ. 44) By rearranging inequality 43), we obtain g Td σ 2 /1 σ 2 )d T ỹ, and together with 44), we obtain g T d { } max σ 2 1 σ 2 ),1. 45) d T ỹ It follows from Assumption 1 and 18), 24), 38), and 39) that β g Td g ỹ ) + t s g T g 2 d g 2 4L + t) μ d T ỹ { } σ max 2 1 σ 2 ),1 s. 46) Letting C 2 = 4L + t)max{σ 2 /1 σ 2 ), 1}/μ,weobtain40). Next, we present a lemma which shows that, asymptotically, the search directions change slowly. Lemma 5. Suppose that Assumptions 1 and 2 hold. Let {x } and {d } be generated by Algorithm MDL + -CG and let α be obtained by the Wolfe line search 23) and 24), then d 0 and w w 2 <, 47) 1 where w = d / d. Proof. Firstly, note that d 0, for otherwise 18) would imply g = 0. Therefore, w is well defined. Next, we divide formula β in two parts as follows: β 1) 2) = β + β, 48) where β 1) { } g T = max ỹ d T,0, ỹ β 2) = t gt s d T. ỹ Moreover, let us define a vector r and a scalar δ by where 49) r := υ d, δ := β 1) d d, 50) υ = 1+ β Therefore, from 17), for 1, we obtain g Td ) 2) g 2 g + β d. 51) w = r + δ w. 52) Using this relation with the identity w = w =1, we have that r = w δ w = w δ w. 53) In addition, using this with the condition δ 0 and the triangle inequality, we get w w w δ w + w δ w = 2 r. 54) Now, we evaluate the quantity υ. It follows from the definition of υ in 52) and21), 39), 40), and 45) that there exists a positive constant D>0such that υ = 1+ β g Td ) g 2 g + β 2) d 1+ β g Td g 2 g g T + t s d d T ỹ 1+ β g Td g 2 g g T + t d s d T ỹ { } σ 1+C 2 )M + t max 2 1 σ 2 ),1 B D. From the previous relation and Lemma 3,we obtain 0 r 2 = υ 2 0 d 2 = υ 2 g 4 0 g 4 d 2 D2 g 4 μ 4 d 2 < +. 0 Therefore, using this with 54), we complete the proof. 55) 56)

6 ISRN Computational Mathematics Let Z + denote the set of positive integers. For λ>0anda positive integer Δ, we define the set K λ,δ := { i Z + i + Δ 1, s i 1 >λ }. 57) Let K λ,δ denote the number of elements in Kλ,Δ.The following lemma shows that if the gradients are bounded away from zero and Lemma 4 holds, then a certain fraction of the steps cannot be too small. This lemma is equivalent to Lemma 3.5 in [10] and Lemma 4.2 in [28]. Lemma 6. Suppose that all assumptions of Lemma 5 hold. Then, there exists constant λ>0 such that, for any Δ Z + and any index 0, there exists a greater index 0 such that K,Δ λ > Δ 2. 58) Percentage %) 98.64 98.17 97.71 97.26 96.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t Next, maing use of Lemmas 4, 5,and6, we can establish the global convergence theorem for Algorithm MDL + -CG under the Wolfe line search for general functions whose proof is similar to that of Theorem 3.6 in [10]andTheorem 4.3 in [28], thus we omit it. Theorem 2. Suppose that Assumptions 1 and 2 hold. If {x } is obtained by Algorithm MDL + -CG and α is obtained by the Wolfe line search 23) and 24), then one has either g = 0 for some or 4. Numerical Experiments lim inf g = 0. 59) In this section, we report numerical experiments which were performed on a set of 73 unconstrained optimization problems. These test problems with the given initial points can be found in Andrei Neculai s web site http:// camo.ici.ro/neculai/scalcg/testuo.pdf). Each test function made an experiment with the number of variables 1000, 5000, and 10000, respectively. We evaluate the performance of our proposed conjugate gradient method MDL + -CG with that of the CG- DESCENT method [16]. The CG-DESCENT code is coauthored by Hager and Zhang obtained from Hager s web page http://www.math.ufl.edu/ hager/papers/cg/). The implementation code was written in Fortran and have been compiled with the Intel Fortran compiler ifort with compiler settings 02 -double-size 128) on a PC 2.66 GHz Quad-Core processor, 4 Gbyte RAM) running Linux operating system. All algorithms were implemented with the Wolfe line search proposed by Hager and Zhang [16] and the parameters were set as default. In our experiments, the termination criterion is g 10 6 and set u = s as in [11]. In the sequel, we focus our interest on our experimental analysis for the best value of parameter t;hence,wehavetestedvaluesoft ranging from 0 to 1 in steps of 0.005. The detailed numerical results can be found in the web site: http://www.math.upatras.gr/ livieris/results/mdl results.zip. Figure 1 presents the percentage of the test problems that were successfully solved by Algorithm MDL + -CG for each Figure 1: Percentage of successfully solved problems by Algorithm MDL + -CG for each value of parameter t. Mean 1600 1400 1200 1000 800 600 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 2: Multigraph of means with respect to the function evaluations blue line) and gradient evaluations red line). choice of parameter t and Figure 2 presents the multigraph of means with respect to function and gradient evaluations. Algorithm MDL + -CG reports the best results relative to the success rate for choices of t which belong to the interval [0.785, 0.825]. Moreover, in case t = 0.82, Algorithm MDL + - CG illustrates the highest success rate 98.64%), solving 216 out of 219 of the test problems successfully. Clearly, the interpretation in Figures 1 and 2 presents the influence of parameter t in the computational cost is more sensitive, hence, we focus our attention to the function and gradient evaluations metrics. Based on the experimental results performed on this limited test set we conjecture that the optimal parameter t with respect to the computational cost belongs to the interval [0.02, 0.115]. Notice that in case t = 0.07, Algorithm MDL + -CG illustrates the least mean number of function and gradient evaluations. Furthermore, for values of t in the intervals [0.32, 0.42], [0.61, 0.70], and [0.95, 1], Algorithm MDL + -CG exhibits its worst performance with respect to the computational cost. It is worth noticing that t

ISRN Computational Mathematics 7 1 1 0.9 0.9 0.8 0.8 P 0.7 P 0.7 0.6 0.6 0.5 0.5 τ 10 1 τ 10 1 CG-DESCENT MDL + 1 MDL + 2 MDL + 3 CG-DESCENT MDL + 1 MDL + 2 MDL + 3 Figure 3: Performance profiles of CG-DESCENT, MDL + 1,MDL + 2 and MDL + 3 based on function evaluations. Figure 5: Performance profiles of CG-DESCENT, MDL + 1,MDL + 2 and MDL + 3 based on CPU time. P 1 0.9 0.8 0.7 0.6 0.5 CG-DESCENT MDL + 1 τ 10 1 MDL + 2 MDL + 3 Figure 4: Performance profiles of CG-DESCENT, MDL + 1,MDL + 2 and MDL + 3 based on gradient evaluations. the choice t = 0.995 exhibits the worst performance in terms of computational cost and success rate. We conclude our analysis by considering the performance profiles of Dolan and Moré [25] for the worst and the best parameter t choices. The use of performance profiles provide a wealth of information such as solver efficiency, robustness, and probability of success in compact form and eliminate the influence of a small number of problems on the benchmaring process and the sensitivity of results associated with the raning of solvers [25]. The performance profile plots the fraction P of problems for which any given method is within a factor τ of the best method. The horizontal axis of the figure gives the percentage of the test problems for which a method is the fastest efficiency), while the vertical axis gives the percentage of the test problems that were successfully solved by each method robustness). The curves in Figures 3 and 4 have the following meaning. i) CG-DESCENT stands for the CG-DESCENT method. ii) MDL + 1 stands for Algorithm MDL + -CG with t = 0.07. iii) MDL + 2 stands for Algorithm MDL + -CG with t = 0.82. iv) MDL + 3 stands for Algorithm MDL + -CG with t = 0.995. Figures 3 5 present the performance profiles of CG- DESCENT, MDL + 1 MDL + 2,andMDL + 3 relative to the function evaluations, gradient evaluations, and CPU time in seconds), respectively. Obviously, MDL + 1 exhibits the best overall performance, significantly outperforming all other conjugate gradient methods, relative to all performance metrics. More analytically, MDL + 1 solves about 64.4% and 66.2% of the test problems with the least number of function and gradient evaluations, respectively, while CG-DESCENT solves about 48.8% and 47%, in the same situations. Moreover, MDL + 2 is more robust than the CG-DESCENT since it solves 55.3% and 53% of the test problems with the least number of function and gradient evaluations, respectively. As regarding the CPU time metric, the interpretation in Figure 5 illustrates that MDL + 1 reports the best performance, followed by MDL + 2.Morespecifically,MDL + 1 solves 68.4% of the test problems with the least CPU time while MDL + 2 solves about 58% of the test problems. In terms of efficiency, MDL + 2 and CG-DESCENT exhibit the best performance, successfully solving 216 out of 219 of the test problems. MDL + 3 presents the worst performance, since its curves lie under the curves of the other conjugate gradient methods,

8 ISRN Computational Mathematics regarding all performance metrics. In summary, based on the performance of MDL + 1,MDL + 2 and MDL + 3, we point out that the choice of parameter t crucially affects the efficiency of Algorithm MDL + -CG. 5. Conclusions and Future Research In this paper, we proposed a conjugate gradient method which consists of a modification of Dai and Liao method. An important property of our proposed method is that it ensures sufficient descent independence of the accuracy of the line search. Moreover, it achieves a high-order accuracy in approximating the second-order curvature information of the objective function by utilizing the modified secant condition proposed by Babaie-Kafai et al. [11]. Under mild conditions, we establish that the proposed method is globally convergent for general functions under the Wolfe line search conditions. The preliminary numerical results show that if we choose a good value of parameter t, our proposed algorithm performs very well. However, we have not theoretically established an optimal parameter t, yet which consists our motivation for future research. Moreover, an interesting idea is to apply our proposed method to a variety of challenging real-world problems such as protein folding problems [29]. References [1] M. R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systems, Research of the National Bureau of Standards, vol. 49, pp. 409 436, 1952. [2] R. Fletcher and C. M. Reeves, Function minimization by conjugate gradients, The Computer Journal, vol. 7, pp. 149 154, 1964. [3] E. Pola and G. Ribière, Note sur la convergence de méthodes de directions conjuguées, Revue Francais d Informatique et de Recherche Operationnelle, vol. 3, no. 16, pp. 35 43, 1969. [4] Y. Liu and C. Storey, Efficient generalized conjugate gradient algorithms. I. Theory, Optimization Theory and Applications, vol. 69, no. 1, pp. 129 137, 1991. [5] Y. H. Dai and Y. Yuan, A nonlinear conjugate gradient method with a strong global convergence property, SIAM Journal on Optimization, vol. 10, no. 1, pp. 177 182, 1999. [6] R. Fletcher, Practical Methods of Optimization, John Wiley & Sons, New Yor, NY, USA, 2nd edition, 1987. [7] Y.H.DaiandY.X.Yuan,Nonlinear Conjugate Gradient Methods, Shanghai Scientific and Technical, Shanghai, China, 2000. [8] J. Nocedal and S. J. Wright, Numerical Optimization, Springer, New Yor, NY, USA, 1999. [9] W. W. Hager and H. Zhang, A survey of nonlinear conjugate gradient methods, Pacific Optimization, vol. 2, no. 1, pp. 35 58, 2006. [10] Y. H. Dai and L. Z. Liao, New conjugacy conditions and related nonlinear conjugate gradient methods, Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87 101, 2001. [11] S. Babaie-Kafai, R. Ghanbari, and N. Mahdavi-Amiri, Two new conjugate gradient methods based on modified secant equations, Computational and Applied Mathematics, vol. 234, no. 5, pp. 1374 1386, 2010. [12] J. A. Ford, Y. Narushima, and H. Yabe, Multi-step nonlinear conjugate gradient methods for unconstrained minimization, Computational Optimization and Applications, vol.40,no.2, pp. 191 216, 2008. [13] G. Li, C. Tang, and Z. Wei, New conjugacy condition and related new conjugate gradient methods for unconstrained optimization, Computational and Applied Mathematics, vol. 202, no. 2, pp. 523 539, 2007. [14] W. Zhou and L. Zhang, A nonlinear conjugate gradient method based on the MBFGS secant condition, Optimization Methods & Software, vol. 21, no. 5, pp. 707 714, 2006. [15] H. Yabe and M. Taano, Global convergence properties of nonlinear conjugate gradient methods with modified secant condition, Computational Optimization and Applications, vol. 28, no. 2, pp. 203 225, 2004. [16] W. W. Hager and H. Zhang, A new conjugate gradient method withguaranteed descent andan efficient line search, SIAM Journal on Optimization, vol. 16, no. 1, pp. 170 192, 2005. [17] L. Zhang, W. Zhou, and D. Li, Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijotype line search, Numerische Mathemati, vol. 104, no. 4, pp. 561 572, 2006. [18] W. Cheng and Q. Liu, Sufficient descent nonlinear conjugate gradient methods with conjugacy condition, Numerical Algorithms, vol. 53, no. 1, pp. 113 131, 2010. [19] Z. Dai and B. S. Tian, Global convergence of some modified PRP nonlinear conjugate gradient methods, Optimization Letters, vol. 5, no. 4, pp. 615 630, 2011. [20] S. Q. Du and Y. Y. Chen, Global convergence of a modified spectral FR conjugate gradient method, Applied Mathematics and Computation, vol. 202, no. 2, pp. 766 770, 2008. [21] A. Lu, H. Liu, X. Zheng, and W. Cong, A variant spectral-type FR conjugate gradient method and its global convergence, Applied Mathematics and Computation, vol. 217, no. 12, pp. 5547 5552, 2011. [22] L. Zhang, Two modified Dai-Yuan nonlinear conjugate gradient methods, Numerical Algorithms, vol. 50, no. 1, pp. 1 16, 2009. [23] L. Zhang, New versions of the Hestenes-Stiefel nonlinear conjugate gradient method based on the secant condition for optimization, Computational & Applied Mathematics, vol. 28, no. 1, pp. 111 133, 2009. [24] L. Zhang and W. Zhou, Two descent hybrid conjugate gradient methods for optimization, JournalofComputationaland Applied Mathematics, vol. 216, no. 1, pp. 251 264, 2008. [25] E. D. Dolan and J. J. Moré, Benchmaring optimization software with performance profiles, Mathematical Programming, vol. 91, no. 2, pp. 201 213, 2002. [26] J. Z. Zhang, N. Y. Deng, and L. H. Chen, New quasi-newton equation and related methods for unconstrained optimization, Optimization Theory and Applications, vol. 102, no. 1, pp. 147 167, 1999. [27] J. Zhang and C. Xu, Properties and numerical performance of quasi-newton methods with modified quasi-newton equations, Computational and Applied Mathematics, vol. 137, no. 2, pp. 269 278, 2001. [28] J. C. Gilbert and J. Nocedal, Global convergence properties of conjugate gradient methods for optimization, SIAM Journal on Optimization, vol. 2, no. 1, pp. 21 42, 1992. [29] J. D. Bryngelson, J. N. Onuchic, N. D. Socci, and P. G. Wolynes, Funnels, pathways, and the energy landscape of protein folding: a synthesis, Proteins, vol. 21, no. 3, pp. 167 195, 1995.

Advances in Operations Research Advances in Decision Sciences Applied Mathematics Algebra Probability and Statistics The Scientific World Journal International Differential Equations Submit your manuscripts at International Advances in Combinatorics Mathematical Physics Complex Analysis International Mathematics and Mathematical Sciences Mathematical Problems in Engineering Mathematics Discrete Mathematics Discrete Dynamics in Nature and Society Function Spaces Abstract and Applied Analysis International Stochastic Analysis Optimization