A derivative-free nonmonotone line search and its application to the spectral residual method

Similar documents
Spectral gradient projection method for solving nonlinear monotone equations

A family of derivative-free conjugate gradient methods for large-scale nonlinear systems of equations

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

Step-size Estimation for Unconstrained Optimization Methods

New Inexact Line Search Method for Unconstrained Optimization 1,2

Math 164: Optimization Barzilai-Borwein Method

Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems

Adaptive two-point stepsize gradient algorithm

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations

Step lengths in BFGS method for monotone gradients

Globally convergent three-term conjugate gradient projection methods for solving nonlinear monotone equations

Improved Newton s method with exact line searches to solve quadratic matrix equation

A proximal-like algorithm for a class of nonconvex programming

Maria Cameron. f(x) = 1 n

New hybrid conjugate gradient methods with the generalized Wolfe line search

The speed of Shor s R-algorithm

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems

A Trust Region Algorithm Model With Radius Bounded Below for Minimization of Locally Lipschitzian Functions

1. Introduction. We develop an active set method for the box constrained optimization

On the convergence properties of the projected gradient method for convex optimization

Residual iterative schemes for largescale linear systems

A Novel Inexact Smoothing Method for Second-Order Cone Complementarity Problems

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum Functions

On efficiency of nonmonotone Armijo-type line searches

Modification of the Armijo line search to satisfy the convergence properties of HS method

Journal of Computational and Applied Mathematics. Notes on the Dai Yuan Yuan modified spectral gradient method

Worst Case Complexity of Direct Search

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Unconstrained optimization

Handling nonpositive curvature in a limited memory steepest descent method

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

Accelerated Gradient Methods for Constrained Image Deblurring

5 Handling Constraints

Steepest Descent. Juan C. Meza 1. Lawrence Berkeley National Laboratory Berkeley, California 94720

Bulletin of the. Iranian Mathematical Society

Line Search Methods for Unconstrained Optimisation

On the Weak Convergence of the Extragradient Method for Solving Pseudo-Monotone Variational Inequalities

Global Convergence Properties of the HS Conjugate Gradient Method

R-Linear Convergence of Limited Memory Steepest Descent

Bulletin of the. Iranian Mathematical Society

1. Introduction. We consider the classical variational inequality problem [1, 3, 7] VI(F, C), which is to find a point x such that

THE solution of the absolute value equation (AVE) of

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization

Applying a type of SOC-functions to solve a system of equalities and inequalities under the order induced by second-order cone

Gradient method based on epsilon algorithm for large-scale nonlinearoptimization

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Handling Nonpositive Curvature in a Limited Memory Steepest Descent Method

A projection-type method for generalized variational inequalities with dual solutions

Differentiable exact penalty functions for nonlinear optimization with easy constraints. Takuma NISHIMURA

Quadrature based Broyden-like method for systems of nonlinear equations

Programming, numerics and optimization

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities

ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

CONVERGENCE BEHAVIOUR OF INEXACT NEWTON METHODS

Quasi-Newton Methods

Keywords: Nonlinear least-squares problems, regularized models, error bound condition, local convergence.

5 Quasi-Newton Methods

Randomized Block Coordinate Non-Monotone Gradient Method for a Class of Nonlinear Programming

A Smoothing Newton Method for Solving Absolute Value Equations

An accelerated Newton method of high-order convergence for solving a class of weakly nonlinear complementarity problems

A class of Smoothing Method for Linear Second-Order Cone Programming

A double projection method for solving variational inequalities without monotonicity

A NOVEL FILLED FUNCTION METHOD FOR GLOBAL OPTIMIZATION. 1. Introduction Consider the following unconstrained programming problem:

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming

Convex Optimization. Problem set 2. Due Monday April 26th

On nonexpansive and accretive operators in Banach spaces

Nearest Correlation Matrix

Search Directions for Unconstrained Optimization

The cyclic Barzilai Borwein method for unconstrained optimization

A convergence result for an Outer Approximation Scheme

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection

Error bounds for symmetric cone complementarity problems

Algorithms for Constrained Optimization

Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization

Spectral Projected Gradient Methods

Worst Case Complexity of Direct Search

A PENALIZED FISCHER-BURMEISTER NCP-FUNCTION. September 1997 (revised May 1998 and March 1999)

Trust-region methods for rectangular systems of nonlinear equations

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

Numerical Methods for Large-Scale Nonlinear Systems

STRONG CONVERGENCE OF AN ITERATIVE METHOD FOR VARIATIONAL INEQUALITY PROBLEMS AND FIXED POINT PROBLEMS

Research Article Finding Global Minima with a Filled Function Approach for Non-Smooth Global Optimization

An efficient Newton-type method with fifth-order convergence for solving nonlinear equations

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

An inexact subgradient algorithm for Equilibrium Problems

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

Quasi-Newton methods for minimization

A NOTE ON Q-ORDER OF CONVERGENCE

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

Inexact Newton Methods Applied to Under Determined Systems. Joseph P. Simonis. A Dissertation. Submitted to the Faculty

Merit functions and error bounds for generalized variational inequalities

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization

The Generalized Viscosity Implicit Rules of Asymptotically Nonexpansive Mappings in Hilbert Spaces

Transcription:

IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral residual method WANYOU CHENG College of Software, Dongguan University of Technology, Dongguan 523000, China AND DONG-HUI LI College of Mathematics and Econometrics, Hunan University, Changsha 410082, China [Received on 31 May 2007; revised on 23 February 2008] In this paper we propose a derivative-free nonmonotone line search for solving large-scale nonlinear systems of equations. Under appropriate conditions, we show that the spectral residual method with this line search is globally convergent. We also present some numerical experiments. The results show that the spectral residual method with the new nonmonotone line search is promising. Keywords: large-scale nonlinear systems; spectral residual method; nonmonotone line search. 1. Introduction We consider the nonlinear system of equations F(x) = 0, (1.1) where F is a continuously differentiable mapping from R n into itself. We are interested in large-scale systems for which the Jacobian of F(x) is not available. In this case, to solve the problem, we need to use derivative-free methods. Derivative-free methods for solving (1.1) include the well-known quasi-newton methods (Dennis & Moré, 1977; Martínez, 1990, 1992, 2000; Li & Fukushima, 1999; Zhou & Li, 2007, 2008) and the recently developed spectral residual method (La Cruz & Raydan, 2003; La Cruz et al., 2006; Zhang & Zhou, 2006), etc. In a derivative-free method, a derivative-free line search technique is necessary. To the authors knowledge, the earliest derivative-free line search is due to Griewank (1986). A favourable property of the line search in Griewank (1986) is that it produces an iterative method possessing a norm descent property. However, when the gradient of the function F(x k ) 2 is orthogonal to F(x k ), that line search may fail. The first well-defined derivative-free line search for solving nonlinear systems of equations was proposed by Li & Fukushima (2000). The line search in Li & Fukushima (2000) provides the iterative method with an approximate norm descent property. It was shown that the Broyden-like quasi-newton method with this line search is globally and superlinearly convergent (Li & Fukushima, 2000). Birgin et al. (2003) proposed another derivative-free line search where the amount of norm reduction required is proportional to the residual norm. Based on this line search, they proposed an inexact Corresponding author. Email: chengwanyou421@yahoo.com.cn Email: dhli@hnu.cn c The author 2008. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

DERIVATIVE-FREE NONMONOTONE LINE SEARCH 815 quasi-newton method and established its global convergence. Solodov & Svaiter (1999) proposed a nice derivative-free line search and developed an inexact Newton method for solving monotone equations. We refer to a recent review paper (Li & Cheng, 2007) for a summary of derivative-free line searches. Quite recently, La Cruz et al. (2006) proposed a new derivative-free line search in which the stepsize α k is determined by the following inequality: f (x k + α k d k ) max f (x k j) + θ k γ 1 αk 2 f (x k), (1.2) 0 j min{k,m 1} where f is a merit function such that f (x) = 0 if and only if F(x) = 0, d k is the direction generated by some iterative method, M is a positive integer, γ 1 (0, 1) and the positive sequence {θ k } satisfies k=0 θ k <. The line search (1.2) is a combination of the nonmonotone line search in Grippo et al. (1986) and the derivative-free line search in Li & Fukushima (2000). The term max 0 j min{k,m 1} f (x k j ) comes from the well-known nonmonotone line search in Grippo et al. (1986) for solving unconstrained optimization problems. The purpose of this term is to enlarge the possibly small stepsize generated by the line search in Li & Fukushima (2000). By the use of this line search, the spectral residual method is globally convergent (La Cruz et al., 2006). The reported numerical results in La Cruz et al. (2006) showed that this line search works very well. As pointed out by Zhang & Hager (2004), although the nonmonotone technique in Grippo et al. (1986) works well in many cases, there are some drawbacks. First, a good function value generated in any iteration is essentially discarded due to the use of the max term. Second, in some cases the numerical performance is very dependent on the choice of M (Grippo et al., 1986; Raydan, 1997). Furthermore, Dai (2002) gave an example that shows that, although an iterative method is generating R-linearly convergent iterations for a strongly convex function, the iterates may not satisfy the nonmonotone line search in Grippo et al. (1986) for k sufficiently large, for any fixed bound M on the memory. We refer to Dai (2002) and Zhang & Hager (2004) for details. To overcome these drawbacks, Zhang & Hager (2004) recently proposed a new nonmonotone line search that requires that an average of the successive function values decreases. Under appropriate conditions, they established the global convergence and R-linear convergence of some iterative methods including the limited memory BFGS method for solving strongly convex minimization problems. The reported numerical results showed that the new nonmonotone line search was superior to the traditional nonmonotone line search in Grippo et al. (1986). The purpose of this paper is to extend the nonmonotone line search proposed by Zhang & Hager (2004) to the spectral residual method for solving nonlinear systems of equations. The remainder of the paper is organized as follows. In Section 2 we propose the derivative-free nonmonotone line search and the algorithm. In Section 3 we establish the global convergence of the algorithm. We report some numerical results in Section 4. Throughout the paper we use J(x) to denote the Jacobian matrix of F(x). We use to denote the Euclidean norm of vectors. We denote by N the set of positive integers. 2. The derivative-free nonmonotone line search and the algorithm In this section, based on the nonmonotone line search proposed by Zhang & Hager (2004), we propose a derivative-free nonmonotone line search and apply it to the spectral residual method proposed by La Cruz et al. (2006). Let us briefly recall the Zhang Hager nonmonotone line search technique for solving

816 W. CHENG AND D.-H. LI the unconstrained optimization problem min f (x), x R n, where f : R n R is continuously differentiable. Suppose that d k R n is a descent direction of f at x k, i.e. f (x k ) T d k < 0, where f (x k ) denotes the gradient of f at x k. In the Zhang Hager line search the stepsize α k satisfies the following Armijo-type condition: f (x k + α k d k ) C k + βα k f (x k ) T d k, where β (0, 1), C 0 = f (x 0 ) and C k is updated by the following rules: = η k Q k + 1, C k+1 = η k Q k C k + f k+1, with Q 0 = 1 and η k [0, 1], where f k+1 is an abbreviation of f (x k+1 ). This line search strategy ensures that an average of the successive function values is decreasing. The choice of η k controls the degree of nonmonotonicity. In fact, if η k = 0 for all k, then the line search is the usual monotone Armijo line search. As η k 1 the line search becomes more nonmonotone, treating all the previous function values with equal weight when we compute C k. In what follows we extend this idea to develop a derivative-free nonmonotone line search for nonlinear systems of equations. Let α k satisfy the following condition: f (x k + α k d k ) C k + ɛ k γ α 2 k f (x k), (2.1) where γ (0, 1), C 0 = f 0, the positive sequence {ɛ k } satisfies k=0 ɛ k < and f is a merit function such that f (x) = 0 if and only if F(x) = 0 and C k is updated by the following rules: = η k Q k + 1, C k+1 = η k Q k (C k + ɛ k ) + f k+1, with Q 0 = 1 and η k [0, 1]. The line search condition (2.1) is a combination of the nonmonotone line search in Zhang & Hager (2004) and the nonmonotone derivative-free line search in Li & Fukushima (2000). If ɛ k = 0 then the update rule of C k is the same as that in Zhang & Hager (2004). We will apply the derivative-free nonmonotone line search (2.1) to the spectral residual method (La Cruz et al., 2006). In the latter part of the paper we let f (x) = 1 2 F(x) 2. We now we state the steps of the spectral residual method with the derivative-free line search (2.1) for solving nonlinear systems of equations. ALGORITHM 2.1 (N-DF-SANE) Step 1: Given the starting point x 0, and constants ɛ > 0, 0 η min η max 1, 0 < ρ min < ρ max < 1, 0 < σ min < σ max, γ > 0, and ɛ > 0. Choose a positive sequence {ɛ k } satisfying Set C 0 = f (x 0 ), Q 0 = 1, σ 0 = 1 and k = 0. ɛ k ɛ. (2.2) k=0

DERIVATIVE-FREE NONMONOTONE LINE SEARCH 817 Step 2: If F k ɛ then stop. Step 3: Compute d k = σ k F(x k ), where σ k [σ min, σ max ] (the spectral coefficient). Set α + = 1 and α = 1. Step 4: Nonmonotone line search. If then set α k = α + and x k+1 = x k + α k d k. Else, if f (x k + α + d k ) C k + ɛ k γ α 2 + f (x k) (2.3) f (x k α d k ) C k + ɛ k γ α 2 f (x k) then set α k = α, d k = d k and x k+1 = x k + α k d k. Else choose α +new [ρ min α +, ρ max α + ] and α new [ρ min α, ρ max α ]. Replace α + by α +new and α by α new. Go to Step 4. Step 5: Choose η k [η min, η max ] and compute Set k = k + 1 and go to Step 2. = η k Q k + 1, C k+1 = η k Q k (C k + ɛ k ) + f k+1. (2.4) The following lemma shows that, for any choice of η k [0, 1], C k lies between f k and ki=0 ( f i + iɛ i 1 ) A k =, (2.5) k + 1 where ɛ 1 = 0. This implies that the line search process is well defined. LEMMA 2.2 The iterates generated by Algorithm 2.1 satisfy f k C k A k k 0, where A k is defined by (2.5). Moreover, the sequence {C k } satisfies Proof. First, by Step 4 of Algorithm 2.1, we have By (2.4) and (2.7), we get C k C k 1 + ɛ k 1. (2.6) f k C k 1 + ɛ k 1. (2.7) C k = η k 1Q k 1 (C k 1 + ɛ k 1 ) + f k Q k f k. This establishes the lower bound for C k. We also have from (2.4) and (2.7) again that C k C k 1 +ɛ k 1. We are going to derive that C k A k by induction. Since C 0 = f 0 and ɛ 1 = 0, we obviously have C 0 = A 0. Suppose that C j A j for all 0 j < k. Since η k [0, 1] and Q 0 = 1, we have Q j+1 = η j Q j + 1 Q j + 1 j + 2. (2.8)

818 W. CHENG AND D.-H. LI Define h k (t): R + R + by h k (t) = t(c k 1 + ɛ k 1 ) + f k. t + 1 It is easy to see from (2.7) that h k is monotonically nondecreasing in t. So we have By the inductive assumption, we obtain C k = h k (η k 1 Q k 1 ) = h k (Q k 1) h k (k). (2.9) h k (k) = k(c k 1 + ɛ k 1 ) + f k k A k 1 + kɛ k 1 + f k = A k. k + 1 k + 1 The last inequality together with (2.9) implies that C k A k. REMARK 2.3 Since ɛ k > 0, after a finite number of reductions of α + the condition f (x k + α + d k ) f k + ɛ k γ α 2 + f (x k) necessarily holds. From Lemma 2.2 we know that f k C k. So the line search process, i.e. Step 4 of Algorithm 2.1, is well defined. 3. Global convergence This section is devoted to the global convergence of Algorithm 2.1. Let Ω be the level set defined by where ɛ is a positive constant satisfying (2.2). We first prove the following two lemmas. Ω = {x R n f (x) f (x 0 ) + ɛ}, LEMMA 3.1 The sequence {x k } generated by Algorithm 2.1 is contained in Ω. Proof. From Step 4 of Algorithm 2.1 we have, for each k, It then follows from (2.6) that The proof is complete. f (x k+1 ) C k + ɛ k. f (x k+1 ) C k 1 + ɛ k 1 + ɛ k f (x 0 ) + k i=0 f (x 0 ) + ɛ. LEMMA 3.2 Let the sequence {x k } be generated by Algorithm 2.1. Then there exists an infinite index set K N such that Moreover, if η max < 1 then ɛ k lim k K α2 k f (x k) = 0. (3.1) lim k α2 k f (x k) = 0. (3.2)

DERIVATIVE-FREE NONMONOTONE LINE SEARCH 819 Proof. From Step 4 of Algorithm 2.1 we have Together with (2.4) this implies that f (x k+1 ) C k + ɛ k γ α 2 k f k. C k+1 = η k Q k (C k + ɛ k ) + f k+1 (η k Q k + 1)(C k + ɛ k ) γ α 2 k f k = C k + ɛ k γ α2 k f k. (3.3) So we get from (2.2) that k=0 α 2 k f k <. (3.4) If lim inf k α 2 k f k 0 then (3.4) would be violated since k + 2 by (2.8). Hence (3.1) holds. If η max < 1 then = 1 + k j=0 i=0 j η k i 1 + k j=0 η j+1 max ηmax j = j=0 1 1 η max. Consequently, (3.2) follows immediately from (3.4). The following theorem establishes the global convergence of Algorithm 2.1. It is similar to but slightly stronger than Theorem 1 in La Cruz et al. (2006). THEOREM 3.3 Let the sequence {x k } be generated by Algorithm 2.1 and η max < 1. Then every limit point x of {x k } satisfies F(x ) T J(x )F(x ) = 0. (3.5) In particular, if F is strict, namely F or F is strictly monotone, then the whole sequence {x k } converges to the unique solution of (1.1). Proof. Let x be an arbitrary limit point of {x k }. Then there exists an infinite index set K 1 N such that lim k K1 x k = x. By (3.2), we have lim k K1 α 2 k f k = 0. Case I: If lim k K1 sup α k 0 then there exists an infinite index set K 2 K 1 such that {α k } K2 is bounded away from zero. By (3.2), we have lim k K2 f (x k ) = 0. This implies (3.5). Case II: If lim α k = 0 (3.6) k K 1 then there exists an index k 0 K 1 such that α k < 1 for all k k 0 with k K 1. Let m k denote the number of inner iterations in Step 4 (i.e. the inequalities in Step 4 were violated m k times). Let α k + and

820 W. CHENG AND D.-H. LI αk be the values of α+ and α, respectively, in the last unsuccessful line search step in N-DF-SANE step k. Then we have From the choice of α +new and α new we have α k ρ m k min k > k 0, k K 1. and α + k α k ρm k 1 max ρm k 1 max. Since ρ max < 1 and lim k K1 m k =, we get lim k K 1 α + k = lim k K 1 α k = 0. By the line search rule, we obtain, for all k K 1 with k k 0, f (x k + α + k d k) > C k + ɛ k γ (α + k )2 f k (3.7) and f (x k α k d k) > C k + ɛ k γ (α k )2 f k. Since C k f k 0, inequality (3.7) implies that f (x k + α + k d k) > f k γ (α + k )2 f k. From Lemma 3.1 we know that f k c = f (x 0 ) + ɛ. So we have f (x k + α + k d k) f k > cγ (α + k )2. In a way similar to the proof of Theorem 1 in La Cruz et al. (2006), repeating the above process, we can prove (3.5). 4. Numerical experiments In this section we test Algorithm 2.1, which we call the N-DF-SANE method, and compare it with the DF-SANE method (La Cruz et al., 2006). The set of test problems was described in La Cruz et al. (2004). The N-DF-SANE code was written in Fortran 77 and in double-precision arithmetic. The programs were carried out on a PC (CPU 1.6 GHz, 256Mb memory) with a Windows operation system. We implemented N-DF-SANE with the following parameters: η k = 0.85, σ min = 10 10, σ max = 10 10, ρ min = 0.1, ρ max = 0.5, γ = 10 4 and ɛ k = F(x 0) for all k 0. For each test problem we used (1+k) 2 the same termination criterion as that in La Cruz et al. (2006). Specifically, we stopped the iteration if the following inequality was satisfied: F(x k ) n e a + e r F(x 0 ) n, (4.1)

DERIVATIVE-FREE NONMONOTONE LINE SEARCH 821 TABLE 1 N-DF-SANE DF-SANE Pro Dim IT NF T IT NF T 1 1000 5 5 0 5 5 0 1 10000 2 2 0 2 2 0 2 1000 195 203 0.0781 210 218 0.0625 2 5000 37 55 0.0938 38 52 0.1406 3 1000 13 20 0 14 21 0 3 10000 62 152 0.5469 48 90 0.2500 4 1000 27 100 0 243 1188 0.0781 4 10000 27 100 0.1406 5 50 798 2701 0.0313 573 2259 0.0156 5 100 6 100 1 1 0 3 3 0 6 10000 2 4 0 3 3 0 7 100 257 306 0 23 29 0 7 500 225 263 0.0312 23 29 0 8 1000 1 1 0 1 1 0 8 10000 1 1 0 1 1 0 9 100 3 3 0 6 6 0 9 1000 4 5 0.2188 6 6 0.2969 10 1000 2 12 0 2 12 0 10 10000 2 12 0.0313 2 12 0.0313 11 1000 17 49 0.0156 17 49 0.0156 11 5000 17 49 0.0469 17 49 0.0469 12 1000 6 7 0.0156 30 62 0.0313 12 10000 4 4 0.0469 23 59 0.3438 13 100 3 7 0 3 7 0 13 1000 4 8 0 4 8 0 14 1000 12 18 0 12 18 0.0156 14 10000 12 20 0.0313 12 20 0.0156 15 1000 5 5 0 5 5 0 15 10000 5 5 0 5 5 0 16 1000 13 15 0 14 16 0 16 5000 16 16 0.0156 17 17 0.0156 17 1000 7 9 0 7 9 0 17 10000 7 9 0.0313 7 9 0.0313 18 50 10 12 0 19 21 0 18 100 19 1000 5 5 0 5 5 0 19 10000 5 5 0.0156 5 5 0.0156 20 100 29 41 0 40 42 0 20 1000 95 193 0.0781 44 62 0.0313 21 399 4 6 0 5 7 0 21 9999 5 7 0.0156 5 7 0.0156 22 1000 1 2 0 1 2 0 22 10000 1 2 0 1 2 0

822 W. CHENG AND D.-H. LI TABLE 2 N-DF-SANE DF-SANE Pro Dim IT NF T IT NF T 23 500 2 18 0 2 18 0 23 1000 2 20 0 2 20 0 24 100 27 34 0 27 34 0 24 500 33 78 0 59 99 0.0156 25 100 2 6 0 2 6 0 25 500 3 9 0 3 9 0 26 1000 1 1 0 1 1 0 26 10000 1 1 0 1 1 0 27 50 10 10 0.0156 10 10 0.0313 27 100 11 11 0.1250 11 11 0.1250 28 100 1 1 0 1 1 0 28 1000 1 1 0 1 1 0 29 100 7 11 0.1406 1 5 0 29 1000 7 11 0 1 5 0 30 99 12 21 0 11 16 0 30 9999 12 21 0.0781 11 16 0.0469 31 1000 6 6 0 6 6 0 31 5000 6 6 0.0156 6 6 0.0156 32 500 6 7 0 6 7 0 32 1000 6 7 0 6 7 0 33 1000 7 21 0 22 46 0.0156 33 5000 4 18 0.0156 4 16 0.0156 34 1000 37 85 0.0156 63 113 0.0156 34 5000 12 18 0.0156 12 18 0.0156 35 1000 4 8 0 21 27 0 35 5000 22 32 0.0469 38 48 0.0156 36 100 29 43 0 26 34 0 36 500 35 49 0 41 64 0.0156 37 1000 61 187 0.0313 21 27 0.0156 37 5000 56 178 0.1406 21 27 0 38 1000 13 27 0 25 30 0.0156 38 5000 22 36 0.0156 25 30 0.0313 39 1000 17 29 0 14 20 0 39 5000 17 29 0.0313 14 20 0.0156 40 1000 1 1 0 1 1 0 40 5000 1 1 0 1 1 0 41 500 9 9 0 7 9 0 41 1000 3 3 0 3 3 0 42 1000 19 27 0 102 198 0.0156 42 5000 19 27 0.0313 102 198 0.0781 43 100 26 44 0 86 108 0.0156 43 500 39 71 0 482 926 0.1094 44 1000 2 2 0 4 4 0 44 5000 3 3 0.0156 3 3 0.0156

DERIVATIVE-FREE NONMONOTONE LINE SEARCH 823 where e a = 10 5 and e r = 10 4. A limit of 1000 iterations was also imposed. We chose α +new, α new and the spectral stepsize σ k in the same way as in La Cruz et al. (2006). We implemented the DF-SANE algorithm with the following parameters: nexp = 2, σ min = 10 10, σ max = 10 10, σ 0 = 1, τ min = 0.1, τ max = 0.5, γ = 10 4, M = 10 and η k = F(x 0) for all k N. (1+k) 2 The DF-SANE code was provided by Prof. Raydan. In Tables 1 and 2 we report the dimension of each test problem (Dim), the number of iterations (IT), the number of function evaluations (NF) and the CPU times in seconds (T). In the tables Pro denotes the number of the test problem that appeared in La Cruz et al. (2004, 2006) and the symbol indicates that the related algorithm failed. We see from Tables 1 and 2 that in many cases the numbers of iterations, the numbers of function evaluations and the CPU times of the two algorithms are identical. In summary we observed the following: 30 problems where N-DF-SANE was superior to DF-SANE in IT; TABLE 3 DF-SANE M = 5 M = 10 M = 20 Pro Dim IT NF T IT NF T IT NF T 2 1000 40 54 0.0156 210 218 0.0625 153 157 0.0469 2 5000 37 55 0.1506 38 52 0.1406 339 345 0.4688 4 1000 131 247 0.0256 243 1188 0.0781 130 326 0.0313 4 10000 7 100 52 71 0 23 29 0 23 29 0 7 500 75 97 0.0156 23 29 0 23 29 0.0156 20 100 30 42 0 40 42 0 40 42 0 20 1000 56 96 0.0313 44 62 0.0313 49 57 0.0156 34 1000 13 25 0 63 113 0.0156 28 38 0 34 5000 12 18 0 12 18 0.0156 12 18 0.0156 42 1000 20 26 0 102 198 0.0156 44 52 0 42 5000 20 26 0.0156 102 198 0.0781 44 52 0.0313 TABLE 4 N-DF-SANE η k = 0.1 η k = 0.5 η k = 0.9 Pro Dim IT NF T IT NF T IT NF T 2 1000 41 55 0.0156 195 203 0.0781 195 203 0.0781 2 5000 225 245 0.4688 37 55 0.0938 37 55 0.0938 4 1000 27 100 0.0156 27 100 0.0156 27 100 0.0156 4 10000 27 100 0.1562 7 100 233 303 0 369 436 0.0156 359 450 0.0156 7 500 198 278 0.0594 262 339 0.0625 225 263 0.0469 20 100 30 42 0 30 42 0 30 42 0 20 1000 117 263 0.0781 73 133 0.0469 62 116 0.0469 34 1000 16 36 0 40 101 0.0156 37 85 0.0156 34 5000 10 18 0.0156 10 18 0.0156 12 18 0.0312 42 1000 42 106 0.0156 19 27 0.0156 19 27 0.0156 42 5000 42 106 0.0938 19 27 0.0312 19 27 0.0312

824 W. CHENG AND D.-H. LI 26 problems where N-DF-SANE was superior to DF-SANE in NF; 17 problems where N-DF-SANE was superior to DF-SANE in CPU time; 15 problems where DF-SANE was superior to N-DF-SANE in IT; 18 problems where DF-SANE was superior to N-DF-SANE in NF; 12 problems where DF-SANE was superior to N-DF-SANE in CPU time. The results in Tables 1 and 2 show that the proposed method is computationally efficient. We then tested the sensitivity of the parameters M and η k. According to an anonymous referee s suggestion, we tested the two algorithms on problems 2, 4, 7, 20, 34 and 42 with different parameters. First, we tested DF-SANE with four values of M: 5, 10, 20 and 40. The performance of DF-SANE with M = 40 is almost the same as the performance of DF-SANE with M = 20. Therefore in Table 3 we only list the results of DF-SANE with M = 5, 10 and 20. We observe from Table 3 that the behaviour of DF-SANE is sensitive to the choice of M. Second, we tested N-DF-SANE with three values of η k : 0.1, 0.5 and 0.9. The results are listed in Table 4. We see from Table 4 that the performance of N-DF-SANE with η k = 0.1 is inferior to the performance of N-DF-SANE with η k = 0.5 and 0.9. One possible reason is that the choice of η k controls the degree of nonmonotonicity. As η k 0 the line search in (2.1) is closer to the approximate norm descent line search in Li & Fukushima (2000). We also see from Tables 1, 2 and 4 that when η k 0.5 the numerical behaviour of N-DF-SANE is not very sensitive to the choice of η k. Acknowledgements The authors would like to thank the two anonymous referees for their valuable suggestions and comments that improved this paper greatly. We are grateful to Prof. M. Raydan for providing us with the test problems and the DF-SANE codes. Funding The National Development Project on Key Basic Research (2004CB719402); National Science Foundation project of China (10771057). REFERENCES BIRGIN, E. G., KREJIC, N. K. & MARTÍNEZ, J. M. (2003) Globally convergent inexact quasi-newton methods for solving nonlinear systems. Numer. Algorithms, 32, 249 260. DAI, Y. H. (2002) On the nonmonotone line search. J. Optim. Theory Appl., 112, 315 330. DENNIS, J. E. & MORÉ, J. J. (1977) Quasi-Newton methods, motivation and theory. SIAM Rev., 19, 46 89. GRIEWANK, A. (1986) The global convergence of Broyden-like methods with suitable line search. J. Aust. Math. Soc. Ser. B, 28, 75 92. GRIPPO, L., LAMPARIELLO, F. & LUCIDI, S. (1986) A nonmonotone line search technique for Newton s method. SIAM J. Numer. Anal., 23, 707 716. LA CRUZ, W., MARTÍNEZ, J. M. & RAYDAN, M. (2004) Spectral residual method without gradient information for solving large-scale nonlinear systems: theory and experiments. Technical Report RT-04-08. Departamento de Computation, UCV. LA CRUZ, W., MARTÍNEZ, J. M. & RAYDAN, M. (2006) Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Math. Comput., 75, 1429 1448.

DERIVATIVE-FREE NONMONOTONE LINE SEARCH 825 LA CRUZ, W. & RAYDAN, M. (2003) Nonmonotone spectral methods for large-scale nonlinear systems. Optim. Methods Softw., 18, 583 599. LI, D. H. & CHENG, W. Y. (2007) Recent progress in the global convergence of quasi-newton methods for nonlinear equations. Hokkaido Math. J., 36, 729 743. LI, D. H. & FUKUSHIMA, M. (1999) A globally and superlinearly convergent Gauss Newton-based BFGS methods for symmetric nonlinear equations. SIAM J. Numer. Anal., 37, 152 172. LI, D. H. & FUKUSHIMA, M. (2000) A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Optim. Methods Softw., 13, 583 599. MARTÍNEZ, J. M. (1990) A family of quasi-newton methods for nonlinear equations with direct secant updates of matrix factorizations. SIAM J. Numer. Anal., 27, 1034 1049. MARTÍNEZ, J. M. (1992) Fixed-point quasi-newton methods. SIAM J. Numer. Anal., 29, 1413 1434. MARTÍNEZ, J. M. (2000) Practical quasi-newton methods for solving nonlinear systems. J. Comput. Appl. Math., 124, 97 121. RAYDAN, M. (1997) The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim., 7, 26 33. SOLODOV, M. V. & SVAITER, B. F. (1999) A globally convergent inexact Newton method for systems of monotone equations. Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods (M. Fukushima & L. Qi eds). Kluwer, pp. 355 369. ZHANG, H. & HAGER, W. W. (2004) A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim., 4, 1043 1056. ZHANG, L. & ZHOU, W. J. (2006) Spectral gradient projection method for solving nonlinear monotone equations. J. Comput. Appl. Math., 196, 478 484. ZHOU, W. J. & LI, D. H. (2007) Limited memory BFGS method for nonlinear monotone equations. J. Comput. Math., 25, 89 96. ZHOU, W. J. & LI, D. H. (2008) A globally convergent BFGS method for nonlinear monotone equations without any merit functions. Math. Comput., 77, 2231 2240.