Gradient method based on epsilon algorithm for large-scale nonlinearoptimization

Similar documents
Step lengths in BFGS method for monotone gradients

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

A derivative-free nonmonotone line search and its application to the spectral residual method

New hybrid conjugate gradient methods with the generalized Wolfe line search

Acceleration of Levenberg-Marquardt method training of chaotic systems fuzzy modeling

Step-size Estimation for Unconstrained Optimization Methods

An Empirical Study of the ɛ-algorithm for Accelerating Numerical Sequences

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

Improved Newton s method with exact line searches to solve quadratic matrix equation

17 Solution of Nonlinear Systems

Chapter 4. Unconstrained optimization

Spectral gradient projection method for solving nonlinear monotone equations

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

Adaptive two-point stepsize gradient algorithm

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize

A Filled Function Method with One Parameter for R n Constrained Global Optimization

MATH 4211/6211 Optimization Quasi-Newton Method

LCPI method to find optimal solutions of nonlinear programming problems

8 Numerical methods for unconstrained problems

A NOTE ON PAN S SECOND-ORDER QUASI-NEWTON UPDATES

First Published on: 11 October 2006 To link to this article: DOI: / URL:

Numerical solutions of nonlinear systems of equations

Unconstrained optimization

Modification of the Armijo line search to satisfy the convergence properties of HS method

Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search

Quasi-Newton Methods

230 L. HEI if ρ k is satisfactory enough, and to reduce it by a constant fraction (say, ahalf): k+1 = fi 2 k (0 <fi 2 < 1); (1.7) in the case ρ k is n

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

Convex Optimization CMU-10725

The Conjugate Gradient Method

MATH 4211/6211 Optimization Basics of Optimization Problems

Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum Functions

Two improved classes of Broyden s methods for solving nonlinear systems of equations

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method

A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization

Search Directions for Unconstrained Optimization

Statistics 580 Optimization Methods

New hybrid conjugate gradient algorithms for unconstrained optimization

Global Convergence Properties of the HS Conjugate Gradient Method

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

The speed of Shor s R-algorithm

Quasi-Newton Methods

Unconstrained Multivariate Optimization

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc.

Constrained optimization: direct methods (cont.)

S.F. Xu (Department of Mathematics, Peking University, Beijing)

Complexity of gradient descent for multiobjective optimization

An artificial neural networks (ANNs) model is a functional abstraction of the

The Wolfe Epsilon Steepest Descent Algorithm

STOP, a i+ 1 is the desired root. )f(a i) > 0. Else If f(a i+ 1. Set a i+1 = a i+ 1 and b i+1 = b Else Set a i+1 = a i and b i+1 = a i+ 1

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

A modified BFGS quasi-newton iterative formula

Numerical Optimization

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

Numerical Optimization

Residual iterative schemes for largescale linear systems

An Efficient Modification of Nonlinear Conjugate Gradient Method

5 Handling Constraints

Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

A class of Smoothing Method for Linear Second-Order Cone Programming

Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems

and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13

FALL 2018 MATH 4211/6211 Optimization Homework 4

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

Chapter 8 Gradient Methods

2. Quasi-Newton methods

Optimization II: Unconstrained Multivariable

Some Theoretical Properties of an Augmented Lagrangian Merit Function

Math 411 Preliminaries

Introduction to Nonlinear Optimization Paul J. Atzberger

Steepest Descent. Juan C. Meza 1. Lawrence Berkeley National Laboratory Berkeley, California 94720

Improving the Convergence of Back-Propogation Learning with Second Order Methods

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

Math 164: Optimization Barzilai-Borwein Method

Seminal papers in nonlinear optimization

An efficient Newton-type method with fifth-order convergence for solving nonlinear equations

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

NonlinearOptimization

Bulletin of the. Iranian Mathematical Society

AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION

Nonlinearity Root-finding Bisection Fixed Point Iteration Newton s Method Secant Method Conclusion. Nonlinear Systems

EXISTENCE VERIFICATION FOR SINGULAR ZEROS OF REAL NONLINEAR SYSTEMS

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization

on descent spectral cg algorithm for training recurrent neural networks

2 Computing complex square roots of a real matrix

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

On the acceleration of augmented Lagrangian method for linearly constrained optimization

Optimization II: Unconstrained Multivariable

Preconditioned conjugate gradient algorithms with column scaling

New Inexact Line Search Method for Unconstrained Optimization 1,2

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Transcription:

ISSN 1746-7233, England, UK World Journal of Modelling and Simulation Vol. 4 (2008) No. 1, pp. 64-68 Gradient method based on epsilon algorithm for large-scale nonlinearoptimization Jianliang Li, Lian Lian, Jun Xu School of Science, Nanjing University of Science & Technology, Nanjing 210094, P. R. China (Received Octomber 22 2007, Accepted December 2 2007) Abstract. Based on epsilon vector sequences extrapolation, in this paper we apply the sequences acceleration principle to the gradient method on large-scale nonlinear optimization, and present many new accelerating methods for the sequences of gradient method. Finally, the experiment results are showed to compare the new accelerating sequences algorithm with the conjugate gradient method. Keywords: gradient method, vector sequences extrapolation, epsilon algorithm, numerical experiment 1 Introduction Nonlinear optimization problem min f(x) D x D Rn (1) exist widely in scientific research and engineering. Generally, the algorithm with superlinear convergence, for example, quasi-newton method based on Newton method or collinear scale method, is effective in solving problem (1) [4, 13]. For large-scale optimization problem, and complicated optimizations lie artificial neural networs or portfolio investment, conjugated gradient method with superlinear convergence is the first choice because there is no need to compute the approximation of Hesisian matrix. Conjugate gradient method is a ind of generalization form of deepest descent method. Its essence is to use an appropriate linear combination of two adjacent negative gradient directions as its new search direction. Consequently, this class of algorithms is superlinear convergent and is common used in science research and engineering technology. In addition, sequence extrapolation principle involves linear or nonlinear combinations of series of approximations, and is applied to many branches of numeric computation because it can obtain required precise predicted results with relatively lower computation complexity. Since we can get superlinear convergent property using the combination of adjacent search directions of deepest descent method, we can also expect to accelerate convergence by applying extrapolation algorithms to vector sequences generated by deepest descent method. 2 Epsilon extrapolation for gradient method 2.1 Gradient method based on epsilon algorithm The gradient method is the simplest optimization algorithm. Its iterative format is Corresponding author. Tel.: +86-25-85408876; E-mail address: ljl6006@hotmail.com. Published by World Academic Press, World Academic Union

World Journal of Modelling and Simulation, Vol. 4 (2008) No. 1, pp. 64-68 65 x +1 = x + α d, d = f(x ), α : min + αd ) α>0 = 0, 1, (2) Though this algorithm has simple form, the generated vector sequence {x } only has linear convergence because two adjacent search directions are orthogonal. This drawbac limits its application. Many extrapolation algorithms have been studied. Appropriate combinations of them can speed up sequential convergence. We can even get convergent result from divergent iterative sequence using algorithms such as Aitin extrapolation algorithm [3, 5, 12]. Epsilon algorithm is the most commonly used sequence extrapolation method, as a more general form of Aitin extrapolation. That epsilon algorithm can accelerate general linear convergent sequence has been proved in theory [15], while sequences generated by deepest descent method are exactly linear convergent. Epsilon algorithm is defined by { 1 = 0, ε(n) 0 = s n +1 = ε(n+1) 1 + [ε (n+1), n = 0, 1, (3) ] 1 Vector epsilon algorithm is a vector-form generalization of epsilon algorithms. The ey for generalizing ε- algorithm to the vector case is how to deal with the inverse of the vector sequence. Different methods dealing with the inverse mae different generalizing items. For vector sequence {x }, one can apply the scalar epsilon algorithm to each component of x n. This method is called the scalar algorithm (SEA). If we choose the vector inverse as the Samelon inverse We can get the vector algorithm (VEA) defined by { 1 = 0, ε(n) 0 = x n +1 = ε(n+1) 1 Another possibility is to use the inverse defined by [ ] 1 = y 1 = εn y T ε n [ +1 ] 1 = z 1 = z/ z 2, z R N (4) + / ε(n) 2 2 y (y T ) y 1 T +1 y 1 =, n = 0, 1, (5) T (n) ε +1 where y is any non-zero vector in n-dimensional space and ε n = εn+1 ε-algorithm (TEA). (6) ε n. This leads to the topological 1 = 0, ε(n) 0 = x n n = 0, 1, +1 = ε(n+1) 1 + + y +2 = ε(n+1) + + (y, ) ( +1, ε(n) ) n, = 0, 1, (7) In this case, for the vector sequence {x n } generated by deepest descent method, we get ε-extrapolation gradient algorithm according to vector algorithm. Algorithms ( ε-extrapolation gradient algorithm) Step 1. Choose ɛ, e, stepwise α; WJMS email for subscription: info@wjms.org.u

66 J. Li & L. Lian & J. Xu: Gradient method based on epsilon algorithm Step 2. Iterate by formula (2) and mae x 0 satisfy g( x 0 ) e, let = 0; Step 3. Let x 0 = x, update 2K times by formula (2) and get vector sequence {x l } 2K l=0, if g( x l) ε during the iteration process, go to Step 5; Step 4. For vector sequence {x } 2K =0, apply vector extrapolation(sea,vea or TEA) to get x + 1 = ε 0 2K. If g( x + 1) ε, go to Step 6; Step 5. = + 1, go to Step 2; Step 6. Stop. 2.2 Conclusions of the accelerated convergence algorithms Assume that the initial point is x 0, let x 1, x 2, be the vector sequence generated by x 0 x j+1 = F (x j ) F is a vector-valued function defined on an open and connected domain D in n-dimensional space, which has a Lipschitz continuous derivative. If x = F (x ) is a fixed point in Dand F (x ) is the Jacobian matrix of F at x, then for all x D We also assume F (x ) doesn t have 1 as an eigenvalue. F (x) x = F (x )(x x ) + O( x x 2 ) (8) Lemma 1. [2, 14] For the F and x, if is chosen on the th circle to be the degree of the minimal polynomial of F (x ) with respect to x i x in every algorithm above maing use of + 1 vectors, and if x 0 is sufficiently close to x, then x i+1 x = O( x i x 2 ) (9) Theorem 1. For nonlinear optimization, if x is a strict minimum point, and f(x) has quadratic derivative G(x) and continuous, second order derivative G(x), which s Lipschitz continuous. For x which is sufficiently close to x 0, if K i is chosen on every cycle to be the degree of the minimal polynomial of G(x) with respect to x i x in every algorithm above(sea, VEA or TEA) maing use of 2K i + 1 vectors, then the vector sequence generated by the vector extrapolation algorithm is quadratic convergence x i+1 x = O( x i x 2 ) (10) 3 Experiments and numerical analysis In order to show the efficiency of our accelerating algorithm, we test it with example problems and compare the results with that generated by conjugate gradient method. We use E(x) = f(x) 2 as the error function and error less than 1-6 as the stopping criterion in the test, All the experiments were implemented in C++, and the computer we used was Pentium 4 CPU2.40GHz. As the program ran under Windows XP and there were only 100 variables in the experimental functions, we may not be able to distinguish the small difference before and after we applied our accelerated algorithms to the problems if the CPU time required for running operation system was considered. Because the computation varies little in iteration, we just compared iteration times in the numerical experiments. Moreover, different vector ε-extrapolation gradient algorithms vary only in the computation of the inverse of matrices, so in the following comparisons we just use vector -algorithm (VEA) as an example. Test problem 1 (Extended Rosenbroc Function [10] ) f(x) = n fi 2 (x) i=1 where x R N (n is even)and WJMS email for contribution: submit@wjms.org.u

World Journal of Modelling and Simulation, Vol. 4 (2008) No. 1, pp. 64-68 67 f 2i 1 (x) = 10(x 2 i x 2 2i 1) f 2i (x) = 1 x 2i 1 We chose x 0 = { 1.2, 1, 1.2, 1,, 1.2, 1} T as the initial point, the minimum of f is 0. We chose n = 100 and the error of one-dimension search less than1e-4. We started the extrapolation when the error was less 2e 5. We chose K = 2. The numerical results are shown in Tab.1. Table 1. The comparison of VEA and conjugate gradient method for test 1 Algorithm Iteration Error gradient method 11039 9.99984e-7 vector ε-algorithm 123 6.70704e-7 FR conjugate gradient method 10600 9.91765e-7 Test problem 2 (Penalty Function [10] ) n+1 f(x) = fi 2 (x) i=1 where x R N and f i (x) = α 1/2 (x i 1), 1 i n f n+1 (x) = ( n x 2 j) 1 4 where α = 10 5. We chose x 0 = (1, 2, 3,, n) T as the initial point. We chose n = 100 and used error less than 1e 6 as the stopping criterion and error of one-dimension search less than 1e 3. We stated the extrapolation when the error was less than 1e 1.We chose K = 10. The numerical results are shown in Tab. 2. j=1 Table 2. The comparison of VEA and conjugate gradient method for test 2 Algorithm Iteration Error gradient method 2759 9.99718e-7 vector ε-algorithm 15 7.46054e-7 FR conjugate gradient method 200 3.82106e-7 The numerical results from Tab. 1 and Tab. 2 show us that gradient method based on vector extrapolation greatly reduced iteration times and saved CPU time for computation. Also the error was reduced. Meanwhile, from the comparison of accelerated sequence algorithm and conjugate gradient method, we found that the former one could even get better numeric results than the latter one with superlinear convergence which was often applied to large-scale optimization. 4 Conclusion In this paper, we discussed the accelerated convergence of the vector sequences generated by gradient method on large-scale nonlinear optimization. These new algorithms based on epsilon algorithm needn t complicated calculation and can accelerate gradient method significantly and is much faster than conjugate gradient method obviously. The results of numerical experiments shown that can reduce CPU time for computation. WJMS email for subscription: info@wjms.org.u

68 J. Li & L. Lian & J. Xu: Gradient method based on epsilon algorithm References [1] C. Brezinsi. Convergence acceleration during the 20th century. J. Comp. Appl. Math., 2000, 122: 1 21. [2] C. Brezinsi, G. de la Transformation de Shans, de la table de Pade. ε-algorithme. Calcolo, 1975, 12: 317 360. [3] Cabay, Jacson. A polynomial extrapolation method for finding limits and anti-limits of vector sequence. SIAM J. Numer. Anal., 1976, 13: 734 752. [4] J. E. Dennis, J. J. More. Quasi-newton methods, motivation and theory. SIAM Rev., 1997, 19: 46 89. [5] R. P. Eddy. Extrapolation to the Limit of a Vector Sequence, p.c.c wang edn. Information Linage Between Applied Mathematics and Industry, Academic Press, NewYor, 1979. 387-396. [6] R. Fletcher. Practical methods of optimization,unconstrained optimization. John Wiley & Sons, 1980. [7] E. Geeler. On the solution of system of equation by the epsilon algorithm of wynn. Math. Comp., 1972, 26: 427 436. [8] K. Jbilou, H. Sado. Some results about vector extrapolation methods and related fixed-point iteration. J. Comp. Appl. Math., 1991, 36: 385 398. [9] K. Jbilou, H. Sado. Vector extrapolation methods applications and numerical comparison. J. Comp. Appl. Math., 2000, 122: 149 165. [10] J. J. More, et al. Testing unconstrained optimization software. ACM Transactions on Mathematical Software, 1981, 3(7): 17 41. [11] A. Sidi. Convergence and stability properties of minimal polynomial and reduced ran extrapolation algorithms. SIAM J. Numer. Anal., 1986, 23: 197 209. [12] A. Sidi, et al. Acceleration of convergence of vector sequences. SIAM J. Numer. Anal., 1986, 23: 176 196. [13] D. C. Sorensen. The q-superlinear convergence of a collinear scaling algorithm for unconstrained optimization. SIAM, Numer. Anal., 1980, 12: 84 114. [14] P. Wynn. On a device for computing the transformation. MATC, 1956, 10: 91 96. [15] P. Wynn. Acceleration techniques for iterated vector and matrix problems. Math. Comp., 1966, 12: 331 348. [16] Y. Yuan. A modified bfgs alogorithm for unconstrained optimization. IMA Journal of Numerical Analysis, 1991, 11: 325 332. WJMS email for contribution: submit@wjms.org.u