and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s

Size: px
Start display at page:

Download "and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s"

Transcription

1 Global Convergence of the Method of Shortest Residuals Yu-hong Dai and Ya-xiang Yuan State Key Laboratory of Scientic and Engineering Computing, Institute of Computational Mathematics and Scientic/Engineering Computing, Chinese Academy of Sciences, P. O. Box 79, Beijing 00080, P. R. China address: Abstract The method of shortest residuals (SR) was presented by Hestenes and studied by Pitlak. If the function is quadratic, and if the line search is exact, then the SR method reduces to the linear conjugate gradient method. In this paper, we put forward the formulation of the SR method when the line search is inexact. We prove that, if stepsizes satisfy the strong Wolfe conditions, both the Fletcher-Reeves and Polak-Ribiere-Polyak versions of the SR method converge globally. When the Wolfe conditions are used, the two versions are also convergent provided that the stepsizes are uniformly bounded; if the stepsizes are not bounded, an example is constructed to show that they need not converge. Numerical results show that the SR method is a promising alternative of the standard conjugate gradient method. Mathematics Subject Classication: 65k, 90c.. Introduction The conjugate gradient method is particularly useful for minimizing functions of many variables min f(x); (.) x< n because it does not require to store any matrices. It is of the form?gk ; for k = ; d k =?g k + k d k? ; for k, (.) x k+ = x k + k d k ; (.3) where g k = rf(x), k is a scalar, and k is a stepsize obtained by a line search. Well-known formulas for k are called the Fletcher-Reeves (FR), Polak-Ribiere-Polyak (PRP) formulas (see [, 8, 9]. They are given by F R k = kg kk kg k? k (.4) This work was supported by National Science Foundation of China (9550).

2 and P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of shortest residuals(sr). The SR method was presented by Hestenes in his monograph [] on conjugate direction methods. In the SR method the search direction d k is taken as the shortest vector of the form d k =?g k + k d k? + k ; 0 < k < ; (.6) where d =?g. It follows that d k is vertical to g k + d k?, and hence its parameter satises the relation (?g k + k d k? ) T (g k + d k? ) = 0: (.7) Assuming that the line search is exact, Hestenes obtains the solution of (.7): k = kg kk kd k? k : (.8) If the function is quadratic, and if the line search is exact, the vector d k+ can be proved to be also the shortest residual in the k-simplex whose vertices are?g,,?g k+ (see []). The SR method can be viewed as a special case of the conjugate subgradient method developed in Wolfe [4] and Lemarechal [4] for minimizing a convex function which may be nondierentiable. Based on this fact, Pitlak [7] also called the above as Wolfe-Lemarechal method. To investigate the equivalent form of the SR method for general functions and construct other conjugate gradient methods, Pitlak [7] introduces the family of methods: d k =?Nrfg k ;? k d k? g; (.9) where Nrfa; bg is dened as the point from a line segment spanned by the vectors a and b which has the smallest norm, namely, knrfa; bgk = minfka + (? )bk : 0 g: (.0) If g T k d k?=0, one can solve (.9) analytically and obtain and d k =?(? k )g k + k k d k? (.) k = kg k k kg k k + kkd k? k : (.) In this case, it is easy to see that direction d k dened by (.6) and (.8) is corresponding to (.)-(.) with k = : (.3) Pitlak points out in [7] that if the line search is exact, and if k, the SR method is equivalent to the FR method, and the corresponding formula of k of the PRP method is k = kg k k jg T k (g k? g k? )j : (.4)

3 One may argue that the corresponding formula of k of the PRP method should be k = kg k k g T k (g k? g k? ) : (.5) Note that if the line search is exact, the reduced method by (.5) will produce the same iterations as the PRP method. Powell []'s examples can also be used to show that the method may cycle without approaching any solution point even if the line search is exact. Thus we take (.4) instead of (.5). For clarity, we now call method (.3), (.) and (.) as the SR method provided that scalar k is such that the reduced method is the linear conjugate gradient method when the function is quadratic and the line search is exact. At the same time, we call formulae (.3) and (.4) for k as the FR and PRP versions of the SR method, and abbreviate them by FRSR and PRPSR respectively. In this paper, we will investigate the convergence properties of the FRSR and PRPSR methods with the stepsize satisfying the Wolfe conditions: f(x k )? f(x k + k d k )? k g T k d k; (.6) g(x k + k d k ) T d k g T k d k ; (.7) or the strong Wolfe conditions, namely, (.6) and jg(x k + k d k ) T d k j jg T k d kj; (.8) where 0 < < <. In [7], Pitlak suggests the following search conditions: f(x k )? f(x k + k d k ) k kd k k ; (.9) g(x k + k d k ) T d k?kd k k ; (.0) where 0 < < <, and he concludes that there exists a procedure which nds k > 0 satisfying (.9)-(.0) in a nite number of operations (see Lemma in [7]). However it is possible that ^ = 0 in his proof, the statement is not true. Consider f(x) = x ; x < (.) where > 0 is constant. Suppose that at the k-th iteration x k = and d k =? are obtained. Then for any > 0, f(x k )? f(x k + d k ) =? (? ) =? : (.) The above implies that (.9) does not hold for any > 0, provided that <. It is worth noting that, for the SR method, we often have that g T d k k =?kd k k, which implies that (.9)-(.0) are equivalent to (.6)-(.7). This paper is organized as follows. In the next section, we give the formula of the SR method under inexact line searches and provide an algorithm in which some safeguards are employed. In Section 3, we prove that the FRSR and PRPSR algorithms converge with the strong Wolfe conditions (.6) and (.8). If the Wolfe conditions (.6)-(.7) are used, convergence is also guaranteed provided that the stepsizes are uniformly bounded. One direct corollary is that the FRSR and PRPSR algorithms with (.6)-(.7) converge for strictly convex functions. In Section 4, a function is constructed to show that, if we do not restrict f k g, the Wolfe conditions can guarantee neither algorithms to converge. The 3

4 example suits for both the FRSR and PRPSR algorithms. Numerical results are reported in Section 5, which show that the SR method is a promising alternative of the standard conjugate gradient method.. Algorithm Expression (.) is deduced in the case of the exact line search. If the line search is inexact, it is not dicult to deduce from (.9) that scalar k in (.) is given by k = kg kk + k g T k d k? kg k + k d k? k ; (.) if we do not restrict k [0; ]. By direct calculations, we can obtain and g T k d k =?kd k k (.)? kd k k = k kgk k kd k? k? (g T k d k? ) : (.3) kg k + k d k? k It follows from (.) that d k is a descent direction unless d k 6= 0. On the other hand, from (.3) we see that d k = 0 if and only if g k and d k? are collinear. In the case when g k and d k? are collinear, assuming that k d k? = tg k, we can solve d k from (.9) and (.0) as follows: (?gk ; if t <?; d k = tg k ; if? t 0; (.4) 0; if t > 0. The direction d k vanishes if t 0. The following example shows this possibility. Consider f(x; y) = ( + ) (x + y ): (.5) Given the starting point x = (; ) T, one can test the unit stepsize will satisfy (.6) and (.8) if the parameters and such that (+)(? ). It follows that x = (?;?) T, and hence that t = = for the FRSR method, t = =( + ) for the PRPSR method. Thus we have t > 0 for both methods. In practice, to avoid numerical overows we restart the algorithm with d k =?g k if the following condition is satised: (.6) jg T k d k? j b kg k kkd k? k; (.7) where b is a positive number. For the PRPSR method, it is obvious that the denominator of k possibly vanishes. Thus to establish convergence results for the PRPSR method we must assume that jg T (g k k? g k? )j > 0. In practice, we test the following condition at every iteration: jg T k (g k? g k? )j > b kg k k ; (.8) where 0 b <. Now we give a general algorithm as follows. 4

5 Algorithm. Given a starting point x. Choose I f0; g, and numbers 0 < b, 0 b <, 0. Step. Compute kg k. If kg k, stop; otherwise, let k =. Step. Compute d k =?g k. Step 3. Find k > 0 satisfying certain line search conditions. Step 4. Compute x k+ by (.3) and g k+. Let k = k +. Step 5. Compute kg k k. If kg k k, stop. Step 6. Test (.7): if (.7) does not hold, go to step (). If I =, go to Step 8. Step 7. Compute k by (.3), go to Step 9. Step 8. Test (.8): if (.8) does not hold, go to Step ; compute k by (.4). Step 9. Compute k by (.) and d k by (.), go to Step 3. We call the above with I = 0 and I = as the FRSR and PRPSR algorithms respectively. 3. Global Convergence Throughout this section we make the following assumption. Assumption 3. () f is bounded below in < n and is continuously dierentiable in a neighbor N of the level set L = fx < n : f(x) f(x )g; () The gradient rf(x) is Lipschitz continuous in N, namely, there exists a constant L > 0 such that krf(x)? rf(y)k Lkx? yk; for any x; y N : (3.) For some references, we formulate the following assumption. Assumption 3. The level set L = fx < n : f(x) f(x )g is bounded. Note that Assumptions 3. and 3. imply that there is a constant such that kg(x)k ; for all x L: (3.) We also formulate the following assumption. Assumption 3.3 The function f(x) is twice continuously dierentiable in N, and there are numbers 0 < such that kyk y T H(x)y kyk ; for all x N and y < n ; (3.3) where H(x) denotes the Hessian matrix of f at x. Under Assumption 3., we state a useful result which was essentially proved by Zoutendijk [5] and Wolfe [, 3]. 5

6 Theorem 3.4 Let x be a starting point for which Assumption 3. is satised. Consider any iteration of the form (.), where d k is a descent direction and k satises the Wolfe conditions (.6)-(.7). Then X k (g T k d k ) kd k k < : (3.4) The above gives the following result: Corollary 3.5 Let x be a starting point for which Assumption 3. is satised. Consider Algorithm. where b = ; b = 0, = 0, and where the line search conditions are (.6)- (.7). Then X k kd k k < : (3.5) Proof If the algorithm restarts, we also have (.) due to (.6). So Theorem 3.4 and relation (.) give the corollary. If Algorithm. restarts at the k-th iteration, we have kg k k = kd k k. Thus (3.5) implies lim inf k! kg k k = 0 if Algorithm. restarts for innitely many times. Thus we suppose with loss of generality that (.7) and (.8) always hold in Algorithm.. In addition, we also suppose that g k 6= 0 for all k since otherwise a stationary point has been found. First, we have the following theorem for the FRSR method. Theorem 3.6 Let x be a starting point for which Assumption 3. is satised. Consider Algorithm. with I = 0, b = and = 0. Then we have that lim inf k! kg kk = 0; (3.6) if the line search satises one of the following conditions: (i) g T k d k? = 0; (ii) the strong Wolfe conditions (.6)-(.8); (iii) the Wolfe conditions (.6)-(.7), and there exists M < such that k M; for all k: (3.7) Proof We proceed by contradiction and assume that lim inf k! kg kk 6= 0: (3.8) Then there exists a constant > 0 such that kg k k ; for all k : (3.9) From (.3) and (.3), we obtain kd k k = kd k? k ( + r k); (3.0) 6

7 where kd k? k + g T d k k? + (gt d k k?) kd r k = k? k kg k k? (gt d : (3.) k k?) kd k? k Using (3.0) recursively, we get that kd k k = kd k ky i= ( + r i ): (3.) Noting that r k > 0, we deduce from (3.5) and (3.) that X k r k = (3.3) because otherwise we have that =kd k k converges which contradicts with (3.5). For (i), since g T k d k? = 0, r k can be rewritten as r k = kd k?k kg k k : (3.4) From this and (3.9), we obtain X k r k X k kd k? k < ; (3.5) which contradicts (3.3); For (ii) and (iii), we conclude that there exists a positive number c such that for all k, jg T k d k?j c kd k? k : (3.6) In fact, for (ii), we see from (.8) and (.) that (3.6) with c =. For (iii), we have from (3.), (.) and (3.7) that jg T k+ d kj = j(g k+? g k ) T d k + g T k d kj kg k+? g k kkd k k + jg T k d k j k Lkd k k + kd k k ( + LM)kd k k : (3.7) So (3.6) holds with c = + LM. Due to (3.5), we see that kd k k! 0; as k! ; (3.8) which implies that there exists an integer k such that p kd k k ; for all k k : (3.9) c Applying (3.6) and this in (3.), we obtain for all k k r k c kd k? k ; (3.0) P where c = ( + c ) =. Therefore we also have that k r k <, which contradicts (3.3). The contradiction shows the truth of (3.6). 7

8 Corollary 3.7 Let x be a starting point for which Assumption 3.3 is satised. Consider Algorithm. with I = 0, b = and = 0. Then if the line search satises (.6)-(.7), we have (3.6). Proof get that From (iii) of the above theorem, it suces to show (3.7). By Taylor's theorem, we f(x k + d k ) = f(x k ) + g T k d k + d T k H( k)d k ; (3.) for some k. This, (.6), (3.3) and (.) imply that (3.7) holds with M = (? )=, which completes the proof. For the PRPSR method, we rst have the following theorem. Theorem 3.8 Let x be a starting point for which Assumption 3. are satised. Consider Algorithm. with I =, b =, b = 0 and = 0. Then we have (3.6) if the line search satises (iii). Proof We still proceed by contradiction and assume that (3.9) holds. It is obvious that if the line search satises (iii), (3.6) still holds. From (.4), (3.), (3.7) and (3.9), we obtain k kd k? k = kg kk kd k? k jg T k (g k? g k? )j kg k k kd k? k Lkg k k k? kd k? k c 3; (3.) where c 3 = =LM. Using this and (3.9) in (.3), we have the following estimation of =kd k k for all k : So kd k k = X k which implies lim inf k! Note that and kd k k?k + g T d k k? k kg k k kd k? k + k kd k? k + kg k k kg k k? (gt k d k?) kg k k kd k? k?? (gt d k k?)? kg k k kd k? k (c? 3 +? )? (gt k d k? )? : (3.3) kg k k kd k? k? (gt d k k?) < ; (3.4) kg k k kd k? k (g T k d k? ) = : (3.5) kg k k kd k? k g k? = g k? y k? ; (3.6) lim inf k! ky k?k = lim inf k! k?kd k? k = 0: (3.7) 8

9 (3.5), (3.6) and (3.7) implies that lim inf k! (gk?d T k? ) = : (3.8) kg k? k kd k? k On the other hand, we have from (.), (3.9) and (3.8) that lim inf k! (gk? T d k?) = lim inf kg k? k kd k? k k! kd k? k = 0: (3.9) kg k? k (3.8) and (3.9) give a contradiction. The contradiction shows the truth of (3.6). From Theorem 3.8, one can see that the PRPSR method also converges for strictly convex functions provided that stepsizes satisfy (.6)-(.7). Corollary 3.9 Let x be a starting point for which Assumption 3.3 is satised. Consider Algorithm. with I =, b =, b = 0 and = 0. Then if the line search satises (.6)-(.7), we have (3.6). Theorem 3.0 Let x be a starting point for which Assumptions 3. and 3. are satised. Consider Algorithm. with I =, b =, b = 0 and = 0. Then we have (3.6) if the line search satises (.6) and (.8). Proof We proceed by contradiction and assume that (3.9) holds. At rst, we conclude that there must exist a number c 4 such that k kd k? k c 4 ; for all k : (3.30) This is because otherwise there exists a subsequence fk i g and a constant, say also c 3, such that ki kd ki k c 3 ; for all i : (3.3) Then we can prove the truth of (3.3) for the subsequence fk i g, which together with (3.5) implies that (3.5) still holds. Then similarly to the proof of Theorem 3.8, we can obtain (3.8) and (3.9), which contradicts with each other. Therefore (3.30) holds. From (.3), (3.9), (3.) and (3.30), we can get that for all k k kd k k kkg k k kd k? k kg k + k d k? k kkg k k kd k? k 4(kg k k + k kd k?k ) c 5 kkd k? k ; (3.3) where c 5 = =4( + c 4). Making products of both sides in (.) with d k (.), we can get and applying k d T k?d k = kd k k : (3.33) Dene u k = d k =kd k k. Then from (3.33), k 0, (.3), (3.6) and (3.3), we obtain ku k? u k? k =? dt k? d k kd k? kkd k k =? kd kk k kd k? k k kd k?k? kd k k kkd k? k = ( kkd k? k + g T k d k?) kd k? k kg k + k d k? k kkd k? k 4 + (g T k d k? ) kd k? k kg k + k d k? k c? 5 kd kk + c kd k? k kg k + k d k? k : (3.34) 9

10 Besides it, from (.8), (.) and (3.30), kg k + k d k? k kg k k + k g T d k k? kg k k? k kd k? k kg k k? c 4 kd k? k: (3.35) Noting (3.5) and (3.9), we can deduce from (3.34) and (3.35) that X k ku k? u k? k < : (3.36) Let jkk; j denote the number of elements of K k; = fi : k i k +? ; ks i? k = kx i? x i? k > g: (3.37) Using (3.36), we conclude that for any > 0, there exists N and k 0 such that jk k;j ; for any k k 0; (3.38) otherwise by the proof of Theorem 4.3 in [3] a contradiction can be similarly obtained. We choose b = = and = L=c 5b. Then we have from (3.) and (3.9) that k kg k k kg k k + kg k? k = b; (3.39) and when ks k? k, we have from (3.), k kg kk Lks k? k L = (3.40) b; c 5 For this, let, k 0 be so given that (3.38) holds. Denote q = [(k? k + )=] and t = k? k +? q. It is obvious that k? k + = q + t and 0 t <. Thus for any k k = maxfk ; k 0 g, we get from (3.3), and (3.38)-(3.40) that kd k k kd k k ky i=k (c 5 i ) 3 q 4 c q+t 5 b q +t 5 c 5 b = (c 5 b) t minfc 5 b; (c 5 b) g; (3.4) which contradicts (3.8). The contradiction shows the truth of (3.6). 4. A counter-example In the above section, we prove that the FRSR and PRPSR algorithms converge with the Wolfe conditions (.6)-(.7) provided that f k g is uniformly bounded. However, if we do not restrict the size of k, they need not converge. This will be shown in the following example. The example suits for both algorithms. Example 4. Consider the function f(x; y) = ( 0; if (x; y) = (0; 0); 4 ( p x x + y )(x + y) + 4 ( p y x + y )(x? y) ; otherwise, (4.) 0

11 where is dened by (t) = 8 >< >: ; p for jtj > 3 ; + sin(b jtj + b ); for p jtj 0; for jtj < p : p 3 ; (4.) In (4.), b = ( p 3 + p ), b =?(5= + p 6). It is easy to see that function f dened by (4.) and (4.) satises Assumption 3.. We construct an example such that for all k ; cos k?! x k = kx k k sin k? ; (4.3) kx k k = kx k? k tan( 4? k?); k? (0; ); (4.4) 8 g k = kx cos( k? kk +! 4 ) sin( k? + 4 ) ; (4.5) d k = kg k k sin k cos( k k) sin( k k)! : (4.6) Because we can use, for example spline tting, we can choose the starting point and assume that (4.3)-(4.6) hold for k =. Suppose that (4.3)-(4.6) hold for k. From (4.3), (4.6) and the denition of f, we can choose > 0 such that! cos k x k+ = kx k+ k sin k ; (4.7) kx k+ k = kx k k tan( 4? k): (4.8) Thus we have that g k+ = kx k+k cos( k +! 4 ) sin( k + 4 ) : (4.9) Since g k+ is orthogonal to g k, we have gk+ T g k = 0 and hence k = for both the FRSR and PRPSR algorithms. Dening k = k =(? k ), where k is given by (.), direct calculations show that g T k+d k = kg k+ kkd k k cos k ; (4.0) kg k+ k = kg k k tan( 4? k); (4.) kd k k = kg k k sin k (4.) and consequently k = kg k+k + g T k+d k kd k k + g T k+d k = = tan ( 4? k) + cos k tan( 4? k) sin k cos k tan( 4? k) sin k + sin k tan( 4? k) k ; (4.3) sin k

12 where Therefore k = cos k? sin 3 k + cos k sin k : (4.4) d k+ = (? k )(?g k+ + k+ d k ) "! cos( k + = k + 4 ) cos( k + sin( k ) + k k) sin( k k)!# ; (4.5) where k = (? k ) tan( 4? k)kg k k. Noting that k for all k (0; =8), the above relation indicates that d k+ kd k+ k = cos(k + +! 4 + k+) sin( k (4.6) k+) and? 4 + k k+ k : (4.7) Because d T k+ g k+ < 0, we have that 0 < k+ k : (4.8) Again we have k+ (0; 8 ). Because d T k+ g k+ =?kd k+ k, we have cos( k + d k+ = kd k+ k sin k k+) sin( k k+)! : (4.9) By induction, we have (4.3)-(4.6) hold for all k. Now we test whether (.6)-(.7) hold. First, due to g T k+ d k > 0, (.7) obviously holds. Further, from (.), (4.3) and (4.5), we have? k g T k d k = k kd k k = kd k kkx k+? x k k = kg kkkd k k cos k + sin k : (4.0) By the denition of f, (4.) and (4.), where f k? f k+ = (kg kk? kg k+ k ) = kg kk h? tan ( 4? k) k = = cos k sin k (cos k + sin k ) kg kk = cos kkg k kkd k k (cos k + sin k ) =? k k g T k d k; (4.) cos k cos k + sin k : (4.) Note that (4.8) implies that k! 0; as k! : (4.3) The above two relations indicate that for any positive number <, k > for large k. Therefore, if we choose a suitable starting point, (.6) and (.7) hold for any 0 < < <. i

13 However, from (4.4) and (4.8), one can prove that Y k? kx k k = kx k tan( 4? i)! c 6 kx k; as k! (4.4) i= where c 6 = Q i= tan( 4? i) > 0. Thus any cluster point of fx k g is a non-stationary point. In this example, we have from (4.0) that k = kg k k (cos k + sin k )kd k k = ; (4.5) (cos k + sin k ) sin k which, together with (4.3), implies that k! ; as k! : (4.6) The above example shows that if we only impose (.6)-(.7) on every line search, it is possible that f k g is not bounded, and the FRSR and PRPSR algorithms fail. 5. Numerical results We tested the FRSR and PRPSR algorithms on SGI Indigo workstations. Our line search subroutine computes k such that (.6) and (.8) hold for = 0:0 and = 0:. The initial value of k is always set to. Although in this case the convergence results, i. e., Theorems 3.6 and 3.0 hold for any numbers 0 < b and 0 b <, we choose b = 0:9 and b = 0: to avoid numerical overows. We compared the numerical results of our algorithms with the Fletcher-Reeves method and the Polak-Ribiere-Polyak method. For the PRP algorithm, we restart it by setting d k =?g k whenever a down-hill search direction is not produced. We tested the algorithms on the 8 examples given by More, Garbow and Hillstrom [6]. The results are reported in Table 4.. The column \P" denotes the number of the problems, and \N" the number of variables. The numerical results are given in the form of I/F/G, where I, F, G are numbers of iterations, function evaluations, and gradient evaluations respectively. The stopping condition is kg k k 0?6 : (5.) The algorithms are also terminated if the number of function evaluations exceed We also terminate the calculation if the function value improvement is too small. More exactly, algorithms are terminated whenever [f(x k )? f(x k+ )]=[ + jf(x k )j]) 0?6 : (5.) 3

14 Table 4. P N FRSR PRPSR FR PRP 3 90/66/6 53/68/78 06/358/33 6/4/ /768/743 58/395/375 37/86/434 6/65/ /7/5 3/7/5 3/7/5 3/7/5 4 >5000 F ailed >5000 F ailed 5 3 7/7/63 9/3/6 >5000 3/39/ /6/7 3/6/7 3/6/7 3/5/ /077/899 > /43/456 > //3 30/30/8 8/8/56 F ailed 9 3 9/69/46 /9/9 5/40/4 0/8/8 0 F ailed 4/56/ F ailed F ailed 4 36/67/5 F ailed F ailed 3/5/5 3 F ailed 84/78/30 > /687/ >5000 6/5/5 > /06/ /3/6 44/53/9 5/583/40 /04/ /30/899 4/9/65 > /569/47 6 /59/40 9/53/36 3/9/45 9/8/6 7 4 > /69/86 36/4737/306 40/589/ /64/09 47/5/7 63/04/78 8/8/39 In the table, a superscript \*" indicates that the algorithm terminated due to (5.) but (5.) is not satised, and \F ailed" means that d k is so small that a numerical overow happens while the algorithm tries to compute f(x k + d k ): From Table 4., We found that the FRSR and PRPSR algorithms perform better than the Fletcher-Reeves and Polak-Ribiere-Polyak algorithms respectively. Therefore, the SR method will be a promising alternative of the standard conjugate gradient method. It is known that if a very small step is produced by the FR method, then this behavior may continue for a very large number of iterations unless the method is restarted. This behavior was observed and explained by Powell [0], Gilbert and Nocedal [3]. In fact, the statement is also true for the FRSR method. Let k denote the angle between?g k and d k. It follows from the denition of k and (.) that cos k =?gt k d k kg k kkd k k = kd kk kg k k : (5.3) Suppose that at k-iteration a \bad" search direction is generated, such that cos k 0, and that x k+ x k. Thus kg k+ k kg k k, and by (5.3), kd k k kg k k kg k+ k: (5.4) From (.), (.3), (.8) and this, we see that k! ; as k! ; (5.5) which with (.) implies that d k+ d k : (5.6) Hence, by this and (5.4), kd k+ k kg k+ k; (5.7) which together with (5.3) shows that cos k+ 0. The argument can therefore start all over again. 4

15 Acknowledgements. The authors were very grateful to the referees for their valuable comments and suggestions. References [] Fletcher R. and Reeves C. (964), Function minimization by conjugate gradients, Comput. J. 7, [] Hestenes M. R. (980), Conjugate direction methods in optimization, Springer-Verlag, New York Heidelberg Berlin, [3] Gilbert J. C. and Nocedal J. (99), Global convergence properties of conjugate gradient methods for optimization, SIAM. J. Optimization. (), -4. [4] Lemarechal C. (975), An extension of Davidon methods to nondierentiable problems, Mathematical Programming Study 3, [5] McCormick G. P. and Ritter K. (975), Alternative Proofs of the convergence properties of the conjugate-gradient method, JOTA: VOL. 3, NO. 5, [6] More J. J., Garbow B. S. and Hillstrom K. E. (98), Testing unconstrained optimization software, ACM Transactions on Mathematical Software 7, 7-4 [7] Pitlak R. (989), On the convergence of conjugate gradient algorithm, IMA Journal of Numerical Analysis 4, [8] Polak E. and Ribiere G. (969), Note sur la convergence de directions conjugees, Rev. Francaise Informat Recherche Operationelle, 3e Annee 6, [9] Polyak B. T. (969), The conjugate gradient method in extremem problems, USSR Comp. Math. and Math. Phys. 9, 94-. [0] Powell M. J. D. (977), Restart procedures of the conjugate gradient method, Math. Program., [] Powell M. J. D. (984), Nonconvex minimization calculations and the conjugate gradient method, in: Lecture Notes in Mathematics vol. 066, Springer-Verlag (Berlin) (984), pp. -4. [] Wolfe P. (969), Convergence conditions for ascent methods, SIAM Review, [3] Wolfe P. (97), Convergence conditions for ascent methods. II: Some corrections, SIAM Review 3, [4] Wolfe P. (975), A method of conjugate subgradients for minimizing nondierentiable functions, Math. Prog. Study 3, [5] Zoutendijk G. (970), Nonlinear Programming, Computational Methods, in: Integer and Nonlinear Programming (J.Abadie,ed.), North-Holland (Amsterdam),

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH

GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH Jie Sun 1 Department of Decision Sciences National University of Singapore, Republic of Singapore Jiapu Zhang 2 Department of Mathematics

More information

New hybrid conjugate gradient methods with the generalized Wolfe line search

New hybrid conjugate gradient methods with the generalized Wolfe line search Xu and Kong SpringerPlus (016)5:881 DOI 10.1186/s40064-016-5-9 METHODOLOGY New hybrid conjugate gradient methods with the generalized Wolfe line search Open Access Xiao Xu * and Fan yu Kong *Correspondence:

More information

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search ANZIAM J. 55 (E) pp.e79 E89, 2014 E79 On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search Lijun Li 1 Weijun Zhou 2 (Received 21 May 2013; revised

More information

Modification of the Armijo line search to satisfy the convergence properties of HS method

Modification of the Armijo line search to satisfy the convergence properties of HS method Université de Sfax Faculté des Sciences de Sfax Département de Mathématiques BP. 1171 Rte. Soukra 3000 Sfax Tunisia INTERNATIONAL CONFERENCE ON ADVANCES IN APPLIED MATHEMATICS 2014 Modification of the

More information

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S. Bull. Iranian Math. Soc. Vol. 40 (2014), No. 1, pp. 235 242 Online ISSN: 1735-8515 AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

More information

Step-size Estimation for Unconstrained Optimization Methods

Step-size Estimation for Unconstrained Optimization Methods Volume 24, N. 3, pp. 399 416, 2005 Copyright 2005 SBMAC ISSN 0101-8205 www.scielo.br/cam Step-size Estimation for Unconstrained Optimization Methods ZHEN-JUN SHI 1,2 and JIE SHEN 3 1 College of Operations

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

Global Convergence Properties of the HS Conjugate Gradient Method

Global Convergence Properties of the HS Conjugate Gradient Method Applied Mathematical Sciences, Vol. 7, 2013, no. 142, 7077-7091 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.311638 Global Convergence Properties of the HS Conjugate Gradient Method

More information

230 L. HEI if ρ k is satisfactory enough, and to reduce it by a constant fraction (say, ahalf): k+1 = fi 2 k (0 <fi 2 < 1); (1.7) in the case ρ k is n

230 L. HEI if ρ k is satisfactory enough, and to reduce it by a constant fraction (say, ahalf): k+1 = fi 2 k (0 <fi 2 < 1); (1.7) in the case ρ k is n Journal of Computational Mathematics, Vol.21, No.2, 2003, 229 236. A SELF-ADAPTIVE TRUST REGION ALGORITHM Λ1) Long Hei y (Institute of Computational Mathematics and Scientific/Engineering Computing, Academy

More information

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13

January 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes

More information

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence

A Modified Hestenes-Stiefel Conjugate Gradient Method and Its Convergence Journal of Mathematical Research & Exposition Mar., 2010, Vol. 30, No. 2, pp. 297 308 DOI:10.3770/j.issn:1000-341X.2010.02.013 Http://jmre.dlut.edu.cn A Modified Hestenes-Stiefel Conjugate Gradient Method

More information

New Hybrid Conjugate Gradient Method as a Convex Combination of FR and PRP Methods

New Hybrid Conjugate Gradient Method as a Convex Combination of FR and PRP Methods Filomat 3:11 (216), 383 31 DOI 1.2298/FIL161183D Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat New Hybrid Conjugate Gradient

More information

New Inexact Line Search Method for Unconstrained Optimization 1,2

New Inexact Line Search Method for Unconstrained Optimization 1,2 journal of optimization theory and applications: Vol. 127, No. 2, pp. 425 446, November 2005 ( 2005) DOI: 10.1007/s10957-005-6553-6 New Inexact Line Search Method for Unconstrained Optimization 1,2 Z.

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

An Efficient Modification of Nonlinear Conjugate Gradient Method

An Efficient Modification of Nonlinear Conjugate Gradient Method Malaysian Journal of Mathematical Sciences 10(S) March : 167-178 (2016) Special Issue: he 10th IM-G International Conference on Mathematics, Statistics and its Applications 2014 (ICMSA 2014) MALAYSIAN

More information

Jorge Nocedal. select the best optimization methods known to date { those methods that deserve to be

Jorge Nocedal. select the best optimization methods known to date { those methods that deserve to be Acta Numerica (1992), {199:242 Theory of Algorithms for Unconstrained Optimization Jorge Nocedal 1. Introduction A few months ago, while preparing a lecture to an audience that included engineers and numerical

More information

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS Adeleke O. J. Department of Computer and Information Science/Mathematics Covenat University, Ota. Nigeria. Aderemi

More information

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations International Journal of Mathematical Modelling & Computations Vol. 07, No. 02, Spring 2017, 145-157 An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations L. Muhammad

More information

154 ADVANCES IN NONLINEAR PROGRAMMING Abstract: We propose an algorithm for nonlinear optimization that employs both trust region techniques and line

154 ADVANCES IN NONLINEAR PROGRAMMING Abstract: We propose an algorithm for nonlinear optimization that employs both trust region techniques and line 7 COMBINING TRUST REGION AND LINE SEARCH TECHNIQUES Jorge Nocedal Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208-3118, USA. Ya-xiang Yuan State Key Laboratory

More information

New hybrid conjugate gradient algorithms for unconstrained optimization

New hybrid conjugate gradient algorithms for unconstrained optimization ew hybrid conjugate gradient algorithms for unconstrained optimization eculai Andrei Research Institute for Informatics, Center for Advanced Modeling and Optimization, 8-0, Averescu Avenue, Bucharest,

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.11(2011) No.2,pp.153-158 Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method Yigui Ou, Jun Zhang

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization

Open Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization BULLETIN of the Malaysian Mathematical Sciences Society http://math.usm.my/bulletin Bull. Malays. Math. Sci. Soc. (2) 34(2) (2011), 319 330 Open Problems in Nonlinear Conjugate Gradient Algorithms for

More information

Step lengths in BFGS method for monotone gradients

Step lengths in BFGS method for monotone gradients Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly

More information

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize Bol. Soc. Paran. Mat. (3s.) v. 00 0 (0000):????. c SPM ISSN-2175-1188 on line ISSN-00378712 in press SPM: www.spm.uem.br/bspm doi:10.5269/bspm.v38i6.35641 Convergence of a Two-parameter Family of Conjugate

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

Adaptive two-point stepsize gradient algorithm

Adaptive two-point stepsize gradient algorithm Numerical Algorithms 27: 377 385, 2001. 2001 Kluwer Academic Publishers. Printed in the Netherlands. Adaptive two-point stepsize gradient algorithm Yu-Hong Dai and Hongchao Zhang State Key Laboratory of

More information

Methods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996

Methods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996 Working Paper Linear Convergence of Epsilon-Subgradient Descent Methods for a Class of Convex Functions Stephen M. Robinson WP-96-041 April 1996 IIASA International Institute for Applied Systems Analysis

More information

Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems

Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems Volume 29, N. 2, pp. 195 214, 2010 Copyright 2010 SBMAC ISSN 0101-8205 www.scielo.br/cam Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems WEIJUN ZHOU

More information

On the convergence properties of the projected gradient method for convex optimization

On the convergence properties of the projected gradient method for convex optimization Computational and Applied Mathematics Vol. 22, N. 1, pp. 37 52, 2003 Copyright 2003 SBMAC On the convergence properties of the projected gradient method for convex optimization A. N. IUSEM* Instituto de

More information

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth

More information

1. Introduction In this paper we study a new type of trust region method for solving the unconstrained optimization problem, min f(x): (1.1) x2< n Our

1. Introduction In this paper we study a new type of trust region method for solving the unconstrained optimization problem, min f(x): (1.1) x2< n Our Combining Trust Region and Line Search Techniques Jorge Nocedal y and Ya-xiang Yuan z March 14, 1998 Report OTC 98/04 Optimization Technology Center Abstract We propose an algorithm for nonlinear optimization

More information

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions

More information

Quasi-Newton methods for minimization

Quasi-Newton methods for minimization Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1

More information

1 Numerical optimization

1 Numerical optimization Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms

More information

Chapter 10 Conjugate Direction Methods

Chapter 10 Conjugate Direction Methods Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method

More information

Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search

Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search Abstract and Applied Analysis Volume 013, Article ID 74815, 5 pages http://dx.doi.org/10.1155/013/74815 Research Article Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search Yuan-Yuan Chen

More information

Bulletin of the. Iranian Mathematical Society

Bulletin of the. Iranian Mathematical Society ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 43 (2017), No. 7, pp. 2437 2448. Title: Extensions of the Hestenes Stiefel and Pola Ribière Polya conjugate

More information

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Improved Damped Quasi-Newton Methods for Unconstrained Optimization Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

Spurious Chaotic Solutions of Dierential. Equations. Sigitas Keras. September Department of Applied Mathematics and Theoretical Physics

Spurious Chaotic Solutions of Dierential. Equations. Sigitas Keras. September Department of Applied Mathematics and Theoretical Physics UNIVERSITY OF CAMBRIDGE Numerical Analysis Reports Spurious Chaotic Solutions of Dierential Equations Sigitas Keras DAMTP 994/NA6 September 994 Department of Applied Mathematics and Theoretical Physics

More information

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS) Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno (BFGS) Limited memory BFGS (L-BFGS) February 6, 2014 Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb

More information

Preconditioned conjugate gradient algorithms with column scaling

Preconditioned conjugate gradient algorithms with column scaling Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

5 Quasi-Newton Methods

5 Quasi-Newton Methods Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min

More information

1 Numerical optimization

1 Numerical optimization Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................

More information

Lecture 10: September 26

Lecture 10: September 26 0-725: Optimization Fall 202 Lecture 0: September 26 Lecturer: Barnabas Poczos/Ryan Tibshirani Scribes: Yipei Wang, Zhiguang Huo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method

Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes

More information

A Modified Polak-Ribière-Polyak Conjugate Gradient Algorithm for Nonsmooth Convex Programs

A Modified Polak-Ribière-Polyak Conjugate Gradient Algorithm for Nonsmooth Convex Programs A Modified Polak-Ribière-Polyak Conjugate Gradient Algorithm for Nonsmooth Convex Programs Gonglin Yuan Zengxin Wei Guoyin Li Revised Version: April 20, 2013 Abstract The conjugate gradient (CG) method

More information

Gradient-Based Optimization

Gradient-Based Optimization Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

GRADIENT = STEEPEST DESCENT

GRADIENT = STEEPEST DESCENT GRADIENT METHODS GRADIENT = STEEPEST DESCENT Convex Function Iso-contours gradient 0.5 0.4 4 2 0 8 0.3 0.2 0. 0 0. negative gradient 6 0.2 4 0.3 2.5 0.5 0 0.5 0.5 0 0.5 0.4 0.5.5 0.5 0 0.5 GRADIENT DESCENT

More information

The Conjugate Gradient Algorithm

The Conjugate Gradient Algorithm Optimization over a Subspace Conjugate Direction Methods Conjugate Gradient Algorithm Non-Quadratic Conjugate Gradient Algorithm Optimization over a Subspace Consider the problem min f (x) subject to x

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

1 Solutions to selected problems

1 Solutions to selected problems 1 Solutions to selected problems 1. Let A B R n. Show that int A int B but in general bd A bd B. Solution. Let x int A. Then there is ɛ > 0 such that B ɛ (x) A B. This shows x int B. If A = [0, 1] and

More information

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf. Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

On memory gradient method with trust region for unconstrained optimization*

On memory gradient method with trust region for unconstrained optimization* Numerical Algorithms (2006) 41: 173 196 DOI: 10.1007/s11075-005-9008-0 * Springer 2006 On memory gradient method with trust region for unconstrained optimization* Zhen-Jun Shi a,b and Jie Shen b a College

More information

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization Journal of Computational and Applied Mathematics 155 (2003) 285 305 www.elsevier.com/locate/cam Nonmonotonic bac-tracing trust region interior point algorithm for linear constrained optimization Detong

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method

Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Modification of the Wolfe line search rules to satisfy the descent condition in the Polak-Ribière-Polyak conjugate gradient method

More information

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method. Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

More information

A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search

A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search A Nonlinear Conjugate Gradient Algorithm with An Optimal Property and An Improved Wolfe Line Search Yu-Hong Dai and Cai-Xia Kou State Key Laboratory of Scientific and Engineering Computing, Institute of

More information

A quasisecant method for minimizing nonsmooth functions

A quasisecant method for minimizing nonsmooth functions A quasisecant method for minimizing nonsmooth functions Adil M. Bagirov and Asef Nazari Ganjehlou Centre for Informatics and Applied Optimization, School of Information Technology and Mathematical Sciences,

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is New NCP-Functions and Their Properties 3 by Christian Kanzow y, Nobuo Yamashita z and Masao Fukushima z y University of Hamburg, Institute of Applied Mathematics, Bundesstrasse 55, D-2146 Hamburg, Germany,

More information

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived

More information

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization M. Al-Baali y December 7, 2000 Abstract This pape

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization M. Al-Baali y December 7, 2000 Abstract This pape SULTAN QABOOS UNIVERSITY Department of Mathematics and Statistics Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization by M. Al-Baali December 2000 Extra-Updates

More information

Nonlinear conjugate gradient methods, Unconstrained optimization, Nonlinear

Nonlinear conjugate gradient methods, Unconstrained optimization, Nonlinear A SURVEY OF NONLINEAR CONJUGATE GRADIENT METHODS WILLIAM W. HAGER AND HONGCHAO ZHANG Abstract. This paper reviews the development of different versions of nonlinear conjugate gradient methods, with special

More information

f = 2 x* x g 2 = 20 g 2 = 4

f = 2 x* x g 2 = 20 g 2 = 4 On the Behavior of the Gradient Norm in the Steepest Descent Method Jorge Nocedal y Annick Sartenaer z Ciyou Zhu x May 30, 000 Abstract It is well known that the norm of the gradient may be unreliable

More information

GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION

GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION GRADIENT METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION By HONGCHAO ZHANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Research Article A New Conjugate Gradient Algorithm with Sufficient Descent Property for Unconstrained Optimization

Research Article A New Conjugate Gradient Algorithm with Sufficient Descent Property for Unconstrained Optimization Mathematical Problems in Engineering Volume 205, Article ID 352524, 8 pages http://dx.doi.org/0.55/205/352524 Research Article A New Conjugate Gradient Algorithm with Sufficient Descent Property for Unconstrained

More information

LBFGS. John Langford, Large Scale Machine Learning Class, February 5. (post presentation version)

LBFGS. John Langford, Large Scale Machine Learning Class, February 5. (post presentation version) LBFGS John Langford, Large Scale Machine Learning Class, February 5 (post presentation version) We are still doing Linear Learning Features: a vector x R n Label: y R Goal: Learn w R n such that ŷ w (x)

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24

More information

Applied Mathematics &Optimization

Applied Mathematics &Optimization Appl Math Optim 29: 211-222 (1994) Applied Mathematics &Optimization c 1994 Springer-Verlag New Yor Inc. An Algorithm for Finding the Chebyshev Center of a Convex Polyhedron 1 N.D.Botin and V.L.Turova-Botina

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

Part 2: Linesearch methods for unconstrained optimization. Nick Gould (RAL)

Part 2: Linesearch methods for unconstrained optimization. Nick Gould (RAL) Part 2: Linesearch methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 14: Unconstrained optimization Prof. John Gunnar Carlsson October 27, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 27, 2010 1

More information

Gradient method based on epsilon algorithm for large-scale nonlinearoptimization

Gradient method based on epsilon algorithm for large-scale nonlinearoptimization ISSN 1746-7233, England, UK World Journal of Modelling and Simulation Vol. 4 (2008) No. 1, pp. 64-68 Gradient method based on epsilon algorithm for large-scale nonlinearoptimization Jianliang Li, Lian

More information

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23 Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is

More information

SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS

SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS A SEMISMOOTH EQUATION APPROACH TO THE SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS Tecla De Luca 1, Francisco Facchinei 1 and Christian Kanzow 2 1 Universita di Roma \La Sapienza" Dipartimento di Informatica

More information

A Proximal Method for Identifying Active Manifolds

A Proximal Method for Identifying Active Manifolds A Proximal Method for Identifying Active Manifolds W.L. Hare April 18, 2006 Abstract The minimization of an objective function over a constraint set can often be simplified if the active manifold of the

More information

Second Order Optimization Algorithms I

Second Order Optimization Algorithms I Second Order Optimization Algorithms I Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 7, 8, 9 and 10 1 The

More information

A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION

A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION 1 A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION G E MANOUSSAKIS, T N GRAPSA and C A BOTSARIS Department of Mathematics, University of Patras, GR 26110 Patras, Greece e-mail :gemini@mathupatrasgr,

More information

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study International Journal of Mathematics And Its Applications Vol.2 No.4 (2014), pp.47-56. ISSN: 2347-1557(online) Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms:

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

More information

QUADRATICALLY AND SUPERLINEARLY CONVERGENT ALGORITHMS FOR THE SOLUTION OF INEQUALITY CONSTRAINED MINIMIZATION PROBLEMS 1

QUADRATICALLY AND SUPERLINEARLY CONVERGENT ALGORITHMS FOR THE SOLUTION OF INEQUALITY CONSTRAINED MINIMIZATION PROBLEMS 1 QUADRATICALLY AND SUPERLINEARLY CONVERGENT ALGORITHMS FOR THE SOLUTION OF INEQUALITY CONSTRAINED MINIMIZATION PROBLEMS 1 F. FACCHINEI 2 AND S. LUCIDI 3 Communicated by L.C.W. Dixon 1 This research was

More information

N.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a

N.G.Bean, D.A.Green and P.G.Taylor. University of Adelaide. Adelaide. Abstract. process of an MMPP/M/1 queue is not a MAP unless the queue is a WHEN IS A MAP POISSON N.G.Bean, D.A.Green and P.G.Taylor Department of Applied Mathematics University of Adelaide Adelaide 55 Abstract In a recent paper, Olivier and Walrand (994) claimed that the departure

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

5 Overview of algorithms for unconstrained optimization

5 Overview of algorithms for unconstrained optimization IOE 59: NLP, Winter 22 c Marina A. Epelman 9 5 Overview of algorithms for unconstrained optimization 5. General optimization algorithm Recall: we are attempting to solve the problem (P) min f(x) s.t. x

More information

Optimization Methods. Lecture 18: Optimality Conditions and. Gradient Methods. for Unconstrained Optimization

Optimization Methods. Lecture 18: Optimality Conditions and. Gradient Methods. for Unconstrained Optimization 5.93 Optimization Methods Lecture 8: Optimality Conditions and Gradient Methods for Unconstrained Optimization Outline. Necessary and sucient optimality conditions Slide. Gradient m e t h o d s 3. The

More information

FINE TUNING NESTEROV S STEEPEST DESCENT ALGORITHM FOR DIFFERENTIABLE CONVEX PROGRAMMING. 1. Introduction. We study the nonlinear programming problem

FINE TUNING NESTEROV S STEEPEST DESCENT ALGORITHM FOR DIFFERENTIABLE CONVEX PROGRAMMING. 1. Introduction. We study the nonlinear programming problem FINE TUNING NESTEROV S STEEPEST DESCENT ALGORITHM FOR DIFFERENTIABLE CONVEX PROGRAMMING CLÓVIS C. GONZAGA AND ELIZABETH W. KARAS Abstract. We modify the first order algorithm for convex programming described

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information