Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Similar documents
Introduction. A Modified Steepest Descent Method Based on BFGS Method for Locally Lipschitz Functions. R. Yousefpour 1

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Higher-Order Methods

PDE-Constrained and Nonsmooth Optimization

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

A quasisecant method for minimizing nonsmooth functions

8 Numerical methods for unconstrained problems

Optimization II: Unconstrained Multivariable

5 Quasi-Newton Methods

Zero-Order Methods for the Optimization of Noisy Functions. Jorge Nocedal

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Unconstrained optimization

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Algorithms for Nonsmooth Optimization

Line Search Methods for Unconstrained Optimisation

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

Trust Region Methods. Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh. Convex Optimization /36-725

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Optimization II: Unconstrained Multivariable

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

2. Quasi-Newton methods

Nonlinear Optimization: What s important?

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

Computational Optimization. Augmented Lagrangian NW 17.3

On Lagrange multipliers of trust region subproblems

Marjo Haarala. Large-Scale Nonsmooth Optimization

Worst Case Complexity of Direct Search

Algorithms for Constrained Optimization

Introduction to Nonlinear Optimization Paul J. Atzberger

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS

Unconstrained minimization of smooth functions

Step lengths in BFGS method for monotone gradients

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

An Inexact Newton Method for Optimization

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

Introduction to unconstrained optimization - direct search methods

Convex Optimization Algorithms for Machine Learning in 10 Slides

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Chapter 4. Unconstrained optimization

An Inexact Newton Method for Nonlinear Constrained Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM

On the iterate convergence of descent methods for convex optimization

Programming, numerics and optimization

You should be able to...

LIMITED MEMORY BUNDLE METHOD FOR LARGE BOUND CONSTRAINED NONSMOOTH OPTIMIZATION: CONVERGENCE ANALYSIS

On Nesterov s Random Coordinate Descent Algorithms - Continued

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations

Global and derivative-free optimization Lectures 1-4

The speed of Shor s R-algorithm

ECS550NFB Introduction to Numerical Methods using Matlab Day 2

j=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k

On Lagrange multipliers of trust-region subproblems

Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Convex Optimization. Problem set 2. Due Monday April 26th

Maria Cameron. f(x) = 1 n

Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods. Jorge Nocedal

A Line search Multigrid Method for Large-Scale Nonlinear Optimization

arxiv: v1 [math.oc] 21 Dec 2016

Quasi-Newton Methods

Notes on Numerical Optimization

arxiv: v1 [math.oc] 22 May 2018

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

Limited Memory Bundle Algorithm for Large Bound Constrained Nonsmooth Minimization Problems

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Taylor-like models in nonsmooth optimization

Newton s Method. Javier Peña Convex Optimization /36-725

Matrix Derivatives and Descent Optimization Methods

1 Numerical optimization

Spectral gradient projection method for solving nonlinear monotone equations

Optimization methods

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Selected Topics in Optimization. Some slides borrowed from

MATH 4211/6211 Optimization Basics of Optimization Problems

Lecture 6: September 17

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Nonlinear Programming

Optimization Algorithms on Riemannian Manifolds with Applications

Lecture 3: Linesearch methods (continued). Steepest descent methods

Algorithms for constrained local optimization

Conditional Gradient (Frank-Wolfe) Method

Convex Optimization CMU-10725

6. Proximal gradient method

Numerical Optimization: Basic Concepts and Algorithms

Generalized Newton-Type Method for Energy Formulations in Image Processing

Step-size Estimation for Unconstrained Optimization Methods

LINE SEARCH ALGORITHMS FOR LOCALLY LIPSCHITZ FUNCTIONS ON RIEMANNIAN MANIFOLDS

Preconditioned conjugate gradient algorithms with column scaling

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

A Robust Gradient Sampling Algorithm for Nonsmooth, Nonconvex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

Transcription:

New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology, Tehran, Iran; z akbari@dena.kntu.ac.ir 2 Department Mathematical Sciences, University of Mazandaran, Babolsar, Iran; yousefpour@umz.ac.ir 3 Department of Mathematics, K.N. Toosi University of Technology, Tehran, Iran; peyghami@kntu.ac.ir Abstract In this paper, a local model is presented for the locally Lipschitz functions. This local model is constructed by an approximation of the steepest descent direction. The steepest descent direction is an element of ǫ-subdifferential with minimal norm. In fact in the quadratic model, gradient is replaced by an approximation of the steepest descent direction. The classical trust region method is applied on this model. We prove that this algorithm is convergent by using the bounded positive definite matrices. The positive definite matrix is updated in each iterations by the BFGS method. Finally, the presented algorithm is implemented by MATLAB code. Keywords: Trust region, Lipschitz functions, Local model, Steihaug method Introduction The nonsmooth unconstraint minimization problem is one of the important problems in the real world. For example in smooth case, the penalty and lagrangian functions are nonsmooth optimization problems. Also, these problems are used in control optimization. Therefore, solving these problems are attended. The trust region (TR) method is an iterative method. In this method, the objective function is trusted by a local model. In each iteration, the model is reduced instead of objective function in the adequate region. If 1

f : R n R is continuously differentiable, then the local model is defined as follows m(x k,b k )(p) = f(x k )+ f(x k ) T p+1/2p T B k p, (1) where B k is adequately selected. If f is twice continuously differentiable, then B k is the Hessian matrix. In some methods, B k is updated by the Quasi-Newton methods. A local method, that can be practically implemented on the general local functions, is not presented. In this paper, we use the steepest descent direction to construct the local model. The steepest descent direction for the locally Lipschitz functions is an element of the Goldstein subgradient with minimal norm. Based on the method, that approximate this direction, several bundle algorithms were developed [1-6]. The efficiency of these algorithms depends on the approximation accuracy. To improve the efficiency of an algorithm, a larger number of subgradients must be computed to approximate the Goldstein subgradient efficiency and, this is time consuming. For example, in [6], the steepest descent direction is approximated by sampling gradients. This approximation is appropriate, but computing this approximation for large scale problems is very expensive. In [4], the steepest descent direction is iteratively approximated. This method computes a good approximation for the steepest descent direction by the less number of subgradients. The numerical results showed that this algorithm is more efficient than other bundle methods. By an approximation of the steepest descent direction, we propose an quadratic model for the locally Lipschitz functions. We combine the Cauchy point and CG-Steihaug methods [7] to approximate the quadratic model solution. The numerical results show that the TR algorithm has better behavior by this combination. In this paper, we implement this algorithm by Matlab code and compare its efficiency by other methods. The nonsmooth trust region algorithm and its convergence In [8], the local model for locally Lipschitz functions is given as follow m(x,p) = f(x)+φ(x,p)+ 1 2 pt Bp. (2) 2

Based on some assumption on φ(.,.), the global convergent of TR was proved. The authors purposed the following function φ(x,p) = max v f(x) < v,p >. But by this definition, minimization of the local model is impractical. In this paper, we give another local model for the locally Lipschitz functions. To construct the local model for the locally Lipschitz functions, we try to substitute the gradient in (1) by a suitable element of ǫ f(x). Let ǫ > 0, the steepest descent direction is computed by using ǫ f(x). Consider the following function v 0 = arg min v, (3) v ǫf(x) and let d 0 = v0 v 0. By Lebourg s Mean Value Theorem, there exists ξ ǫ f(x) such that f(x+d 0 ) f(x) = ǫξ T d 0 ǫv T 0 v 0 v 0 = ǫ v 0. In fact, d 0 is the steepest descent direction. But solving (3) is impractical, thus ǫ f(x) is approximated by its finite subset, i.e., if W ǫ f(x) then convw is considered an approximation of ǫ f(x). Consider the following problem v w = arg min v convw v, let d = vw v w. If f(x+ǫd) f(x) cǫ v w for some c (0,1), then d can be an approximation of a steepest descent direction. Else by adding a new element of ǫ f(x) in W, the approximation of ǫ f(x) is improved. The method, how construct such a subset, is described in [4]. Suppose that W k ǫ f(x k ) and conv W k is an approximation of ǫ f(x k ). We consider the following problem v k = arg min v conv W k v, and suppose that f(x k ǫ v k v k ) f(x) cǫ v k where c (0,1). In [4], an algorithm is presented for finding W k and v k. Based on this subdifferential, v k ǫ f(x k ), we define the following quadratic model: m(x k,p) = f(x k )+v T kp+ 1 2 pt B k p, 3

where B k is a positive definite matrix. Based on this quadratic model, the trust region method is presented as follows. Algorithm 1. (The nonsmooth trust region algorithm) Step 0: Let 0, 1 > 0, θ,δ 1,θ δ (0,1), x 1 R n, ξ 1 f(x), c 1,c 2,c 3 (0,1), c 4 > 1, B 1 = I and, k = 1. Step 1: Apply Algorithm 2 in [4] at point x k with parameters ǫ = k, δ = δ k and c = c 1. Suppose Algorithm 2 in [4] finds a proper approximation of ǫ f(x k ), convw k, and a adequate subgradient, v k, such that v k = arg min v convw k v. Step 2: If v k = 0, then stop, else if v k δ k, then set k+1 = θ k, δ k+1 = δ k θ δ, x k+1 = x k, k = k + 1 and go to Step 1. Else set δ k+1 = δ k and go to Step 3. Step 3: Solve the following quadratic subproblem: min p R nm(x k,p) = f(x k )+vkp+ T 1 2 pt B k p s.t. p k, and set p k be its solution. Step 4: If f(x k +p k ) f(x k ) c 1 v T k p k, then set x k+1 = x k +p k and go to Step 5, else set k+1 = θ k, x k+1 = x k, k = k+1 and go to Step 1. Step 5: Define the following ratio ρ k = f(x k +p k ) f(x k ). Q(p k ) Q(0) If ρ k c 3 and p k = k then, set k+1 = min{ 0,c 4 k } and, if ρ c 2 then, set k+1 = k θ. Else set k+1 = k. Step 6: Select a subgradient ξ k+1 f(x k+1 ), then update B k by the BFGS method. Set k = k +1 and go to Step 1. The following theorem proves the convergent of algorithm. 4

Theorem 1. Let f : R n R be a locally Lipschitz function. If the level set L := {x : f(x) f(x 1 )} is bounded, then either Algorithm 1 terminates finitely at some k 0 with v k0 = 0, or the sequence {x k }, generated by Algorithm 1, is convergent. If x = lim k x k, then 0 f(x ). REFERENCES 1. A. A. Goldstein. Optimization of Lipschitz continuous functions, Mathematical Programming, 13:14 22, (1977). 2. D. P. Bertsekas and S. K. Mitter. A descent numerical method for optimization problems with nondifferentiable cost functionals, SIAM Journal on Control, 11:637 652, (1973). 3. M. Gaudioso and M. F. Monaco. A bundle type approach to the unconstrained minimization of convex nonsmooth functions, Mathematical Programming, 23(2):216 226, (1982). 4. N. Mahdavi-Amiri and R. Yousefpour. An effective nonsmooth optimization algorithm for locally lipschitz functions, Accepted Journal of Optimization Theory Application. 5. P. Wolfe. A method of conjugate subgradients for minimizing nondifferentiable functions, Nondifferentiable Optimization, M. Balinski and P. Wolfe, eds., Mathematical Programming Study, North- Holland, Amsterdam, 3:145 173, (1975). 6. J. V. Burke, A. S. Lewis, and M. L. Overton. A robust gradient sampling algorithm for nonsmooth, nonconvex optimization, SIAM Journal of Optimization, 15:571 779, (2005). 7. J. Nocedal and S. J. Wright Numerical optimization, Springer, (1999). 8. L. Qi and J. Sun. A trust region algorithm for minimization of locally lipschitzian functions, Mathematical Programming, 66:25 43, (1994). 5