Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization

Size: px
Start display at page:

Download "Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization"

Transcription

1 Mathematical Programming manuscript No. (will be inserted by the editor) Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Wei Bian Xiaojun Chen Yinyu Ye July 5, 01, Received: 8 March, 013; 8 October, 013 / Accepted: date Abstract We propose a first order interior point algorithm for a class of non- Lipschitz and nonconvex minimization problems with box constraints, which arise from applications in variable selection and regularized optimization. The objective functions of these problems are continuously differentiable typically at interior points of the feasible set. Our first order algorithm is easy to implement and the objective function value is reduced monotonically along the iteration points. We show that the worst-case iteration complexity for finding an ϵ scaled first order stationary point is O(ϵ ). Furthermore, we develop a second order interior point algorithm using the Hessian matrix, and solve a quadratic program with a ball constraint at each iteration. Although the second order interior point algorithm costs more computational time than that of the first order algorithm in each iteration, its worst-case iteration complexity for finding an ϵ scaled second order stationary point is reduced to O(ϵ 3/ ). Note that an ϵ scaled second order stationary point must also be an ϵ scaled first order stationary point. Keywords constrained non-lipschitz optimization complexity analysis interior point method first order algorithm second order algorithm This wor was supported partly by Hong Kong Research Council Grant PolyU5003/10p, The Hong Kong Polytechnic University Postdoctoral Fellowship Scheme, the NSF foundation ( , ) of China and US AFOSR Grant FA Wei Bian Department of Mathematics, Harbin Institute of Technology, Harbin, China. bianweilvse50@163.com. Xiaojun Chen Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China. maxjchen@polyu.edu.h Yinyu Ye Department of Management Science and Engineering, Stanford University, Stanford, CA yinyu-ye@stanford.edu

2 Mathematics Subject Classification (000) 90C30 90C6 65K05 49M37 1 Introduction In this paper, we consider the following optimization problem: min s.t. n f(x) = H(x) + λ φ(x p i ) i=1 x Ω = {x : 0 x b}, (1) where H : R n R is continuously differentiable, φ : [0, + ) [0, + ) is continuous and concave, λ > 0, 0 < p < 1, b = (b 1, b,..., b n ) T with b i (0, + ) {+ }, and 0 x b means that if b i < +, x i [0, b i ], otherwise x i 0, i = 1,,..., n. Moreover, φ is continuously differentiable in (0, + ) and φ(0) = 0. Without loss of generality, we assume that a minimizer of (1) exists and min Ω f(x) 0. Problem (1) is nonsmooth, nonconvex, and non-lipschitz, which has been extensively used in image restoration, signal processing and variable selection; see, e.g., [,10,14,15,19,0,4,31]. The function H(x) is often used as a data fitting term, while the function n i=1 φ(xp i ) is used as a regularization term. The feasible set Ω of (1) includes R n + = {x : x 0} as a special case. Moreover, the unconstrained problem min n H(x) + λ φ( x i p ), () i=1 where the non-lipschitz points are in the interior of the feasible region, can be equivalently reformulated as the constrained problem n n min H(x + x ) + λ φ((x + i )p ) + λ φ((x i )p ) s.t. x + 0, x 0 i=1 i=1 (3) by using variable splitting x = x + x. In Section 3, we show that the scaled first and second order stationary points of problems () and (3) are in oneone correspondence. The advantage of (3) is that non-smooth points are only at the boundary of the feasible set which allows us to use the gradient and Hessian of the objective functions in the interior point methods. Numerical algorithms for nonconvex optimization problems have been studied extensively. However, little theoretical complexity or convergence speed analysis of the algorithms is nown, in contrast to the complexity study of convex optimization in the past thirty years. We now review few results on complexity analysis of nonconvex optimization problems.

3 3 Smooth, nonconvex. Using an interior point algorithm, Ye [9] proved that an ϵ KKT or first order stationary point of a general quadratic program min 1 xt Qx + c T x s.t. Ax = q, x 0 (4) can be computed in O(ϵ 1 log ϵ 1 ) iterations, where each iteration would solve a ball-constrained or trust-region quadratic program that is equivalent to a simple convex minimization problem. Here Q R n n, c R n, A R m n, q R m. Ye [9] also proved that, as ϵ 0, the iterative sequence converges to a point satisfying the second order necessary optimality condition. More precisely, the least eigenvalue of Q in the null space of all active constraints is greater than ϵ. For general unconstrained nonconvex optimization, it was shown in [17, 1] that the standard steepest descent method with line search or trust-region can find an ϵ first order stationary point in O(ϵ ) iterations. In [], Nesterov and Polya showed that a Newton-type method based on cubic regularization requires at most O(ϵ 3/ ) iterations to find an ϵ first order stationary point. Instead of using Hessian and its global Lipschitz constant, Cartis, Gould and Toint [4 6] further proposed an adaptive regularization with cubics (ARC) method in which approximations of Hessian and its Lipschitz constant are updated at each iteration. They have also showed that the ARC taes at most O(ϵ 3/ ) iterations to find an ϵ first order stationary point. Then, an algorithm with one-dimensional global optimization of the cubic model is given and the sharpness of the complexity bound O(ϵ 3/ ) is derived in [7]. Lipschitz continuous, nonconvex. Cartis, Gould and Toint [3] estimated the worst-case complexity of a first order trust-region or quadratic regularization method for solving the following unconstrained nonsmooth, nonconvex minimization problem min Φ h (x) := H(x) + h(c(x)), (5) where h : R m R is convex but may be nonsmooth and c : R n R m is continuously differentiable. Their method taes at most O(ϵ ) iterations to reduce the size of a first order criticality measure below ϵ, which is as the same order as the worst-case complexity of steepest-descent methods applied to unconstrained smooth nonconvex optimization. Garmanjani and Vicente [15] proposed a class of smoothing direct-search methods for a general unconstrained nonsmooth optimization by applying a direct-search method to the smoothing function f of the objective function f [8]. Such approach can be considered as the zero order methods because only function values are used. When f is locally Lipschitz, the smoothing directsearch method [15] too at most O(ϵ 3 log ϵ 1 ) iterations to find an x such that f(x, µ) ϵ and µ ϵ, where µ is the smoothing parameter. When µ 0, f(x, µ) f(x) and f(x, µ) v with v f(x).

4 4 Non-Lipschitz, nonconvex. Ge, Jiang and Ye [16] extended the complexity result of [9] to the following concave non-lipschitz minimization min n x p i s.t. Ax = q, x 0 (6) i=1 and showed that finding an ϵ scaled first order stationary point or global minimizer requires at most O(ϵ 1 log ϵ 1 ) iterations. Recently, Bian and Chen [1] proposed a smoothing quadratic regularization (SQR) algorithm for solving unconstrained non-lipshchitz minimization problem (). At each iteration, the SQR algorithm solves a strongly convex quadratic minimization problem with a diagonal Hessian matrix, which has a simple closed form solution. The SQR algorithm is easy to implement and its worst-case complexity of reaching an ϵ scaled first order stationary point is O(ϵ ). To overcome the nonsmoothness of the second term in the objective function of (), smoothing methods were used in the SQR algorithm. In this paper, we propose a first order interior point method and a second order interior point method for solving the constrained non-lipschitz nonconvex optimization problem (1), using the smoothness of f in {x : 0 < x b} and eeping all iterates in it. The former uses the gradient of f to derive a quadratic overestimation and has the worst-case complexity O(ϵ ) for finding an ϵ scaled first order stationary point of (1). The latter uses the gradient and the Hessian to derive a cubic overestimation and has the worst-case iteration complexity O(ϵ 3/ ) for finding an ϵ scaled second order stationary point of a special version of (1) as the following min H(x) + λ n x p i. (7) x 0 To our best nowledge, these two methods are the first methods with the state of the art iteration complexity bounds for constrained, non-lipschitz and nonconvex optimization. Specially, the second order interior point algorithm is the first method to find an ϵ scaled second order stationary point or an ϵ global minimizer of non-lipschitz, nonconvex optimization problem (7) in no more than O(ϵ 3/ ) iterations. The above results are summarized in Table 1. Our original goal was to produce the O(ϵ 1 log ϵ 1 ) worst-case complexity bound for problem (1), as it was established for problems (4) in [9] and (6) in [16]. But we failed even when H(x) is quadratic and we leave this as an open problem. In developing the O(ϵ 1 log ϵ 1 ) bound, potential reduction techniques were used, where the quadratic objective or the concavity of n i=1 xp i has played a ey role. In either case, a quadratic overestimation of the potential function can be constructed and it is to be minimized, which maes the analysis considerably simpler and the convergence rate better. However, when the two objectives in (4) and (6) are added together, which is a special case i=1

5 5 Smooth Lipschitz continuous Non-Lipschitz O(ϵ 1 log ϵ 1 ) [Ye 1998] (4) [Ge et al 011](6) O(ϵ 3/ ) [Nesterov et al 006]; this paper for (7) [Cartis et al 011] O(ϵ ) [Nesterov 004]; [Cartis et al 011](5) [Bian et al 01](); [Gratton et al 008] this paper for (1) O(ϵ 3 log ϵ 1 ) [Garmanjani et al 01] Table 1: Worst-case complexity results for nonconvex optimization of (1) namely 1 min x 0 xt Qx + c T x + λ n x p i, a quadratic overestimation of the potential function is no longer achievable. Our paper is organized as follows. In Section, a first order interior point algorithm is proposed for solving (1), which only uses f and a Lipschitz constant of H on Ω and is easy to implement. Any iteration point x > 0 belongs to Ω and the objective function is monotonically decreasing along the generated sequence {x }. Moreover, the algorithm produces an ϵ scaled first order stationary point of (1) in at most O(ϵ ) iterations. In Section 3, a second order interior point algorithm is given to solve (7), which can generate an ϵ scaled second order stationary point in at most O(ϵ 3/ ) iterations. Since our problem has constraints, the ϵ scaled first order and second order stationary points resemble the complementarity condition for all inequality constrained optimization problems. In Section 4, we present numerical results to illustrate the efficiency of the algorithms and the complexity bound. Throughout this paper, K = {0, 1,,...}, I = {1,,..., n}, I b = {i {1,,..., n} : b i < + } and e n = (1, 1,..., 1) T R n. For x R n, A = (a ij ) m n R m n and p > 0, x = x, diag(x) = diag(x 1, x,..., x n ), A p = ( a ij p ) m n. For two matrices A, B R n n, we denote A B when A B is positive semi-definite. i=1 First Order Method In this section, we propose a first order interior point algorithm for solving (1), which uses the first order reduction technique and eeps all iterates x > 0 in the feasible set Ω. We show that the objective function value f(x ) is monotonically decreasing along the sequence {x } generated by the algorithm, and the worst-case complexity of the algorithm for generating an ϵ global minimizer or ϵ scaled first order stationary point of (1) is O(ϵ ), which is the same in the worst-case complexity order of the steepest-descent methods for smooth nonconvex optimization problems. Moreover, it is worth noting that the proposed first order interior point algorithm is easy to implement, and the computation cost at each iteration is little.

6 6 Throughout this section, we need the following assumptions. Assumption.1: H is globally Lipschitz on Ω with a Lipschitz constant β 1. 1 Specially, when I b we choose β such that β max i Ib b i. Assumption.: For any given x 0 Ω, there is R 1 such that sup{ x : f(x) f(x 0 ), x Ω} R. When H(x) = 1 Ax q, Assumption.1 holds with β = A T A. Assumption. holds, if Ω is bounded or H(x) 0 for all x Ω and φ(s) as s..1 First Order Necessary Condition Note that for problem (1), when 0 < p < 1, the Clare generalized gradient of φ(s p ) does not exist at 0. Inspired by the first and second order necessary conditions for local minimizers of unconstrained non-lipschitz optimization in [11,1], we give the scaled first and second order necessary condition for the local minimizers of constrained non-lipschitz optimization (1) in this section and the next section, respectively. Then, for any ϵ (0, 1], the ϵ scaled first order and second order stationary point of (1) can be deduced directly. First, for ϵ > 0, an ϵ global minimizer of (1) is defined as a feasible solution 0 x ϵ b and f(x ϵ ) min f(x) ϵ. 0 x b It has been proved in [9] that finding a global minimizer of the unconstrained l -l p minimization problem is strongly NP hard. For any fixed x R n, denote X = diag(x). Any local minimizer x of the unconstrained l -l p optimization modeled by () with H(x) = 1 Ax q and φ(s) = s satisfies the first order necessary condition [1] XA T (Ax q) + λp x p = 0. Similarly, using X as a scaling matrix, if x is a local minimizer of (1), then x Ω satisfies the first order necessary condition [X f(x)] i = x i [ H(x)] i + λpφ (s) s=x p i i = 0 if x i < b i ; (8a) [ f(x)] i = [ H(x)] i + λpφ (s) s=x p i i 0 if x i = b i. (8b) For (1), if x at which f is differentiable satisfies the first order necessary condition given above, then x Ω satisfies (i) [ f(x)] i = 0 if x i < b i ; (ii) [ f(x)] i 0 if x i = b i. Moreover, although [ f(x)] i does not exist when x i = 0, one can see that, as x i 0+, [ f(x)] i +. Now we can define an ϵ scaled first order stationary point of (1).

7 7 Definition 1 For a given ϵ (0, 1], we call x Ω an ϵ scaled first order stationary point of (1), if (i) [X f(x)] i ϵ if x i < (1 1 ϵ)b i; (ii) [ f(x)] i ϵ if x i (1 1 ϵ)b i. Definition 1 is consistent with the first order necessary conditions in (8a)- (8b) with ϵ = 0.. First Order Interior Point Algorithm Note that for any x, x + (0, b], Assumption.1 implies that H(x + ) H(x) + H(x), x + x + β x+ x. Since φ is concave on [0, + ), then for any s, t (0, + ), φ(t) φ(s) + φ(s), t s. Thus, for any x, x + (0, b], we obtain f(x + ) f(x) + f(x), x + x + β x+ x. Let x + = x + Xd x. We obtain f(x + ) f(x) + X f(x), d x + β Xd x. (9) To achieve a reduction of the objective function at each iteration, we minimize a quadratic function subject to a box constraint at each iteration when x > 0, which is min s.t. X f(x), d x + β dt x X d x 1 e n d x X 1 (b x). (10) For any fixed x (0, b], the objective function in (10) is strongly convex and separable about every element of x, and thus the unique solution of (10) has a closed form as d x = P Dx [ 1 β X 1 f(x)], where D x = [ 1 e n, X 1 (b x)] and P Dx is the orthogonal projection operator on the box D x. Denote X = diag(x ) and use d and D to denote d x and D x, respectively.

8 8 First Order Interior Point Algorithm Give ϵ (0, 1] and choose x 0 (0, b]. For 0, set d = P D [ 1 β X 1 f(x )], (11a) x +1 = x + X d. (11b) The first order interior point algorithm presented in the current paper is significantly different from the classical interior point methods [8]. The current method is based on a simple gradient projection into the interior of the nonnegative orthant, while the classical one follows the central path of the logarithmic barrier function via the Newton method. Lemma 1 The proposed First Order Interior Point Algorithm is well defined, which means that 0 < x b, K. Proof We only need to prove that if 0 < x b, then 0 < x +1 b. On the one hand, by d X 1 (b x ), we have x +1 = x + X d x + (b x ). On the other hand, using d 1 e n, we obtain x +1 = x + X d x 1 x = 1 x > 0. Hence, 0 < x +1 b. Lemma Let {x } be the sequence generated by the First Order Interior Point Algorithm, then the sequence {f(x )} is monotonely decreasing and satisfies f(x +1 ) f(x ) β X d = β x+1 x. Moreover, we have x R. Proof From the KKT condition of (10), the solution d of quadratic programming (10) satisfies the necessary and sufficient condition as follows βx d + X f(x ) µ + ν = 0, M (d + 1 e n) = 0, N (d X 1 (b x )) = 0, 1 e n d X 1 (b x ), (1) where M = diag(µ ) and N = diag(ν ) with M, N 0.

9 9 Moreover, from (9), we obtain f(x +1 ) f(x ) X f(x ), d + β X d = βx d + µ ν, d + β X d = β X d + µ T d ν T d (13) = β X d 1 µt e n ν T X 1 (b x ). By µ, ν 0 and 0 < x b, we obtain the inequality given in this lemma, which implies that f(x ) f(x 0 ), K. From Assumption., we obtain x R. Different from some other potential reduction methods, the objective function is monotonelly decreasing along the sequence generated by the First Order Interior Point Algorithm. Theorem 1 For any ϵ (0, 1], the First Order Interior Point Algorithm obtains an ϵ scaled first order stationary point or ϵ global minimizer of (1) in no more than O(ϵ ) iterations. Proof Let {x } be the sequence generated by the proposed First Order Interior Point Algorithm. Then x (0, b] and x R, K. Without loss of generality, we suppose that R 1. In the following, we will consider four cases. Case 1: X d 1 4βR ϵ. From Lemma, we obtain that f(x +1 ) f(x 1 ) 4 R β ϵ = 1 3R β ϵ. Case : µ T e n 1 4β ϵ. From Lemma, we get Case 3: ν T X 1 (b x ) 1 4 ϵ. From (13), we get f(x +1 ) f(x ) 1 8β ϵ. f(x +1 ) f(x ) 1 4 ϵ. Case 4: X d < 1 4βR ϵ, µt e n < 1 4β ϵ and ν T X 1 (b x ) 1 4 ϵ. From the first condition in (1), we obtain that βx d + f(x ) X 1 µ + X 1 ν = 0, (14)

10 10 which implies that X f(x ) = βx d + µ ν. (15) Then, we obtain [X f(x )] i β X X d + µ + [ν ] i 1 ϵ + [ν ] i. (16) If i I b, [ν ] i = 0 and we have [X f(x )] i 1 ϵ. Fix i I b. We consider two subclasses in this case. Case 4.1: x i < b i bi ϵ. Then, bi x i x i b iϵ b i Then, (16) gives [X f(x )] i ϵ. Case 4.: x i b i b i ϵ. Then, x i [ν ] i 0, we obtain ϵ, which gives [ν ] i ϵ from νt X 1 bi. By µ 1 4β ϵ, we have [X 1 µ ] i ϵ βb i [ f(x )] i = [βx d X 1 β X d + [X 1 µ + X 1 ν ] i µ ] i [X 1 ν ] i (b x ) 1 4 ϵ β X d + 1 ϵ [X 1 ν ] i ϵ [X 1 ν ] i ϵ. ϵ. By Therefore, from the analysis in Cases , x is an ϵ scaled first order stationary point of (1). Basing on the above analysis in Cases 1-3, at least one of the following two facts holds at the th iteration: (i) f(x +1 ) f(x ) 1 3R β ϵ ; (ii) x is an ϵ scaled first order stationary point of (1). Therefore, we would produce an ϵ global minimizer or an ϵ scaled first order stationary point of (1) in at most 3f(x 0 )R βϵ iterations. Remar 1 When λ = 0 or p = 1 in (1), the KKT condition of (1) can be written as [ f(x)] i 0 if x i = 0, [ f(x)] i = 0 if 0 < x i < b i, [ f(x)] i 0 if x i = b i. (17) which are sufficient but not necessary for the conditions in (8). b Denote ϵ = min{1, min i i Ib }. Similar to the analysis in Theorem 1, from (14), for ϵ (0, ϵ], if X d < 1 4βR ϵ, µt e n < 1 4β ϵ and ν T X 1 (b x ) 1 4 ϵ, we can obtain the following estimation [ f(x )] i 1 ϵ if 0 x i ϵ, [ f(x )] i 1 ϵ + 1 ϵ if ϵ < x i b < (1 ϵ )b (18) i i [ f(x )] i ϵ if x i (1 ϵ )b i.

11 11 When ϵ = 0, the conditions in (18) are consistent with the KKT condition of (1) in (17). Thus, we can state that the First Order Interior Point Algorithm can be extended to (1) with λ = 0 or p = 1 for finding an ϵ KKT point with the worst-case complexity O(ϵ ), which recovers the complexity bound of the steepest descent method for constrained nonconvex optimization. 3 Second Order Interior Point Algorithm In this section, we consider problem (7), a special case of (1) with Ω = {x : x 0}, under Assumption. and the following assumption on H. Assumption 3.1: H is twice continuously differentiable and H is globally Lipschitz on Ω with Lipschitz constant γ. By using the cubic overestimation idea [4 6,18,,3], a second order interior point algorithm is proposed for solving (7), which uses the Hessian of H. We show that the worst-case complexity of the second order interior point algorithm for finding an ϵ scaled second order stationary point of (7) is O(ϵ 3/ ). Comparing with the first order interior point algorithm proposed in Section, the worst-case complexity of the second order interior point algorithm is better and the generated point satisfies stronger optimality conditions. However, a quadratic program with ball constraint has to be solved at each iteration. 3.1 Second Order Necessary Condition for (7) Based on the scaled second order necessary condition for unconstrained non- Lipschitz optimization in [11,1], we now that if x is a local minimizer of (7), then x Ω satisfies X f(x)x = X H(x)X + λp(p 1)X p 0. (19) If x > 0, then f is differentiable at x and (19) implies that f(x) 0. Now we give the definition of the ϵ scaled second order stationary point of (7) as follows. Definition For a given ϵ (0, 1], we call x Ω an ϵ scaled second order stationary point of (7), if X f(x) ϵ and X f(x)x ϵi n. Definition is consistent with the scaled first and second order necessary conditions given above when ϵ = 0.

12 1 3. Second Order Interior Point Algorithm From the cubic overestimation idea, for any x, x + Ω, Assumption 3.1 implies that H(x + ) H(x) + H(x), x + x Similarly, for any t, s > 0, + 1 H(x)(x + x), x + x γ x+ x 3. t p s p + ps p 1, t s + p(p 1) s p (t s) + Thus, for any x, x + > 0, we obtain p(p 1)(p ) s p 3 (t s) 3. 6 f(x + ) f(x) f(x), x + x + 1 f(x)(x + x), x + x which can also be expressed by γ x+ x 3 p(p 1)(p ) + λ 6 n i=1 x p 3 i (x + i x i ) 3, f(x + ) f(x) X f(x), d x + 1 X f(x)xd x, d x γ Xd x 3 + λp(p 1)(p ) 6 n x p i [d x] 3 i. i=1 (0) Since p(1 p) 1 4, then p(1 p)( p) 1. Combining this estimation with n i=1 [d x] 3 i d x 3, if x R, then (0) implies f(x + ) f(x) X f(x), d x + 1 X f(x)xd x, d x (γr3 + 1 λrp ) d x 3. (1) At x > 0, we minimize a quadratic function subject to a ball constraint to achieve a sufficient reduction. For a given ϵ (0, 1], we solve the following problem min q(d x ) = X f(x), d x + 1 X f(x)xd x, d x s.t. d x ϑ ϵ () where ϑ = 1 min{ 1 γr λrp, 1} and R is the constant in Assumption.. To solve (), we consider two cases. Case 1: X f(x)x is positive semi-definite.

13 13 From the KKT condition of (), the solution d x of () satisfies the following necessary and sufficient conditions X f(x)xd x + X f(x) + ρ x d x = 0, (3a) ρ x 0, d x ϑ ϵ, ρ x ( d x ϑ ϵ) = 0. (3b) In this case, () is a convex quadratic program with ball constraint, which can be solved effectively in polynomial time, see [7] and references therein. Case : X f(x)x has at least one negative eigenvalue. From [6], the solution d x of () satisfies the following necessary and sufficient conditions X f(x)xd x + X f(x) + ρ x d x = 0, (4a) d x = ϑ ϵ, X f(x)x + ρ x I n is positive semi-definite. (4b) In this case, () is a nonconvex quadratic programming with ball constraint. However, by the results in [9,30], (4) can be solved effectively in polynomial time with the worst-case complexity O(log(log(ϵ 1 ))). The double log is established based on a globally convergent Newton method. In the first phase the method selects a point from at most log log(ϵ) many candidate points using the bisection method. Then it is proved that, starting from the selected point, the quadratic convergence of the Newton method is guaranteed. From (3) and (4), if d x solves (), then q(d x ) = X f(x), d x + 1 X f(x)xd x, d x = X f(x)xd x ρ x d x, d x + 1 X f(x)xd x, d x (5) = ρ x d x 1 dt x X f(x)xd x. Since X f(x)x + ρ x I n is always positive semi-definite, d T x X f(x)xd x ρ x d x, and thus q(d x ) 1 ρ x d x. Therefore, from (1), we have f(x + ) f(x) 1 ρ x d x (γr3 + 1 λrp ) d x 3. (6) Second Order Interior Point Algorithm Choose x 0 int(ω) and ϵ (0, 1]. For 0, solve () with x = x for d x +1 = x + X d. (7a) (7b)

14 14 The Second Order Interior Point Algorithm is related to the classical interior point methods [8]. The computational wor of each iteration is identical to the algorithm in [9]. However, in [9] the objective is a quadratic overestimation of the Karmarar-type potential function, and in the current paper the objective is just the Taylor quadratic expansion of the original objective function. The main differences are: 1) the current paper deals with more general objective function and [9] deals with only quadratic function, which is why different convergence rates are established. For quadratic functions, there would be no third-order errors so that the analysis was considerably simpler and a better convergence rate was achieved in [9]. In fact, it is a surprise to us that we are able to establish the current convergence rate for such general objectives. ) [9] needs a prior nown lower bound for the objective function in order to construct a valid potential function, but the current method does not need such information. From the definition of ϑ in (), d 1, similar to the analysis in Lemma 1, the Second Order Interior Point Algorithm is also well defined. Let {x } be the sequence generated by it, then x > 0. In what follows, we will prove that there is κ > 0 such that either f(x +1 ) f(x ) κϵ 3 or x +1 is an ϵ scaled second order stationary point of (7). Lemma 3 If ρ > 9ϑ d holds for all K, then the Second Order Interior Point Algorithm produces an ϵ global minimizer of (1) in at most O(ϵ 3/ ) iterations. Moreover, x R, K. Proof If ρ > 9ϑ d, then ρ > 4 9 (γr3 + 1 λrp ) d and we have that 1 6 (γr3 + 1 λrp ) d < 3 8 ρ, (8) and from (3b) and (4b), we obtain d = ϑ ϵ. From (6) and (8), we have f(x +1 ) f(x ) 1 ρ d (γr3 + 1 λrp ) d 3 < 1 ρ d ρ d = 1 8 ρ d, which means f(x +1 ) < f(x ). If this always holds, then f(x ) is strictly monotone decreasing. By Assumption., x R, K. Moreover, f(x ) f(x 0 ) 8 ρ d 36 ϑ ϵ 3/, which follows that we would produce an ϵ global minimizer of (1) in at most 36f(x 0 )ϑ ϵ 3/ iterations. In what follows, we prove that x +1 is an ϵ scaled second order stationary point of (7) when ρ 9ϑ d for some.

15 15 Lemma 4 If there is K such that ρ 9ϑ d, then x +1 satisfies X +1 f(x +1 ) ϵ. Proof From (3a) and (4a), the following relation always holds ρ d =X f(x )X d + X f(x ) which implies =X H(x )X d + λp(p 1)X p Xd + X H(x ) + λpx (x ) p 1 =X ( H(x )X d + λp(p 1)X p X d + H(x ) + λp(x ) p 1 ), H(x )X d + λp(p 1)X p X d + H(x ) + λp(x ) p 1 + ρ X 1 d = 0. Thus, there is τ [0, 1] such that H(x +1 ) + λp(x +1 ) p 1 = H(x +1 ) + λp(x +1 ) p 1 H(x )X d λp(p 1)X p X d H(x ) λp(x ) p 1 ρ X 1 d = H(τx + (1 τ)x +1 )X d H(x )X d + λp(x +1 ) p 1 λp(p 1)X p X d λp(x ) p 1 ρ X 1 d. Therefore, we have X +1 ( H(x +1 ) + λp(x +1 ) p 1 ) X +1 ( H(τx + (1 τ)x +1 )X d H(x )X d ) + X +1 (λp(x +1 ) p 1 λp(p 1)X p X d λp(x ) p 1 ) + ρ X +1 X 1 d. (9) From Assumption 3.1, we estimate the first term after the inequality in (9) as follows X +1 H(τx + (1 τ)x +1 )X d H(x )X d γ(1 τ) X +1 X d γ X +1 X d γr 3 d. (30) From X D = X +1 X, we have with D = diag(d ), which implies X 1 X +1 = D + I n, (31) X 1 x+1 = d + e n.

16 16 Then, we consider the second term in (9), X +1 (λp(x +1 ) p 1 λp(p 1)X p X d λp(x ) p 1 ) =λpx p (X p (x+1 ) p + (1 p)x 1 X +1d X 1 x+1 ) =λpx p ((d + e n ) p + (1 p)(d + I n )d (d + e n )) =λpx p ((d + e n ) p + (1 p)d pd e n ). (3) obtain Using the Taylor expansion that 1 + pt p(1 p) t (1 + t) p 1 + pt, we e n + pd Adding (1 p)d pd e n into (33), we have Thus, p(1 p) d (d + e n ) p e n + pd. (33) 0 (d + e n ) p + (1 p)d pd e n (1 p)d. (d + e n ) p + (1 p)d pd e n (1 p) d. (34) Then, from (3) and (34), we get X +1 (λp(x +1 ) p 1 λp(p 1)X p X d λp(x ) p 1 ) λp X p (d + e n ) p + (1 p)d pd e n λp(1 p) X p d 1 λrp d. (35) From d 1, we have ρ X +1 X 1 d ρ d (1 + d ) 3 3 ρ d. (36) Therefore, from (9), (30), (35) and (36), we obtain X +1 H(x +1 ) + λp(x +1 ) p (γr λrp ) d + 3 ρ d = 1 ϵ ϵ 3 ϵ. Lemma 5 Under the assumptions in Lemma 4, x +1 satisfies Proof From (3b) and (4b), we now X +1 f(x +1 )X +1 ϵi n. X f(x )X + ρ I n is positive semi-definite. Then, H(x ) + λp(p 1)X p ρ X ; (37)

17 17 From Assumption 3.1, we obtain H(x +1 ) H(x ) γ x x +1 γ X d γr d. (38) Note that H(x +1 ) and H(x ) are symmetric, (38) gives Adding (37) and (39), we get H(x +1 ) H(x ) γr d I n. (39) H(x +1 ) λp(p 1)X p ρ X γr d I n. (40) Adding λp(p 1)X p +1 into the both sides of (40), we obtain X +1 H(x +1 )X +1 + λp(p 1)X +1 X p +1 X +1 λp(p 1)X +1 X p X +1 ρ X +1 X X +1 γr d X+1 + λp(p 1)X +1 X p +1 X +1. (41) Using (31) again, we get ρ X +1 X X +1 = ρ (D + I n ) 9 4 ρ I n. (4) On the other hand, using (31), we have X +1 X p +1 X +1 X +1 X p X +1 = X p +1 X +1X p =X p +1 (I n (X p +1 Xp )) = X p +1 (I n (I n + D ) p ). (43) From the Taylor expansion that (1 + t) p 1 + ( p)t + ( p)(1 p) t, we get (I n + D ) p I n + ( p)d + 1 ( p)(1 p)d. (44) Applying (44), D 1 I n and 0 < p < 1 to (43), we derive λp(1 p)(x +1 X p +1 X +1 X +1 X p X +1 ) =λp(1 p)x p +1 (I n (I n + D ) p ) λp(1 p)x p +1 (( p)d + 1 ( p)(1 p)d ) λp(1 p) x +1 p (( p) d + 1 ( p)(1 p) d )I n λp(1 p)r p ( p)(1 p) ( p + ) d I n p p(1 p)( p)(3 p) λr d I n λ Rp d I n. (45)

18 18 From (41), (4), (43), (45), and ρ 9ϑ d, we obtain X +1 f(x +1 )X +1 =X +1 H(x +1 )X +1 + λp(p 1)X +1 X p +1 X +1 ( 9 4 ρ + γr 3 d + 1 λrp d )I n ( 1 ϑ + γr3 + 1 λrp ) d I n 1 ϑ d I n ϵi n. According to Lemmas 3-5, we can obtain the worst-case complexity of the Second Order Interior Point Algorithm for finding an ϵ scaled second order stationary point of (7). Theorem For any ϵ (0, 1], the proposed Second Order Interior Point Algorithm obtains an ϵ scaled second order stationary point or ϵ global minimizer of (7) in no more than O(ϵ 3/ ) iterations. Remar When H(x) = 1 Ax q in (7), then γ = 0 and ϑ in () turns to be 1 ϑ = max{, λr p }. The proposed Second Order Interior Point Algorithm obtains an ϵ scaled second order stationary point or ϵ global minimizer of (7) in no more than 36f(x 0 ) max{4, λ R p }ϵ 3/ iterations. Remar 3 Through the worst-case complexity result in Theorem can be extended to (7) with λ = 0 or p = 1, the complexity bound is for finding the scaled second order stationary point of (7). When λ = 0 or p = 1, the general second order optimality condition of (7) is defined as x 0, f(x) 0, x T f(x) = 0 and H(x) 0, then we call x satisfies the ϵ second order optimality condition of (7) if x 0, f(x) ϵ, X f(x) ϵ and H(x) ϵi n. (46) Thus, due to the second inequality in (46) and the scaling in the second inequality of Definition, the conditions given in Definition is necessary but not sufficient for the ϵ second order optimality conditions in (46). Moreover, the second order interior point algorithm and the complexity bound given in section can be extended to min x 0 H(x) + λ n φ(x p ), where φ is twice continuously differentiable in (0, + ) and φ (t) > 0, for instance, φ and φ 3 in Section 5. i=1

19 19 We end this section by showing our interior point algorithms can be applied to solve the unconstrained problem (). In particular, we show the scaled first and second order stationary points of () and (3) are in one-one correspondence. Denote F n = {x : x is a scaled first order stationary point of ()}, F n + = {(x +, x ) : (x +, x ) is a scaled first order stationary point of (3)}, S n = {x : x is a scaled second order stationary point of ()}, S n + = {(x +, x ) : (x +, x ) is a scaled second order stationary point of (3)}. Theorem 3 (i) Suppose φ (s) > 0 for s > 0, then (x +, x ) F+ n x F n with x = x + x ; x F n (x +, x ) F+ n with x + = max(0, x) and x = max(0, x). (47) (ii) Suppose φ(s) = s and H is twice continuously differentiable, then (x +, x ) S+ n x S n with x = x + x ; x S n (x +, x ) S+ n with (47). Proof (i) Suppose (x +, x ) F n +, then x +, x 0 and { X + H(x + x ) + λp[φ (s) s=(x + i )p (x + i )p ] n i=1 = 0 X H(x + x ) + λp[φ (s) s=(x i )p (x i )p ] n i=1 = 0, (48) where X + = diag(x + ) and X = diag(x ). Thus, which gives X X + H(x + x ) + λp[φ (s) s=(x + i )p (x + i )p (x i )]n i=1 X + X H(x + x ) + λp[φ (s) s=(x i )p (x i )p (x + i )]n i=1 = 0, φ (s) s=(x + i )p x i (x+ i )p + φ (s) s=(x i )p x + i (x i )p = 0, i I. By φ (s) > 0 for s > 0 and the above equation, we obtain that x + i x i = 0, i I. Thus, adding the two equations in (48) gives (X + X ) H(x + x ) + λp[φ (s) s=(x + i x i )p (x + i x i )p ] n i=1 = 0, which means that x = x + x F n. On the other hand, suppose x F n, then (x + ) T x = 0, where x + and x are with the form in (47). Thus, (48) holds, which implies (x +, x ) F n +.

20 0 (ii) Suppose (x +, x ) S+ n and denote x = x + x, then ( ) ( ) ( ) X + H(x) H(x) X + X H(x) H(x) ( ) (X +λp(p 1) + ) p (X ) p 0, X (49) which follows ( In I n ) ( X + X ) ( H(x) H(x) H(x) H(x) Thus, ) ( X + X ) ( ) In +λp(p 1) ( I n I n ) ( (X + ) p (X ) p ) ( In I n ) 0. (X + X ) H(x)(X + X )+λp(p 1)(X + ) p +λp(p 1)(X ) p 0. (50) From (X + ) p + (X ) p X + X p and (50), we obtain (X + X ) H(x)(X + X ) + λp(p 1) X + X p 0, (51) which implies that x S n with x = x + x. On the other hand, suppose x S n and (x +, x ) with the form in (47). Then, (51) holds, which follows ( ) D + n ) D n (X + X ) H(x)(X + X ) ( D + n D n +λp(p 1) ( D + n D n I n ) X + X p ( (5) ) D n + Dn 0, where D + n = diag(sign(x + )) and D n = diag(sign(x )). By the definitions in (47), we obtain D + n (X + X ) = X +, D n (X + X ) = X, D + n X + X p D n = D n X + X p D + n = 0 n n, D + n X + X p D + n = X + p, D n X + X p D n = X p. (53) From (5) and (53), we obtain (49), which implies that (x +, x ) S n +. 4 Numerical Experiments In this section, we present three examples to show the good performance and worst-case complexity of the First Order Interior Point Algorithm (FOIPA) proposed in this paper for solving (1). The numerical testing is carried out on a Lenovo PC (3.00GHz,.00GB of RAM) with the use of Matlab 7.4. In the following three examples, Iteration number denotes the number of iterations for obtaining an ϵ scaled first order stationary point.

21 1 FOIPA Lasso Best subset λ, p 11.7, , , ,0.5 x 1 (lcavol) x (lweight) x 3 (age) x 4 (lbph) x 5 (svi) x 6 (lcp) x 7 (pleason) x 8 (pgg45) Number of nonzero Prediction error Iteration number Table : Example 1: Variable selection by FOIPA Lasso and Best subset methods Example 1 (Prostate Cancer) The prostate cancer date is downloaded from the web site edu/tibs/elemstatlearn/data.html. It consists of the medical records of 97 men who were about to receive a radical prostatectomy, which is divided into a training set with 67 observations and a test set with 30 observations. For more detail of the data set, see [8,11,1, 19]. We use the following constrained l l p model to solve this problem min Ax x 0 q + λ n x p i, where A R 67 8 and q R 67 are built by the training set. Let x 0 = 0.1e 8 and ϵ = The numerical results of the FOIPA with β = A T A are given in Table, in which two well-nown methods (Lasso, Best subset) from Table 3.3 in [19] are also listed. Table indicates the FOIPA with 0 < p < 1 can find fewer main factors with smaller prediction error than the other two methods. i=1 Example (Nonnegative Compressed Sensing) In this example, we test the FOIPA with a compressive sensing problem, where the goal is to reconstruct a length-n nonnegative sparse signal from m observations, where m < n. The purpose of this example is to show the worst-case complexity result and good performance of the FOIPA. We use the following code in Matlab to generate the original signal x s R n, sensing matrix A R m n and observation signal q R m with given positive integers n and T. m=n/4; x s =zeros(n,1); P=randperm(n); A=randn(m,n); x s (P(1:T))=abs(randn(T,1)); A=orth(A ) ; q=a x s.

22 Iteration number ϕ1,p = 0.5,λ = ϕ,p = 0.3,λ = 0.08,a = 0. ϕ3,p = 0.1,λ = 0.11,a = 0. CPU Time (second) ϕ1,p = 0.5,λ = ϕ,p = 0.3,λ = 0.08,a = 0. 10µϕ3,p = 0.1,λ = 0.11,a = ε (a) ε (b) Fig. 1: Example 1: FOIPA for finding an ϵ scaled first order stationary point of (54) (a) Iteration number, (b) CPU Time (second), where ϕ 1, φ and φ 3 are given in Section 5 In this example, we consider the following constrained optimization problem n min Ax x 0 q + φ(x p i ). (54) Set n = 4096 and T = 40 in the Matlab code to generate A, q and x s. Choose x 0 = 0.1e n. For different choices of φ and p, the Iteration number and CPU time for obtaining an ϵ scaled first order stationary point of (54) are illustrated in Fig. 1. From this figure, we can see that the total iterations for obtaining an ϵ scaled first order stationary point is much smaller than the worst-case estimation 3f(x 0 )R βϵ given in the proof of Theorem 1. For the case that φ := φ 1 with λ = and p = 0.5, the original signal x s and ϵ scaled first order stationary point x ϵ with ϵ = 10 4 are pictured in Fig. (a). Meantime, the convergence of x, f(x ) and MSE(x ) are also illustrated in Fig. (b)-(d), where MSE(x ) is the mean squared error of x to the original signal x s defined by MSE(x) = x x s /n. Example 3 (Unconstrained Compressed Sensing) In order to support the theoretical analysis in Theorem 3, we test the FOIPA into a typical compressive sensing problem, where the goal is to reconstruct a length-n sparse signal (may containing positive and negative signals) from m observations, where m < n. The original signal x s is generated randomly by the code x s (P(1:T))=randn(T,1), A and q are generated as in Example with n = 104 and T = 10. To solve this problem, we construct the following unconstrained l l p optimization min x +,x 0 i=1 A(x + x ) q + λ x + p p + λ x p p, (55) which can be solved by the algorithms proposed in this paper. With initial point x +0 = x 0 = 0.1e n and β = 4 A T A = 4, the FOIPA can find an ϵ

23 3 x s x ε (x +, x ) (a) (b) f(x ) 4 3 MSE(x ) Iteration (c) Iteration (d) Fig. : Example : (a) original signal x s and ϵ scaled stationary point x ϵ with ϵ = 10 4, (b) convergence of iterate x, (c) convergence of function value f(x ), (d) convergence of mean squared error MSE(x ) (a) (b) Fig. 3: Example 3: Original and reconstructed signal by the FOIPA with ϵ = 10 3 and x = x + x (a) original signal, (b) reconstructed signal. (=10 3 ) scaled first order stationary point in 131 iterations with CPU time seconds. The original signal and the ϵ scaled first order stationary point are described in Fig. 3. The convergence of (x +, x ) and (x + ) T x are shown in Fig. 4. From Fig. 3 and Fig. 4, we can see that x = x + x converges to the original signal.

24 (x +, x ) (x + ) T x Iteration (a) (b) Fig. 4: Example 3: (a) convergence of (x +, x ), (b) convergence of (x + ) T x 5 Final Remars This paper proposes two interior point methods for solving constrained non- Lipschitz, nonconvex optimization problems arising in many important applications. The first order interior point method is easy to implement and its worst-case complexity is O(ϵ ) which is the same in order as the worst-case complexity of steepest-descent methods applied to unconstrained, nonconvex smooth optimization, and the trust region methods and SQR methods applied to unconstrained, nonconvex nonsmooth optimization [1, 3]. The second order interior method has a better complexity order O(ϵ 3/ ) for finding an ϵ scaled second order stationary point. It is not answered in this paper that whether the complexity bounds are sharp for the first and second order methods, which gives us an interesting topic for further research. Assumptions in this paper are standard and applicable to many regularization models in practice. For example, H(x) = Ax q and φ is one of the following six penalty functions i) soft thresholding penalty function [0,5]: φ 1 (s) = s ii) logistic penalty function [4]: φ (s) = log(1 + αs) iii) fraction penalty function [13,4]: φ 3 (s) = αs 1+αs iv) hard thresholding penalty function[14]: φ 4 (s) = λ (λ s) +/λ v) smoothly clipped absolute deviation penalty function[14]: φ 5 (s) = s vi) minimax concave penalty function [31]: 0 φ 6 (s) = min{1, (α t/λ) + }dt α 1 s 0 (1 t αλ ) +dt.

25 5 Here α and λ are two positive parameters, especially, α > in φ 5 (s) and α > 1 in φ 6 (s). These six penalty functions are concave in [0, ) and continuously differentiable in (0, ), which are often used in statistics and sparse reconstruction. Acnowledgements We would lie to than Prof. Sven Leyffer, the anonymous Associate Editor and referees for their insightful and constructive comments, which help us to enrich the content and improve the presentation of the results in this paper. References 1. W. Bian and X. Chen, Worst-case complexity of smoothing quadratic regularization methods for non-lipschitzian optimization, SIAM J. Optim., 8, (013).. A.M. Brucstein D.L. Donoho and M. Elad, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 51, (009). 3. C. Cartis, N.I.M. Gould and Ph.L. Toint, On the evaluation complexity of composite function minimization with applications to nonconvex nonlinear programming, SIAM J. Optim., 1, (011). 4. C. Cartis, N.I.M. Gould and Ph.L. Toint, Adaptive cubic regularisation methods for unconstrained optimization, Part I: motivation, convergence and numerical results, Math. Program., 17, (011). 5. C. Cartis, N.I.M. Gould and Ph.L. Toint, Adaptive cubic regularisation methods for unconstrained optimization, Part II: worst-case function- and derivative-evaluation complexity, Math. Program., 130, (011). 6. C. Cartis, N.I.M. Gould and Ph.L. Toint, An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity, IMA J. Numer. Anal., doi: /imanum/drr035 (01). 7. C. Cartis, N.I.M. Gould and Ph.L. Toint, Complexity bounds for second-order optimality in unconstrained optimization, J. Complexity, 8, (01). 8. X. Chen, Smoothing methods for nonsmooth, novonvex minimization, Math. Program., 134, (01). 9. X. Chen, D. Ge, Z. Wang and Y. Ye, Complexity of unconstrained L -L p minimization, Math. Program, doi: /s (01). 10. X. Chen, M. Ng and C. Zhang, Nonconvex l p regularization and box constrained model for image restoration, IEEE Trans. Image Processing, 1, (01). 11. X. Chen, L. Niu and Y. Yuan, Optimality conditions and smoothing trust region Newton method for non-lipschitz optimization, Preprint, X. Chen, F. Xu and Y. Ye, Lower bound theory of nonzero entries in solutions of l -l p minimization, SIAM J. Sci. Comput., 3, (010). 13. X. Chen and W. Zhou, Smoothing nonlinear conjugate gradient method for image restoration using nonsmooth nonconvex minimization, SIAM J. Imaging Sci., 3, (010). 14. J. Fan, Comments on Wavelets in stastics: a review by A. Antoniadis, Stat. Method. Appl., 6, (1997). 15. R. Garmanjani and L.N. Vicente, Smoothing and worst case complexity for direct-search methods in nonsmooth optimization, IMA J. Numer. Anal., doi: /imanum/drs07 (01) 16. D. Ge, X. Jiang and Y. Ye, A note on the complexity of L p minimization, Math. Program., 1, (011). 17. S. Gratton, A. Sartenaer and Ph.L. Toint, Recursive trust-region methods for multiscale nonlinear optimization, SIAM J. Optim., 19, , (008). 18. A. Griewan, The modification of Newton s method for unconstrained optimization by bounding cubic terms, Technical Report NA/1 (1981), Department of Applied Mathematics and Theoretical Physics, University of Cambridge, United Kingdom, 1981.

26 6 19. T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning Data Mining, Inference, and Prediction, Springer, New Yor (009). 0. J. Huang, J.L. Horowitz and S. Ma, Asymptotic properties of bridge estimators in sparse high-dimensional regression models, Ann. Statist., 36, (008). 1. Yu. Nesterov, Introductory Lectures on Convex Optimization, Applied Optimization, Kluwer Academic Publishers, Dordrecht, The netherlands, Yu. Nesterov and B.T. Polya, Cubic regularization of Newton s method and its global performance, Math. Program., 108, (006). 3. Yu. Nesterov, Accelerating the cubic regularization of Newton s method on convex problems, Math. Program., 11, (008). 4. M. Niolova, M.K. Ng, S. Zhang and W.-K. Ching, Efficient reconstruction of piecewise constant images using nonsmooth nonconvex minimization, SIAM J. Imaging Sci., 1, -5 (008). 5. R. Tibshirani, Shrinage and selection via the Lasso, J. Roy. Statist. Soc. Ser. B, 58, (1996). 6. S.A. Vavasis and R. Zippel, Proving polynomial time for sphere-constrained quadratic programming, Technical Report , Department of Computer Science, Cornell University, Ithaca, NY, S.A. Vavasis, Nonlinear Optimization: Complexity Issues, Oxford Sciences, New Yor, Y. Ye, Interior Point Algorithms: Theory and Analysis, John Wiley & Sons, Inc., New Yor, Y. Ye, On the complexity of approximating a KKT point of quadratic programming, Math. Program., 80, (1998). 30. Y. Ye, A new complexity result on minimization of a quadratic function with a sphere constraint, in Recent Advances in Global Optimization, C. Floudas and P.M. Pardalos, eds., Princeton University Press, Princeton, NJ (199). 31. C.-H Zhang, Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38, (010).

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization

Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Mathematical Programming manuscript No. (will be inserted by the editor) Complexity Analysis of Interior Point Algorithms for Non-Lipschitz and Nonconvex Minimization Wei Bian Xiaojun Chen Yinyu Ye July

More information

6 February 2012, Revised 2 January, 26 May 2013

6 February 2012, Revised 2 January, 26 May 2013 WORST-CASE COMPLEXITY OF SMOOTHING QUADRATIC REGULARIZATION METHODS FOR NON-LIPSCHITZIAN OPTIMIZATION WEI BIAN AND XIAOJUN CHEN 6 February 01, Revised January, 6 May 013 Abstract. In this paper, we propose

More information

Worst Case Complexity of Direct Search

Worst Case Complexity of Direct Search Worst Case Complexity of Direct Search L. N. Vicente May 3, 200 Abstract In this paper we prove that direct search of directional type shares the worst case complexity bound of steepest descent when sufficient

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

A Note on the Complexity of L p Minimization

A Note on the Complexity of L p Minimization Mathematical Programming manuscript No. (will be inserted by the editor) A Note on the Complexity of L p Minimization Dongdong Ge Xiaoye Jiang Yinyu Ye Abstract We discuss the L p (0 p < 1) minimization

More information

Worst Case Complexity of Direct Search

Worst Case Complexity of Direct Search Worst Case Complexity of Direct Search L. N. Vicente October 25, 2012 Abstract In this paper we prove that the broad class of direct-search methods of directional type based on imposing sufficient decrease

More information

An introduction to complexity analysis for nonconvex optimization

An introduction to complexity analysis for nonconvex optimization An introduction to complexity analysis for nonconvex optimization Philippe Toint (with Coralia Cartis and Nick Gould) FUNDP University of Namur, Belgium Séminaire Résidentiel Interdisciplinaire, Saint

More information

New hybrid conjugate gradient methods with the generalized Wolfe line search

New hybrid conjugate gradient methods with the generalized Wolfe line search Xu and Kong SpringerPlus (016)5:881 DOI 10.1186/s40064-016-5-9 METHODOLOGY New hybrid conjugate gradient methods with the generalized Wolfe line search Open Access Xiao Xu * and Fan yu Kong *Correspondence:

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Complexity of gradient descent for multiobjective optimization

Complexity of gradient descent for multiobjective optimization Complexity of gradient descent for multiobjective optimization J. Fliege A. I. F. Vaz L. N. Vicente July 18, 2018 Abstract A number of first-order methods have been proposed for smooth multiobjective optimization

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

Worst case complexity of direct search

Worst case complexity of direct search EURO J Comput Optim (2013) 1:143 153 DOI 10.1007/s13675-012-0003-7 ORIGINAL PAPER Worst case complexity of direct search L. N. Vicente Received: 7 May 2012 / Accepted: 2 November 2012 / Published online:

More information

Optimal Newton-type methods for nonconvex smooth optimization problems

Optimal Newton-type methods for nonconvex smooth optimization problems Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization

Nonmonotonic back-tracking trust region interior point algorithm for linear constrained optimization Journal of Computational and Applied Mathematics 155 (2003) 285 305 www.elsevier.com/locate/cam Nonmonotonic bac-tracing trust region interior point algorithm for linear constrained optimization Detong

More information

Cubic regularization of Newton s method for convex problems with constraints

Cubic regularization of Newton s method for convex problems with constraints CORE DISCUSSION PAPER 006/39 Cubic regularization of Newton s method for convex problems with constraints Yu. Nesterov March 31, 006 Abstract In this paper we derive efficiency estimates of the regularized

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

More First-Order Optimization Algorithms

More First-Order Optimization Algorithms More First-Order Optimization Algorithms Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 3, 8, 3 The SDM

More information

Bulletin of the. Iranian Mathematical Society

Bulletin of the. Iranian Mathematical Society ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 43 (2017), No. 7, pp. 2437 2448. Title: Extensions of the Hestenes Stiefel and Pola Ribière Polya conjugate

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

An example of slow convergence for Newton s method on a function with globally Lipschitz continuous Hessian

An example of slow convergence for Newton s method on a function with globally Lipschitz continuous Hessian An example of slow convergence for Newton s method on a function with globally Lipschitz continuous Hessian C. Cartis, N. I. M. Gould and Ph. L. Toint 3 May 23 Abstract An example is presented where Newton

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

Bias-free Sparse Regression with Guaranteed Consistency

Bias-free Sparse Regression with Guaranteed Consistency Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March

More information

An interior-point trust-region polynomial algorithm for convex programming

An interior-point trust-region polynomial algorithm for convex programming An interior-point trust-region polynomial algorithm for convex programming Ye LU and Ya-xiang YUAN Abstract. An interior-point trust-region algorithm is proposed for minimization of a convex quadratic

More information

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search ANZIAM J. 55 (E) pp.e79 E89, 2014 E79 On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search Lijun Li 1 Weijun Zhou 2 (Received 21 May 2013; revised

More information

Optimized first-order minimization methods

Optimized first-order minimization methods Optimized first-order minimization methods Donghwan Kim & Jeffrey A. Fessler EECS Dept., BME Dept., Dept. of Radiology University of Michigan web.eecs.umich.edu/~fessler UM AIM Seminar 2014-10-03 1 Disclosure

More information

A Smoothing Newton Method for Solving Absolute Value Equations

A Smoothing Newton Method for Solving Absolute Value Equations A Smoothing Newton Method for Solving Absolute Value Equations Xiaoqin Jiang Department of public basic, Wuhan Yangtze Business University, Wuhan 430065, P.R. China 392875220@qq.com Abstract: In this paper,

More information

Bulletin of the. Iranian Mathematical Society

Bulletin of the. Iranian Mathematical Society ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 41 (2015), No. 5, pp. 1259 1269. Title: A uniform approximation method to solve absolute value equation

More information

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize

Convergence of a Two-parameter Family of Conjugate Gradient Methods with a Fixed Formula of Stepsize Bol. Soc. Paran. Mat. (3s.) v. 00 0 (0000):????. c SPM ISSN-2175-1188 on line ISSN-00378712 in press SPM: www.spm.uem.br/bspm doi:10.5269/bspm.v38i6.35641 Convergence of a Two-parameter Family of Conjugate

More information

Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization

Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization Xiaocheng Tang Department of Industrial and Systems Engineering Lehigh University Bethlehem, PA 18015 xct@lehigh.edu Katya Scheinberg

More information

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs) ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality

More information

Second Order Optimization Algorithms I

Second Order Optimization Algorithms I Second Order Optimization Algorithms I Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 7, 8, 9 and 10 1 The

More information

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method

Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method Advances in Operations Research Volume 01, Article ID 357954, 15 pages doi:10.1155/01/357954 Research Article Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction

More information

Adaptive two-point stepsize gradient algorithm

Adaptive two-point stepsize gradient algorithm Numerical Algorithms 27: 377 385, 2001. 2001 Kluwer Academic Publishers. Printed in the Netherlands. Adaptive two-point stepsize gradient algorithm Yu-Hong Dai and Hongchao Zhang State Key Laboratory of

More information

Math 273a: Optimization Basic concepts

Math 273a: Optimization Basic concepts Math 273a: Optimization Basic concepts Instructor: Wotao Yin Department of Mathematics, UCLA Spring 2015 slides based on Chong-Zak, 4th Ed. Goals of this lecture The general form of optimization: minimize

More information

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization racle Complexity of Second-rder Methods for Smooth Convex ptimization Yossi Arjevani had Shamir Ron Shiff Weizmann Institute of Science Rehovot 7610001 Israel Abstract yossi.arjevani@weizmann.ac.il ohad.shamir@weizmann.ac.il

More information

1. Introduction. We consider the following constrained optimization problem:

1. Introduction. We consider the following constrained optimization problem: SIAM J. OPTIM. Vol. 26, No. 3, pp. 1465 1492 c 2016 Society for Industrial and Applied Mathematics PENALTY METHODS FOR A CLASS OF NON-LIPSCHITZ OPTIMIZATION PROBLEMS XIAOJUN CHEN, ZHAOSONG LU, AND TING

More information

Journal of Computational and Applied Mathematics

Journal of Computational and Applied Mathematics Journal of Computational and Applied Mathematics 234 (2) 538 544 Contents lists available at ScienceDirect Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam

More information

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function Zhongyi Liu, Wenyu Sun Abstract This paper proposes an infeasible interior-point algorithm with

More information

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE Yugoslav Journal of Operations Research 24 (2014) Number 1, 35-51 DOI: 10.2298/YJOR120904016K A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE BEHROUZ

More information

On Nesterov s Random Coordinate Descent Algorithms - Continued

On Nesterov s Random Coordinate Descent Algorithms - Continued On Nesterov s Random Coordinate Descent Algorithms - Continued Zheng Xu University of Texas At Arlington February 20, 2015 1 Revisit Random Coordinate Descent The Random Coordinate Descent Upper and Lower

More information

Numerical Experience with a Class of Trust-Region Algorithms May 23, for 2016 Unconstrained 1 / 30 S. Smooth Optimization

Numerical Experience with a Class of Trust-Region Algorithms May 23, for 2016 Unconstrained 1 / 30 S. Smooth Optimization Numerical Experience with a Class of Trust-Region Algorithms for Unconstrained Smooth Optimization XI Brazilian Workshop on Continuous Optimization Universidade Federal do Paraná Geovani Nunes Grapiglia

More information

Global Quadratic Minimization over Bivalent Constraints: Necessary and Sufficient Global Optimality Condition

Global Quadratic Minimization over Bivalent Constraints: Necessary and Sufficient Global Optimality Condition Global Quadratic Minimization over Bivalent Constraints: Necessary and Sufficient Global Optimality Condition Guoyin Li Communicated by X.Q. Yang Abstract In this paper, we establish global optimality

More information

1 Strict local optimality in unconstrained optimization

1 Strict local optimality in unconstrained optimization ORF 53 Lecture 14 Spring 016, Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Thursday, April 14, 016 When in doubt on the accuracy of these notes, please cross check with the instructor s

More information

1. Introduction. We consider the numerical solution of the unconstrained (possibly nonconvex) optimization problem

1. Introduction. We consider the numerical solution of the unconstrained (possibly nonconvex) optimization problem SIAM J. OPTIM. Vol. 2, No. 6, pp. 2833 2852 c 2 Society for Industrial and Applied Mathematics ON THE COMPLEXITY OF STEEPEST DESCENT, NEWTON S AND REGULARIZED NEWTON S METHODS FOR NONCONVEX UNCONSTRAINED

More information

Apolynomialtimeinteriorpointmethodforproblemswith nonconvex constraints

Apolynomialtimeinteriorpointmethodforproblemswith nonconvex constraints Apolynomialtimeinteriorpointmethodforproblemswith nonconvex constraints Oliver Hinder, Yinyu Ye Department of Management Science and Engineering Stanford University June 28, 2018 The problem I Consider

More information

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28

CORE 50 YEARS OF DISCUSSION PAPERS. Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable 2016/28 26/28 Globally Convergent Second-order Schemes for Minimizing Twicedifferentiable Functions YURII NESTEROV AND GEOVANI NUNES GRAPIGLIA 5 YEARS OF CORE DISCUSSION PAPERS CORE Voie du Roman Pays 4, L Tel

More information

GENERALIZED second-order cone complementarity

GENERALIZED second-order cone complementarity Stochastic Generalized Complementarity Problems in Second-Order Cone: Box-Constrained Minimization Reformulation and Solving Methods Mei-Ju Luo and Yan Zhang Abstract In this paper, we reformulate the

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Convex Quadratic Approximation

Convex Quadratic Approximation Convex Quadratic Approximation J. Ben Rosen 1 and Roummel F. Marcia 2 Abstract. For some applications it is desired to approximate a set of m data points in IR n with a convex quadratic function. Furthermore,

More information

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem SIAM J. OPTIM. Vol. 9, No. 4, pp. 1100 1127 c 1999 Society for Industrial and Applied Mathematics NEWTON S METHOD FOR LARGE BOUND-CONSTRAINED OPTIMIZATION PROBLEMS CHIH-JEN LIN AND JORGE J. MORÉ To John

More information

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints Journal of Computational and Applied Mathematics 161 (003) 1 5 www.elsevier.com/locate/cam A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf. Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

More information

Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization

Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization J. M. Martínez M. Raydan November 15, 2015 Abstract In a recent paper we introduced a trust-region

More information

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L.

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L. McMaster University Advanced Optimization Laboratory Title: A Proximal Method for Identifying Active Manifolds Authors: Warren L. Hare AdvOl-Report No. 2006/07 April 2006, Hamilton, Ontario, Canada A Proximal

More information

An Even Order Symmetric B Tensor is Positive Definite

An Even Order Symmetric B Tensor is Positive Definite An Even Order Symmetric B Tensor is Positive Definite Liqun Qi, Yisheng Song arxiv:1404.0452v4 [math.sp] 14 May 2014 October 17, 2018 Abstract It is easily checkable if a given tensor is a B tensor, or

More information

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S. Bull. Iranian Math. Soc. Vol. 40 (2014), No. 1, pp. 235 242 Online ISSN: 1735-8515 AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

More information

On the acceleration of augmented Lagrangian method for linearly constrained optimization

On the acceleration of augmented Lagrangian method for linearly constrained optimization On the acceleration of augmented Lagrangian method for linearly constrained optimization Bingsheng He and Xiaoming Yuan October, 2 Abstract. The classical augmented Lagrangian method (ALM plays a fundamental

More information

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization

Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Worst-Case Complexity Guarantees and Nonconvex Smooth Optimization Frank E. Curtis, Lehigh University Beyond Convexity Workshop, Oaxaca, Mexico 26 October 2017 Worst-Case Complexity Guarantees and Nonconvex

More information

On the complexity of an Inexact Restoration method for constrained optimization

On the complexity of an Inexact Restoration method for constrained optimization On the complexity of an Inexact Restoration method for constrained optimization L. F. Bueno J. M. Martínez September 18, 2018 Abstract Recent papers indicate that some algorithms for constrained optimization

More information

A Power Penalty Method for a Bounded Nonlinear Complementarity Problem

A Power Penalty Method for a Bounded Nonlinear Complementarity Problem A Power Penalty Method for a Bounded Nonlinear Complementarity Problem Song Wang 1 and Xiaoqi Yang 2 1 Department of Mathematics & Statistics Curtin University, GPO Box U1987, Perth WA 6845, Australia

More information

Image restoration: numerical optimisation

Image restoration: numerical optimisation Image restoration: numerical optimisation Short and partial presentation Jean-François Giovannelli Groupe Signal Image Laboratoire de l Intégration du Matériau au Système Univ. Bordeaux CNRS BINP / 6 Context

More information

Research Article Finding Global Minima with a Filled Function Approach for Non-Smooth Global Optimization

Research Article Finding Global Minima with a Filled Function Approach for Non-Smooth Global Optimization Hindawi Publishing Corporation Discrete Dynamics in Nature and Society Volume 00, Article ID 843609, 0 pages doi:0.55/00/843609 Research Article Finding Global Minima with a Filled Function Approach for

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Step lengths in BFGS method for monotone gradients

Step lengths in BFGS method for monotone gradients Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization Yinyu Ye Department is Management Science & Engineering and Institute of Computational & Mathematical Engineering Stanford

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Advanced Mathematical Programming IE417. Lecture 24. Dr. Ted Ralphs

Advanced Mathematical Programming IE417. Lecture 24. Dr. Ted Ralphs Advanced Mathematical Programming IE417 Lecture 24 Dr. Ted Ralphs IE417 Lecture 24 1 Reading for This Lecture Sections 11.2-11.2 IE417 Lecture 24 2 The Linear Complementarity Problem Given M R p p and

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Research Article Identifying a Global Optimizer with Filled Function for Nonlinear Integer Programming

Research Article Identifying a Global Optimizer with Filled Function for Nonlinear Integer Programming Discrete Dynamics in Nature and Society Volume 20, Article ID 7697, pages doi:0.55/20/7697 Research Article Identifying a Global Optimizer with Filled Function for Nonlinear Integer Programming Wei-Xiang

More information

Convex Optimization Lecture 16

Convex Optimization Lecture 16 Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

A class of Smoothing Method for Linear Second-Order Cone Programming

A class of Smoothing Method for Linear Second-Order Cone Programming Columbia International Publishing Journal of Advanced Computing (13) 1: 9-4 doi:1776/jac1313 Research Article A class of Smoothing Method for Linear Second-Order Cone Programming Zhuqing Gui *, Zhibin

More information

A New Trust Region Algorithm Using Radial Basis Function Models

A New Trust Region Algorithm Using Radial Basis Function Models A New Trust Region Algorithm Using Radial Basis Function Models Seppo Pulkkinen University of Turku Department of Mathematics July 14, 2010 Outline 1 Introduction 2 Background Taylor series approximations

More information

Gradient Methods Using Momentum and Memory

Gradient Methods Using Momentum and Memory Chapter 3 Gradient Methods Using Momentum and Memory The steepest descent method described in Chapter always steps in the negative gradient direction, which is orthogonal to the boundary of the level set

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

Research Article Modified T-F Function Method for Finding Global Minimizer on Unconstrained Optimization

Research Article Modified T-F Function Method for Finding Global Minimizer on Unconstrained Optimization Mathematical Problems in Engineering Volume 2010, Article ID 602831, 11 pages doi:10.1155/2010/602831 Research Article Modified T-F Function Method for Finding Global Minimizer on Unconstrained Optimization

More information

THE solution of the absolute value equation (AVE) of

THE solution of the absolute value equation (AVE) of The nonlinear HSS-like iterative method for absolute value equations Mu-Zheng Zhu Member, IAENG, and Ya-E Qi arxiv:1403.7013v4 [math.na] 2 Jan 2018 Abstract Salkuyeh proposed the Picard-HSS iteration method

More information

Parameter Optimization in the Nonlinear Stepsize Control Framework for Trust-Region July 12, 2017 Methods 1 / 39

Parameter Optimization in the Nonlinear Stepsize Control Framework for Trust-Region July 12, 2017 Methods 1 / 39 Parameter Optimization in the Nonlinear Stepsize Control Framework for Trust-Region Methods EUROPT 2017 Federal University of Paraná - Curitiba/PR - Brazil Geovani Nunes Grapiglia Federal University of

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué

Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Optimisation non convexe avec garanties de complexité via Newton+gradient conjugué Clément Royer (Université du Wisconsin-Madison, États-Unis) Toulouse, 8 janvier 2019 Nonconvex optimization via Newton-CG

More information