Log-Sigmoid Multipliers Method in Constrained Optimization

Size: px
Start display at page:

Download "Log-Sigmoid Multipliers Method in Constrained Optimization"

Transcription

1 Annals of Operations Research 101, , Kluwer Academic Publishers. Manufactured in The Netherlands. Log-Sigmoid Multipliers Method in Constrained Optimization ROMAN A. POLYAK Systems Engineering and Operations Research Department and Mathematical Sciences Department, George Mason University, Fairfa, VA 22030, USA Abstract. In this paper we introduced and analyzed the Log-Sigmoid LS) multipliers method for constrained optimization. The LS method is to the recently developed smoothing technique as augmented Lagrangian to the penalty method or modified barrier to classical barrier methods. At the same time the LS method has some specific properties, which make it substantially different from other nonquadratic augmented Lagrangian techniques. We established convergence of the LS type penalty method under very mild assumptions on the input data and estimated the rate of convergence of the LS multipliers method under the standard second order optimality condition for both eact and noneact minimization. Some important properties of the dual function and the dual problem, which are based on the LS Lagrangian, were discovered and the primal dual LS method was introduced. Keywords: log-sigmoid, multipliers method, duality, smoothing technique 1. Introduction Recently Chen and Mangasarian used the integral of the scaled sigmoid function St,k) = 1 + ep kt)) 1 as an approimation for + = ma{0,} to develop the smoothing technique for solving conve system of inequalities and linear complementarity problems [6]. Later Auslender et al. analyzed the smoothing technique for constrained optimization [1]. The smoothing method for constrained optimization employs a smooth approimation of + to transform a constrained optimization problem into a sequence of unconstrained optimization problems. The convergence of the correspondent sequence of the unconstrained minimizers to the primal solution is due to the unbounded increase of the scaling parameter. So the smoothing technique is in fact a penalty type method with a smooth penalty function and can be considered as a particular case of SUMT [7]. There are few well known difficulties associated with the penalty type approach: rather slow convergence, the Hessian of the penalty function became ill conditioned and the area where Newton method is well defined shrinks to a point when the scaling parameter k. This paper is dedicated to Professor Anthony V. Fiacco on the occasion of his 70th birthday. Partially supported by NSF under Grant DMS

2 428 POLYAK It motivates an alternative approach. We use the Log-Sigmoid LS) function ψt) = 2ln2St,1) to transform the constraints of a given constrained optimization problem into an equivalent one. The transformation ψt) is parametrized by a positive scaling parameter. Simultaneously we tranform the objective function with log-sigmoid type transformation. The classical Lagrangian for the equivalent problem the Log-Sigmoid Lagrangian LSL) is our basic instrument. There are three basic reasons for using the LS transformation and the corresponding Lagrangian: 1) ψ C on, ); 2) the LSL is as smooth as the initial functions in the entire primal space; 3) ψ and ψ are bounded on, ). Sequential unconstrained minimization of the LSL in primal space followed by eplicit formula for the Lagrange multipliers update forms the LS multipliers method. Our first contribution is the convergence proof of the LS multipliers method. It is proven that for inequality constrained optimization problem, which satisfies the standard second order optimality conditions the LS method converges with Q-linear rate for any fied but large enough scaling parameter. If one changes the scaling parameter from step to step as it takes place in the smoothing methods then the rate of convergence is Q-superlinear. It is worth to mention that such substantial improvement of the rate of convergence is possible to achieve without increase computational efforts per step as compare with the smoothing technique. Our second contribution is the proof that a particular modification of the LS method retains the up to Q-superlinear rate of convergence if instead of the eact primal minimizer one uses its approimation. It makes the LS multipliers method practical and together with the properties 1) 3) of the transformation ψ increases the efficiency of the Newton method for constrained optimization. We also discovered that the dual function and the dual problem, which are based on LSL have some etra important properties on the top of those which are typical for the classical dual function and the corresponded dual problem. The new properties of the dual function allow to use Newton type methods for solving the dual problem, which leads to the second order multipliers methods with up to quadratic rate of convergence. Finally we introduced the primal dual LS method, which has been tested numerically on a number of LP and NLP problems. The numerical results obtained clearly indicate that the primal dual LS method can be very efficient in the final phase of the computational process. The paper is organized as follows. The problem formulation and the basic assumptions are given in the net section. In section 3, we consider the LS transformation and its properties. In section 4 we consider the equivalent problem and correspondent

3 LOG-SIGMOID MULTIPLIERS 429 Lagrangian. The LS multiplier method is introduced in section 5. In section 6 we establish the convergence and estimate the rate of convergence of the LS multipliers method. In section 7 we consider the modification of the LS method and show that the rate of convergence of LS methods can be retained for ineact minimization. The primal dual LS method is introduced in section 8. Duality issues related to the LSL are considered in section 9. We conclude the paper with some remarks related to the future research. 2. Statement of the problem and basic assumptions Let f : R n R 1 be conve and all c i : R n R 1, i = 1,...,p, be concave and smooth functions. We consider a conve set ={ R n + : c i) 0, i= 1,...,p} and the following conve optimization problem. We will assume that: X = arg min { f) }. A) The optimal set X is not empty and bounded. B) The Slater s condition holds, i.e., there eists ˆ R n ++ : c i ˆ) > 0, i= 1,...,p. To simplify consideration we will include the nonnegativity constraints i 0, i = 1,...,n, into the set c i ) 0, i.e., = { R n : c i ) 0, i= 1,...,p, c p+1 ) = 1 0,...,c p+n ) = n 0 } = { R n : c i ) 0, i= 1,...,m }, m = p + n. If B) holds and f), c i ), i = 1,...,m, are smooth, then the Karush Kuhn Tucker s KKT s) conditions hold true, i.e., there eists a nonnegative vector λ = λ 1,...,λ m ) such that L,λ ) = f ) m λ i c i ) = 0, 2.1) λ i c i ) = 0, i = 1,...,m, 2.2) where L, λ) = f) m λ ic i ) is the Lagrangian for the primal problem P). Also due to B), the optimal dual set { L = λ R m + : f ) m λ i c i ) } = 0, X 2.3) is bounded. Along with the primal problem P) we consider the dual problem where dλ) = inf L, λ) is the dual function. λ L = Arg ma { dλ) λ R m +}, D) P)

4 430 POLYAK Later we will use the standard second order optimality condition. Let us assume that the active constraint set at is I ={i: c i ) = 0} ={1,...,r}. We consider the vector-functions c T ) = c 1 ),..., c m )), cr) T ) = c 1),..., c r )) and their Jacobians c) = J c) ) c 1 ) =., c r) ) = J c r) ) ) c 1 ) =.. c m ) c r) ) The sufficient regularity conditions rank c r) ) = r, λ i > 0, i I, 2.4) together with the sufficient condition for the minimum to be isolated 2 L,λ ) y,y ) ρy,y), ρ > 0, y 0: c r) ) y = 0 2.5) comprise the standard second order optimality sufficient conditions. We conclude the section with an assertion, which will be used later. The following assertion is a slight modification of Debreu theorem see, for eample, [11]). Assertion 2.1. Let A be a symmetric n n matri, let B an r n matri, = diagλ i ) r and λ i > 0. Ay,y) ρy,y), ρ > 0, y: By = 0 2.6) then there eists k 0 > 0 large enough such that for any 0 <µ<ρwe have A + kb T B ), ) µ,), R n, 2.7) whenever k k Log-sigmoid transformation The Log-Sigmoid Transformation LST) ψ : R, 2ln2) we define by the formula ψt) = 2ln2St,1) = 2ln2 1 + e t) 1 = 2 ln 2 + t ln 1 + e t )). 3.1) For the scaled log-sigmoid transformation we have k 1 ψkt) = 2k 1 ln 2St,k) = 2k 1 ln 2 ln 1 + e kt)), k > 0. Let us consider the following function: { 2t + 2k vt,k) = 1 ln 2, t 0, 2k 1 ln 2, t 0.

5 LOG-SIGMOID MULTIPLIERS 431 It is easy to see that vt,k) k 1 ψkt) = { 2k 1 ln 1 + e kt), t 0, 2k 1 ln 1 + e kt), t 0. Therefore the following estimation is taking place 0 vt,k) k 1 ψkt) 2k 1 ln 2, <t<. The assertion below states the basic LST properties. Assertion 3.1. The LST ψ has the following properties: A1) ψ0) = 0; A2) ψ t) = 21 + e t ) 1 > 0, t, + ) and ψ 0) = 1; A3) ψ t) = 2e t 1 + e t ) 2 < 0, t, + ) and ψ 0) = 1/2; A4) lim t ψ t) = lim t 21 + e t ) 1 = 0; A5) a) 0 <ψ t) < 2; b) 0.5 ψ t) < 0, <t<. One can check properties A1) A5) directly. The substantial difference between ψt) and the shifted log-barrier function, which leads to the MBF theory and methods [11], is that ψt) is defined on, + ) together with its derivatives of any order. The properties A5) distinguish ψt) not only from shifted barrier and eponential transformation see [9,17]), but also from classes of nonquadratic augmented Lagrangians P I and P I see [4, p. 309]) as well as transformations which have been considered lately see [3,8,13,16]). The properties A5) have substantial impact on both global and local behavior of the LS multiplier as well as on its dual equivalents interior pro method with entropy like ϕ-divergence distance. Entropy like ϕ-divergence distance function and correspondent interior pro method for the dual problem have been considered in [15]. The LS transformation and the correspondent LS multipliers method, which we consider in section 5 is equivalent to a pro method with entropy like ϕ-divergence distance for the dual problem. The ϕ-divergence distance is based on Fermi Dirac kernel ϕ = ψ, because the Fenchel conjugate of LS ψ s) = inf { st ψt) t R } = s 2) ln2 s) s ln s is in fact the Fermi Dirac entropy type function. The issues related to LS multipliers method and its dual equivalent we are going to consider in the upcoming paper.

6 432 POLYAK 4. Equivalent problem and log-sigmoid Lagrangian We use ln1 + e t ) to transform the objective function and ψt) to transform the constraints. For the objective function we obtain f):= ln 1 + e f)) > ) The constraints transformation is scaled by the parameter k>0, i.e., c i ) 0 2k 1 ln e kc i) ) 1 0, i = 1, 2,...,m. Therefore for any given k>0 the problem X = arg min { f) 2k 1 ln 2 ln 1 + e kc i) )) 0, i= 1,...,m } 4.2) is equivalent to the original problem P). The boundness of f)from below is important for our further considerations. The Lagrangian for the equivalent problem 4.2) log-sigmoid Lagrangian is the main tool in our analysis m L,λ,k)= f)+ 2k 1 λ i ln 1 + e kc i) ) m ) 2k 1 λ i ln ) The LSL can be rewritten as follows m L,λ,k)= f) 2 λ i c i k) + 2k 1 = f) m λ i c i ) + 2k 1 m m λ i ln 1 + e ) m ) kc i) 2k 1 λ i ln 2 λ i ln e kc i)/2 + 2k 1 m = L, λ) + 2k 1 λ i ln ekci)/2 + e kc i)/2 2 m ) = L, λ) + 2k 1 kci ) λ i ln ch. 2 m λ i ln 1 + ekc i) ) 2 The following lemma establishes the basic LSL properties at any KKT s pair,λ ). Lemma 4.1. For any KKT s pair,λ ) the following LSL properties are taking place for any k>0. 1 ) L,λ,k)= f ); 2 ) L,λ,k)= L,λ ) = f ) m λ i c i ) = 0; 3 ) 2 L,λ,k)= 2 L,λ ) + 0.5k c ) T c ).

7 LOG-SIGMOID MULTIPLIERS 433 Proof. In view of the complementarity condition we have L,λ,k ) = f ) 2k 1 for any k>0. For the LSL gradient in we have L,λ,k)= f) Again due to 2.2) we obtain m m λ i ln 2 ln 1 + e kc i ) )) = f ) 2λ i e kc i) 1 + e kc i) c i) = f) L,λ,k ) = L,λ ) = f ) For the LSL Hessian in we obtain m m λ i c i ) = 0. 2 L,λ,k)= 2 L, λ) + 2k c)t I + e kc)) 2 c), 2λ i 1 + e kc i) c i). where e kc) = diage kc i) ) m, = diagλ i) m and I is the identical matri in Rm. Again due to 2.2) for any k>0wehave 2 L,λ,k ) = 2 L,λ ) + 0.5k c ) T c ). 4.4) The local LSL properties 1 ) 3 ) are similar to those of the logarithmic MBF function [11]. Globally the LSL has some etra important features due to the properties A5). These features effect substantially the behavior of the correspondent multipliers method, which we consider in the net section. The following lemma characterizes the conveity properties of LSL. Lemma 4.2. If f)and all c i ) C 2 then for any fied λ R n ++ and k>0thelsl Hessian is positive definite for any R n, i.e., L,λ,k)is strictly conve in R n and strongly conve on any bounded set in R n. Proof. The proof follows directly from the formula of the LSL Hessian 2 L,λ,k)= 2 L, λ) + 2k c p)) T p) Ip + e kc p)) ) 2 cp) ) + 2k n) In + e k) 2 and the conveity of f)and all c i ). The following lemma 4.3 is a consequence of property 3 ) and assertion 1.1.

8 434 POLYAK Lemma 4.3. If conditions 2.4), 2.5) are satisfied then there eits k 0 > 0andM 0 > µ 0 > 0 that the following estimation M 0 ky,y) 2 L,λ,k ) y,y ) µ 0 y, y), y R n, 4.5) takes place for any fied k k 0. Proof. taking We obtain the right inequality 4.5) as a consequence of 2.5), 2.7) and 3 )by A = 2 L,λ ) and B = c r) ). The left inequality follows from the formula 4.4) for 2 L,λ,k) if k k 0 and k 0 > 0 is large enough. Corollary. If f)and all c i ) are twice continuous differentiable and ε>0issmall enough then for any fied k k 0 there eist a pair M > µ > 0 such that for any primal dual pair w =, λ) S w,ε ) = { w: } w w ε the following inequalities hold true. µ,y) 2 L,λ,k)y,y) My,y), y R n, 4.6) In other words in the neighborhood of the KKT s pair,λ ) the condition number cond 2 L,λ,k) µm 1 is stable for any fied k k 0. Remark 4.1. Lemma 4.3 is true whether f)and all c i ), i = 1,...,m,areconve or not. Lemma 4.4. If X is bounded, then for any λ R m ++ and k>0there eists ˆ =ˆλ,k) = arg min { L,λ,k) R n}. Proof. If X is bounded then by adding one etra constraint c m+1 ) = f) + M 0, where M>0 is large enough we obtain due to corollary 20 see [7, p. 94]), that the feasible set is bounded. Therefore without restriction of generality we assume from the very beginning that is bounded. We start by establishing that LSL L,λ,k) has no direction of recession in, i.e., for any nontrivial direction z R n lim L + tz,λ,k)=, λ t Rn ++,k>0. Let int, i.e., c i ) > 0. Due to the boundness of for any z 0 one can find i 0 : c i0 + tz) = 0, t >0, in fact, if c i + tz) > 0 t >0, i = 1,...,m,then is unbounded.

9 LOG-SIGMOID MULTIPLIERS 435 Let = + tz, using concavity of c i0 ) we obtain c i0 ) c i0 ) c i0 ), ) or i.e., 0 <α= c i0 ) c i0 ),z ) t, ci0 ), z ) α t 1 = β<0. 4.7) Again using the concavity of c i0 ) we obtain or Hence in view of 4.8) we obtain c i0 + tz) c i0 ) + c i0 ),z ) t t) c i0 + tz) βt t), t >0. 4.8) L + tz,λ,k)= f+ tz) + 2k 1 λ i ln 1 + e kc i+tz) ) 2k 1 λ i ln 2 f+ tz) 2k 1 λ i + 2k 1 λ i0 ln 1 + e kc i 0 +tz) ) = f+ tz) 2k 1 λ i + 2k 1 λ i0 ln 1 + e kc i 0 +tz) ) 2λ i0 c i0 + tz) f+ tz) 2k 1 λ i 2βλ i0 t t). Taking into account 4.1) and 4.6) we obtain, so the set lim L + tz,λ,k)=+, z t Rn, Xλ, k) = { ˆ L ˆ,λ,k) = inf R n L,λ,k) } is not empty and bounded Rockafellar [4, theorem 27.1d]). Moreover, for λ, k) R m+1 ++ due to lemma 4.2 the set Xλ, k) contains only one point ˆλ,k) = arg min{l,λ,k) R n }. The uniqueness of ˆλ,k) means that in contrast to the dual function dλ) = inf{l, λ) R n } which is based on the Lagrangian for the initial problem P), the dual function d k λ) = min{l,λ,k) R n }, which is based on LSL, is as smooth as the initial functions for any λ R n ++. Remark 4.2. Due to A5a)) we have lim t ψ t) = 2 <, therefore the eistence of ˆλ,k) does not follow from standard considerations [4, p. 329], see also [1]. In fact let us consider the following LP min{3 0}. WehaveX ={0} and L ={3}. For λ = 1andk = 1 the LSL L, 1, 1) = 3 + 2ln1 + e ) 2ln2andinfL, 1, 1) =. The transformation of the objective function is critical for the eistence of ˆλ,k).

10 436 POLYAK Remark 4.3. The conve in R n LSL L,λ,k)has a bounded level set in for any λ R n ++ and k>0. It does not follow from lemma 12 see [7, p. 95]), because ψt) does not satisfy the assumption a). 5. Log-sigmoid multipliers method We consider the following method. For a chosen λ 0 R n ++ and k>0wegenerate iteratively the sequence { s } and {λ s } according to the following formulas: s+1 = arg min { L,λ,k) R n}, 5.1) λ s+1 i = λ s i ψ kc )) i s+1 = 2λ ) s i 1 + e kc i s+1 ) 1, i = 1,...,m. 5.2) Along with multipliers method 5.1), 5.2) we consider a version of this method when the parameter k>0isnot fied but one can change it from step to step. For a given positive sequence {k s }: k s+1 >k s, lim s k s =, we find the primal { s } and the dual {λ s } sequences by formulas s+1 = arg min { L,λ,k s ) R n}, 5.3) λ s+1 i = λ s i ψ k s c i s+1 )) = 2λ s i 1 + e k s c i s+1 ) ) 1, i = 1,...,m. 5.4) First of all we have to guarantee that the multipliers method 5.1), 5.2) is well defined, i.e., that s+1 eists for any given λ s R m ++ and k>0. Duetolemma4.4foranyλ s R m ++ there eist ˆλs,k) = arg min{l, λ s,k) R n } and due to the formulas 5.2) and 5.4) we have λ s R m ++ λs+1 R m ++. Therefore if the starting vector of Lagrange multipliers λ 0 R n ++ then all vectors λs, s = 1, 2,..., will remain positive, so the LS method is eecutable. The critical part of any multipliers method is the formula for the Lagrange multipliers update. It follows from 5.2) and 5.4) that λ s+1 i >λ s i if cs+1 )<0andλ s+1 i <λ s i if c s+1 )>0. In this respect the LS method is similar to other multipliers method, however due to A5a)) the LS method has some very specific properties. In particular the Lagrange multipliers cannot be increased more than twice independent on the constraint violation and the value of the scaling parameter k>0. It means, for instance, if λ 0 = e = 1,...,1) R m is the starting Lagrange multipliers vector then for any k>0large enough and any constraint violation the new Lagrange multipliers cannot be more than two. Therefore in contrast to the eponential [17] or MBF methods [11] it is impossible to find approimation close enough to λ by using L,e,k)= f)+ 2k 1 m ln e kc i) ) no matter how large k>0 we are ready to use. Therefore to guarantee convergence when k we modified L,e,k). The convergence and the rate of convergence of the LS multipliers methods we consider in the net section.

11 LOG-SIGMOID MULTIPLIERS Convergence and rate of convergence We start with a modification of L,e,k). For a chosen 0 <α<1 we define the penalty LS function P : R n R ++ R ++ by formula P,k) = f)+ 2k 1+α The eistence and uniqueness of the minimizer ) = k) = arg min { P,k) R n} follows from lemmas 4.2 and 4.4. For the minimizer ) we have By introducing we obtain P ), ) = f ) ) m ln e kc i) ). 6.1) m 2k α 1 + e ) kc i) 1 ) ci ) = ) λ i ) λ i k) = 2k α 1 + e kc i) ) 1, i = 1,...,m, 6.3) P ), ) = f ) ) m ) λ i ) c i ) = L ), λ ) ) = ) The following theorem establishes the convergence of {k)} k>0 and {λk)} k>0 to X and L. Theorem ) If conditions A) and B) hold then the primal and dual trajectories {k)} k>0 and {λk)} k>0 are bounded and their limit points belong to X and L. 2) If the standard second order optimality conditions 2.4), 2.5) are satisfied then lim k k) = and lim k λk) = λ. If, in addition, f)and all c i ) C 2, then the following bound 0 f ) f k) ) 0.5k α f ) + k 1 m ln 2 ) ) m λ 6.5) holds true for any k k 0,wherek 0 > 0 is large enough. Proof. 1) As we mentioned in the proof of lemma 4.4 if X is bounded then by adding an etra constraint we can assume that the feasible set is bounded. Also due to lemma 4.2 the minimizer k) is unique, therefore λk) is uniquely defined by 6.3).

12 438 POLYAK For the vector k) we define two sets of indees I + = I + k) ={i: c i k)) 0} and I = I k) ={i: c i k)) < 0}. Then P k),k ) [ m = P,k)= f ) + 2k 1+α ln 1 + e kc i ) ) ] m ln 2 [ = f ) + 2k 1+α ln 1 + e ) kc i ) + ln 1 + e ) kc i ) k ] c i ) m ln 2. i I + i I i I Therefore P,k) f ) 2k α i I c ) 2k 1+α m ln 2. On the other hand P,k) P,k ) = f ) + 2k 1+α m ln e kc i ) ). In view of e kc i ) ) 1, i = 1,...,m,wehaveP,k) f ). Therefore keeping in mind fk))>0 we obtain 2k α i I c i ) f ) + 2k 1+α m ln 2 or ci ) 0.5k α f ) + k 1 m ln 2. i I In other words, for the maimum constraint violation at k) we have ) ma ci k) 0.5k α f ) + k 1 m ln 2 = vk). 6.6) i I Therefore due to corollary 20 see [7, p. 94]), the boundness {k)} k>0 follows from the boundness of. The boundness of the dual trajectory {λk)} k>0 follows from Slater s condition B), 6.4) and the boundness of the primal trajectory {k)} k>0. Let {k s )} s=1 and {λk s)} s=1 be the primal and dual converging subsequences and = lim s k s ) and λ = lim s λk s ), then by passing to the limit in 6.4) we obtain L, λ ) m = f ) λ i c i ) = 0 and from 6.3) and 6.6) we have λ R m + and c i ) 0, i = 1,...,m, λ i = 0, i / I ) = { i: c i ) = 0 } hence, λ) is a KKT s pair, i.e., X, λ.

13 LOG-SIGMOID MULTIPLIERS 439 2) If the standard second order optimality conditions 2.4), 2.5) are satisfied, then the pair,λ ) is unique, therefore = lim k k) and λ = lim k λk). To obtain the bound 6.5) we consider the enlarged feasible set k) ={: c i ) vk), i = 1,...,m}, which is bounded because is bounded. Therefore fk = arg min{f) k)} eists and fk fk)), hence f ) fk)) f ) fk. Due to the conditions 2.4), 2.5) and keeping in mind that f), c i) C 2 we can use theorem 6 see [7, p. 34]), to estimate f ) fk for any k k 0 and k 0 > 0 large enough we obtain f ) f k) ) f ) f k Using 6.6) we obtain the bound 6.5). m vk) λ i. Convergence results for general classes of smoothing methods have been considered in [1]. Before we establish the rate of convergence for the LS multipliers method we would like to discuss one intrinsic property of the smoothing methods. Let us consider the penalty LS function s Hessian. We have H ), ) = 2 P ), ) = 2 L ), λ ) ) + 2k c ) ) T ) e kc )) I + e kc ))) 2 ) c ), where ) = diagλ )) m and ekc )) = diage kci )) ) m. In view of 6.5) for k>0 large enough the pair ), λ )) is close to,λ ), therefore H ), ) 2 L,λ ) + 0.5k c ) T c ). Due to assertion 2.1 for k>0 large enough the min eigval H ), ) = µ>0, while the ma eigval H ), ) = Mk, M>0. Therefore cond H ), ) = µmk) 1 = O k 1). Hence the cond H ), ) converges to zero faster than fk)) converges to f ). The infinite increase of the scaling parameter k>0 is the only way to insure the convergence of the smoothing method. Therefore from some point on the smooth unconstrained minimization methods and in particular the Newton method might loose its efficiency. The method 5.1), 5.2) allows to speed up the rate of convergence substantially and at the same time keeps stable the condition number of the LS Hessian. Now we will prove that under the standard second order optimality conditions the primal dual sequence { s,λ s } generated by the LS multipliers method 5.1), 5.2) converges to the primal dual solution with Q-linear rate under a fied but large enough scaling parameter k>0.

14 440 POLYAK In our analysis we follow the scheme [11], in which the quadratic augmented Lagrangian proof see [4, p. 109]) for equality constraints has been generalized for nonquadratic augmented Lagrangians applied to inequality constrained optimization. In the course of our analysis we will estimate the threshold k 0 > 0 for the scaling parameter when the Q-linear rates occurs. First, we specify the etended dual feasible domain in R m + [k 0, ), where the Q-linear convergence takes place. Let = = ma 1 i n i, we choose a small enough 0 <δ< min 1 i r λ i. We will split the dual optimal vector λ on active λ r) = λ 1,...,λ r ) Rr ++ and passive λ m r) = λ r+1,...,λ m ) = 0 parts. The neighborhood of λ we define as follows: D ) D λ,k 0,δ ) = { λ, k) R m+1 ++ : λ i δ>0, λi λ i δk, i = 1,...,r, 0 λ i δk, k k 0,i= r + 1,...,m }. By introducing the vector t = t i,i = 1,...,m) = t r),t m r) ) with t i = λ i λ i )k 1, k k 0, i = 1,...,m, we transform the dual set D ) into the neighborhood of the origin in the etended dual space Sδ,k 0 ) = { t, k): t 1,...,t m ; k): t i ) δ λ i k 1,i= 1,...,r, } t i 0, i= r + 1,...,m, t δ, k k 0. Then for LSL we obtain L,t,k) = f)+ 2k 1 m kti + λ ) i ln e kc i ) ). For each λ D ) and k k 0 we can find the correspondent t, k) Sδ,k 0 ),the minimizer ˆ =ˆt,k) = arg min { L,t,k) R n} and the new vector of the Lagrange multipliers ˆλ = ˆλt, k) = ˆλ i t, k) = 2 kt i + λ ) i 1 + e kc i ˆ) ) 1 ),i= 1,...,m. Let us split the vector ˆλ = ˆλ r), ˆλ m r) ) on the active ˆλ r) = ˆλ i,i= 1,...,r) and the passive ˆλ m r) ˆλ m r) ˆ,t,k ) = λi ˆ,t,k ) = 2kti 1 + e kc i ˆ) ) 1,i= r + 1,...,m ) parts. We consider the vector-function h, t, k) = m i=r+1 which correspond to the passive set of constraints. ˆλ i,t,k) c i ),

15 LOG-SIGMOID MULTIPLIERS 441 Let a R n, b R n, θτ): R R, θb) = θb 1 ),..., θb n ) ), aθb) = a 1 θb 1 ),...,a n θb n ) ), a + θb) = a 1 + θb 1 ),...,a n + θb n ) ). Now we are ready for the basic statement in this section. We would like to emphasize that results of the following theorem remain true if neither f) nor c i ), i = 1,...,m,areconve. Theorem 6.2. If f) and all c i ) C 2 and the conditions 2.4) and 2.5) hold, then there eists such a small δ>0andlargek 0 > 0 that for any λ D ) and k k 0 : 1) there eist ˆ = ˆλ,k) = arg min{l,λ,k) R n }: L ˆ,λ,k) = 0and ˆλ = ˆλλ, k) = 2λ1 + e kcˆ) ) 1 ; 2) for the pair ˆ, ˆλ) the estimate ma { ˆ, ˆλ λ } ck 1 λ λ 6.7) holds and c>0 is independent on k k 0 ; 3) the LS function L,λ,k)is strongly conve in a neighborhood of ˆ. Proof. For any R n,anyk>0andt R m the vector-function h, t, k) is smooth in, also h, 0,k ) = 0 R n, h, 0,k ) = 0 n,n, λr) h, 0,k ) = 0 n,r, where 0 p,q is p q matri with zero elements. Let σ = min{c i ) i = r + 1,...,m} > 0. We consider the following map : R n+r+m+1 R n+r, which is defined by, ˆλ r),t,k ) [ f) r ˆλ ] i c i ) h, t, k) = 2k 1 ) kt r) + λ ). 6.8) r) 1 + e kc r) ) 1 k 1 ˆλ r) Taking into account 2.1), 2.2) we obtain,λ r), 0,k) = 0 n+r, k >0. 6.9) Let ˆλ r) = ˆλ r),λ r), 0,k), I r identical matri in R r, r) = diag λ i ) r, L = L,λ ), c = c ), c r) = c r) ). In view of h, 0,k)= 0 n,n and ˆλ r) h, 0,k)= 0 n,r we obtain [ ] L cr) T k ˆλ r) = 1 2 r) c. r) k 1 I r

16 442 POLYAK Using reasoning similar to those in [11] we obtain that 1 k eists and there is a number ϰ>0, which is independent on k k 0 that ϰ. 6.10) 1 k By applying the second implicit function theorem see [3, p. 12]) to the map 6.8) we find that on the set Sδ,k 0,k 1 ) = { t, k): t i ) δ λ i k 1,i= 1,...,r, t i 0, i= r + 1,...,m, t δk 1 },k 0 k k 1 there eists two vector-functions ) = t,k) = 1 t, k),..., m t, k) ) and ˆλ r) ) = ˆλ r) t, k) = ˆλ 1 t, k),..., ˆλ r t, k) ) such that t,k), ˆλ r t, k), t, k ) ), ˆλ ), ) ) One can rewrite system 6.11) as follows: f ) ) r ) ˆλ i ) c i ) h ), ) = 0, 6.12) ˆλ i ) = 2 kt i + λ i ) 1 + e kc i )) ) 1, i = 1,...,r. 6.13) We also have ˆλ i ) = 2λ i 1 + e kc i ˆ )) ) 1, i = r + 1,...,m. Recalling that λ m r) = λ r+1,...,λ m ) = 0 Rm r we first estimate ˆλ m r) λ m r). For any small ε>0wecan find δ>0small enough that t,k) 0,k) = ) ε for any t Sδ,k 0 ). Taking into account c i ) σ>0weobtain ) σ c i t,k) 2, i = r + 1,...,mfor t Sδ,k 0). Therefore in view of e + 1 we obtain 0 < ˆλ i t, k) 2λ i 1 + e 0.5kσ 2λ i kσ 4 σ λ i k. 6.14) Therefore ˆλ m r) λ 4 m r) σ k 1 λm r) λ m r). 6.15) Now we will consider the vector-functions t,k) = ) and ˆλ r) t, k) = ˆλ r) ). By differentiating 6.12) and 6.13) in t we find the Jacobians t ) t t,k) and t ˆλ r) ) = t ˆλ r) t, k) from the following system: [ ] t ) = t ˆλ r) ) ˆλ r) ) ) [ 1 t h ] ), ) 2diag 1 + e ) kc i )) 1) r ;. 6.16) 0r,m r

17 LOG-SIGMOID MULTIPLIERS 443 Considering the system 6.16) for t = 0 R m, we obtain [ ] t 0,k) = t ˆλ r) 0,k) ˆλ r) ) [ 1 t h 0,k),0,k ) ] [ I r ; 0 r,m r = k ) 1 t h 0,k),0,k ) ] I r ; 0 r,m r. In view of 6.10) and estimation t h, 0,k ) 2k 1 + e kσ/2) 1 c m r) ) 4σ 1 c m r) ) which holds true for any k k 0, we obtain ma { t 0,k), t ˆλ r) 0,k) } ϰ c ) m r) + I r ) Therefore = ϰ 4σ 1 cm r) ) + 1 ) = c ) ma { t t,k), t ˆλ r) t, k) } 2c ) for any t, k) Sδ,k 0 ) and δ>0small enough. Keeping in mind that 0,k) = and ˆλ r) 0,k) = ˆλ r) and using arguments similar to those in [11], we obtain ma { t,k), ˆλ r) t, k) ˆλ } r) 2c 0 k 1 λ λ. 6.19) Let λ λ ) λ λ λ λ )) ˆλ,k) =,k, ˆλλ, k) = ˆλ r),k ), ˆλ m r),k k k k then taking c = ma{2c 0, 4/σ } from 6.15) and 6.19) we obtain 6.7). To prove the final part of the theorem we consider the LS Hessian L,λ,k)at the point ˆ =ˆλ,k). Wehave L,λ,k)= f) and for the Hessian 2 L,λ,k)we obtain m 2λ i 1 + e kc i) c i) 2 L,λ,k) m = 2 f) 2λ i 1 + e kc i) 2 c i) + 2k c) ) T ) I + e kc) 2 c), where I identical matri in R m and e kc) = diage kc i) ) m, = diagλ i) m. Therefore 2 L ˆ,λ,k ) = 2 L ˆ, ˆλ ) + k c ˆ )) T I + e kcˆ) ) 1 c ˆ ).

18 444 POLYAK Using the estimation 6.1) for any λ, k) D ) we obtain 2 L ˆ,λ,k ) 2 L,λ ) + k c )) T I + e kc ) ) 1 c ) = 2 L,λ ) k c )) T c ). The strong conveity of LS L,λ,k)in in the neighborhood of ˆ follows from continuity of 2 L,λ,k)in and assertion 2.1. The proof of theorem is completed. Corollary 6.1. The Q-linear rate of convergence for the method 5.1), 5.2) and Q-superlinear convergence for 5.3), 5.4) follows directly from the estimation 6.7) because c>0 is independent on k>k Modification of the LS method The LS method 5.1), 5.2) requires solving unconstrained optimization problem at each step. To make the method practical we have to replace the unconstrained minimizer by an approimation that retains the convergence and the rate of convergence of LS method. In this section we establish the conditions for the approimation and prove that such an approimation allows to retain for the modified LS method the rate of convergence 6.7). For a given positive Lagrange multipliers vector λ R m ++, a large enough penalty parameter k>0and a positive scalar τ>0wefind an approimation for the primal minimizer ˆ from the inequality R n : L,λ,k) τk 1 2 I + e kc ) ) 1 λ λ 7.1) and the approimation for the Lagrange multipliers by formula λ = 2 I + e kc )) 1 λ. 7.2) It leads to the following modification of the LS multipliers method 5.1), 5.2). We define the modified primal dual sequence by the following formulas: s+1 R n : L s+1,λ s,k ) τk e kc i s+1 ) ) 1 λ s λ s, 7.3) λ s+1 = 2 I + e ) kc s+1 ) 1 λ s. 7.4) It turns out that the modification 7.3), 7.4) of the LS method 5.1), 5.2) keeps the basic property of the LS method, namely the Q-linear rate of convergence as soon as the second order optimality conditions hold and the functions f)and c i ), i = 1,...,m, are smooth enough.

19 LOG-SIGMOID MULTIPLIERS 445 Theorem 7.1. If the standard second order optimality conditions 2.4), 2.5) hold and the Hessians 2 f)and 2 c i ), i = 1,...,m, satisfy the Lipschitz condition 2 f 1 ) 2 f 2 ) L0 1 2, 2 c i 1 ) 2 c i 2 ) 7.5) Li 1 2 then there is k 0 > 0 that for any λ D ) and k k 0 the following bound holds true: ma {, λ λ } c5 + τ)k 1 λ λ 7.6) and c>0 is independent on k k 0. Proof. Let us assume that ε>0 is small enough and S,ε ) = { R n : } ε, λ = 1 + e kc )) 1 λ S λ,ε ) = { λ R m ++ : λ λ ε }. We consider vectors =, λ = λ λ = ) λ r), λ m r), λr) = λ r) λ r), λ m r) = λ m r) λ m r) = λ m r), y r) =, λ r) ) and y =, λ). Due to 7.5) we have f )= f ) + 2 f ) + r 0 ), 7.7) c i )= c i ) + 2 c i ) + r i ), i = 1,...,m, 7.8) and r 0 0) = 0, r i 0) = 0. r i ) L i. Then L,λ,k)= f ) = f ) = f ) Also due to 7.5) we have r 0 ) L 0, m 2λ i 1 + e kc i ) c i ) m λ i c i ) r ) λi + λ i ci ) + h,λ m r),k ), where h,λ m r),k)= m i=r+1 2λ i1 + e kc i ) ) 1 c i ). Using 7.7), 7.8) we obtain L,λ,k) = f ) + 2 f ) + r 0 ) r ) λi + λ ) i ci + 2 c ) i + r i ) ) + h,λ m r),k )

20 446 POLYAK Let = f ) r λ i c i ) + 2 f ) r λ i c i ) + r 0 ) r λ i 2 c )) i r λ i 2 c i ) r ) λi + λ i ri ) + h,λ m r),k ). r 1) y) = r 0 ) r λ i 2 c ) i + r ) λi + λ i ri ) then keeping in mind the KKT s condition we can rewrite the epression above as L,λ,k) = L,λ ) c r) ) T λr) +h,λ m r),k ) +r 1) y), 7.9) where r 1) 0) = 0andthereisL 1) > 0that r y) L 1) y. Then λ i = λ i λ i = λ i λ i + λ i λ i, i.e., λ i λ i ) λ i = λ i λ i, i = 1,...,r,or r) e r),k) λ r) = λ r) λ r), 7.10) where er) T,k) = e 1,k),...,e r,k)) and e i,k) = 1 e kci) )1 + e kci) ) 1, i = 1,...,r. Further, e i,k) = e i,k ) + k ei T,k ) + ri e ), i = 1,...,r, and 0) = 0, r e i ) L e i. r e i In view of e i,k)= 0and e i,k ) = 2k e 2kc i ) 1 + e kc i ) ) 2 ci ) = k 2 c i ), i = 1,...,r, we have e i,k) = 1 2 k c i ) + ri e ), i = 1,...,r. 7.11) Therefore the system 7.10) can be rewritten as r) e r),k) λ r) = I r + E r),k) ) λ r) λ r)), where E r), k) = diage i, k)) r. Using 7.11) we obtain 1 2 r) c r) ) k 1 λ r) = k 1 I r + E r),k) ) λ r) λ r)) k 1 r) re r) )

21 LOG-SIGMOID MULTIPLIERS 447 or r) c r) ) 2k 1 λ r) = 2k 1 I r +E r),k) ) λ r) λ r)) rλ ), 7.12) where r 2) ) = 2k 1 r) re r) ), re r) )T = r e 1 ),...,re r )), r 2) 0) = 0, and there is L 2) > 0that r 2) ) L 2). Combining 7.9) and 7.12) we obtain or 2 L c r) λ r) = L,λ,k) h,λ m r),k ) r y) r) c r) 2k 1 λ r) = 2k 1 I r + E r),k) ) λ r) λ r)) r 2) ) ϕ k y r) = a,λ,k) + b,λ,k) + r y r) ), 7.13) where [ ] [ L c r) L,λ,k) h,λ m r),k ) ] ϕ k = r) c, a,λ,k) =, r) 2k 1 0 b,λ r),k ) [ ] [ ] 0 r 1) = 2k 1 I r + E r),k) ) y r) ) ), r y r) ) =, λ r) λ r 2) ) r) r0) = 0, and there is L>0that r y r) ) L y r). As we know already for k 0 > 0 large enough and any k>k 0 the inverse matri ϕ 1 k eists and there is ϰ >0 independent on k k 0 that ϕ 1 k ϰ. Therefore we can solve the system 7.13) for y r). y r) = ϕ 1 [ ] [ k a,λ,k) + b,λ,k) + r y) = ϕ 1 k a ) + b ) + r yr) ) ] = C y r) ). 7.14) Therefore and C y r) ) = ϕ 1 k r y r) ) C yr) ) ϕ 1 k r yr) ) ϰl yr). So, for y r) small enough we have C yr) ) q<1. In other words the operator C y r) ) is a contractive operator for y r) small enough.

22 448 POLYAK Let us estimate the contractibility at the operator C y r) ) with more details. First of all we shall estimate a ) and b ). Wehave a ) L,λ,k) + h,λ m r),k ). Note that for ε>0 small enough and S,ε)we obtain λ i = 2λ i 1 + e kc i ) ) 1 4λi 1 + e kc i ) ) 1 4λi 1 + e kσ/2 ) 1, i = r + 1,...,m. Hence for k 0 > 0 large enough and k k 0 one can find a small enough η>0that λ i ηk 1 λ i = ηk 1 ) λ i λ i, i = r + 1,...,m, i.e., and λ m r) = λ m r) λ m r) ηk 1 λm r) λ m r) 7.15) h,λm r),k ) m = i=r+1 m i=r+1 2λ i 1 + e kc i ) ) 1 ci ) 8λ i 1 + e kσ/2 ) 1 ci ). For k 0 > 0 large enough and any k k 0 we have 41 + e kσ/2 ) 1 c i ) 2k 1, i = r + 1,...,m. Therefore h,λm r),k ) 2k 1 λm r) λ m r) 2k 1 λ λ. Further, from 7.11) we obtain L,λ,k) τk 1 λ λ τk 1 λ λ + τk 1 λ λ. 7.16) Therefore a ) τk 1 λ λ τ)k 1 λ λ. 7.17) Further Hence I r + E r),k) = diag ekc i ) ) = diag e ) kc i ) 1) r 1 + e kc i ). b ) 2k 1 λr) λ r) 2k 1 λ λ. 7.18) From 7.14), 7.15) we obtain yr) ϕ 1 [ k a ) + b ) + r yr) ) ] ϰ τk 1 λ r) λ r) + τk 1 λ m r) +4 + τ)k 1 λ λ + r yr) ) ).

23 LOG-SIGMOID MULTIPLIERS 449 Then in view of λ r) λ r) y r), λ m r) k 1 λ m r) λ m r) k 1 λ λ and r y r) ) L 2 y r) 2 we obtain y r) ϰ τk 1 y r) +5 + τ)k 1 ) λ λ L + 2 y r) 2 or ϰl 2 y r) 2 1 ϰτk 1) y r) +5 + τ) ϰk 1 λ λ 0 and y r) 1 [ 1 ϰτ ) 1 ϰτ ) 2 2L ϰ2 5 + τ) λ λ ) 1/2 ]. ϰl k k k If k 0 > 0 is large enough then for any k k 0 we have [ 1 ϰτ ) 2 2L ϰ2 5 + τ) 1/2 λ λ ] k k 1 ϰτ k Therefore 2 ϰ5 + τ) y r) λ λ. k So in view of 7.14) for c = ma{2 ϰ,η} we have The proof is completed. ma {, λ λ } c5 + τ) k ) 2L ϰ2 5 + τ) λ λ. k λ λ. Remark 7.1. The results of theorem 7.1 remain true whenever f) and all c i ) are conve or not. 8. Primal dual LS method The numerical realization of the LS method 5.1), 5.2) leads to finding an approimation from 7.1) and updating the Lagrange multipliers by formula 7.2). To find one can use Newton method. The Newton LS method has been described in [12]. In this section we consider another approach to numerical realization of the LS multipliers method 5.1), 5.2). Instead of using Newton method to find and then to update the Lagrange multipliers we will use Newton method for solving the following primal dual system L ˆ, ˆλ ) = f ˆ ) ) ˆλ i c i ˆ = 0, 8.1) ˆλ = ψ kc ˆ )) λ 8.2)

24 450 POLYAK for ˆ and ˆλ under the fied k>0andλ R m ++,whereψ kc ˆ)) = diagψ kc i ˆ))) m. After finding an approimation, λ) for the primal dual pair ˆ, ˆλ) we replace λ for λ and take, λ) as a starting point for the new system. We apply the Newton method for solving 8.1), 8.2) using, λ) as a starting point. By linearizing 8.1), 8.2) we obtain the following system for the Newton direction, λ) m f)+ 2 f) λ i + λ i ) c i ) + 2 c i ) ) = 0, 8.3) λ + λ = ψ k c) + c) )) λ. 8.4) The system 8.3) we can rewrite as follows: 2 L, λ) c)t λ + L, λ) By ignoring the last term we obtain m λ i 2 c i ) = ) 2 L, λ) c)t λ = L, λ). 8.6) By ignoring the second order components in the representation of ψ kc) + k c) ) we obtain ψ kc) + k c) ) = ψ kc) ) + kψ kc) ) c). So we can rewrite the system 8.4) as follows kψ kc) ) c) + λ = λ λ, 8.7) where ψ kc)) = diagψ kc i ))) m, = diagλ i) m, λ = ψ kc))λ. Combining 8.6), 8.7) we obtain 2 L, λ) c)t λ = L, λ), 8.8) c ) + kψ kc) ) ) 1 λ = kψ kc) ) ) 1 λ λ ). 8.9) We find λ from 8.9) and substitute to 8.8). We obtain the following system for where M ) = L, λ ), 8.10) M,λ) = 2 L, λ) k c)t ψ kc) ) c) is a symmetric and positive definite matri. Moreover, the cond M,λ,k) is stable in the neighborhood of,λ ) for any fied k k 0, see lemma 4.3. By solving the system 8.10) for we find the primal direction. The net primal approimation for ˆ we obtain as = )

25 LOG-SIGMOID MULTIPLIERS 451 The net dual approimation for ˆλ we find by formula λ = λ + λ = λ + kψ kc) ) c). 8.12) So the method 8.11), 8.12) one can view as dual primal predictor corrector. The vector λ = ψ kc ˆ))λ, is the predictor for ˆλ. Using this dual predictor we find the primal direction from 8.10) and use it to find the dual corrector λ by formula λ = kψ kc) ) c). The net dual approimation λ for ˆλ we find by 8.12). The primal dual LS is fast and numerically stable in the neighborhood of,λ ). To make the LS method converge globally one can combine the primal dual method with Newton LS using the scheme [10]. Such approach produced very encouraging results on a number of LP and NLP problems [12]. We will show some recently obtained results in section Log-sigmoid Lagrangian and duality We have seen already that LS Lagrangian L,λ,k) has some important properties, which the classical Lagrangian L, λ) does not possess. Therefore one can epect that the dual function d k λ) = inf { L,λ,k) R n} 9.1) and the dual problem λ = arg ma { d k λ) λ R m } + 9.2) might have some etra properties as compared to the dual function dλ) and the dual problem D), which are based on L, λ). First of all due to the lemmas 4.2 and 4.4 for any λ R m ++ and any k>0there eists a unique minimizer ˆ =ˆλ,k) = arg min { L,λ,k) R n} for any conve programming problem with a bounded optimal set. The uniqueness of the minimizer ˆλ,k) together with smoothness of f) and c i ) provide smoothness for the dual function d k λ) which is always concave whether the primal problem P) is conve or not. So the dual function d k λ) is smooth under reasonable assumption about the primal problem P). Also the dual problem 9.2) is always conve. Let us consider the properties of the dual function and the dual problem 9.2) with more details. Assuming smoothness f)and c i ) and uniqueness ˆλ,k) we can compute the gradient d k λ) and the Hessian 2 d k λ). For the gradient d k λ) we obtain d k λ) = L ˆ,λ,k ) λ ˆλ,k) + λ L,λ,k), where λ ˆλ,k) = J λ ˆλ,k)) is the Jacobian of the vector-function ˆλ,k).

26 452 POLYAK In view of L ˆ,λ,k) = 0wehave d k λ) = λ L ˆλ,k),λ,k ) = λ L ˆ ), ) = 2k 1 ln e kc i ˆ) ),...,ln e kc m ˆ) )) T = 2k 1 ln e kcˆ)), 9.3) where ln e kc) ) is a column vector with components ln e kci) ), i = 1,...,m. Since 2 L ˆλ,k),λ,k)is positive definite the system L,λ,k)= 0 yields a unique vector-function ˆλ,k) such that ˆλ,k)= and By differentiating 9.4) in λ we obtain therefore L ˆλ,k),λ,k ) L ˆ ), ) = ) 2 L ˆ ), ) λ ˆ ) + 2 λ L ˆ ), ) = 0 λ ˆλ,k) = λ ˆ ) = 2 L ˆ ), )) 1 2 λ L ˆ ), ). 9.5) Let us consider the Hessian for the dual function. Using 9.3) and 9.5) we obtain 2 d k λ) = λ λ d k λ) ) = 2k 1 λ ln e kcˆλ,k))) = 2 λ L ˆ,λ,k ) λ ˆλ,k) = 2 λ L ˆ ), ) 2 L ˆ ), )) 1 2 λ L ˆ ), ). 9.6) To compute λ L ˆ ), ) let us consider λ L ln 1 + e ) kc 1 ˆ )) ln 2 ), ) = 2k 1. ln 1 + e ). kc m ˆ )) ln 2 Then for the Jacobian λ L ), )) = λ 2 L ), ) we obtain ) 1 + e kc 1 ˆ )) 1 ) c1 ˆ ) λ 2 L ˆ ), ) = 2. ) = ψ kc ˆ ) )) c ˆ ) ), ) 1 + e kc m ˆ )) c m ˆ ) where ψ kc ˆ ))) = 2diag[1 + e kc i ˆ )) ) 1 ] m. Therefore 2 λ L ˆ ), ) = T λ L ˆ ), ) = c ˆ ) ) T ψ kc ˆ ) )) c ˆ ) ). By substituting λ 2 L ˆ ), ) and 2 λl ˆ ), ) into 9.6) we obtain the following formula for the Hessian of the dual function. 2 d k λ) = ψ kc ˆ ) )) c ˆ ) ) 2 L ˆ ), )) 1 c ˆ ) ) Tψ kc ˆ ) )). 9.7)

27 LOG-SIGMOID MULTIPLIERS 453 Note that 2 d k λ ) = ψ kc )) c ) 2 L,λ,k )) 1 c ) T ψ kc )). 9.8) We proved the following lemma. Lemma 9.1. If f)and all c i ) C 2 then 1) if P) is a conve programming problem and A) is true then the dual function d k λ) C 2 for any λ R m ++ and any k>0; 2) if P) is a conve programming problem and f)is strongly conve then the dual function d k λ) C 2 for any λ R m + and any k>0; 3) if the standard second order optimality conditions 2.4), 2.5) are satisfied then d k λ) C 2 for any pair λ D ) and k k 0 whether the problem P) is conve or not. The gradient d k λ) is given by 9.3) and the Hessian 2 d k λ) is given by 9.7). Theorem 9.1 Duality). 1) If Slater condition B) holds then the eistence of the primal solution implies the eistence of the dual solution and for any k>0. f ) = d k λ ) 9.9) 2) If f)is strictly conve, f)and all c i ) are smooth and the dual solution eists, then the primal eists and 9.9) holds for any k>0. 3) If f)and all c i ) C 2 and 2.4), 2.5) are satisfied then the second order optimality conditions hold true for the dual problem for any k k 0 if k 0 is large enough. Proof. 1) The primal solution is at the same time a solution for the equivalent problem 4.1). Therefore keeping in mind the Slater condition B) we obtain such λ R m + such that λ i c i ) = 0, i = 1,...,m, L,λ,k ) L,λ,k ), R n,k>0. Therefore d k λ ) = min L,λ,k ) = L,λ,k ) = f ) R n L,λ,k ) min L,λ,k)= d kλ), λ R m R n +. Hence λ R m + is the solution for the dual problem and 9.9) holds.

28 454 POLYAK 2) Let us assume that λ R m + is the solution for the dual problem. If f) is strictly conve then the function L,λ,k) is strictly conve too in R n. Therefore the gradient d k λ) eists. Consider the optimality condition for the dual problem λ i = 0 λi d k λ) 0, 9.10) λ i > 0 λi d k λ) = ) Let = arg min{l, λ, k) R n },then λi d k λ) = 2k 1 ln e kc i ) ). From 9.10) we obtain λ i = 0 2k 1 ln e ) kc i ) e ) kc i ) 1 e kc i ) 1 c i ) 0. From 9.11) we have λ i > 0 ln e kc i ) ) = 0 e kc i ) = 1 c i ) = 0. Therefore, λ) is the primal dual feasible pair for which the complementary conditions hold, i.e., =, λ = λ. 3) To prove that for the dual problem the standard second order optimality conditions hold true we consider the Lagrangian for the dual problem We have Then λ = arg ma { d k λ) λ i 0, i= 1,...,m }. Lλ, v,k)= d k λ) + m v i λ i. 2 λλ Lλ, v,k)= 2 λλ d kλ). 9.12) The active dual constraints are λ i = 0, i = r + 1,...,m, and the vectors e i = 0,...,0, 0,...,1,...,0), i = r + 1,...,m, are the gradients of the active dual constraints. Therefore the tangent subspace to the dual active set at the point λ is Y = { y R m : y, e i ) = 0, i= r + 1,...,m } = { y R m : y = y 1,...,y r, 0,...,0) }. It is clear that the gradients e i, i = r + 1,...,m, of the dual active constraints are linear independent. So, to prove that for the dual problem the second order optimality conditions hold true we have to show 2 λλ L λ, v,k ) y,y ) µ y 2 2, µ > 0, y Y.

29 LOG-SIGMOID MULTIPLIERS 455 Using 9.8) and 9.12) we obtain 2 λλ L λ, v,k ) y,y ) = λλ d k λ ) y,y ) where = ψ kc )) c ) 2 L,λ,k )) 1 c ) T ψ kc )) y,y ) = c ) 2 L,λ,k )) 1 ) c Tȳ,ȳ ), ȳ = ψ kc )) y = ψ kc )) 1 y 1,...,ψ kc )) y r, 0,...,0 ) = y 1,...,y r, 0,...,0) = y r), 0) = y. In other words 2 λλ L λ, v,k ) y,y ) = 2 L,λ,k )) 1 c ) T y, c ) T y ). Using 4.4) we obtain i.e., Hence M 0 k) 1 y, y) 2 L,λ,k )) 1 ) y,y µ 1 0 y, y) M 0 k) 1 y, y) 2 L,λ,k )) 1 ) y,y µ 1 0 y, y). 2 λλ L λ, v,k ) y,y ) M 0 k) 1 cr) T ) y r) cr) T ) ) y r) = M 0 k) 1 c r) ) cr) T ) ) y r),y r). 9.13) Due to 2.4) the Gram matri c r) cr) T is positive definite, therefore there is µ > 0that c r) cr) T y r),y r) ) µ y r) 2 2. Therefore in view of 9.13) we obtain 2 λλ L λ, v,k ) y,y ) µ y 2 2, y Y, 9.14) where µ = M 0 k) 1 µ. So the standard second order optimality condition holds true for the dual problem. Corollary 9.1. If 2.4), 2.5) hold and k 0 > 0 is large enough then for any k k 0 the restriction d k λ r) ) = d k λ) λr+1 = 0,...,λ m =0 of the dual function to the manifold of the dual active constraints is strongly concave. The properties of d k λ r) ) allow to use smooth unconstrained optimization technique, in particular Newton method for solving the dual problem. It leads to the second order LS multipliers method. Remark 9.1. The part 3) of theorem 9.1 holds true even for nonconve optimization problems. It is not true if instead of L,λ,k)one uses the classical Lagrangian L, λ).

30 456 POLYAK 10. Numerical results The primal dual LS method we described in section 8 generally speaking does not converge globally. However locally it converges very fast. Therefore in the first stage of the computation we used the path following type approach with LS penalty function 6.1), i.e., we find an approimation for k) and increase k>0 from step to step. For the unconstrained minimization P,k)in we used Newton method with step-length. When the duality gap becomes reasonably small we use the primal approimation for k) to compute approimation λ for the Lagrange multipliers λk) and then the primal dual vector, λ) is used as a starting point in the primal dual method 8.11), 8.12). The first stage consumes the most of the computational time, while the primal dual method 8.11), 8.12) requires only a few steps to reduce the duality gap and the infeasibility from to For all problems which have been solved we observed the hot start phenomenon see [10,11]), when few and from some point on only one Newton step is required for finding an approimation with high accuracy for the solution of the primal dual system 8.1), 8.2). In the following tables we show numerical results obtained by using the NR multipliers method for several problems, which we downloaded from Dr. R. Vanderbei webpage. Name: catenary n = 198, m = 298, p = 100; # Objective: linear # Constraints: conve quadratic # Feasible set: conve # This model finds the shape of a hanging chain # The solution is known to be y = cosha*) + b # for appropriate a and b. Path following... it f g gap inf step e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-05 3 Primal-Dual algorithm e e e e e e e e-09 6

31 LOG-SIGMOID MULTIPLIERS e e e e e e e e-17 1 Name: esfl_socp n = 1002, m = 2002, p = 1000; Pathfollowing... it f g gap inf step e e e e e e e e e e e e e e e e e e e e e e e e-04 3 Primal-Dual algorithm e e e e e e e e e e e e e e e e e e e e e e e e-10 1 Name: fekete n = 150, m = 200, p = 50 # Objective: nonconve nonlinear # Constraints: conve quadratic Pathfollowing... it f g gap inf step e e e e e e e e e e e e e e e e e e e e e e e e e e e e Primal-Dual algorithm e e e e e e e e e e e e e e e e e e e e+00 2

32 458 POLYAK Name: fir_socp n = 12, m = 319, p = 307; Pathfollowing... it f g gap inf step e e e e e e e e e e e e-02 3 Primal-Dual algorithm e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-10 4 Name: hydrothermal n = 46, m = 55, p = 9; # Objective: nonconve nonlinear # Constraints: nonconve nonlinear Pathfollowing... it f g gap inf step e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-03 1 Primal-Dual algorithm e e e e e e e e e e e e e e e e e e e e-10 1

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION

LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION ROMAN A. POLYAK Abstract. We introduce the Lagrangian Transformation(LT) and develop a general LT method for convex optimization problems. A class Ψ of

More information

Primal-Dual Exterior Point Method for Convex Optimization

Primal-Dual Exterior Point Method for Convex Optimization Primal-Dual Exterior Point Method for Convex Optimization Roman A. Polyak Department of SEOR and Mathematical Sciences Department, George Mason University, 4400 University Dr, Fairfax VA 22030 rpolyak@gmu.edu

More information

Primal-Dual Nonlinear Rescaling Method for Convex Optimization 1

Primal-Dual Nonlinear Rescaling Method for Convex Optimization 1 journal of optimization theory and applications: Vol. 122, No. 1, pp. 111 156, July 2004 ( 2004) Primal-Dual Nonlinear Rescaling Method for Convex Optimization 1 R. Polyak 2 and I. Griva 3 Communicated

More information

Primal-Dual Nonlinear Rescaling Method for Convex Optimization 1

Primal-Dual Nonlinear Rescaling Method for Convex Optimization 1 Primal-Dual Nonlinear Rescaling Method for Convex Optimization 1 R. A. Polyak 2 Igor Griva 3 1 The research of the first author, who spent the Fall of 2001 and the Spring of 2002 as a Fulbright Scholar

More information

Primal-Dual Lagrangian Transformation method for Convex

Primal-Dual Lagrangian Transformation method for Convex Mathematical Programming manuscript No. (will be inserted by the editor) Roman A. Polyak Primal-Dual Lagrangian Transformation method for Convex Optimization Received: date / Revised version: date Abstract.

More information

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques

More information

Support Vector Machine via Nonlinear Rescaling Method

Support Vector Machine via Nonlinear Rescaling Method Manuscript Click here to download Manuscript: svm-nrm_3.tex Support Vector Machine via Nonlinear Rescaling Method Roman Polyak Department of SEOR and Department of Mathematical Sciences George Mason University

More information

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):

More information

PRIMAL DUAL METHODS FOR NONLINEAR CONSTRAINED OPTIMIZATION

PRIMAL DUAL METHODS FOR NONLINEAR CONSTRAINED OPTIMIZATION PRIMAL DUAL METHODS FOR NONLINEAR CONSTRAINED OPTIMIZATION IGOR GRIVA Department of Mathematical Sciences, George Mason University, Fairfax, Virginia ROMAN A. POLYAK Department of SEOR and Mathematical

More information

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem 1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and

More information

Computational Optimization. Constrained Optimization Part 2

Computational Optimization. Constrained Optimization Part 2 Computational Optimization Constrained Optimization Part Optimality Conditions Unconstrained Case X* is global min Conve f X* is local min SOSC f ( *) = SONC Easiest Problem Linear equality constraints

More information

Algorithms for constrained local optimization

Algorithms for constrained local optimization Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

update (PDNRD) for convex optimization. We proved the global convergence, established 1.5-Q-superlinear rate of convergence

update (PDNRD) for convex optimization. We proved the global convergence, established 1.5-Q-superlinear rate of convergence Mathematical Programming manuscript No. (will be inserted by the editor) Igor Griva Roman A. Polyak Primal-dual nonlinear rescaling method with dynamic scaling parameter update Received: / Revised version:

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

Convex Optimization Overview (cnt d)

Convex Optimization Overview (cnt d) Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize

More information

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem CS 189 Introduction to Machine Learning Spring 2018 Note 22 1 Duality As we have seen in our discussion of kernels, ridge regression can be viewed in two ways: (1) an optimization problem over the weights

More information

Lecture 16: October 22

Lecture 16: October 22 0-725/36-725: Conve Optimization Fall 208 Lecturer: Ryan Tibshirani Lecture 6: October 22 Scribes: Nic Dalmasso, Alan Mishler, Benja LeRoy Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Lecture 23: Conditional Gradient Method

Lecture 23: Conditional Gradient Method 10-725/36-725: Conve Optimization Spring 2015 Lecture 23: Conditional Gradient Method Lecturer: Ryan Tibshirani Scribes: Shichao Yang,Diyi Yang,Zhanpeng Fang Note: LaTeX template courtesy of UC Berkeley

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Lecture 7: Weak Duality

Lecture 7: Weak Duality EE 227A: Conve Optimization and Applications February 7, 2012 Lecture 7: Weak Duality Lecturer: Laurent El Ghaoui 7.1 Lagrange Dual problem 7.1.1 Primal problem In this section, we consider a possibly

More information

The Kuhn-Tucker and Envelope Theorems

The Kuhn-Tucker and Envelope Theorems The Kuhn-Tucker and Envelope Theorems Peter Ireland ECON 77200 - Math for Economists Boston College, Department of Economics Fall 207 The Kuhn-Tucker and envelope theorems can be used to characterize the

More information

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions

More information

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

More information

Optimisation in Higher Dimensions

Optimisation in Higher Dimensions CHAPTER 6 Optimisation in Higher Dimensions Beyond optimisation in 1D, we will study two directions. First, the equivalent in nth dimension, x R n such that f(x ) f(x) for all x R n. Second, constrained

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION

A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION A SHIFTED PRIMAL-DUAL PENALTY-BARRIER METHOD FOR NONLINEAR OPTIMIZATION Philip E. Gill Vyacheslav Kungurtsev Daniel P. Robinson UCSD Center for Computational Mathematics Technical Report CCoM-19-3 March

More information

2.3 Linear Programming

2.3 Linear Programming 2.3 Linear Programming Linear Programming (LP) is the term used to define a wide range of optimization problems in which the objective function is linear in the unknown variables and the constraints are

More information

Review of Optimization Basics

Review of Optimization Basics Review of Optimization Basics. Introduction Electricity markets throughout the US are said to have a two-settlement structure. The reason for this is that the structure includes two different markets:

More information

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013 PAUL SCHRIMPF OCTOBER 24, 213 UNIVERSITY OF BRITISH COLUMBIA ECONOMICS 26 Today s lecture is about unconstrained optimization. If you re following along in the syllabus, you ll notice that we ve skipped

More information

Lecture 13: Constrained optimization

Lecture 13: Constrained optimization 2010-12-03 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems

More information

The Kuhn-Tucker and Envelope Theorems

The Kuhn-Tucker and Envelope Theorems The Kuhn-Tucker and Envelope Theorems Peter Ireland EC720.01 - Math for Economists Boston College, Department of Economics Fall 2010 The Kuhn-Tucker and envelope theorems can be used to characterize the

More information

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

More information

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

More information

Lagrangian Duality for Dummies

Lagrangian Duality for Dummies Lagrangian Duality for Dummies David Knowles November 13, 2010 We want to solve the following optimisation problem: f 0 () (1) such that f i () 0 i 1,..., m (2) For now we do not need to assume conveity.

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

too, of course, but is perhaps overkill here.

too, of course, but is perhaps overkill here. LUNDS TEKNISKA HÖGSKOLA MATEMATIK LÖSNINGAR OPTIMERING 018-01-11 kl 08-13 1. a) CQ points are shown as A and B below. Graphically: they are the only points that share the same tangent line for both active

More information

Duality revisited. Javier Peña Convex Optimization /36-725

Duality revisited. Javier Peña Convex Optimization /36-725 Duality revisited Javier Peña Conve Optimization 10-725/36-725 1 Last time: barrier method Main idea: approimate the problem f() + I C () with the barrier problem f() + 1 t φ() tf() + φ() where t > 0 and

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Lecture 23: November 19

Lecture 23: November 19 10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING

CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING CONVERGENCE ANALYSIS OF AN INTERIOR-POINT METHOD FOR NONCONVEX NONLINEAR PROGRAMMING HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global and local convergence results

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

CHAPTER 1-2: SHADOW PRICES

CHAPTER 1-2: SHADOW PRICES Essential Microeconomics -- CHAPTER -: SHADOW PRICES An intuitive approach: profit maimizing firm with a fied supply of an input Shadow prices 5 Concave maimization problem 7 Constraint qualifications

More information

MODIFYING SQP FOR DEGENERATE PROBLEMS

MODIFYING SQP FOR DEGENERATE PROBLEMS PREPRINT ANL/MCS-P699-1097, OCTOBER, 1997, (REVISED JUNE, 2000; MARCH, 2002), MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY MODIFYING SQP FOR DEGENERATE PROBLEMS STEPHEN J. WRIGHT

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Lagrangian Transformation and Interior Ellipsoid Methods in Convex Optimization

Lagrangian Transformation and Interior Ellipsoid Methods in Convex Optimization 1 Lagrangian Transformation and Interior Ellipsoid Methods in Convex Optimization ROMAN A. POLYAK Department of SEOR and Mathematical Sciences Department George Mason University 4400 University Dr, Fairfax

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

9. Interpretations, Lifting, SOS and Moments

9. Interpretations, Lifting, SOS and Moments 9-1 Interpretations, Lifting, SOS and Moments P. Parrilo and S. Lall, CDC 2003 2003.12.07.04 9. Interpretations, Lifting, SOS and Moments Polynomial nonnegativity Sum of squares (SOS) decomposition Eample

More information

CONSTRAINED NONLINEAR PROGRAMMING

CONSTRAINED NONLINEAR PROGRAMMING 149 CONSTRAINED NONLINEAR PROGRAMMING We now turn to methods for general constrained nonlinear programming. These may be broadly classified into two categories: 1. TRANSFORMATION METHODS: In this approach

More information

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties

A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties A null-space primal-dual interior-point algorithm for nonlinear optimization with nice convergence properties Xinwei Liu and Yaxiang Yuan Abstract. We present a null-space primal-dual interior-point algorithm

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Numerical Optimization

Numerical Optimization Constrained Optimization - Algorithms Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Consider the problem: Barrier and Penalty Methods x X where X

More information

Part 2: NLP Constrained Optimization

Part 2: NLP Constrained Optimization Part 2: NLP Constrained Optimization James G. Shanahan 2 Independent Consultant and Lecturer UC Santa Cruz EMAIL: James_DOT_Shanahan_AT_gmail_DOT_com WIFI: SSID Student USERname ucsc-guest Password EnrollNow!

More information

Interior Point Algorithms for Constrained Convex Optimization

Interior Point Algorithms for Constrained Convex Optimization Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions Angelia Nedić and Asuman Ozdaglar April 16, 2006 Abstract In this paper, we study a unifying framework

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written 11.8 Inequality Constraints 341 Because by assumption x is a regular point and L x is positive definite on M, it follows that this matrix is nonsingular (see Exercise 11). Thus, by the Implicit Function

More information

Inexact Solution of NLP Subproblems in MINLP

Inexact Solution of NLP Subproblems in MINLP Ineact Solution of NLP Subproblems in MINLP M. Li L. N. Vicente April 4, 2011 Abstract In the contet of conve mied-integer nonlinear programming (MINLP, we investigate how the outer approimation method

More information

Duality Theory of Constrained Optimization

Duality Theory of Constrained Optimization Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

More information

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems AMSC 607 / CMSC 764 Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 4: Introduction to Interior Point Methods Dianne P. O Leary c 2008 Interior Point Methods We ll discuss

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

On well definedness of the Central Path

On well definedness of the Central Path On well definedness of the Central Path L.M.Graña Drummond B. F. Svaiter IMPA-Instituto de Matemática Pura e Aplicada Estrada Dona Castorina 110, Jardim Botânico, Rio de Janeiro-RJ CEP 22460-320 Brasil

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

More information

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE

INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE INTERIOR-POINT METHODS FOR NONCONVEX NONLINEAR PROGRAMMING: CONVERGENCE ANALYSIS AND COMPUTATIONAL PERFORMANCE HANDE Y. BENSON, ARUN SEN, AND DAVID F. SHANNO Abstract. In this paper, we present global

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Math 273a: Optimization Lagrange Duality

Math 273a: Optimization Lagrange Duality Math 273a: Optimization Lagrange Duality Instructor: Wotao Yin Department of Mathematics, UCLA Winter 2015 online discussions on piazza.com Gradient descent / forward Euler assume function f is proper

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

1.4 FOUNDATIONS OF CONSTRAINED OPTIMIZATION

1.4 FOUNDATIONS OF CONSTRAINED OPTIMIZATION Essential Microeconomics -- 4 FOUNDATIONS OF CONSTRAINED OPTIMIZATION Fundamental Theorem of linear Programming 3 Non-linear optimization problems 6 Kuhn-Tucker necessary conditions Sufficient conditions

More information

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation Peter J.C. Dickinson DMMP, University of Twente p.j.c.dickinson@utwente.nl http://dickinson.website/teaching/2017co.html version:

More information

Penalty, Barrier and Augmented Lagrangian Methods

Penalty, Barrier and Augmented Lagrangian Methods Penalty, Barrier and Augmented Lagrangian Methods Jesús Omar Ocegueda González Abstract Infeasible-Interior-Point methods shown in previous homeworks are well behaved when the number of constraints are

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL) Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization Nick Gould (RAL) x IR n f(x) subject to c(x) = Part C course on continuoue optimization CONSTRAINED MINIMIZATION x

More information

Lecture V. Numerical Optimization

Lecture V. Numerical Optimization Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

5.6 Penalty method and augmented Lagrangian method

5.6 Penalty method and augmented Lagrangian method 5.6 Penalty method and augmented Lagrangian method Consider a generic NLP problem min f (x) s.t. c i (x) 0 i I c i (x) = 0 i E (1) x R n where f and the c i s are of class C 1 or C 2, and I and E are the

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Pessimistic Bi-Level Optimization

Pessimistic Bi-Level Optimization Pessimistic Bi-Level Optimization Wolfram Wiesemann 1, Angelos Tsoukalas,2, Polyeni-Margarita Kleniati 3, and Berç Rustem 1 1 Department of Computing, Imperial College London, UK 2 Department of Mechanical

More information