MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons of MMA and GCMMA n Matlab. The frst versons of these methods were publshed n [1] and [2]. 1. Consdered optmzaton problem Throughout ths note, optmzaton problems of the followng form are consdered, the varables are x = (x 1,..., x n T IR n, y = (y 1,..., y m T IR m, and z IR. mnmze f 0 (x + a 0 z + (c y + 1 2 d y 2 =1 subect to f (x a z y 0, = 1,..., m x X, y 0, z 0. (1.1 Here, X = {x IR n x mn real numbers whch satsfy x x max x mn < x max, = 1,..., n}, x mn and x max are gven for all, f 0, f 1,..., f m are gven, contnuously dfferentable, real-valued functons on X, a 0, a, c and d are gven real numbers whch satsfy a 0 > 0, a 0, c 0, d 0 and c + d > 0 for all, and also a c > a 0 for all wth a > 0. In (1.1, the natural optmzaton varables are x 1,..., x n, whle y 1,..., y m and z are artfcal optmzaton varables whch should make t easer for the user to formulate and solve certan subclasses of problems, lke least squares problems and mnmax problems. As a frst example, assume that the user wants to solve a problem on the followng standard form for nonlnear programmng. mnmze f 0 (x subect to f (x 0, = 1,..., m x X, (1.2 f 0, f 1,..., f m are gven dfferentable functons and X s as above. To make problem (1.1 (almost equvalent to ths problem (1.2, frst let a 0 = 1 and a = 0 for all > 0. Then z = 0 n any optmal soluton of (1.1. Further, for each, let d = 1 and c = a large number, so that the varables y become expensve. Then typcally y = 0 n any optmal soluton of (1.1, and the correspondng x s an optmal soluton of (1.2. It should be noted 1
that the problem (1.1 always has feasble solutons, and n fact also at least one optmal soluton. Ths holds even f the user s problem (1.2 does not have any feasble solutons, n whch case some y > 0 n the optmal soluton of (1.1. As a second example, assume that the user wants to solve a mn-max problem on the form mnmze max (x} =1,..,p subect to g (x 0, = 1,..., q x X. (1.3 h and g are gven dfferentable functons and X s as above. It s assumed that the functons h are non-negatve on X (possbly after addng a suffcently large constant C to each of them, the same C for all h. For each gven x X, the value of the obectve functon n problem (1.3 s the largest of the p real numbers h 1 (x,..., h p (x. Problem (1.3 may equvalently be wrtten on the followng form wth varables x IR n and z IR : mnmze z subect to z h (x, = 1,..., p g (x 0, x X, z 0. = 1,..., q (1.4 To make problem (1.1 (almost equvalent to ths problem (1.4, let m = p + q, f 0 (x = 0, f (x = h (x for = 1,..., p, f p+ (x = g (x for = 1,..., q, a 0 = a 1 = = a p = 1, a p+1 = = a p+q = 0, d 1 = = d m = 1, c 1 = = c m = a large number. As a thrd example, assume that the user wants to solve a constraned least squares problem on the form p 1 mnmze 2 (h (x 2 =1 (1.5 subect to g (x 0, = 1,..., q x X. h and g are gven dfferentable functons and X s as above. Problem (1.5 may equvalently be wrtten on the followng form wth varables x IR n and y 1,..., y 2p IR: mnmze 1 2 p (y 2 + yp+ 2 =1 subect to y h (x, = 1,..., p y p+ h (x, g (x 0, y 0 and y p+ 0, x X. = 1,..., p = 1,..., q = 1,..., p (1.6 To make problem (1.1 (almost equvalent to ths problem (1.6, let m = 2p + q, f 0 (x = 0, f (x = h (x for = 1,..., p, f p+ (x = h (x for = 1,..., p, f 2p+ (x = g (x for = 1,..., q, a 0 = 1, a 1 = = a m = 0, d 1 = = d m = 1, c 1 = = c 2p = 0, c 2p+1 = = c 2p+q = a large number. 2
2. Some practcal consderatons In many applcatons, the constrants are on the form g (x g max, g (x stands for e.g. a certan stress, whle g max s the largest permtted value on ths stress. Ths means that f (x = g (x g max (n (1.1 as well as n (1.2. The user should then preferably scale the constrants n such a way that 1 g max 100 for each (and not g max = 10 10. The obectve functon f 0 (x should preferably be scaled such that 1 f 0 (x 100 for reasonable values on the varables. The varables x should preferably be scaled such that 0.1 x max x mn 100, for all. Concernng the large numbers on the coeffcents c (mentoned above, the user should for numercal reasons try to avod extremely large values on these coeffcents (lke 10 10. It s better to start wth reasonably large values and then, f t turns out that not all y = 0 n the optmal soluton of (1.1, ncrease the correspondng values of c by e.g. a factor 100 and solve the problem agan, etc. If the functons and the varables have been scaled accordng to above, then resonably large values on the parameters c could be, say, c = 1000 or 10000. Fnally, concernng the smple bound constrants x mn x x max, t may sometmes be the case that some varables x do not have any prescrbed upper and/or lower bounds. In that case, t s n practce always possble to choose artfcal bounds x mn and x max such that every realstc soluton x satsfes the correspondng bound constrants. The user should then preferably avod choosng x max x mn unnecessarly large. It s better to try some reasonable bounds and then, f t turns out that some varable x becomes equal to such an artfcal bound n the optmal soluton of (1.1, change ths bound and solve the problem agan (startng from the recently obtaned soluton, etc. 3. The ordnary MMA MMA s a method for solvng problems on the form (1.1, usng the followng approach: In each teraton, the current teraton pont (x (k, y (k, z (k s gven. Then an approxmatng (k subproblem, n whch the functons f (x are replaced by certan convex functons f (x, s generated. The choce of these approxmatng functons s based manly on gradent nformaton at the current teraton pont, but also on some parameters u (k and l (k ( movng asymptotes whch are updated n each teraton based on nformaton from prevous teraton ponts. The subproblem s solved, and the unque optmal soluton becomes the next teraton pont (x (k+1, y (k+1, z (k+1. Then a new subproblem s generated, etc. 3
The MMA subproblem looks as follows: mnmze subect to f (k 0 (x + a 0z + (c y + 1 2 d y 2 =1 f (k (x a z y 0, = 1,..., m α (k y 0, z 0. x β (k, = 1,..., n, = 1,..., m (3.1 In ths subproblem (3.1, the approxmatng functons p (k q (k f (k (x = = (u (k x (k 2 = (x (k l (k 2 =1 (k f (x are chosen as ( (k p q (k u (k + x x l (k + r (k, = 0, 1,..., m, (3.2 ( ( f 1.001 (x (k x ( ( f 0.001 (x (k x r (k = f (x (k + ( f + 0.001 + ( f + 1.001 =1 ( (k p u (k x (k (x (k + x (x (k + x + q (k x (k l (k x max x max 10 5 x mn 10 5 x mn, (3.3, (3.4. (3.5 ( + f Here, (x (k f denotes the largest of the two numbers (x (k and 0, x x ( f whle (x (k denotes the largest of the two numbers f (x (k and 0. x x The bounds α (k α (k β (k and β (k n (3.1 and (5.1 are chosen as = max{ x mn, l (k = mn{ x max, u (k whch means that the constrants α (k of constrants: 0.9(x (k 0.5(x max + 0.1(x (k l (k 0.1(u (k x (k x β (k x mn l (k x x (k x mn x x (k, x (k, x (k 0.5(x max x mn }, (3.6 + 0.5(x max x mn }, (3.7 are equvalent to the followng three sets x x max, (3.8 0.9(u (k x (k, (3.9 0.5(x max x mn. (3.10 4
The default rules for updatng the lower asymptotes l (k and the upper asymptotes u (k as follows. The frst two teratons, when k = 1 and k = 2, are l (k = x (k 0.5(x max x mn, u (k = x (k + 0.5(x max x mn. (3.11 In later teratons, when k 3, γ (k = l (k = x (k u (k = x (k 0.7 f (x (k 1.2 f (x (k 1 f (x (k provded that ths leads to values that satsfy γ (k (x (k 1 l (k 1, + γ (k (u (k 1 x (k 1, x (k 1 (x (k 1 x (k 2 < 0, x (k 1 (x (k 1 x (k 2 > 0, x (k 1 (x (k 1 x (k 2 = 0, l (k x (k 0.01(x max x mn, l (k x (k 10(x max x mn, u (k x (k + 0.01(x max x mn, u (k x (k + 10(x max x mn. (3.12 (3.13 (3.14 If any of these bounds s volated, the correspondng l (k of the volated nequalty. or u (k s put to the rght hand sde Note that most of the explct numbers n the above expressons are ust default values of dfferent parameters n the fortran code. More precsely: The number 10 5 n (3.3 and (3.4 s the default value of the parameter raa0. The number 0.1 n (3.6 and (3.7 s the default value of the parameter albefa. The number 0.5 n (3.6 and (3.7 s the default value of the parameter move. The number 0.5 n (3.11 s the default value of the parameter asynt. The number 0.7 n (3.13 s the default value of the parameter asydecr. The number 1.2 n (3.13 s the default value of the parameter asyncr. All these values can be carefully changed by the user. As an example, a more conservatve method s obtan by decreasng move and/or asynt and/or asyncr. 5
4. GCMMA the globally convergent verson of MMA The globally convergent verson of MMA, from now on called GCMMA, for solvng problems of the form (1.1 conssts of outer and nner teratons. The ndex k s used to denote the outer teraton number, whle the ndex ν s used to denote the nner teraton number. Wthn each outer teraton, there may be zero, one, or several nner teratons. The double ndex (k, ν s used to denote the ν:th nner teraton wthn the k:th outer teraton. The frst teraton pont s obtaned by frst chosng x (1 X and then chosng y (1 and z (1 such that (x (1, y (1, z (1 becomes a feasble soluton of (1.1. Ths s easy. An outer teraton of the method, gong from the k:th teraton pont (x (k, y (k, z (k to the (k + 1:th teraton pont (x (k+1, y (k+1, z (k+1, can be descrbed as follows: Gven (x (k, y (k, z (k, an approxmatng subproblem s generated and solved. In ths subproblem, the functons f (x are replaced by certan convex functons f (k,0 (x. The optmal soluton of ths subproblem s denoted (ˆx (k,0, ŷ (k,0, ẑ (k,0 (k,0. If f (ˆx (k,0 f (ˆx (k,0, for all = 0, 1,..., m, the next teraton pont becomes (x (k+1, y (k+1, z (k+1 = (ˆx (k,0, ŷ (k,0, ẑ (k,0, and the outer teraton s completed (wthout any nner teratons needed. Otherwse, an nner teraton s made, whch means that a new subproblem s generated and solved at x (k (k,1 (k,0, wth new approxmatng functons f (x whch are more conservatve than f (x for those ndces for whch the above nequalty was volated. (By ths we mean that f (k,1 (k,0 (x > f (x for all x X (k, except for x = x (k (k,1 f (x (k (k,0 = f (x (k. The optmal soluton of ths new subproblem s denoted (ˆx (k,1, ŷ (k,1, ẑ (k,1 (k,1. If f (ˆx (k,1 f (ˆx (k,1, for all = 0, 1,..., m, the next teraton pont becomes (x (k+1, y (k+1, z (k+1 = (ˆx (k,1, ŷ (k,1, ẑ (k,1, and the outer teraton s completed (wth one nner teratons needed. Otherwse, another nner teraton s made, whch means that a new subproblem s generated and solved at x (k (k,2, wth new approxmatng functons f (x, etc. These nner teratons are (k,ν repeated untl f (ˆx (k,ν f (ˆx (k,ν for all = 0, 1,..., m, whch always happens after a fnte (usually small number of nner teratons. Then the next teraton pont becomes (x (k+1, y (k+1, z (k+1 = (ˆx (k,ν, ŷ (k,ν, ẑ (k,ν, and the outer teraton s completed (wth ν nner teratons needed. It should be noted that n each nner teraton, there s no need to recalculate the gradents f (x (k, snce x (k has not changed. Gradents of the orgnal functons f are calculated only once n each outer teraton. Ths s an mportant note snce the calculaton of gradents s typcally the most tme consumng part n structural optmzaton. 6
The GCMMA subproblem looks as follows, for k {1, 2, 3,...} and ν {0, 1, 2,...}: mnmze subect to f (k,ν 0 (x + a 0 z + (c y + 1 2 d y 2 =1 f (k,ν (x a z y 0, = 1,..., m α (k y 0, z 0. x β (k, = 1,..., n, = 1,..., m (4.1 In ths subproblem (4.1, the approxmatng functons p (k,ν q (k,ν f (k,ν (x = = (u (k x (k 2 = (x (k l (k 2 Note: If ρ (k,ν+1 =1 (k,ν f (x are chosen as ( (k,ν p q (k,ν u (k + x x l (k + r (k,ν, = 0, 1,..., m, (4.2 ( ( f 1.001 (x (k x ( ( f 0.001 (x (k x r (k,ν > ρ (k,ν = f (x (k then + ( f + 0.001 + ( f + 1.001 =1 ( (k,ν p u (k x (k (x (k + x (x (k + x + q(k,ν x (k l (k x max x max ρ (k,ν x mn ρ (k,ν x mn, (4.3, (4.4. (4.5 (k,ν+1 f (x s a more conservatve approxmaton than (k,ν f (x. Between each outer teraton, the bounds α (k and β (k and the asymptotes l (k and u (k are updated as n the orgnal MMA, the formulas (3.6 (3.14 stll hold. The parameters ρ (k,ν n (4.3 and (4.4 are strctly postve and updated accordng to below. Wthn a gven outer teraton k, the only dfferences between two nner teratons are the values of some of these parameters. In the begnnng of each outer teraton, when ν = 0, the followng default values are used: ρ (k,0 = 0.1 n f (x (k x =1 (xmax x mn, for = 0, 1,.., m. (4.6 If any of the rght hand sdes n (4.6 s < 10 6 then the correspondng ρ (k,0 s set to 10 6. 7
In each new nner teraton, the updatng of ρ (k,ν s based on the soluton to the most recent subproblem. Note that f (k,ν (x may be wrtten on the form: f (k,ν (x = h (k (x + ρ (k,ν d (k (x, h (k (x and d (k (x do not depend on ρ (k,ν. Some calculatons gve that Now, let d (k (x = Then h (k (ˆx (k,ν +(ρ (k,ν be a natural value of ρ (k,ν+1 s modfed as follows. +δ (k,ν =1 δ (k,ν (u (k (u (k l (k (x x (k 2 x (x l (k (x max x mn. (4.7 = f (ˆx (k,ν (k,ν f (ˆx (k,ν d (k (ˆx (k,ν. (4.8 d (k (ˆx (k,ν = f (ˆx (k,ν, whch shows that ρ (k,ν +δ (k,ν mght. In order to get a globally convergent method, ths natural value ρ (k,ν+1 = mn{ 1.1 (ρ (k,ν + δ (k,ν, 10ρ (k,ν } f δ (k,ν > 0, ρ (k,ν+1 = ρ (k,ν f δ (k,ν 0. (4.9 f (k,ν It follows from the formulas (4.2 (4.5 that the functons are always frst order approxmatons of the orgnal functons f at the current teraton pont,.e. f (k,ν (x (k = f (x (k and (k,ν f (x (k = f (x (k. (4.10 x x f (k,ν Snce the parameters ρ (k,ν are always strctly postve, the functons are strctly convex. Ths mples that there s always a unque optmal soluton of the GCMMA subproblem. There are at least two approaches for solvng the subproblems n MMA and n GCMMA, the dual approach and the prmal-dual nteror-pont approach. In the prmal-dual nteror-pont approach, a sequence of relaxed KKT condtons are solved by Newton s method. We have mplemented ths approach n Matlab, snce all the requred calculatons are most naturally carred out on a matrx and vector level. Ths approach for solvng the subproblem s descrbed next. 8
5. A prmal-dual method for solvng the subproblems n MMA and GCMMA To smplfy the notatons, the teraton ndces k and ν are now removed n the subproblem. Further, we let b = r (k,ν, and drop the constant r (k,ν 0 from the obectve functon. Then the MMA/GCMMA subproblem becomes mnmze g 0 (x + a 0 z + (c y + 1 2 d y 2 =1 subect to g (x a z y b, = 1,..., m g (x = α x β, = 1,..., n z 0, y 0, = 1,..., m =1 ( p u x + and l < α < β < u for all. q x l (5.1, = 0, 1,..., m, (5.2 5.1. Optmalty condtons for the MMA/GCMMA subproblem Snce the subproblem (5.1 s a regular convex problem, the KKT optmalty condtons are both necessary and suffcent for an optmal soluton to (5.1. In order to state these condtons, we frst form the Lagrange functon correspondng to (5.1. L = g 0 (x + a 0 z + + m (c y + 1 2 d y 2 + λ (g (x a z y b + =1 (ξ (α x + η (x β =1 =1 µ y ζz, λ = (λ 1,..., λ m T, ξ = (ξ 1,..., ξ n T, η = (η 1,..., η n T, µ = (µ 1,..., µ m T and ζ are non-negatve Lagrange multplers for the dfferent constrants n (5.1. Let ψ(x, λ = g 0 (x + p (λ = p 0 + λ g (x = =1 Then the Lagrange functon can be wrtten L = ψ(x, λ + (a 0 ζz + + =1 =1 λ p and q (λ = q 0 + (5.3 ( p (λ + q (λ, (5.4 u x x l (ξ (α x + η (x β + =1 (c y + 1 2 d y 2 λ a z λ y λ b µ y, =1 9 λ q. (5.5 (5.6
and then the KKT optmalty condtons for the subproblem (5.1 become as follows. ψ ξ + η = 0, x = 1,..., n ( L/ x = 0 (5.7a c + d y λ µ = 0, = 1,..., m ( L/ y = 0 (5.7b a 0 ζ λ T a = 0, ( L/ z = 0 (5.7c g (x a z y b 0, = 1,..., m (prmal feasblty (5.7d λ (g (x a z y b = 0, = 1,..., m (compl slackness (5.7e ξ (α x = 0, = 1,..., n (compl slackness (5.7f η (x β = 0, = 1,..., n (compl slackness (5.7g µ y = 0, = 1,..., m (compl slackness (5.7h ζz = 0, (compl slackness (5.7 α x β, = 1,..., n (prmal feasblty (5.7 z 0 and y 0, = 1,..., m (prmal feasblty (5.7k ξ 0 and η 0, = 1,..., n (dual feasblty (5.7l ζ 0 and µ 0, = 1,..., m (dual feasblty (5.7m λ 0, = 1,..., m (dual feasblty (5.7n ψ x = p (λ (u x 2 q (λ (x l 2 and λ T a = λ a. (5.8 =1 10
5.2. The relaxed optmalty condtons for the MMA/GCMMA subproblem When a prmal-dual nteror-pont method s used for solvng the subproblem (5.1, the zeros n the rght hand sdes of the complementary slackness condtons (5.7e (5.7 are replaced by the negatve of a small parameter ε > 0. Further, slack varables s are ntroduced for the constrants (5.7d. The relaxed optmalty condtons then become as follows. ψ ξ + η = 0, x = 1,..., n (5.9a c + d y λ µ = 0, = 1,..., m (5.9b a 0 ζ λ T a = 0, (5.9c g (x a z y + s b = 0, = 1,..., m (5.9d ξ (x α ε = 0, = 1,..., n (5.9e η (β x ε = 0, = 1,..., n (5.9f µ y ε = 0, = 1,..., m (5.9g ζz ε = 0, (5.9h λ s ε = 0, = 1,..., m (5.9 x α > 0 and ξ > 0, = 1,..., n (5.9 β x > 0 and η > 0, = 1,..., n (5.9k y > 0 and µ > 0, = 1,..., m (5.9l z > 0 and ζ > 0, (5.9m s > 0 and λ > 0, = 1,..., m (5.9n For each fxed ε > 0, there exsts a unque soluton (x, y, z, λ, ξ, η, µ, ζ, s of these condtons. Ths follows because (5.9a (5.9n are mathematcally (but not numercally equvalent to the KKT condtons of the followng strctly convex problem n the varables x, y, z and s. mnmze g 0 (x + a 0 z + (c y + 1 2 d y 2 ε log(x α ε log(β x ε log y ε log s ε log z (5.10 subect to g (x a z y + s b, = 1,..., m (α < x < β, y > 0, z > 0, s > 0 the strct nequaltes wthn parentheses wll be satsfed automatcally because of the logarthm terms n the obectve functon. 11
5.3. A Newton drecton for the relaxed optmalty condtons Assume that a pont w = (x, y, z, λ, ξ, η, µ, ζ, s whch satsfes (5.9 (5.9n s gven, and assume that, startng from ths pont, Newton s method should be appled to (5.9a (5.9. Then the followng system of lnear equatons n w = ( x, y, z, λ, ξ, η, µ, ζ, s should be generated and solved. Ψ G T e e d e e a T 1 G e a e ξ x α η β x µ y ζ z s λ x y z λ ξ η µ ζ s = δ x δ y δ z δ λ δ ξ δ η δ µ δ ζ δ s δ x,..., δ s are defned by the left hand sdes n (5.9a (5.9, Ψ s an n n dagonal matrx wth (Ψ = 2 ψ x 2 = 2p (λ (u x 3 + 2q (λ (x l 3, (5.11 G s an m n matrx wth (G = g x = p (u x 2 q (x l 2, (5.12 d s a dagonal matrx wth the vector d = (d 1,..., d m T on the dagonal, x α s a dagonal matrx wth the vector x α on the dagonal, etc. e s a vector (1,..., 1 T, wth dmenson apparent from the context, so that e s a unty matrx, wth dmenson apparent from the context. In the above Newton system, ξ, η, µ, ζ and s can be elmnated through ξ = x α 1 ξ x ξ + ε x α 1 e, η = β x 1 η x η + ε β x 1 e, (5.13a (5.13b µ = y 1 µ y µ + ε y 1 e, (5.13c ζ = (ζ/z z ζ + ε/z, s = λ 1 s λ s + ε λ 1 e. (5.13d (5.13e 12
Then the followng system n ( x, y, z, λ s obtaned. D x G T D y e ζ/z a T G e a D λ x δ x y δ y = z δ z λ δ λ (5.14 D x = Ψ + x α 1 ξ + β x 1 η, (a dagonal matrx (5.15a D y = d + y 1 µ, (a dagonal matrx (5.15b D λ = λ 1 s, (a dagonal matrx (5.15c δ x = ψ x ε x α 1 e + ε β x 1 e, δ y = c + d y λ ε y 1 e, (5.15d (5.15e δ z = a 0 λ T a ε/z, (a scalar (5.15f δ λ = g(x az y b + ε λ 1 e. (5.15g In (5.15d, ψ x s a vector wth the n components ψ x = Next, y can be elmnated from the system (5.14 through Then the followng system n x, z and λ s obtaned. p (λ (u x 2 q (λ (x l 2. y = Dy 1 λ Dy 1 δ y. (5.16 D x G T ζ/z a T G a D λy x δ x z = δ z λ δ λy (5.17 D λy = D λ + D 1 y, (a dagonal matrx (5.18a δ λy = δ λ + D 1 y δ y. (5.18b 13
In ths system (5.17, ether x or λ can be elmnated. If x s elmnated through the followng system n λ and z s obtaned. x = Dx 1 G T λ Dx 1 δ x, (5.19 D λy + GD 1 x G T a a T ζ/z λ z = δ λy GD 1 x δ z δ x (5.20 If, nstead, λ s elmnated from the system (5.17 through λ = D 1 λy the followng system n x and z s obtaned. G x D 1 λy a z + D 1 δ λy λy, (5.21 D x + G T D 1 λy G a T D 1 λy G GT D 1 λy a ζ/z + at D 1 λy a x z = δ x G T D 1 δ λy λy δ z + a T D 1 δ λy λy (5.22 Note that the sze of the system (5.20 s (m+1 (m+1, whle the sze of the system (5.22 s (n + 1 (n + 1. Therefore, typcally, the system (5.20 should be preferred f n > m, whle the system (5.22 should be preferred f m > n. In both the systems (5.20 and (5.22, t s of course possble to elmnate also the varable z (and reduce the systems to m m and n n, respectvely, but for numercal reasons t s not wse to do that. As an example, f a s dense (many non-zeros then the system obtaned after elmnatng z wll be dense, even f the orgnal system (5.20 or (5.22 s sparse. 5.4. Lne search n the Newton drecton When the Newton drecton w = ( x, y, z, λ, ξ, η, µ, ζ, s has been calculated n the gven pont w = (x, y, z, λ, ξ, η, µ, ζ, s, a step should be taken n that drecton. A pure Newton step would be to take a step equal to the calculated drecton, but such a step mght lead to a pont some of the nequaltes (5.9 (5.9n are volated. Therefore, we frst let t be the largest number such that t 1, x + t x α 0.01(x α for all, β (x + t x 0.01(β x for all, and (y, z, λ, ξ, η, µ, ζ, s + t ( y, z, λ, ξ, η, µ, ζ, s 0.01 (y, z, λ, ξ, η, µ, ζ, s. Further, the new pont should also be n some sense better than the prevous. Therefore, we let τ be the largest of t, t/2, t/4, t/8,... such that δ(w + τ w < δ(w, δ(w s the resdual vector defned by the left hand sdes n the relaxed KKT condtons (5.9a (5.9, and s the ordnary Eucldan norm. Ths s always possble to obtan snce the Newton drecton s a descent drecton for δ(w. 14
5.5. The complete prmal-dual algorthm Frst of all, ε (1 and a startng pont w (1 = (x (1, y (1, z (1, λ (1, ξ (1, η (1, µ (1, ζ (1, s (1 whch satsfes (5.9 (5.9n are chosen. The followng s a smple but reasonable choce. ε (1 = 1, x (1 = 1 2 (α + β, y (1 = 1, z (1 = 1, ζ (1 = 1, λ (1 = 1, s (1 = 1, ξ (1 = max{ 1, 1/(x (1 α }, η (1 = max{ 1, 1/(β x (1 }, µ (1 = max{ 1, c /2}. A typcal teraton, leadng from the l:th teraton pont w (l to the (l+1:th teraton pont w (l+1, conssts of the followng steps. Step 1: For gven ε (l and w (l whch satsfy (5.9 (5.9n, calculate w (l as descrbed n secton 5.3 above. Step 2: Calculate a steplength τ (l as descrbed n secton 5.4 above. Step 3: Let w (l+1 = w (l + τ (l w (l. Step 4: If δ(w (l+1 < 0.9 ε (l, let ε (l+1 = 0.1 ε (l. Otherwse, let ε (l+1 = ε (l. Increase l by 1 and go to Step 1. The algorthm s termnated when ε (l has become suffcently small, say ε (l 10 7, and δ(w (l+1 < 0.9 ε (l. 6. References [1] K. Svanberg, The method of movng asymptotes a new method for structural optmzaton, Internatonal Journal for Numercal Methods n Engneerng, 1987, 24, 359-373. [2] K. Svanberg, A class of globally convergent optmzaton methods based on conservatve convex separable approxmatons, SIAM Journal of Optmzaton, 2002, 12, 555-573. 15