Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton problem. The Matlab verson of the author s MMA code s based on the assumpton that the users optmzaton problem s wrtten on the followng form, where the optmzaton varables are x = (x 1,..., x n ) T, y = (y 1,..., y m ) T and z. m f 0 (x) a 0 z (c y 1 2 d y 2 ) subect to f (x) a z y 0, = 1,..., m y 0, z 0. x x max, = 1,..., n = 1,..., m (1.1) Here, x 1,..., x n are the true optmzaton varables, whle y 1,..., y m and z are artfcal optmzaton varables whch wll be motvated below. f 0, f 1,..., f m are gven, contnuously dfferentable, real-valued functons. and x max are gven real numbers whch satsfy < x max. a 0 and a are gven real numbers whch satsfy a 0 > 0 and a 0. c and d are gven real numbers whch satsfy c 0, d 0 and c d > 0. 2. Ordnary NLP problems. Assume that the user wants to solve a problem on the followng standard form for nonlnear programmng. f 0 (x) subect to f (x) 0, = 1,..., m x x max, = 1,..., n (2.1) To put (1.1) nto ths form (2.1), frst let a 0 = 1 and a = 0 for all. Then z = 0 n any optmal soluton of (1.1). Further, for each, let d = 0 and c = a large number, so that the varables y become expensve. Then typcally y =0 n any optmal soluton of (1.1), and the correspondng x s an optmal soluton of (2.1). 1
It should be noted that the problem (1.1) always has feasble solutons, and n fact also at least one optmal soluton. Ths holds even f the user s problem (2.1) does not have any feasble solutons, n whch case some y > 0 n the optmal soluton of (1.1). Ths s ust one advantage of the formulaton (1.1) compared to the formulaton (2.1). Now some practcal consderatons and recommendatons. In many applcatons, the constrants are on the form σ (x) σ max, where σ (x) stands for e.g. a certan stress, whle σ max s the largest permtted value on ths stress. Ths means that f (x) = σ (x) σ max (n (1.1) as well as n (2.1)). The user should then preferably scale the constrants n such a way that 1 σ max 100 for each (and not σ max = 10 10 ). The obectve functon f 0 (x) should preferably be scaled such that 1 f 0 (x) 100 for reasonable values on the varables. The varables x should preferably be scaled such that 0.1 x max 100, for all. Concernng the large numbers on the coeffcents c (mentoned above), the user should for numercal reasons try to avod extremely large values on these coeffcents (lke 10 10 ). It s better to start wth reasonably large values and then, f t turns out that not all y = 0 n the optmal soluton of (1.1), ncrease the correspondng values of c by e.g. a factor 100 and solve the problem agan, etc. If the functons and the varables have been scaled accordng to above, then resonably large values on the parameters c could be, say, c = 1000 or 10000. Fnally, concernng the smple bound constrants x x max, t may sometmes be the case that some varables x do not have any prescrbed upper and/or lower bounds. In that case, t s n practce always possble to choose artfcal bounds and x max such that every realstc soluton x satsfes the correspondng bound constrants. The user should then preferably avod choosng x max unnecessarly large. It s better to try some reasonable bounds and then, f t turns out that some varable x becomes equal to such an artfcal bound n the optmal soluton of (1.1), change ths bound and solve the problem agan (startng from the recently obtaned soluton), etc. 3. Least squares problems. (Mnmum 2 norm problems.) Assume that the user wants to solve a constraned least squares problem on the form p (h (x)) 2 subect to g (x) 0, = 1,..., q x x max, = 1,..., n (3.1) where h and g are gven dfferentable functons. 2
The functons f and the parameters a, c and d should then be chosen as follows n problem (1.1). m = 2p q, f p (x) = h (x), = 1,..., p f 2p (x) = g (x), = 1,..., q a = 0, = 1,..., m d = 2, = 1,..., 2p d 2p = 0, = 1,..., q c = 0, = 1,..., 2p c 2p = large number, = 1,..., q 4. Mnmum 1 norm problems. Assume that the user wants to solve a mnmum 1 norm problem on the form p h (x) subect to g (x) 0, = 1,..., q x x max, = 1,..., n (4.1) where h and g are gven dfferentable functons. The functons f and the parameters a, c and d should then be chosen as follows n problem (1.1). m = 2p q, f p (x) = h (x), = 1,..., p f 2p (x) = g (x), = 1,..., q a = 0, = 1,..., m d = 0, = 1,..., m c = 1, = 1,..., 2p c 2p = large number, = 1,..., q 5. Mnmax problem. (Mnmum norm problems.) Assume that the user wants to solve a mnmax problem on the form max (x) },..,p subect to g (x) 0, = 1,..., q where h and g are gven dfferentable functons. x x max, = 1,..., n (5.1) 3
The functons f and the parameters a, c and d should then be chosen as follows n problem (1.1). m = 2p q, f p (x) = h (x), = 1,..., p f 2p (x) = g (x), = 1,..., q a = 1, = 1,..., 2p a 2p = 0, = 1,..., q d = 0, = 1,..., m c = large number, = 1,..., m 6. The MMA subproblem MMA s a method for solvng problems on the form (1.1), usng the followng approach: In each teraton, the current teraton pont (x, y, z ) s gven. Then an approxmatng explct subproblem s generated. In ths subproblem, the functons f (x) are replaced by approxmatng convex functons f (x). These approxmatons are based manly on gradent nformaton at the current teraton pont, but also (mplctly) on nformaton from prevous teraton ponts. The subproblem s solved, and the unque optmal soluton becomes the next teraton pont (x (k1), y (k1), z (k1) ). Then a new subproblem s generated, etc. The subproblem mentoned above looks as follows. subect to m f 0 (x) a 0 z (c y 1 2 d y 2 ) f (x) a z y 0, = 1,..., m α y 0, z 0. x β, = 1,..., n = 1,..., m (6.1) The approxmatng functons f (x) = n =1 f p u x (x) are chosen as q x l r, = 0, 1,..., m, where p q = (u = (x ) x ) ( 2 f (x ) κ, x ) l ) ( 2 f (x ) κ, x 4
Here, r = f (x ) α β n =1 u p x x q = max{, 0.9l 0.1x }, = mn{x max, 0.9u 0.1x }. l, ( ) f (x ) = max{0, f (x )} and x x ( ) f (x ) = max{0, f (x )}. x x The default rules for updatng the lower asymptotes l and the upper asymptotes u are as follows. The frst two teratons, when k = 1 and k = 2, l u In later teratons, when k 3, l u = x = x = x = x 0.5(x max 0.5(x max ), ). γ (x (k 1) l (k 1) ), γ (u (k 1) ), where γ = 0.7 f (x 1.2 f (x 1 f (x )(x (k 1) x (k 2) ) < 0, )(x (k 1) x (k 2) ) > 0, )(x (k 1) x (k 2) ) = 0. The default values of the parameters κ are κ = 10 3 f (x ) x 10 6 u l, for = 0, 1,.., m and = 1,.., n. (6.2) Ths mples that all the approxmatng functons are strctly convex, whch n turn mples that there s always a unque optmal soluton of the MMA subproblem. Regardless of the values of the parameters κ, the functons f are always frst order approxmatons of the orgnal functons f at the current teraton pont,.e. f f (x ) = f (x ) and f x (x ) = f x (x ). (6.3) 5