SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 10: Interior methods. Anders Forsgren. 1. Try to solve theory question 7.

SF2822 Applied Nonlinear Optimization Lecture 10: Interior methods Anders Forsgren SF2822 Applied Nonlinear Optimization, KTH 1 / 24 Lecture 10, 2017/2018 Preparatory question 1. Try to solve theory question 7. SF2822 Applied Nonlinear Optimization, KTH 2 / 24 Lecture 10, 2017/2018

Interior methods The term interior methods is used as a common name for methods of penalty- and barrier type for nonlinear optimization. We have previously considered interior methods for quadratic programming. We now consider general nonlinear programming. Penalty- and barrier methods in primal form are from the 60s. They have some less desirable properties due to ill-conditioning. The methods were revived in 1984 for linear programming. Primal-dual interior methods are methods of the 90s. They have better behavior. We look at barrier and penalty methods separately for the case of simplicity. SF2822 Applied Nonlinear Optimization, KTH 3 / 24 Lecture 10, 2017/2018 Barrier function for general nonlinear problem Consider an inequality-constrained problem P f x subject to gx 0, where f, g C 2, g : IR n IR m. We assume {x IR n : gx > 0} and require gx > 0 implicitly. For a positive parameter µ, form the logarithmic barrier function m B µ x = f x µ ln g i x. i=1 Necessary conditions for a r of B µ x are B µ x = 0, where m 1 B µ x = f x µ g i x g ix = f x µax T Gx 1 e, i=1 with Gx = diaggx and e = 1 1... 1 T. SF2822 Applied Nonlinear Optimization, KTH 4 / 24 Lecture 10, 2017/2018

Barrier function for general nonlinear problem, cont. If xµ is a local r of min x:gx>0 B µ x it holds that f xµ µaxµ T Gxµ 1 e = 0. Proposition Let xµ be a local r of min x:gx>0 B µ x. Under suitable conditions, it holds that lim xµ = x, lim µgxµ 1 e = λ, µ 0 µ 0 where x is a local r of P and λ is the associated Lagrange multiplier vector. Note! It holds that gxµ > 0. SF2822 Applied Nonlinear Optimization, KTH 5 / 24 Lecture 10, 2017/2018 Barrier function for general nonlinear problem, cont. Let λµ = µgxµ 1 e, i.e., λ i µ = µ, i = 1,..., m. g i xµ Then, B µ xµ = 0 f xµ Axµ T λµ = 0. This means that xµ and λµ solve the nonlinear equation λ i µ = 0, i = 1,..., m, g i x where we in addition require gx > 0 and λ > 0. If the second block of equations is multiplied by Gx we obtain g i xλ i µ = 0, i = 1,..., m. A perturbation of the first-order necessary optimality conditions. SF2822 Applied Nonlinear Optimization, KTH 6 / 24 Lecture 10, 2017/2018

Barrier function, example 1 2 x 1 2 + x 2 2 + x 3 2 subject to x 1 + x 2 + x 3 3 0. It holds that x = 1 1 1 T with λ = 1. B µ x = 1 2 x 1 2 + x 2 2 + x 3 2 µ lnx 1 + x 2 + x 3 3. x 1 1 µ B µ x = x 2 1. x 1 + x 2 + x 3 3 1 x 3 2 B µ x 0 if x 1 + x 2 + x 3 3 > 0. SF2822 Applied Nonlinear Optimization, KTH 7 / 24 Lecture 10, 2017/2018 Barrier function, example, cont. Since 2 B µ x 0, B µ has a unique r for all µ > 0. The requirement B µ xµ = 0 1 1 xµ = 2 + 4 + µ e. 3 Hence, λµ = µ gxµ = gives 1 1 2 + 4 + µ. 3 Insertion gives lim µ 0 xµ = e = x and lim µ 0 λµ = 1 = λ. SF2822 Applied Nonlinear Optimization, KTH 8 / 24 Lecture 10, 2017/2018

Barrier function method A barrier function method approximately finds xµ, λµ for decreasing values of µ. A primal-dual method takes Newton iterations on the primal-dual nonlinear equations Gxλ µe = 0. The Newton step x, λ is given by 2 xxlx, λ Ax T x f x Ax T λ =, ΛAx Gx λ Gxλ µe where Λ = diagλ. SF2822 Applied Nonlinear Optimization, KTH 9 / 24 Lecture 10, 2017/2018 An iteration in a primal-dual barrier function method An iteration in a primal-dual barrier function method takes the following form, given µ > 0, x such that gx > 0 and λ > 0. 1. Compute x, λ from 2 xxlx, λ Ax T ΛAx Gx x f x Ax T λ =. λ Gxλ µe 2. Choose suitable steplength α such that gx + α x > 0, λ + α λ > 0. 3. x x + α x, λ λ + α λ. 4. If x, λ sufficiently close to xµ, λµ, reduce µ. SF2822 Applied Nonlinear Optimization, KTH 10 / 24 Lecture 10, 2017/2018

Barrier function method, cont. Note that x solves 2 xxlx, λ + Ax T ΛGx 1 Ax x = f x µax T Gx 1 e. If 2 xxlx, λ + Ax T ΛGx 1 Ax 0, then x is a descent direction to B µ at x. If 2 xxlx, λ + Ax T ΛGx 1 Ax 0 we may replace 2 xxlx, λ by B such that B + Ax T ΛGx 1 Ax 0, and solve B Ax T x f x Ax T λ =. ΛAx Gx λ Gxλ µe SF2822 Applied Nonlinear Optimization, KTH 11 / 24 Lecture 10, 2017/2018 Barrier function method, cont. The system of equations may be symmetrized as 2 xxlx, λ Ax T x f x Ax T λ Ax GxΛ 1 = λ Gx µλ 1. e Such a symmetrization allows modification of 2 xxlx, λ during the factorization. The matrix is here symmetric but becomes infinitely ill-conditions as the solution is approached. Typically g i x/λ i 0 or g i x/λ i. The ill-conditioning is in general benign. SF2822 Applied Nonlinear Optimization, KTH 12 / 24 Lecture 10, 2017/2018

Interior properties As the barrier method has been stated above, we assume that {x IR n : gx > 0}, and in particular that the initial point satisfies gx > 0. An alternative is to rewrite P as P f x subject to gx s = 0, s 0. We may then initially choose s > 0, but do not initially require gx s = 0. This is how the barrier method for IQP was described. SF2822 Applied Nonlinear Optimization, KTH 13 / 24 Lecture 10, 2017/2018 Interior properties, cont. If the constraint gx s = 0 is included as a nonlinear equation, the resulting primal-dual nonlinear equations may be written as gx s = 0, Sλ µe = 0. If s is eliminated by letting gx = s, the original equations are obtained. During the iterations, we do not require gx = s, but only s > 0. SF2822 Applied Nonlinear Optimization, KTH 14 / 24 Lecture 10, 2017/2018

Interior properties, cont. The Newton step x, s, λ is given by 2 xxlx, λ 0 Ax T x f x Ax T λ Ax I 0 s = gx s. 0 Λ S λ Sλ µe One may eliminate s to obtain 2 xxlx, λ Ax T x f x Ax T λ =, ΛAx S λ Gxλ µe which has the same structure as the original system for P. SF2822 Applied Nonlinear Optimization, KTH 15 / 24 Lecture 10, 2017/2018 Treatment of equalities Consider the problem P f x subject to g i x 0, i I, g i x = 0, i E, x IR n, where f, g C 2, g : IR n IR m, The inequality constraints g i x 0, i I, may be dealt with using a barrier transformation, as outlined above. SF2822 Applied Nonlinear Optimization, KTH 16 / 24 Lecture 10, 2017/2018

Treatment of equalities, cont. An equality constraint g i x = 0 may either be left "as is", which means approximate solution of f x µ i I ln g i x subject to g i x = 0, i E, or by a quadratic penalty transformation, which means approximate solution of f x µ i I ln g i x + 1 2µ g i x 2. A penalty transformation is a "similar" to a barrier transformation in that it can be viewed as a perturbation to the optimality conditions in a primal-dual setting. i E SF2822 Applied Nonlinear Optimization, KTH 17 / 24 Lecture 10, 2017/2018 Treatment of equalities, cont. A barrier transformation gives the nonlinear equations g i xλ i µ = 0, i I, g i x = 0, i E, whereas a penalty-barrier transformation gives g i xλ i µ = 0, i I, g i x + µλ i = 0, i E. Similar linear algebraic procedures. Method referred to as interior methods, since they stay interior with respect to the inequalities. SF2822 Applied Nonlinear Optimization, KTH 18 / 24 Lecture 10, 2017/2018

Penalty function Consider an equality-constrained nonlinear program P = f x subject to gx = 0, where f, g C 2, g : IR n IR m. For a positive parameter µ, form the quadratic penalty function P µ x = f x + 1 2µ gxt gx. Necessary conditions for a local r of P µ x is P µ x = 0, where P µ x = f x + 1 µ AxT gx, with Ax T = g 1 x... g m x. SF2822 Applied Nonlinear Optimization, KTH 19 / 24 Lecture 10, 2017/2018 Penalty function, cont. If xµ is a local r to min P µ x then f xµ + 1 µ AxµT gxµ = 0. Proposition Let xµ min P µ x. Under suitable conditions, it holds that lim xµ = x gxµ, lim = λ, µ 0 µ 0 µ where x is a local r to P = and λ is the corresponding Lagrange multiplier vector. SF2822 Applied Nonlinear Optimization, KTH 20 / 24 Lecture 10, 2017/2018

Penalty function, cont. Let λµ = gxµ. µ Then P µ xµ = 0 f xµ Axµ T λµ = 0. This means that xµ and λµ solve the system of equations gx µ + λ = 0. If the second block of equations is multiplied by µ we obtain the equivalent system gx + µλ = 0. This is a perturbation of the first-order necessary optimality conditions, resulting when µ = 0. SF2822 Applied Nonlinear Optimization, KTH 21 / 24 Lecture 10, 2017/2018 Penalty function example x 1 x 2 x 1 x 3 x 2 x 3 subject to x 1 + x 2 + x 3 3 = 0. It holds that x = 1 1 1 T with λ = 2. P µ x = x 1 x 2 x 1 x 3 x 2 x 3 + 1 2µ x 1 + x 2 + x 3 3 2. x 2 x 3 P µ x = x 1 x 3 + x 1 1 + x 2 + x 3 3 1 µ x 1 x 2 1 1 2 P µ x = µ 1 ee T + I, with e = 1 1 1 T.. SF2822 Applied Nonlinear Optimization, KTH 22 / 24 Lecture 10, 2017/2018

Penalty function example, cont. We obtain 2 P µ x 0 if µ < 2/3. Hence, P µ has a unique r if µ < 2/3. P µ lacks r for µ > 2/3. The requirement P µ xµ = 0 gives xµ = Hence, λµ = gxµ µ = 6 3 2µ. 3 3 2µ e. Insertion gives lim µ 0 xµ = e = x and lim µ 0 λµ = 2 = λ. SF2822 Applied Nonlinear Optimization, KTH 23 / 24 Lecture 10, 2017/2018 Interior methods Interior methods are often efficient on convex problems. Then, typically, no penalty transformation is put on the equality constraints. They are linear. For linear programming and convex quadratic programming, the number of iterations is typically of the order 10 20. Each iteration becomes more costly when the problem size increases. For nonconvex problems the situation is significantly more complicated. We will not discuss this in detail here. Unknown how to "warm start" interior methods efficiently. SF2822 Applied Nonlinear Optimization, KTH 24 / 24 Lecture 10, 2017/2018