Algorithms for nonlinear programming problems II

Algorithms for nonlinear programming problems II Martin Branda Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics Computational Aspects of Optimization Martin Branda (KPMS MFF UK) 03-05-2016 1 / 31

Algorithm convergence Algorithm convergence Definition Let X R p, Y R q be nonempty closed sets. Let F : X Y be a set-valued mapping. The map F is said to be closed at x X if for any sequences {x k } X, and {y k } satisfying x k x, y k F (x k ), y k y we have that y F (x). The map F is said to be closed on Z X if it is closed at each point in Z. Martin Branda (KPMS MFF UK) 03-05-2016 2 / 31

Algorithm convergence Algorithm convergence Zangwill s theorem Bazaraa et al. (2006), Theorem 7.2.3: Let A1. X R p be a nonempty closed set A2. ˆX X be a nonempty solution set A3. F : X X be a set-valued mapping closed over complement of ˆX A4. Given x 1 X the sequence {x k } is generated iteratively as follows: If x k ˆX, then STOP; otherwise, let x k+1 F (x k ) and repeat. A5. the sequence x 1, x 2,... be contained in a compact subset of X A6. there exist a continuous function 1 α such that α(y) < α(x) if x / ˆX and y F (x) Then either the algorithm stops in a finite number of steps with a point in ˆX or it generates an infinite sequence {x k } such that all accumulation points belong to ˆX and α(x k ) α(x) for some x ˆX. 1 descent function: α(x) = f (x) or α(x) = f (x) Martin Branda (KPMS MFF UK) 03-05-2016 3 / 31

Algorithm convergence Algorithm convergence Newton method Let x be an optimal solution, set F (x) = x ( 2 f (x) ) 1 f (x), α(x) = x x. More details: Bazaraa et al. (2006), Theorem 8.6.5 Martin Branda (KPMS MFF UK) 03-05-2016 4 / 31

Algorithm convergence Algorithm classification Order of derivatives 2 : derivative-free, first order (gradient), second-order (Newton) Feasibility of the constructed points: interior and exterior point methods Deterministic/randomized Local/global 2 If possible, deliver the derivatives. Martin Branda (KPMS MFF UK) 03-05-2016 5 / 31

Barrier method f : R n R, g j : R n R min x f (x) s.t. g j (x) 0, j = 1,..., m. (Extension including equality constraints is straightforward.) Assumption {x : g(x) < 0}. Martin Branda (KPMS MFF UK) 03-05-2016 6 / 31

Barrier method Barrier functions: continuous with the following properties B(y) 0, y < 0, lim B(y) =. y 0 Examples 1, log min{1, y}. y Maybe the most popular barrier function: Set and solve where µ > 0 is a parameter. B(y) = log y B(x) = m B ( g j (x) ), j=1 min f (x) + µ B(x), (1) Martin Branda (KPMS MFF UK) 03-05-2016 7 / 31

Polynomial-time interior point methods for LP min c T x µ s.t. Ax = b. n log x j j=1 (2) KKT-conditions... Martin Branda (KPMS MFF UK) 03-05-2016 8 / 31

Interior point methods f : R n R, g j : R n R min x f (x) s.t. g j (x) 0, j = 1,..., m. (Extension including equality constraints is straightforward.) New slack decision variables s R m min x,s f (x) s.t. g j (x) + s j = 0, j = 1,..., m, s j 0. (3) Martin Branda (KPMS MFF UK) 03-05-2016 9 / 31

Interior point methods Barrier problem min x,s f (x) µ m log s j j=1 s.t. g(x) + s = 0. (4) The barrier term prevents the components of s from becoming too close to zero. Lagrangian function L(x, s, z) = f (x) µ m log s j j=1 m z j (g j (x) + s j ). j=1 Martin Branda (KPMS MFF UK) 03-05-2016 10 / 31

Interior point methods KKT conditions for barrier problem (matrix notation) f (x) g T (x)z = 0, µs 1 e z = 0, g(x) + s = 0, S = diag{s 1,..., s m }, Z = diag{z 1,..., z m }, g(x) is the Jacobian matrix (components of function in rows?) Multiply the second equality by S f (x) g T (x)z = 0, SZe = µe, g(x) + s = 0, = Nonlinear system of equalities Newton s method Martin Branda (KPMS MFF UK) 03-05-2016 11 / 31

Newton s method f (x + v) = f (x) + 2 f (x)v = 0 with the solution (under 2 f (x) 0) v = ( 2 f (x) ) 1 f (x) Martin Branda (KPMS MFF UK) 03-05-2016 12 / 31

Interior point methods Use Newton s method to obtain a step ( x, s, z) H(x, z) 0 g T 0 Z S g I 0 x s z = f (x) + g T (x)z SZe + µe g(x) s H(x, z) = 2 f (x) m j=1 z j 2 g j (x), 2 f denotes the Hessian matrix. Martin Branda (KPMS MFF UK) 03-05-2016 13 / 31

Interior point methods Stopping criterion: { } f E = max (x) g T (x)z, SZe + µe, g(x) + s ε, ε > 0 small. Martin Branda (KPMS MFF UK) 03-05-2016 14 / 31

Interior point methods ALGORITHM: 0. Choose x 0 and s 0 > 0, and compute initial values for the multipliers z 0 > 0. Select an initial barrier parameter µ 0 > 0 and parameter σ (0, 1), set k = 1. 1. Repeat until a stopping test for the nonlinear program (19.1) is satisfied: Solve the nonlinear system of equalities using Newton s method and obtain (x k, s k, z k ). Decrease barrier parameter µ k+1 = σµ k, set k = k + 1. Martin Branda (KPMS MFF UK) 03-05-2016 15 / 31

Interior point methods Convergence of the method (Nocedal and Wright 2006, Theorem 19.1): continuously differentiable f, g j, LICQ at any limit point, then the limits are stationary points of the original problem Martin Branda (KPMS MFF UK) 03-05-2016 16 / 31

Interior point method Example min x (x 1 2) 4 + (x 1 2x 2 ) 2 s.t. x 2 1 x 2 0. (5) min x,s (x 1 2) 4 + (x 1 2x 2 ) 2 s.t. x 2 1 x 2 + s = 0, s 0. (6) min x,s (x 1 2) 4 + (x 1 2x 2 ) 2 µ log s s.t. x 2 1 x 2 + s = 0. (7) Martin Branda (KPMS MFF UK) 03-05-2016 17 / 31

Interior point method Example Lagrange function L(x 1, x 2, s, z) = (x 1 2) 4 + (x 1 2x 2 ) 2 µ log s z(x 2 1 x 2 + s). Optimality conditions together with feasibility L x 1 = 4(x 1 2) 3 + 2(x 1 2x 2 ) 2zx 1 = 0, L x 2 = 4(x 1 2x 2 ) + z = 0, L s = µ s z = 0, L z = x 1 2 x 2 + s = 0. We have obtained 4 equations with 4 variables... (8) Martin Branda (KPMS MFF UK) 03-05-2016 18 / 31

Interior point method Example Slight modification 4(x 1 2) 3 + 2(x 1 2x 2 ) 2zx 1 = 0, (9) 4(x 1 2x 2 ) + z = 0, (10) sz µ = 0, (11) x1 2 x 2 + s = 0. (12) Necessary derivatives H(x 1, x 2, z) = g(x) = ( 12(x1 2) 2 + 2 2z 4 ( 2x1 ) 4 8 1 ) (13) (14) Martin Branda (KPMS MFF UK) 03-05-2016 19 / 31

Interior point method Example System of linear equations for Newton s step 12(x 1 2) 2 + 2 2z 4 0 2x 1 4 8 0 1 0 0 z s 2x 1 1 1 0 = x 1 x 2 s z 4(x 1 2) 3 2(x 1 2x 2 ) + 2zx 1 4(x 1 2x 2 ) z sz + µ x 2 1 + x 2 s = Martin Branda (KPMS MFF UK) 03-05-2016 20 / 31

Interior point method Example Starting point x 0 = (0, 1), z 0 = 1, s 0 = 1, µ > 0, then the step... 48 4 0 0 4 8 0 1 0 0 1 1 0 1 1 0 x 1 x 2 s z = 36 9 1 + µ 0 Martin Branda (KPMS MFF UK) 03-05-2016 21 / 31

Interior point method Example 2.0 1.5 1.0 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Martin Branda (KPMS MFF UK) 03-05-2016 22 / 31

Sequential Quadratic Programming Sequential Quadratic Programming Nocedal and Wright (2006): Let f, h i : R n R be smooth functions, min f (x) x s.t. h i (x) = 0, i = 1,..., l. (15) Lagrange function L(x, v) = f (x) + and KKT optimality conditions l v i h i (x) i=1 x L(x, v) = x f (x) + A(x) T v = 0, h(x) = 0, (16) where h(x) T = (h 1 (x),..., h l (x)) and A(x) T = [ h 1 (x), h 2 (x),..., h l (x)] denotes the Jacobian matrix. Martin Branda (KPMS MFF UK) 03-05-2016 23 / 31

Sequential Quadratic Programming Sequential Quadratic Programming We have system of n + l equations in the n + l unknowns x and v: [ x f (x) + A(x) F (x, v) = T ] v = 0. (17) h(x) ASS. (LICQ) Jacobian matrix A(x) has full row rank. The Jacobian is given by [ 2 2 F (x, v) = xx L(x, v) A(x) T ]. (18) A(x) 0 We can use the Newton algorithm.. Martin Branda (KPMS MFF UK) 03-05-2016 24 / 31

Sequential Quadratic Programming Sequential Quadratic Programming Setting f k = f (x k ), 2 xxl k = 2 xxl(x k, v k ), A k = A(x k ), h k = h(x k ), we obtain the Newton step by solving the system [ 2 xx L k A T ] [ ] [ k px fk A = T k v ] k (19) A k 0 p v h k Then we set x k+1 = x k + p x and v k+1 = v k + p v. Martin Branda (KPMS MFF UK) 03-05-2016 25 / 31

Sequential Quadratic Programming Sequential Quadratic Programming ASS. (SOSC) For all d { d 0 : A(x) d = 0}, it holds d T 2 xxl(x, v)d > 0. Martin Branda (KPMS MFF UK) 03-05-2016 26 / 31

Sequential Quadratic Programming Sequential Quadratic Programming Important alternative way to see the Newton iterations: Consider the quadratic program min p f k + p T f k + 1 2 pt 2 xxl k p s.t. h k + A k p = 0. (20) KKT optimality conditions 2 xxl k p + f k + A T k ṽ = 0 h k + A k p = 0, (21) [ 2 xx L k A T k A k 0 ] [ px pṽ ] [ fk = h k ] (22) which is the same as the Newton system if we add A T k v k to the first equation. Then we set x k+1 = x k + p x and v k+1 = pṽ. Martin Branda (KPMS MFF UK) 03-05-2016 27 / 31

Sequential Quadratic Programming Sequential Quadratic Programming Algorithm: Start with an initial solution (x 0, v 0 ) and iterate until a convergence criterion is met: 1. Evaluate f k = f (x k ), h k = h(x k ), A k = A(x k ), 2 xxl k = 2 xxl(x k, v k ). 2. Solve the Newton equations OR the quadratic problem to obtain new (x k+1, v k+1 ). If possible, deliver explicit formulas for first and second order derivatives. Martin Branda (KPMS MFF UK) 03-05-2016 28 / 31

Sequential Quadratic Programming Sequential Quadratic Programming f, g j, h i : R n R are smooth. We use a quadratic approximation of the objective function and linearize the constraints, p R p min p f (x k ) + p T x f (x k ) + 1 2 pt 2 xxl(x k, u k, v k )p s.t. g j (x k ) + p T x g j (x k ) 0, j = 1,..., m, h i (x k ) + p T x h i (x k ) = 0, i = 1,..., l. (23) Use an algorithm for quadratic programming to solve the problem and set x k+1 = x k + p k, where u k+1, v k+1 are Lagrange multipliers of the quadratic problem which are used to compute new 2 xxl. Convergence: Nocedal and Wright (2006), Theorem 18.1 Martin Branda (KPMS MFF UK) 03-05-2016 29 / 31

Sequential Quadratic Programming Martin Branda (KPMS MFF UK) 03-05-2016 30 / 31

Literature Sequential Quadratic Programming Bazaraa, M.S., Sherali, H.D., and Shetty, C.M. (2006). Nonlinear programming: theory and algorithms, Wiley, Singapore, 3rd edition. Boyd, S., Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press, Cambridge. P. Kall, J. Mayer: Stochastic Linear Programming: Models, Theory, and Computation. Springer, 2005. Nocedal, J., Wright, J.S. (2006). Numerical optimization. Springer, New York, 2nd edition. Martin Branda (KPMS MFF UK) 03-05-2016 31 / 31