Constrained optimization: direct methods (cont.)

Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi

Direct methods Also known as methods of feasible directions Idea in a point x h, generate a feasible search direction where objection function value can be improved use line search to get x h+1 Methods differ in how to choose a feasible direction and what is assumed from the constraints (linear/nonlinear, equality/inequality)

Examples of direct methods Projected gradient method Active set method Sequential Quadratic Programming (SQP) method

Active set method Consider a problem min f x s. t. Ax b, where A is an m n matrix and b R m x S if and only if Ax b ith constraint: a i T x bi Idea In x h, the set of constraints is divided into active (i I) and inactive constraints Inactive constraints are not taken into account when the search direction d h is determined Inactive constraints affect only when computing the optimal step length α h

Feasible directions For i I, a i T x h = b i If d h is feasible in x h, then x h + αd h S for some α > 0 a i T x h + αd h = a i T x h + α a i T d h b i a i T d h 0 for feasible d h and the constraint remains active if a i T d h = 0

Optimality conditions Assume that t constraints are active in x h (I = i 1,, i t ) and denote by A the t n matrix that consists of the active constraints and Z is the matrix spanning d A d = 0 Necessary: If x S (A x = b = (b i1,, b it )) is a local minimizer, then 1) Z T f x = 0 that is f x + A T μ = 0, 2) μ i = μ i > 0 for all i I and 3) matrix Z T H X Z is positive semidefinite Sufficient: If in x S 1) Z T f x = 0, 2) μ i = μ i > 0 for all i I and 3) matrix Z T H x Z is positive definite, then x is a local minimizer

On active constraints Optimization problem with inequality constraints is more difficult than problem with equality constraints since the active set in a local minimizer is not known If it would be known, then it would be enough to solve a corresponding equality constrained problem In that case, if the other constraints would be satisfied in the solution and all the Lagrange multipliers were non-negative, then the solution would also be a solution to the original problem

Using the active set At each iteration, a working set is considered which consists of the active constraints in x h The direction d h is determined so that it is a descent direction in the working set E.g. Rosen s projected gradient method can be used

Algorithm 1) Choose a starting point x 1. Determine the active set I 1 and the corresponding matrices A1 and Z 1 in x 1. Set h = 1. 2) Compute a feasible direction d h in the subspace defined by the active constraints. (E.g. by using projected gradient.) 3) If d h = 0, go to 6). Otherwise, find α max = max α x h + αd h S and solve min f x h + αd h s. t. 0 α α max. Let the solution be α h. Set x h+1 = x h + α h d h. 4) If α h < α max, go to 7). (Active set has changed.) 5) Addition to active set: Add the constraint which first becomes active to I h. Update A h and Z h correspondingly I h+1, A h+1 and Z h+1. Go to 7). 6) Removal from active set: Compute μ i, i I h. If μ i 0 for all i I h, x h is optimal, stop. If μ i < 0 for some i, remove the corresponding constraint from active set (the one with the h h smallest μ i ). Update A and Z correspondingly I h+1 h+1, A and Z h+1. Set x h+1 = x h. 7) Set h = h + 1 and go to 2).

How to compute μ At each iteration, a problem min f(x) s. t. A x = b is solved Lagrangian function: L x, μ = f x + μ T A x b In a minimizer x, L x, μ = f x + A T μ = 0 f x = A T μ A f x = (A AT )μ μ = (A A T ) 1 A f x Used in x h to approximate μ

From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: Active set method + Rosen s projected gradient

From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example (cont.)

SQP method Sequential Quadratic Programming Idea is to generate a sequence of quadratic optimization problems whose solutions approach the solution of the original problem quadratic problems are based on applying KKT conditions to the original problem Minimize a quadratic approximation of the Lagrangian function with respect to linear approximation of the constraints Also referred as projected Lagrangian method

Approximation of the constraints Consider a problem min f x s. t. h i x = 0, i = 1,, l, where all the functions are twice continuously differentiable Taylor s series (d = x x h ): h i x h + d h i x h + h i x h T d h i x = 0 for all i h x h d = h(x h )

Approximation of the Lagrangian L x h + d, ν L x h, ν + d T x L x h, ν + 1 2 dt 2 xx L x h, ν d A quadratic subproblem: min d d T f x h + 1 2 dt E x h, ν h d s. t. h x h d = h(x h ), where E(x h, ν h ) is either the Hessian of the Lagrangian or some approximation of it Approximate if the second derivatives are not available It can be shown (under some assumptions), that solutions of the subproblems approach x and Lagrange multipliers approach ν

Algorithm 1) Choose a starting point x 1. Compute E 1 and set h = 1. 2) Solve a quadratic subproblem min d T f x h + 1 d 2 dt E h d s. t. h x h d = h(x h ). Let the solution be d h. 3) Set x h+1 = x h + d h. Stop if x h+1 satisfies optimality conditions. 4) Compute an estimate for ν h+1 and compute E h+1 = E x h+1, ν h+1. 5) Set h = h + 1 and go to 2).

Notes Quadratic subproblems are typically not solved as optimization problems but are converted into a system of equations and solved with suitable methods: E h d = 0, h x d = 0 If E = 2 xx L, then x h, ν h x, ν quadratically If E is an approximation, then convergence is superlinear Only local convergence, that is, starting point must be close to optimum Inequality constraints considered with active set strategies or explicitly as g i x h T d g i (x h )