Introduction to unconstrained optimization - direct search methods

Introduction to unconstrained optimization - direct search methods Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi

Structure of optimization methods Typically Constraint handling converts the problem to (a series of) unconstrained problems In unconstrained optimization a search direction is determined at each iteration The best solution in the search direction is found with line search Constraint handling method Unconstrained optimization Line search

Group discussion 1. What kind of optimality conditions there exist for unconstrained optimization (x R n )? 2. List methods for unconstrained optimization? what are their general ideas? Discuss in small groups (3-4) for 15-20 minutes Each group has a secretary who writes down the answers of the group At the end, we summarize what each group found

Reminder: gradient and hessian Definition: If function f: R n R is differentiable, then the gradient f(x) consists of the partial derivatives f(x) i.e. f x = f(x),, f(x) x 1 x n Definition: If f is twice differentiable, then the matrix 2 f(x) 2 f(x) x 1 x 1 x 1 x n H x = 2 f(x) 2 f(x) x n x 1 x n x n is called the Hessian (matrix) of f at x Result: If f is twice continuously differentiable, then 2 f(x) x i x j = 2 f(x) x j x i T x i

Reminder: Definite Matrices Definition: A symmetric n n matrix H is positive semidefinite if x R n x T Hx 0. Definition: A symmetric n n matrix H is positive definite if x T Hx > 0 0 x R n Note: If (> <), then H is negative semidefinite (definite). If H is neither positive nor negative semidefinite, then it is indefinite. Result: Let S R n be open convex set and f: S R twice differentiable in S. Function f is convex if and only if H(x ) is positive semidefinite for all x S.

Unconstraint problem min f x, s. t. x R n Necessary conditions: Let f be twice differentiable in x. If x is a local minimizer, then f x = 0 (that is, x is a critical point of f) and H x is positive semidefinite. Sufficient conditions: Let f be twice differentiable in x. If f x = 0 and H(x ) is positive definite, then x is a strict local minimizer. Result: Let f: R n R is twice differentiable in x. If f x = 0 and H(x ) is indefinite, then x is a saddle point.

Unconstraint problem Adopted from Prof. L.T. Biegler (Carnegie Mellon University)

Descent direction Definition: Let f: R n R. A vector d R n is a descent direction for f in x R n if δ > 0 s.t. f x + λd < f(x ) λ (0, δ]. Result: Let f: R n R be differentiable in x. If d R n s.t. f x T d < 0 then d is a descent direction for f in x.

Model algorithm for unconstrained minimization Let x h be the current estimate for x 1) [Test for convergence.] If conditions are satisfied, stop. The solution is x h. 2) [Compute a search direction.] Compute a nonzero vector d h R n which is the search direction. 3) [Compute a step length.] Compute α h > 0, the step length, for which it holds that f x h + α h d h < f(x h ). 4) [Update the estimate for minimum.] Set x h+1 = x h + α h d h, h = h + 1 and go to step 1. From Gill et al., Practical Optimization, 1981, Academic Press

On convergence Iterative method: a sequence {x h } s.t. x h x when h Definition: A method converges linearly if α [0,1) and M 0 s.t. h M x h+1 x α x h x, superlinearly if M 0 and for some sequence α h 0 it holds that h M x h+1 x α h x h x, with degree p if α 0, p > 0 and M 0 s.t. h M x h+1 x α x h x p. If p = 2 (p = 3), the convergence is quadratic (cubic).

Summary of group discussion for methods 1. Newton s method 1. Utilizes tangent 2. Golden section method 1. For line search 3. Downhill Simplex 4. Cyclic coordinate method 1. One coordinate at a time 5. Polytopy search (Nelder-Mead) 1. Idea based on geometry 6. Gradient descent (steepest descent) 1. Based on gradient information

Direct search methods Univariate search, coordinate descent, cyclic coordinate search Hooke and Jeeves Powell s method

From Miettinen: Nonlinear optimization, 2007 (in Finnish) Coordinate descent f x = 2x 1 2 + 2x 1 x 2 + x 2 2 + x 1 x 2

From Miettinen: Nonlinear optimization, 2007 (in Finnish) Idea of pattern search

From Miettinen: Nonlinear optimization, 2007 (in Finnish) Hooke and Jeeves f x = x 1 2 4 + x 1 2x 2 2

From Miettinen: Nonlinear optimization, 2007 (in Finnish) Hooke and Jeeves with fixed step length f x = x 1 2 4 + x 1 2x 2 2

Powell s method Most efficient pattern search method Differs from Hooke and Jeeves so that for each pattern search step one of the coordinate directions is replaced with previous pattern search direction.