MATH529 Fundamentals of Optimization Unconstrained Optimization II

MATH529 Fundamentals of Optimization Unconstrained Optimization II Marco A. Montes de Oca Mathematical Sciences, University of Delaware, USA 1 / 31

Recap 2 / 31

Example Find the local and global minimizers and maximizers on R of f (x) = 3x 4 4x 3 + 1. 3 / 31

Graph of f (x) = 3x 4 4x 3 + 1. 4 / 31

Two theorems summarize the basic facts about global optimization of one variable functions. Theorem (1st order condition (Necessary, but not sufficient)) Suppose that f (x) is a differentiable function on R (or the function s domain I ). If x is a global minimizer of f (x), then f (x ) = 0. Theorem (2nd order condition (Sufficient, but not necessary)) Suppose that f (x), f (x), and f (x) are all continuous on R (or I ) and that x is a critical point of f (x). a) If f (x) 0 for all x R (or I ), then x is a global minimizer of f (x) on R (or I ). b) If f (x) > 0 for all x I such that x x, then x is a strict global minimizer of f (x) on R (or I ). 6 / 31

Local optimization is easier to verify. Theorem (1st order condition (Necessary, but not sufficient)) Suppose that f (x) is a differentiable function on R (or I ). If x is a local minimizer of f (x), then f (x ) = 0. Theorem (2nd order condition (Sufficient, but not necessary)) Suppose that f (x), f (x), and f (x) are all continuous on R (or I ) and that x is a critical point of f (x). If f (x ) > 0, then x is a strict local minimizer of f (x). 8 / 31

Exercise Find the local and global minimizers and maximizers on I = ( 1, 1) of f (x) = ln(1 x 2 ). 9 / 31

What about functions of many variables? 10 / 31

What about functions of many variables? Extend theorems that allow us to identify and classifly local minimizers of one variable functions to multivariable cases. 11 / 31

Notation: A vector in R n is an ordered n-tuple x = x 1 x 2 x 3. x n of real numbers called components of x. If x and y are vectors in R n, then their dot product or inner product is defined by y 2 x y = x T y = (x 1, x 2, x 3,..., x n ) y 3 =. y n y 1 n x i y i i=1 where x T is the transpose of x. 12 / 31

Notation: If f (x) is a function of n variables with continuous first and second partial derivatives on R n, then the gradient of f (x) is the vector f x 1 f f x n x 2 f f = x 3. 13 / 31

Notation: The Hessian of f (x), denoted by 2 f or Hf, is the symmetric n n matrix 2 f = Hf = x 2 1 x 2 x 1 x 3 x 1. x n x 1 x 1 x 2 x 2 2 x 3 x 2. x n x 2 x 1 x 3... 2 f x 1 x n x 2 x 3... x 2 3. x 2 x n... x 3 x n.... x n x 3... 2 f xn 2 14 / 31

Definition Suppose f (x) is a real-valued function defined on a subset D of R n. A point x in D is: A global minimizer for f (x) on D if f (x ) f (x) for all x D; 15 / 31

Definition Suppose f (x) is a real-valued function defined on a subset D of R n. A point x in D is: A global minimizer for f (x) on D if f (x ) f (x) for all x D; A strict global minimizer for f (x) on D if f (x ) < f (x) for all x D such that x x ; A local minimizer for f (x) if there is a positive number δ such that f (x ) f (x) for all x D for which x in B(x, δ); A strict local minimizer for f (x) if there is a positive number δ such that f (x ) < f (x) for all x D for which x in B(x, δ) and x x ; 18 / 31

Theorem (Multivariable Taylor s formula) Suppose that x, x are points in R n and that f (x) is a real-valued function of n variables with continuous first and second partial derivatives on some open set containing the line segment [x, x] = {w R n : w = x + t(x x ), 0 t 1} joining x and x. Then, there exists a z [x, x] such that f (x) = f (x ) + f (x ) T (x x ) + 1 2 (x x ) T Hf (z)(x x ) 20 / 31

Theorem (Local minimizer identification) Suppose that f (x) is a real-valued function for which all first partial derivatives of f (x) exist on a subset D R n. If x is an interior point of D that is a local minimizer of f (x), then f (x ) = 0. 21 / 31

Theorem (Classification of minimizers (maximizers)) Suppose that x is a critical point of a function f (x) with continuous first and second partial derivatives on R n. Then: x is a global minimizer of f (x) if (x x ) T Hf (z)(x x ) 0 for all x R n and all z [x, x]; x is a strict global minimizer of f (x) if (x x ) T Hf (z)(x x ) > 0 for all x R n such that x x and for all z [x, x]; x is a global maximizer of f (x) if (x x ) T Hf (z)(x x ) 0 for all x R n and all z [x, x]; x is a strict global maximizer of f (x) if (x x ) T Hf (z)(x x ) < 0 for all x R n such that x x and for all z [x, x]; 22 / 31

Practical ways to use the previous theorem: Conditions that involve the form (x x ) T Hf (z)(x x ), or in general v T Av, where A is a symmetric square matrix, call for methods to identify whether A (in our case the Hessian of the objective function) is positive or negative (semi)definite. 23 / 31

Quadratic forms: Let a 11 a 12... a 1n a 21 a 22... a 2n A =....... a n1 a n2... a nn The quadratic form Q A (x) = x T Ax = a 11 x 2 1 + a 12x 1 x 2 + a 13 x 1 x 3 +... + a ij x i x j +... + a ii x 2 i +... + a nn x 2 n. 24 / 31

Example Write the quadratic form associated with the following matrix: 1 0 2 3 A = 0 2 1/2 1 2 1/2 0 4. 3 1 4 5 25 / 31

Determining whether a quadratic form Q A (x) > 0 for all x R n. Example in class.... 26 / 31