Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

Similar documents
MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as

Chapter 2: Unconstrained Extrema

Paul Schrimpf. October 18, UBC Economics 526. Unconstrained optimization. Paul Schrimpf. Notation and definitions. First order conditions

Lecture 8: Maxima and Minima

Math Matrix Algebra

Math (P)refresher Lecture 8: Unconstrained Optimization

Additional Exercises for Introduction to Nonlinear Optimization Amir Beck March 16, 2017

Fundamentals of Unconstrained Optimization

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

8 Extremal Values & Higher Derivatives

Linear Algebra And Its Applications Chapter 6. Positive Definite Matrix

A A x i x j i j (i, j) (j, i) Let. Compute the value of for and

September Math Course: First Order Derivative

Symmetric Matrices and Eigendecomposition

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Performance Surfaces and Optimum Points

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Symmetric matrices and dot products

This property turns out to be a general property of eigenvectors of a symmetric A that correspond to distinct eigenvalues as we shall see later.

5 More on Linear Algebra

Lecture 6 Positive Definite Matrices

Chapter 3 Transformations

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Lecture 3 Introduction to optimality conditions

1 Directional Derivatives and Differentiability

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

Chapter 13. Convex and Concave. Josef Leydold Mathematical Methods WS 2018/19 13 Convex and Concave 1 / 44

Econ Slides from Lecture 8

REVIEW OF DIFFERENTIAL CALCULUS

OR MSc Maths Revision Course

Monotone Function. Function f is called monotonically increasing, if. x 1 x 2 f (x 1 ) f (x 2 ) x 1 < x 2 f (x 1 ) < f (x 2 ) x 1 x 2

Linear Algebra: Characteristic Value Problem

Lecture 3: Basics of set-constrained and unconstrained optimization

Lecture 7: Positive Semidefinite Matrices

1 Convexity, concavity and quasi-concavity. (SB )

Unconstrained optimization

Linear Algebra. Chapter 8: Eigenvalues: Further Applications and Computations Section 8.2. Applications to Geometry Proofs of Theorems.

Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Functions of Several Variables

Transpose & Dot Product

Convex Functions and Optimization

Tangent spaces, normals and extrema

Lakehead University ECON 4117/5111 Mathematical Economics Fall 2002

Tutorials in Optimization. Richard Socher

ARE211, Fall 2005 CONTENTS. 5. Characteristics of Functions Surjective, Injective and Bijective functions. 5.2.

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

Chapter 2 BASIC PRINCIPLES. 2.1 Introduction. 2.2 Gradient Information

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Optimization. Sherif Khalifa. Sherif Khalifa () Optimization 1 / 50

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

Gradient Descent. Dr. Xiaowei Huang

Numerical Optimization

Chapter 2: Preliminaries and elements of convex analysis

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Transpose & Dot Product

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Assignment 1: From the Definition of Convexity to Helley Theorem

c Igor Zelenko, Fall

Mathematical Economics. Lecture Notes (in extracts)

Lecture 4: Optimization. Maximizing a function of a single variable

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

Lecture 6 - Convex Sets

Chapter 6. Eigenvalues. Josef Leydold Mathematical Methods WS 2018/19 6 Eigenvalues 1 / 45

Linear Algebra, part 2 Eigenvalues, eigenvectors and least squares solutions

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

6.1 Matrices. Definition: A Matrix A is a rectangular array of the form. A 11 A 12 A 1n A 21. A 2n. A m1 A m2 A mn A 22.

Second Order Optimality Conditions for Constrained Nonlinear Programming

Lec3p1, ORF363/COS323

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc.

Mathematical Economics: Lecture 16

Nonlinear Programming Algorithms Handout

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)

Matrix Algebra, part 2

STAT200C: Review of Linear Algebra

Functions of Several Variables

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

Approximation Algorithms

LECTURE 9 LECTURE OUTLINE. Min Common/Max Crossing for Min-Max

Lecture 1 and 2: Random Spanning Trees

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

7. Symmetric Matrices and Quadratic Forms

Optimization Theory. A Concise Introduction. Jiongmin Yong

Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Review of Optimization Methods

Lecture 2: Convex Sets and Functions

CHAPTER 4: HIGHER ORDER DERIVATIVES. Likewise, we may define the higher order derivatives. f(x, y, z) = xy 2 + e zx. y = 2xy.

Recall the convention that, for us, all vectors are column vectors.

Lecture Note 5: Semidefinite Programming for Stability Analysis

There are six more problems on the next two pages

Transcription:

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then 1. x S is a global minimum point of f over S if f (x) f (x ) for any x S. 2. x S is a strict global minimum point of f over S if f (x) > f (x ) for any x x S. 3. x S is a global maximum point of f over S if f (x) f (x ) for any x S. 4. x S is a strict global maximum point of f over S if f (x) < f (x ) for any x x S. global optimum=global minimum or maximum. maximal value of f over S: sup{f (x) : x S} minimal value of f over S: inf{f (x) : x S} minimal and maximal values are always unique. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 1 / 29

Example 1: Find the global minimum and maximum points of f (x, y) = x + y over the unit ball S = B[0, 1] = {(x, y) T : x 2 + y 2 1} In class. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 2 / 29

Example 2: { } x + y min f (x, y) = x 2 + y 2 + 1 : x, y R (1/ 2, 1/ 2) - global maximizer ( 1/ 2, 1/ 2) - global minimizer. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 3 / 29

Local Minima and Maxima Definition Let f : S R be defined on a set S R n. Then 1. x S is a local minimum of f over S if there exists r > 0 for which f (x ) f (x) for any x S B(x, r). 2. x S is a strict local minimum of f over S if there exists r > 0 for which f (x ) < f (x) for any x x S B(x, r). 3. x S is a local maximum of f over S if there exists r > 0 for which f (x ) f (x) for any x S B(x, r). 4. x S is a strict local maximum of f over S if there exists r > 0 for which f (x ) > f (x) for any x x S B(x, r). Of course, a global minimum (maximum) point is also a local minimum (maximum) point. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 4 / 29

Example f described above is defined over [ 1, 8]. Classify each of the points x = 1, 1, 2, 3, 5, 6.5, 8 as strict/nonstrict global/local minimum/maximum points. In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 5 / 29

Fermat s Theorem - First Order Optimality Condition Theorem. Let f : U R be a function defined on a set U R n. Suppose that x int(u) is a local optimum point and that all the partial derivatives of f exist at x. Then f (x ) = 0. Proof. Let i {1, 2,..., n} and consider the 1-D function g(t) = f (x + te i ) x is a local optimum point of f t = 0 is a local optimum of g g (0) = 0. Thus, f x i (x ) = g (0) = 0. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 6 / 29

Stationary Points Definition Let f : U R be a function defined on a set U R n. Suppose that x int(u) and that all the partial derivatives of f are defined at x. Then x is called a stationary point of f if f (x ) = 0. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 7 / 29

Example: { } x + y min f (x, y) = x 2 + y 2 + 1 : x, y R f (x, y) = 1 (x 2 + y 2 + 1) 2 ( ) (x 2 + y 2 + 1) 2(x + y)x (x 2 + y 2. + 1) 2(x + y)y Stationary points are those satisfying: x 2 2xy + y 2 = 1, x 2 2xy y 2 = 1. Hence, the stationary points are (1/ 2, 1/ 2), ( 1/ 2, 1/ 2). (1/ 2, 1/ 2) - global maximum, ( 1/ 2, 1/ 2) - global minimum. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 8 / 29

Classification of Matrices - Positive Definiteness 1. A symmetric matrix A R n n is called positive semidefinite, denoted by A 0, if x T Ax 0 for every x R n. 2. A symmetric matrix A R n n is called positive definite, denoted by A 0, if x T Ax > 0 for every 0 x R n. Example 1: In class A = ( ) 2 1. 1 1 Example 2: In class B = ( ) 1 2. 2 1 Lemma: Let A be a positive definite (semidefinite) matrix. Then the diagonal elements of A are positive (nonnegative). Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 9 / 29

Negative (Semi)Definiteness, Indefiniteness 1. A symmetric matrix A R n n is called negative semidefinite, denoted by A 0, if x T Ax 0 for every x R n. 2. A symmetric matrix A R n n is called negative definite, denoted by A 0, if x T Ax < 0 for every 0 x R n. 3. A symmetric matrix A R n n is called indefinite if there exist x, y R n such that x T Ax > 0, y T Ay < 0.. Remarks: A is negative (semi)definite if and only if A is positive (semi)definite. A matrix is indefinite if and only if it is neither positive semidefinite nor negative semidefinite. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 10 / 29

Eigenvalue Characterization Theorem. Let A be a symmetric n n matrix. Then (a) A is positive definite iff all its eigenvalues are positive. (b) A is positive semidefinite iff all its eigenvalues are nonnegative. (c) A is negative definite iff all its eigenvalues are negative. (d) A is negative semidefinite iff all its eigenvalues are nonpositive. (e) A is indefinite iff it has at least one positive eigenvalue and at least one negative eigenvalue. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 11 / 29

Eigenvalue Characterization Proof Proof of part (a): (other parts follows immediately or by similar arguments) There exists orthogonal U R n n such that where d i = λ i (A). U T AU = D diag(d 1, d 2,..., d n ) Making the linear change of variables x = Uy, we have x T Ax = y T U T AUy = y T Dy = n i=1 d i y 2 i. Therefore, x T Ax > 0 for all x 0 iff n i=1 d i y 2 i > 0 for any y 0. (1) (1) holds iff d i > 0 for all i (why?) Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 12 / 29

Trace and Determinant Corollary. Let A be a positive semidefinite (definite) matrix. Then Tr(A) and det(a) are nonnegative (positive). Proof. In class Proposition. Let A be a symmetric 2 2 matrix. Then A is positive semidefinite (definite) if and only if Tr(A), det(a) 0 (Tr(A), det(a) > 0). Proof. In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 13 / 29

Example Classify the matrices A = ( ) 1 1 1 4 1, B = 1 1 1. 1 3 1 1 0.1 In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 14 / 29

The Principal Minors Criteria. Definition Given an n n matrix, the determinant of the upper left k k submatrix is called the k-th principal minor and is denoted by D k (A). Example. A = a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 ( ) a11 a D 1 (A) = a 11, D 2 (A) = det 12, D a 21 a 3 (A) = det a 11 a 12 a 13 a 21 a 22 a 23. 22 a 31 a 32 a 33 Principal Minors Criteria Let A be an n n symmetric matrix. Then A is positive definite if and only if D 1 (A) > 0, D 2 (A) > 0,..., D n (A) > 0. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 15 / 29

Examples Classify the matrices 4 2 3 2 2 2 4 1 1 A = 2 3 2, B = 2 2 2, C = 1 4 1. 3 2 4 2 2 1 1 1 4 In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 16 / 29

Diagonal Dominance Definition (diagonally dominant matrices) Let A be a symmetric n n matrix. (a) A is called diagonally dominant if A ii j i A ij i = 1, 2,..., n (b) A is called strictly diagonally dominant if A ii > j i A ij i = 1, 2,..., n Theorem (positive (semi)definiteness of diagonally dominant matrices) (a) If A is symmetric, diagonally dominant with nonnegative diagonal elements, then A is positive semidefinite. (b) If A is symmetric, strictly diagonally dominant with positive diagonal elements, then A is positive definite. See proof of Theorem 2.25 on pages 22,23. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 17 / 29

Necessary Second Order Optimality Conditions Theorem. Let f : U R be a function defined on an open set U R n. Suppose that f is twice continuously differentiable over U and that x is a stationary point. Then 1. if x is a local minimum point, then 2 f (x ) 0. 2. if x is a local maximum point, then 2 f (x ) 0. Proof. of 1: There exists a ball B(x, r) U for which f (x) f (x ) for all x B(x, r). Let d R n be a nonzero vector. For any 0 < α < r d, we have x α x + αd B(x, r), for any such α, f (x α) f (x ). On the other hand, there exists a vector z α [x, x α] such that f (x α) f (x ) = α2 2 dt 2 f (z α )d. (2) for any α (0, r d ) the inequality dt 2 f (z α )d 0 holds. Since z α x as α 0 +, we obtain that d T f (x )d 0. 2 f (x ) 0. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 18 / 29

Sufficient Second Order Optimality Conditions Theorem. Let f : U R be a function defined on an open set U R n. Suppose that f is twice continuously differentiable over U and that x is a stationary point. Then 1. if 2 f (x ) 0, then x is a strict local minimum point of f over U. 2. if 2 f (x ) 0, then x is a strict local maximum point of f over U. Proof. of 1: (2 directly follows) There exists a ball B(x, r) U for which 2 f (x) 0 for any x B(x, r). By LAT, there exists a vector z x [x, x] (and hence z x B(x, r)) for which f (x) f (x ) = 1 2 (x x ) T 2 f (z x )(x x ). 2 f (z x ) 0 for any x B(x, r) such that x x, the inequality f (x) > f (x ) holds, implying that x is a strict local minimum point of f over U. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 19 / 29

Saddle Points Definition Let f : U R be a continuously differentiable function defined on an open set U R n. A stationary point x U is called a saddle point of f over U if it is neither a local minimum point nor a local maximum point of f over U. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 20 / 29

Sufficient Condition for Saddle Points Theorem Let f : U R be a function defined on an open set U R n. Suppose that f is twice continuously differentiable over U and that x is a stationary point. If 2 f (x ) is an indefinite matrix, then x is a saddle point of f over U. Proof. 2 f (x ) has at least one positive eigenvalue λ > 0, corresponding to a normalized eigenvector denoted by v. r > 0 such that x + αv U for any α (0, r). BY QAT 1 There exists a function g : R ++ R satisfying such that for any α (0, r) g(t) t 0 as t 0, f (x + αv) = f (x ) + λα2 2 v 2 + g( v 2 α 2 ). 1 quadratic approximation theory Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 21 / 29

Proof Contd. f (x + αv) = f (x ) + λα2 2 + g(α 2 ). ε 1 (0, r) such that g(α 2 ) > λ 2 α2 for all α (0, ε 1 ). f (x + αv) > f (x ) for all α (0, ε 1 ). x cannot be a local maximum point of f over U. Similarly, cannot be a local minimum point of f over U saddle point. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 22 / 29

Attainment of Minimal/Maximal Points Weierstrass Theorem Let f be a continuous function defined over a nonempty compact set C R n. Then there exists a global minimum point of f over C and a global maximum point of f over C. When the underlying set is not compact, Weierstrass theorem does not guarantee the attainment of the solution, but certain properties of the function f can imply attainment of the solution. Definition Let f : R n R be a continuous function over R n. f is called coercive if lim f (x) =. x Theorem[Attainment of Global Optima Points for Coercive Functions] Let f : R n R be a continuous and coercive function and let S R n be a nonempty closed set. Then f attains a global minimum point on S. Proof. In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 23 / 29

Example Classify the stationary points of the function f (x 1, x 2 ) = 2x1 2 + x 1x2 2 + 4x 1 4 (. 4x1 + x2 f (x) = 2 + 16x ) 1 3, 2x 1 x 2 Stationary points are the solutions to 4x 1 + x 2 2 + 16x 3 1 = 0, 2x 1 x 2 = 0. Stationary points are (0, 0), (0.5, 0), ( 0.5, 0) ( ) 2 4 + 48x 2 f (x 1, x 2 ) = 1 2x 2. 2x 2 2x 1 2 f (0.5, 0) = ( ) 8 0 0 1 strict local minimum 2 f ( 0.5, 0) = saddle point ( ) 8 0 0 1 2 f (0, 0) = ( ) 4 0 0 0 saddle point (why?) Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 24 / 29

Illustration Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 25 / 29

Global Optimality Conditions No free meals... Theorem. Let f be a twice continuously defined over R n. Suppose that 2 f (x) 0 for any x R n. Let x R n be a stationary point of f. Then x is a global minimum point of f. Proof. In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 26 / 29

Example f (x) = x 2 1 + x 2 2 + x 2 3 + x 1 x 2 + x 1 x 3 + x 2 x 3 + (x 2 1 + x 2 2 + x 2 3 ) 2. 2x 1 + x 2 + x 3 + 4x 1 (x1 2 + x 2 2 + x 3 2) f (x) = 2x 2 + x 1 + x 3 + 4x 2 (x1 2 + x 2 2 + x 3 2). 2x 3 + x 1 + x 2 + 4x 3 (x1 2 + x 2 2 + x 3 2) 2 2 + 4(x 1 2 + x2 2 + x2 3 ) + 8x2 1 1 + 8x 1 x 2 1 + 8x 1 x 3 f (x) = 1 + 8x 1 x 2 2 + 4(x 1 2 + x2 2 + x2 3 ) + 8x2 2 1 + 8x 2 x 3. 1 + 8x 1 x 3 1 + 8x 2 x 3 2 + 4(x 1 2 + x2 2 + x2 3 ) + 8x2 3 x = 0 is a stationary point. 2 f (x) 0 for all x. (why?) Consequence: x = 0 is the global minimum point. Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 27 / 29

Quadratic Functions A quadratic function over R n is a function of the form f (x) = x T Ax + 2b T x + c, where A R n n is symmetric, b R n and c R. f (x) = 2Ax + 2b, 2 f (x) = 2A. Consequently, Lemma Let f (x) = x T Ax + 2b T x + c (A R n n sym.,b R n, c R). 1. x is a stationary point of f iff Ax = b. 2. if A 0, then x is a global minimum point of f iff Ax = b. 3. if A 0, then x = A 1 b is a strict global minimum point of f. Proof. In class Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 28 / 29

Two Important Theorems on Quadratic Functions Lemma [coerciveness of quadratic functions] Let f (x) = x T Ax + 2b T x + c where A R n n is symmetric, b R n and c R. Then f is coercive if and only if A 0. Lemma 2.42 in the book (proof in page 33). Theorem [characterization of the nonnegativity of quadratic functions] f (x) = x T Ax + 2b T x + c, where A R n n is symmetric, b R n and c R. Then the following two claims are equivalent (i) f (x) x T Ax + 2b T x + c 0 for all x R n. ( ) A b (i) b T 0. c Theorem 2.43 in the book (proof in pages 33,34). Amir Beck Introduction to Nonlinear Optimization Lecture Slides - Unconstrained Optimization 29 / 29