Math (P)refresher Lecture 8: Unconstrained Optimization

Similar documents
Functions of Several Variables

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

Paul Schrimpf. October 18, UBC Economics 526. Unconstrained optimization. Paul Schrimpf. Notation and definitions. First order conditions

September Math Course: First Order Derivative

Math Matrix Algebra

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

Mathematical Economics (ECON 471) Lecture 3 Calculus of Several Variables & Implicit Functions

EC /11. Math for Microeconomics September Course, Part II Lecture Notes. Course Outline

1 Convexity, concavity and quasi-concavity. (SB )

Chapter 2: Unconstrained Extrema

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Math 273a: Optimization Basic concepts

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

MATH529 Fundamentals of Optimization Unconstrained Optimization II

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as

Optimality conditions for unconstrained optimization. Outline

Linear Algebra: Linear Systems and Matrices - Quadratic Forms and Deniteness - Eigenvalues and Markov Chains

Symmetric Matrices and Eigendecomposition

Fundamentals of Unconstrained Optimization

Econ Slides from Lecture 8

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Boston College. Math Review Session (2nd part) Lecture Notes August,2007. Nadezhda Karamcheva www2.bc.

Math Camp Notes: Linear Algebra I

Nonlinear Programming (NLP)

OR MSc Maths Revision Course

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

Functions of Several Variables

Chapter 2 BASIC PRINCIPLES. 2.1 Introduction. 2.2 Gradient Information

Linear Algebra And Its Applications Chapter 6. Positive Definite Matrix

ARE211, Fall 2005 CONTENTS. 5. Characteristics of Functions Surjective, Injective and Bijective functions. 5.2.

Chapter 13. Convex and Concave. Josef Leydold Mathematical Methods WS 2018/19 13 Convex and Concave 1 / 44

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

Monotone Function. Function f is called monotonically increasing, if. x 1 x 2 f (x 1 ) f (x 2 ) x 1 < x 2 f (x 1 ) < f (x 2 ) x 1 x 2

Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Copositive Plus Matrices

CHAPTER 4: HIGHER ORDER DERIVATIVES. Likewise, we may define the higher order derivatives. f(x, y, z) = xy 2 + e zx. y = 2xy.

EC400 Math for Microeconomics Syllabus The course is based on 6 sixty minutes lectures and on 6 ninety minutes classes.

Gradient Descent. Dr. Xiaowei Huang

4.2: What Derivatives Tell Us

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

5 More on Linear Algebra

Review of Optimization Methods

Transpose & Dot Product

Transpose & Dot Product

Optimality Conditions

Performance Surfaces and Optimum Points

Announcements. Topics: Homework: - sections , 6.1 (extreme values) * Read these sections and study solved examples in your textbook!

Bob Brown Math 251 Calculus 1 Chapter 4, Section 4 1 CCBC Dundalk

MATH 4211/6211 Optimization Basics of Optimization Problems

1. Which one of the following points is a singular point of. f(x) = (x 1) 2/3? f(x) = 3x 3 4x 2 5x + 6? (C)

Analysis of Functions

ARE202A, Fall 2005 CONTENTS. 1. Graphical Overview of Optimization Theory (cont) Separating Hyperplanes 1

Midterm Exam, Econ 210A, Fall 2008

Mathematical Economics: Lecture 16

Precalculus Lesson 4.1 Polynomial Functions and Models Mrs. Snow, Instructor

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

3 (Maths) Linear Algebra

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

Static Problem Set 2 Solutions

Introduction to gradient descent

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

2x (x 2 + y 2 + 1) 2 2y. (x 2 + y 2 + 1) 4. 4xy. (1, 1)(x 1) + (1, 1)(y + 1) (1, 1)(x 1)(y + 1) 81 x y y + 7.

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

MATH Max-min Theory Fall 2016

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

MATH Topics in Applied Mathematics Lecture 12: Evaluation of determinants. Cross product.

Review of Vectors and Matrices

Mathematical Economics. Lecture Notes (in extracts)

Lecture 7: Positive Semidefinite Matrices

1) The line has a slope of ) The line passes through (2, 11) and. 6) r(x) = x + 4. From memory match each equation with its graph.

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

Module 04 Optimization Problems KKT Conditions & Solvers

. This matrix is not symmetric. Example. Suppose A =

Lecture 3: Basics of set-constrained and unconstrained optimization

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

Convex Functions and Optimization

First Derivative Test

Polynomial functions right- and left-hand behavior (end behavior):

3.2. Polynomial Functions and Their Graphs. Copyright Cengage Learning. All rights reserved.

Multivariate Newton Minimanization

QUADRATIC FORMS AND DEFINITE MATRICES. a 11 a 1n. a n1 a nn a 1i x i

2.5 The Fundamental Theorem of Algebra.

Lecture 4: Optimization. Maximizing a function of a single variable

Introduction to unconstrained optimization - direct search methods

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Lecture 8: Maxima and Minima

ARE202A, Fall Contents

1 Strict local optimality in unconstrained optimization

CHAPTER 2: QUADRATIC PROGRAMMING

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

REVIEW OF DIFFERENTIAL CALCULUS

Session: 09 Aug 2016 (Tue), 10:00am 1:00pm; 10 Aug 2016 (Wed), 10:00am 1:00pm &3:00pm 5:00pm (Revision)

Quadratics. SPTA Mathematics Higher Notes

Linear Algebra and Vector Analysis MATH 1120

Transcription:

Math (P)refresher Lecture 8: Unconstrained Optimization September 2006 Today s Topics : Quadratic Forms Definiteness of Quadratic Forms Maxima and Minima in R n First Order Conditions Second Order Conditions Global Maxima and Minima Quadratic Forms Quadratic forms important because Approximates local curvature around a point eg, used to identify max vs min vs saddle point 2 Simple, so easy to deal with 3 Have a matrix representation Quadratic Form: A polynomial where each term is a monomial of degree 2: Q(x,, x n ) = i j a ij x i x j which can be written in matrix terms a Q(x) = ( 2 ) a 2 2 a n x 2 x x2 x n a 2 a 22 2 a 2n x 2 2 a n 2 a 2n a nn x n or Q(x) = x T Ax Examples: Quadratic on R 2 : 2 Quadratic on R 3 : Q(x, x 2 ) = ( ) ( a x x 2 a ) ( ) 2 x 2 2 a 2 a 22 x 2 = a x 2 + a 2 x x 2 + a 22 x 2 2 Q(x, x 2, x 3 ) = ( x x 2 ) x 3 a 2 a 2 2 a 2 a 22 2 a 3 2 a 3 2 a 23 2 a 23 a 33 x x 2 x 3 = a x 2 + a 22 x 2 2 + a 33 x 2 3 + a 2 x x 2 + a 3 x x 3 + a 23 x 2 x 3 Much of the material and examples for this lecture are taken from Simon & Blume (994) Mathematics for Economists and Ecker & Kupferschmid (988) Introduction to Operations Research

Math (P)refresher: Unconstrained Optimization 2 2 Definiteness of Quadratic Forms Definiteness helps identify the curvature of Q(x) at x Definiteness: By definition, Q(x) = 0 at x = 0 The definiteness of the matrix A is determined by whether the quadratic form Q(x) = x T Ax is greater than zero, less than zero, or sometimes both over all x 0 Positive Definite x T Ax > 0, x 0 Min 2 Positive Semidefinite x T Ax 0, x 0 3 Negative Definite x T Ax < 0, x 0 Max 4 Negative Semidefinite x T Ax 0, x 0 5 Indefinite x T Ax > 0 for some x 0 and x T Ax < 0 for other x 0 Examples: Neither Positive Definite: ( ) 0 Q(x) = x T x 0 = x 2 + x2 2 2 Positive Semidefinite: ( ) Q(x) = x T x = (x x 2 ) 2 3 Indefinite: ( ) 0 Q(x) = x T x 0 = x 2 x2 2

Math (P)refresher: Unconstrained Optimization 3 3 Test for Definiteness using Principal Minors Given an n n matrix A, kth order principal minors are the determinants of the k k submatrices along the diagonal obtained by deleting n k columns and the same n k rows from A Example: For a 3 3 matrix A, First order principle minors: a, a 22, a 33 2 Second order principle minors: a a 2 a 2 a 22, a a 3 a 3 a 33, a 22 a 23 a 32 a 33 3 Third order principle minor: A Define the kth leading principal minor M k as the determinant of the k k submatrix obtained by deleting the last n k rows and columns from A Example: For a 3 3 matrix A, the three leading principal minors are M = a, M 2 = a a 2 a 2 a 22, M a a 2 a 3 3 = a 2 a 22 a 23 a 3 a 32 a 33 Algorithm: If A is an n n symmetric matrix, then M k > 0, k =,, n = Positive Definite 2 M k < 0, for odd k and M k > 0, for even k 3 M k 0, k =,, n, but does not fit the pattern of or 2 = Negative Definite = Indefinite If some leading principle minor is zero, but all others fit the pattern of the preceding conditions or 2, then Every principal minor 0 = Positive Semidefinite 2 Every principal minor of odd order 0 and every principal minor of even order 0 4 Maxima and Minima in R n = Negative Semidefinite Conditions for Extrema: The conditions for extrema are similar to those for functions on R Let f(x) be a function of n variables Let B(x, ɛ) be the ɛ-ball about the point x Then f(x ) > f(x), x B(x, ɛ) = Strict Local Max 2 f(x ) f(x), x B(x, ɛ) = Local Max 3 f(x ) < f(x), x B(x, ɛ) = Strict Local Min 4 f(x ) f(x), x B(x, ɛ) = Local Min

Math (P)refresher: Unconstrained Optimization 4 5 First Order Conditions When we examined functions of one variable x, we found critical points by taking the first derivative, setting it to zero, and solving for x For functions of n variables, the critical points are found in much the same way, except now we set the partial derivatives equal to zero Given a function f(x) in n variables, the gradient f(x) is a column vector, where the ith element is the partial derivative of f(x) with respect to x i : x is a critical point iff f(x ) = 0 f(x) = x x 2 x n Example: Find the critical points of f(x) = (x ) 2 + x 2 2 + The partial derivatives of f(x) are x = 2(x ) x 2 = 2x 2 2 Setting each partial equal to zero and solving for x and x 2, we find that there s a critical point at x = (, 0) 6 Second Order Conditions When we found a critical point for a function of one variable, we used the second derivative as an indicator of the curvature at the point in order to determine whether the point was a min, max, or saddle For functions of n variables, we use second order partial derivatives as an indicator of curvature Given a function f(x) of n variables, the Hessian H(x) is an n n matrix, where the (i, j)th element is the second order partial derivative of f(x) with respect to x i and x j : x x 2 2 f(x) H(x) = x 2 x 2 x x n x x 2 2 x n x 2 x x n x 2 x n x 2 n Curvature and The Taylor Polynomial as a Quadratic Form: The Hessian is used in a Taylor polynomial approximation to f(x) and provides information about the curvature of f(x) at x eg, which tells us whether a critical point x is a min, max, or saddle point We will only consider critical points on the interior of a function s domain

Math (P)refresher: Unconstrained Optimization 5 The second order Taylor polynomial about the critical point x is f(x + h) = f(x ) + f(x )h + 2 ht H(x )h + R(h) 2 Since we re looking at a critical point, f(x ) = 0; and for small h, R(h) is negligible Rearranging, we get f(x + h) f(x ) 2 ht H(x )h 3 The RHS is a quadratic form and we can determine the definiteness of H(x ) (a) If H(x ) is positive definite, then the RHS is positive for all small h: f(x + h) f(x ) > 0 = f(x + h) > f(x ) ie, f(x ) < f(x), x B(x, ɛ), so x is a strict local min (b) Conversely, if H(x ) is negative definite, then the RHS is negative for all small h: f(x + h) f(x ) < 0 = f(x + h) < f(x ) ie, f(x ) > f(x), x B(x, ɛ), so x is a strict local max Summary of Second Order Conditions: Given a function f(x) and a point x such that f(x ) = 0, H(x ) Positive Definite = Strict Local Min 2 H(x) Positive Semidefinite = Local Min x B(x, ɛ) 3 H(x ) Negative Definite = Strict Local Max 4 H(x) Negative Semidefinite = Local Max x B(x, ɛ) 5 H(x ) Indefinite = Saddle Point Example: We found that the only critical point of f(x) = (x ) 2 + x 2 2 + is at x = (, 0) Is it a min, max, or saddle point? Recall that the gradient of f(x) is ( ) 2(x ) f(x) = 2x 2 Then the Hessian is H(x) = ( ) 2 0 0 2 2 To check the definiteness of H(x ), we could use either of two methods: (a) Determine whether x T H(x )x is greater or less than zero for all x 0: x T H(x )x = ( ) ( ) ( ) 2 0 x x x 2 = 2x 2 + 2x 2 2 0 2 For any x 0, 2(x 2 + x2 2 ) > 0, so the Hessian is positive definite and x is a strict local minimum (b) Using the method of leading principal minors, we see that M = 2 and M 2 = 4 Since both are positive, the Hessian is positive definite and x is a strict local minimum x 2

Math (P)refresher: Unconstrained Optimization 6 7 Global Maxima and Minima To determine whether a critical point is a global min or max, we can check the concavity of the function over its entire domain Here again we use the definiteness of the Hessian to determine whether a function is globally concave or convex: H(x) Positive Semidefinite x = Globally Convex 2 H(x) Negative Semidefinite x = Globally Concave Notice that the definiteness conditions must be satisfied over the entire domain Given a function f(x) and a point x such that f(x ) = 0, f(x) Globally Convex = Global Min 2 f(x) Globally Concave = Global Max Note that showing that H(x ) is negative semidefinite is not enough to guarantee x is a local max However, showing that H(x) is negative semidefinite for all x guarantees that x is a global max (The same goes for positive semidefinite and minima) Example: Take f (x) = x 4 and f 2 (x) = x 4 Both have x = 0 as a critical point Unfortunately, f (0) = 0 and f 2 (0) = 0, so we can t tell whether x = 0 is a min or max for either However, f (x) = 2x2 and f 2 (x) = 2x2 For all x, f (x) 0 and f 2 (x) 0 ie, f (x) is globally convex and f 2 (x) is globally concave So x = 0 is a global min of f (x) and a global max of f 2 (x) 8 One More Example Given f(x) = x 3 x3 2 + 9x x 2, find any maxima or minima First order conditions Set the gradient equal to zero and solve for x and x 2 f x = 3x 2 + 9x 2 = 0 f x 2 = 3x 2 2 + 9x = 0 We have two equations in two unknowns Solving for x and x 2, we get two critical points: x = (0, 0) and x = (3, 3) 2 Second order conditions Determine whether the Hessian is positive or negative definite The Hessian is ( ) 6x 9 H(x) = 9 6x 2 Evaluated at x, H(x ) = ( ) 0 9 9 0 The two leading principal minors are M = 0 and M 2 = 8, so H(x ) is indefinite and = (0, 0) is a saddle point x

Math (P)refresher: Unconstrained Optimization 7 Evaluated at x 2, ( ) H(x 8 9 2) = 9 8 The two leading principal minors are M = 8 and M 2 = 243 Since both are positive, H(x 2 ) is positive definite and x 2 = (3, 3) is a strict local min 3 Global concavity/convexity In evaluating the Hessians for x and x 2 we saw that the Hessian is not everywhere positive semidefinite Hence, we can t infer that x 2 = (3, 3) is a global minimum In fact, if we set x = 0, the f(x) = x 3 2, which will go to as x 2