Optimization Theory. Lectures 4-6

Similar documents
OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003

Lecture 4: Optimization. Maximizing a function of a single variable

Optimization. A first course on mathematics for economists

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1

E 600 Chapter 4: Optimization

Generalization to inequality constrained problem. Maximize

Nonlinear Programming and the Kuhn-Tucker Conditions

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Mathematical Economics. Lecture Notes (in extracts)

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Lecture 2: Convex Sets and Functions

Constrained maxima and Lagrangean saddlepoints

Convex Optimization & Lagrange Duality

EC /11. Math for Microeconomics September Course, Part II Lecture Notes. Course Outline

5. Duality. Lagrangian

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

Optimality conditions for unconstrained optimization. Outline

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

EE/AA 578, Univ of Washington, Fall Duality

Outline. Roadmap for the NPP segment: 1 Preliminaries: role of convexity. 2 Existence of a solution

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Economics: Lecture 16

Constrained Optimization Theory

Convex Optimization M2

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

MATH2070 Optimisation

Convex Optimization Boyd & Vandenberghe. 5. Duality

8. Constrained Optimization

Econ 508-A FINITE DIMENSIONAL OPTIMIZATION - NECESSARY CONDITIONS. Carmen Astorne-Figari Washington University in St. Louis.

Finite Dimensional Optimization Part III: Convex Optimization 1

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Optimality Conditions for Constrained Optimization

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Part IB Optimisation

Applied Lagrange Duality for Constrained Optimization

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Paul Schrimpf. October 17, UBC Economics 526. Constrained optimization. Paul Schrimpf. First order conditions. Second order conditions

Linear and non-linear programming

Lecture: Duality.

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

EC400 Math for Microeconomics Syllabus The course is based on 6 sixty minutes lectures and on 6 ninety minutes classes.

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to

September Math Course: First Order Derivative

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a

Microeconomics I. September, c Leopold Sögner

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Introduction to Nonlinear Stochastic Programming

Maximum likelihood estimation

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

Lecture 18: Optimization Programming

Econ Slides from Lecture 14

Convex Optimization and Modeling

Duality Theory of Constrained Optimization

Lecture 3: Basics of set-constrained and unconstrained optimization

Lagrangian Duality Theory

Support Vector Machines

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Solving Dual Problems

Mathematical Foundations -1- Supporting hyperplanes. SUPPORTING HYPERPLANES Key Ideas: Bounding hyperplane for a convex set, supporting hyperplane

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Boston College. Math Review Session (2nd part) Lecture Notes August,2007. Nadezhda Karamcheva www2.bc.

Constrained Optimization and Lagrangian Duality

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Additional Homework Problems

Lecture 3: Semidefinite Programming

Constrained Optimization

Review of Optimization Methods

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Analysis and Linear Algebra. Lectures 1-3 on the mathematical tools that will be used in C103

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Lagrange Relaxation and Duality

Chap 2. Optimality conditions

1.4 FOUNDATIONS OF CONSTRAINED OPTIMIZATION

Lecture: Duality of LP, SOCP and SDP

CSCI : Optimization and Control of Networks. Review on Convex Optimization

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

A Brief Review on Convex Optimization

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

14 Lecture 14 Local Extrema of Function

10 Numerical methods for constrained problems

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Optimization. Sherif Khalifa. Sherif Khalifa () Optimization 1 / 50

EC487 Advanced Microeconomics, Part I: Lecture 2

subject to (x 2)(x 4) u,

Module 04 Optimization Problems KKT Conditions & Solvers

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris

Mathematical Foundations II

CHAPTER 1-2: SHADOW PRICES

CONSTRAINED OPTIMALITY CRITERIA

Primal/Dual Decomposition Methods

Optimization for Machine Learning

Mathematics For Economists

Transcription:

Optimization Theory Lectures 4-6

Unconstrained Maximization Problem: Maximize a function f:ú n 6 ú within a set A f ú n. Typically, A is ú n, or the non-negative orthant {x0ú n x$0}

Existence of a maximum: Theorem. If A is compact (i.e., closed and bounded) and f is continuous, then a maximum exists. Proof: For each x 0 A, the set {z0a f(z)$f(x)} is a closed subset of a compact set, hence compact, and the intersection of any finite number of these sets is non-empty. Therefore, by the finite intersection property, they have a non-empty intersection, which is a maximand. ~

Uniqueness of a maximum: Def: A function f is strictly concave on A (i.e., x,y 0 A, x y, 0 < 2 < 1 implies f(2x+(l-2)y) > 2f(x) + (l- 2)f(y)). Theorem. If A is convex and f is strictly concave and a maximum exists, then it is unique. Proof: If x y are both maxima, then (x+y)/2 0 A and by strict concavity, f((x+y)/2) gives a higher value, a contradiction. ~

A vector y 0 ú n points into A (from x 0 0 A) if x 0 + 2y 0 A for all sufficiently small positive scalars 2. Assume that f is twice continuously differentiable, and let f x denote its vector of first derivatives and f xx denote its array of second derivatives. Assume that x o achieves a maximum of f on A. Then, a Taylor s expansion gives f(x o ) $ f(x o +2y) = f(x o ) + 2f x (x o )@y + ( 2 2 /2)ytf xx (x o )y + R(2 2 ) for all y that point into A and small scalars 2 > 0, where R(g) is a remainder satisfying lim g60 R(g)/g = 0.

First-Order Condition (FOC): fx(x o )@y # 0 for all y that point into A (implies fx(x o )@y = 0 when both y and -y point into A, and fx(x o ) = 0 when x o is interior to A so that all y point into A). When A is the nonnegative orthant, the FOC is Mf(x o )/Mxi # 0, and Mf(x o )/Mxi = 0 if xi > 0 for i = 1,...,n. Proof: In Taylor s expansion, take 2 sufficiently small so quadratic term is negligible. ~

Second-Order Condition (SOC): ytfxx(x o )y # 0 for all y pointing into A with fx(x o )@y = 0 (implies ytfxx(x o )y # 0 for all y when x o is interior to A). The FOC and SOC are necessary at a maximum. If FOC holds, and a strict form of the SOC holds, ytfxx(x o )y < 0 for all y 0 pointing into A with fx(x o )@y = 0, then x o is a unique maximum within some neighborhood of x o. Proof: In Taylor s expansion, take 2 sufficiently small so remainder term is negligible. ~

Inequality-Constrained Maximization Suppose g:a D 6 ú and h:a D 6 ú m are continuous functions on a convex set A, and define B(y) = {x0a h(x,y) $ 0} for each y 0 D. Typically, A is the nonnegative orthant of ú n. Maximization in x of g(x,y) subject to the constraint h(x,y) $ 0 is called a nonlinear mathematical programming problem.

Define r(y) = max x0b(y) g(x,y) and f(y) = argmax x0b(y) g(x,y). If B(y) is bounded, then it is compact, guaranteeing that r(y) and f(y) exist.

Define a Lagrangian L(x,p,y) = g(x,y) + p@h(x,y). A vector (x 0,p 0,y) with x 0 0 A and p 0 $ 0 is a (global) Lagrangian Critical Point (LCP) at y 0 D if for all x 0 A and p $ 0. L(x,p 0,y) # L(x 0,p 0,y) # L(x 0,p,y) Note that a LCP is a saddle-point of the Lagrangian, which is maximized in x at x 0 given p 0, and minimized in p at p 0 given x 0. The variables in the vector p are called Lagrangian multipliers or shadow prices.

Example: Suppose x is a vector of policy variables available to a firm, g(x) is the firm s profit, and excess inventory of inputs is h(x,y) = y - q(x), where q(x) specifies the vector of input requirements for x. The firm must operate under the constraint that excess inventory is non-negative. The Lagrangian L(x,p,y) = g(x) + p@(y-q(x)) can then be interpreted as the overall profit from operating the firm and selling off excess inventory at prices p. In this interpretation, a LCP determines the firm s implicit reservation prices for the inputs, the opportunity cost of selling inventory rather than using it in production.

The problem of minimizing in p a function t(p,y) subject to v(p,y) $ 0 is the same as that of maximizing -t(p,y) subject to this constraint, with the associated Lagrangian -t(p,y) + x@v(p,y) with shadow prices x. Defining L(x,p,y) = t(p,y) - x@v(p,y), a LCP for this constrained minimization problem is then (x 0,p 0,y) such that L(x,p 0,y) # L(x 0,p 0,y) # L(x 0,p,y) for all x $ 0 and p $ 0. Note that this is the same string of inequalities that defined a LCP for a constrained maximization problem. The definition of L(x,p,y) is in general different in the two cases, but we next consider a problem where they coincide.

2.16.2. An important specialization of the nonlinear programming problem is linear programming, where one seeks to maximize g(x,y) = b@x subject to the constraints h(x,y) = y - Sx $ 0 and x $ 0, with S an m n matrix. Associated with this problem, called the primal problem, is a second linear programming problem, called its dual, where one seeks to minimize y@p subject to SNp - b $ 0 and p $ 0.

The Lagrangian for the primal problem is L(x,p,y) = g(x,y) + p@h(x,y) = b@x + p@y - pnsx. The Lagrangian for the dual problem is L(x,p,y) = y@p - x@(snp - b) = b@x + p@y - pnsx, which is exactly the same as the expression for the primal Lagrangian. Therefore, if the primal problem has a LCP, then the dual problem has exactly the same LCP.

2.16.3. Theorem. If (x 0,p 0,y) is a LCP, then x 0 is a maximand in x of g(x,y) subject to h(x,y) $ 0 for each y 0 D. Proof: The inequality L(x 0,p 0,y) # L(x 0,p,y) gives (p 0 - p)@h(x 0,y) # 0 for all p $ 0. Then p = 0 implies p 0 @h(x 0,y) # 0, while taking p to be p 0 plus various unit vectors implies h(x 0,y) $ 0, and hence p 0 @h(x 0,y) = 0. These are called the complementary slackness conditions, and state that if a constraint is not binding, then its Lagrangian multiplier is zero. The inequality L(x,p 0,y) # L(x 0,p 0,y) then gives g(x,y) + p@h(x,y) # g(x 0,y). Then, x 0 satisfies the constraints. Any other x that also satisfies the constraints has p@h(x,y) $ 0, so that g(x,y) # g(x 0,y). Then x 0 solves the constrained maximization problem. ~

2.16.4. Theorem. Suppose g:a D 6 ú and h:a D 6 ú m are continuous concave functions on a convex set A. Suppose x 0 maximizes g(x,y) subject to the constraint h(x,y) $ 0. Suppose the constraint qualification that for each vector p satisfying p $ 0 and p 0, there exists x 0 A such that p@h(x,y) > 0. [A sufficient condition for the constraint qualification is that there exist x 0 A at which h(x,y) is strictly positive.] Then, there exists p 0 $ 0 such that (x 0,p 0,y) is a LCP.

Proof: Define the sets C 1 = {(8,z)0ú ú m 8#g(x,y) and z # h(x,y) for some x 0 A} and C 2 = {(8,z)0ú ú m 8>g(x 0,y) and z > 0}. Show as an exercise that C 1 and C 2 are convex and disjoint. Since C 2 is open, they are strictly separated by a hyperplane with normal (:,p); i.e., :8N+pzN > :8O+pzO for all (8N,zN) 0 C 2 and (8O,zO) 0 C 1, This inequality and the definition of C 2 imply that (:,p) $ 0. If : = 0, then p 0 and taking zn 6 0 implies 0 $ p@h(x,y) for all x 0 A, violating the constraint qualification. Therefore, we must have : > 0. Define p 0 = p/:. Then, the separating inequality implies g(x 0,y) $ g(x,y) + p 0 @h(x,y) for all x 0 A. Since p 0 $ 0 and h(x 0,y) $ 0, taking x = x 0 implies p 0 @h(x 0,y) = 0. Therefore, L(x,p,y) = g(x,y) + p@h(x,y) satisfies L(x,p 0,y) # L(x 0,p 0,y). Also, p $ 0 implies p@h(x 0,y) $ 0, so that L(x 0,p 0,y) # L(x 0,p,y). Therefore, (x 0,p 0,y) is a LCP. ~

2.16.5. Theorem. Suppose maximizing g(x) subject to the constraint q(x) # y has LCP (x 0,p 0,y) and (x 0 +)x,p 0 +)p,y+)y). Then (p 0 +)p)@)y # g(x 0 +)x) - g(x 0 ) # p 0 @)y. Proof: The inequalities L(x 0 +)x,p 0,y) # L(x 0,p 0,y) # L(x 0,p 0 +)p,y) and L(x 0,p 0 +)p,y+)y) # L(x 0 +)x,p 0 +)p,y+)y) # L((x 0 +)x,p 0,y+)y) imply L(x 0,p 0 +)p,y+)y) - L(x 0,p 0 +)p,y) # L(x 0 +)x,p 0 +)p,y+)y) - L(x 0,p 0,y) # L((x 0 +)x,p 0,y+)y) - L(x 0 +)x,p 0,y). Then, cancellation of terms gives the result. ~ This result justifies the interpretation of a Lagrangian multiplier as the rate of increase in the constrained optimal objective function that results when a constraint is relaxed, and hence as the shadow or implicit price of the constrained quantity.

Classical Programming Problem Consider the problem of maximizing f:ú n 6 ú subject to equality constraints h j (x) - c j = 0 for j = 1,..., m (in vector notation, h(x) - c = 0) where m < n.

Lagrangian: L(x,p) = f(x) - p@[h(x) - c] The vector (x o,p o ) is said to be a local (interior) Lagrangian Critical Point if (11) L x (x o,p o ) = 0, L p (x o,p o ) = h(x o ) - c = 0, and znl xx (x o,p o )z # 0 if z satisfies h x (x o )z = 0, where p o is unconstrained in sign.

Theorem. If (x o,p o ) is a local LCP and znl xx (x o p o )z < 0 if z 0 satisfies h x (x o )z = 0, then x o is a unique local maximum of f subject to h(x) = 0. Proof: Note that 0 = L p (x o, p o ) = h(x o ) - c implies that x o is feasible, and that L(x o,p o ) = f(x o ). A Taylor s expansion of L(x o +2z,p o ) yields L(x o +2z,p o ) = L(x o,p o ) + 2L x (x o,p o )@z + (2 2 /2)zNL xx (x o,p o )z + R(2 2 ), the R(2 3 ) term is a residual.

Theorem. If (x o,p o ) is a local LCP and znl xx (x o p o )z < 0 for all z 0 satisfying h x (x o )z = 0, then x o is a unique local maximum of f subject to h(x) = 0. Proof: Note that 0 = L p (x o, p o ) = h(x o ) - c implies that x o is feasible, and that L(x o,p o ) = f(x o ). Taylor s expansions yield f(x o +2z) = f(x o ) + 2f x (x o )@z + (2 2 /2)zNf xx (x o )z + R(2 2 ), h(x o +2z) = c + 2h x (x o )@z + (2 2 /2)zNh xx (x o )z + R(2 2 ), the R(2 3 ) terms are residuals.

Using L x (x o,p o ) = 0, L(x o +2z,p o ) = L(x o,p o ) + 2L x (x o,p o )@z + (2 2 /2)zNL xx (x o,p o )z + R(2 2 ) = f(x o ) + (2 2 /2)zNL xx (x o, p o, q o )z + R(2 2 ) A point x o + 2z satisfying the constraints, with 2 small, must satisfy h x (x o )z = 0. Then, the negative semidefiniteness of L xx subject to these constraints implies f(x o +2z) # f(x o ) + (2 2 /2)zNL xx (x o,p o )z +R(2 2 ) # f(x o ) + R(2 2 ). If L xx is negative definite subject to these constraints then the SOC is sufficient for x o to be a local maximum. ~

Theorem. If x o maximizes f(x) subject to h(x) - c = 0, and the constraint qualification holds that the m n array B = h x (x o ) is of full rank m, then there exist Lagrange multipliers p 0 such that (x 0,p 0 ) is a local LCP. Proof: The hypothesis of the theorem implies that f(x o ) $ f(x o +2z) = f(x o ) + 2f x (x o )@z + (2 2 /2)zNf xx (x o )z + R(2 2 ) for all z such that c = h(x o +2z) = c + 2h x (x o )@z + (2 2 /2)zNh xx (x o )z + R(2 2 ).

Taking 2 small, these conditions imply x o )@z = 0 and znf xx (x o )z for any z satisfying h x (x o )@z = 0 ecall that B = h x (x o ) is m n of rank m, and define the dempotent) n n matrix M = I - BN(BBN) -1 B. Since BM 0, each column of M is a vector z meeting the ondition h x (x o )@z = 0, implying that f x (x o )@z = 0, or here 0 = Mf x (x o ) = f x (x o ) - h x (x o )Np 0, p 0 = (BBN) -1 Bf x (x o ).

Define L(x,p) = f(x) - p@[h(x) - c]. Then (21) L x (x o,p o ) = f x (x o ) - h x (x o )Np o = [I - BN(BBN) -1 B]f x (x o ) = 0. The construction guarantees that L p (x o,p o ) = 0. Finally, Taylor s expansion of the Lagrangian establishes that znl xx (x o,p o )z # 0 for all z satisfying h x (x o ) @ z = 0. Therefore, the constrained maximum corresponds to a local LCP. ~