Optimization Paul Schrimpf September 12, PDF Free Download

Optimization Paul Schrimpf September 1, 018 University of British Columbia Economics 56 cba1 Today s lecture is about optimization Useful references are chapters 1-5 of Diit (1990), chapters 16-19 of Simon and Blume (1994), chapter 5 of Carter (001), and chapter 7 of De la Fuente (000) We typically model economic agents as optimizing some objective function Consumers maimize utility Eample 01 A consumer with income y chooses a bundle of goods 1,, n with prices p 1,, p n to maimize utility u() Firms maimize profits ma u( 1,, n ) st p 1 1 + + p n n y 1,, n 1 Eample 0 A firm produces a good y with price p, and uses inputs R k with prices w R k The firm s production function is f : R k R The firm s problem is ma p y y, wt st f () y 1 Unconstrained optimization Although most economic optimization problems involve some constraints it is useful to begin by studying unconstrained optimization problems There are two reasons for this One is that the results are somewhat simpler and easier to understand without constraints The second is that we will sometimes encounter unconstrained problems For eample, we can substitute the constraint in the firm s problem from eample 0 to obtain an unconstrained problem ma p f () w T 11 Notation and definitions An optimization problem refers to finding the maimum or minimum of a function, perhaps subject to some constraints In economics, the most common optimization problems are utility maimization and profit maimization Because of this, we will state most of our definitions and results for maimization problems 1This work is licensed under a Creative Commons Attribution-ShareAlike 40 International License 1See appendi A if any notation is unclear 1

Of course, we could just as well state each definition and result for a minimization problem by reversing the direction of most inequalities We will start by eamining generic unconstrained optimization problems of the form ma where R n and f : R n R To allow for the possibility that the domain of f is not all of R n, we will let U R n, be the domain, and write ma U f () for the maimum of f on U If F ma f (), we mean that f () F for all and f ( ) F for some Definition 11 F ma U f () is the maimum of F on U if f () F for all and f ( ) F for some There may be more than one such We denote the set of all such that f ( ) F by arg ma U f () and might write arg ma f (), or, if we know there is only one such,, we sometimes write arg ma f () Definition 1 U is a maimizer of f on U if f ( ) ma U f () The set of all maimizers is denoted arg ma U f () f () Definition 13 U is a strict maimizer of f if f ( ) > f () for all Definition 14 f has a local maimum at if δ > 0 such that f (y) f () for all y in U and within δ distance of (we will use the notation y N δ () U, which could be read y in a δ neighborhood of intersected with U) Each such is called a local maimizer of f If f (y) < f () for all y, y N δ () U, then we say f has a strict local maimum at When we want to be eplicit about the distinction between local maimum and the maimum in definition 11, we refer to the later as the global maimum Eample 11 Here are some eamples of functions from R R and their maima and minima (1) f () is minimized at 0 with minimum 0 () f () c has minimum and maimum c Any is a maimizer (3) f () cos() has maimum 1 and minimum 1 πn for any integer n is a maimizer (4) f () cos() + / has no global maimizer or minimizer, but has many local ones A related concept to the maimum is the supremum Definition 15 The supremum of f on U is S is written, S sup U f (), and means that S f () for all U and for any A < S there eists A U such that f ( A ) > A

The supremum is also called the least upper bound The main difference between the supremum and maimum is that the supremum can eist when the maimum does not Eample 1 Let R and f () 1 Then ma 1+ R f () does not eist, but sup R f () 0 f 000 05 050 075 100 0 10 0 10 0 An important fact about real numbers is that if f : R n R is bounded above, ie there eists B R such that f () B for all R n, then sup R n f () eists and is unique The infimum is to the minimum as the supremum is to the maimum 1 First order conditions Suppose we want to maimize f () If we are at and travel some small distance in direction v, then we can approimate how much f will change by its directional derivative, ie f ( + v) f () + d f (; v) If there is some direction v with d f (; v) 0, then we could take that v or v step and reach a higher function value Therefore, if is a local maimum, then it must be that d f (; v) 0 for all v From theorem B1 this is the same as saying i 0 for all i Theorem 11 Let f : R n R, and suppose f has a local maimum and f is differentiable at Then i 0 for i 1,, n Proof The paragraph preceding the theorem is an informal proof The only detail missing is some care to ensure that the is sufficiently accurate If you are interested, see the past course notes for details http://facultyartsubcca/pschrimpf/56/ lec10optimizationpdf The first order condition is the fact that i local maimizer or minimizer of f 0 is a necessary condition for to be a Definition 16 Any point such that f is differentiable at and D f 0 is call a critical point of f If f is differentiable, f cannot have local minima or maima (=local etrema) at noncritical points f might have a local etrema its critical points, but it does not have to Consider f () 3 f (0) 0, but 0 is not a local maimizer or minimizer of F Similarly, if f : R R, f () 1, then the partial derivatives at 0 are 0, but 0 is not a local minimum or maimum of f This section will use some results on partial and directional derivatives See appendi B and the summer math review for reference 3

Second order conditions To determine whether a given critical point is a local minimum or maimum or neither we can look at the second derivative of the function Let f : R n R and suppose is a critical point Then i ( ) 0 To see if is a local maimum, we need to look at f ( + h) for small h We could try taking a first order epansion, f ( + h) f ( ) + D f h, but we know that D f 0, so this epansion is not useful for comparing f ( ) with f ( + h) If f is twice continuously differentiable, we can instead take a second order epansion of f around f ( + v) f ( ) + D f v + 1 vt D f v where D f is the matri of f s second order partial derivatives Since is a critical, point D f 0, so f ( + v) f ( ) 1 vt D f v We can see that is a local maimum if 1 vt D f v < 0 for all v 0 The above inequality will be true if v T D f v < 0 for all v 0 The Hessian, D f is just some symmetric n by n matri, and v T D F v is a quadratic form in v This motivates the following definition Definition 1 Let A be a symmetric matri, then A is Negative definite if T A < 0 for all 0 Negative semi-definite if T A 0 for all 0 Positive definite if T A > 0 for all 0 Positive semi-definite if T A 0 for all 0 Indefinite if 1 st T 1 A 1 > 0 and some other such that T A < 0 Later, we will derive some conditions on A that ensure it is negative (semi-)definite For now, just observe that if D F is negative semi-definite, then must be a local maimum If D F is negative definite, then is a strict local maimum The following theorem restates the results of this discussion Theorem 1 Let F : R n R be twice continuously differentiable and let be a critical point If (1) The Hessian, D F, is negative definite, then is a strict local maimizer () The Hessian, D F, is positive definite, then is a strict local minimizer (3) The Hessian, D F, is indefinite, is neither a local min nor a local ma (4) The Hessian is positive or negative semi-definite, then could be a local maimum, minimum, or neither Proof The main idea of the proof is contained in the discussion preceding the theorem The only tricky part is carefully showing that our approimation is good enough 4

When the Hessian is not positive definite, negative definite, or indefinite, the result of this theorem is ambiguous Let s go over some eamples of this case Eample 1 F : R R, F() 4 The first order condition is 4 3 0, so 0 is the only critical point The Hessian is F () 1 0 at However, 4 has a strict local minimum at 0 f 100 075 050 05 000 10 05 00 05 10 Eample F : R R, F( 1, ) 1 The first order condition is DF ( 1, 0) 0, so the 1 0, R are all critical points The Hessian is ( ) 0 D F 0 0 This is negative semi-definite because h T D F h h 0 Also, graphing the 1 function would make it clear that 1 0, R are all (non-strict) local maima Eample 3 F : R R, F( 1, ) 1 + 4 The first order condition is DF ( 1, 4 3 ) 0, so the (0, 0) is a critical point The Hessian is ( ) 0 D F 0 1 This is negative semi-definite at 0 because h T D F 0 h h 0 However, 0 is not a 1 local maimum because F(0, ) > F(0, 0) for any 0 0 is also not a local minimum because F( 1, 0) < F(0, 0) for all 1 0 In each of these eamples, the second order condition is inconclusive because h T D F h 0 for some h In these cases we could determine whether is a local maimum, local minimum, or neither by either looking at higher derivatives of F at, or look at D F for all in a neighborhood of We will not often encounter cases where the second order condition is inconclusive, so we will not study these possibilities in detail Eample 4 (Competitive multi-product firm) Suppose a firm has produces k goods using n inputs with production function f : R n R k The prices of the goods are p, and the prices of the inputs are w, so that the firm s profits are Π() p T f () w T 5

The firm chooses to maimize profits The first order condition is or without using matrices, OPTIMIZATION ma p T f () w T p T D f w 0 k j1 p j j i ( ) w i for i 1,, n The second order condition is that k j1 p f j j ( ) D [p T 1 f ] k j1 p f j j 1 n ( ) must be negative semidefinite a k j1 p f j j 1 n ( ) k j1 p f j j ( ) n a There is some awkwardness in the notation for the second derivative of a function from R n R k when k > 1 The first partial derivatives can be arranged in an k n matri There are then k n n second partial derivatives When k 1 it is convenient to write these in an n n matri When k > 1, one could arrange the second partial derivates in 3-d array of dimension k n n, or one could argue that an kn n matri makes sense These higher order derivatives are called tensors and physicists and mathematicians do have some notation for working with them However, tensors do not come up that often in economics Moreover, we can usually avoid needing any such notation by rearranging some operations In the above eample, f : R n R k, so D f should be avoided (especially if we combine it with other matri and vector algebra), but p T f () is a function from R n to R, so D [p T f ] is unambiguously an n n matri 3 Constrained optimization To begin our study of constrained optimization, let s consider a consumer in an economy with two goods Eample 31 Let 1 and be goods with prices p 1 and p The consumer has income y The consumer s problem is ma 1, u( 1, ) st p 1 1 + p y In economics 101, you probably analyzed this problem graphically by drawing indifference curves and the budget set, as in figure 1 In this figure, the curved lines are indifference curves That is, they are regions where utility is constant So going outward from the ais they show all ( 1, ) such that u( 1, ) c 1, u( 1, ) c, u( 1, ) c 3, and u( 1, ) c 4 with c 1 < c < c 3 < c 4 The diagonal line is the budget constraint All points below this line satisfy the constraint p 1 1 + p y The optimal 1, is the point where an indifference curve is tangent to the budget 6

line The slope of the budget line is p 1 p The slope of the indifference curve can be found by implicitly differentiating, So the optimum satisfies u ( )d 1 + u ( )d 0 1 u 1 u d ( ) u 1 d 1 The ratio of marginal utilities equals the ratio of marginal prices Rearranging, we can also say that u 1 p 1 p 1 p u p Suppose we have a small amount of additional income If we spend this on 1, we u would get /p 1 more of 1, which would increase our utility by 1 p 1 At the optimum, the marginal utility from spending additional income on either good should be the same Let s call this marginal utility of income µ, then we have µ u 1 p 1 or u µp 1 0 1 u µp 0 This is the standard first order condition for a constrained optimization problem Now, let s look at a generic maimization problem with equality constraints u p ma f () st h() c where f : R n R and h : R n R m As in the unconstrained case, consider a small perturbation of by some small v The function value is then approimately, and the constraint value is f ( + v) f () + D f v h( + v) h() + Dh v + v will satisfy the constraint only if Dh v 0 Thus, is a constrained local maimum if for all v such that Dh v 0 we also have D f v 0 Now, we want to epress this condition in terms of Lagrange multipliers Remember that Dh is an m n matri of partial derivatives, and D f is a 1 n row vector of partial 7 u

Figure 1 Indifference curves derivatives If n and m 1 this condition can be written as implies h 1 ()v 1 + h ()v 0 ()v 1 + ()v 0 1 Solving the first of these equations for v, substituting into the second and rearranging gives 1 () h 1 () () h (), which as in the consumer eample above, we can rewrite as µ h 0 1 1 µ h 0 for some constant µ Similar reasoning would show that for any n and m 1, if for all v such that Dh v 0 we also have D f v 0, then it is also true that there eists µ such that i µ h i 0 for i 1,, n Equivalently, using matri notation D f µdh Note that conversely, if such a µ eists, then trivially for any v with Dh v 0, we have D f v µdh v 0 8

When there are multiple constraints, ie m > 1, it would be reasonable to guess that Dh v 0 implies D f v 0 if and only if there eists µ R m such that m h j µ j 0 i i j1 or in matri notation D f µ T Dh This is indeed true It is tedious to show using algebra, but will be easy to show later using some results from linear algebra about the row, column, and null spaces of matrices A heuristic geometric argument supporting the previous claims is as follows Imagine the level sets or contour lines of h( ) The feasible points that satisfy the constraint lie on the contour line with h() 0 Similarly imagine the contour lines of f ( ) If is a constrained maimizer, then as we move alone the contour line h( ) 0, we must stay on the same contour line (or set if f is flat) of f In other words, the contour lines of h and f must be parallel at Gradients give the direction of steepest ascent and are perpendicular to contour lines Therefore, if the gradients of h and f are parallel, then so are the contour lines Gradients being parallel just means that they should be multiples of one another, ie D f µ T Dh for some µ T The following theorem formally states the preceding result Theorem 31 (First order condition for maimization with equality constraints) Let f : R n R and h : R n R m be continuously differentiable Suppose is a constrained local maimizer of f subject to h() c Also assume that Dh has rank m Then there eists µ R m such that (, µ ) is a critical point of the Lagrangian, ie for i 1,, n and j 1,, m L(, µ) f () µ T (h() c) L (, µ ) µ T h ( ) 0 i i i L (, µ ) h( ) c 0 µ j The assumption that Dh has rank m is needed to make sure that we don t divide by zero when defining the Lagrange multipliers When there is a single constraint, it simply says that at least one of the partial derivatives of the constraint is non-zero This assumption is called the non-degenerate constraint qualification It is rarely an issue for equality constraints, but can sometimes fail as in the following eamples Eercise 31 (1) Solve the problem: ma α 1 log 1 + α log st (p 1 1 + p y) 3 0 What goes wrong? How does it differ from the following problem: ma α 1 log 1 + α log st (p 1 1 + p y) 0 9

() Solve the problem: ma 1 + st 1, 1 + 0 31 Lagrange multipliers as shadow prices Recall that in our eample of consumer choice, the Lagrange multiplier could be interpreted as the marginal utility of income A similar interpretation will always apply Consider a constrained maimization problem, ma From 31, the first order conditions are f () st h() c D f µ T Dh 0 h( ) c 0 What happens to and f ( ) if c changes? Let (c) denote the maimizer as a function of c Differentiating the constraint with respect to c shows that By the chain rule, Dh (c)d c I D c ( f ( (c)) ) D f (c)d c Using the first order condition to substitute for D f (c), we have D c ( f ( (c)) ) µ T Dh (c)d c µ T Thus, the multiplier, µ, is the derivative of the maimized function with respect to c In economic terms, the multiplier is the marginal value of increasing the constraint Because of this µ j is called the shadow price of c j Eample 3 (Cobb-Douglas utility ) Consider the consumer s problem in eample 31 with Cobb-Douglas utility, The first order conditions are Solving for 1 and yields ma α 1 1, 1 α st p 1 1 + p y α 1 α 1 1 1 α p 1µ 0 α 1 1 α α 1 p µ 0 p 1 1 + p y 0 1 α 1 y α 1 + α p 1 α y α 1 + α p 10

The ependiture share of each good p j j y, is constant with respect to income Many economists, going back to Engel (1857) have studied how ependiture shares vary with income See Lewbel (008) for a brief review and additional references If we solve for µ, we find that α α 1 1 αα y α1+α 1 µ (α 1 + α ) α 1+α p α 1 1 pα µ is the marginal utility of income in this model As an eercise you may want to verify that if we were to take a monotonic transformation of the utility function (such as log( α 1 1 α ) or (α 1 1 α )β ) and re-solve the model, then we would obtain the same demand functions, but a different marginal utility of income 3 Inequality constraints Now let s consider an inequality instead of equality constraint ma f () st g() b When the constraints binds, ie g( ) b, the situation is eactly the same as with an equality constraint However, the constraints do not necessarily bind at a local maimum, so we must allow for that possibility Suppose is a constrained local maimum To begin with, let s consider a simplified case where there is only one constraint, ie g : R n R If the constraint does not bind, then for small changes to, the constraint will still not bind In other words, locally it as though we have an unconstrained problem Then, as in our earlier analysis of unconstrained problems, it must be that D f 0 Now, suppose the constraint does bind Consider a small change in, v Since is a local maimum, it must be that if f ( + v) > f ( ), then + v must violate the constraint, g( + v) > b Taking first order epansions, we can say that if D f v > 0, then D g v > 0 This will be true if D f λd g for some λ > 0 Note that the sign of the multiplier λ matters here To summarize the results of the previous paragraph If is a constrained local maimum, then there eists λ 0 such that D f λd g 0 Furthermore if λ > 0, then g( ) b If g( ) < b, then λ 0 (it is also possible, but rare for both λ 0 and g( ) b) This situation where if one inequality is strict and then another holds with equality and vice versa is called a complementary slackness condition Essentially the same result holds with multiple inequality constraints Theorem 3 (First order condition for maimization with inequality constraints) Let f : R n R and g : R n R m be continuously differentiable Suppose is a local maimizer of f subject to g() b Suppose that the first k m constraints, bind g j ( ) b j for j 1k and that the Jacobian for these constraints, g 1 1 g k 1 11 g 1 n g k n

has rank k Then, there eists λ R m such that for we have λ j L(, λ) f () λ T (g() b) L (, λ ) T g λ ( ) 0 i i i L (, λ ) λ ( λ j g j ( ) b ) 0 j λ j 0 g j ( ) b j for i 1,, n and j 1,, m Moreover for each j at least if λ > 0 then g j j ( ) b j and if g j ( ) < b, then λ 0 (complementary slackness) j Some care needs to be taken regarding the sign of the multipliers and the setup of the problem Given the setup of our problem, ma f () st g() b, if we write the Lagrangian as L(, λ) f () λ T (g() b), then the λ j 0 If we instead wrote the Lagrangian as L(, λ) f () + λ T (g() b), then we would get negative multipliers If we have a minimization problem, min f () st g() b, the situation is reversed f () λ T (g() b) leads to negative multipliers, and f () + λ T (g() b) leads to positive multipliers Reversing the direction of the inequality in the constraint, ie g() b instead of g() b, also switches the sign of the multipliers Generally it is not very important if you end up with multipliers with the wrong sign You will still end up with the same solution for The sign of the multiplier does matter for whether it is the shadow price of b or b To solve the first order conditions of an inequality constrained maimization problem, you would first determine or guess which constraints do and do not bind Then you would impose λ j 0 for the non-binding constraints and solve for and the remaining components λ If the resulting do lead to the constraints not binding that you guessed not binding, then that is a possible local maimum To find all possible local maima, you would have to repeat this for all possible combinations of constraints binding or not There m such possibilities, so this process can be tedious Eample 33 Let s solve The Lagrangian is ma 1 st 1 + 3 1 + 3 L(, λ) 1 λ 1 ( 1 + 3) λ ( 1 + 3) 1

The first order conditions are 0 λ 1 1 4λ 1 0 1 4λ 1 λ 0 λ 1 ( 1 + 3) 0 λ ( 1 + 3) Now, we must guess which constraints bind Since the problem is symmetric in 1 and, we only need to check three possibilities instead of all four The three possibilities are neither constraint binds, both bind, or one binds and one does not If neither constraint binds, then λ 1 λ 0, and the first order conditions imply 1 0 This results in a function value of 0 This is feasible, but it is not the maimum since for eample, small positive 1 and would also be feasible and give a higher function value Instead, suppose both constraints bind Then the solutions to 1 + 3 and 1 + 3 are 1 ±1 and ±1 Substituting into the first order condition and solving for λ 1 and λ gives 0 1 λ 1 4λ 0 1 4λ 1 λ 1 6 λ 1 λ Both λ 1 and λ are positive, so complementary slackness is satisfied Finally, let s consider the case where the first constraint binds but not the second In this case λ 0, and taking the ratio of the first order conditions gives 1 1 1 or 1 1 Substituting this into the first constraint yields 1 + 1 3, so 1 3/, and 3/4 However, then 1 + 3 + 3/4 > 3 The second constraint is 1 violated, so this cannot be the solution Fortunately, the economics of a problem often suggest which constraints will bind, and you rarely actually need to investigate all m possibilities Eample 34 (Quasi-linear preferences) A consumer s choice of some good(s) of interest and spending on all other goods z is often modeled using quasi-linear preferences U(, z) u() + z The consumer s problem is ma u() + z st p + z y,z 0 z 0 13

Let λ, µ, and µ z be the multipliers on the constraints The first order conditions are u () λp + µ 0 1 λ + µ z 0 along with the complementary slackness conditions λ 0 p + z y µ 0 0 µ z 0 z 0 There are 3 8 possible combinations of constraints binding or not However, we can eliminate half of these possibilities by observing that if the budget constraint is slack, p + z < y, then increasing z is feasible and increases the objective function Therefore, the budget constraint must bind, p + z y and λ > 0 There are four remaining possibilities Having 0 z is not possible because then the budget constraint would not bind This leaves three possibilities (1) 0 < and 0 < z That implies µ µ z 0 Rearranging the first order conditions gives that u () p and from the budget constraint z y p Depending on u, p, and y, this could be the solution () 0 and 0 < z That implies that µ z 0 and µ > 0 From the budget constraint, z y From the first order conditions we have u (0) p + µ Since µ must be positive, this equation requires that u (0) < p This also could be the solution depending on u, p, and y (3) 0 < and 0 z That implies that µ 0 From the budget constraint, y/p Combining the first order conditions to eliminate λ gives 1 + µ z u (y/p) p Since µ z > 0, this equation requires that u (y/p) > p This also could be the solution depending on u, p, and y Eercise 3 Show that replacing the problem ma f () st g() b with an equality constrained problem with slack variables s, ma,s leads to the same conclusions as theorem 3 f () st g() s b, s 0 33 Second order conditions As with unconstrained optimization, the first order conditions from the previous section only give a necessary condition for to be a local maimum of f () subject to some constraints To verify that a given that solves the first order condition is a local maimum, we must look at the second order condition As in 14

the unconstrained case, we can take a second order epansion of f () around f ( + v) f ( ) D f v + 1 vt D f v 1 vt D f v This is a constrained problem, so any + v must satisfy the constraints as well As before, what will really matter are the equality constraints and binding inequality constraints To simplify notation, let s just work with equality constraints, say h() c We can take a first order epansion of h around to get Then v satisfies the constraints if h( + v) h( ) + Dh v c h( ) + Dh v c Dh v 0 Thus, is a local maimizer of f subject to h() c if v T D f v 0 for all v such that that Dh v 0 The following theorem precisely states the result of this discussion Theorem 33 (Second order condition for constrained maimization) Let f : U R be twice continuously differentiable on U, and h : U R l and g : U R m be continuously differentiable on U R n Suppose interior(u) and there eists µ R l and λ R m such that for we have λ j L(, λ, µ) f () λ T (g() b) µ T (h() c) L (, λ ) λ ( ) µ T h ( ) 0 i i i i L (, λ ) h l ( ) c 0 µ l L (, λ ) λ ( λ j g( ) c ) 0 j λ j 0 g( ) b T g 15

Let B be the matri of the derivatives of the binding constraints evaluated at, If h 1 1 h l B 1 g 1 1 g k 1 v T D f v < 0 for all v 0 such that Bv 0, then is a strict local constrained maimizer for f subject to h() c and g() b Recall from above that an n by n matri, A, is negative definite if T A < 0 for all 0 Similarly, we say that A is negative definite on the null space3 of B if T A < 0 for all N(B) \ {0} Thus, the second order condition for constrained optimization could be stated as saying that the Hessian of the objective function must be negative definite on the null space of the Jacobian of the binding constraints The proof is similar to the proof of the second order condition for unconstrained optimization, so we will omit it h 1 n h l n g 1 n g k n 4 Comparative statics Often, a maimization problem will involve some parameters For eample, a consumer with Cobb-Douglas utility solves: ma α 1 1, 1 α st p 1 1 + p y The solution to this problem depends α 1, α, p 1, p, and y We are often interested in how the solution depends on these parameters In particular, we might be interested in the maimized value function (which in this eample is the indirect utility function) v(p 1, p, y, α 1, α ) ma α 1 1, 1 α st p 1 1 + p y We might also be interested in the maimizers 1 (p 1, p, y, α 1, α ) and (p 1, p, y, α 1, α ) (which in this eample are the demand functions) 41 Envelope theorem The envelope theorem tells us about the derivatives of the maimum value function Suppose we have an objective function that depends on some parameters θ Define the value function as v(θ) ma f (, θ) st h() c Let (θ) denote the maimizer as function of θ, ie (θ) arg ma f (, θ) st h() c 3The null space of an m n matri B is the set of all R n such that B 0 We will discuss null spaces in more detail later in the course 16

By definition v(θ) f ( (θ), θ) Applying the chain rule we can calculate the derivative of v For simplicity, we will treat and θ as scalars, but the same calculations work when they are vectors dv dθ θ ( (θ), θ) + ( (θ), θ) d dθ (θ) From the first order condition, we know that ( (θ), θ) d h µ dθ ( (θ)) d dθ However, since (θ) is a constrained maimum for each θ, it must be that h( (θ)) c for each θ Therefore, 0 d dθ h( (θ)) h ( (θ)) d 0 dθ Thus, we can conclude that dv dθ θ ( (θ), θ) More generally, you might also have parameters in the constraint, v(θ) ma f (, θ) st h(, θ) c Following the same steps as above, you can show that dv dθ θ µ h L θ θ Note that this result also covers the previous case where the constraint did not depend on θ In that case, h dv 0 and we are left with just θ dθ θ The following theorem summarizes the above discussion Theorem 41 (Envelope) Let f : R n+k R and h : R n+k R m be continuously differentiable Define v(θ) ma f (, θ) st h(, θ) c where R n and θ R k Then D θ v θ D θ f,θ µ T D θ h,θ where µ are the Lagrange multipliers from theorem 31, D θ f,θ denotes the 1 k matri of partial derivatives of f with respect to θ evaluated at (, θ), and D θ h,θ denotes the m k matri of partial derivatives of f with respect to θ evaluated at (, θ) Eample 41 (Consumer demand) Some of the core results of consumer theory can be shown as simple consequences of the envelope theorem Consider a consumer choosing goods R n with prices p and income y Define v(p, y) ma u() st p T y 0 17

In this contet, v is called the indirect utility function From the envelope theorem, v µ p i (p, y) i v y µ Taking the ratio we have a relationship between the (Marshallian) demand function,, and the indirect utility function, v i (p, y) p i v y This is known as Roy s identity Now consider the mirror problem of minimizing ependiture given a target utility level, e(p, ū) min p T st u() ū In general, e is a maimum value function, but in this particular contet, it is the consumer s ependiture function Using the envelope theorem again, e (p, ū) h p i (p, ū) i where h (p, ū) is the constrained minimizer It is the Hicksian (or compensated) i demand function Finally, using the fact that h i (p, ū) (p, e(p, ū)) and differentiating with respect i to p k gives Slutsky s equation h i i + i p k p k y e p k i p k + i y k Slutsky s equation is useful because we can determine the sign of some of these derivatives From above we know that e (p, ū) h p i (p, ū) i so h i (p, ū) p j e p j p i (p, ū) If we fi p and 0 h (p, ū), we know that u( 0 ) ū, so 0 satisfies the contraint in the minimum ependiture problem Therefore, for any p, p T 0 e( p, ū) min p T st u() ū 18

Taking a second order epansion, this gives p T 0 p T 0 e( p, ū) e(p, ū) D p e (p,ū) ( p p) + 1 ( p p)t D pe (p,ū) ( p p) 0 ( p p) T D pe (p,ū) ( p p) where the second line uses the fact that D p e (p,ū) T 0 Hence, we know that D pe (p,ū) D p h (p,ū) is negative semi-definite In particular, the diagonal, h i p i, must be less than or equal to zero Hicksian demand curves must slope down Marshallian demand curves usually do too, but might not when the income effect, i y, is large i We can also analyze how the maimizer varies with the parameters We do this by totally differentiating the first order condition Consider an equality constrained problem, ma f (, θ) st h(, θ) c where R n, θ R s, and c R m The optimal much satisfy the first order condition and the constraint j m k1 µ k h k j 0 h k (, θ) c 0 for j 1,, n, and k 1,, m Suppose θ changes by dθ, let d and dµ be the amounts that and µ have to change by to make the first order conditions still hold These must satisfy, n l1 f j l d l + s r1 j θ r dθ r ( m n h k µ k d l + j l k1 l1 s r1 h k j θ r dθ r n l1 ) h k l d l + m k1 s r1 dµ k h k j 0 h k θ r dθ r 0 where the first equation is for j 1,, n and the second is for k 1,, m We have n + m equations to solve for the n + m unknown d and dµ We can epress this system of equations more compactly using matrices ( ) ( ) ( D f Dµ T h (D h) T d D θ f ) D θ µt h dθ D h 0 dµ D θ h where D f is the n n matri of second partial derivatives of f with respect to, D θ µt h is the n s matri of second partial derivatives of µ T h with respect to combinations of and θ, etc Whether epressed using matrices or not, this system of equations is a bit unwieldy Fortunately in most applications, there will be some simplification For eample, if the 19

constraints do not depend on θ, we simply get d (D f ) 1 D dθ θ f }{{} D θ In other situations, the constraints might depend on θ, but s, m, and/or n might just be one or two A similar result holds with inequality constraints, ecept at θ where changing θ changes which constraints bind or do not Such situations are rare, so we will not worry about them Eample 4 (Production theory) Consider a competitive multiple product firm facing output prices p R k and input prices w R n The firm s profits as function of prices is π(p, w) ma y, pt y w T st y f () 0 The first order conditions are p T λ T 0 w T + λ T D f 0 y f () 0 Total differentiating with respect to p, holding w constant gives Combining we can get dp T dλ T 0 dλ T D f + d T D λ T f 0 dy D f d 0 dp T dy d T D λ T f d Notice that if we assume the constraint binds and substitute it into the objective function, then the second order condition for this problem is that v T D p T f v < 0 for all v If the second order condition holds, then we must have d T D λ T f d > 0 Therefor dp T dy > 0 Increasing output prices increases output As an eercise, you could use similar reasoning to show that dw T d < 0 Increasing input prices decreases input demand Eample 43 (contracting with productivity shocks) In class, we were going over the following eample, but did not have time to finish, so I am including it here Firm s problem with not-contractible, observed by firm but not worker ma q i [ i f (h i ) c i ] c(),h() i {H,L} 0

st W 0 q i [U(c i ) + V(1 h i )] 0 (1) i L f (h H ) c H L f (h L ) c L () H f (h L ) c L H f (h H ) c H (3) Which constraint binds? Compare c i and h i with first-best contract Recall The first best contract (when is observed and contractible so that constraints () and (3) are not present) had c H c L and h H > h L, and in particular i f (h i ) V (1 h i ) U (c i ) With this contract, if is only known to the firm, the firm would always want to lie and declare that productivity is high Therefore, we know that at least one of ()-(3) must bind Moreover, we should suspect that constraint () will bind We can verify that this is the case by looking at the first order conditions Let µ be the multiplier for (1), λ L for () and λ H for (3) The first order conditions are [c L ] : 0 q L + µq L U (c L ) λ L + λ H [c H ] : 0 q H + µq H U (c H ) + λ L λ H [h L ] : 0 q L L f (h L ) µq L V (1 h L ) + λ L L f (h L ) λ H H f (h L ) [h H ] : 0 q H H f (h H ) µq H V (1 h H ) λ L L f (h H ) + λ H H f (h H ) Now, let s deduce what constraints must bind Possible cases: (1) Both () and (3) slack Then, we have the first-best problem, but as noted above, the solution to the first best problme violates (), so this case is not possible () Both () and (3) bind Then, and c H c L L [ f (h H ) f (h L )] c H c L H [ f (h H ) f (h L )] Assuming H > L, this implies c H c L c and h H h L h The first order conditions for [c i ] imply that λ H λ L q L (1 µu (c)) q H (1 µu (c)) Since each q > 0, the only possibility is that 1 µu (c) 0 and λ H λ L λ However, then the first order condtioins for [h i ] imply that µv (1 h) f q L L + λ( L H ) q H H + λ( H L ) (h) q L q H ( H L )(1 + λ + λ ) 0 q L q H but this is impossible since H > L, λ 0, q L > 0 and q H > 0 1

(3) (3) binds, but () is slack This is impossible Subtracting one constraint from the other gives ( H L )[ f (h H ) f (h L )] 0 which implies h H h L On the other hand if (3) binds and () is slack, then and c H c L H [ f (h H ) f (h L )] c H c L < L [ f (h H ) f (h L )] Since H > L and f is increasing, this implies h L > h H, a contradiction (4) () binds, but (3) is slack This is the only possibility left Now we have and c H c L L [ f (h H ) f (h L )] c H c L < H [ f (h H ) f (h L )] Since H > L and f is increasing, this implies h H > h L, and c H > c L Using the fact that λ H 0 and combining [c L ] and [h L ] gives: L f (h L ) V (1 h L ) U (c L ) An interpretation of this is that when productivity is low, the marginal production of labor is set equal to marginal rate of substitution between labor and consumption This contract features an efficient choice of labor relative to consumption in the low productivity state To see what happens in the high productivity state, combining [c H ] and [h H ] leads to H f (h H ) q H λ L L / H V (1 h H ) q H λ L U (5) (c H ) Also, note that q H λ L µu (c H ) > 0, and L < H, so q H λ L L / H q H λ L H f (h H ) < V (1 h H ) U (c H ) (4) > 1 Hence, in the high productivity state, there is a distortion in the tradeoff between labor and consumption This distortion is necessary to prevent the firm from wanting to lie about productivity We could proceed further and try to eliminate λ L from these epressions, but I do not think it leads to any more useful conclusions (6) Appendi A Notation R is the set of real numbers R n is the set of vectors of n real numbers R k is read in R k and means that is vector of k real numbers f : R n R means f is a function from R n to R That is, f s argument is an n-tuple of real numbers and its output is a single real number U R n means U is a subset of R n

When convenient we will treat R k as a k 1 matri, so w T k i1 w i i N δ () is a δ neighborhood of, meaning the set of points within δ distance of For R n, we will use Euclidean distance, so that N δ () is the set of y R n such that n i1 ( i y i ) < δ Appendi B Review of derivatives Partial and directional derivatives were discussed on the summer math review, so we will just briefly restate their definitions and some key facts here Definition B1 Let f : R n R The ith partial derivative of f is f ( 01,, 0i + h, 0n ) f ( 0 ) ( 0 ) lim i h 0 h The ith partial derivative tells you how much the function changes as its ith argument changes Definition B Let f : R n R k, and let v R n the directional derivative in direction v at is f ( + αv) f () d f (; v) lim α 0 α where α R is a scalar An important result relating partial to directional derivatives is the following Theorem B1 Let f : R n R and suppose its partial derivatives eist and are continuous in a neighborhood of 0 Then n d f (; v) v i ( 0 ) i i1 in this case we will say that f is differentiable at 0 It is convenient to gather partial derivatives of a function into a matri For a function f : R n R, we will gather its partial derivatives into a 1 n matri, ( ) D f 1 () n () We will simply call this matri the derivative of f at This helps reduce notation because for eample we can write n d f (; v) v i ( 0 ) D f v i i1 Similarly, we can define second and higher order partial and directional derivatives Definition B3 Let f : R n R The ijth partial second derivative of f is f i j ( 0 ) lim h 0 i ( 01,, 0j + h, 0n ) 3 h i ( 0 )

Definition B4 Let f : R n R k, and let v, w R n the directional derivative in directions v and w at is d d f ( + αw; v) d f (; v) f (; v, w) lim α 0 α where α R is a scalar Theorem B Let f : R n R and suppose its first and second partial derivatives eist and are continuous in a neighborhood of 0 Then n n d f f (; v, w) v i ( 0 )w j i j j1 i1 in this case we will say that f is twice differentiable at 0 Additionally, if f is twice differentiable, then f i j f j j We can gather the second partials of f into an n n matri, D 1 1 n f, f 1 n f n and then write d f (; v, w) w T D f w D f is also called the Hessian of f f f 4

References Carter, Michael 001 Foundations of mathematical economics MIT Press De la Fuente, Angel 000 Mathematical methods and models for economists Cambridge University Press Diit, Avinash K 1990 Optimization in economic theory Oford University Press Oford Engel, Ernst 1857 Die Productions- und Consumptionsverhaeltnisse des Koenigsreichs Sachsen Zeitschrift des Statistischen Bureaus des Koniglich Sachsischen Ministeriums des Inneren (8-9) Lewbel, Arthur 008 Engel curve In The New Palgrave Dictionary of Economics, edited by Steven N Durlauf and Lawrence E Blume Basingstoke: Palgrave Macmillan URL http://wwwdictionaryofeconomicscom/article?id=pde008_e000085 Simon, Carl P and Lawrence Blume 1994 Mathematics for economists Norton New York 5

Optimization Paul Schrimpf September 12, 2018