Chapter 9 Lagrange multipliers Portfolio optimization The Lagrange multipliers method for finding constrained extrema of multivariable functions 91 Lagrange multipliers Optimization problems often require finding extrema of multivariable functions subject to various constraints One method to solve such problems is by using Lagrange multipliers, as outlined below Let U R n be an open set, and let f : U R be a smooth function, eg, infinitely many times differentiable We want to find the extrema of f(x subject to m constraints given by g(x = 0, where g : U R m is a smooth function, ie, Find x 0 U such that max g(x =0 x U f(x = f(x 0 or min g(x =0 x U f(x = f(x 0 (91 Problem (91 is called a constrained optimization problem For this problem to be well posed, a natural assumption is that the number of constraints is smaller than the number of the degrees of freedom, ie, m<n Apointx 0 U satisfying (91 is called a constrained extremum point of the function f(x with respect to the constraint function g(x To solve the constrained optimization problem (91, let λ = (λ i i=1:m be a column vector of the same size, m, as the number of constraints; λ is called the Lagrange multipliers vector The Lagrangian associated to problem (91 is the function F : U R m R given by F (x, λ = f(x + λ t g(x (92 271
272 CHAPTER 9 LAGRANGE MULTIPLIERS If g(x = ( g1 (x g m (x, then F (x, λ can be written explicitly as F (x, λ = f(x + m λ i g i (x Example: For a better understanding of the Lagrange multipliers method, in parallel with presenting the general theory, we will solve the following constrained optimization problem: Find the maximum and minimum values of the function f(x 1,x 2,x 3 =4x 2 2x 3 subject to the constraints 2x 1 = x 2 + x 3 and x 2 1 + x2 2 =13 We reformulate the problem by introducing the functions f : R 3 R and g : R 3 R 2 given by ( 2x1 x f(x =4x 2 2x 3 ; g(x = 2 x 3 x 2 1 + x2 2 13, (93 where x =(x 1,x 2,x 3 We want to find x 0 R 3 such that Let λ = problem is max g(x =0 x R 3 λ 2 ( λ1 i=1 f(x = f(x 0 or min g(x =0 x R 3 be the Lagrange multiplier f(x = f(x 0 The Lagrangian associated to this F (x, λ = f(x + λ 1 g 1 (x + λ 2 g 2 (x = 4x 2 2x 3 + λ 1 (2x 1 x 2 x 3 + λ 2 (x 2 1 + x 2 2 13 (94 In the Lagrange multipliers method, the constrained extremum point x 0 is found by identifying the critical points of the Lagrangian F (x, λ For the method to work, the following necessary condition must be satisfied: The gradient g(x has full rank at any point x where the constraint g(x =0is satisfied, ie, rank( g(x = m, x U such that g(x =0 (95 Example (continued: In order to use the Lagrange multipliers method for our example, we first check that rank( g(x = 2 for any x R 3 such that g(x =0 Recall that the function g(x is given by (93 Then, g(x = ( 2 1 1 2x 1 2x 2 0
91 LAGRANGE MULTIPLIERS 273 Note that rank( g(x = 1 if and only if x 1 = x 2 = 0 Also, g(x =0ifand only if 2x 1 x 2 x 3 =0andx 2 1 + x2 2 = 13 However, if x2 1 + x2 2 = 13, then it is not possible to have x 1 = x 2 = 0 We conclude that, if g(x = 0, then rank( g(x = 2 The gradient 1 of F (x, λ with respect to both x and λ will be denoted by (x,λ F (x, λ and is the following row vector: (x,λ F (x, λ = ( x F (x, λ λ F (x, λ ; (96 cf (138 It is easy to see that F (x, λ = f m g i (x + λ i (x, x j x j x j i=1 j =1:n; (97 F (x, λ λ i = g i (x, i =1:m (98 Denote by f(x and g(x the gradients of f : U R and g : U R m, ie, ( f f f(x = (x, λ (x, λ ; (99 x 1 x n g 1 g x 1 (x 1 g x 2 (x 1 x n (x g 2 g g(x = x 1 (x 2 g x 2 (x 2 x n (x ; (910 g m g x 1 (x m g x 2 (x m x n (x cf (138 and (140, respectively Then, from (97 910, it follows that ( F x F (x, λ = λ F (x, λ = (x, λ x 1 ( F (x, λ λ 1 F x n (x, λ F λ m (x, λ From (96, (911, and (912, we conclude that = f(x+λ t ( g(x; (911 =(g(x t (912 (x,λ F (x, λ = ( f(x+λ t ( g(x (g(x t (913 The following theorem gives necessary conditions for a point x 0 U to be a constrained extremum point for f(x Its proof involves the Inverse Function Theorem and is beyond the scope of this book 1 In this section, we use the notation F, instead of DF, for the gradient of F
274 CHAPTER 9 LAGRANGE MULTIPLIERS Theorem 91 Assume that the constraint function g(x satisfies the condition (95 If x 0 U is a constrained extremum point of f(x with respect to the constraint g(x =0, then there exists a Lagrange multiplier λ 0 R m such that the point (x 0,λ 0 is a critical point for the Lagrangian function F (x, λ, ie, (x,λ F (x 0,λ 0 = 0 (914 We note that (x,λ F (x, λ is a function from R m+n into R m+n Thus, solving (914 to find the critical points of F (x, λ requires, in general, using the N dimensional Newton s method for solving nonlinear equations; see section 521 for details Interestingly enough, for some financial applications such as finding minimum variance portfolios, problem (914 is a linear system which can be solved without using Newton s method; see section 93 for details Example (continued: Since we already checked that the condition (95 is satisfied for this example, we proceed to find the critical points of the Lagrangian F (x, λ Using formula (94 for F (x, λ, it is easy to see that (x,λ F (x, λ = 2λ 1 +2λ 2 x 1 4 λ 1 +2λ 2 x 2 2 λ 1 2x 1 x 2 x 3 x 2 1 + x2 2 13 Let x 0 =(x 0,1,x 0,2,x 0,3 andλ 0 =(λ 0,1,λ 0,2 Then, solving (x,λ F (x 0,λ 0 =0 is equivalent to solving the following system: 2λ 0,1 +2λ 0,2 x 0,1 = 0 4 λ 0,1 +2λ 0,2 x 0,2 = 0 2 λ 0,1 = 0 2x 0,1 x 0,2 x 0,3 = 0 x 2 0,1 + x2 0,2 13 = 0 t (915 From the third equation of (915, we find that λ 0,1 = 2 Then, the system (915 can be written as Since λ 0,2 0, we find from (916 that x 0,1 = 2 λ 0,2 ; λ 0,2 x 0,1 = 2 λ 0,2 x 0,2 = 3 x 0,3 = 2x 0,1 x 0,2 x 2 0,1 + x2 0,2 = 13 x 0,2 = 3 λ 0,2 ; x 0,3 = 7 λ 0,2 ; x 2 0,1 + x 2 0,2 = 13 λ 2 0,2 =13 (916 Thus, λ 2 0,2 = 1 and the system (915 has two solutions, one corresponding to λ 0,2 = 1, and another one corresponding to λ 0,2 = 1, as follows: x 0,1 =2; x 0,2 = 3; x 0,3 =7; λ 0,1 = 2; λ 0,2 =1 (917
91 LAGRANGE MULTIPLIERS 275 and x 0,1 = 2; x 0,2 =3; x 0,3 = 7; λ 0,1 = 2; λ 0,2 = 1 (918 From (917 and (918, we conclude that the Lagrangian F (x, λ has the following two critical points: ( 2 ( 2 x 0 = 3 ; λ 0 = 1 (919 7 and x 0 = ( 2 3 ; λ 0 = 7 ( 2 1 (920 Finding sufficient conditions for a critical point (x 0,λ 0 of the Lagrangian F (x, λ to correspond to a constrained extremum point x 0 for f(x is somewhat more complicated (and rather rarely checked in practice Consider the function F 0 : U R given by F 0 (x = F (x, λ 0 = f(x + λ t 0g(x Let D 2 F 0 (x 0 be the Hessian of F 0 (x evaluated at the point x 0, ie, 2 F 0 (x x 2 0 2 F 0 1 x 2x 1 (x 0 2 F 0 x nx 1 (x 0 D 2 2 F 0 x F 0 (x 0 = 1x 2 (x 0 2 F 0 (x x 2 0 2 F 0 2 x nx 2 (x 0 (921 2 F 0 x 1x n (x 0 2 F 0 x 2x n (x 0 2 F 0 x (x 0 2 n Note that D 2 F 0 (x 0 isann n matrix Let q(v be the quadratic form associated to the matrix D 2 F 0 (x 0, ie, q(v = v t D 2 F 0 (x 0 v = 1 i,j n 2 F 0 x i x j (x 0 v i v j, (922 where v =(v i i=1:n We restrict our attention to the vectors v satisfying g(x 0 v = 0, ie, to the vector space V 0 = {v R n g(x 0 v =0} (923 Note that g(x 0 is a matrix with m rows and n columns, where m<n; cf (910 If the condition (95 is satisfied, it follows that rank( g(x 0 = m, ie, the matrix g(x 0 hasm linearly independent columns Assume, without losing any generality, that the first m columns of g(x 0 are linearly independent
276 CHAPTER 9 LAGRANGE MULTIPLIERS By solving the linear system g(x 0 v = 0, we obtain that the entries v 1, v 2,, v m of the vector v can be written as linear combinations of v m+1, v m+2,, v n, the other n m entries of v Let v red = ( vm+1 Then, by restricting q(v to the vector space V 0, we can write q(v as a quadratic form depending only on the entries of the vector v red, ie, q(v = q red (v red = q red (i, jv i v j, v V 0 (924 v n m+1 i,j n Example (continued: We compute the reduced quadratic forms corresponding to the critical points of the Lagrangian function from our example Recall that there are two such critical points, given by (919 and (920, respectively The first critical point of F (x, λ isx 0 =(2, 3, 7 and λ 0 =( 2, 1; cf (919 The function F 0 (x =f(x+λ t 0g(x isequalto F 0 (x = 4x 2 2x 3 +( 2 (2x 1 x 2 x 3 +1 (x 2 1 + x 2 2 13 = x 2 1 + x 2 2 4x 1 +6x 2 13 (925 From (925 and (926, we find that ( 2 0 0 D 2 F 0 (x 0 = 0 2 0 and g(x 0 = 0 0 0 ( 2 1 1 4 6 0 If v =(v i i=1:3, then q(v = v t D 2 F 0 (x 0 v = 2v 2 1 +2v 2 2 Recall that the gradient of the constraint function g(x is ( 2 1 1 g(x = 2x 1 2x 2 0 Then, g(x 0 = (926 ( 2 1 1 4 6 0, and the condition g(x 0 v = 0 is equivalent to { 2v1 v 2 v 3 = 0 4v 1 6v 2 = 0 { v3 = 2v 1 v 2 v 1 = 3 2 v 2 { v3 = 2v 2 ; v 1 = 3 2 v 2 Let v red = v 2 The reduced quadratic form q red (v red is q red (v red = 2v 2 1 +2v 2 2 = 13 2 v2 2 (927
91 LAGRANGE MULTIPLIERS 277 The second critical point of F (x, λ isx 0 =( 2, 3, 7 and λ 0 =( 2, 1; cf (920 The function F 0 (x =f(x+λ t 0g(x isequalto F 0 (x = x 2 1 x 2 2 4x 1 +6x 2 +13 (928 From (928 and (926, we find that ( 2 0 0 D 2 F 0 (x 0 = 0 2 0 0 0 0 If v =(v i i=1:3, then and g(x 0 = ( 2 1 1 4 6 0 Then, g(x 0 = q(v = v t D 2 F 0 (x 0 v = 2v1 2 2v2 2 and the condition g(x 0 v = 0 is equivalent to ( 2 1 1 4 6 0 { 2v1 v 2 v 3 = 0 4v 1 +6v 2 = 0 { v3 = 2v 1 v 2 v 1 = 3 2 v 2 { v3 = 2v 2 ; v 1 = 3 2 v 2 Let v red = v 2 The reduced quadratic form q red (v red is q red (v red = 2v 2 1 2v 2 2 = 13 2 v2 2 (929 Whether the point x 0 is a constrained extremum for f(x will depend on the nonzero quadratic form q red being either positive semidefinite, ie, or negative semidefinite, ie, q red (v red 0, v red R n m, q red (v red 0, v red R n m Theorem 92 Assume that the constraint function g(x satisfies condition (95 Let x 0 U R n and λ 0 R m such that the point (x 0,λ 0 is a critical point for the Lagrangian function F (x, λ =f(x+λ t g(x If the reduced quadratic form (924 corresponding to the point (x 0,λ 0 is positive semidefinite, then x 0 is a constrained minimum for f(x with respect to the constraint g(x =0 If the reduced quadratic form (924 corresponding to the point (x 0,λ 0 is negative semidefinite, then x 0 is a constrained maximum for f(x with respect to the constraint g(x =0 If the reduced quadratic form (924 corresponding to (x 0,λ 0 is nonzero, and it is not positive semidefinite nor negative semidefinite, then x 0 is not a constrained extremum point for f(x with respect to the constraint g(x =0
278 CHAPTER 9 LAGRANGE MULTIPLIERS Example (continued: We use Theorem 92 to classify the critical points of the Lagrangian function corresponding to our example Recall from (927 that the reduced quadratic form corresponding to the critical point x 0 =(2, 3, 7 and λ 0 =( 2, 1 of F (x, λ isq red (v red = 13 2 v2 2, which is positive semidefinite for any x R 3 Using Theorem 92, we conclude that the point (2, 3, 7 is a minimum point for f(x By direct computation, we find that f(2, 3, 7 = 26 Recall from (929 that the reduced quadratic form corresponding to the critical point x 0 =( 2, 3, 7 and λ 0 =( 2, 1 of F (x, λ isq red (v red = 13 2 v2 2, which is negative semidefinite for any x R 3 Using Theorem 92, we conclude that the point ( 2, 3, 7 is a maximum point for f(x By direct computation, we find that f( 2, 3, 7 = 26 It is important to note that the solution to the constrained optimization problem presented in Theorem 92 is significantly simpler, and does not involve computing the reduced quadratic form (924, if the matrix D 2 F 0 (x 0 is either positive definite or negative definite Corollary 91 Assume that the constraint function g(x satisfies condition (95 Let x 0 U R n and λ 0 R m such that the point (x 0,λ 0 is a critical point for the Lagrangian function F (x, λ =f(x +λ t g(x Let F 0 (x =f(x +λ t 0g(x, and let D 2 F 0 (x 0 be the Hessian of F 0 evaluated at the point x 0 If the matrix D 2 F 0 (x 0 is positive definite, ie, if all the eigenvalues of the matrix D 2 F 0 (x 0 are strictly greater than 0, then x 0 is a constrained minimum for f(x with respect to the constraint g(x =0 If the matrix D 2 F 0 (x 0 is negative definite, ie, if all the eigenvalues of the matrix D 2 F 0 (x 0 are strictly less than 0, then x 0 is a constrained maximum for f(x with respect to the constraint g(x =0 The result of Corollary 91 will be used in sections 93 and 94 for details to find minimum variance portfolios and maximum return portfolios, respectively is not straightforward Summarizing, the steps required to solve a constrained optimization problem using Lagrange multipliers are: Step 1: Check that rank( g(x = m for all x such that g(x =0 Step 2: Find (x 0,λ 0 U R m such that (x,λ F (x 0,λ 0 =0 Step 31: Compute q(v =v t D 2 F 0 (x 0 v, wheref 0 (x =f(x+λ t 0 g(x Step 32: Compute q red (v red byrestrictingq(v to the vectors v satisfying the condition g(x 0 v = 0 Decide whether q red (v red is positive semidefinite or negative semidefinite
91 LAGRANGE MULTIPLIERS 279 Step 4: Use Theorem 92 to decide whether x 0 is a constrained minimum point or a constrained maximum point Note: If the matrix D 2 F 0 (x 0 is either positive semidefinite or negative semidefinite, skip Step 32 and go from Step 31 to the following version of Step 4: Step 4: Use Corollary 91 to decide whether x 0 is a constrained minimum point or a constrained maximum point Several examples of solving constrained optimization problems using Lagrange multipliers are given in section 911; the steps above are outlined for each example Applications of the Lagrange multipliers method to portfolio optimization problems are presented in sections 93 and 94 911 Examples The first example below illustrates the fact that, if condition (95 is not satisfied, then Theorem 91 may not hold Example: Find the minimum value of x 2 1 + x 1 + x 2 2, subject to the constraint (x 1 x 2 2 =0 Answer: This problem can be solved without using Lagrange multipliers as follows: If (x 1 x 2 2 = 0, then x 1 = x 2 Then, the function to minimize becomes ( x 2 1 + x 1 + x 2 2 = 2x 2 1 + x 1 = 2 x 1 + 1 2 4 1 8, which achieves its minimum when x 1 = 1 4 We conclude that there exists a unique constrained minimum point x 1 = x 2 = 1 4, and that the corresponding minimum value is 1 8 Although we solved the problem directly, we attempt to find an alternative solution using the Lagrange multipliers method According to the framework outlined in section 91, we want to find the minimum of the function f(x 1,x 2 = x 2 1 + x 1 + x 2 2 for (x 1,x 2 R 2, such that the constraint g(x 1,x 2 = 0 is satisfied, where g(x 1,x 2 = (x 1 x 2 2 By definition, g(x = ( g x 1 g x 2 = ( 2(x 1 x 2 2(x 1 x 2 Note that g(x = 0 if and only if x 1 = x 2 Thus, g(x = (0 0 (and therefore rank( g(x = 0 for all x such that g(x = 0 We conclude that condition (95 is not satisfied at any point x such that g(x =0
280 CHAPTER 9 LAGRANGE MULTIPLIERS Since the problem has one constraint, we only have one Lagrange multiplier, whichwedenotebyλ R From (92, it follows that the Lagrangian is F (x 1,x 2,λ = f(x 1,x 2 + λg(x 1,x 2 = x 2 1 + x 1 + x 2 2 + λ(x 1 x 2 2 The gradient (x,λ F (x, λ is computed as in (913 and has the form (x,λ F (x, λ = ( 2x 1 +1+2λ(x 1 x 2 2x 2 2λ(x 1 x 2 (x 1 x 2 2 Finding the critical points for F (x, λ requires solving (x,λ F (x, λ = 0, which is equivalent to the following system of equations: { 2x1 +1+2λ(x 1 x 2 = 0; 2x 2 2λ(x 1 x 2 = 0; (x 1 x 2 2 = 0 This system does not have a solution: from the third equation, we obtain that x 1 = x 2 Then, from the second equation, we find that x 2 = 0, which implies that x 1 = 0 Substituting x 1 = x 2 = 0 into the first equation, we obtain 1 = 0, which is a contradiction In other words, the Lagrangian F (x, λ has no critical points However, we showed before that the point (x 1,x 2 = ( 1 4, 4 1 is a constrained minimum point for f(x given the constraint g(x = 0 The reason Theorem 91 does not apply in this case is that condition (95, which was required in order for Theorem 91 to hold, is not satisfied Example: Find the positive numbers x 1, x 2, x 3 such that x 1 x 2 x 3 =1andx 1 x 2 + x 2 x 3 + x 3 x 1 is minimized Answer: We first reformulate the problem as a constrained optimization problem Let U = i=1:3 (0, R3 and let x =(x 1,x 2,x 3 U The functions f : U R and g : U R are defined as f(x = x 1 x 2 + x 2 x 3 + x 3 x 1 ; g(x = x 1 x 2 x 3 1 We want to minimize f(x overthesetu subject to the constraint g(x =0 Step 1: Check that rank( g(x = 1 for any x such that g(x =0 Let x =(x 1,x 2,x 3 U It is easy to see that g(x = (x 2 x 3 x 1 x 3 x 1 x 2 (930 Note that g(x 0, since x i > 0, i = 1 : 3 Therefore, rank( g(x = 1 for all x U, and condition (95 is satisfied Step 2: Find (x 0,λ 0 such that (x,λ F (x 0,λ 0 =0 The Lagrangian associated to this problem is F (x, λ = x 1 x 2 + x 2 x 3 + x 3 x 1 + λ (x 1 x 2 x 3 1, (931
91 LAGRANGE MULTIPLIERS 281 where λ R is the Lagrange multiplier Let x 0 = (x 0,1,x 0,2,x 0,3 From (931, we find that (x,λ F (x 0,λ 0 = 0 is equivalent to the following system: x 0,2 + x 0,3 + λ 0 x 0,2 x 0,3 = 0; x 0,1 + x 0,3 + λ 0 x 0,1 x 0,3 = 0; x 0,1 + x 0,2 + λ 0 x 0,1 x 0,2 = 0; x 0,1 x 0,2 x 0,3 = 1 By multiplying the first three equations by x 0,1, x 0,2,andx 0,3, respectively, and using the fact that x 0,1 x 0,2 x 0,3 = 1, we obtain that λ = x 0,1 x 0,2 + x 0,1 x 0,3 = x 0,1 x 0,2 + x 0,2 x 0,3 = x 0,1 x 0,3 + x 0,2 x 0,3 Since x 0,i 0,i =1:3,wefindthatx 0,1 = x 0,2 = x 0,3 Since x 0,1 x 0,2 x 0,3 =1,we conclude that x 0,1 = x 0,2 = x 0,3 =1andλ 0 = 2 Step 31: Compute q(v =v t D 2 F 0 (x 0 v Since λ 0 = 2, we find that F 0 (x =f(x+λ t 0g(x isgivenby F 0 (x = x 1 x 2 + x 2 x 3 + x 3 x 1 2x 1 x 2 x 3 +2 The Hessian of F 0 (x is ( 0 1 2x3 1 2x 2 D 2 F 0 (x 1,x 2,x 3 = 1 2x 3 0 1 2x 1, 1 2x 2 1 2x 1 0 and therefore ( 0 1 1 D 2 F 0 (1, 1, 1 = 1 0 1 1 1 0 From (922, we find that the quadratic form q(v is q(v = v t D 2 F 0 (1, 1, 1 v = 2v 1 v 2 2v 2 v 3 2v 1 v 3 (932 Step 32: Compute q red (v red We first solve formally the equation g(1, 1, 1 v =0,wherev =(v i i=1:3 is an arbitrary vector From (930, we find that g(1, 1, 1=(111,and g(1, 1, 1 v = v 1 + v 2 + v 3 = 0 By ( solving for v 1 in terms of v 2 and v 3 we obtain that v 1 = v 2 v 3 Let v red = v2 v We substitute v 2 v 3 for v 1 in (932, to obtain the reduced quadratic 3 form q red (v red, ie, q red (v red = 2v2 2 +2v 2 v 3 +2v3 2 = v2 2 + v3 2 +(v 2 + v 3 2 Therefore, q red (v red > 0 for all (v 2,v 3 (0, 0, which means that q red is a positive definite form Step 4: From Theorem 92, we conclude that the point x 0 =(1, 1, 1 is a constrained minimum for the function f(x =x 1 x 2 + x 2 x 3 + x 3 x 1, with x 1,x 2,x 3 > 0, subject to the constraint x 1 x 2 x 3 =1