Introduction to Optimization Techniques. Nonlinear Programming

Introduction to Optiization echniques Nonlinear Prograing

Optial Solutions Consider the optiization proble in f ( x) where F R n xf Definition : x F is optial (global iniu) for this proble, if f( x ) f( x) for all xf. Definition : xf is a local iniu if there is an 0 so that f( x) f( x) for all x F N( x, ) where n Nx (, ) xr xx and is a nor on n R 2

Lagrange Multiplier Method Now consider the optiization proble in f ( x) xx g( x) 0 (GP) where n X R, f : XR, g : XR (and we take the feasible region F xx g( x) 0 Definition : he Lagrangian for (GP) is a function L( x, ) f ( x) g( x) where g( x) is the inner product of the vector with the vector g( x). hat is, L( x, ) f( x) g ( x) i i i ). 3

Lagrange Multiplier Method heore (Lagrange) : Let R (i.e., is a nonnegative -vector) and let x X solve the relatively unconstrained proble in L(, ) xx x If x F and λ ( x ) 0, then Proof : By assuption, f g But λ g( x ) 0 then iplies f But F X and, therefore, for all xf x is optial for (GP). x λ x x λ x xx ( ) g( ) f( ) g( ) for all x x λ x xx ( ) f( ) g( ) for all f x f x λ x f x ( ) ( ) g( ) ( ) Note : he theore assues an appropriate exists and we know its value. 4

Lagrange Multiplier Method Ex : ( ay not exist) in x x R 2 x 0 then F {0} and therefore x 0 is optial, Let 0, then in L( x, ) in x x 0 xx xr in L( x, ) xx not optial for 2 If 0, in L( x, ) in x x xx xr x( ) 0 is optial for in L( x, ) x 2 X x 0 is not. 5

Saddle Points for Optiality Ex 2 : Show there is no such (for heore ) for the proble in x R x x 0 Definition : For proble GP, ( x, is said to be a saddle-point of the Lagrangian if x X, 0 and L( x, L( x, L( x, for all R for all x X 6

Saddle Points for Optiality heore 2 : ( x, X R + is a saddle-point of the Lagrangian of GP, if and only if (i) x solves in L x, xx ii xx, g( x) 0 (i.e., xf) (iii) g( x) 0 (copleentary slackness) Proof : Assue ( x, X R + is a saddle-point. hen, L( x, L( x, for all x X is precisely the stateent (i) x solves in L x, xx 7

Saddle Points for Optiality Also, since L( x, L( x, all R, we have f( x) g( x) f( x) g( x), for all R ( ) ( ) g( x) 0, for all Assue g( x) 0. Without loss of generality (wlog) we ay assue g(. x) 0 Let, i i, i 2. then 0 and ( ) g( x) ( ) g ( x) g ( x) 0 i i i i and this contradicts (). herefore, (ii) x X, g( x) 0 R 8

Saddle Points for Optiality Now, () iplies (by taking 0 ) g( x) 0 But we ve just shown that g( x) 0 and we ve assued 0 ; therefore g( x) 0 and, hence, (iii) g( x) 0 Conversely, assue ( x, X R + satisfies conditions (i) - (iii). Condition (i) is precisely the stateent L( x, L( x,, for all x X and we now need to show L( x, L( x, for all R. 9

Saddle Points for Optiality For R we have g( x) 0 or g( x) 0 and therefore, f( x) g( x) f( x) f( x) g( x) or L( x, L( x,, for all R Corollary : If proble GP has a saddle-point ( x, ) then x is optial for GP. Proof : If ( x, ) is a saddle-point, then conditions (i)-(iii) hold and therefore heore applies. Note : Exaples and 2 show that not all probles have saddle-points (even though the proble ay have an optial solution) 0

Saddle Points for Optiality HW : Let ( x, ) be a saddle-point for GP and let x x be optial for GP. Show whether or not ( xλ, ) is a saddle-point. HW a : Let ( x, ), ( x, be two saddle-points for GP. Show whether or not ( xλ, ) is a saddle-point. HW 2 : Using the definition of the Lagrangian for GP, derive the Lagrangians for (a) ax f ( x) (b) ax f ( x) (c) in f ( x) xx g( x) 0 xx g( x) 0 xx g( x) 0

Saddle Points for Optiality HW 3 : Using the definition of Lagrangian for GP, derive the Lagrangian for in f ( x) xx h( x) 0 r( x) 0 and show that the ultipliers for the vector function h( x) are unrestricted in sign. (Hint : First write h ( x) 0 as h( x) 0, h( x) 0 ) HW 4 : (a useful lower bound) : For proble (GP) show that, for all, λ R in L( xλ, ) in f ( x) xx (Hint : recall, for x F, λ g( x) 0 ) xf 2

For Proble GP Dual Proble in f ( x) xx g( x) 0 (GP) we define the dual proble, denoted by (D), to be ax L ( λ) λ (D) where L ( λ) in L( x,λ) ( L is called the dual function) x X Note : he above HW shows that for all λ R (i.e., for all λ 0 ) we have L ( λin f ( x) and hence, xf ax L ( λin f ( x) λ0 xf his is called the Weak Duality heore. 3

Dual Proble Ex 3 : Consider the linear progra in cx xx Ax b x 0 Let n X xr x0 and take the Lagrangian to be L( x, λc x λ Ax b) ( c λ A) xλ b L ( in ( λ c λ A x λ b x0 4

Dual Proble Notational Digression: Let A be an n atrix. We let ai denote the i - th row of A and we let a denote the j -th colun of A. herefore, we have a i a2 2 n A or A = a a a a n j By Ax is eant : Ax a x j (where a nuber ties a vector is the nuber j ties each coponent of the vector) 5

Dual Proble We also have ax ax 2 Ax ax 2 n By ya is eant : y A ai yi or y A y a y a y a i Now, for the linear progra we have L ( λin ( c λ Axλ b x0 x0 n j cj λa xj ibi j= i in ( n j in ( cj x λa j ibi x j 0 j i 6

Dual Proble Now, if c λ A 0 there exists at least one index k so that k and therefore, in ( c λax x k 0 k and hence, if c λ A 0 we have ( λ k ck k λa k On the other hand, if c λ A 0 then each c 0 and, hence, k λa k in ( c λax 0 x k 0 herefore, if c λ A0, we have k L ( λ λ b k L 0 7

Dual Proble Hence, ax ( λ ay be rewritten as λ0 ax L λb c λ A 0 λ 0 ax or bλ Aλ c λ 0 and this, of course is the usual linear prograing dual for in cx Ax b x 0 8

HW 5 : For the linear progra in cx Dual Proble Ax b x 0 n take X R and let L( x, λ, γ) c x λ ( Ax b) γ x. Following the line of arguent above, develop the dual proble ax L ( λ, γ λ0 γ0 We now characterize saddle-points in ters of duality. 9

Optiality via Duality heore 3 : For proble (GP) the pair ( x, λxr is a saddle-point, if and only if, (a) x solves GP, (b) λ solves D, and (c) f ( x) L ( λ Proof : Assue ( x, λ is a saddle-point. By theores and 2 we autoatically have (a) x solves GP. Also, L g g ( λin L( x,λ) in f( x) λ ( x) f( x) λ ( x) f( x) xx xx where the last two equalities follow fro conditions (i) and (iii) of heore 2. L ( λ) f ( x) and condition (c) holds. L By weak duality, we have ( λ) f ( x) for all λ R and hence L ( λ) f ( x) iplies λ solves ax L ( λ and hence (b) holds. λ0 20

Optiality via Duality Conversely, assue conditions (a), (b), (c) hold (and, of course, λ 0). o show ( x, λ is a saddle-point. Condition (c) states ( ) f( x) L ( λin L( x, λ) L( x,λ) f( x) λ g( x) xx But condition (a) iplies x F and, hence, λ g( x) 0 and it ust then be the case that f( x) λ g( x) f( x) herefore, we ust have and f( x) λ g( x) f( x) λ g( x) 0 condition (iii) of heore 2 holds. Also, x F iplies condition (ii) of heore 2 holds. Hence, it only reains to show that condition (i) holds; i.e., to show x solves in L( x, λ ). But this follows iediately fro () xx since λ g ( x) 0 iplies (using () ) that in L( x, λ) L( x, λ) and condition (i) of heore 2 holds xx 2

Ex (revisited) : in x xx 2 x 0 Optiality via Duality 2 hen, L( x, ) xx and L ( ) in For 0, L( in x For xr xr 2 0, L ( ( ) 2 2 4 xx 2, if L (, if 4 L ( 22

Optiality via Duality and while sup L ( 0 we see that ax L ( has no optial solution; 0 0 that is, there is no 0 so that L ( 0. Ex 2 (revisited) : in x (P) xr For 0 we see that and for 0 we have x 0 L( x, ) x x) and L ( in x x) L (0in x 0 xr xr L ( in x x) xr 0, if L (, if 23

L( Optiality via Duality 0 solves ax ( and 0 But x solves (P) and herefore, L f x L ( 0 x ( ) f x 0 L( ax L( in ( ) 0 xf [NOE : when this situation occurs we say that there is a duality gap. hat is, if x solves GP and solves D and L ( f( x ) we have a duality gap.] herefore, any proble with a duality gap cannot have a saddle-point. 24

Optiality via Duality HW 6 : Consider the proble, denoted by P I, where in xx cx Ax b (P ) I X n I x R xj 0 and integer-valued, j,, n [NOE: (P I ) is a version of the so-called linear integer prograing proble ] Let (P) denote the associated linear progra in xx Ax cx b (P) where n n XR 0 xr x 25

L I Optiality via Duality Let denote the dual function for and Let L denote the dual function for P. Show whether or not LI L HW 7 : Consider the proble in xx x 2 2x 0 where X [0,] (Note: the constraint is an equality constraint) P I (a) Derive L ( ) (b) Decide whether or not P has a saddle-point. 26

Karush-Kuhn-ucker Points Consider the optiization proble in f ( x) xx g( x) 0 (GDP) where f is differentiable on the interior of X and also each g, i i,, is differentiable on the interior of X. We use the sybol GDP for general differentiable proble. As before, g L( x, λ) f ( x) λ ( x) 27

(KK point) if where Karush-Kuhn-ucker Points Definition: We say ( x, λ) (int X) R is a Karush-Kuhn-ucker point (i') L( x, λ x (ii) x X, g( x) 0 g (iii) λ ( x) 0 L( x, λ L( x, λ x L( x, λ,, x x n the gradient, with respect to x, of the Lagrangian evaluated at x. 28

Karush-Kuhn-ucker Points Ex 2 (revisited) : in xr x x 0 then x is optial for P. Consider L( x, x x ). hen 2 x L( x, ( x) 2 herefore, if we take we have. 2 0x L( x, x L(, 2 Also, x int R,, and. herefore, g( x) x 0 g( x) 2( ) 0 we see that ( x, (, 2) is a Karush-Kuhn-ucker point for (P) [Recall : (P) has no saddle-point]. 29

Ex (revisited) : Karush-Kuhn-ucker Points in x xr 2 x 0 (P) 2 Let x 0, the only feasible point. Now L( x, xx and for all. herefore, (P) has x L( x, 2x x L( x, 0 no KK point (as well as no S.P.) 30

Karush-Kuhn-ucker Points HW 8 : Consider the linear progra ( A is n ) in xr n cx Ax b x 0 (P) and let L( x, λ, γc x λ ( Ax b) γ x. Show that (P) has an optial n solution x if, and only if, there exists vectors λ R, γ R, so that ( x,( λ, γ ) is a saddle-point. Is it necessarily true that ( x,( λ, γ ) is also a Karush-Kuhn-ucker point? Why? [HIN : Don t be afraid to use your knowledge of linear prograing.] 3

Econoic Motivation of Duality Assue GP provides our optial production cost in f ( x) xx g( x) 0 (GP) We are faced with the following offer. he Dual Co. will buy us out as follows: he Dual Co. provides us with a vector of prices λ ( 2 ) 0. We then choose x X and the Dual Co. will pay us λ g ( x) igi( x) (think of the vector g( x) as being the vector i of resources used when we choose x X). On the other hand, since we will not be producing, we are to pay the Dual Co. the savings in production costs (i.e., f ( x) ). herefore, the net payent to the Dual Co. is f ( x) λ g( x) L( x,λ. 32

Econoic Motivation of Duality Of course, when faced with λ 0 we would like to pay the Dual Co. as little as possible; i.e., we would like to pay L ( λ) in L( x,λ and, of course, the Dual Co. would like to choose a vector λ so that we pay the Dual Co. as uch as a possible. hat is, the Dual Co. would like to choose a λ 0 so that xx L ( λ ) ax L( λ0 Now, we already know that ax ( λ) in f ( x (i.e., our largest λ0 L possible payent to the Dual Co. is no larger than our optial production cost). λ xf 33

Econoic Motivation of Duality If ax ( λ) in f ( x (duality gap) λ0 L xf the Dual Co., presuably, will not want to buy us out since the aount of oney it receives fro us ust be saller than its optial production cost after it buys us out. herefore, a necessary condition for a rational Dual Co. to ake us an offer in the first place is that ax L ( λ) in f ( x. λ0 xf herefore, assue this condition is et. Now assue when faced with λ 0 we choose an xx so that (wlog, assue g( x) 0 ). he Dual Co. will then argue that it did not correctly estiate and it will offer a new price vector λ 0 where, say,, i i, i 2 and 0 (i.e., it will raise or increase ) 34

Econoic Motivation of Duality Proof : L( xλ, f( x) ( ) g ( x) g ( x) g ( x) 2 2 L( x, λ g ( x) L( x, λ i.e., our net payent to the Dual Co. will increase if the Dual Co. is allowed to change its price offer λ and we stick to our previous choice of the vector x. [Of course, since the Dual Co. has changed its λ we ll insist on being allowed to change our x.] herefore, a necessary condition for the Dual Co. to be satisfied with its own price offer is that g( x) 0. On the other hand, suppose g( x) 0 but λ g( x) 0. hen λ g( x) 0 and therefore, wlog, we ay assue g ( x) 0. 35

Econoic Motivation of Duality Show that the Dual Co. will then want to decrease the price therefore a necessary condition for the Dual Copany to be satisfied with its offer is λ g( x) 0, g( x) 0 and ax L ( λ) in f ( x. λ0 Hence, a necessary condition for us to actually be bought out is that our optial production optiization proble (GP) has a saddle-point. xf and 36

Crude Idea of a Dual Algorith for GP Step 0 : Set, select λ k 0. Step : Let x k X solve Step 2 : Let (NOE : IkC k and IkCk ). For each i Ik (if Ik ) let k 0 in L( x, λ xx k k k If x F xx g( x) 0 and λ g( x ) 0, go to Step 3 k i g k k k k i x j j g j x I ( ) 0, C ( ) 0 k k i For each j C if C, let 0 k k Set k k and return to Step. k i k k 37

Crude Idea of a Dual Algorith for GP k k k Step 3 : Stop; x is optial for GP ( ( x, λ ) is a S.P.) NOE : he above algorith is not well-defined for the following reason: (a) At Step 2 we have not specified by how uch to increase the prices associated with infeasible constraints nor have we specified by how uch to decrease the prices associated with feasible constraints for which copleentary slackness fails. (b) Even if GP has a saddle-point we, as yet, do not know if this algorith converges to a S.P. (c) Since this algorith is seeking a saddle-point for GP, the procedure is autoatically in trouble if (GP) does not have a S.P. his algorith, of course, was otivated by the above econoic discussion we ll have ore to say about these types of procedures later in the course. 38