Duality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by

Duality (Continued) Recall, the general primal problem is min f ( x), xx g( x) 0 n m where X R, f : X R, g : XR ( X). he Lagrangian is a function L: XR R m defined by L( xλ, ) f ( x) λ g( x)

Duality (Continued) and the dual function, L : R m R { } is defined by L ( λ) min L( x, λ) xx (where it is understood that max sup and min inf ) heorem 3: he dual function is concave (regardless of the nature of the general primal problem). L 2

Duality (Continued) 2 2 Proof: Let λ, λ 0 be such that L ( λ ) and L ( λ ). 2 Let [0,] and consider λ ( ) λ. L 2 2 λ ( ) λ min f ( x) λ ( ) λ g( x) xx 2 min f( ) ( ) ( ) f( ) ( ) x λ g x x λ g x x X 2 min f( x) λ g( x) ( ) min f( x) λ g( x) xx L 2 ( λ ) ( ) L( λ ) xx 3

Duality (Continued) 2 If, on the other hand, L ( λ ) or L ( λ ) then, automatically, the concave inequality holds. L L L 2 2 ( λ ( ) λ ) ( λ ) ( ) ( λ ) Alternative Proof: For a fixed x X we see that f ( x) λ g( x) is linear in λ 0 and is therefore concave in λ (for this fixed x ). herefore, G λ λr R x λ x x m (, ),, f ( ) g( ) 4

Duality (Continued) is a convex set. But G λ λr R λ L m (, ),, L ( ) m ( λ, ) λ R, R, min L( x, λ) xx λ λr R f x λ g x all xx m (, ),, ( ) ( ) f g xx xx G m ( λ, ) λ R, R, ( x) λ ( x) x But the intersection of any collection of convex sets is also convex; therefore G L is convex L is concave in λ 0. 5

Duality (Continued) We now introduce the primal function which may also be called the optimal value function or the perturbation function. Definition: he set of feasible right sides is defined to be m Y yr xx g( x) y X Y : YR (note: ). he primal function is defined as ( y) min f ( x), xx g( x) y. 6

Duality (Continued) heorem 4: If X is convex and g is a vector of concave functions then Y is a convex set. If, in addition, f is convex on X then is a convex function on Y. 2 2 Proof: Let y Y, y Y. hen there exist x X, x X 2 2 so that g( x ) y and g( x ) y, and then g( x ) y 2 2 and ( ) g ( x ) ( ) y for [0,]. By concavity of the vector g we have g( x ( ) x ) g( x ) ( ) g( x ) y (y 2 2 2 2 y ( y Y, since 2 x x X (. 7

Duality (Continued) 2 Now, also assume f is convex on X. Let y Y, y Y be 2 such that ( y ) and ( y ). hen, by the definition of inf, for all 0 there exist x 2, x X so that g( x ) y g( x ) y 2 2 i i and f( x ) ( y ), i, 2. Also, by concavity of g, x ( ) x F xx g( x) y ( ) y 2 2 8

Duality (Continued) and, therefore, y y x x 2 2 ( ( ) ) f ( ( ) ) f x herefore, by letting 0 we get 2 ( ) ( ) f( ) y y x 2 ( ) ( ) ( ) 2 2 ( y ( ) y ) ( y ) ( ) ( y ) HW 46: Complete the proof for the case where 2 or ( y ). ( y ) 9

Let f : R R be defined by f( x) = x f ( x) x hen f is a convex function which is not differentiable at x = 0. However, note that at x = 0 we have x x ( xx) or x x 0

and this holds for all [,]. Such a function is said to be subdifferentiable at x. Definition: Let X R n be convex and let f : X R. f is said to be subdifferentiable at x X (in convex sense) if there is a vector R n so that x f( x) f( x) ( xx) for all x X. he vector x is called a subgradient or a support at x. x

[Note: f is subdifferentiable in concave sense at x if f is subdifferentiable in convex sense at x.] We say f is subdifferentiable on X (in either sense) if it is so at each x X. heorem 5: Let X be convex and let f : X R be subdifferentiable (in convex sense) (in concave sense). hen f is (convex on X ) (concave on X ). 2

2 Proof: Let x, x X and let x x ( ) x 2 for [0,]. hen so that R n x Subdifferentiability f f ( x ) ( x ) ( x x ) x f f 2 2 ( x ) ( x ) ( x x ) x f (x ) + ( ) f (x f (x ) x x x x 2-2 ) ( ) ( ) ( ) x x f (x ) // 3

HW 47 : Let X be convex and let f : X R be convex. Does this imply f is subdifferentiable, in convex sense, on X? HW 48 : Show that if f is differentiable at x and if x is a support at x then f ( x) x. [Hint : Simply use the notion of directional derivative.] heorem 6: Consider the general primal problem min f ( x) g( x) 0. Let λ 0 and let x X solve min L( xλ, ). hen g( x) is a support (concave sense) for at λ. L xx xx 4

Proof: o show L() λ L()- λ g( x) ( λ λ) for all λ 0. Now, L ( λ) inf ( f( x) λ g( x)) f( x) λ g( x) xx f ( x) λ g( x) λ g( x) λ g( x) L ( λ) g( x) ( λλ) [Note: Because of HW 48, if L is differentiable at λ then L ( λ) g( x).] he following is an often useful result. 5

heorem 7: Consider the general problem GP: min f ( x), xx g( x) 0 Let λ 0 and let x solve min L( xλ, ). hen x solves xx min f ( x), xx g( x) g( x) ( ) P g ( x ) 6

Proof : Let y g( x). he Lagrangian for this perturbed problem is L(, xλ ) f () x λ y ( g () x g ()) x f ( x) λ g( x) λ g( x)) L( xλ, ) λ g( x) L ( λ) min L ( x, λ) min L( x, λ) λ g( x) L( x, λ) λ g( x) y xx y xx 7

herefore, (i) x solves Subdifferentiability min L ( xλ, ) xx (ii) x X, g( x) g( x) y (iii) λ ( g( x) g( x)) 0 ( xλ, ) is a saddle-point for the Lagrangian L(, xλ ) P g ( x ) (which is the Lagrangian for.) HW 49 : Corollary: x is optimal for min f ( x), xx g( x) y where y g ( x) if 0, and y g ( x) if 0. i i i i i i y 8

We now want to demonstrate the result that if x solves min L( xλ, ) then λ is a support (convex sense) for at g( x). hat is, we want to show ( y) ( g( x)) λ ( yg( x)), for all y Y. herefore, if is differentiable at g( x) then xx ( g( x)) λ he most convenient way to lead up to this important result is through the following which relates the primal function,, with the dual function,. L 9

heorem 8: L( λ)= inf ( ( y) λ y). yy Proof: inf ( ( ) L( λ)= f x λ g( x)) inf ( f( x) λ g( x)) xx all yy { xx g( x) y } herefore, L( λ) inf ( ( y) λ y). yy inf ( f ( x) λ y) ( y) λ y { xx g( x) y } It remains to show the opposite inequality. Define Y { y R m xx g ( x ) y }. hen Y Y. Now, let x X and let y g ( x) ( yy ). 20

hen, f( x) λ g( x) inf ( f( x) λ g( x)) inf ( f( x) λ y) { xx g( x) y } { xx g( x) y} inf ( f ( x) λ y) ( y) λ y { xx g( x) y } herefore, since x X is arbitrary, we have L ( λ) min ( f ( x) λ g( x)) inf ( ( y) λ y) inf ( ( y) λ y) xx yy yy We can now derive the result dual to that of heorem 6. heorem 9: Let λ 0 and let x X solve min L( xλ, ) xx hen λ is a support (convex sense) for at g( x). 2

Proof: o show ( y) ( g( x)) λ ( y g( x)), for all y Y, by heorem 7, we know that ( xλ, ) is a saddle-point for min f ( x) g( x) g( x) xx and, therefore, ( g( x )) ( λ ) ( λ ) λ g( x ) inf ( ( y ) λ y ) λ g( x ) L y L yg( x) yy ( y) λ g( x) λ y all yy ( y) ( g( x)) λ ( y g ( x)), for all y Y Note: If is differentiable at g( x), then ( g( x)) λ 22

Let s define the sets of supports for and L, respectively, as and m ( z) { λr ( y) ( z) λ ( y z), forall y Y } m L( γ) { yr L( λ) L( γ) y ( λγ), forall λ 0 } In terms of this notation, heorems 6 and 9 can be summarized as follows. heorem 20: Let λ 0 and let x X solve min L( xλ, ). hen, and g( x) L ( λ) λ ( g( x)) xx 23

Moreover is differentiable at and is differentiable at g x, L λ ( ) g( x) L ( λ) and λ ( g( x)) 2 Example : min x, x 0. xr Recall, this problem has no saddle-point. Also, we showed L ( L ( 24

that max L( ) has no optimizing vector. Also, note that 0 and, for y Y, we have ( y) y Y { y y 0} y) y y) y And this function has no convex support at the origin ( y 0). 25

Example 2 : min x, x 0. xr Subdifferentiability Recall, this problem has no saddle-point. We also showed L ( ) 0, if 0, if 0 herefore, 0 solves max L ( ). Also, note that Y R and 0 ( y) y, if y 0, if y 26

y) y Note that has no convex support at the origin ( y 0). Also, note that L( x, ) x and therefore only x 0 solves min L( x, ) and, of course, is not even feasible. Also, note that xr max L( x, ) xr x 0 (so this is no help either). 27

Example 3 : min x, x 0 where xx hen L( x, ) ( ) x and x R, if 0 L ( ), if 28

* * * herefore, solves max L ( ). Also, note that ( x, ) is a saddle-point, 0 * where x (show this!). However, L( x, ) for all xr * * * and, therefore, x is not the only optimizer for the Lagrangian (parameterized by the optimal dual vector). In particular, the operation * min L( x, ) xr does not automatically provide an optimal primal solution. Note further * that the dual function,, is not differentiable at. L hese examples lead to the following. 29

heorem 2: Consider the general primal problem min f ( x), xx g( x) 0 (GP) and assume GP has an optimal vector x. hen, GP has a saddle point if, and only if, the primal function has a nonnegative support (convex sense) at the origin ( y 0). Proof: Suppose ( x, λ ) is a saddle-point. o show λ (0) or to show ( y) (0) λ y, for all yy 30

Now, (0) f ( x) L ( λ) inf ( ( y) λ y) yy ( y) λ y, all yy or ( y) (0) λ y, all yy Conversely, suppose λ (0), λ 0. hen, ( y) (0) λ y, all yy L ( λ) inf ( ( y) λ y) (0) yy 3

But the Weak Duality heorem states that L ( λ) and, therefore, L ( λ). herefore, λ solves the dual problem and f ( x) L ( λ) implies that ( xλ, ) is a saddle-point. heorem 22: Let λ 0 solve max L ( λ and further assume that L is λ0 differentiable at λ. hen any x * X which solves min L( xλ, ) is also optimal for the general primal GP. xx 32

Proof: Since λ is optimal we must have L ( λ) ( λλ) 0, all λ 0 (since { λλ λ 0 } is the set of feasible directions at λ ). By HW48 * and heorem 6 we have L ( λ) g( x ) and therefore * gx ( ) ( λ λ) 0, all λ0 or * * inf λ ( x ) λ ( x ) λ0 g g (*) * herefore, g( x ) 0 since if, say, g ( x * ) 0 * then inf λ g( x ). * * herefore, λ g( x ) 0. But, by setting λ=0 in (*), we have λ g( x ) 0. * Hence λ g( x ) 0. herefore, λ0 33

and solves * (i) x X min L( x, λ) xx * * (ii), g( ) x X x 0 * (iii) λ g( x ) 0 * ( x, λ) * is a saddle-point for GP which, in turn, implies x is optimal for GP. HW 50 Consider the problem min xr x x 2 x 0 0 34

[ Note: his problem is equivalent to Example.] Show whether this problem has a saddle-point. HW 5: Consider GP and assume there is a vector x X so that (i.e., g( x) 0, g2( x) 0,, g m ( x) 0). Show that the set of " λ " components of saddle-points (if any) is bounded. hat is, show that the set { λ 0 xx ( x, λ is a saddle -point } is a bounded set [Hint: Use the definition of the saddle point.] Is the set for HW 50 bounded? [Note: he proof of heorem 22 is not entirely rigorous since λ may have some zero components (i.e., λ may be on boundary of { λ λ 0} and we have not said what we mean by L being differentiable at a boundary point.] 35