Conjugate Transformations Since sup x2c [yx g(x)] = inf [g(x) yx] it is geometrically clear that y 2Diff there is a number t such that the graph of the linear equation : t + yx "supports" g : C, in that it is "under and comes "arbitrarily close" to (epi g), as shown in the following two examples.refer Figure x2c Asymptotic at infinity g(x) Tangential at finite point g(x) 000000 000000 000000 000000 00000 0000 t 0000 0000 0000 0000 0000 0000 0000 000 00 t slope y x slope y x Asymptotic Support Tangential Support Figure : Asymptotic and Tangential supports It is also clear that such a y 2 D gives rise to the conjugate transform value h(y) = t. In essence, h : D comes form the "envelope" of g : C, namely, all "linear supports" (y; t) of g : C. If there are no such supports, D = f;g and g : C has no conjugate transform. For example, g(x) def = exp x has no supports and hence no conjugate transform when C = < but has supports and hence a conjugate transform when C is bounded from above, say C =( ; +]. If a support (y; t) is tangential at a point say(~x; g(~x)), and if g is differentiable at ~x, note that y = g 0 (~x). On the other hand, note that not every derivative g 0 (~x) at some ~x 2 (C) produces a support (y; t) with y = g 0 (~x). Also, note that the following example shows that the tangential supports can occur at a point (~x; g(~x)) where g is not differentiable.
Refer Figure 2 g : C where C = [ ; +] and g(x) = q jxj has infinitely many supports at (0,0) namely, all straight lines through (0; 0) with slope y 2 [ ; +]. It also has infinitely many supports at ( ; +), namely, all straight lines through ( ; ) with slope y 2 ( ; ]. Finally, it has infinitely many supports at (; ), namely, all straight lines through (; ) with slope y 2 [; +). None of these supports occur where g is differentiable, and no points at which g is differentiable produce a support. The support slopes y are the graph of a multivalued function @g, termed as subderivative of g. g 00000000 0000 00 x dg x Figure 2: Graph depicting the function and its subderivative 2
Refer Figure 3 The range of @g in this example is just <, is the domain D of the conjugate transform h : D whose graph and epigraph are easily obtained either analytically or geometrically. Note that g : C is not a closed convex function (because it is not convex), but h : D is (as previously established in the general case). The subderivative @h of h : D is in fact just h 0 (y), except at points y where h is not differentiable; at which points it has a continuum of multivariables (a property that is shared by all convex functions). The range @h in this example just [ ; +], is of course the domain C μ of the conjugate transform μg : μc of h : D. the graph and epigraph of μg : μc are easily obtained either analytically or geometrically as is the graph of its subderivative @μg. Note that C C μ and g(x) μg(x) for each x 2 C (previously proved in the general case). h 0000000 00000 000 0 y dh y g 000000000 000000 0000 x dg x Figure 3: Graph depicting a convex function and its subderivative 3
In fact it can be shown in general that μg : μc is the "closed convex hull" of g : C, in that (epi μg) is the "closed convex hull" of (epi g) namely the intersect of all closed convex sets containing (epi g). Note also that the graph =f(x; y)=y 2 @g(x)g of @g is a subset of the graph =f(x; μ y)=y 2 @μg(x)g of @μg and that both are monotone in the sense that (y y 2 )(x x 2 ) 0 when points (x ;y ) and (x 2 ;y 2 ) belong to such a graph. In fact μ is the completion of in the sense that there is "no monotone curve" larger than μ (i.e., properly containing μ and hence ). These properties of and μ can be proved in the general case, as can the property that the graph of @μg is simply the "inverse" of the graph of @h : a fact that will enable us to calculate the conjugate transform h : D of g : C by "subdifferentiation" of g : C followed by "completion" of, then "inversion" of, and finally the "integration" of. In equilibrium problems, (rather than optimization problems), we shall see that complete monotone curves arise from modeling (rather than objective function g : C). Since we have just seen that convex functions and their subderivatives arise naturally from conjugate transformation (even when the function being transformed is not convex), we need to take a careful look at convex functions and their differentiability property. It is not hard to see that a function f defined on a subset I of < is convex iff I is an interval (the only convex subsets of <) andalso f(s 2 )» s 3 s 2 s 3 s f(s )+ s 2 s s 3 s f(s 3 ):::::(Λ) when s < s 2 < s 3 and all are in I. This inequality (*) is equivalent to each of the three inequalities, f(s 2 ) f(s )» f(s 3) f(s )» f(s 3) f(s 2 ) :::::(ΛΛ) s 2 s s 3 s s 3 s 2 In particular, subtracting f(s ) from both sides of (*) and dividing by s 2 s gives the left-hand inequality (**); while multiplying (*) by, adding f(s 3 ) to both sides of the resulting inequality, and then dividing by s 3 s 2 gives the 4
right-hand inequality (**). Manipulation of each of the three inequalities (**) to give (*) is even easier. Consequently, we infer that a function f defined on a subset I of < is convex iff I is an interval and inequalities (**) are satisfied when s < s 2 < s 3 and all are in I. This characterization of convex functions f : I in terms of the "difference quotients" in (**) is the key to establishing the differentiability properties of convex functions. Theorem If f is convex on I <, then. f 0 def f (s+ffi) f (s) + = lim ffi!0+ exists and is either finite or for each s 2 I ffi οfright hand end point ofig, 2. f 0 def f (s+ffi) f (s) = lim ffi!0 exists and is either finite or + for each s 2 I ffi ο fleft hand end point ofig, 3. On the (int I) both the "left hand derivative" f 0 and the "right hand derivative" f 0 are finite non decreasing functions, and + f 0» f 0 + for each s 2(int I), 4. the sub-derivative set @f(s) def = ft 2 R=f(s)+tffi» f(s + ffi); 8ffi 2 I fsgg is given by the formula, @f(s) = 8 >< >: 5. f is continuous on (int I), (-;f 0 + (s)]) if s is the left-hand endpoint of I [f 0 (s);f0 + (s)] if s 2(int I) [f 0 (s); +] ifs is the right-hand endpoint of I 6. f is differentiable at s 2(int I) iff @f(s) contains a single point t, in which case f 0 (s) =t = f 0 (s) =f 0 + (s), 7. f 00 (s) 0foreach s 2(int I) for which f 00 (s) exists, 8. if e is an endpoint ofi, then the lim s!e f(s) exists and is either finite or + and the lim s!e f(s)» f(e) whene 2 I 5
Proof: Straight-forward exercise by way of inequality (**) and standard arguments from analysis, when done in the stated order. Theorem: If f 00 exists on an open interval I and f 00 (s) 08s 2 I, then,. f is convex on I, 2. f is strictly convex on I iff f 00 (s) is positive except possibly at isolated Proof: points of I. f(s 3 ) f(s 2 ) s 3 s 2 f(s 2) f(s ) s 2 s = f 0 (s 5 ) f 0 (s 4 )=f 00 (s 6 )(s 5 s 4 ) 0 The remainder follows from inequality (**) differentiation and integration. If g : C with C < has a conjugate transform h : D then g : C also has a closed convex hull function μg : C μ to which the preceding theorems can be applied (with f =μg and I = C. μ In particular, the graph of @g is a subset of the graph μ of @μg which is itself a complete monotone curve. In fact = μ iff g : C is itself convex and closed; in which case g : C = g : C μ and hence g : C and h : D are conjugate functions. Conjugate functions g : C and h : D can be defined (or constructed) directly from a given complete monotone curve andapoint(a; b) 2 a construction that is the key to transforming equilibrium problems into useful optimization problems. the construction is illustrated graphically in Figure 4. monotone and complete. fl(s) def = ft=(s; t) 2 g; s 2 domain( ) fl (t) def = fs=(s; t) 2 g; s 2 range( ) R g(x) def b ab = fl(s)ds + C R def x = fx= a 2 a R h(y) def y = b fl (t)dt + ab D R def y = fy= 2 b fl(s)d(s); convergesg fl (t)d(t); convergesg 6
t y b 00000000 000 00000000 000 00000000 000 00000000 000 00000000 000 00000000 0000000 000000 000000000 000000000 000000000 000000000 000000000 a x Due to this area xy<=g(x)+h(y) s Figure 4: Complete monotone curve depicting inequality Using the area interpretation of the definite integrals shows that: xy» g(x)+h(y); equality iff (x; y) 2 m y 2 fl(x) =@g(x) m x 2 fl (y) =@h(y) + y 2 @g(x); iff x 2 @h(y) Observe g : C and h : D are, ffl Convex, by montonicity of ffl Closed, by completeness of ffl Conjugate functions 7
Multi dimensional Conjugancy Young-Fenchel Conjugate inequality Theorem: Given "Conjugate functions" g : C and h : D (i.e, each is Convex, closed and Conjugate transform of the other) with equality iff xy» g(x)+h(y); 8x 2C;y 2D y 2 @g(x) def = fy 2< n =g(x)+y(x 0 x)» g(x 0 ); 8x 0 2Cg Proof: The inequality follows from the defining equation, If h(y) = sup[yx g(x)]; 8y 2D x2c g(x)+y(x 0 x)» g(x 0 ); 8x 0 2C; then yx 0 g(x 0 )» yx gx; 8x 0 2C and hence h(y) = yx g(x); so the conjugate inequality is an equality. On the other hand, if xy = g(x)+h(y) then and hence yx g(x) =h(y) yx 0 g(x 0 ); 8x 0 2C g(x 0 ) g(x)+y(x 0 x); 8x 0 2C: Q.E.D Theorem: Given conjugate functions g : C and h : D. y 2 @g(x) iff x 2 @h(y) 2. if then, C = py C k ; g(x) = px k= k= @g(x) = py k= @g k (x k ) g k (x k ) 8
Proof:. Follows from the preceding theorem and the fact that g : C and h : D are conjugate transforms of one another. 2. Follows from the fact that y 2 @x iff px g k (x k )+ px y k (x k0 x k )» px k= k= k= g k (x k0 ); 8x 0 2C iff g k (x k )+y k (x k0 x k )» g k (x k0 ); 8x k0 2C k for k =; 2;:::p iff y k 2 @g k (x k ) for k =; 2;::::p: Q.E.D Note: Since we previously showed that the seperability ofg : C hypothesized in (2) implies that, D = py D k and h(y) = px k= k= h k (y k0 ) a corollary to (2) is that, @h(y) = py k= @h k (y k ) These are three elementary facts that help in the computation of conjugate transforms when the conjugate transforms of closely related functions are already known. In particluar, given that g : C has a known conjugate transform, h : D,. for a given scalar s, the function g + s : C has a conjugate transform h s : D (because sup x2c [yx fg(x)+sg] = sup x2c [yx g(x)] s): 2. for a given vector v 2< n, the function g(: + v) :C fvg has a conjugate transform h(:) v : D (because sup x2c fvg [yx fg(x + v)g] becomes via the change of variables x + v = z, sup z2c [y(z v) g(z)] = sup[yz g(z)] vy z2c 9
3. for a given scalar > 0, the function g : C has a conjugate transform h(:= ) : D (because sup x2c [yx g(x)] = sup [(y= )x g(x)] x2c Example: Since the inspections shows that, 0:f0g has conjugate transform 0:< we infer that,. a:f0g has conjugate transform -a:< 2. a:fbg has conjugate transform -a+b(.):< 3. -a+b(.):< has conjugate transform a:fbg = sup[(y= )x g(x)]) x2c Example: Since we know that, log( P n ex i ):<n has conjugate transform ( y i log y i ):fy 0= y i =g we infer that: (). log( P n c ie x i ):<n has conjugate transform ( y i log y i=c i ):fy 0= y i =g (because c i e x i = ex i+log(c i ) and log[y i =c i ]=log(y i )=log(c i )) (2). log( P n c ie x i ):<n has conjugate transform ( y i log y i=c i ) log( ) :fy 0= y i = g (because (y i = )log[y i = c i ]= y i logfy i =c i g ( y i )log( ) and because y= 0 and P n (y i= ) =imply that y 0and P n (y i)=.) 0
Example: (=2)x 2 : < with graph has conjugate transform (=2)y 2 : < (by inspection and elementary calculus). Consequently,. (=2)(x b) 2 : < has conjugate transform (=2)y 2 + by : <. 2. (=2) P n (x i b i ) 2 : < n has conjugate transform P n (=2)y2 i + by : <n. 3. (=2) P n (x i b i ) 2 d : < n has conjugate transform P n (=2)y2 i + by + d : <n. 4. [(=2) P n (x i b i ) 2 ]:< n has conjugate transform (=2 ) P n y2 i < n. + by + d : Example: jxj : < with graph, has conjugate transform 0 : [ ; +], by inspection. Consequently,
. P jx bj : < has conjugate transform by :[ ; +]. n 2. jx i b i j : < Q n n P has conjugate transform by : [ ; +]. n 3. jx i b i j d : < Q n n P has conjugate transform by + d : [ ; +]. n 4. [ jx i b i j d] :< Q n n has conjugate transform by + d : [ ; +]. Exercise: Generalize the preceding examples to (=p)jxj p : < where the given constant p>. fhint: while using inspection and the elementary calculus you will find it convenient to introduce the constant q where (=p)+(=q) = g. 2
Linear sets, Affine sets and Polyhedral sets Convex sets and cones Definition: A non empty set S E ( -dimensional Euclidean space) is: ffl linear (i.e., a subspace) if ffi z +ffi 2 z 2 2 S when (z ;z 2 ) 2 S and (ffi ;ffi 2 ) 2 <. ffl affine (i.e., a linear manifold) if ffi z + ffi 2 z 2 2 S when (z ;z 2 ) 2 S and ffi + ffi 2 =. ffl convex if ffi z +ffi 2 z 2 2 S when (z ;z 2 ) 2 S and ffi +ffi 2 = and ffi ;ffi 2 0. Observations: Each linear set is affine (but not conversely) and each affine set is convex (but not conversely). Some important facts: (whose proofs are non trivial). Each linear set is the solution set of a finite system of linear homogeneous equations and also the set of all linear combinations of a finite set of basis vectors (and conversely). Each affine set is the solution set of a finite system of linear (not necessarily homogeneous) equations and also a particular solution plus linear combination of basis solutions to the corresponding linear homogeneous system (and conversely). The solution set of a finite system of linear equations and/or inequalities (i.e., a polyhedral set) is convex (but not conversely). Each convex set that is topologically closed and not all of E is the solution set of a (possibly infinite) system of linear equations and/or inequalities. Definition: z 2 S and ffi 0. A non empty set S E is conical (i.e., a cone) if ffiz 2 S when Observations: Each linear set is conical (but not conversely). Some important facts: (whose proofs are non trivial). A non empty set S is a convex cone iff ffi z + ffi 2 z 2 2 S when (z ;z 2 ) 2 S and ffi ;ffi 2 0. 3
The solution set of a finite system of linear homogeneous equations and/or inequalities (i.e., a special type of polyhedral set) is a convex cone (but not conversely). Such a cone is "finitely generated" in that it consists of all nonnegative linear combinations of a finite set of "generating vectors". 4