Division of the Humanities and Social Sciences Ec 181 KC Border Conve Analysis and Economic Theory Winter 2018 Toic 16: Fenchel conjugates 16.1 Conjugate functions Recall from Proosition 14.1.1 that is a subgradient of a roer conve function f at the oint if and only if there is some (finite) real number β such that the affine function g : y y β satisfies y β f(y) for all y and g() = f(). In other words, we have β = ma y This suggests the following definition. y f(y). 16.1.1 Definition Let f : R m R be a conve function. The Fenchel conjugate of f is the function f : R m R defined by f () = su y f(y). (1) y R m If g is concave, its (concave) conjugate g is defined by g () = inf R n g(). Note that if f is concave (resectively conve), then f is conve (resectively concave) and ( f) () = ( f ( ) ). 16.1.2 Proosition Let f : R m R be a conve function. Its Fenchel conjugate f is a closed conve function. Moreover f is roer if and only if f is roer. Proof : The conjugate of a conve function is closed and conve since it is the ointwise suremum of the affine functions g y : y f(y). If f is roer, then f () is finite for some, so (1) imlies that f(y) > for every y and f(y) must be finite for some y. That is, f is roer. Now assume that f is roer, and let ri dom f. Then the affine function g : f() on the dual sace satisfies f () g() = f() for every, so f does not take on the value. Moreover, by Corollary 14.1.5, there is some q f(), so f (q) = q f(), which is finite. Therefore dom f is nonemty. Thus f is roer. KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 2 This imlies that f also has a conjugate (f ), usually written as just f. 16.1.3 Proosition (Fenchel s inequality) If f is a roer conve function in R m, then for all, R m, we have f() + f (). (F) Proof : Note that if f is roer then f is roer, so f() + f () never takes the form + or, so the right hand side is always a well defined etended real number. By definition f () = su R m f(), so for all, we have f () f(), and rearranging gives the desired inequality. 16.1.4 Proosition If f is conve, then cl f = f. Consequently if f is a closed conve function, then f = f. Proof : Let g β be an affine function. Then g f if and only if β f (). (Why?) By definition, cl f() = su g() : g is affine and g f} = su β : R m & β f ()} = su f ()}. But this last term is by definition (f ) (). Now when is a subgradient of f at, then in fact Fenchel s inequality holds with equality. The converse is true. That is, if Fenchel s Inequality binds for some, setting β = f () imlies β = f() and for all y, y β f(y). 16.1.5 Theorem If f is a roer closed conve function, then the following are equivalent. 1. f() + f () =. 2. f(). 3. f (). 4. f () = f() = ma y y f(y). 5. f() = f () = ma q q f (q). If g is a roer closed concave function with concave conjugate g, then the following are equivalent. 1. g() + g () =. v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 3 2. g(). 3. g (). 4. g () = g() = min y y g(y). 5. g() = g () = min q q g (q). Proof : The equivalence of (1), (2), and (4) is just restatement of Proosition 14.1.1 in terms of f (). Since f is closed, by Proosition 16.1.4 we have f = f and the equivalence of (1), (3), and (5) follows by interchanging the roles of f and f. Economic interretations of the conjugate Consider a multiroduct firm that can roduce n different oututs. Fi the factor wages and let f : R n + R be its cost as a function of the outut vector, that is, f() is the cost of roducing the outut vector R n +. Conveity of the cost function catures the roerty of decreasing returns to scale in roduction. Now let be a vector of outut rices. Then f() is the firm s rofit from choosing the outut vector. The conve conjugate f is just the firm s otimal rofit function, that is, f () is the maimum rofit the firm can make at rices. Or consider a firm that roduces one good from n inuts, where the rice of the outut good has been normalized to unity. Let the concave function g be its roduction function, so that g() is the quantity (and value) of outut from the inut vector R n +. Concavity of the roduction function again catures the roerty of decreasing returns to scale in roduction. Let be a vector of inut rices. Then g() is the firm s loss from choosing the inut vector. The concave conjugate g () = inf g() is the otimal value function for the loss minimization roblem, that is, g () is the minimum loss (so g () is the maimum rofit) the firm can make at inut rices. Eamles When the conjugate f () = su y y f(y) is achieved as a maimum at and f is differentiable at, we may be able to use the first order condition = f () to solve for f () elicitly, f () = f() where satisfies f () =. (2) In order for this to be sufficient, we need that f be invertible. 16.1.6 Eamle (The conjugate of e ) For f() = e, we have f () = e. So = f () imlies = e or = ln. Thus f () = e =ln = ln. KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 4 As a check, we comute f. The derivative of ln is ln + 1 1 = ln, so the first-order condition = (f ) () imlies ln or = e. Then (f ) () = ( ln ) =e = e (e ) e = e. 16.1.7 Eamle (The conjugate of 2 /2) For f() = 2 /2, we have f () =. So = f () imlies =. Thus ( ) f () = 2 /2 = 2 2 2 = = 2 2. Similarly (f ) () = 2 /2 = 2 /2. = 16.1.8 Eamle (The concave conjugate of ln ) For ln > 0 g() = 0 we have g () = 1/ and g () = 1/ 2 < 0, so g is concave. Also the first-order condition = g () for a minimum imlies = 1/ or = 1/. Thus g () = ln = 1 ln(1/) = 1 (ln 1 ln ) = 1 + ln, =1/ as ln 1 = 0, or more roerly g 1 + ln > 0 () = 0. so the effective domain of g is also (0, ). Now the derivative of 1+ln is 1/, so = (g ) () imlies = 1/ or = 1/. Then (g ) () = (1 + ln ) = 1 (1 ln(1/)) = ln, > 0. =1/ v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 5 16.1.9 Eamle (The conve conjugate of ln ) For ln > 0 f() = 0 we have f () = 1 + ln and f () = 1/ > 0, so f is conve. See Figure 16.1.1. Then the first-order condition = f () imlies = 1 + ln or = e 1. Thus f () = f() = ln = e 1 e 1 ln e 1 = e 1, so the effective domain of f is all of R. =e 1 0.2 0.1 0.2 0.4 0.6 0.8 1.0 1.2 0.1 0.2 0.3 Figure 16.1.1. The grah of ln. To verify that f = f, the derivative of e 1 is e 1, so = (f ) () imlies = e 1 or = 1 + ln. Then (f ) () = f () = e 1 =1+ln = (1+ln ) e (1+ln ) 1 = ln, > 0. 16.1.10 Eercise What is the concave conjugate of f( 1, 2 ) = 1/2 1 1/2 2 with effective domain R 2 +? 16.1.11 Eercise Let α > 1 and define for 0. Then f is conve. Prove that f() = α α f () = β β for 0, where 1 α + 1 β = 1. KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 6 So by Fenchel s Inequality, α α + β β for, 0. Conclude that if g is a function that satisfies g α dµ < and h satisfies h β dµ <, then gh dµ <. (Perhas you have heard of conjugate eonents?) 16.1.12 Eercise (Cf. Eamle 5.6, Borwein and Lewis [2]) Find the conve conjugate of f for the following cases: ln, > 0 1. f() = ln, > 0 2. f() = 0, = 0 e 1, 0 3. f() = α α 4. f() = 1 α, 0 where 0 < α < 1., otherwise, α α 5. f() = 2 2, 0 α ln ( + 1) ln( + 1), > 0 6. f() = 1 7. f() =, > 0 Samle answer: 1. f() = ln, > 0 f () = ln( ), < 0 v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 7 ln, > 0 2. f() = 0, = 0 f () = e. 3. f() = e 1, 0 f () = ln + 1, 1 f () =, < 1. α α 4. f() = 1 α, 0 ( ) f 1 β () = β, β, otherwise,, otherwise, where 0 < α < 1 and β = α/(1 α). 1 ) α α 5. f() = 2 2, 0 α α( + f 2 1, 0 () = ln ( + 1) ln( + 1), > 0 6. f() = ln ( + 1) ln( + 1), > 0 7. f() = ( α 1 + f () = 2 1 ), 0 ln(1 e f ), < 0 () = 8. f() = 1/ > 0, 0. f () = 2, 0 16.1.13 Eercise The entroy of a vector R m +, often denoted H() is defined by m H() = i ln i, with the convention that 0 ln 0 = 0. For robability vectors, the entroy is a measure of the uncertainty associated with, see for instance, Jaynes [4] or Kullback [5]. Verify that entroy is a roer strictly concave function. Show that its concave conjugate is given by i=1 m H () = e i 1 = 1 i=1 e m e i. i=1 KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 8 16.2 Suort functions are conjugates of indicator functions Recall that the conve analyst s indicator function is defined by 0 C δ( C) = + / C. The indicator of C is a conve function if and only if C is a conve set, and is a roer closed conve function if and only C is a nonemty closed conve set. Now / C δ( C) = C so the conjugate δ ( C) satisfies so δ ( C) = su δ( C) = su, C δ ( C) = π C (), where π C is the rofit function. On the other hand if we look at the concave indicator function δ( C), we get ( δ) ( C) = (δ )( C) = c C (), where c C is the cost function of C. We already noted another relation between conjugates and suort functions. For a roer closed conve function f, δ ( (, 1) ei f ) = f (). 16.3 Conjugate of an affine function 16.3.1 Eamle Since an affine function f is both concave and conve, we have a roblem, albeit a minor roblem. The concave conjugate and the conve conjugate do not agree, but only differ outside their common effective domain. To see this, consider the affine function f() = β. Then So f (q) = su f (q) = inf q f() = su q f() = inf q + β for f conve, q + β for f concave. f () = β v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 9 either way, but if q, then by choosing the aroriate we can make q be any real number. So the effective domain of f is the singleton } and f () = β, and outside the effective domain f is or deending on whether f is treated as conve or concave. Let s find the (conve) conjugate of f. Now q f q, (q) = β q =, so (f ) () = su q q f (q) = β = f(). 16.4 Conjugate functions and maimization: Fenchel s Duality Theorem Let f : C R be a roer closed concave function on a conve subset of R. Assume maimizes f over C. If we etend f to be concave on all of R n by setting f() = for not in C, then still maimizes f over R n. Now by Lemma 14.1.8 we have 0 f( ), so by Theorem 16.1.5, it follows that f (0) and f( ) = f (0), where f is the (concave) conjugate of f. This (fortunately) agrees with the definition f () = suα : f() α for all R n }, which reduces to f (0) = su f(). But there is a more interesting relationshi between conjugates and maimization. The net result is due to Fenchel [3, 47 48,. 105 109]. It also aears in Rockafellar [6, Theorem 31.1,. 327 329], who also rovides a number of variations. It states that every concave maimization roblem has a dual conve minimization roblem, and the solutions to the two coincide. 16.4.1 Fenchel s Duality Theorem (Conve version) Let f be a roer conve function and g be a roer concave function on R n. If ri dom f ri dom g, then inf f() g() = su g () f (), where f is the conve conjugate of f and g is the concave conjugate of g. Moreover, the suremum is attained for some R n. (Note that since the functions are etended real-valued, the suremum may be attained yet be infinite.) If in addition, f and g are closed and if ri dom f ri dom g =, then the infimum is attained at some dom f dom g, and is finite. Interchanging f and g and su and inf gives the following. 16.4.2 Fenchel s Duality Theorem (Concave version) Let f be a roer conve function and g be a roer concave function on R n. If ri dom f ri dom g, then su g() f() = inf f () g (), KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 10 where f is the conve conjugate of f and g is the concave conjugate of g. Moreover, the infimum is attained for some R n. If in addition f and g are closed and if ri dom f ri dom g, then the suremum is attained at some dom f dom g, and is finite. Proof of concave version: From Fenchel s Inequality, for every and, so f() + f () g() + g (), (3) f () g () g() f() for all,. (Since we are subtracting etended real-valued functions, we should make sure that the meaningless eression does not occur. Now f and g are roer conve and concave functions resectively, so f() > and g() < for all, so g() f() is a well defined etended real number. By Proosition 16.1.2, both f and g are roer, so by the same logic, f () g () is a well defined etended real number.) Therefore taking the infimum on the right and the suremum on the left, inf f () g () su g() f(). (4) We need now to show that there is no duality ga, that is, the reverse inequality also holds. So let α = su g() f(). If α =, then (4) holds with equality, so assume α <. We also know that α > since g() f() is finite for any dom f dom g. Also α satisfies f() + α g() for all. icture Now consider the eigrah A of f + α, and the strict hyograh B of g, A = (, β) R n R : β f() + α} B = (, β) R n R : β < g()}. Then A and B are disjoint nonemty conve subsets of R n R, so by Theorem 8.5.1 there eists a nonzero (, λ) R n R that roerly searates A and B, say (, λ) A (, λ) B. It follows then that λ < 0. (To see this, suose λ = 0. Then roer searation of A and B by (, 0) imlies inf dom f < su y dom g y, which imlies that roerly searates dom f and dom g, which contradicts ri dom f ri dom g v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 11 (Theorem 8.5.1). If λ > 0, then for large enough β > 0, we have (, λ) (, β) > (, λ) (, g() ), a contradiction of the searation inequality.) Thus without loss of generality we may take λ = 1. Then searation imlies f() α g() for all. Taking the suremum on the left and the infimum on the right gives f ( ) α = su Recalling the definition of α gives su f() α inf g() f() = α f ( ) g ( ) inf g() = g ( ). f () g (). This roves the reverse of inequality (4), so these are actually equalities, there is no ga, and attains the infimum. Now assume that f and g are closed, and that ri dom f ri dom g. Aly the argument just used to the functions f and g, to get su g () f () = inf f () g () and is finite. Now use the fact that f = f and g = g, to get that the infimum of f g, and hence the suremum of g f, is attained for some. 16.5 Infimal convolution 16.5.1 Definition The infimal convolution of roer conve functions f and g, denoted f g, is defined by (f g)() = inf f( z) + g(z)} = inf f(y) + g(z) : y + z = }. More generally, the infimal convolution of roer conve functions f 1,..., f n is (f 1 f 2 f n )() = inf f 1 (z 1 ) + + f n (z n ) : z 1 + + z n = }. Similarly, the suremal convolution of concave functions f 1,..., f n is f 1 f 2 f n () = su f 1 (z 1 ) + + f n (z n ) : z 1 + + z n = }. The roof of the net lemma is immediate from the definitions. 16.5.2 Lemma For roer conve functions f and g, (f g)() = inf α : (, α) ei f + ei g}. Similarly, for roer concave functions f and g, (f g)() = su α : (, α) hyo f + hyo g}. KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 12 16.5.3 Corollary For roer conve functions f and g, the infimal convolution f g is conve. Similarly, the suremal convolution of roer concave functions is concave. Note that Lemma 16.5.2 does not imly that ei(f g) = ei f +ei g. Indeed this is not generally true. Just consider f() = e and g() = e. Then ei f + ei g = (, α) R R : α > 0}, whereas ei(f g) = (, α) R R : α 0}. Also, the reason we defined the infimal convolution for roer functions and not more generally was to avoid the ossibility of. Lemma 16.5.2 rovides an alternate definition that could be used more generally. The net result shows that addition and infimal convolution are dual oerations under Fenchel conjugation. 16.5.4 Theorem (cf. Aubin [1, Theorem 3.4,. 37]) Let f and g be roer closed conve (or concave) functions on R n. If ri dom f ri dom g, then for each dom f dom g, there is some q satisfying (f + g) () = f ( q) + g (q) = (f g )(). Proof : I ll rove the concave case. By definition, (f + g) () = inf = inf = su q ( f() + g() ) ( g()) (f() ) (f ) (q) ( g) (q), where the last equality is the conve version of Fenchel s Duality Theorem 16.4.1 alied to the conve function g and the concave function f(). Moreover this suremum is attained for some q. Now recall that ( g) (q) = g ( q), so define q = q. Furthermore, (f ) (q) = su Substituting above yields q (f() ) = su (f + g) () = f ( + q) + g ( q) = f ( q) + g ( q). ( + q) f() = f ( + q). 16.6 Fenchel s Duality Theorem redu Returning to Fenchel s Duality Theorem, the first order necessary condition for a maimum of g f at is that 0 (g f)( ). When ri dom f ri dom g, so that (g f)( ) = g( ) f( ), we have f( ) g( ). A generalization of this condition that is also sufficient is given in the net theorem. v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 13 16.6.1 Theorem Let f be a closed roer conve function and g be a closed roer concave function on R n. Assume ri dom f ri dom g and ri dom f ri dom g. Then the following conditions are equivalent. 1. su C g() f() = g( ) f( ) = f ( ) g ( ) = inf f () g (). 2. g( ) and f ( ). 3. f( ) and g ( ). 4. g( ) f( ) and f ( ) g ( ). Proof : (1) = (4): If su g () f () = g ( ) f ( ) = f( ) g( ) = inf f() g(), rearranging and using (3) gives Thus by Theorem 16.1.5, g ( ) + g( ) = = f( ) + f ( ). f ( ), g ( ), f( ), and g( ). (2) = (1): From Theorem 16.1.5 we have g( ) imlies g( ) + g ( ) =, and f ( ) imlies f( ) + f ( ) =. Therefore g( ) + g ( ) = f( ) + f ( ) so Moreover by (4) we have g( ) f( ) = f ( ) g ( ). inf f () g () su g() f(). Thus g( ) f( ) = f ( ) g ( ) inf f () g () su g() f() g( ) f( ), so g( ) f( ) = su g() f(). Similarly, f ( ) g ( ) = inf f () g (). The imlication (3) = (1) is similar, and (4) = (3) and (4) = (2) are trivial. KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 14 16.6.2 Eamle (Fenchel Duality Theorem and Suort Functions) The cost function µ A of a nonemty closed conve set A is given by µ A () = inf A. The indicator function δ( A) satisfies δ( A) = 0 for A, and δ( A) = for / A. Let f() = q and g() = δ( A). Then f is a roer closed conve function and g is a roer closed concave function, and µ A (q) = inf f() g(). The dual roblem is to find su g () f (). Now ri dom f = R n and ri dom g = ri A. By Theorem 16.4.1, the suremum is attained, and it is easy to see that it is attained at q. Thus 0 (f g )(q). Recall from Eamle 16.3.1 that f (q) = 0 and f () = for q. Thus ri dom f = q}. Also the concave conjugate of the concave function g satisfies g () = inf g() = inf A = µ A (). So dom g = : µ A ()is finite}. In order to aly the remainder of Fenchel s Duality Theorem 16.4.1 or Theorem 16.6.1, we must have q ri dom µ A. Assume this for a moment. In that case, achieves the infimum (q = µ A (q)) if and only if there eists satisfying g ( ), f( ). Needs work. Now f() = q for any, so f( ) if and only if = q. So minimizes q over A if and only if µ A (q). This constitutes another roof of Theorem 15.2.1 for the case where q ri dom µ A. Unfortunately the conditions under which q ri dom µ A are not very simle to elain. See Rockafellar [6, Corollary 13.3.4,. 117, and also. 66]. 16.7 The calculus of sub/suerdifferentials One of the reasons the differential (total derivative) of a function is such a useful tool is that there is a useful differential calculus, that is, a set of rules for maniulating symbolic reresentations of differentials. The imortant results are the symbolic reresentation of the linearity of the derivative and the chain rule. The calculus of subdifferentials is not as nice, but it retains linearity and a weak version of the chain rule that alies to the comosition of a conve and linear function. The following result is immediate from the subgradient inequality. 16.7.1 Lemma Let f be a conve (or concave) function. Then for any real λ, (λf)() = λ f(). (N.B. If f is conve and λ < 0, then (λf)() is the suerdifferential of the concave function λf.) v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018
KC Border Fenchel conjugates 16 15 16.7.2 Theorem Let f and g be roer closed concave (or conve) functions on R n. If the oint belongs to ri dom f ri dom g, then (f + g)() = f() + g() Proof : (cf. Aubin [1, Theorem 4.4,. 52].) Note that also belongs to the relative interior of dom(f + g), so each of f, g, and f + g is suerdifferentiable at. Moreover f +g is a roer closed concave function, as it is easy to see that the sum of uer (or lower) semicontinuous functions is uer (or lower) semicontinuous. It is easy to see that f() + g() (f + g)() just add the suergradient inequalities. That is, if f() and q g(), for each y we have so f() + (y ) f(y) and g() + q (y ) g(y), (f + g)() + ( + q) (y ) (f g )(y). That is, + q (f + g)(). (By the way, the assumtion that ri dom f ri dom g is not needed for this art.) For the reverse inclusion, let belong to the suerdifferential (f + g)(). Then by Theorem 16.1.5 (f + g)() + (f + g) () =, but by Theorem 16.5.4 (this is where the assumtion that ri dom f ri dom g is needed), there is a q satisfying (f + g) () = f ( q) + g (q), so we have f() + f ( q) + g() + g (q) =. Subtracting q from both sides and rearranging gives f() + f ( q) + g() + g (q) q = ( q) [ f() + f ( q) ( q) ] + [ g() + g (q) q ] = 0. But by Fenchel s Inequality for concave functions, each of the two bracketed terms is nonositive, so each must be zero. But then by Theorem 16.1.5, we have q f() and q g(). Thus = ( q) + q belongs to f() + g(). 16.7.3 Theorem Let A be a linear transformation from R n into R m (an m n matri if you will), and let g be a roer conve function on R m, and let Then f() = g(a). f() A ( g(a) ), If in addition, the range of A includes a oint in ri dom f, then for all, f() = A ( g(a) ), where A is the adjoint (transose) of A. KC Border: for Ec 181, Winter 2018 src: Conjugates v. 2018.02.20::15.56
KC Border Fenchel conjugates 16 16 Proof : (cf. Rockafellar [6, Theorem 23.9,. 225 226]) ( ): Let A ( g(a) ), say = A q, with q g(a). The subgradient inequality imlies g(a) + q (Ay A) g(ay) for all y R n. Now recall the definition of the adjoint: If A: R n R m, then A : R m R n satisfies A = Ay for all R n and all R m. Thus we have g(a) + (y ) g(ay) for all y R n, which is the subgradient inequality for f = g A at. ( ): Now assume that the range of A includes a oint in ri dom f, and that f(). Then by Proosition 14.1.1, f () = f() = ma y y f(y) = ma y y g(ay). ************************************ References [1] J.-P. Aubin. 1984. L analyse non linéaire et ses motivations Économiques. Paris: Masson. [2] J. M. Borwein and A. S. Lewis. 1991. Duality relationshis for entroylike minimization roblems. SIAM Journal on Control and Otimization 29(2):325 338. DOI: 10.1137/0329017 [3] W. Fenchel. 1953. Conve cones, sets, and functions. Lecture notes, Princeton University, Deartment of Mathematics. From notes taken by D. W. Blackett, Sring 1951. [4] E. T. Jaynes. 1957. Information theory and statistical mechanics, I. Physical Review 106(4):620 630. DOI: 10.1103/PhysRev.106.620 [5] S. Kullback. 1968. Information theory and statistics, 2d. ed. New York: Dover Publications. [6] R. T. Rockafellar. 1970. Conve analysis. Number 28 in Princeton Mathematical Series. Princeton: Princeton University Press. v. 2018.02.20::15.56 src: Conjugates KC Border: for Ec 181, Winter 2018