Y R n m : AY = B. McGill University, 805 Sherbrooke St West, Room1114, Montréal, Québec, Canada H3A 0B9

Size: px

Start display at page:

Download "Y R n m : AY = B. McGill University, 805 Sherbrooke St West, Room1114, Montréal, Québec, Canada H3A 0B9"

Merry Dorsey
5 years ago
Views:

1 VARIATIONAL PROPERTIES OF MATRIX FUNCTIONS VIA THE GENERALIZED MATRIX-FRACTIONAL FUNCTION JAMES V. BURKE, YUAN GAO, AND TIM HOHEISEL Abstract. We show that many important convex matrix functions can be represented as the partial infimal projection of the generalized matrix fractional (GMF) and a relatively simple convex function. This representation provides conditions under which such functions are closed and proper as well as formulas for the ready computation of both their conjugates and subdifferentials. Special attention is given to support and indicator functions. Particular instances yield all weighted Ky Fan norms and squared gauges on R n m, and as an example we show that all variational Gram functions are representable as squares of gauges. Other instances yield weighted sums of the Frobenius and nuclear norms. The scope of applications is large and the range of variational properties and insight is fascinating and fundamental. An important byproduct of these representations is that they lay the foundation for a smoothing approach to many matrix functions on the interior of the domain of the GMF function, which opens the door to a range of unexplored optimization methods. Key words. convex analysis, infimal projection, matrix-fractional function, support function, gauge function, subdifferential, Ky Fan norm, variational Gram function AMS subject classifications. 68Q25, 68R0, 68U05. Introduction. The generalized matrix-fractional (GMF) function was introduced in [5] where it is shown to unify a number of tools and concepts for matrix optimization including optimal value functions in quadratic programming, nuclear norm optimization, multi-task learning, and, of course, the matrix fractional function. In the present paper we greatly expand the number of applications to include all Ky Fan norms, matrix gauge functionals, and variational Gram functions [4]. Our analysis includes descriptions of the variational properties of these functions such as formulas for their convex conjugates and their subdifferentials. In what follows, we set E := R n m S n where R n m and S n are the linear spaces of real n m matrices and (real) symmetric n n matrices, respectively. Given (A, B) R p n R p m with rge B rge A, recall that the GMF function ϕ is defined as the support function of the graph of the matrix valued mapping Y 2 Y Y T over the manifold Y R n m AY = B }, i.e., ϕ : E R := R + } is (.) ϕ(x, V ) := sup (Y, W ), (X, V ) (Y, W ) D(A, B)}, where (.2) D(A, B) := (Y, 2 ) Y Y T E } Y R n m : AY = B. A closed form expression for ϕ is derived in [5] where it is also shown that ϕ is smooth on the (nonempty) interior of its domain. Our study focuses on functions p : R n m R representable as the partial infimal projection of the form (.3) p(x) := inf ϕ(x, V ) + h(v ), V Sn Department of Mathematics, University of Washington, Seattle, WA 9895 (jvburke@uw.edu). Research is supported in part by the National Science Foundation under grant number DMS Department of Applied Mathematics, University of Washington, Seattle, WA 9895 (yuangao@uw.edu). McGill University, 805 Sherbrooke St West, Room4, Montréal, Québec, Canada H3A 0B9 tim.hoheisel@mcgill.ca)

2 2 J. V. BURKE, Y. GAO, AND T. HOHEISEL 39 where h : S n R is convex. Different functions h illuminate different variational 40 properties of the matrix X. For example, when h := U, for U S n ++, and when 4 both A and B are zero, then p is a weighted nuclear norm where the weights depend 42 on any square-root of U (see Corollary 4.7). Among the consequences of the repre- 43 sentation (.3) are conditions under which p is closed and proper as well as formulas 44 for the ready computation of both p and p (Section 3). As an application of our 45 general results, we give more detailed explorations in the cases where h is a support 46 function (Section 4) or an indicator function (Section 5). We illustrate these results 47 with specific instances. For example, we obtain all weighted squared gauges on R n m, 48 cf. Corollary 5.9, as well as a complete characterization of variational Gram functions 49 [4] and their conjugates. In addition, we show that all variational Gram funtions are 50 representable as squares of gauges, cf. Proposition 5.0. Other choices yield weighted 5 sums of Frobenius and nuclear norms, see [5, Corollary 5.9]. The scope of applications 52 is large and the range of variational properties is fascinating and fundamental. 53 Beyond the variational results of this paper, there is a compelling but unexplored 54 computational aspect of this representation. Hsieh and Olsen [3] show that (.3) 55 with h = 2tr ( ) yields a smoothing approach to optimization problems involving the 56 nuclear norm. More generally, observe that many matrix optimization problems often 57 take the form (P ) min f(x) + p(x), X Rn m where f, p : R n m R := R + }. The function f is thought of as the primary objective and is often smooth or convex while p is typically a structure inducing convex function. Using the representation (.3), the problem (P ) can be written as min f(x) + ϕ(x, V ) + h(v ). (X,V ) E This reformulation allows one to exploit the smoothness of ϕ on the interior of its domain. For example, if both f and h are smooth, one can employ a damped Newton, or path following approach to solving (P ). We emphasize, that this is not the goal or intent of this paper, however, our results provide the basis for future investigations along a variety of such numerical and theoretical avenues. The paper is organized as follows: In Section 2 we provide the tools from convex analysis and some basic properties of the GMF function. Section 3 contains the general theory for partial infimal projections of the form (.3). In Section 4 we specify h in (.3) to be a support function of some closed, convex set V S n. In Section 5 we choose h to be the indicator of such set. In particular, this yields powerful results on variational Gram functions and Ky Fan norms, see Section We close out with some final remarks in Section 6 and supplementary material in Section 7. Notation: For a linear transformation L, we write rge L and ker L for its range and kernel, respectively. For A R p n, we abuse notation somewhat and write rge A and ker A for its range and kernel, respectively, when A is considered as a linear transformation between R n and R p. Again, for A R p n, we set Ker r A := X R n r AX = 0 } = X R n r rge X ker A }, Rge r A := Y R p r X R n r : Y = AX } = Y R p r rge Y rge A } and write KerA or RgeA when the choice of r is clear from the context. Observe that Ker A = ker A, Rge A = rge A, and (Ker r A) = Rge r A T, where we equip any

3 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION matrix space with the (Frobenius) inner product X, Y := tr (X T Y ). The Moore- Penrose pseudoinverse of A, see e.g. [], is denoted by A. The set of all symmetric matrices of dimension n is given by S n. The positive and negative semidefinite cone are denoted by S n + and S n, respectively. For two sets S, T in the same real linear space their Minkowski sum is S + T := s + t s S, t T }. For I R we also put I S := λs λ I, s S }. 2. Preliminaries. Tools from convex analysis. Let (E,, ) be a finite-dimensional Euclidean space with induced norm :=,. E.g. on matrix spaces we use the Frobenius norm induced by the trace operator. The closed ɛ-ball about a point x E is denoted by B ɛ (x). Let S E be nonempty. The (topological) closure and interior of S are denoted by cl S and int S, respectively. The (linear) span of S will be denoted by span S. The convex hull of S is the set of all convex combinations of elements of S and is denoted by conv S. Its closure (the closed convex hull) is conv S := cl (conv S). The conical hull (also positive hull) of S is the set pos S := R + S = λx x S, λ 0}. The convex conical hull of S is r } cone S := λ i x i r N, x i S, λ i 0. i= It is easily seen that cone S = pos (conv S) = conv (pos S). The closure of the latter is cone S := cl (cone S). The affine hull of S is denoted by aff S. The relative interior of a convex set C E is given by ri C = x C ε > 0 : B ε (x) aff C C }. The points x ri C are characterized through (see, e.g., [2, Section 6.2]) (2.) pos (C x) = span (C x). The polar set of S is defined by S := v E v, x (x S)}. Moreover, we define the bipolar set of S by S := (S ), so that S = conv (S 0}). If K E is a cone (i.e. pos K K) we have K = v E v, x 0 (x K)} =: K. If U E is a subspace, U is the orthogonal subspace U. The horizon cone of S E is the the closed cone given by S := v E λ k } 0, x k S} : λ k x k v }. For a cone K E we have K = cl K. Moreover, for a convex set C E, C coincides with the recession cone of the closure of C, i.e. (2.2) C = v x + tv cl C (t 0, x C)} = y C + y C }.

4 4 J. V. BURKE, Y. GAO, AND T. HOHEISEL For f : E R its domain and epigraph are given by dom f := x E f(x) < + } and epi f := (x, α) E R f(x) α}, respectively. We call f convex if its epigraph epi f is a convex set, and we call it closed (or lower semicontinuous) if epi f is closed. If f is proper, we call it positively homogeneous if epi f is a cone, and sublinear if epi f is a convex cone. In what follows we use the following abbreviations: Γ(E):=f : E R + } f proper, convex} and Γ 0 (E):=f Γ(E) f closed}. The lower semicontinuous hull cl f and the horizon function f of f are defined through the relations cl (epi f) = epi cl f and epi f = (epi f), respectively. For f Γ 0 (E) the horizon function f coincides with the recession function, see e.g. [5, p. 66], and all of the respective results apply. Note that also the moniker asymptotic function is used for the horizon function, see e.g. [, 0]. The horizon cone of a function f is defined as hzn f := x f (x) 0}. By [5, Theorem 8.7], for f Γ 0, we have 22 hzn f = W f(x) µ} (µ R : W h (W ) µ} ) For a convex function f : E R + } its subdifferential at a point x dom f is given by f( x) := v E f(x) f( x) + v, x x }. Recall that, for a convex function f, we always have ri (dom f) dom f dom f, see e.g. [5, p. 227], where dom f := x E f(x) } is the domain of the subdifferential. For some function f : E R its (Fenchel) conjugate f : E R is given by f (y) := sup x, y f(x)}. x E Note that f Γ 0 (E) if and only if f = f := (f ). The definition of the conjugate function yields the Fenchel-Young inequality (2.3) f(x) + f (y) x, y (x, y E). Given a nonempty set S E, its indicator function δ S : E R + } is given by 0 if x S, δ S (x) := + if x / S. The indicator of S is convex if and only if S is a convex set, in which case the normal cone of S at x S is given by N S ( x) := δ S ( x) = v E v, x x 0 (x S)}.

5 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION The support function σ S : E R + } and the gauge function γ S : E R + } of a nonempty set S E are given respectively by σ S (x) := sup v, x and γ S (x) := inf t 0 x ts }. v S Here we use the standard convention that inf = +. It is easy to see that (2.4) σ S = σ conv S and γ conv S = γ conv S. Moreover, given two (nonempty) sets S, T E and x E, it is easily seen that (2.5) σ S + σ T = σ S+T. Suppose C E is closed and convex. Then its barrier cone is defined by bar C := dom σ C. The closure of the barrier cone of C and the horizon cone are paired in polarity, i.e. (2.6) (bar C) = C and cl (bar C) = (C ). For two functions f, f 2 : E R, their infimal convolution f f 2 is defined by 52 (f f 2 )(x) := inf y E f (x y) + f 2 (y)} (x E) The generalized matrix-fractional function. As noted in the introduction, the GMF function is the support function of D(A, B) given in (.2). Hence, we write (2.7) σ D(A,B) (X, V ) = ϕ(x, V ) and also refer to σ D(A,B) as the GMF function. From [5, 6], we obtain the formula (2.8) ϕ(x, V ) = ( (X 2 tr ) T ( ) ) B M(V ) X B if rge ( X B) rge M(V ), V KA, + else, where (A, B) R p n R p m with rge B rge A and K A is the cone of all symmetric matrices that are positive semidefinite with respect to the subspace ker A, i.e. (2.9) K A := V S n u T V u 0 (u ker A) }, and M(V ) is the Moore-Penrose pseudoinverse of the bordered matrix ( ) V A T (2.0) M(V ) =. A 0 The matrix-fractional function [4, 9] is obtained by setting the matrices A and B to zero. A detailed analysis of the GMF function appears in the papers [5, 6]. In particular, it is shown that 67 (2.) dom σ D(A,B) = dom σ D(A,B) ( ) } X = (X, V ) R n m S n rge rge M(V ), V K A. B

6 6 J. V. BURKE, Y. GAO, AND T. HOHEISEL For the study of the convex-analytical properties of the support function σ D(A,B) the computation of the closed convex hull of the (nonconvex) set D(A, B) has been critical. A representation of conv D(A, B) relying mainly on Carathéodory s theorem was obtained in [5, Proposition 4.3]. A refined and more versatile expression was proven in [6], see below. The key object for this expression is the (closed, convex) cone K A defined in (2.9), which reduces to S n + for A = 0. We briefly summarize the geometric and topological properties of K A useful to our study, and which follow from [6, Proposition ] (by setting S = ker A). Proposition 2.. For A R p n let P R n n be the orthogonal projection onto ker A and let K A be given by (2.9). Then the following hold: a) K A = V S n P V P 0}. b) KA = cone vv T v ker A } = W S n W = P W P 0}. c) int K A = V S n u T V u > 0 (u A \ 0}) }. The central result from [6] is the following characterization of conv D(A, B). Theorem 2.2 ([6, Theorem 2]). Let D(A, B) be given by (.2). Then conv D(A, B) = Ω(A, B) := (Y, W ) E AY = B and } 2 Y Y T + W KA In particular, Theorem 2.2 in combination with (2.4) implies that σ D(A,B) = σ Ω(A,B), an identity which we will employ throughout. 3. Infimal projections of the generalized matrix-fractional function. We will now focus on infimal projections involving the GMF function. For these purposes consider the function ψ : E R, given by (3.) ψ(x, V ) := σ Ω(A,B) (X, V ) + h(v ), where h Γ(S n ) and Ω(A, B) is given by Theorem 2.2. Our primary objective is the infimal projection of the sum ψ from (3.) in the variable V, i.e. we analyze the marginal function p : R n m R defined by (3.2) p(x) := inf ψ(x, V ). V Sn We lead with an elementary observation Lemma 3. (Domain of p). Let p defined by (3.2). Then the following hold: 96 a) p is convex. b) dom p = X R n m V K A dom h : rge ( X B ) rge M(V )}. In partic- 98 ular, p is proper if and only if dom h K A is nonempty. 99 Moreover, if dom p then the following hold: 200 c) If B = 0 (e.g. if A = 0) then dom p is a subspace, hence relatively open. 20 d) If rank A = p (full row rank) then dom p = R n m, hence open Proof. a) The convexity follows from, e.g., [6, Proposition 2.22]. b) The formula for dom p follows from the definition of p and the representation of dom σ Ω(A,B) in (2.) which also gives the properness exactly when dom h K A. c) If B = 0, note that, X dom p if and only if span X} dom p. Since dom p is also convex, it is a subspace, see, e.g., [6, Proposition 3.8]. d) The bordered matrix M(V ) from (2.0) is invertible if (and only if) rank A = p. In this case the condition rge ( X B ) rge M(V ) is trivially satisfied for any X Rn m and B R p m. Therefore the statement follows from b).

7 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION The following example shows that the domain of p may not be relatively open (hence not a subspace) if B 0, which proves that this assumption in Lemma 3. c) is not redundant. Example 3.2 (dom p). Let A = ( ) and b = ( ). Then ker A = span ( ) } and KA = ( v w w u ) v + u 2w }. Moreover, put V := ( 2 0 ) and define Then V is clearly convex and compact. dom h = V (e.g. h := δ V ). Note that V := [0, ] V = ( 2w w w 0 ) w [0, ]} S2. Now let h Γ 0 (S 2 ) be any function with dom h K A = V. We hence infer that x dom p w [0, ] : ( x b ) rge ( w V ) A T A 0 w [0, ], r, s R 2 x = w : V r + A T s, b = Ar Therefore, we find that w [0, ], λ, µ R : x = w ( 2 0 ) [ ( 0 ) + λ ( )] + µ ( ) w [0, ], γ R : x = w ( 0 ) + γ ( ). dom p = [0, ] ( 0 ) + span ( )}, 227 and hence 228 ri (dom p) = (0, ) ( 0 ) + span ( )}, so that dom p is clearly not relatively open As mentioned above, the former example shows that dom p may fail to be a subspace if B 0. Lemma 3. d) and Example 3.7 a), on the other hand, illustrate that dom p might still be a subspace even if B 0, hence the condition B = 0 is only sufficient for dom p to be a subspace (if nonempty). 3.. ψ, ψ, and their subdifferentials. Our study of the infimal projection p given in (3.2) requires a thorough understanding of the properties of the functions ψ, ψ, and their subdifferentials. For this we make extensive use of the condition ri (dom h) int K A, which we refer to as the conjugate constraint qualification (CCQ). Then Lemma 3.3 (Conjugate of ψ). (3.3) dom η = and the following hold: η : (Y, W ) E (Y, W ) Let ψ be given as in (3.) and define inf (Y,T ) Ω(A,B) h (W T ). ( AY = B, ) 2 Y Y T + KA } (W dom h )

8 8 J. V. BURKE, Y. GAO, AND T. HOHEISEL a) ψ is closed and convex. b) If dom h K A then ψ, ψ Γ 0 (E) with ψ = cl η. c) Under CCQ, we have ψ = η. Moreover, in this case, the infimum in the definition of η is attained on the whole domain, i.e. (3.4) T (Ȳ, W ):=argmin h ( W W ) (Y, W ) Ω(A, B), Y = Ȳ } (Y,W ) = (Ȳ, W ) ( Ȳ, W ) Ω(A, B), ψ (Ȳ, W ) = h ( W W ) }. is nonempty for all (Ȳ, W ) dom ψ. d) Under CCQ, dom ψ = (Y, W ) T (Y, W )} and, for every (Y, W ) dom ψ, we have T S n : V h (W T ) K A, ψ (Y, W )= (X, V ) V, 2 Y Y T + T = 0, rge (X V Y ) (KerA). 253 Proof. Note that η(y, W ) < + if and only if there is a T S n such that (Y, T ) Ω(A, B) and W T dom h, or equivalently, AY = B, T 2 Y Y T + KA and T W dom h, which proves (3.3). 256 Define ĥ : E R by ĥ(x, V ) := h(v ). Then dom ĥ = Rn m dom h and 257 ψ = σ Ω(A,B) + ĥ. 258 a) The sum of two closed and convex functions is always closed and convex. 259 b) The sum of two proper functions is proper if and only if the domains of both 260 functions intersect. Here, note that dom ĥ dom σ D(A,B) dom h K A. Therefore, ψ is proper if (and only if) the latter condition holds. Combined with a) this shows ψ is closed, proper, and convex, and hence, so is its conjugate ψ. Moreover, from Theorem 7. a) we infer ψ (Y, W ) = cl Since ĥ (Y, W ) = δ 0} (Y ) + h (W ), we have which proves ψ = cl η. δ Ω(A,B) ĥ (Y, W ) = ( δ Ω(A,B) ĥ ) (Y, W ). inf (Y,T ) Ω(A,B) h (W T ), c) We have ri (dom ĥ) = Rn m ri (dom h). Also, by [5, Theorem 4.], we have int (dom σ Ω(A,B) ) = (X, V ) V int K A }. Hence (3.5) ri (dom ĥ) ri (dom σ D(A,B)) ri (dom h) int K A. Theorem 7. a) (applied to σ Ω(A,B) and ĥ), CCQ, and (3.5) then imply ψ = η with T (Ȳ, W ) := argmin h ( W W ) (Y, W ) Ω(A, B), Y = Ȳ }. (Y,W )

9 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION d) Observe that σ D(A,B) = N Ω(A,B) and ĥ = R n m h. Then part c) and Theorem 7. d) (applied to σ Ω(A,B) and ĥ) yield ψ (Y, W ) = (X, V ) (X, V ) σ D(A,B) (Y, W ) ĥ (Y 2, W 2 ), (Y, W ) = (Y, W ) + (Y 2, W 2 ) = (X, V ) T R n m : (X, V ) N Ω(A,B) (Y, T ), V h (W T ) }. The claim follows from the representation for N Ω(A,B) (Y, T ) in [6, Proposition 3]. We now turn our attention to the subdifferential of ψ which will be used for computing the subdifferential of its infimal projection p. Corollary 3.4 (Subdifferential of ψ). Let ψ be given by (3.) and T (, ) by (3.4). Then the following hold: a) If (Ȳ, W ) σ Ω(A,B) ( X, V ) + 0} h( V ), then T (Ȳ, W ) and (3.6) T (Ȳ, W ) = T S n W T h( V ), ( Ȳ, T ) σ Ω(A,B) ( X, V ) }. b) Under CCQ we have ( ) } dom ψ = (X, V ) X V dom h K A, rge rge M(V ). B Moreover, for all ( X, V ) dom ψ and all (Ȳ, W ) ψ( X, V ), we have T (Ȳ, W ) and } 288 (3.7) ψ( X, V ) = σ Ω(A,B) ( X, V ) + 0} h( V ) = (Ȳ, W ) E T ( Ȳ, W ) } Proof. Set f (X, V ) := σ Ω(A,B) (X, V ) and f 2 (X, V ) := h(v ). Then part a) follows from Theorem 7. b), and part b) follows from Theorem 7. c) Infimal projection I. We are now in position to prove our first main result about the infimal projection p defined in (3.2). Theorem 3.5 (Conjugate of p and properties under CCQ). (3.2). Moreover, let q : R n m R be given by q : Y inf (Y, W ) Ω(A,B) h (W ). Then the following hold: a) dom q = Y R n m AY = B, ( 2 Y Y T KA) dom h }. b) p = cl q, hence dom q dom p. c) If CCQ holds for p, then we have: I) p = q, i.e. (3.8) p (Y ) = inf (Y, W ) Ω(A,B) h (W ). Let p be given by Moreover, for all Y dom p, the infimum is a minimum, i.e. there exists W dom h with (Y, W ) Ω(A, B) such that p (Y ) = h (W ). In particular, p is closed, proper, and convex with dom p = dom q. II) p Γ 0 (R n m ) is finite-valued (hence locally Lipschitz).

10 0 J. V. BURKE, Y. GAO, AND T. HOHEISEL Proof. a) Obvious. b) The expression for p (without CCQ) follows from [6, Theorem.23 c)] and Lemma 3.3 b). The domain containment is clear as p = cl q q. c.i) From [6, Theorem.23 c)] we have p = ψ (, 0), hence Lemma 3.3 c) gives the claimed statements. c.ii) p is convex by Lemma 3. a), and it does not take the value as p is proper by I). To prove the desired statement it therefore suffices to see that dom p = R n m. To this end, observe, see Lemma 3., that dom p = L(dom σ Ω(A,B) R n m dom h), where L : (X, V ) X. By CCQ we have ri (dom h) int K A, hence ri (dom σ Ω(A,B) R n m dom h) = int (dom σ Ω(A,B) ) R n m ri (dom h) = R n m int K A R n m ri (dom h) = R n m int K A ri (dom h), where we use [5, Theorem 4.] to represent int (dom σ Ω(A,B) ). This now gives ri (dom p) = L [ ri (dom σ Ω(A,B) R n m dom h) ] = R n m. We now take a broader perspective on infimal projection by embedding it into a pertubation duality framework in the sense of [6, Theorem.39] or [, Chapter 5]. Given X R n m, we define ψ X by Moreover define p X by ψ X(X, V ) := ψ(x + X, V ) ((X, V ) E). 326 (3.9) p X(X) := inf V S n ψ X(X, V ) (X R n m ) Then see [6, Equation (3)]. Defining ψ X(Y, W ) = ψ (Y, W ) X, Y ((Y, W ) E), (3.0) q X(W ) := sup X, Y ψ (Y, W )} (W S n ), Y then q X is a proper (see Lemma 3.7 for its domain) and convex function and we have a natural duality pairing of p X and q X with weak duality reading p X(0) q X(0) ( X R n m ). Applying the general pertubation duality to our scenario yields the following result. Proposition 3.6 (Shifted duality for p). Let p be defined by (3.2), let X dom p and q X be defined by (3.0). Then the following hold: a) If 0 ri (dom q X) then p( X) = q X(0) R, argmax ψ( X, ), and q X(0). b) If X ri (dom p) then p( X) = q X(0) R, argmax Y X, Y ψ (Y, W )}, and p( X).

11 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION 34 c) Under either condition 0 ri (dom q X) or X ri (dom p), p is lsc at Xand 342 q X is lsc at d) We have 344 p( X) = ψ( X, V ), = X, Ȳ ψ (Ȳ, 0), = q X(0) (Ȳ, 0) ψ( X, V ) (Ȳ, 0) ψ ( X, V ) Proof. Let X dom p and observe that p(x + X) = p X(X) (X R n m ), hence, in particular, p( X) = p(0) R. Applying [, Theorem , Corollary 5..2] to the duality pair p X and q X and translating from p X at 0 to p at X gives all the desired statements. The domain of q X is given below. Here, the set (3.) C(A, B) := W S n Y : (Y, W ) Ω(A, B)}, which will play a crucial role in what follows, occurs naturally. Lemma 3.7 (Domain of q X). Let X R n m and q X defined by (3.0). Then dom q X = C(A, B) + dom h. Proof. a) Using Lemma 3.3, observe that q X(W ) = inf ψ (Y, W ) } X, Y Y } = inf η(y, W ) X, Y Y = inf h (W T ) } X, Y. (Y,T ) Ω(A,B) Therefore, we have dom q X = W (Y, T ) Ω(A, B) : W T dom h } = C(A, B) + dom h Before we proceed with our analysis, we will discuss various constraint qualifications for the optimization problem defining p in the next section Constraint qualifications. We start our analysis with a result about the set C(A, B) from (3.), which was used in Lemma 3.7 to represent the domain of q X. 365 Lemma 3.8 (Properties of C(A, B)). Let C(A, B) be as in (3.). Then we have: 366 a) C(A, B) is closed and convex with C(A, B) = KA. 367 b) C(A, B) = dom σ Ω(A,B) ( X, ) for all X such that σω(a,b) ( X, ) is proper. 368 c) We have } ri C(A, B) = W Y : AY = B, Y Y T + W ri (KA) 370 = ri (dom σ Ω(A,B) ( X, ) ) 37 for all X such that σω(a,b) ( X, ) is proper.

12 2 J. V. BURKE, Y. GAO, AND T. HOHEISEL 372 Proof. a) With the linear map T : (Y, W ) W we have C(A, B) = T (Ω(A, B)). 373 Therefore C(A, B) is convex. By [6, Proposition 0] we have Ω(A, B) = 0} KA. 374 Therefore, ker T Ω(A, B) = 0}. Hence [6, Theorem 3.0] gives the rest of a) b) Apply Corollary 7.2 to ḡ := σ Ω(A,B) ( X, ) to infer that ḡ (W ) = inf X, Y Y :(Y,W ) Ω(A,B) (W S n ) This proves the claim. c) Observe that ri C(A, B) = ri T (Ω(A, B) = T (ri Ω(A, B)) and use [6, Proposition 8] to get the first representation. The second one follows from b). We now define the constraint qualifications central to our study. Note that CCQ was already defined earlier. Definition 3.9 (Constraint qualifications). Let p be given by (3.2). We say that p satisfies i) PCQ if 0 ri (dom h + C(A, B)); ii) strong PCQ (SPCQ) if 0 int (dom h + C(A, B)); iii) boundedness PCQ (BPCQ) if dom h K A and (dom h) K A = 0}; iv) CCQ if ri (dom h) int K A. Note that PCQ stands for primal constraint qualification and CCQ for conjugate constraint qualification. The next results clarify the relations between the various constraint qualifications. We lead with characterizations of PCQ and BPCQ. Lemma 3.0 (Characterizations of (B)PCQ). Let p be given by (3.2) and let (3.2) f X := ψ( X, ) ( X R n m ). Let X dom p. Then the following hold: a) The following are equivalent: i) 0 ri (dom f X); ii) PCQ holds for p; iii) Y R n m : AY = B, 2 Y Y T ri (KA + dom h ). In addition, similar characterizations of SPCQ hold by substituting the relative interior for the interior. b) BPCQ holds for p if and only if dom h K A is nonempty and bounded. Proof. a) Defining g X := σ Ω(A,B) ( X, ), we find that f X = cl (g X h ) and therefore ri (dom f X) = ri (dom g X + dom h ) = ri (C(A, B) + dom h ), see Lemma 3.8 c). This proves the first two equivalences. The third follows readily from the representation of ri (Ω(A, B)) from [6, Proposition 8]. b) Follows readily from [6, Theorem 3.5, Proposition 3.9]. We point out that, under PCQ, Lemma 3.0 shows that the objective functions ψ( X, ) ( X dom p) occuring in the definition of p in (3.2) are weakly coercive when proper, see [, Theorem 3.2.]. The latter reference tells us that the infimum in (3.2) is attained under PCQ if finite, a fact that will be stated again (and derived alternatively) in Theorem 3.4. Under SPCQ, the objective functions ψ( X, ) ( X dom p) are level-bounded (or coercive), in which case the argmin ψ( X, ) is nonempty and compact (and clearly convex). The next result shows the relations between the different notions of PCQ.

13 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION 3 Lemma 3.. Let p be given by (3.2). Then the following hold: a) BPCQ = SPCQ = PCQ. b) If int (dom h ) int ( C(A, B)) then PCQ and SPCQ are equivalent. 48 Proof. a) The first implication can be seen as follows: If BPCQ holds then 49 dom f X dom h K A is bounded (and nonempty exactly if X dom p). There- 420 fore f X is level-bounded for all X dom p, i.e. 0 int (dom f X ) ( X dom p), see 42 e.g. [6, Theorem.8]. In view of Lemma 3.0 a) this implies that SPCQ holds. 422 The second implication is trivial b) Obvious from the definitions. We now provide characterizations for CCQ. Lemma 3.2 (Characterizations of CCQ). Let p be given by (3.2). Then i) dom h int K A ii) CCQ holds for p iii) ( KA ) hzn h = 0}. Proof. The first equivalence is a direct consequence of the line segment principle (cf. [5, Theorem 6.]): The fact that ii) implies i) is obvious. For the converse direction let y dom h int K A and pick x ri (dom h). Then z λ := λx + ( λ)y ri (dom h) for all λ (0, ]. Letting λ 0 we find that z λ ri (dom h) int K A for all λ (0, ] sufficiently small, which proves that ri (dom h) int K A. The second equivalence can be seen as follows: We apply [5, Corollary 6.2.2] (to f := h and f 2 := δ KA ). This result tells us that ri (dom h) int K A if and only if there does not exist a matrix W S n such that 435 (3.3) (h ) (W ) + σ KA ( W ) 0 and (h ) ( W ) + σ KA (W ) > Since σ KA ( W ) = δ K A ( W ), the first of these conditions is equivalent to the con- 437 dition W ( KA ) hzn h. In particular, we can infer that ( KA ) hzn h = 0} 438 gives the inconsistency of (3.3) and thus establishes iii) ii). 439 The second condition in (3.3) implies W 0. Thus, in view of Proposition 2. b), 0 W KA Sn +, and hence W / KA. Thus, every nonzero element of the 44 set ( KA satisfies (3.3). Thus, the nonexistence of a W satisfying (3.3) 442 implies that ( KA ) hzn h = 0}, which altogether proves the result. 443 We note that for any proper, convex function f we always have hzn f (dom f) 444 which, in view of Lemma 3.2, implies that the condition (3.4) ( KA) (dom h ) = 0} is stronger than CCQ. However, we do not use it in our subsequent study. Moreover, since K A = S n if (and only if) A has full column rank we have rank A = n = CCQ Infimal projection II. We return to our analysis of the infimal projection defining p in (3.2). The following result reveals that the two critical conditions 0 ri (dom q X) and X ri (dom p), respectively, that occured in (3.6), embed nicely into our constraint qualifications studied in Section 3.3. Corollary 3.3. Let p be defined by (3.2), let X dom p and q X be defined by (3.0). Then the following hold: a) PCQ holds for p if and only if 0 ri (dom q X); b) If CCQ holds then X ri (dom p).

14 J. V. BURKE, Y. GAO, AND T. HOHEISEL Proof. a) Follows immediately from Lemma 3.7 and the definition of PCQ. b) Under CCQ we have dom p = R n m, see Theorem 3.5, hence b) follows. As a consequence of Corollary 3.3 and Proposition 3.6 we can add to the properties of p proven in Theorem 3.5. Theorem 3.4 (Properties of p under PCQ). Let p be defined by (3.2) such that PCQ is satisfied and let q X be given by (3.0). Then the following hold: a) p Γ 0 (R n m ); b) argmin V ψ( X, V ) ( X dom p) (primal attainment); c) p( X) = q X(0) ( X dom p) (strong duality). Proof. a) Under PCQ, by Corollary 3.3, we have 0 ri (dom q X) for all X dom p. Hence, by Proposition 3.6 c), p is lsc at X dom p. Since p is proper and 468 convex, see Lemma 3., this shows that p Γ b), c) Follows readily from Corollary 3.3 and Proposition 3.6 a). We note that Theorem 3.4 could have been proven entirely without using the shifted duality framework from Proposition 3.6, but by using the following approach: With the linear projection L : (X, V ) X which has been used implicitly throughout our study, it can be seen that p = Lψ is a linear image in the sense of [5, p. 38]. Then [5, Theorem 9.2] gives all statements from Proposition 3.4. This can be seen after realizing that the constraint qualification from the latter reference, which for p = Lψ reads ψ(0, V ) > 0 or ψ (0, V ) 0 (V S n ), as ker L = 0} S n, is exactly PCQ, which, however, also takes some effort. For the sake of uniformity, we have chosen to derive Theorem 3.4 from the shifted duality scheme, which will also be serviceable for our subsequent subdifferential analysis. The next result follows readily from the foregoing analysis. Corollary 3.5. Let p be given by (3.2). If PCQ and CCQ are satisfied for p then the following hold: a) p Γ 0 (R n m ) is finite-valued and for all X R n m there exists V such that p( X) = ψ( X, V ). b) p = q and for all Ȳ dom p there exists W such that (Ȳ, W ) Ω(A, B) and p (Ȳ ) = h ( W ). Proof. Follows from Theorem 3.5. The table below summarizes most of our findings so far. Here X dom p and Ȳ dom p. Consequence\Hypothesis - PCQ SPCQ BPCQ CCQ PCQ + CCQ p Γ p Γ 0 p( X) = q X(0) argmin ψ( X, ) argmin ψ( X, ) compact dom p = R n m p = p argmin h ( T ) (Ȳ,T ) Ω(A,B)

15 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION In view of Proposition 3.6 b) and Corollary 3.3 one might be inclined to think that using CCQ instead of the pointwise condition 0 ri (dom p) is excessively strong. However, computing the relative interior of dom p without CCQ is problematic, cf. the derivations in the proof of Theorem 3.5 c.ii) under CCQ. Moreover, CCQ is exactly what is needed to establish desirable properties of p, see Theorem 3.5 c.i). Hence, we do not consider constraint qualifications weaker than CCQ. We now turn our attention to subdifferentiation of p. Proposition 3.6 (Subdifferential of p). Let p be given by (3.2). Then the following hold: a) Under CCQ we have (3.5) p( X) = argmax X, Y inf Y (Y,T ) Ω(A,B) h ( T )}, which is nonempty and compact. b) Under PCQ equation (3.5) holds, and, for X dom p, we have p( X) = Ȳ V : ( Ȳ, 0) ψ( X, V ) } c) Under PCQ and CCQ, we have = Ȳ V : ( X, V ) ψ (Ȳ, 0)} = Ȳ V : p( X) = ψ( X, V ) = X, Ȳ p (Ȳ )}. p( X) = Y V, T : T h( V ), (Y, T ) σ Ω(A,B) ( X, V ) }, which is compact and nonempty. Proof. a) Under CCQ, p is convex and finite-valued (hence closed and proper), therefore (3.5) follows from [5, Theorem 23.5] and the fact that the closure for p can be dropped in the argmax problem. Moreover, we have dom p = R n m, which gives the remaining statements in a). b) Under PCQ we also have that p Γ 0, hence the same reasoning as in a) gives (3.5). We now prove the remainder: For the first identity notice that (see e.g. [0, Chapter D, Corollary 4.5.3]) p( X) = Y (Y, 0) ψ( X, V ) } ( V argmin ψ( X, V )), V the latter argmin set being nonempty due to what was argued above. The -inclusion 52 is hence clear. For the reverse inclusion invoke also [6, Example 0.2] to see that if 522 (Y, 0) ψ( X, V ) then V argmin V ψ( X, V ). 523 The second identity in c) is clear from [5, Theorem 23.5] as ψ Γ 0 (E). 524 The third follows from Proposition 3.6 in combination with Corollary 3.3 and recalling that ψ (Ȳ, 0) = p (Ȳ ). 526 c) Apply Corollary 3.4 to the first representation in b). 527 For X rbd (dom p) the subdifferential p( X) can be empty. Moreover, it is un- 528 bounded if X / int (dom p). The latter may even occur under BPCQ as the following 529 example shows. dom ψ( X, ) is bounded.

16 6 J. V. BURKE, Y. GAO, AND T. HOHEISEL 530 Example 3.7. Let A = ( ) and b = ( 0 ) so that KA = ( v w 53 w u ) u 0}. 532 Defining h := δ V for 533 we hence find that V := ( v 0 u 0 ) u 0, v [0, ]} dom h K A = ( v ) v [0, ]} and dom h int K A =, so that CCQ is violated but BPCQ (hence (S)PCQ) holds. We find that 537 x dom p V V K A : ( x b ) rge ( ) V A T A v [0, ], r, s R 2 : x = ( v ) r + ( ) s, ( 0 ) = ( ) r v [0, ], ρ, σ R : x = ( v ) [( 0 ) + ρ ( 0 )] + σ ( 0 ) 540 x span ( 0 )}. 54 Therefore we have dom p = span ( 0 )}. In particular, dom p is a proper subspace 542 of R 2, hence relatively open with empty interior. Therefore p(x) is nonempty and 543 unbounded for any x dom p h is a support function. We now study the case where h is a support function. Concretely, given a closed, convex set V S n, we consider the function p : R n m R given by (4.) p(x) := inf V S n σ Ω(A,B)(X, V ) + σ V (V ). Recall that, by Hörmander s Theorem, see e.g. [5, Corollary 3.2.], this covers exactly the cases where h is positively homogeneous (and closed, proper, convex). We commence by analyzing the constraint qualifications from Section 3.3 in the case that h is a support function. Here, and for the remainder of this section, observe that the choice h = σ V implies that dom h = bar V and dom h = V. Lemma 4. (Constraint qualifications for (4.)). the following hold: a) (CCQ) The conditions Let p be given by (4.). Then (4.2) (4.3) (4.4) bar V int K A, V ( KA) = 0}, cl (bar V) K A = S n are each equivalent to CCQ for p. b) (PCQ) PCQ holds for p if and only if (4.5) pos (C(A, B) + V) = span (C(A, B) + V). c) (BPCQ) The conditions (4.6) (4.7) (4.8) bar V K A and cl (bar V) K A = 0}, bar V K A is bounded, bar V K A and V + KA = S n 566 are each equivalent to BPCQ for p, hence imply (4.5).

17 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION Proof. Observe that with h = σ V we have dom h = bar V and hzn h = V. a) (4.2) is condition i) in Lemma 3.2 for h = σ V, while (4.3) is condition iii). Employing [3, Section 3.3, Exercise 6]) we have This completes the proof of a). b) This is just an application of (2.). (4.3) cl (bar V K A ) = S n. c) Using (2.6), we see that (4.6) is exactly BPCQ (for h = σ V ), while the equivalence to (4.7) follows from Lemma 3.0 b). The equivalence of (4.8) to the former follows from the fact that (4.6) cl (V + KA) = S n, see [3, Section 3.3, Exercise 6]), where the closure can be dropped by interpreting [5, Theorem 6.3] accordingly. By the additivity of support functions, see (2.5), we find that (4.9) p(x) = inf V S n σ Σ(X, V ) (X R n m ), where (4.0) Σ := Σ(A, B, V) := Ω(A, B) + 0} V E. This facilitates some of the analysis. Proposition 4.2. Let p be given by (4.). Then the following hold: a) p Γ 0 (i.e. p = p ) under any of the conditions in (4.2)-(4.4) or (4.5). In particular this holds under any condition (4.6)-(4.8). Under any of the conditions (4.2)-(4.4) p also finite-valued. b) p = δ cl Σ (, 0) where the closure is superfluous (i.e. Σ is closed), in particular, under any condition (4.2)-(4.4). Proof. a) Follows respectively from Lemma 4., Theorem 3.5 c) and Theorem 3.4. b) By [6, Exercise 3.2], Σ is closed if ( KA ) V = 0}, i.e. under any condition in (4.2)-(4.4), see Lemma 4. a). The rest follows from [6, Proposition.23 (c)]. We are now interested in computing refined representations for the conjugate of p given by (4.). Corollary 4.3. Consider the function p from (4.) with V S n nonempty, closed and convex. Under any condition (4.2)-(4.4) we have where p = δ Ξ(A,B) Ξ(A, B) := Y W V : (Y, W ) Ω(A, B)} ( ) } = Y AY = B, 2 Y Y T KA V. In particular, we have p = σ Ξ(A,B) which is finite-valued.

18 8 J. V. BURKE, Y. GAO, AND T. HOHEISEL Proof. By Theorem 3.5 c) and Lemma 4. we find that p (Y ) = inf δ V(W ) = (Y, W ) Ω(A,B) 0 if W V : (Y, W ) Ω(A, B), + else, which shows that p = δ Ξ (A, B). The fact about p follows from Proposition 4.2 a) The case B = 0. We now consider the case when B = 0. Recall from [6, 607 Theorem ] that this implies that σ Ω(A,0) is a gauge function. Similarly, if 0 V, 608 then σ V is also a gauge, in fact, σ V = γ V, cf. [6, Example.9]. 609 This combination of assumptions has interesting consequences when the geometries of the sets V and KA are compatible in the following sense. Definition 4.4 (Cone compatible gauges). Given a closed, convex cone K E, we define an ordering on E by x K y if and only if y x K. A gauge γ on E is said to be compatible with this ordering if and only if γ(x) γ(y) whenever 0 K x K y. The following lemma provides a characterization of cone compatible gauges. Lemma 4.5 (Cones and compatible gauges). Let 0 C E be a closed, convex set, and let K E be a closed, convex cone. Then γ C is compatible with the ordering K if and only if (4.) K (y K) C (y K C). Proof. Note that, for y K, we have K (y K) = x 0 K x K y }. Suppose that γ C is compatible with K, and let y C K. If x K (y K), then γ C (x) γ C (y), and, consequently, K (y K) C. Next suppose (4.) holds, and let x, y E be such that 0 K x K y. Then, y K and x K (y K). We need to show that γ C (x) γ C (y). If γ C (y) = +, this is trivially the case, so we may as well assume that γ C (y) =: t < +. If t > 0, then t y C K and t x K ( t y K) C. Hence, γ C ( t y) = γ C ( t x), and so, γ C (x) γ C (y) as desired. In turn, if t = 0, then ty K C (t > 0), so that tx K (ty K) C (t > 0), i.e., x C and so γ C (x) = 0. Corollary 4.6 (Infimal projection with a gauge function). Let p be given by (4.) where V is a nonempty, closed, convex subset of S n. Suppose that B = 0. Then the following hold: a) Under any of the conditions (4.2)-(4.4) we have (4.2) p = δ Y AY =0, W V : AW =0, 2 Y Y T W }. b) If 0 V and γ V is compatible with the ordering induced by K A then 636 (4.3) p (Y ) = δ Y AY =0, γv( 2 Y Y T ) } (Y ) ( ) = δ ( K A ) V 2 Y Y T.

19 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION Proof. a) Follows readily from Corollary 4.3 by setting B = 0 and using the representation of K A in Proposition 2.. b) First observe that KA = W S n + rge W ker A }, see Proposition 2. b), recall that rge Y = rge Y Y T (Y R n m ) and V V if and only if γ V (V ). Exploiting these facts, we see that AY = 0, W V : AW = 0, 2 Y Y T W ( ) AY = 0, W V : γ V (W ) γ V 2 Y Y T AY = 0, γ V ( 2 Y Y T ) AY = 0, 2 Y Y T V rge Y Y T ker A, 2 Y Y T ( K A) V. 2 Y Y T V Therefore b) follows from a). Linear functionals are special instances of support functions. We hence obtain the following remarkable result as a consequence of our more general analysis above. Here denotes the nuclear norm 2. Corollary 4.7 (h linear). Let p : R n m R be defined by p(x) = inf σ Ω(A,0)(X, V ) + Ū, V V S n for some Ū Sn + Ker n A and C(Ū) := Y 2 Y Y T Ū } 654. Then we have: 655 a) p = δ C( Ū) Ker na is closed, proper, convex. 656 b) p = σ C( Ū) Ker na = γ C( Ū) +Rge n AT is sublinear, finite-valued, nonnegative 657 and symmetric (i.e. a seminorm). 658 c) If Ū 0 with 2Ū = LLT (L R n n ) and A = 0 then p = σ C( Ū) = L T ( ), i.e. p is a norm with C(Ū) as its unit ball and γ C( Ū) as its dual norm. d) If Ū is positive definite, C(Ū) and C(Ū) are compact, convex, symmetric 3 with 0 in their interior, thus pos C(Ū) = pos C(Ū) = S n. Proof. a) Observe that h := 663 Ū, = σ Ū}. Hence the machinery from above 664 applies with V = Ū}. As V is bounded, CCQ is trivially satisfied (cf. (4.2)-(4.4)) 665 and the representation of p follows from Corollary 4.6 a). 666 b) We have p = p = σ C( Ū) Ker na 669 = γ (C( Ū) Ker na) 670 = γ cl (C(Ū) +RgenAT ) 67 = γ C( Ū) +Rge n A. T 2 For a matrix T the nuclear norm T is the sum of its singular values. 3 We say the set S E symmetric if S = S.

20 20 J. V. BURKE, Y. GAO, AND T. HOHEISEL 672 As CCQ holds, the first identity is due to Proposition 4.2. The second uses a), the 673 third follows from [5, Theorem 4.5]. The sublinearity of p is clear. The finite- 674 valuedness follows from Proposition 4.2. Since 0 C(Ū) the nonnegativity follows as 675 well, and the symmetry is due to the symmetry of C(Ū). 676 c) Consider the case Ū = 2 I: By part a), we have p = δ Y Y Y T I }. Observe that Y Y Y T I } 677 = Y Y 2 } =: B Λ 678 is the closed unit ball of the spectral norm. Therefore, p = σ BΛ = B Λ =. 679 To prove the general case suppose that 2Ū = LLT. Then it is clear that C(Ū) = 680 Y L Y C( 2 I)}, and therefore p(x) = σ C( Ū)(X) = sup Y :L Y C( 2 I) Y, X = sup L Y C( 2 I) L Y, L T X = σ C( 2 I) (L T X) = L T X. 686 Here the first identity is due to part b) (with A = 0) and the last one follows from 687 the special case considered above. 688 d) Follows from c) using [5, Theorem 5.2] We point out that Corollary 4.7 generalizes the nuclear norm smoothing result by Hsieh and Olsen [3, Lemma ] and complements [5, Theorem 5.7] 5. h is an indicator function. We now suppose that the function h in (3.) is given by h := δ V for some nonempty, closed, and convex set V S n, i.e., in this section, the infimal projection p : R n m R is given by (5.) p(x) = inf V S n σ D(A,B)(X, V ) + δ V (V ). We first want to discuss the constraint qualifications from Section 3.3 in this particular case. Here, and for the remainder of this section, observe that the choice h = δ V implies that dom h = V and dom h = bar V. Lemma 5. (Constraint qualifications for (5.)). the following hold: a) (CCQ) The conditions (5.2) (5.3) V int K A, cone V K A = S n are each equivalent to CCQ for p. b) (PCQ) The PCQ holds for p if and only if (5.4) pos C(A, B) + bar V = span (C(A, B) + bar V). c) (BPCQ) The qualification conditions (5.5) (5.6) (5.7) V K A and V K A = 0}, V K A is bounded, V K A and bar V + KA = S n are each equivalent to BPCQ for p, hence imply (5.4). Let p be given by (5.). Then

21 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION Proof. a) First, observe that, with h = δ V, condition i) in Lemma 3.2 is exactly (5.2). By the same lemma this is equivalent to hzn σ V ( KA) = 0}. Moreover, as σ V = σv, we have hzn σ V = V σ V (V ) 0} = V. Invoking [3, Section 3.3, Exercise 6 (a)] implies that hzn σ V ( KA) = 0} cl (cone V K A ) = S n, where the closure in the latter statement can clearly be dropped, e.g. by interpreting [5, Theorem 6.3] accordingly. b) Use (2.) to infer that PCQ holds for p if and only if pos (C(A, B)) + bar V = pos (C(A, B) + V ) = span (C(A, B) + bar V). c) The equivalences of BPCQ, (5.5), and (5.6) are clear. Since V and cl (bar V) are paired in polarity, see (2.6), [3, Section 3.3, Exercise 6 (a)] implies that V K A = 0} cl (bar V + KA) = S n, where the closure in the latter statement can be dropped as in a). This establishes all equivalences. The following result provides sufficient conditions for the occurence of p = p when p is given as in (5.), i.e. in the case that h is an indicator function. Corollary 5.2. Let p be given by (5.). Then p Γ 0 (R n m ) (i.e. p = p ) under any of the conditions in (5.2)-(5.7). Under condition (5.2)-(5.3) it is also finite-valued. Proof. Follows from Lemma 5. and Theorem 3.5 c) and Theorem 3.4, respectively. We treat the case A = 0 and B = 0 separately as we will use it in Section 5.2. Corollary 5.3. Let p be given as in (5.) and assume that A = 0 and B = 0 and such that V S n + is nonempty. Then we have 737 PCQ S n + + bar V = S n BP CQ Moreover, p Γ ( R n m ), i.e. p = p under any of following conditions: i) V S n ++ (CCQ); ii) V S n + is bounded (or equivalently S n + + bar V = S n ) ((B/S)PCQ). Under condition i) p is also finite-valued. Proof. For the first statement notice that C(0, 0) = S n = K0 and invoke Lemma 5.. The rest follows from Corollary 5.2 and Lemma 5.. To compute the conjugate p, instead of using Theorem 3.5, a direct derivation relying on [5, Theorem 3.2] yields a powerful result.

22 J. V. BURKE, Y. GAO, AND T. HOHEISEL Theorem 5.4 (Infimal projection with an indicator function). (5.). Then its conjugate p : R n m R is given by Let p be given by 748 p (Y ) = 2 σ V K A ( Y Y T ) + δ Y AY =B } (Y ). 749 In particular, for A = 0 and B = 0 we obtain 750 p (Y ) = 2 σ V S n + ( Y Y T ). 75 Proof. By (2.7), we have [ ] p (Y ) = sup X, Y inf σ D(A,B)(X, V ) + δ V (V ) X V = sup V sup X = sup [ X, Y σd(a,b) (X, V ) δ V (V ) ] sup V V K A rge ( X B) rge M(V ) tr ( 2 ( ) T ( ) ) X X M(V ) + Y T X B B for Y R n m. Since rge ( ( X B) rge M(V ), we can make the substitution M(V ) U 755 W) = ( X 756 B), to obtain p (Y ) = sup sup V V K A U,W AU=B = sup V V K A = sup V V K A = sup V V K A m tr ( 2 ( inf u i,w i i= Au i=b i m inf u i,w i i= Au i=b i m [ i= inf Au i=b i ( ) T ( ) ) U U M(V ) + Y T (V U + A T W ) W W 2 ( ui w i = δ Z AZ=B } (Y ) + sup V V K A ) T M(V ) ( ui w i ) y Ti V u i w Ti Ay i ) ( ) 2 ut i V u i V y i, u i + w i, b i Ay i ( ) 2 ut i V u i V y i, u i m inf Au i=b i i= ] + inf ( w i, b i Ay i ) w i ( 2 ut i V u i V y i, u i ), where the final equality follows since δ y bi Ay i }(y i ) = sup wi w i, b i Ay i (i =,..., m). By hypothesis rge B rge A, and so, by [5, Theorem 3.2] ( ) T ( ) V yi V M(V ) yi b i b i ( ) = inf Au i=b i 2 ut i V u i V y i, u i (i =,..., m),

23 THE GENERALIZED MATRIX-FRACTIONAL FUNCTION 23 Therefore, when AY = B, we have m p (Y ) = sup ( ) T ( ) ( ) V yi V M(V ) yi ( where Ayi = b i so V yi ) ( V V K A 2 b i= i b i b i = M(V ) yi ) 0 m ( ( )) T ( ( )) T yi = sup M(V ) M(V ) yi M(V ) V V K A = sup V V K A 2 = sup V V K A 2 i= m i= ( ) T yi M(V ) 0 m yi T V y i i= = sup V V K A 2 tr (Y T V Y ), ( yi 0 ) T which proves the general expression for p. The case A = 0, B = 0 follows readily. We now study the subdifferential of p given by (5.). Corollary 5.5. Let p be given by (5.). If V int K A (CCQ) then p( x) = argmax X, Y inf σ V( T )} Y (Y,T ) Ω(A,B) 775 is nonempty and compact for all X dom p. If, in addition, pos C(A, B) + bar V = 776 span (C(A, B) + bar V) (PCQ), then p( X) = Ȳ V, T : T NV ( V ), (Ȳ, T ) σ Ω(A,B) ( X, V ) } is nonempty and compact for all X R n m. Proof. Follows readily from Proposition 3.6 in combination with Lemma B = 0 and 0 V.. We now consider the important special case of p given by (5.) where 0 V and B = 0. In this case p turns out to be a squared gauge function, see Corollary 5.9. We start with a technical lemma. Lemma 5.6. Let C, K E be nonempty, convex with K being a cone. Then (C + K) = C K. If C + K is closed with 0 C, then (C K ) = C + K. In particular, the set C + K is closed if C and K are closed and K ( C ) = 0}. Proof. Clearly, C K (C + K). Conversely, if z (C + K), then z, x + ty for all x C, y K, and t > 0. Multiplying this inequality by t and letting t, we see that z K. By letting t 0, we see that z C. Now assume that C + K is closed with 0 C. Then C + K is closed and convex with 0 C + K. Hence, by [5, Theorem 4.5], C + K = (C + K) = (C K ). The final statement of the lemma follows from [5, Corollary 9..]. The first main result in this section is concerned with a representation of the conjugate p under the standing assumptions. Corollary 5.7 (The gauge case I). Let p be given by (5.) with 0 V and B = 0 and let P be the orthogonal projection onto ker A. Moreover, let S := W S n rge W ker A} = W S n W = P W P }. 4 Then the following hold: 4 Here we consider S = S n Ker na as a subset in the space S n.

Matrix Support Functional and its Applications

Matrix Support Functional and its Applications James V Burke Mathematics, University of Washington Joint work with Yuan Gao (UW) and Tim Hoheisel (McGill), CORS, Banff 2016 June 1, 2016 Connections What