Membership testing for Bernoulli and tail-dependence matrices. Daniel Krause. Matthias Scherer. Jonas Schwinn. Ralf Werner

Membership testing for Bernoulli an tail-epenence matrices Daniel Krause Chair of Mathematical Finance, Technische Universität München, Parkring 11, 85748 Garching-Hochbrück, Germany, email: aniel.krause@tum.e, phone: +49(0)89-289-17400 Matthias Scherer Chair of Mathematical Finance, Technische Universität München, Parkring 11, 85748 Garching-Hochbrück, Germany, email: scherer@tum.e, phone: +49(0)89-289-17402 Jonas Schwinn Institute of Mathematics Business Mathematics, Universität Augsburg, Universitätsstraße 14, 86159 Augsburg, Germany, email: jonas.schwinn@math.uni-augsburg.e, phone: +49(0)821-598-2160 Ralf Werner Institute of Mathematics Business Mathematics, Universität Augsburg, Universitätsstraße 14, 86159 Augsburg, Germany, email: ralf.werner@math.uni-augsburg.e, phone: +49(0)821-598-5854 Version of: August 16, 2017 Keywors: Bernoulli-compatible matrix, tail-epenence matrix, column generation, binary quaratic programming 1

1 Introuction an motivation Abstract Testing a given matrix for membership in the family of Bernoulli matrices is a longstaning problem, the many applications of Bernoulli vectors in computer science, finance, meicine, an operations research emphasize its practical relevance. A novel approach towars this problem was taken by [Fiebig et al., 2017] for lowimensional settings 6. For the first time, they exploit the close relationship between the Bernoulli polytope (also known as correlation polytope) an the wellstuie cut polytope, which plays a central role in membership testing of Bernoulli matrices. Inspire by this approach, we use results from [Deza an Laurent, 1997, Embrechts et al., 2016, Fiebig et al., 2017] in a pre-phase of our algorithm to check necessary an sufficient conitions, before actually testing a matrix on Bernoulli compatibility. In our main approach, we however buil upon an early attempt by [Lee, 1993] base on the vertex representation of the correlation polytope an irectly solve the corresponing linear program. To appropriately eal with the issue of exponentially many primal variables, we propose a specifically taylore column generation metho. A straightforwar, yet novel, analysis of the arising subproblem of etermining the most violate ual constraint in the column generation process leas to an exact algorithm for membership testing. Although the membership problem is known to be NP-complete, we observe very promising performance up to imension = 40 on a broa variety of test problems. 1 Introuction an motivation 1.1 Motivation Characterizing a correlation matrix in terms of its algebraic properties is classical content of an introuctory course to multivariate statistics. The closely relate question, however, namely testing if a matrix B R is a Bernoulli matrix or a matrix of pairwise tail-epenence coefficients 1 is much harer. Literature relate to this problem is sprea over ifferent communities ranging from probability an operations research to applications in various isciplines, see for example [Macke et al., 2009] an other references liste in this paper. This results in an inconsistent notation an nomenclature that makes it challenging to keep track of all relevant results. Our original interest in the problem stems from a probabilistic treatment by [Embrechts et al., 2016], research inspire by an actuarial application, which ens with the statement: Concerning future research, an interesting open question is how one can (theoretically or numerically) etermine whether a given arbitrary nonnegative, square matrix is a tail-epenence or Bernoulli-compatible matrix. To the best of our knowlege there are no corresponing algorithms available. From a methoological point of view, however, much closer is the eep mathematical investigation of the geometry of the problem by [Fiebig et al., 2017] who succees in characterizing low-imensional cases 6 in terms of an analysis of the geometry of the closely relate cut polytope. 1 It has been foun in [Embrechts et al., 2016, Fiebig et al., 2017] that the membership problem for tail-epenence matrices can be reuce to the membership problem for Bernoulli matrices. Hence, we focus without loss of generality on this latter membership problem. 2

1.2 Review of existing literature 1.2 Review of existing literature The abovementione problem appears (explicitely or implicitely) in ifferent communities. From a probabilistic point of view, the problem of working with multivariate Bernoulli vectors has, for instance, been treate in [Teugels, 1990, Qaqish, 2003, Preisser an Qaqish, 2014]. In [Qaqish, 2003] it is mentione that However, specifying or computing p becomes impractical for n greater than about 15., which is linke to the exponentially increasing number of parameters in the problem, as also observe in [Teugels, 1990]. Correlation matrices of Bernoulli variables have also been stuie by [Chaganty an Joe, 2006], one of their finings being that positive efiniteness is a necessary but not sufficient criterion. Other authors have use specific families of copulas to create vali families of Bernoulli matrices, see for example [Emrich an Piemonte, 1991] for a very prominent one, but this is of limite use for the membership problem we are ealing with. Given a certain matrix T R, the question of constructing a vali ranom vector X = (X 1,..., X ) such that T ij is precisely the tail-epenence coefficient of X i an X j was investigate in [Embrechts et al., 2016] an linke to the corresponing Bernoulli problem, a result erive inepenently in [Fiebig et al., 2017]. The paper by [Fiebig et al., 2017] contains, among other interesting results, a mathematical escription of the unerlying polytopes in imensions 6. Further etails on the corresponing polytopes an their characterization can, for example, be foun in [Deza an Laurent, 1997]. Taking into account results by [Pitowsky, 1991] for the correlation polytope, we know that this problem is NP-complete an, hence, a simple solution to the problem cannot be expecte. 1.3 Problem formulation Research questions: For a given symmetric matrix B R, ecie (numerically) if there exists a -imensional ranom vector X on some probability space (Ω, A, P) such that each component X i, i = 1,..., is Bernoulli-istribute an such that B = E P [XX ]. (B) If such a ranom vector exists, provie a metho to raw ranom samples from this multivariate istribution. In this case, B is calle Bernoulli-compatible (or Bernoulli matrix in short), otherwise B is calle Bernoulli-incompatible. As on each probability space (Ω, A, P) it hols that we obtain the following result. E P [XX ] = p {0,1} P[X = p] pp 3

1.4 Membership testing by linear programming Proposition 1.1 (Vertex representation of the Bernoulli polytope) A matrix B R is Bernoulli-compatible if an only if B B, where B enotes the Bernoulli polytope ( ) B := conv {pp p {0, 1} }. A formal proof of this Proposition can, for example, be foun in [Embrechts et al., 2016], Theorem 2.2. Remark 1.2 (Nomenclature in the OR community) The Bernoulli polytope is a well-known polytope in the operations research community, where it is known uner the name correlation polytope, introuce by [Pitowsky, 1991]. Alreay in [Pitowsky, 1991], it is shown that membership testing for this polytope is NP-complete. In Section 2.2 we provie more etails on this important aspect of the problem. As alreay mentione, ue to this NP-completeness, it cannot be expecte that there is any metho that solves problem (B) for large (i.e. > 40) in reasonable time. However, ue to the well-researche structure of the correlation polytope an its connections to binary quaratic programming, cf. [Pitowsky, 1991, Deza an Laurent, 1997], we can hope for methos that solve at least meium size instances (i.e. 20 40) within a few minutes of computation time. 1.4 Membership testing by linear programming Following the main iea of [Lee, 1993], from Proposition 1.1 it immeiately follows that testing B B is equivalent to solving the following optimization problem vp(b) := min a Λ 2 2 1 a i B i B, i=0 where A enotes the matrix max-norm of the matrix A, 1 = (1,..., 1), Λ m := {λ R m + λ 1 = 1}, an B i = p(i)p(i), i = 0,..., 2 1. Here, p(i) enotes the natural bijection 2 between all integers between 0 an 2 1 an all {0, 1}-vectors of imension. The interpretation of this problem is straightforwar: fin the convex combination of the vertices B i of the Bernoulli polytope which is as close as possible to the given matrix 2 More formally, the bijection p : {0,..., 2 1} {0, 1} is given as the inverse of the bijection i : p j=1 p j 2 j 1 which maps {0, 1}-vectors bijectively onto the integers {0,..., 2 1}. 4

1.4 Membership testing by linear programming B in matrix max-norm. This problem can easily be reformulate as a linear program: vp(b) = min a Λ 2 α R s.t. α B 2 1 i=0 2 1 i=0 a i B i + αe a i B i αe B, (PB) with E = 11. Remark 1.3 (Non-uniqueness of representation) It has to be note that if such a representation exists, it is not necessarily unique. An example illustrating that ifferent stochastic moels for the ranom vector (X 1, X 2, X 3 ) might imply the same Bernoulli matrix is the following: Inepenent an Bernoulli(½) istribute X i, i = 1, 2, 3 imply a Bernoulli matrix B with ½ on the iagonal an ¼ on the off-iagonal. This can be constructe by: (i) the convex combination taking ¼ times the matrix E an ¼ times each of the matrices e i e i, i = 1, 2, 3 or (ii) taking 1/8 times each of the eight extremal matrices in the tri-variate case. Interestingly, representation (ii) use in a mixture moel correspons to inepenence. Remark 1.4 (Answer to the secon part of the research question) In case B B, the well-known Theorem of Carathéoory yiels that there is at least one representation of B which nees at most ( + 1)/2 + 1 vertices. However, it is not (yet) clear how such a sparse representation can be efficiently etermine. Practically more important ue to the funamental theorem of linear programming (FTLP) if the Simplex metho is use to solve (PB), it will always yiel a solution with at most 3 ( + 1) strictly positive a i. From such a similarily sparse representation, a ranom number generator can be built in a straightforwar manner. This alreay yiels an answer to the secon part of the research question. Remark 1.5 (Different formulations for membership testing) As alreay note in [Lee, 1993], the membership test (B) boils own to a test of the feasibility of a linear system of inequalities, or a convex hull problem, respectively. If the problem is formulate as an LP, the choice of the objective function is not etermine. This allows to consier a variety of objective functions, which might be more suitable for certain purposes than the istance-like approach followe here. Interestingly, Lee s insight to formulate the test problem as an LP has not attracte any follow-up research on an efficient solution of the arising LP. For our numerical approach, we also nee to consier the corresponing ual problem. So far, the ual problem has not receive much attention among probabilists, 3 A close inspection of the structure of the constraints in (PB) actually yiels the same number of non-zeros as the Theorem of Caratheoory. 5

1.4 Membership testing by linear programming with [Fiebig et al., 2017] constituting an exception. The ual problem to (PB) is given by: v(b) := max Y, Z S Y, Z 0 γ R B, Z Y + γ s.t. B i, Z Y + γ 0 i = 0,..., 2 1 E, Z + Y = 1 (DB) where S enotes the Hilbert space of all real symmetric matrices with scalar prouct G, H := trace(g H) an its corresponing Frobenius norm. The ual program (DB) can be interprete as fining a separating hyperplane, represente by a matrix G = Z Y, which separates the matrix B from the Bernoulli polytope, if possible. More exactly, the following theorem hols: Theorem 1.6 (Membership testing by linear programming) The following statements hol for any matrix B S with B [0, 1] : 1. Problems (PB) an (DB) are both feasible an it hols 0 v(b) vp(b) 0.5. 2. Problems (PB) an (DB) possess a primal optimal solution (a, α ) an a ual optimal solution (Y, Z, γ ), respectively, an strong uality hols for (PB) an (DB), i.e. vp(b) = 2 1 i=0 a i B i B = B, Z Y + γ = v(b). 3. B B vp(b) = v(b) = 0. 4. v(b) > 0 the ual optimal solution G = Z Y strictly separates B from the Bernoulli polytope. The two conitions B S an B [0, 1] are obviously necessary for any matrix B to be Bernoulli-compatible, cf. Proposition 2.1, therefore, we can focus on such matrices only in the above theorem. Remark 1.7 (Early termination criterion for (DB)) It is important to note that accoring to (4) any ual feasible point with strictly positive objective value i.e. a hyperplane separating B from the Bernoulli polytope alreay gives a certificate that the matrix B is not Bernoulli-compatible. This especially means that in this case the ual program oes not nee to be solve to optimality for testing purposes. Proof (of Theorem 1.6) 1. The statement follows from the observation that (â, ˆα) with â i := 0 for i = 0,..., 2 2, â i := 1/2, for i = 2 1 an ˆα := 1/2, is primally feasible an that (Ŷ, Ẑ, ˆγ) with Ŷ := 1/(22 )E, Ẑ := Ŷ an ˆγ := 0 is ually feasible. The general inequality is ue to the weak uality for linear programs. 6

1.5 Main contribution an outline of the rest of the paper 2. The secon statement is a irect consequence of (1) together with strong LP uality theory. The remaining statements follow from the first two statements in a straightforwar manner. Remark 1.8 (Extension complexity of the correlation polytope) Let us point out that problem (DB) has 2 many linear inequality constraints. One might woner, whether one is able to erive a formulation with significantly less constraints. Unfortunately, this is out of reach, as for instance emonstrate by [Kaibel an Weltge, 2015] in a nicely accessible way. There, it has been shown that there is no formulation as a linear program with less than O(1.5 ) constraints. 1.5 Main contribution an outline of the rest of the paper In the remainer of this exposition, we first collect a variety of properties of Bernoulli matrices (classical necessary an sufficient conitions as well as less known connections to binary quaratic programs an novel connections to {0, 1}-completely positive matrices), before we scrutinize in etail the numerical solution of (PB) an (DB). As a sie effect, we thereby obtain a novel necessary conition for a matrix to be a {0, 1}- completely positive matrix, which can be numerically verifie. In essence, the easy-toverify necessary an sufficient conitions can naturally be exploite in a pre-phase to avoi high numerical effort, before an actual membership test is performe. Taking avantage of the increase computational power since the early paper by [Lee, 1993], we irectly tackle the corresponing LPs by a stanar LP solver. This is successful for imensions up to 17. Due to the expecte memory issues for larger, to circumvent the curse of imensionality, we then solve the primal-ual pair of LPs with a specifically esigne column generation approach (or, taking the ual point of view, a cutting plane metho). This alreay allows to solve higher imensional problems ( 30); however, the computation times still suffer severely from the expensive search for the most violate ual constraint in the column generation process. To significantly enhance the performance of stanar column generation methos, we therefore analyse the subproblem of fining the most violate ual constraint. We emonstrate that this subproblem can be reuce to a stanar binary quaratic problem. The replacement of the exhaustive search by the BQP then results in a significant improvement in terms of calculation time. Using a battery of test problems, we illustrate the efficiency of our algorithm by solving problems up to imension = 30 without numerical issues. By exploiting several heuristics (improve primal starting point, early termination criterion, an ual test on incompatibility) we can finally solve problems up to = 40 within reasonable computation time. 7

2 Bernoulli-compatible matrices 2 Bernoulli-compatible matrices Although Theorem 1.6 is our basis for testing a matrix B on Bernoulli compatibility, a lot of computation time can be save, if efficient preliminary tests are run. For this purpose, in this section we recall 4 known necessary an sufficient conitions for a given matrix B to be Bernoulli-compatible. Special emphasis is put on criteria that can easily be verifie numerically, although we also briefly touch upon further known properties for the reaer s convenience. For some of these properties, we provie (sometimes novel) easily accessible proofs motivate from the stochastic representation. Further, we recall the less known equivalence of membership testing for the Bernoulli polytope to solving a binary quaratic program (BQP), which provies more insights into the complexity of testing Bernoulli compatibility. Finally, we provie a few novel connections to the class of {0, 1}-completely positive matrices. The section conclues by putting Bernoulli matrices into context concerning the hierarchy of popular convex cones use in combinatorial optimization. 2.1 Necessary an sufficient conitions In the following proposition, several easy-to-verify properties of Bernoulli-compatible matrices are summarize. Proposition 2.1 (Necessary conitions) Let B R. Then each of the following conitions is necessary for B to be Bernoullicompatible. 1. B S. 2. B [0, 1]. 3. B is positive semiefinite. 4. B satisfies the Fréchet Hoeffing bouns, i.e. max(0, B ii + B jj 1) B ij min(b ii, B jj ). Proof See, for example, [Embrechts et al., 2016], Proposition 2.1 or [Fiebig et al., 2017]. Let us give some further necessary conitions for a matrix B to be Bernoulli-compatible base on gap inequalities 5 for the cut polytope. 4 We mainly follow [Embrechts et al., 2016], [Fiebig et al., 2017], an [Deza et al., 1993, Deza an Laurent, 1997] in our presentation. For a concise introuction to the topic of gap inequalities for the cut polytope, an especially hypermetric inequalities, let us refer to [Laurent an Poljak, 1996] an [Deza et al., 1993, Deza an Laurent, 1997]. 8

2.1 Necessary an sufficient conitions Proposition 2.2 (Necessary gap inequalities) Let B R. Then each of the following conitions is necessary for B to be Bernoullicompatible. 1. Negative type inequalities: z Z : z Bz 0. 2. (Unroote) Triangle inequalities: i, j, k : 0 B ij B jk B ik + B kk 1 0 B ii + B jj + B kk B ij B ik B jk 1. 3. Hypermetric inequalities: z Z : z Bz z iag(b). Remark 2.3 (Application for membership testing) A few comments are in orer: ˆ ˆ The negative type inequalities for a matrix B are actually equivalent to the easyto-verify positive semiefiniteness of the matrix B, an thus o not provie further information. The triangle inequalities constitute O( 3 ) many easy-to-verify linear constraints on Bernoulli-compatible matrices. However, they can be erive irectly from the hypermetric inequalities. Nevertheless, it is avisable that the triangle inequalities are teste for in avance. Sometimes, the Fréchet Hoeffing bouns are also referre to as roote triangle inequalities. ˆ Finally, each single inequality out of the infinitely many hypermetric inequalities 6 is again easy-to-verify. These inequalities are not implie by the previous necessary conitions. Actually, as for example shown in [Deza an Laurent, 1997], Proposition 14.2.4, the hypermetric inequalities alreay follow from a finite subset of these; at the time being the size of this subset grows faster than exponentially with. For a fixe positive efinite matrix B, it is straightforwar to show that only exponentially many inequalities have to be verifie. ˆ ˆ Unfortunately, results in [Avis an Grishukhin, 1993, Deza an Laurent, 1997] inicate that testing all hypermetric inequalities an fining a violate one, if there is any, is a har problem itself. For more etails on the complexity, an on the complexity of relate questions, let us refer to [Deza an Laurent, 1997], Section 28.3. Besies the most important inequalities which we have state here, there are further inequalities known in the literature, see for example [Deza an Laurent, 1997] for an overview. Of course, vali inequalities can also be easily erive by choosing appropriate elements from the ual cone W P + constructe in Section 2.4. 6 To the best of our knowlege, [Fiebig et al., 2017] were the first (an so far only) to consier hypermetric inequalities in the context of Bernoulli-compatible matrices. 9

2.1 Necessary an sufficient conitions Proof (of Proposition 2.2) 1. Without loss of generality, z can be chosen from Q instea of Z, the remaining statement for R follows by a continuity argument; thus, negative type inequalities are equivalent to the positive semiefiniteness of B. 2. All triangle inequalities follow immeiately from an inspection of the eight possible outcomes for (X i, X j, X k ). As shown in [Deza an Laurent, 1997], Sections 27 an 28, they are a irect consequence of the hypermetric inequalities. 3. Let us rewrite the hypermetric inequality as: z Z : B, zz iag(z) 0 (see Definition 2.12 an Propo- which is vali, as obviously zz iag(z) W P + sition 2.14). An alternative, more elementary argumentation for Assertion 3 is as follows: note that for each Bernoulli vector X (or equivalently, each vertex of the Bernoulli polytope) it hols that z XX z z X = (z X) 2 z X which is obviously non-negative as z X only takes integer values. The assertion then follows by taking the expectation (or equivalently, convex combinations). For numerical purposes, as not all (of the at least exponentially many) hypermetric inequalities can be teste in avance, a few (small) z Z can be ranomly selecte an teste in avance. However, as we are intereste in eterministic tests, we o not pursue this iea any further in this exposition an refrain from testing ranomly chosen hypermetric inequalities. Motivate by the corresponing result in [Berman an Xu, 2007], Theorem 2.1, we can also erive an upper boun on the i-th coefficient in the representation of a Bernoulli matrix. Proposition 2.4 (Upper bouns on the probability of iniviual events) Let B R be a non-singular Bernoulli-compatible matrix with representation B = 2 1 i=0 a ib i. Then it hols a i ( p(i) B 1 p(i) ) 1 an a i min (k,l):(b i ) kl =1 B kl for all i = 1,..., 2 1. Proof Let a i > 0, otherwise the two inequalities are trivially satisfie. Then, from the representation of B we obtain that B a i p(i)p(i) is positive semiefinite. The first statement now follows from stanar reformulations in matrix algebra base on the Schur complement. The secon statement is obvious. 10

2.1 Necessary an sufficient conitions Thus, whenever a new variable is ae within the column generation approach, we can immeiately boun this variable by the bouns given above. Please note that it is a priori not clear which of the two bouns is the better one. Starting from the ual LP (DB), let us now provie a novel, yet easy-to-verify, necessary conition for a matrix B to be Bernoulli-compatible. This conition relies on the iea that instea of fining a general cut separating the Bernoulli polytope from B, we aim at fining a cut with a very simple (low-imensional) structure, where the separation test can be easily compute. As our numerical experiments show, this simple test is surprisingly effective in fining Bernoulli-incompatible matrices in practical settings. Proposition 2.5 (Necessary conition via ual approximation) Let B R satisfy all necessary conitions from Proposition 2.1. If the optimal value of the LP min α Y,β Y,α Z,β Z,γ R (α Z α Y ) trace(b) + (β Z β Y ) B, E + γ s.t. β Y 0, β Z 0, α Y + β Y 0, α Z + β Z 0, (α Y + β Y + α Z + β Z ) + ( 2 )(β Y + β Z ) = 1, k(α Z α Y ) + k 2 (β Z β Y ) + γ 0, k = 0,...,. is strictly positive, then B is not Bernoulli-compatible. Proof Set Y = α Y I + β Y E an Z = α Z I + β Z E. Then the triple (Y, Z, γ) is feasible in (DB) if an only if β Y 0, β Z 0, α Y + β Y 0, α Z + β Z 0, (α Y + β Y + α Z + β Z ) + ( 2 )(β Y + β Z ) = 1, B i, Z Y + γ 0, i = 0,..., 2 1. an Due to the special structure of Y an Z, the last 2 inequalities boil own to only + 1 inequalities: k(α Z α Y ) + k 2 (β Z β Y ) + γ 0, k = 0,...,. Thus, each feasible point of this reuce approximation is feasible in (DB) as well, which shows the claim. In the following, let us provie two obvious necessary an sufficient conitions. The secon conition is quite useful for theoretical purposes (but not for numerical purposes), as it shows that a ownscale version of a Bernoulli-compatible matrix remains Bernoullicompatible. In contrast, the first conition comes in quite hany for the etermination of a suitable starting point for the solution of the primal LP. Accoringly, this has been 11

2.2 Connection to BQPs an complexity of (PB) an (DB) use in our numerical implementation: Let us consier low imensional principal iagonal blocks of the matrix B (e.g. = 10). For these kin of blocks, the primal LP can be solve very efficiently. If one of these blocks is not Bernoulli-compatible, the same hols for B. In the other case, we can easily use these local solutions to construct a feasible starting point for the original LP. This starting point alreay fits the selecte blocks an thus has reasonable small objective value. Of course, also ranom (or alternatively selecte) samples of a few low imensional principal sub-matrices can be efficiently teste via solving low imensional versions of (BP). For the same reasons as above, we o not consier these ranom tests any further, but mention it as an iea for further research. Proposition 2.6 (Necessary an sufficient conitions) Let B R. Then: 1. B is Bernoulli-compatible each principal sub-matrix of B is Bernoulli-compatible. 2. B is Bernoulli-compatible λ [0, 1] : λb is Bernoulli-compatible. Proof As the parts of both statements are obvious, we only consier the irections. The first statement follows from the fact that any sub-vector of X is again multivariate Bernoulli istribute, if X is multivariate Bernoulli istribute. The secon statement follows irectly from the convex hull representation of a Bernoulli-compatible matrix. The first conition of the subsequent proposition is again easy-to-verify. To the best of our knowlege, the secon property cannot be exploite for numerical purposes. Proposition 2.7 (Sufficient conitions) Let B, G, H R an let B satisfy all necessary conitions from Proposition 2.1. Then it hols: 1 1. If B is iagonally ominant, then trace(b) B is Bernoulli-compatible. 2. If G an H are Bernoulli-compatible, then G H (i.e. the Haamar prouct of G an H) is Bernoulli-compatible. Proof The first statement follows from Lemma 2.15 an [Embrechts et al., 2016], Proposition 2.5. For a proof of the secon statement, let us refer to [Embrechts et al., 2016], Proposition 2.1 or [Fiebig et al., 2017], Corollary 14. 2.2 Connection to BQPs an complexity of (PB) an (DB) From [Pitowsky, 1991], Section 3.1 an 3.3, we know that B B? is both in NP an it is NP-har, thus NP-complete. However, it is not clear if the complementary question B B? is also in NP, i.e. it is not known if B B? is in co- NP. For more etails on the complexity iscussion, let us refer to [Pitowsky, 1991] 12

2.3 Connection to {0, 1}-completely positive matrices or [Deza an Laurent, 1997], who link this problem to the question NP = co-np?. Nevertheless, to foster a better unerstaning of the inherent complexity of the membership testing problem, let us consier the following observation, which can be trace back to [Deza an Laurent, 1997], but oes not seem to be wiely known: min p {0,1} p Qp = min λ Λ 2 min λ Λ 2 min λ Λ 2 min λ Λ 2 λ i p(i) Qp(i) = i λ i Q, p(i)p(i) = i λ i Q, B i = i Q, i λ i B i = min B B Q, B. This shows that any 7 BQP can be formulate as a linear program over the Bernoulli polytope. Hence, if the membership problem (together with provision of a separating hyperplane) was in P, a polynomial time algorithm for the solution of BQPs coul be foun. In our numerical approach we exploit the above connection between the Bernoulli polytope an BQPs in the reverse irection: we base our ansatz for membership testing for the Bernoulli polytope on the solution of a sequence of corresponing BQPs. To the best of our knowlege, this is the first attempt to use this reverse connection to significantly spee up the testing of Bernoulli compatibility. 2.3 Connection to {0, 1}-completely positive matrices Berman an Xu introuce in [Berman an Xu, 2005] an [Berman an Xu, 2007] the concept of {0, 1}-completely positive matrices, together with a etaile investigation of the structure of special types of these matrices. In our notation, a matrix A is calle a {0, 1}-completely positive matrix if it can be written as A = 2 1 i=1 ν ib i with 8 ν i N 0. In this case, 2 1 i=1 ν i is calle the {0, 1}-completely positive rank of A, in short rank {0,1} (A). The following result establishes an immeiate connection between these {0, 1}-completely positive matrices an Bernoulli matrices. Base on this connection, a novel necessary conition for {0, 1}-completely positive matrices is erive in the subsequent corollary, which is base on testing a scale version of the matrix on Bernoulli compatibility. 7 For completeness, note that min p Qp + q p = min p (Q + iag(q)) p, p {0,1} p {0,1} showing that all binary QPs are covere by the above. 8 Note that the sum inex starts at 1 instea of 0 on purpose. 13

2.3 Connection to {0, 1}-completely positive matrices Theorem 2.8 (Scaling of Bernoulli-compatible matrices) Let B S + satisfy the necessary conitions from Proposition 2.1 an let trace(b) > 0. Then it hols: 1. B Bernoulli-compatible 1 trace(b) B Bernoulli-compatible 1 2. trace(b) B Bernoulli-compatible an trace(b) 1 B Bernoulli-compatible From this theorem, the subsequent corollary, which yiels a necessary conition for {0, 1}-completely positive matrices, follows immeiately. Corollary 2.9 (Necessary conition for {0, 1}-completely positive matrices) 1 Let 0 A S be a {0, 1}-completely positive matrix. Then B = trace(a) A is Bernoullicompatible. For the proof of the theorem, we nee the following lemma which characterizes the trace of Bernoulli-compatible matrices. Lemma 2.10 (Bouning a 0 ) Let B S be Bernoulli-compatible. Then, for each a Λ 2 B = 2 1 i=0 a i B i with it hols that or, equivalently 2 1 i=1 2 1 a i trace(b) i=1 a i, 1 a 0 trace(b) (1 a 0 ). Proof (of Lemma 2.10) As B 0 = 0, we can leave out the first term of the representation of B an obtain B = 2 1 i=1 a i B i. Now, as we have which proves the statement. 1 trace(b i ), for i = 1,..., 2 1 2 1 i=1 2 1 a i trace(b) i=1 a i 14

2.3 Connection to {0, 1}-completely positive matrices Proof (of Theorem 2.8) 1. Note that 1 2 1 trace(b) B = i=0 From the above lemma, we have 2 1 i=1 2 a i trace(b) B 1 i = i=1 a i trace(b) 1. a i trace(b) B i. Hence, the above representation yiels a proper representation as a Bernoullicompatible matrix, establishing the claim. 2. As B = trace(b) ( 1 trace(b) B) an the matrix in brackets is Bernoulli-compatible accoring to the assumption, the claim irectly follows from the secon part of Proposition 2.6. Proof (of Corollary 2.9) As A is a {0, 1}-completely positive matrix, it has a representation A = 2 1 i=1 ν i B i with ν := 2 1 i=1 ν i > 0. 1 Thus, ν B is obviously a Bernoulli-compatible matrix. The rest of the statement follows from the first statement in Theorem 2.8. Unfortunately, there oes not seem to be any 1-to-1 corresponence between Bernoullicompatible matrices an {0, 1}-completely positive matrices in the gist of the previous corollary. Nevertheless, the following (slightly weaker) result can be state. Proposition 2.11 Let B be Bernoulli-compatible an let B have only rational entries. Then there exists a κ N such that κb is a {0, 1}-completely positive matrix. Proof As it is well-known that the Simplex algorithm yiels a rational solution if starte with rational inputs, there is at least one representation a Λ 2 with rational entries only. Then, setting κ to the main enominator of all a i yiels the require κ. Taking into account the fact that the optimal solution in an LP is etermine by means of an optimal basis, the main enominator κ can be boune by the Haamar boun for the eterminant of {0, 1}-matrices (an the main enominator of B), see for instance [Ziegler, 1999], Lemma 24. 15

2.4 Relation of the Bernoulli cone to stanar cones 2.4 Relation of the Bernoulli cone to stanar cones Motivate by the previous consierations, let us introuce the following convex cones. Definition 2.12 (Dual cone to the Bernoulli cone) The cone BC + := cone(b ) shall be enote as Bernoulli cone. Let us call its ual cone W P + := (BC+ ) the weakly positive cone. For example, all matrices qq + iag(q) with q 0 are weakly positive matrices. Similarly, the matrices qq iag(q) are weakly positive, if q 1 or if 0 q 1 1. Finally, the matrices qq ± iag(q) are weakly positive for all q Z ; they constitute exactly the matrices efining the hypermetric inequalities. Before we give a complete characterization of the weakly positive cone an iscuss its relation to other cones, let us state the following obvious observation. For this purpose, let } Cop + {A = R x Ax 0, x R +, CP + {A = R B 0 : A = BB }, } DNN + {A = R A 0, A 0 enote the usual copositive cone, the completely positive cone, an the oubly nonnegative cone. The subsequent observation is not only true in our specific setup, but carries over to all kin of polytopes generate by expectations like E P [Y Y ] for nonnegative ranom vectors Y with given marginals. Proposition 2.13 It hols that Cop + W P + an furthermore BC+ CP +. Proof Note that for any copositive matrix C an X 0 it hols C, E P [XX ] = E P [ C, XX ] = E P [X CX] 0. From this ientity, both statements are immeiate. In our specific setup, it is possible to give a complete characterization of the ual cone. Proposition 2.14 (Characterization of weakly positive matrices) It hols W P + = {A S 0 min x Ax}, x {0,1} i.e. the weakly positive cone consists of all matrices which yiel non-negative optimal value in the stanar BQP. 16

2.4 Relation of the Bernoulli cone to stanar cones Proof Using the above observations on the relation between linear optimization over the Bernoulli polytope an binary quaratic programming, we get W P + = (BC+ ) = {A S 0 A, B B BC + } = {A S 0 min B BC + A, B } = {A S 0 min λ A, B } λ 0,B B = {A S 0 min λ 0,x {0,1} λx Ax} = {A S 0 min x {0,1} x Ax}. The following lemma yiels that membership testing for the Bernoulli cone is equivalent to membership testing for the Bernoulli polytope. Please note that this lemma was alreay use in the proof of Proposition 2.7 to erive a sufficient conition (i.e. iagonal ominance) for Bernoulli matrices. Lemma 2.15 (Complexity of membership testing for the Bernoulli cone) Let A S with trace(a) > 0. Then A BC + 1 trace(a) A B. Proof The irection is immeiate from the efinition of the cone BC +. For the reverse irection, let A BC +. Then there exists λ > 0 an B B such that A = λb. As 1 B B we also have trace(b) B B ue to Theorem 2.8.(1), from which the statement follows. Remark 2.16 Due to Lemma 2.15, testing membership in the Bernoulli cone, i.e. B BC +?, remains an NP-complete problem, as well as testing B W P +?. Using the recent result from [Frielan an Lim, 2016], Theorem 15, we obtain that also membership testing for the weakly positive cone has to be NP-har. In our specific situation, this irectly follows from the representation erive in Proposition 2.14 ue to the NP-harness of (BQP). Please note, that the NP-completeness of the corresponing ecision problem of (BQP) oes not 9 imply that the membership problem of the weakly positive cone also is in NP (otherwise, it woul be NP-complete). 9 The ecision problem for minimization problems is formulate in terms of, which is the wrong irection for our purposes. 17

3 A column generation approach Summarizing the above observations, we obtain the following iagram of stanar cones use in optimization theory. BC + CP + DNN+ S + Cop+ W P + W P + Cop+ S+ R + S+ CP + BC+ where a cone is marke in re, when its membership problem is known to be NP-har. If the membership problem is known to be in P, then it is marke in green. The arrow between cones enotes the ual relationship to each other. 3 A column generation approach 3.1 Motivation Since LPs are often consiere to be the most easy-to-solve optimization problems, it is tempting to solve problems (PB) an (DB) by stanar LP-solvers. Accoringly, we solve the ifferent formulations (PB) an (DB) with IBM s Ilog Cplex. The results are presente in Figure 1, where computation time an typical 10 memory usage are illustrate. For small imensions, i.e. 17, we were able to solve all instances within a secon. However, ue to memory limitations, we were not able to solve any problem instance for > 20. As alreay mentione in the introuctory section, to overcome these ifficulties, we subsequently propose a column generation approach. 3.2 A generic column generation metho In the following, let us recall the generic column generation metho for linear programs; see, for example, [Lübbecke, 2010] for a etaile presentation. For this purpose, let us consier a linear optimization problem (P J ) in the following form, calle the master problem: v(j) := min x R n s.t. c j x j j J j x j b j J x j 0 j J (P J ) 10 It is very ifficult to exactly measure the average memory usage of an algorithm at any given time, thus we present approximate (slightly overestimate for small imensions) values at this point. 18

approx. memory usage in GB average time in s 3.2 A generic column generation metho 16 14 Memory Cplex 10 1 12 10 0 10 8 10-1 6 4 10-2 2 0 4 6 8 10 12 14 16 18 20 10-3 Figure 1 Typical memory usage an computation times of Cplex, average over all problem classes. For = 20 the maximum available memory of 16 GB is reache. with J = {1,..., n}, j R m for j J an b R m, where n is much larger than m. The corresponing ual 11 problem (D J ) is given by v(j) = max y R m b y s.t. j y c j j J y 0. (D J ) As Figure 1 shows, ue to the large number of primal variables, irectly solving the master problem becomes intractable beyon n 10 6 ue to memory issues. Therefore, one resorts to iteratively solving restricte master problems (P Ik ) for k = 1,..., K: v(i k ) := min x R n s.t. c j x j j I k j I k j x j b x j 0 j I k. (P Ik ) The sets I k J, usually calle inner sets, represent subsets of inices (i.e. primal variables) which are use for the optimization the remaining variables are simply set to 0 an thus exclue from the optimization. Starting with some initial inner set I 0, subsequently variables (columns) which are assume to improve the current optimal solution are ae (generate), when avancing from I k to I k+1. Thus, in the course of 11 In this section, we assume that the primal master problem is feasible an boune. Hence, by strong uality, the same hols for the ual problem, an both optimal values coincie. 19

average numer of iterations 3.2 A generic column generation metho the algorithm, a finite sequence of (small) subsets I 0 I 1... I K J is consiere. Due to the funamental theorem of linear programming, there always exists an optimal basic solution for P J an, by construction, an optimal solution for the master problem is obtaine in the restricte problem as soon as an optimal base for the master problem is inclue in a set I k. The number of elements in such a base is m, which is assume to be much smaller than n. Hence, in practice, I K will hopefully be substantially smaller than J for the majority of problem instances. As Figure 2 shows, this is not in vain, as 500 450 400 350 300 250 200 150 100 50 0 0 5 10 15 20 25 30 Figure 2 Average number of iterations of the pure column generation metho, average over 10 instances of each problem classes, cf. Section 4. It can be observe that the number of column generation steps only increases milly with the imension. the average number of column generation steps grows only milly with the imension an oes not excee 500 in all our examples for 30. Denote by x I k an optimal solution of (P Ik ) an by y I k the corresponing optimal ual solution. Since x I k is feasible for (P J ), x I k is an optimal solution for the master problem if an only if y I k is feasible for (D J ). The ual feasibilty of y I k can be etermine by means of the subproblem (SP k ): h I k := max j J h j(y I k ), (SP k ) where for some (not necessarily feasible) point y, the violation of the j-th ual constraint is given by h j (y) := y j c j. By construction, h I k 0 implies y I k to be ual feasible an thus provies optimality of the current solution x I k. In the case of h I k > 0 one sets I k+1 := I k {ji k }, that is, one 20

3.2 A generic column generation metho as the corresponing maximizing column j I k in (SP k ) an sets k := k + 1. Repeating this process, an optimal solution for the master problem is foun after a finite number of steps. Altogether, we obtain Algorithm 1 (Column Generation) 1. Choose an initial subset I 0 J such that the restricte master problem is feasible an boune an set k := 0. 2. Solve the restricte master problem (P Ik ) to obtain x I k together with ual multipliers y I k. 3. If v(i k ) = 0: x I k solves (P J ), stop. 4. Solve the subproblem (SP k ) to obtain h I k an the corresponing maximizer j I k. 5. If h I k 0: x I k solves (P J ), stop. Else, set I k+1 := I k {j I k }, k := k + 1 an go to 2. Remark 3.1 Let us emphasize a few important aspects of Algorithm 1: (i) For finite J, if (P I0 ) is feasible an boune, the algorithm inherits the finiteness an correctness properties of linear programming, cf. [Lübbecke, 2010]. This means, for some K n, the iterate x I K has to be an optimal solution for (P J ). Note that in the extreme case this may lea to I K = J. In practice, as can be epicte from Figure 2, the average number of iterations only increases rather milly with the imension. (ii) By construction, any two resticte optimal solutions x I k an x I l with k < l satisfiy c T x I k c T x I l, i.e. the optimal values of the restricte problems converge from above to the optimal value of the master problem. (iii) The stopping criterion in Step 3 of Algorithm 1 is not part of a classical column generation metho. However, as we know that the optimal value of the primal problem is always non-negative, we can stop the column generation as soon as an objective value of 0 is obtaine. In this case the given matrix B is Bernoullicompatible. In case Algorithm 1 stops in Step 5, the matrix B is not Bernoullicompatible. (iv) It is critical to the overall efficiency of the column generation approach that both the restricte LP as well as the subproblem of etermining the most violating constraint can be solve efficiently. In most successful applications of column generation, the problem structure of the subproblem can be exploite to avoi solving by full enumeration. For more etails on the efficient solution of the subproblem in the present context let us refer to Section 3.3. 21

3.3 Efficient solution of (SP k ) 3.3 Efficient solution of (SP k ) As mentione above, an efficient implementation of the column generation is obtaine if subproblem (SP k ) is solve efficiently. To avoi the full enumeration of all constraints, let us now exploit the specific structure of the subproblem. Given the ual variable y = (Y, Z, γ), the maximum ual violation can be compute as follows: max j J h j(y) = max j J γ B j, Y Z = γ + max j J B j, Z Y = γ + max j J p(j)p(j), Z Y = γ + max p {0,1} p (Z Y )p From the optimal p, the corresponing inex can be etermine immeiately. Therefore, fining the most violating constraint an computing the maximum violation boils own to solving the binary quaratic program max p Gp, (SP-BQP) p {0,1} with G = Z Y. For this problem, it is well-known that it is NP-har, as long as no special structure in G can be assume, see, e.g., [Paberg, 1989], which is the case in the present situation. Until toay, exact solution methos seem to be limite to a few hunre variables at most, see, for instance, [Kochenberger et al., 2014]. In our implementation, we have solve (SP-BQP) by Cplex, which has shown to be much more efficient than full enumeration, cf. Figure 3. Furthermore, one observes that the time spent for solving the restricte master problem roughly equals the time neee for fining the most violating constraint. This inicates that Algorithm 1 can only be improve in terms of computation times, if both the LPs an the binary quaratic subproblems can be solve much faster. 3.4 Dual boun As mentione in Remark 1.7, Algorithm 1 can be terminate early in case a separating hyperplane is foun. By linear uality we know that any feasible solution to (D J ) provies a lower boun to the optimal value of (P J ). Unfortunately, the ual solution y I k provie at step k is in general, an excluing the optimal case infeasible for (D J ). Therefore, we consier an aitional ual boun base on a Slater point of the ual problem in the gist of [Daum an Werner, 2011]. Proposition 3.2 Let y be infeasible for (D J ). Further, let y s be a Slater point for (D J ), i.e. h (y s ) < 0. Then there exists some µ ]0, 1[ such that ȳ = µy + (1 µ)y s is a Slater point, i.e. h (ȳ) < 0, for all 0 µ < µ. A suitable choice for µ is given by µ = h (y s ) h (y) h (y s ). 22

average time in s 3.4 Dual boun 10 1 10 0 Full enumeration SP-BQP Restricte master problem 10-1 10-2 10-3 10-4 4 6 8 10 12 14 16 18 20 Figure 3 Comparison of average computation times for one full enumeration, the solution of one binary quaratic subproblem, an the solution of one restricte master problem. The average is taken over all problem instances. It can be observe that full enumeration is much slower than solving the BQP subproblem. Further, the restricte master problem an the BQP subproblem roughly have the same computational workloa. This implies that we can shift any infeasible iterate y I k along the line towars the Slater point y s to a feasible iterate ȳ Ik. Whenever ȳ Ik has a ual function value strictly greater than zero, it constitutes a separating hyperplane in the sense of Remark 1.7. As we will feasible region. y s. ȳ Ik. y I k Figure 4 By means of a Slater point y s, shrink iterate y I k to a ual feasible iterate ȳ Ik to obtain a lower boun for the optimal objective function value. see in the following, this aitional ual boun allows for a much earlier termination of the column generation an thus ecreases computation time for Bernoulli-incompatible matrices significantly. 23

4 Numerical analysis 4 Numerical analysis In this section, we report in etail the setup of our case stuy base on selecte test instances, before we iscuss the main numerical finings. In summary, our numerical analysis shows that the pure column generation metho is quite efficient up to = 30. Further, making use of several heuristics, we can efficiently test for Bernoulli compatibility up to imension = 40. 4.1 Test problems Unfortunately, for testing Bernoulli-compatibility, there is no common test library available. Therefore, for the numerical tests, we have come up with five ifferent families of test problems. The first two represent specifically selecte parametrize problem classes, whereas the last three are base on ranom combinations of vertices of the Bernoulli polytope B. All test cases satisfy the necessary conitions from Proposition 2.1, besies a few exceptions for < 6, as well as a significant number of instances in class 1 which violate the Fréchet Hoeffing bouns, cf. Figures 5 an 9. Problem class 1: The matrices B of the first class are given by B = (η η 2 κ)i + κη 2 E, for some 0 η 1 an 0 κ 1, where I enotes the ientity matrix. Instances of this problem class can be either Bernoulli-compatible or not, epening on the parameters, cf. Figure 5. Problem class 2: Matrices B of the secon class are given by B ii = p, p + q i = 1,...,, B ij = p p + 1 p + q p + q + 1, 1 i j, for some 0 < p < 1 an 0 < q < 1. Instances of this problem class are always Bernoullicompatible: Draw (U 1,..., U ) from a copula that is efine as the convex combination of 1/(p + q + 1) times the comonotonicity copula an (p + q)/(p + q + 1) times the inepenence copula. Further, let X i := 1 Ui p/(p+q) for i = 1,...,. It is then easily p(1+p) verifie (by conitioning) that E[X i X j ] = (1+p+q)(p+q) for i j. Moreover, E[X2 i ] = by the uniform margins property of a copula. p p+q 24

2 2 2 2 4.1 Test problems 0.1 0.1 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 (a) = 3 (b) = 6 0.1 0.1 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 (c) = 9 () = 12 Figure 5 Bernoulli-compatibility of matrices in problem class 1, epening on η an κ for = 3, 6, 9, 12. Black areas inicate Bernoulli-compatible matrices; gray areas inicate Bernoulli-incompatible instances that violate the Fréchet Hoeffing bouns. 25

4.1 Test problems Problem class 3: The thir class constitutes ranomly generate matrices B = n λ ik B ik. k=1 The number of terms n is uniformly istribute in the interval [ 2, 4 ] an the vertices B ik are uniformly istribute over all vertices of B. Finally, the non-zero coefficients λ i1,..., λ in of the convex combination are uniformly istribute on the stanar (n)- simplex, i.e. sample from a Dirichlet istribution. As a convex combination of extremal points of B, B is always Bernoulli-compatible. Problem class 4: Base on class 3, the matrices B of problem class 4 are given by B := A + 1 10 B j, where the matrix A is generate as in problem class 3. One specific inex j with λ j > 0 is ranomly chosen an increase by 0.1. In practice, this usually leas to Bernoulliincompatible matrices for > 14. Problem class 5: Finally, we also consier a problem class which is suppose to prouce har problem instances, by setting B := A + 1 B j. Now, the matrix B is erive in a similar fashion as in class 4, however, the shift ecreases with increasing imension. This is suppose to prouce both Bernoulli-compatible an Bernoulli-incompatible matrices which are close to the Bernoulli polytope s bounary 12. In Figure 6, we have illustrate the istance of ranomly generate instances from problem classes 4 an 5. 12 This assumption is supporte by our numerical finings in this section. 26