A BRANCH AND CUT APPROACH TO LINEAR PROGRAMS WITH LINEAR COMPLEMENTARITY CONSTRAINTS

Size: px

Start display at page:

Download "A BRANCH AND CUT APPROACH TO LINEAR PROGRAMS WITH LINEAR COMPLEMENTARITY CONSTRAINTS"

Sophia O’Brien’
5 years ago
Views:

1 A BRANCH AND CUT APPROACH TO LINEAR PROGRAMS WITH LINEAR COMPLEMENTARITY CONSTRAINTS By Bin Yu A Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Major Subject: DECISION SCIENCES AND ENGINEERING SYSTEMS Approved by the Examining Committee: John E. Mitchell, Thesis Adviser Kristin P. Bennett, Member Thomas C. Sharkey, Member Xuegang Ban, Member Rensselaer Polytechnic Institute Troy, New York August 2011 (For Graduation August 2011)

3 CONTENTS LIST OF TABLES vi LIST OF FIGURES viii ACKNOWLEDGMENT x ABSTRACT xi 1. Introduction Statement of the Problem Application of LPCC Hierarchical Optimization Inverse Convex Quadratic Programming Indefinite Quadratic Programming Piecewise Linear Optimization Quantile Minimization Previous Work for Solving LPCCs Thesis Outline Tighten the Relaxation of LPCC Preliminaries Preliminary Discussion and Motivation Cutting Plane and Cutting Surface Valid Linear Constraints for LPCC Linear Constraints Based on Bounds on the Variables Linear Constraints Based on Disjunctive Programming Introduction to Disjunctive Programming Disjunctive Cuts Based on Complementarity Constraints Disjunctive Cuts Based on Simplex Tableau Linear Constraints for Bilevel Programs Valid Second Order Cone Constraints for LPCC Convex Relaxation of Non-convex Quadratic Constraint Stronger Second Order Cone Constraint McCormick Bounds Refinement and Subgradient Approximation Cuts Computational Results Computation Environment iii

4 2.4.2 LPCC Test Instances Computational Result of Linear Constraint Based on Bounds on the Variables Computational Result of Disjunctive Cuts Computational Result of Valid Second Order Cone Constraint Computational Result of CPLEX with Cuts Conclusion LPCC using Branch-and-Cut Preliminaries Preliminary Discussion and Motivation Branch-and-Bound & Branch-and-Cut Branch and Cut Algorithm Preprocessing Phase LPCC Feasibility Recovery Cutting Planes generation and selection Special partition on variable x or ỹ when M is positive semi-definite matrix Overall Flow of the Preprocessor Branch-and-Bound Phase Branching Complementarity Selection Node Selection Node Pre-solving and Warm Start Overall Flow of Branch-and-Bound for LPCC (Bounded Case) Overall Flow of Branch-and-Bound for LPCC (Unbounded Case) General Scheme of the Branch-and-Cut Algorithm for Solving LPCC Computational Results Computational Result of the Feasibility Recovery Process Computational Result of Branch and Cut Algorithm Case Study: Cross-validated Support Vector Regression Preliminaries on Cross-validated Support Vector Regression Feasibility Recovery for Cross-Validated SVR Cutting Planes for Cross-Validated SVR Valid Cuts from Geometric Meaning of SVR Problem Valid Cuts from Bi-level Formulation Special Partition on Space of C and ε iv

5 4.5 Computational Results Synthetic Dataset I Synthetic Dataset II Real-World QSAR Dataset Discussion Concluding Remarks Conclusion Future Work LITERATURE CITED v

6 LIST OF TABLES 2.1 LPCC test instances Computational result of bound cuts refinements with p = Computational result of bound cuts refinements with p = Computational result of bound cuts refinements with p = Computational result of bound cuts refinements with p = Comparison of average gap closed and processing time of bound cuts refinements under different p setting Computational result of disjunctive cuts and simple cuts Comparison of average gap closed and processing time of disjunctive cuts and simple cuts Computational result of McCormick bound cuts refinement Comparison of average gap closed and processing time of McCormick bound refinements Computational result of CPLEX with bound cuts (p=0) Computational result of CPLEX with bound cuts (p=1) Computational result of CPLEX with bound cuts (p=2) Computational result of CPLEX with bound cuts (p=3) Computational result of CPLEX with disjunctive cuts and cuts from second order cone constraints Comparison of average solving time and exploring nodes of CPLEX with different type of cuts Computational Result of Feasibility Recovery Computational result of branch-and-cut for solving 60 test instances Comparison of geometric mean of solving time Comparison of geometric mean of exploiting node Computational result of CPLEX for solving synthetic data I under 2-fold cross-validation Computational result of branch and cut with special partition for solving synthetic data I under 2-fold cross validation vi

7 4.3 Computational result for 10-d synthetic data II under 3-fold cross validation. Best MAD and MSE results are tagged Computational result for 15-d synthetic data II under 3-fold cross validation. Best MAD and MSE results are tagged Computational result for 25-d synthetic data II under 3-fold cross validation. Best MAD and MSE results are tagged Computational result for QSAR data under 5-fold cross validation. Best MAD and MSE results are tagged vii

8 LIST OF FIGURES 2.1 Comparison of average gap closed of bound cuts refinements under different p setting Comparison of average refinement time of bound cuts refinements for LPCC instances with m = 100 under different p setting Comparison of average refinement time of bound cuts refinements for LPCC instances with m = 150 under different p setting Comparisn of average refinement time of bound cuts refinements for LPCC instances with m = 200 under different p setting Comparison of average gap closed of disjunctive cut and simple cut Comparison of average refinement time of disjunctive cut and simple cut for LPCC instances with m = Comparison of average refinement time of disjunctive cut and simple cut for LPCC instances with m = Comparisn of average refinement time of disjunctive cut and simple cut for LPCC instances with m = Comparison of average gap closed of bound cuts refinement and Mc- Cormick bound refinement Comparison of average refinement time of bound cuts refinement and McCormick bound refinement for LPCC instances with m = Comparison of average refinement time of bound cuts refinement and McCormick bound refinement for LPCC instances with m = Comparisn of average refinement time of bound cuts refinement and McCormick bound refinement for LPCC instances with m = Branch-and-bound search scheme Flow chart of pre-processing Flow chart of branch-and-bound for bounded case Flow chart of branch-and-bound for unbounded case Flow chart of branch-and-cut algorithm Average of feasibility recovery time m = 100 performance profile for CPLEX and B&C m = 150 performance profile for CPLEX and B&C viii

9 3.9 m = 200 performance profile for CPLEX and B&C Plot of wx j + b versus y j with 2ε insensitive tube Comparison of average preprocessing time under different number of partitions for solving synthetic data I under 2-fold cross validation Comparison of average branch-and-bound time under different number of partitions for solving synthetic data I under 2-fold cross validation Comparison of average of total solving time under different number of partitions for solving synthetic data I under 2-fold cross validation Plot of percentage of partition left vs branching time for synthetic data I with Ω = Plot of percentage of partition left vs branching time for synthetic data I with Ω = Plot of percentage of partition left vs branching time for synthetic data I with Ω = Plot of percentage of partition left vs branching time for synthetic data I with Ω = Plot of percentage of partition left vs branching time for synthetic data I with Ω = Test errors(mean Average Deviation) of 10-d synthetic dataset II Test errors(mean Squared Error) of 10-d synthetic dataset II Test errors(mean Average Deviation) of 15-d synthetic dataset II Test errors(mean Squared Error) of 15-d synthetic dataset II Test errors(mean Average Deviation) of 25-d synthetic dataset II Test errors(mean Squared Error) of 25-d synthetic dataset II Test errors(mad) of QSAR dataset Test errors(mse) of QSAR dataset ix

10 ACKNOWLEDGMENT My most thankfulness goes to my advisor, Professor John Mitchell, for all his support and encouragements during my Ph.D research. He is the one that introduced me to the wonderful world of optimization. This thesis wouldnt exist in this form without his constant guidance and invaluable insights. I am also indebted to Professor Jong-Shi Pang for valuable discussions on the research topic. I would like to thank to the other members of my thesis committee: Professor Kristin Bennett, Professor Thomas Sharkey and Professor Xuegang Ban. Finally I am deeply grateful to my parents, my wife and my lovely daughter Judy for their love and constant support. I dedicate this thesis to them. The road ahead will be long and our climb will be steep. x

11 ABSTRACT In this thesis, the primary problem of interest is the Linear Program with Linear Complementarity Constraints (LPCC), consisting of minimizing a linear objective function over a set of linear constraints with a set of linear complementarity constraints. The LPCC provides a unified framework for various optimization models, such as hierarchical optimization, inverse convex quadratic programs, indefinite quadratic programs, piecewise linear optimization and quantile minimization, as well as optimization problems with equilibrium constraints. However due to the fact that complementarity constraints are non-convex in nature, meaning the LPCC is NP-hard, finding the global resolution of the LPCC is a big challenge. Different from traditional approaches for accommodating such complementarity constraints which introduce additional binary variables and a conceptually very large scalar θ and solve the resulting mixed-integer programming problem, we exploit the disjunctive structure of LPCC directly. We discuss various methods to tighten the linear relaxation of LPCC. Different types of linear constraints and second order cone constraints which are valid for LPCC are both being studied. Computational results are included to compare the benefit of the various constraints on the value of the relaxation. Then we propose a branch and cut algorithm to globally solve the LPCC problem, where branching is imposed directly on complementarity constraints. The algorithm is able to characterize infeasible and unbounded LPCC problems as well as solve problems with finite optimal value. We test our algorithm on randomly generated problems, and compare the results by using different cutting planes and branching strategies. Finally a data mining application of LPCC, the cross-validated support vector regression problem, is fully explored. In this application, we present a bilevel programming formulation for a cross-validated support vector regression problem with (C, ɛ) as the design variables, and then convert it into an instance of an LPCC by writing out the KKT condition of the inner quadratic program which is strictly convex. By taking advantage of the properties of this LPCC formulation and its original bilevel formulation, we use our proposed algorithm with the customized preprocessing routine to attack this challenging problem. The computational result shows that our approach has the capability to solve this type of problem with xi

12 small size dataset. xii

13 CHAPTER 1 Introduction A linear program with linear complementarity constraints (LPCC), which minimizes a linear objective function over a set of linear constraints with additional linear complementarity constraints, is a non-convex, disjunctive optimization problem. With its linear structure, the LPCC plays the same role in the domain of non-convex programming as a linear program does in the domain of convex programming. In section 1.1, we present the mathematical formulation of the general LPCC we use throughout this thesis. In addition, we will briefly talk about the relationship between linear complementarity constraints and affine variational constraints which used to define the equilibrium constraints. In section 1.2, we present various applications of LPCC in which complementarity occurs naturally in the algebraic and/or logical description of the model objectives and/or constraints. Such applications include hierarchical optimization, inverse convex quadratic programming, indefinite quadratic programming, piecewise linear optimization, and quantile minimization. We omit another important application on cross-validated support vector regression in that section, which will be discussed in detail in Chapter 4. In the last section 1.3, various existing algorithms designed for solving LPCCs are reviewed. Among these existing methods, most of them are only able to obtain a stationary solution and incapable to ascertain the quality of the solution. This is the major drawback for the existing solvers. In this thesis, we mainly focus on finding the global resolution of the LPCC, and we achieve this goal through two steps: Step 1 Study various valid constraints by exploiting the complementarity constraints directly, and evaluate the benefit of these constraints on the value of the linear relaxation of the LPCC. Step 2 Propose a branch and cut algorithm to globally solve the LPCC problem, where cuts are derived from various valid constraints studied in Step 1 and branching is imposed on the complementarity constraints. A general LPCC solver has been developed based on this branch and cut approach, and it is able to compete with the existing MIP based solvers like CPLEX. 1

14 2 1.1 Statement of the Problem In this thesis, we consider a general formulation of the LPCC in the form suggested by Pang and Fukushima [52]. Given vectors and matrices: c R n, d R m, b R k, q R m, A R k n, B R k m, N R m n and M R m m, the LPCC is to find (x, y) R n R m in order to globally minimize (x,y) subject to c T x + d T y Ax + By b and 0 y w := q + Nx + My 0 (1.1) where the a b denotes that the perpendicularity between vector a and b, i.e., a T b = 0. Thus, without the orthogonality condition: y q +Nx+My, the LPCC is a linear program (LP). The global resolution of the LPCC means the generation of a certificate showing that the problem is in one of its 3 possible states: (a) it is infeasible, (b) it is feasible but unbounded below, or (c) it attains a finite optimal solution. It is well known that if y and w are bounded, then there exists a diagonal matrix Θ with diagonal entries θ i and problem (1.1) can be formulated as a mixed integer program: minimize (x,y,z) subject to and c T x + d T y Ax + By b 0 y Θz 0 q + Nx + My Θ(1 z) z {0, 1} m (1.2) The obvious drawback of this formulation is that in order to find θ i we need to compute the upper bound of y i and w i, not to mention such upper bound may not exist. To avoid this drawback, in this thesis, we present a branch and cut algorithm which branches on the complementarity constraint directly, and the details will be discussed in Chapter 3. It is also noted that problem (1.1) generalizes the standard linear complementarity problem (LCP): 0 y q + My 0. Moreover, affine variational

15 3 constraints also lead to the problem (1.1). In particular, consider the problem: minimize (x,y) subject to c T x + d T y Ax + By b y K and (y y) T (q + Nx + My) 0, y K (1.3) where K is a given polyhedron. An important case is when K is equal to R m +. In that case, the affine variational constraints are equivalent to linear complementarity constraints: 0 y q + Nx + My 0. To illustrate the equivalence, we can rewrite the affine variational constraints as: y T (q + Nx + My) y T (q + Nx + My), y 0 Therefore we have y argmin y {y T (q + Nx + My), s.t. y 0} By letting λ R l be the multipliers of the inequalities y 0 and applying Karush-Kuhn-Tucker (KKT) condition, y must satisfy: y 0 λ 0 q + Nx + My λ = 0 y T λ = 0 Replacing λ in the above equations, the problem (1.3) has the equivalent formulation of problem (1.1). In turn, the LPCC provides a unified formulation for equilibrium problems with affine structures. We will discuss other applications of LPCC in the following section. 1.2 Application of LPCC In this section, we give a review of various applications in which complementarity arises naturally in the algebraic and/or logical description of the model objective and/or constraints. The reader can be referred to our recent paper [31] for more details. Among these applications, complementarity constraints mainly play two kinds of role during modelling process:

16 4 1. Modelling KKT optimality condition that must be satisfied by some of the variables. Such application include hierarchical optimization, inverse convex quadratic programs, indefinite quadratic programs. 2. Modelling certain logical conditions that required by some practical optimization problems. Such applications include non-convex piecewise linear optimization, quantile minimization Hierarchical Optimization In a hierarchical optimization problem, a subset of the variables are constrained to be optimal solutions of several lower level problems parametrized by the remaining variables, and these lower level problems may have subproblems of their own. If the upper level problem is linear, and if the lower level problems are linear programs or convex quadratic programs, then, by replacing the lower level problems with their KKT optimality conditions, the hierarchical optimization problem can be reformulated as an LPCC. For simplicity, we restrict our attention to hierarchical problems with a single layer of subproblems. In particular, we consider the following problem which arises in Stackelberg games [56]: minimize (x,y) c T x + subject to Ax + and r h it y i i=1 r B i y i b i=1 y i argmin y i (d i + D i x + subject to r j i,j=1 C i y i g i F x E ij y j ) T y i (yi ) T Q i y i r j i,j=1 G j y j (1.4) where x R n, y i, y i R p i, b R m, g i R q i, each Q i is symmetric and positive semidefinite, and c, d i, h i, A, B i, C i, D i, E ij, F, G i and Q i are all dimensioned appropriately. In this problem, variables x could be treated as leader s decision and variables y i could be treated as ith follower s decision, and each follower is optimizing its own subproblem parametrized by leader s decision x. Notice that here we are making assumption that if the subproblem has multiple optimal solutions, then we require model to choose one that is best for the leader s problem. Since Q i is positive semidefinite matrix, which means each subproblem is convex, we can replace the subproblems with their KKT optimality condition.

17 5 Hence, (1.4) can be reformulated as the following LPCC: minimize (x,y) c T x + subject to Ax + and where w i, λ i R q i. r h it y i i=1 r B i y i b i=1 C i y i g i + F x + d i + D i x + r j i,j=1 0 w i λ i 0 r j i,j=1 G j y j w i = 0 E ij y j + Q i y i C it λ i = 0 for i = 1,..., r (1.5) If lower level problem doesn t have any subproblem, then this type of hierarchical optimization problem is named as bilevel optimization problem. The next subsection gives a example of the LPCC formulation of bilevel optimization problems Inverse Convex Quadratic Programming Before we talk about inverse convex quadratic programming, we first introduce the concept of forward optimization and inverse optimization: a forward optimization problem is to determine an optimal solution given that the input parameters of the model are already known; while an inverse optimization problem is to identify the parameters of the model in the presence of historical data and/or empirical observations. Inverse problem plays an important role during the process of model selection. Inverse convex quadratic programming pertains to the inversion of the input parameters to a convex quadratic programs so that the measure of deviation between constructed parameters and target parameters (historical data/empirical observations) is minimized; when the measure of deviation is a linear function, then we obtain an LPCC. To illustrate, consider a standard convex quadratic program: minimize x R n c T x xt Qx subject to Ax b, (1.6) where Q is a symmetric positive semidefinite matrix, and c, Q, A, b are all dimensioned appropriately. The forward problem is finding the optimal solution of the

18 6 above QP given (Q, A, b, c). The inverse problems is as follows. Suppose we have historical data or empirical observations denoted by ( x, b, c) and a pair of matrices (Q, A) that identifies the forward optimization model, we want to construct a pair (b, c) from a feasible polyhedron F and an optimal solution x to the forward QP parametrized by (b, c) so that the measure of the deviation between (x, b, c) and ( x, b, c) is minimized. Here we are using l p norm to measure the deviation, then the mathematical formulation of the inverse convex quadratic program is given by: minimize (x,b,c) subject to and where F is a polyhedron. (x, b, c) ( x, b, c) p (b, c) F x argmin x c T x (x ) T Qx subject to Ax b. (1.7) Writing out the KKT conditions of the inner-level QP in (1.7), problem (1.7) is equivalent to: minimize (x,b,c) subject to (x, b, c) ( x, b, c) p (b, c) F and c + Qx + A T λ = 0 0 b Ax λ 0. (1.8) If we use l 1 or l norm to measure the deviation, then the above problem turns out to be an LPCC Indefinite Quadratic Programming In this subsection, we present a very important application of LPCC in solving indefinite quadratic program which is studied in the recent paper [29]. Consider the QP (1.6) where the matrix Q is symmetric and indefinite. It is well known that under the assumption that QP is feasible and has a finite optimal value, the QP (1.6) is equivalent to the LPCC of minimizing a certain linear objective function (involving the constraint multipliers) over the set of KKT conditions of the QP [24]. However this equivalence does not hold for an unbounded QP. In [29], the author proposed an LPCC approach to certify the boundedness of an indefinite QP assuming only its feasibility. In what follows, we briefly describe this boundedness certification LPCC formulation for the QP (1.6). Without loss generality, we assume that the recession

19 7 cone D {d R n : Ad 0} of the feasible set is contained in the nonnegative orthant R n +. It is then shown in [29] that the QP (1.6) is unbounded below if and only if the LPCC below has a feasible solution with a negative objective value: minimize (x,d,ξ,λ,µ,t,s) R 2n+3m+2 subject to t 0 = c + Qx + A T ξ + t1 n 0 = Qd + A T λ A T µ + s1 n 0 ξ b Ax 0 0 µ b Ax 0 0 λ Ad 0 0 ξ Ad 0 0 µ Ad 0 0 s, 1 T nd 1. (1.9) If Q is copositive on D, then the QP (1.6) is unbounded below if and only if the following somewhat simplified LPCC: minimize (x,d,ξ,λ,t) R 2(n+m)+1 subject to t 0 = c + Qx + A T ξ + t1 n 0 = Qd + A T λ 0 ξ b Ax 0 0 λ Ad 0 0 ξ Ad T nd (1.10) has a feasible solution with a negative objective value. If it is confirmed that QP attains a finite optimal solution, then another LPCC as an equivalent reformulation of the QP based on KKT condition could be resolved to attain the optimal solution Piecewise Linear Optimization Piecewise linear functions are widely used to approximate non-linear functions, and, in a lot of practical problems, piecewise linear function is usually considered as cost function which appears in model s objective. The common way of modelling piecewise linear functions is to formulate it as a mixed integer

20 8 programs. Such approach has been studied in the Ph.D. thesis [38] and the subsequent references [39, 40, 60, 61]. In this subsection, we alternatively use linear complementarity constraints to describe the logical relation of a piecewise linear function. Here we consider a piecewise linear function f(x) given as: α 1 + β 1 x if < x γ 1 f(x) α 2 + β 2 x if γ 1 x γ 2.. α p + β p x if γ p 1 x γ p α p+1 + β p+1 x if γ p x <, for some constants α j, β j, and γ j with γ 1 < γ 2 < < γ p and α j + β j γ j = α j+1 + β j+1 γ j for all j = 1,, p. If we introduce auxiliary variables y j which denote the portion of x in the interval [γ j 1, γ j ], where γ 0 = and γ p+1, then we can write f(x) as p+1 p+1 f(x) = α 1 + β j y j and x = y j (1.11) j=1 j=1 0 γ j y j y j+1 0, j = 1,..., p (1.12) where γ 1 if j = 1 γ j γ j γ j 1 if j = 2,..., p. Complementarity constraints (1.12) are used to ensure the logical relation among each interval of that piecewise linear function, and it is necessary if f(x) is not convex. By using the complementarity representation (1.11) and (1.12) of a piecewise linear function, piecewise linear objective function or piecewise linear constraints can be modelled as linear complementarity constraints Quantile Minimization Quantiles are the quantities taken at intervals from the cumulative distribution function (CDF) of a random variable. In chance constrained programming [54], quantile minimization can be formulated as minimize{α : P ξ [α f(x, ξ)] 1 γ, x P } (1.13) (α,x)

21 9 where 0 < γ < 1, ξ is a random parameter and P is a polyhedron. Assume the uncertainty can be represented by m scenarios, and f(x, ξ) is a linear function of x in each of these scenarios. Let p i be the probability of the ith scenario, and for scenario ξ i, f(x, ξ i ) = b i a T i x. Problem (1.13) can be formulated as the following LPCC: minimize α,β,x,s α subject to α + β i b i a T i x i = 1,..., m 0 β s 0 p T s 1 γ 0 s 1, x P. (1.14) To illustrate the equivalence, we first show any feasible solution to the above LPCC is feasible in (1.13). If s i > 0, then, by complementarity constraints β s, it implies β i = 0 which further implies α is greater than the function value at x for ith scenario. Therefore any feasible solution to the above formulation satisfies P ξ [α f(x, ξ)] = i:α b i a T i x p i i p i i:β i =0 i:s i >0 p i s i = p T s 1 γ, p i so it is feasible in (1.13). It can be shown similarly that we are able to construct a feasible solution to (1.14) given a feasible solution to (1.13). Quantile minimization has several important applications in risk analysis and portfolio management. One important application which has been studied in recent paper [53] is value-at risk (VaR) minimization. In that paper, the author formulated the global minimization of VaR as an LPCC, and solve it by mixed integer programming. 1.3 Previous Work for Solving LPCCs In this section, we give a brief review of previous work for solving LPCCs. At first, it is worth to mention the origin of LPCC: linear complementary

22 10 problem (LCP) which is to find vectors w and z such that w Mz = q w 0 z 0 w T z = 0 (1.15) where p R m and M R m m. Such problem is formed when we express the KKT optimal condition for linear and quadratic program. In 1968, Lemke [44] first proposed a complementary pivoting algorithm for solving the linear complementary problem. Under the assumption that M is copositive, the algorithm is able to converge to a complementary basic feasible solution in a finite number of steps. After that, Todd [57] presents a general pivoting system which provides a natural setting for the study of complementary pivoting algorithm. Later, Al-Khayyal [4] develops a branch and bound algorithm for solving general linear complementary problem. It is noted that the methods which are able to solve general LCP can also be extended to solve LPCC by using so called sequential LCP Method. Such procedure can be found in detail in the reference [36]. Since the general LCP has been shown to be NP-complete (Garey and Johnson [23]), the LPCC as a more generalized case is NP-hard. Due to the intractability of NP-hard problem, research on algorithms for solving LPCC followed two main areas: one concerns the development of global convergent algorithms with guarantee of finding a suitable stationary point; the other concerns the development of exact algorithms for getting global resolution of LPCC. Among various method for getting a stationary point of LPCC, complementary pivoting algorithm for LPCC is an extension of pivoting algorithm for LCP which handle linear complementarity constraints just as classic simplex algorithm for linear programs. Such algorithms usually perform in this way: start from a feasible solution, maintain feasibility for all iterations and try to improve the objective function. Under certain constraint qualifications, these methods guarantee convergence to certain stationary solution. The references [15] [22] [19] study and implement this type of method to solve general LPCC. Another way to get stationary point is through so called regularization framework [55]: construct a sequence of relaxed problems controlled by some parameter, then obtain

23 11 a sequence of solutions which converge to a stationary point when the parameter goes to the limit. Each regularized relaxed problem is solved by NLP based algorithm like interior point method. One way of regularization is to introduce a positive parameter φ and relax the complementarity constraints in problem (1.1) to {y, w 0, y T w φ}. An alternative is to put penalty of violation of complementarity constraints into the objective, and gradually update the penalty to infinity [45]. The obvious drawback of these methods are incapable to ascertain the quality of the computed solution. The methods for getting global resolution of LPCC are mainly based on enumerative scheme. Several methods based on branch-and-bound scheme have been proposed for solving LPCC derived from bilevel linear program. J.F. Bard and J.T. Moore [9] proposed a pure branch and bound method for solving bilevel linear program. P. Hansen et al. [27] enhance this branch-and-bound scheme by exploiting the necessary optimality condition of the inner problem. As opposed to branch-and-bound method, the references [32] and [34] study alternative way to solve LPCC by using cutting plane method. However, cutting plane method as a stand alone method to solve LPCC is usually not useful in practice. Branchand-cut is a more acceptable scheme to solve LPCC. Recently C. Audet et al [5] proposed a branch-and-cut algorithm for solving bilevel linear program, where the so called simple cut used in this algorithm is actually a special form of disjunctive cuts [6]. We will discuss the relationship between them in section It is noted that most of existing methods for getting global resolution of LPCC presume the LPCC has a finite optimal value, and this limitation was resolved until recently paper [30]. In that paper, the author proposed a minimax integer program formulation of LPCC, and solve this system using Benders decomposition method. However the success of this method heavily depends on the so call sparsifcation process. If the sparsification process is not successful, the algorithm is basically equal to an enumerative process by solving a satisfiability IP to choose the next enumerative candidate, and will be possible to check every piece of LPCC. In this thesis, we alternatively use a branch-and-cut scheme which is a more systematic enumerative process to get the global resolution of LPCC, and our algorithm is also able to characterize infeasible and unbounded LPCC problems as well as solve problems with finite optimal value. Moreover we also focus on study of various valid constraints for LPCC by exploiting the complementarity structure, such topic

24 12 has not been fully exploited in the literature for studying LPCC. 1.4 Thesis Outline We conclude this Chapter by presenting the outline of this thesis. In Chapter 2, various valid constraints are studied to tighten the linear relaxation of LPCC, and we group these constraints into two classes: linear constraints (i.e. cutting plane) and second order cone constraints (i.e. cutting surface). In section 2.2.1, we talk about using bounds on variables y and w to generate valid linear constraints for LPCC. Such linear constraints are very useful if the bounds are available. By imposing one complementarity constraint, we are able to generate a disjunctive cut by solving cut generation LP (CGLP) to cut off the point that violates that complementarity constraint. The details of this disjunctive programming approach will be discussed in section Special cuts exploiting the inner objective function of a bilevel linear program are addressed in section In section 2.3, the second order cone constraints are introduced as the convex relaxation of the valid non-convex quadratic constraint y T w 0 for LPCC. Computational results of these valid constraints on improving the value of the relaxation of LPCC are presented in section 2.4. In Chapter 3, we propose a branch and cut algorithm to globally solve the LPCC problem, where the cutting planes are based on the result in Chapter 2 and branching is imposed directly on complementarity constraints. The general scheme of the algorithm are separated to handle two cases: bounded case and unbounded case. The algorithm is able to characterize infeasible and unbounded LPCC problems as well as solve problems with finite optimal value. Cuts management strategy and branching strategy are also discussed in this chapter, and a new feasible solution recovery procedure is presented to benefit the start of the algorithm. The computational result on randomly generated problems are presented in section 3.3. In Chapter 4, we study another very important bilevel optimization application of LPCC: cross-validated support vector regression, which we omit in Chapter 1. Based on the structure of SVR LPCC formulation, special cuts from inner convex quadratic program and second order cone constraints are studied. We use the algorithm proposed in Chapter 3 with customized preprocessing routine to solve cross-validated support vector regression instances. Although the

25 13 result is not as good as we expected, the computational experience has shown the potential capability of our approach to solve small size problem. Moreover, based on the small problems that we can solve, we show that some other methods like FILTER and SLAMS are often close to optimality for these small problem, and probably good choices for larger problems. Finally, in Chapter 5, we conclude our thesis and discuss the direction of the future research.

26 CHAPTER 2 Tighten the Relaxation of LPCC In this Chapter, we present various valid constraints for LPCC which can be used to tighten the weak linear programming relaxation of LPCC. Two classes of valid constraints are addressed in this chapter: linear constraints (i.e. cutting plane) and second order cone constraints (i.e. cutting surface). For the class of linear constraints, three types of linear constraint are presented. Linear constraint based on bounds on the complementary variable y and w is shown in section Disjunctive cut derived from imposing one pair of violated complementarity constraint is discussed in section The third type of linear constraint which is specially designed for bilevel linear program is presented in section This type of cut is constructed by exploiting the inner objective function. The Second order cone constraints for general LPCC will be addressed in section 2.3. Computational result on evaluation the benefit of various constraints will be presented in section 2.4. The reader can be referred to our recent paper [48] for more details about this chapter. 2.1 Preliminaries In this preliminary section, we first discuss the motivation of this chapter and then give a brief introduction about cutting plane and cutting surface method. Familiarity with polyhedral theory and Mixed Integer Programming is assumed in following discussion; the uninitiated reader is referred to Nemhauser and Wolsey [50] (also see Wolsey [49]) Preliminary Discussion and Motivation Chapter 1. Before we start, we first recall the general LPCC formulation mentioned in minimize (x,y) subject to c T x + d T y Ax + By b and 0 y w := q + Nx + My 0 *Portions of this chapter previous appeared as: J. E. Mitchell, J. S. Pang, and B. Yu. Obtaining tighter relaxations of mathematical programs with complementarity constraints. Technical report, Department of Mathematical Sciences, Rensselaer Polytechnic Institute, February

27 15 where c R n, d R m, b R k, q R m, A R k n, B R k m, N R m n and M R m m. Let F be the feasible region of this LPCC, i.e., F (x; y) Rn+m : Ax + By b 0 y q + Nx + My 0 And let Q denote the closed convex hull of F. Then the LPCC could be determined by solving the linear program of minimizing the objective over this convex hull Q. However, the exact inequality description of Q is almost unknown to us, and what we can do is to try to obtain an convex outer approximation of this convex hull Q. The most intuitive way to get a convex outer approximation is to omit the whole complementarity constraint from problem (1.1), then we get minimize (x,y) subject to c T x + d T y Ax + By b y 0 q + Nx + My 0 (2.1) This relaxation is so called linear programming relaxation of LPCC. We let ˆQ denote the feasible region of this relaxation problem ˆQ (x; y) R n+m : Ax + By f q + Nx + My 0 y 0 However, in practice, ˆQ is usually a very poor approximation to Q. That motivates us to obtain a tighter approximation to the convex hull Q than ˆQ. The goal of this chapter is to consider using cutting plane and cutting surface method to sharp ˆQ and obtain a better approximation to the convex hull of the feasible solution of LPCC. The following subsection will give a brief review of cutting plane and cutting surface method Cutting Plane and Cutting Surface Cutting plane method is widely used to solve combinatorial optimization problem, especially when combined within a branch and bound scheme. Basic

28 16 scheme of cutting plane algorithm works by solving a sequence of linear programming relaxation. If the relaxation is not feasible to the original problem, then cutting plane can be generated to separate this relaxation from the convex hull of the feasible solution to the original problem. As more cutting planes being added into linear programming relaxation, the relaxations gradually approximate to the optimal solution of the original problem. The procedure for generating cutting plane is usually called as separation problem. Gomory cuts [25] which are derived from a row of the optimal tableau for the linear programming relaxation are the most famous general cutting plane used to solve integer programming problem. Such type of cuts is generalized by Chvátal [17] as Chvátal-Gomory cuts, which can be derived from any non-negative combination of linear inequality of relaxed linear program. The theory of disjunctive programming [6] gives alternative way to generate general cutting plane. The key procedure of disjunctive approach is to construct a convex hull of two or more disjunctive sets defined by several disjunctive constraints, then project this convex hull back to the original space and try to cut off the relaxation that violates these disjunctive constraints. The generation of disjunctive cut might require to solve a bigger linear program, therefore, in practice, we need to consider the trade-off between cost of generating cuts and effect of these cuts. In section 2.2.2, we will discuss in detail about how to derive disjunctive cuts from complementarity constraints. Cutting plane can also be generated by investigating the structure of the problem, and such cuts are usually stronger than the general cutting plane. Jünger et al. [37] contains a survey of problem specific cutting planes. In section 2.2.3, we study such problem specific cutting planes from bilevel optimization problem. It is noted that in practice it is better to use several different families of cutting planes together, since different families of cutting planes will interact with each other, and the overall result will be better than using each family individually. Cutting surface method is very similar to cutting plane method, except that the constraints are no longer linear. In this thesis, we mainly use second order cone constraint as cutting surface. The second order cone constraint is a convex constraint with the form: Ax + b 2 c T x + d (2.2) where 2 is the standard Euclidean norm, and A, b, c, d are all dimensioned appropriately. The convex quadratic constraint is equivalent to a second order cone

29 17 constraint. In section 2.3, we consider convex quadratic relaxation of the nonconvex quadratic constraint y T w 0 that come from complementarity constraint 0 y w := q + Nx + My Valid Linear Constraints for LPCC Three types of linear constraints which are valid for LPCC will be discussed in this section. Among these three types of cuts, the first two: cuts from bounds on the complementary variables and disjunctive cuts are the general cuts for LPCC, while the last one is the cut specially designed for bilevel optimization problem Linear Constraints Based on Bounds on the Variables As mentioned in Chapter 1, if the bounds on complementary variables y i and w i are available, then we are able to formulate LPCC as Mixed Integer Program. Besides that, the bounds on y i and w i can also imply a valid linear constraint for LPCC. Now we assume the upper bounds on y i and w i exist, and let yi u and wi u denote the upper bounds on y i and w i respectively. Then obviously we have (y i y u i )(w i w u i ) 0 If we expand the above inequality, it turns out to be y i w i y u i w i w u i y i + y u i w u i 0 Because of the complementarity restriction on y i and w i, which means y i w i = 0. We can cancel the y i w i term in the above inequality, and it turns out to be a valid linear constraint for LPCC y u i w i + w u i y i y u i w u i (2.3) It is noted that the tighter the bound on y i and w i could be, the stronger the cuts (2.3) could be. In what following, we will discuss the way to get tighter bounds on y i and w i. It is obvious that a loose upper bounds on y i can be obtained by solving a

30 18 linear program maximize (x,y) subject to y i Ax + By b y 0 q + Nx + My 0 (2.4) and an upper bound on w i can be constructed in the same way. To tighten these upper bounds, it is useful to find a good feasible solution to LPCC which will provide a valid upper bound on the optimal value of LPCC. Such a feasible solution can be found by using either a heuristic method or a nonlinear programming solver such as KNITRO [12]. In Chapter 3, we also propose a optimization based approach to find an initial feasible solution to an LPCC. Now if we assume such upper bound is available, and let v UB denote this upper bound, then the constraint c T x + d T y v UB (2.5) is valid for any optimal solution to (1.1). Therefore we can add constraint (2.5) into problem (2.4) to tighten these loose upper bounds. We can further tighten these upper bounds by exploiting the complementarity constraints. First, because of y i w i, we can add additional constraint w i = 0 into problem (2.4). So together with the objective upper bound constraint, the linear program for computing upper bound on y i can be improved as maximize (x,y) subject to y i Ax + By b y 0 w := q + Nx + My 0 c T x + d T y v UB w i = 0 (2.6) Further, we can select an index set J with p components, and construct 2 p linear programs of the form (2.6) with additional constraints that for each j J either y j = 0 or w j = 0. So yi u will be the largest of the optimal values of these 2 p linear programs. A similar procedure can be applied to tighten the upper bound on w i. We can also do bounds refinements by applying Procedure 1 iteratively, since the tighter the bounds on y or w the stronger the constraint (2.3) will be.

31 19 input : parameter p and the optimal solution to linear program (2.6), denoted by (ˆx, ŷ, ŵ) output: tightening bounds on y i, denoted by y u i Initialization: Set yi u = Inf; Let s j = ŷ j ŵ j for j = 1,..., m; Let J denote the indices of the largest p components of s; Let I denote a subset{ of J, and define } C I (x; y) R n+m y : i 0, i I ; q i + N i x + M i y 0, i J \ I foreach I that is a subset of J do solve linear program(2.6) with additional constraints C I, and let ub denote the optimal objective value(ub = Inf if infeasible); yi u =max(yi u,ub); end Procedure 1: Bound Tightening on y i input : initial bounds on y and w, total number of refinement N output: refined bounds on y and w Initialization: use initial bounds on y and w to construct bound cuts (2.3) and add them into linear program (2.6); for i 1 to N do for j 1 to m do apply Procedure 1 to tighten upper bounds on y j and w j ; if either upper bounds on y j or w j improved then modify the coefficient of constraint(2.3) in linear program (2.6) with updated upper bounds on y j or w j ; end end if there is no any bounds on y k or w k improved for k = 1,..., m then break; end end Procedure 2: Bound Refinement It is noted that it is better to set p 3, since as p increases, the number of LP will increase exponentially. Another computational consideration is to limit the number of bounds calculated and number of bounds refinements applied. One suggestion is to calculate bounds only for variables where the complementarity constraints is violated, and tighten the bounds only for constraints (2.3) that are tight at the solution to the LP relaxation.

32 Linear Constraints Based on Disjunctive Programming In the previous section, we talk about constructing cuts from bounds on complementary variables. However this approach is only applicable when these bounds exist. In this section, we present a disjunctive programming approach to derive cuts from complementarity constraints even when the bounds on complementary variables does not exist Introduction to Disjunctive Programming In this subsection, we give an introduction of disjunctive programming. In the perspective of disjunctive programming, we consider problem minimize x subject to c T x Ax b i I (Di x d i ) (2.7) where I is a index set, and c, A, b, D i, d i are all dimensioned appropriately. The last disjunctive constraints require that x satisfies at least one of the systems D i x d i, i I. Let P denote the feasible set of problem (2.7), and P Q denote the closed convex hull of P. Then the reverse polar cone PQ of P Q, i.e., the cone of all valid inequality for P Q can be described as the following theorem Theorem 1 ([6]). PQ := {(α; β) Rn+1 : α T x β for all x P Q } α = A T u i + D it v i = (α; β) R n+1 : β b T u i + d it v i i I u i, v i 0 Theorem 1 gives a way to obtain valid inequalities for P Q. Immediately following from this theorem, we can formulate a separation problem to decide whether a point ˆx P Q or find a valid inequality αx β for P Q that is violated by ˆx. Theorem 2 ([6]). ˆx P Q if and only if the optimal value of the following Cut-

33 21 Generation Linear Program (CGLP) is non-negative. minimize (α,β,u,v) subject to α T ˆx β α = A T u i + D it v i β b T u i + d it v i u i, v i 0 (u i + v i ) = 1 i I i I The last constraint of CGLP is a normalization constraint, which truncates the cone into a polyhedron. It is noted that the way of normalization will affect the strength and numerical stability of the resulting cuts [8]. In this thesis, for simplicity, we use this equal weight normalization. Another observation from Theorem 2 is that it is not necessary to compute the optimal solution to CGLP, any feasible solution to CGLP with negative objective value is enough to construct a cut. In other words, if (α, β, u, v) is a feasible solution of CGLP with negative objective value, then αx β is a valid inequality for P Q which cuts off ˆx Disjunctive Cuts Based on Complementarity Constraints In order to use the machinery of disjunctive programming to generate cuts for LPCC, we need to convert the LPCC into the disjunctive programming format. In fact, we can replace the complementarity constraints 0 y q +Nx+My 0 with disjunctive constraints y i 0 q i + Nx i + My i 0, i = 1,..., m, and non-negative constraints. So the LPCC (1.1) is equivalent to minimize (x,y) subject to c T x + d T y Ax + By b y 0 q + Nx + My 0 and y i 0 q i + N i x + M i y 0 i = 1,..., m (2.8) Here N i and M i mean the ith row of matrix N and M respectively. Problem (2.8) is so called conjunctive normal form (a conjunction whose terms do not contain further conjunctions). Next we will present the so called

34 22 disjunctive normal form (a disjunction whose terms do not contain further disjunctions). Based on this form, we are able to construct disjunctive cuts. In order to present disjunctive normal form, we need to first give some notation. We let S denote a subset of {1,..., m}, and we define D S (x; y) Rn+m : y i 0, q i + N i x + M i y 0, i S i S Then there are total 2 m different subsets that can be selected from {1,..., m}. Let {S 1,..., S 2 m} represents these 2 m subsets. So the disjunctive normal form of problem (1.1) can be written as minimize (x,y) subject to and c T x + d T y Ax + By b y 0 q + Nx + My 0 i I D S i (2.9) Here I = {1,..., 2 m }. Once we get disjunctive normal form of LPCC, we can apply Theorem 1 to get the representation of valid inequality for convex hull Q, however, the set I is itself exponential in the number of complementarity constraints, which means the representation system becomes unmanageably large. But if we only impose one or two complementarity constraints at once, then such representation will be useful to construct valid inequalities. We first consider imposing one complementarity constraint. Denoting Ax + By b y 0 F i (x; y) R n+m : for i = 1... m q + Nx + My 0 y i 0 q i + N i x + M i y 0 And we let Q i denote the closure of the convex hull of F i. Obviously we have Q = clconv{ i N F i } clconv{f i } = Q i

35 23 where N = {1,..., m}. This relationship indicates that any valid inequality for Q i is also valid for Q. F i is the disjunctive normal form of problem (2.1) with imposing ith complementarity constraint. Applying Theorem 1 directly, we have the following Corollary to represent the cone of valid inequality for Q i. Corollary 1. Q i := {(α x ; α y ; β) R n+m+1 : αx T x + αy T y β for all (x; y) Q i } α x = A T u, 1 + N T u,,, 1 α x = A T u, 2 + N T u,,, 2 Ni T v 2 = (α x ; α y ; β) R n+m+1 : α y = B T u, 1 + Iu,, 1 + M T u,,, 1 e i v 1 α y = B T u, 2 + Iu,, 2 + M T u,,, 2 M T i v 2 β b T u, 1 q T u,,, 1 β b T u, 2 q T u,,, 2 + q i v 2 u, 1, u,, 1, u,,, 1, u, 2, u,, 2, u,,, 2, v 1, v 2 0 Another Corollary follows immediately from the result of Corollary 1. Corollary 2. (ˆx; ŷ) Q i if and only if the optimal value of the following Cut- Generation Linear Program (CGLP) is non-negative. minimize (α x,α y,β,u,v) α T x ˆx + α T y ŷ β subject to α x = A T u, 1 + N T u,,, 1 α x = A T u, 2 + N T u,,, 2 N T i v 2 α y = B T u, 1 + Iu,, 1 + M T u,,, 1 e i v 1 α y = B T u, 2 + Iu,, 2 + M T u,,, 2 M T i v 2 β b T u, 1 q T u,,, 1 β b T u, 2 q T u,,, 2 + q i v 2 u, 1, u,, 1, u,,, 1, u, 2, u,, 2, u,,, 2, v 1, v 2 0 u, 1 + u,, 1 + u,,, 1 + u, 2 + u,, 2 + u,,, 2 + v 1 + v 2 = 1 Corollary 2 gives a way to construct a cut to cut off a point that violates ith complementarity constraints.

36 24 It is very similar when we impose two complementarity constraints. Suppose we impose ith and kth complementarity constraints. Denoting Ax + By b y 0 F ik (x; y) R n+m : q + Nx + My 0 y i 0 q i + N i x + M i y 0 y k 0 q k + N k x + M k y 0 And we let Q ik denote the close convex hull of F ik. Now the disjunctive normal form of F ik becomes Ax + By b y 0 F ik (x; y) R n+m : q + Nx + My 0 D 1 D 2 D 3 D 4 where D 1 = y i 0 y k 0 D 2 = q i + N i x + M i y 0 q k + N k x + M k y 0 D 3 = y i 0 q k + N k x + M k y 0 D 4 = y k 0 q i + N i x + M i y 0 Therefore we can apply the same procedure to construct disjunctive cuts based on the disjunctive normal form of F ik. The difference from imposing one complementarity constraint is that now the size of index set become 4 and size of CGLP for generating disjunctive cuts is also doubled. But since we impose two complementarity constraints simultaneously, the generated cuts should be stronger than the cuts generated from imposing one complementarity constraint. In practice, we need to consider the trade-off between the cost of generating cuts and the effect of these cuts.

37 Disjunctive Cuts Based on Simplex Tableau As noticed in the previous subsection, in order to generate a disjunctive cut, we need to solve a cut-generation LP whose size is at least as twice as the original LP relaxation. This becomes a big barrier to use this approach in practice, since for a lot of real problems even the initial LP relaxation is very big. To reduce the computational cost, we can alternatively derive disjunctive cuts from the optimal simplex tableau. Let ξ R n+m and let ˆξ = (ˆx; ŷ) be the optimal solution to the LP relaxation. If (ˆx; ŷ) violates ith complementarity constraint, which means y i w i > 0, then we can represent y i and w i with non-basic variables based on two rows of simplex tableau corresponding to y i and w i : y i = ŷ i a yi j ξ j j NB w i = ŵ i j NB Here NB denotes the set of non-basic variables. a wi j ξ j (2.10) From the complementarity restriction on y i and w i, we can construct two disjunctive sets: ( ŷ i a yi j ξ j j NB ) ( ŵ i j NB a wi j ξ j ) (2.11) And because ŷ i and ŵ i are both positive, we can divide ŷ i or ŵ i to both sides of inequality in the above disjunctive set without changing direction of inequality. Then disjunctive set (2.11) turns out to be ( 1 j NB ) ( a yi j ξ j 1 ) a wi j ξ j ŷ i ŵ i j NB (2.12) Together with non-negative restriction on ξ, we can construct a disjunctive normal form ξ R n+m : ξ 0 ( 1 j NB ) ( a yi j ξ j 1 ) a wi j ξ j ŷ i ŵ i j NB (2.13) Applying Theorem 1 and Theorem 2 to the above disjunctive normal form, we can set up the following cut-generation LP which generates a constraint α j ξ j β j NB

38 26 that is valid for Q i. Note that ˆξ j = 0 for j NB. minimize (α,β,v) β subject to α j ayi j v 1 ŷ i α j awi j v 2 ŵ i j NB (2.14) β v 1 and β v 2 v 1, v 2 0 normalization constraint Any feasible solution to problem (2.14) with negative objective value, which means β > 0, will give a valid constraint to cut off ˆξ. Moreover, since β will stay positive in the generated cut, we can always rescale the coefficient of the cut as α j ξ j 1. Therefore we can rewrite CGLP (2.14) as j NB minimize (α,v) subject to j NB α j α j ayi j v 1 ŷ i α j awi j v 2 ŵ i j NB (2.15) 1 v 1 and 1 v 2 v 1, v 2 0 normalization constraint The objective of problem (2.15) is to make the coefficients of the cut become relative small which will increases the numerical stability of the cut, and any feasible solution to problem (2.15) gives a valid constraint for Q i that cuts off ˆξ. It is noted that the standard simple cut present by Audet et al. [5] j NB max( ayi j, awi j )ξ j 1 ŷ i ŵ i

39 27 is in fact a special disjunctive cut which can be derived from problem (2.15) by setting v 1 = 1, v 2 = 1 and equal weight normalization. If ˆξ violates two complementarity constraints, then we can also derive disjunctive cuts from 4 rows of the simplex tableau corresponding these two complementarity constraints. Suppose ˆξ violates ith and kth complementarity constraints. Then in a very similar way, we can construct a four terms disjunctive set 1 1 j NB j NB a yi j ŷ i ξ j a yk j ŷ k ξ j 1 1 j NB j NB a wi j ŵ i ξ j a wk j ŵ k ξ j 1 1 Further a cut-generation LP can be set up to cut off ˆξ j NB j NB a yi j ŷ i ξ j a wk j ŵ k ξ j 1 1 j NB j NB (2.16) a wi j ŵ i ξ j a yk j ŷ k ξ j minimize (α,v) subject to j NB α j α j ayi j v 1i + ayk j v 1k ŷ i ŷ k α j awi j v 2i + awk j v 2k ŵ i ŵ k α j ayi j v 3i + awk j v 3k ŷ i ŵ k α j awi j v 4i + ayk j v 4k ŵ i ŷ k j NB (2.17) 1 v 1i + v 1k 1 v 2i + v 2k 1 v 3i + v 3k and 1 v 4i + v 4k v 1i, v 1k, v 2i, v 2k, v 3i, v 3k, v 4i, v 4k 0 normalization constraint It is noted that the cut-generation LP derived from simplex tableau is far smaller than the standard cut-generation LP discussed in section , and make possible to generate this type of disjunctive cut during the branching procedure. Note that the cuts are valid, but not all disjunctive cuts can be derived in this way. In paper [7], Balas and Perregaard discussed an approach to solve the standard cutgeneration LP without explicitly constructing it, and the reader can be referred

40 28 to that paper for more details Linear Constraints for Bilevel Programs Recall that when we discuss the application of LPCC in Chapter 1, we mentioned that certain hierarchical optimization problems can be formulated as LPCCs. In this subsection, we present a framework to derive special cuts from bilevel problem, a two level hierarchical optimization problem. We use problem (1.4) to illustrate this framework. For the sake of simplicity, we restrict r = 1 in problem (1.4), so the simplified problem can be written as minimize (x,y) subject to and c T x + h T y Ax + By b y argmin y (d + Dx) T y (y ) T Qy subject to Cy g F x (2.18) where x R n, y R l, b R m, g R k, each Q is symmetric and positive semi-definite, and c, d, h, A, B, C, D,F, and Q are all dimensioned appropriately. Since the inner problem is convex, we can replace the inner problem with its KKT optimal condition, and problem (2.18) can be reformulated as the following LPCC: where w, λ R k. minimize (x,y) subject to and c T x + h T y Ax + By b Cy g + F x w = 0 d + Dx + Qy C T λ = 0 0 w λ 0 (2.19) The complementarity constraints in problem (2.19) is to force the solution of LPCC to be optimal for the inner problem of problem (2.18). Therefore if we consider problem (2.19) alone, then relaxing the complementarity constraints may results in a very poor relaxation. To remedy this, we can take advantage of the structure of problem (2.18), and use the inner objective value to restrict the inner variables to be close to the optimal for the inner problem. Let y(x) denotes the feasible region of the inner problem under x in problem (2.18). The basic idea of deriving cuts from inner problem can be described

41 29 as the following lemma: Lemma 1. Given ŷ, if any feasible solution (x; y) to problem (2.19) that satisfy ŷ y(x), then it must satisfy the following inequality: (d + Dx) T y yt Qy (d + Dx) T ŷ + 1 2ŷT Qŷ (2.20) Proof. Since (x; y) is feasible solution to problem (2.19), y is a optimal solution to the inner problem of problem (2.18) under x. ŷ y(x) means ŷ is feasible solution to the inner problem under x, therefore the optimal value of objective function should be at least as good as the objective value corresponding any feasible solution. The result follows. It is noted that inequality (2.20) is a non-convex constraint, and it is only valid for certain x. In order to obtain a valid convex constraint for (2.19), we may have to make some simplifying assumptions and weaken the constraint. We first consider the way to convexify the inequality (2.20). Noted that the only non-convex term in (2.20) is (d + Dx) T y. We define x = d + Dx (2.21) Inequality (2.20) is equivalent to l x j y j yt Qy (d + Dx) T ŷ + 1 Qŷ 2ŷT j=1 or equivalently the second order cone constraint l σ j yt Qy (d + Dx) T ŷ + 1 Qŷ (2.22) 2ŷT j=1 with the additional non-convex constraints x j y j σ j j = 1,..., l (2.23) Assume further that x and y is bounded: x l x x u y l y y u

42 30 where x l, x u, y l, y u denote the boounds on x and y. It was shown by Al-Khayyal and Falk [3] that the convex envelope for σ j is given by x l jy j + y l j x j x l jy l j σ j x u j y j + y u j x j x u j y u j σ j x l jy j + y u j x j x l jy u j σ j j = 1,..., l (2.24) x u j y j + y l j x j x u j y l j σ j So we can relax (2.23) by using the inequalities in (2.24). Next, we consider to make inequality (2.20) to be valid for problem (2.19), in fact we just need to find a ŷ such that for any feasible solution (x; y) to problem (2.19) ŷ y(x). Assume that g F x is bounded: g l g F x g u where g l and g u are bounds on g F x. Then as long as Cŷ g u we have Cŷ g F x for any feasible (x; y) to problem (2.19). We summarize the above discussions as the following lemma: Lemma 2. Given ŷ such that Cŷ g u, then the following inequalities are valid for problem (2.19): l σ j yt Qy (d + Dx) T ŷ + 1 Qŷ 2ŷT j=1 x = d + Dx (2.25) x l jy j + y l j x j x l jy l j σ j x u j y j + y u j x j x u j y u j σ j j = 1,..., l Now we consider the way to find the ŷ for (2.25). In fact, any ŷ that satisfy Cŷ g u can be used. Therefore we can use constraint Cy g u to construct several optimization problems to get the ŷ. For example, suppose ( x; ȳ) is the LP relaxation solution to problem (2.19).

43 31 Then we can construct problem: minimize y subject to (d + D x)y yt Qy Cy g u By b A x (2.26) and let ŷ be the optimal solution to this problem. Alternatively, we can also construct an optimization problem to restrict the right hand side of inequality(2.20) as much as possible. The optimization problem can be described as follows: minimize (x;y) subject to l t j yt Qy j=1 Cy g u Ax + By b x = d + Dx t j x u j y j + yj l x j x u j yj l j = 1,..., l (2.27) The last constraints come from (2.24), and they give a upper bound on (d+dx) T y. This constraint can also be replaced with t i x l jy j + yj u x j x l jyj u i = 1,..., l. Once we select ŷ, we can apply (2.25) to generate strong linear cuts. A lower bound on a linear function rx T x + ry T y can be obtained by solving the second order cone program γ r := minimize (x;y;σ; x) subject to rx T x + ry T y Ax + By b F x + Cy g l σ j yt Qy (d + Dx) T ŷ + 1 Qŷ 2ŷT j=1 x = d + Dx x l jy j + y l j x j x l jy l j σ j x u j y j + y u j x j x u j y u j σ j j = 1,..., l The resulting cut is By setting r T x r T x x + r T y y γ r (2.28) and r T y, we are able to generate various cuts as needed and

44 32 restrict certain inner variables to be close to optimal for the inner problem. In practice, such type of cuts will bring a great benefit to branch-and-bound procedure. In Chapter 4, we will talk about more using this approach to restrict certain variables for cross-validated support vector regression problem. 2.3 Valid Second Order Cone Constraints for LPCC In the previous subsection 2.2.3, we already mentioned specific second order cone constraints derived from the inner problem of a bilevel program, and use these second order cone constraints to construct linear cuts. In this section, we will present an approach to derive valid second order cone constraints for general LPCC Convex Relaxation of Non-convex Quadratic Constraint Recall that in general LPCC problem (1.1), the complementarity constraints 0 y q + Nx + My 0 is equivalent to the non-negativity constraints y 0 q + Nx + My 0 together with the non-convex quadratic constraint q T y + y T Nx yt My 0 (2.29) where M = M + M T. Now we consider convex relaxation of (2.29). Since M is symmetric matrix, an eigen-decomposition of M can be written as M = V ΛV T where Λ is a diagonal matrix with diagonal entries λ i, and V is an orthogonal matrix with columns denoted by v i. Without loss of generality, we assume only the first p diagonal entries of Λ are non-negative. Moreover, if k denotes the rank

45 33 of N, then we can factorize matrix N as N = Γ T Ψ where Γ is a k m matrix and Ψ is a k n matrix. Further, we define k-dimensional variables x and ỹ x = Ψx ỹ = Γy Then constraint (2.29) is equivalent to the constraint k p m q T y + x j ỹ j + 1 λ 2 i (vi T y) 2 1 λ 2 i (vi T y) 2 j=1 i=1 i=p+1 or equivalently the second order cone constraint k p m q T y + σ j + 1 λ 2 i (vi T y) 2 1 λ 2 i π i (2.30) j=1 i=1 i=p+1 with the additional non-convex constraints (v T i y) 2 π i i = p + 1,..., m (2.31) x j ỹ j σ j j = 1,..., k (2.32) When x j, ỹ j and v T i y are all bounded, we can apply the same idea used in previous section, and construct the convex envelope for σ j and π i. We have x l jỹ j + ỹ l j x j x l jỹ l j σ j x u j ỹ j + ỹ u j x j x u j ỹ u j σ j x l jỹ j + ỹ u j x j x l jỹ u j σ j j = 1,..., k (2.33) x u j ỹ j + ỹ l j x j x u j ỹ l j σ j 2v l iv T i y (v l i) 2 π i 2vi u vi T y (vi u ) 2 π i i = p + 1,..., m (2.34) (vi l + vi u )vi T y viv l i u π i where x l j, x u j, ỹ l j, ỹ u j, v l i, and v u i denote the bounds on x j, ỹ j and v T i y. Together with (2.30), (2.33) and (2.34), we are able to get a convex relaxation

46 34 of non-convex constraint (2.29) Stronger Second Order Cone Constraint In the previous section, we talked about the convex relaxation of non-convex quadratic constraint from complementarities. It is noted that (2.33) gives a description of convex hull for σ j when the feasible region for x j and ỹ j is a special type of rectangle [ x l j, x u j ] [ỹ l j, ỹ u j ]. If the feasible region is not a rectangle, then we might get stronger convex relaxation by adding certain second order cone constraints. In the following, we will discuss how to find such stronger second order cone constraint if the feasible region of x j and ỹ j is not rectangle. Notice that for any scalar α > 0, we have x j ỹ j = 1 4α ( x j + αỹ j ) 2 1 4α ( x j αỹ j ) 2 Therefore constraint (2.32) is equivalent to 1 4α ( x j + αỹ j ) 2 1 4α ( x j αỹ j ) 2 σ j Let α l and α u denote the lower and upper bounds on x j αỹ j respectively. We are able to get the convex relaxation of (2.32) 1 4α ( x j + αỹ j ) 2 1 4α ((αl + α u )( x j αỹ j ) α l α u ) σ j (2.35) From (2.35), we notice that every positive value of α give a convex lower bound on σ j. Next, we consider the way of choosing α. Let Q denote the restricted feasible region of LP relaxation of general LPCC Q := (x; y) : Ax + By b y 0 q + Nx + My 0 c T x + d T y v UB valid linear cuts Let Q j denote the projection of Q on ( x j, ỹ j ) space. If Q j is bounded, we can use a parametric approach to find every facet and extreme point of Q j. The finite set of facets can be used to define the initial set of constraints (2.35)

47 35 It is noted that if x j ᾱỹ j = β is a facet of Q j for some choice of β, then the corresponding constraint (2.35) are tight on this facet, and no other value of α gives a constraint that is tight at any interior point of this facet. Therefore if Q j is not rectangle, then the facet defined constraints (2.35) should be stronger than constraints (2.33). However, it might be unnecessary to use all facet defined constraints (2.35). One reason is that Q is just the feasible region of LP relaxation of the LPCC, so the projection of Q on ( x j, ỹ j ) space may not be able to represent the actual feasible region of the LPCC projected on ( x j, ỹ j ) space; another reason is that for current quadratic constraint program solver like CPLEX, the numerical issue will be raised when too many quadratic constraints involved. In the following section, we will describe an apparently more useful procedure, McCormick bounds refinement, which is able to shrink the feasible region on ( x j, ỹ j ) space and provide a tighter convex relaxation of feasible region of the LPCC McCormick Bounds Refinement and Subgradient Approximation Cuts In this section, we describe two preprocessing procedures that are able to take advantage of the second order cone constraints we talked about in the previous section: McCormick bounds refinements and subgradient approximation cuts. The motivation of McCormick bounds refinement is derived from tightening the gap between non-convex constraints (2.32) and its linear approximations (2.33). This procedure is able to shrink the bounds on x and ỹ and improve the initial lower bound of the LPCC especially when M is positive semi-definite matrix and the dimension of x or ỹ is small. Subgradient approximation cuts are derived from the subgradient approximation to the convex quadratic term in the valid second order cone constraints, which provide linear approximations to these second order cone constraints and let our linear program based branch and cut code take advantage of the valid second order cone constraints. For easy discussion, we assume M is a positive semi-definite matrix. Then following the discussion in the previous section, the convex relaxation of non-

48 36 convex constraint (2.29) can be rewritten as k q T y + σ j yt My 0 j=1 x l jỹ j + ỹj l x j x l jỹj l σ j x u j ỹ j + ỹj u x j x u j ỹj u σ j x l jỹ j + ỹj u x j x l jỹj u σ j x u j ỹ j + ỹj l x j x u j ỹj l σ j j = 1,..., k (2.36) We first discuss the effect of the x and ỹ bounds on the above convex relaxation. Notice that we have 1 2 ( xl jỹ j + ỹj l x j x l jỹj l + x u j ỹ j + ỹj u x j x u j ỹj u ) σ j j = 1,..., k 1 2 ( xl jỹ j + ỹj u x j x l jỹj u + x u j ỹ j + ỹj l x j x u j ỹj) l σ j And based on McCormick inequalities, we also have 1 2 ( xl jỹ j + ỹj l x j x l jỹj l + x u j ỹ j + ỹj u x j x u j ỹj u ) x j ỹ j j = 1,..., k 1 2 ( xl jỹ j + ỹj u x j x l jỹj u + x u j ỹ j + ỹj l x j x u j ỹj) l x j ỹ j Then if we aggregate the above inequalities together, and after simplification we will have Therefore if (2.36) holds, then we have q T y + x T N T y yt My = q T y + x j ỹ j σ j 1 2 ( xu j x l j)(ỹ u j ỹ l j) q T y j=1 k x j ỹ j yt My j=1 k (σ j ( xu j x l j)(ỹj u ỹj)) l yt My j=1 k ( x u j x l j)(ỹj u ỹj) l (2.37) Thus, the tighter the bounds on x and ỹ are, the tighter the convex relaxation k (2.36) is. We can use the value of 1 ( x u 2 j x l j)(ỹj u ỹj) l to evaluate the strength j=1 of this convex relaxation of (2.36), and we name this value as Violation Level of this convex relaxation. The McCormick bounds refinement procedure is to shrink the feasible region

49 37 of x and ỹ in order to get the tighter convex relaxation of the LPCC. To get the initial bounds on x and ỹ, we can simply set up a linear program maximize/minimize (x,y, x,ỹ) subject to x i or ỹ i Ax + By b y 0 q + Nx + My 0 c T x + d T y v UB x = Ψx ỹ = Γy (2.38) The McCormick bounds refinement procedure is very similar as the bounds refinements procedure we discussed in section except now we are solving convex quadratic constraint program instead of linear program. With the refined bounds on x and ỹ, we can set up a convex quadratic constraint program to get improved lower bound on LPCC minimize (x,y, x,ỹ,σ) subject to c T x + d T y Ax + By b y 0 q + Nx + My 0 x = Ψx ỹ = Γy k q T y + σ j yt My 0 j=1 x l jỹ j + ỹ l j x j x l jỹ l j σ j (2.39) x u j ỹ j + ỹ u j x j x u j ỹ u j σ j x l jỹ j + ỹ u j x j x l jỹ u j σ j j = 1,..., k x u j ỹ j + ỹ l j x j x u j ỹ l j σ j Notice that there is only one convex quadratic constraint in (2.39). That motivates us to use subgradient approximation to this convex quadratic constraint to benefit our LP based branch and cut algorithm. In order to get good linear approximation constraints that are able to improve the lower bound of LPCC, we make subgradient approximation on points that are the optimal solution of

50 38 program (2.39). Procedure 3 describes the preprocessing procedure that combines Mccormick bounds refinements and subgradient approximation cuts generation. input : initial bounds on x and ỹ, total number of refinement N output: refined bounds on x and ỹ; subgradient approximation cut set SC Initialization: use initial bounds on x and ỹ to construct McCormick bound cuts and convex quadratic constraint (2.36) and add them into linear program (2.38); denote this convex quadratic constraint program as BoundQCP; for i 1 to N do end for j 1 to k do solve BoundQCP with modified objective to get refined bounds end on x j and ỹ j ; if there is no any bounds on x j or ỹ j improved for j = 1,..., k then end break; else modify the coefficient of constraint(2.36) in BoundQCP with end improved bounds on x j or ỹ j ; solve (2.39) with improved bounds on x j or ỹ j, denote the optimal solution as (x,y ), add subgradient approximation cut k q T y + σ j y T My + y T M(y y ) 0 into SC ; j=1 Procedure 3: McCormick Bounds Refinement & Subgradient Approximation cuts 2.4 Computational Results In this section, we present the computational results of these valid constraints discussed in this chapter. We mainly test three types of constraints: bound cuts, disjunctive cuts, and valid second order cone constraints. The goal of this computational study is to show how well these valid constraints can tighten the LP relaxation of LPCC and improving the initial lower bound of LPCC. These valid

51 39 constraints can all be improved by having a good upper bound coming from a feasible solution. Therefore, for all test instances used in this section, we use a heuristic method which will be discussed in next Chapter to recover a feasible solution Computation Environment All procedures and algorithms are developed in C language with CPLEX callable library, and all LPs and convex quadratic constraint programs are solved using CPLEX 11. All the test instances are run on single core of AMD Phenom II X GHZ, 4GB memory. The relative gap for optimality is 10 6, upperbound lowerbound here the relative gap is defined as. The tolerance of max(1, lowerbound ) complementarity is 10 6,i.e., either y i or w i for i = 1,..., m should be less than 10 6 for any feasible LPCC solution LPCC Test Instances In order to test the effectiveness of different type of valid constraints, a series of LPCC instances was randomly generated, and Procedure 4 gives a detailed

52 40 description of the generator. input : n,m,k,rankm,dense output: vector c,d,b,q; matrix A,B,N,M 1: generate n dimension vector x with value between 0 and 10, integer; 2: generate m dimension vector ȳ with value between 0 and 10, integer if index < m 3 ; 0 otherwise; 3: generate n dimension vector c with value between 0 and 10, integer; 4: generate m dimension vector d with value between 0 and 10, integer; 5: generate k n matrix A with value between -5 and 6, integer, and the matrix density is dense; 6: generate k m matrix B with value between -5 and 6, integer, and the matrix density is dense; 7: generate m n matrix N with value between -5 and 6, integer, and the matrix density is dense; 8: generate m rankm matrix L with value between -5 and 6, integer, and the matrix density is dense; generate m m skew symmetric matrix M with value between -2 and 3, integer; Let m m matrix M = LL T + M; 9: generate k dimension vector b with value between 1 and 11, integer; let k dimension vector b = A x + Bȳ b; 10: generate m dimension vector q with value between 1 and 11, integer; let m dimension vector q = N x Mȳ + q; Procedure 4: LPCC instances generator Remark 1. n is the dimension of x variable; m is the dimension of y variable; k is the dimension of b; rankm is the rank of matrix M; dense is the density of generated matrix; we assume all instances have the non-negativity constraint x 0 which are not included in the constraint Ax + By b; step 1 and step 2 are used to generate a feasible LPCC solution; step 8 is to generate matrix M to be a non-symmetric positive semi-definite matrix with rank rankm. We totally generated 60 LPCC instances with 100, 150, 200 complementaries, 20 instances of each size, and with the same parameter, we randomly generated 5 instances. Table 2.1 reports the LP relaxation, optimal value, initial upper bound and CPLEX solving result for these 60 instances. For CPLEX solving LPCC instances, we used indicator constraint in CPLEX C callable library [33]

53 41 to formulate the complementarity constraint, and the CPLEX setting is default. The time limit for CPLEX is 7200 seconds. Notice that CPLEX is unable to solve most of our LPCC instances when m = 200 within 7200 seconds. In the following sections, we will show the computational result with various valid constraints on improving the initial LP relaxation. We use Gap Closed, which is defined as lower bound LP CC rx, to evaluate the lower bound improvement in the next three sections. The computational results are presented in the LP CC opt LP CC rx figures and tables attached at the end of this section Computational Result of Linear Constraint Based on Bounds on the Variables In this section, we show the computational result with bound cuts which we discussed in section In order to show the effectiveness of bound cuts refinements, we did 8 bound refinements for each run, and we record the lower bound improvement and processing time after each refinement. Table 2.2, 2.3, 2.4 and 2.5 reports the computational result with bound cuts when p = 0, p = 1, p = 2 and p = 3 respectively, and Table 2.6 shows the average gap closed and refinement time of these 60 LPCC instances. Figure 2.1, 2.2, 2.3 and 2.4 provide visualization of how gap closed and processing time increase with number of refinements. Based the above computational result of bound cuts, we make the following observations: 1. Bound cuts are able to improve the lower bound of LPCC for most of our test instances, and for some of the test instances bound cuts are sufficient to solve these instances. However, there is one instance, instance 41, bound cut can t improve the initial lower bound of LPCC at all. 2. As p increase, we can get the tighter bounds on y j and w j by applying Procedure 1, which lead to stronger bound cuts (2.3). However as shown in figure 2.2, 2.3 and 2.4, the computation time also increase dramatically as p increase. That s because we need to solve 2 p linear programs in Procedure 1. Therefore in practice it s better to let p 3. It is also noted that as m increases the computation time also increases significantly. That s because we need to get refined bound on y j and w j for j = 1...m in Procedure 2, which means we need to apply Procedure 1 2m times. One suggestion as

54 42 we mentioned before is to calculate bounds only for variables where the complementarity constraints is violated. 3. Bound cuts refinement procedure is able to improve the lower bound of LPCC further, and the computation time increase linearly in responds to the number of refinements. It is noted that as shown in figure 2.1 the ratio of improvement gradually becomes smaller as number of refinement increases, and that suggests 4 or 5 bounds cut refinement might be enough in the preprocess Computational Result of Disjunctive Cuts In this section, we report the computational result of disjunctive cuts which are derived from one complementarity. We compare two types of disjunctive cuts: the cut from solving Cut Generation LP and simple cut from simplex tableau. To show the effectiveness of these cuts, we also generate the cuts multiple runs,and for each run we will generate simple cut and disjunctive cut for each violated complimentary constraint. We report the results after m/8, m/4, m/3 and m/2 iterations respectively. Table 2.7 records the gap closed % and cuts generation time for these 60 test instances, and table 2.8 compares the average gap closed % and computation time between cuts from CGLP and simple cuts. Figure 2.5, 2.6, 2.7 and 2.8 provide the visual comparison of gap closed and generation time between these two type cuts. Based the computational result of these two types of disjunctive cuts, we make the following observations: 1. The lower bounds of all test instances can be improved by using disjunctive cuts, even for instance 41 which is unable to be improved by bound cuts. 2. As shown in figure 2.5, disjunctive cuts from solving CGLP is far stronger than simple cuts ; while as shown in figure 2.6, 2.7 and 2.8, simple cuts is far cheapter than disjunctive cuts from solving CGLP in terms of the cuts generation cost. 3. For both cases, it seems that the gap closed % does not increase very much as number of generation iteration increasing. It suggests that it might be enough to generate the disjunctive cuts from the first few iterations in the preprocessing procedure.

55 Computational Result of Valid Second Order Cone Constraint In this section, we will show the computational result of valid second order cone constraint. It is noted that CPLEX will have a lot of numerical issues when QCP involves too many quadratic constraints. Therefore here we only consider testing the second order cone constraint with McCormick bound cuts, i.e., (2.39) and its refinement procedure, since it will only involve one quadratic constraint. For comparison of lower bound improvement with bound cuts, we also do 8 McCormick bound refinements for each run. Table 2.9 records the details of gap closed % and computation time for these 60 test instances. Notice that we are unable to process 5 instance due to some numerical difficulties. Table 2.10 shows the average performance of the second order cone constraint among these test instances, and figure 2.9, 2.10, 2.3, and 2.12 compare the gap closed % and computational time of McCormick bounds refinement with the result of bound cuts shown in previous section Based the above computational result of McCormick bounds refinements, we make the following observations: 1. Second order cone constraints are able to improve the lower bound of our test instances significantly with very cheap computation cost. By comparison with bound cuts refinement directly, McCormick bound refinement can achieve a similar gap closed % of bound cuts refinement with far less computation time. 2. The computation time of McCormick bounds refinement is less influenced by m than bound cuts refinement does. That s because we only need to calculate the refined bounds on x and ỹ in McCormick bound refinement procedure, and the computation time depends on the dimension of x or ỹ, not m. Thus, McCormick bounds refinement will be very useful when m is very large and dimension of x or ỹ is relative small Computational Result of CPLEX with Cuts In the previous sections, we have shown the effectiveness of various valid linear constraints and second order constraints on tightening the initial LP relaxation. In fact, some of the instances can be sufficiently solved even just with these cutting planes and cutting surfaces. However, for most of our instances, we

56 44 still need to apply branch and bound routine to solve the problem to optimality. In this section, we will present the computational result of CPLEX with these developed cutting planes. Notice that for the second order cone constraints we use subgradient approximation to the second order cone constraint instead, and all the cutting planes are added at root node. Recall that in table 2.1, we have shown the computational result of CPLEX solving our 60 LPCC instances, and CPLEX is unable to solve most of our LPCC instances when m = 200. Table 2.11, 2.12, 2.13 and 2.14 show the computational result of CPLEX with bound cuts when # splits p = 0, 1, 2, 3 respectively, and we are doing 0, 4, 8 refinements for each setting of p. Table 2.15 shows the computational result of CPLEX with disjunctive cuts, simple cuts and cuts derived from the second order cone constraint. Table 2.16 gives a comparison of average solving time and exploring nodes of CPLEX with these different types of cutting planes. Based on the comparison in table 2.16, we make the following observations: 1. According to table 2.16, bound cuts can improve the performance of CPLEX for solving our LPCC instances much more than other types of cuts can do. From the table, without any cuts, CPLEX is unable to solve 15 out of 20 LPCC instances when m = 200, while with the bound cuts, the number of unsolved instances decreases significantly. For example, with the bound cuts (# splits p = 0; # refine = 4), the number of unsolved problems is reduced to 2. Also as number of splits p or number of refinements increasing, CPLEX solving time and number of exploring nodes decreases, which means, the stronger the bound cuts we have, the bigger performance improvement CPLEX will gain from that. In the meanwhile, considering the cost of generating bound cuts and the performance improvement that bound cuts bring to CPLEX, it might be a good idea to choose # splits p = 0or1 and 4 refinements. 2. Surprisingly disjunctive cuts and simple cuts will totally mess up CPLEX for solving our 60 instances, although based on the computational result shown in section they can increase the initial lower bound of our test instances. We believe the reason is because we add too many disjunctive cuts or simple cuts at root node (since for each refinement we will generate cuts for each complementarity violation), which leads to big size of LP. This motivates us to manage and select these disjunctive cuts to add at root node

57 45 instead of adding all of them. In the section of the next chapter, we will discuss this issue in more details. 3. Cuts derived from second order cone constraint can also help CPLEX to solve our 60 instances, but their effectiveness is not as good as bound cut. However considering the low cost of generating these cuts and their effectiveness of closing the initial gap, we should definitely consider adding these cuts at root node Conclusion At the end of this chapter, we can make some conclusions based on the computational result we have shown. 1. Bound cuts, disjunctive cuts and valid second order cone constraints discussed in this chapter can be used to tighten the initial LP relaxation for most of our testing LPCC instances. Bound cuts and second order cone constraints have the similar effectiveness on closing the initial gap; while the cost of generating second order cone constraint is much lower than the cost of generating bound cuts. 2. General disjunctive cut and simple cut seem a little weaker than the bound cut and second order cone constraint on tightening the initial LP relaxation for our test instances. Simple cut is much cheaper than general disjunctive cut on generating cost, since it is derived from Simplex tableau with almost no cost. Therefore simple cut might be generated at nodes during branch and bound routine. We need to select and manage disjunctive cuts and simple cuts when add them at the root node. Too many these type of cuts will totally mess up the CPLEX branch and bound routine based on our computational experience. 3. Based on the computational result shown in section 2.4.6, bound cuts can improve the performance of CPLEX branch and bound routine significantly. This indicates that the bounds on the complementary variables y and w might be very important for the branch and bound routine. Although cuts derived from the second order cone constraint is less helpful to CPLEX than bound cuts, it can be used to close the initial gap with very small computational cost. Therefore it might be a good idea of combining bound

58 46 cuts and the cuts derived from second order cone constraint together. In this next chapter, we will discuss these issue in more details.

59 47 Problem Values CPLEX # n k m rankm dense LP CC rx LP CC opt LP CC ub T me(sec) Node Gap % % % % % % % % % % % % % % % % % % % % % % % % Table 2.1: LPCC test instances Remark. The sub-column # in column Problem is the problem counter; the rest sub-columns in column Problem record the input parameters used to generate corresponding instances; the sub-column LP CC rx in column Value contains the value of LP relaxation of LPCC instance, i.e., the optimal value by solving linear program (2.1); the sub-column LP CC opt in column Value contains the value of optimal objective of LPCC instance; the sub-column LP CC ub in column Value contains the value of initial upper bound of LPCC from a recovered feasible solution (we will discuss this heuristic feasible recovery algorithm in next chapter); the sub-column T ime(sec) in column CPLEX contains the solving time of CPLEX; the sub-column Node in column CPLEX contains the total number of nodes explored by CPLEX when CPLEX is terminated by either finding the optimal solution or hitting the time limit; the sub-column Gap in column CPLEX contains the relative gap if CPLEX is terminated by hitting the time limit;

60 48 Gap Closed(%) Time(sec) # R0 R1 R2 R3 R4 R5 R6 R7 R8 R0 R1 R2 R3 R4 R5 R6 R7 R Table 2.2: Computational result of bound cuts refinements with p = 0. Remark. sub-column R i denotes i th refinement.

61 49 Gap Closed(%) Time(sec) # R0 R1 R2 R3 R4 R5 R6 R7 R8 R0 R1 R2 R3 R4 R5 R6 R7 R Table 2.3: Computational result of bound cuts refinements with p = 1. Remark. sub-column R i denotes i th refinement.

62 50 Gap Closed(%) Time(sec) # R0 R1 R2 R3 R4 R5 R6 R7 R8 R0 R1 R2 R3 R4 R5 R6 R7 R Table 2.4: Computational result of bound cuts refinements with p = 2. Remark. sub-column R i denotes i th refinement.

63 51 Gap Closed(%) Time(sec) # R0 R1 R2 R3 R4 R5 R6 R7 R8 R0 R1 R2 R3 R4 R5 R6 R7 R Table 2.5: Computational result of bound cuts refinements with p = 3. Remark. sub-column R i denotes i th refinement.

64 52 p # Refine Gap Closed(%) Time(sec) m = 100 m = 150 m = Table 2.6: Comparison of average gap closed and processing time of bound cuts refinements under different p setting Remark. The column Gap Closed (%) contains the average gap closed of 60 test instances; sub-column m = 100 in column Time (sec) contains the average refinement time of the test instance with 100 complementarities; sub-column m = 150 and m = 200 are similar to sub-column m = 100.

65 Time (sec) Gap Closed (%) p=0 p=1 p=2 p= Number of Refinements Figure 2.1: Comparison of average gap closed of bound cuts refinements under different p setting p=0 p=1 p=2 p= Number of Refinements Figure 2.2: Comparison of average refinement time of bound cuts refinements for LPCC instances with m = 100 under different p setting

66 Time (sec) Time (sec) p=0 p=1 p=2 p= Number of Refinements Figure 2.3: Comparison of average refinement time of bound cuts refinements for LPCC instances with m = 150 under different p setting p=0 p=1 p=2 p= Number of Refinements Figure 2.4: Comparisn of average refinement time of bound cuts refinements for LPCC instances with m = 200 under different p setting

67 55 Disjunctive Cuts Simple Cuts # Gap Closed (%) Time (sec) Gap Closed (%) Time (sec) m/8 m/4 m/3 m/2 m/8 m/4 m/3 m/2 m/8 m/4 m/3 m/2 m/8 m/4 m/3 m/ Table 2.7: Computational result of disjunctive cuts and simple cuts. Remark. sub-column m/8 denotes m/8 cuts generation iterations; subcolumn m/4, m/3 and m/2 are similar to sub-column m/8.

68 Gap Closed (%) 56 Cut Type # Refine Gap Closed(%) Time(sec) m = 100 m = 150 m = 200 Disjunctive Cut m/ m/ m/ m/ Simple Cut m/ m/ m/ m/ Table 2.8: Comparison of average gap closed and processing time of disjunctive cuts and simple cuts Disjunctive Cut Simple Cut m/8 m/4 m/3 m/2 Cut Generation Iteration Number Figure 2.5: simple cut Comparison of average gap closed of disjunctive cut and

69 Time (sec) Time (sec) m/8 m/4 m/3 m/2 Cut Generation Iteration Number Disjunctive Cut Simple Cut Figure 2.6: Comparison of average refinement time of disjunctive cut and simple cut for LPCC instances with m = Disjunctive Cut Simple Cut m/8 m/4 m/3 m/2 Cut Generation Iteration Number Figure 2.7: Comparison of average refinement time of disjunctive cut and simple cut for LPCC instances with m = 150

70 Time (sec) Disjunctive Cut Simple Cut m/8 m/4 m/3 m/2 Cut Generation Iteration Number Figure 2.8: Comparisn of average refinement time of disjunctive cut and simple cut for LPCC instances with m = 200

71 59 Gap Closed(%) Time(sec) # R0 R1 R2 R3 R4 R5 R6 R7 R8 R0 R1 R2 R3 R4 R5 R6 R7 R Table 2.9: Computational result of McCormick bound cuts refinement Remark. instance 34, 47, 52, 57, and 60 are able to be processed because of numerical difficulties.

72 Gap Closed (%) 60 # Refine Gap Closed(%) Time(sec) m = 100 m = 150 m = Table 2.10: Comparison of average gap closed and processing time of McCormick bound refinements p=0 p=1 p=2 p=3 McCormick Number of Refinements Figure 2.9: Comparison of average gap closed of bound cuts refinement and McCormick bound refinement

73 Time (sec) Time (sec) p=0 p=1 p=2 p=3 McCormick Number of Refinements Figure 2.10: Comparison of average refinement time of bound cuts refinement and McCormick bound refinement for LPCC instances with m = p=0 p=1 p=2 p=3 McCormick Number of Refinements Figure 2.11: Comparison of average refinement time of bound cuts refinement and McCormick bound refinement for LPCC instances with m = 150

74 Time (sec) p=0 p=1 p=2 p=3 McCormick Number of Refinements Figure 2.12: Comparisn of average refinement time of bound cuts refinement and McCormick bound refinement for LPCC instances with m = 200

75 63 Bound Cuts (# splits p=0) # Refine=0 # Refine=4 # Refine=8 # Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap % % % % % % % % % % % % % Table 2.11: Computational result of CPLEX with bound cuts (p=0)

76 64 Bound Cuts (# splits p=1) # Refine=0 # Refine=4 # Refine=8 # Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap % % % % % % % % % % % Table 2.12: Computational result of CPLEX with bound cuts (p=1)

77 65 Bound Cuts (# splits p=2) # Refine=0 # Refine=4 # Refine=8 # Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap % % % % % % % % % Table 2.13: Computational result of CPLEX with bound cuts (p=2)

78 66 Bound Cuts (# splits p=3) # Refine=0 # Refine=4 # Refine=8 # Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap % % % % % % % % % % Table 2.14: Computational result of CPLEX with bound cuts (p=3)

79 67 Disjunctive Cuts Simple Cuts Cuts from SOC # Refine= m/8 # Refine= m/2 # Refine= m/8 # Refine= m/2 # Refine=8 # Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap Time(sec) Nodes Gap % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % Table 2.15: Computational result of CPLEX with disjunctive cuts and cuts from second order cone constraints Remark. CPLEX encounters numerical difficulties when handles instance 50 with simple cuts (# refinements= m/2 ) and instance 34, 47, 52, 57, and 60 with McCormick refinements.

80 68 Cut Type # Refine Time(sec) Nodes # unsolved instances m = 100 m = 150 m = 200 m = 100 m = 150 m = 200 m = 100 m = 150 m = 200 No Cuts N/A Bound Cuts (# splits p = 0) Bound Cuts (# splits p = 1) Bound Cuts (# splits p = 2) Bound Cuts (# splits p = 3) Disjunctive Cuts m/ m/ Simple Cuts m/ m/ Cuts from Second Order Cone Constraints Table 2.16: Comparison of average solving time and exploring nodes of CPLEX with different type of cuts. Remark. The column # unsolved instance indicates the number of instances that can not be solved by CPLEX within 7200 seconds.

81 CHAPTER 3 LPCC using Branch-and-Cut In this chapter, we propose a branch-and-cut algorithm for solving the general LPCC problem (1.1). Notice that for the general LPCC problem, if the initial LP relaxation problem of LPCC is feasible, there will be two cases for the initial LP relaxation: initial LP relaxation is bounded below and initial LP relaxation is unbounded. We refer these two cases as bounded case and unbounded case of LPCC. For the bounded case, our algorithm consists two phases: preprocessing phase and branch-and-bound phase; while for the unbounded case only branchand-bound phase will be applied. In section 3.2.1, we first describe the preprocessing phase of our algorithm: in section , a heuristic feasibility recovery procedure is developed to recover a feasible solution of LPCC which provides a valid upper bound of LPCC; and in section , the strategy of generating and selecting cutting planes from various types of cutting plane we studied in previous chapter are discussed which could sharpen the LP relaxation and improve the initial lower bound of LPCC. In section 3.2.2, we present the second phase of our algorithm: branch-and-bound phase. Various node selection strategies and branching complementarity selection strategies are discussed in section and Our proposed algorithm is able to characterize infeasible and unbounded LPCC problems as well as solve problems with finite optimal value. In section 3.3, we show the computational result of our branch-and-cut algorithm on solving randomly generated LPCC instances including the 60 test LPCC instances used in previous chapter. 3.1 Preliminaries In this preliminary section, we mainly explain the motivation of using branchand-cut approach to solve LPCC and give a brief introduction of Branch-and- Bound and Branch-and Cut Method. 69

82 Preliminary Discussion and Motivation Recalling the discussion in chapter 1, if y and w are bounded, then we have the equivalent MIP formulation of problem (1.1): minimize (x,y,z) subject to and c T x + d T y Ax + By b 0 y Θz 0 q + Nx + My Θ(1 z) z {0, 1} m (3.1) where Θ is a diagonal matrix with diagonal entries θ i In the above MIP formulation, the binary vector z is only used to model the complementary relationship of LPCC, and except for the complementarity constraints it does not interact with x and y at all. This observation motivates us to enforce the complementarities through specialized branching scheme, i.e., branch on complementarities directly without introducing binary vector z. This kind of specialized branching approach has been studied to solve several problems such as generalized assignment problems [20], nonconvex quadratic programs [58] and nonconvex piecewise linear optimization problems [40]. The obvious advantage of using specialized branching approach for solving LPCC is that we no longer need θ in the formulation, and therefore this approach is also applicable for the case when y or w is unbounded. In fact, even if we know such θ exists, the cost of computing valid θ could be very expensive especially when m is very large. Moreover, introducing binary vector z will lead to increase both the number of variables and the number of constraints, and these Big-M type constraints are usually not tight which will increase the degeneracy of the solution of the relaxation. In this chapter, we mainly study this specialized branching approach combined with the cutting planes developed in the previous chapter for solving general LPCC (1.1) Branch-and-Bound & Branch-and-Cut Before we start to propose our algorithm, we will first give a brief overview of branch-and-bound and branch-and-cut method. Branch-and-bound method has been widely applied to solve various optimization problems. The method is based on the principle of divide-and-conquer,

83 71 R root node pruned node solved node current solving node F feasible solution new formed node unsolved node Figure 3.1: Branch-and-bound search scheme and is essentially an implicit enumeration method. Branching is to iteratively divide the given problem into two or more sub-problems. During the branching procedure, a branching tree is created with each node representing one of the sub-problems. Bounding is to obtain a lower bound on the sub-problem and to compare this lower bound with known upper bound of the original problem in order to determine whether this sub-problem can be pruned out or not. The bounding process is essentially to avoid a complete enumeration of all potential sub-problems. Figure 3.1 represents the process of branch-and-bound scheme. Branch-and-bound method is one of the first methods to solve MIPs. The first branch-and-bound algorithm for solving integer program was introduced by Land and Doig [43] in 1960, and until nowadays branch-and-bound algorithm is still the core algorithm used by various commercial solver to solve MIPs. In mixed integer programming, branching will happen if there is any integer variable with fractional value, and bounding process will consider the LP relaxation of that node as lower bound. It is noted that the key steps in branch-and-bound method are nodes selection ( select which sub-problem to further explore ) and variables selection ( select the way to divide current node into several sub-problems ). In this chapter, we will present a specialized branching scheme to solve general LPCC instance, where branching is performed to fix complementarity constraint.

On Linear Programs with Linear Complementarity Constraints

On Linear Programs with Linear Complementarity Constraints Jing Hu, John E. Mitchell, Jong-Shi Pang, and Bin Yu Wednesday September 22, 2010 Dedication. It is our great pleasure to dedicate this work to