Bilevel Integer Programming - PDF Free Download

Bilevel Integer Programming Ted Ralphs 1 Joint work with: Scott DeNegre 1, Menal Guzelsoy 1 COR@L Lab, Department of Industrial and Systems Engineering, Lehigh University Norfolk Southern Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 1 / 69

Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 / 69

Outline 1 Introduction Motivation Setup Applications 3 Special Cases Recourse Problems Continuous Second Stage Other Cases 4 Algorithms 5 Implementation Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 3 / 69

Motivation A standard mathematical program models a set of decisions to be made simultaneously by a single decision-maker with a single objective. Many decision problems arising both in real-world applications and in the theory of integer programming involve multiple, independent decision-makers (DMs), multiple, possibly conflicting objectives, and/or hierarchical/multi-stage decisions. Modeling frameworks Multiobjective Programming multiple objectives, single DM Mathematical Programming with Recourse multiple stages, single DM Multilevel Programming multiple stages, multiple objectives, multiple DMs Multilevel programming generalizes standard mathematical programming by modeling hierarchical decision problems, such as Stackelberg games. Such models arises in a remarkably wide array of applications. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 5 / 69

Nash and Stackelberg Games Many game theoretic models can be formulated as optimization problems involving multiple decision makers. In a Nash game, the players are treated as equals and take simultaneous action. Computationally, one often wishes to find a Nash equilibrium, in which the action of each player is optimal, given the actions of all other players. In a Stackelberg game, there is a dominant player, called the leader, who acts first, after which other players react. In this case, one is concerned with determining the leader s decision, given the assumption that the followers will react optimally. This can often be modeled as a bilevel program. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 6 / 69

Bilevel (Integer) Linear Programming Formally, a bilevel linear program is described as follows. x X R n1 are the upper-level variables y Y R n are the lower-level variables Bilevel (Integer) Linear Program max { c 1 x + d 1 y x P U X, y argmin{d y y P L (x) Y} } (MIBLP) The upper- and lower-level feasible regions are: P U = { x R + A 1 x b 1} and P L (x) = { y R + G y b A x }. We consider the general case in which X = Z p1 R n1 p1 and Y = Z p R n p. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 7 / 69

Notation Notation Ω I = {(x, y) (X Y) x P U, y P L (x)} Ω = {(x, y) (R n1 R n ) x P U, y P L (x)} M I (x) = argmin{d y y (P L (x) Y)} F I = { (x, y) x (PU X), y M I (x) } F = { (x, y) x PU, y argmin{d y y P L (x)} } Underlying bilevel linear program (BLP): max c 1 x + d 1 y (x,y) F Underlying mixed integer linear program (MILP): Underlying linear program (LP): max (x,y) Ω I c1 x + d 1 y max (x,y) Ω c1 x + d 1 y Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 8 / 69

(Standard) Mixed Integer Linear Programs In parts of the talk, we will need to consider a (standard) mixed integer linear program (MILP). To simplify matters, when we discuss a standard MILP, it will be of the form MILP min{c x x P (Z p R n p )}, (MILP) where P = {x R n + Ax = b}, A Qm n, b Q m, c Q n. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 9 / 69

Basic Assumptions For the remainder of the talk, we consider decision problems in which there are two DMs, a leader or upper-level DM and a follower or lower-level DM. We assume individual rationality of the two DMs, i.e., the leader can predict the follower s reaction to a given course of action. For simplicity, we also assume that for every action by the leader, the follower has a feasible reaction. The follower may in fact have more than one equally favorable reaction to a given action by the leader. These alternatives may not be equally favorable to the leader. We assume that the leader may choose among the follower s alternatives. This assumption is reasonable if the players have a semi-cooperative relationship. We assume the feasible set F I is nonempty and compact to ensure solutions exist. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 10 / 69

Overview of Practical Applications Hierarchical decision systems Government agencies Large corporations with multiple subsidiaries Markets with a single market-maker. Decision problems with recourse Parties in direct conflict Zero sum games Interdiction problems Modeling robustness : leader represents external phenomena that cannot be controlled. Weather External market conditions Controlling optimized systems: follower represents a system that is optimized by its nature. Electrical networks Biological systems Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 1 / 69

Example: Electricity Networks (Bienstock and Verma (008)) As we know, electricity networks operate according to principles of optimization. Given a network, determining the power flows is an optimization problem. Suppose we wish to know the minimum number of links that need to be removed from the network in order to cause a failure. This can be viewed as a Stackelberg game. Note that neither the leader nor the follower is cognizant in this case. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 13 / 69

Overview of Technical Applications Bilevel structure is inherent in many decision problems that occur within branch and cut and other algorithms based on disjunctive methods. In many cases, we would like to choose the most effective disjunction according to some criteria. The choice of disjunction is thus an optimization problem that itself has bilevel structure. The bilevel nature arises from the fact that the effectiveness of the disjunction is usually evaluated by solving another optimization problem. Examples Choosing the valid inequality with the largest violation. Choosing a branching disjunction that achieves maximal bound improvement. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 14 / 69

Example: Capacity Constraints for CVRP In the Capacitated Vehicle Routing Problem (CVRP), the capacity constraints are of the form x e b(s) S N, S > 1, (1) e={i,j} E i S,j S where b(s) is any lower bound on the number of vehicles required to serve customers in set S. By defining binary variables y i = 1 if customer i belongs to S, and z e = 1 if edge e belongs to δ( S), we obtain the following bilevel formulation for the separation problem: min e E ˆx e z e b( S) () z e y i y j e E (3) z e y j y i e E (4) b( S) = max{b( S) b( S) is a valid lower bound} (5) Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 15 / 69

Example: Capacity Constraints for CVRP (cont.d) If the bin packing problem is used in the lower-level, the formulation becomes: min e E ˆx e z e b( S) (6) z e y i y j e = {i, j} (7) z e y j y i e = {i, j} (8) n b( S) = min (9) h l l=1 n w l i = y i i N (10) l=1 d i w l i Kh l l = 1,...,n, (11) i N where we introduce the additional binary variables w l i = 1 if customer i is served by vehicle l, and h l = 1 if vehicle l is used. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 16 / 69

Recourse Problems If d 1 = d, we can view this as a mathematical program with recourse. We can reformulate the bilevel program as follows. min{ c 1 x + Q(x) x P U X}, (1) where Q(x) = min{d y y P L (x) Y}. (13) The function Q is known as the value function of the recourse problem. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 18 / 69

Recourse Problems with Continuous Second Stage If Y = R n1, it is well-known by LP duality that Q(x) = max{u(b A x) ug d, u R m + }, (14) so we can further reformulate (1) as min{ c 1 x + z x P U X, z u i (b A x) i D}, (15) where D is a set indexing the extreme points of the dual polyhedron D = {u R m + ug d }. (16) Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 19 / 69

Benders Decomposition We can then solve (1) by Benders decomposition. This amounts to solving (15) by cut generation. The value function of the lower-level problem is convex in the upper-level variables, which is what makes the problem tractable. zlp zlp(b) + u (v b) b v Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 0 / 69

Two-Stage Stochastic Integer Programs Consider the two-stage stochastic mixed integer program where min{c 1 x + E ξ Q ξ (x) x P U X}, (17) Q ξ (x) = min{d y y Y, G y ω(ξ) A x}, (18) ξ is a random variable from a probability space (Ξ, F, P), and for each ξ Ξ, ω(ξ) R m. If the distribution of ξ is discrete and has finite support, then (17) is a bilevel program. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 1 / 69

Continuous Second Stage In general, if Y = R n1, then the lower-level problem can be replaced with its optimality conditions. The optimality conditions for the lower-level optimization problem are G y b A x ug d u(b G A x) = 0 (d ug )y = 0 u, y R + When X = R n1, this is a special case of a class of non-linear mathematical programs known as mathematical programs with equilibrium constraints (MPECs). An MPEC can be solved in a number of ways, including converting it to a standard integer program. Note that in this case, the value function of the lower-level problem is piecewise linear, but not necessarily convex. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 / 69

Other Cases Pure integer. Positive constraint matrix at lower level. Binary variables at the upper and/or lower level. Interdiction problems Mixed Integer Interdiction where max min dy x PU I y PL I(x) PU I = { x B n A 1 x b 1} PL(x) I = { y Z p R n p G y b, y u(e x) }. (MIPINT) The case where follower s problem has network structure is called the network interdiction problem and has been well-studied. The model above allows for lower-level systems described by general MILPs. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 3 / 69

Mixed Integer Bilevel Programs We have seen that there are some cases of bilevel programming that are tractable. When some/all of the variables are discrete, the problem becomes more complex, both practically and theoretically. Many of the ideas from these special cases can be extended in principle. However, it s very difficult to obtain anything tractable in practice. An obvious question is whether we can generalize any of the very well-developed methodology we know for solving standard MILPs. We can do so to a certain extent, but the proper generalizations are not obvious. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 5 / 69

Complexity It is perhaps to be expected that general BMILPs are in a different complexity class than standard MILPs. This is because checking feasibility is itself an NP-complete problem for BMILPs. Even the case in which X = R n 1 and Y = Rn is NP-complete. Bilevel programming is (apparently) Σ p -complete, though we do not have a formal proof for the general case. The class Σ p is one step higher in the so-called polynomial-time hierarchy than the class NP(= Σ p 1 ). Roughly speaking, this class consists of problems that could be solved in nondeterministic polynomial time, given an oracle for problems in NP. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 6 / 69

Example Consider the following instance of (MIBLP) from Moore and Bard (1990). y max x Z subject to x + 10y y argmin {y : 5x 0y 30 x y 10 x + y 15 x + 10y 15 y Z } From the figure, we can make several observations: 1 F Ω, F I Ω I, and Ω I Ω F I F 5 4 3 1 F F I conv(ω I ) conv(f I ) 0000 1111 0000000000 1111111111 00000000000000 11111111111111 00000000000000 11111111111111 1 3 4 5 6 7 3 Solutions to (MIBLP) do not occur at extreme points of conv(ω I ) 8 x Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 7 / 69

Properties of MIBLPs In this example: Optimizing over F yields the integer solution (8, 1), with the upper-level objective value 18. Imposing integrality yields the solution (, ), with upper-level objective value From this we can make two important observations: 1 The objective value obtained by relaxing integrality is not a valid bound on the solution value of the original problem, Even when solutions to max (x,y) F c 1 x + d 1 y are in F I, they are not necessarily optimal. Thus, some familiar properties from the MILP case do not hold here. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 8 / 69

Understanding the Structure The key to understanding the structure of an MIBLP is to consider the value function associated with the lower-level problem. This value function of the instance (MILP) is a function z : R R {± } defined as follows: MILP Value Function where, for a given right-hand side vector d R m, z(d) = min x S(d) c x, (19) S(d) = {x Z p + R n p + Ax = d}. In what follows, we will set Θ = {d R m S(d) }. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 9 / 69

Example: Value Function z(d) = min 1 x 1 + x 3 + x 4 s.t x 1 3 x + x 3 x 4 = d and x 1, x Z +, x 3, x 4 R +. z(d) 3 5 3 1 1-4 7-3 5-3 -1 1 0 1 1 3 5 3 7 4 d Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 30 / 69

Value Function Reformulation If we knew the value function explicitly, we could reformulate (MIBLP) as y 5 4 3 1 1 3 4 5 6 7 8 F max F c1 x + d 1 y I where z LL is the value function of the lower-level problem. This is, in principle, a standard mathematical program. x subject to A 1 x b 1 G y b A x d y = z LL (b A x) x X, y Y, It is easy to see why relaxing integrality does not yield a valid bound. In this case, we are effectively replacing z LL with the value function of the LP relaxation (more on this soon). Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 31 / 69

Properties of the Value Function It is subadditive over Θ. It is piecewise polyhedral and lower semi-continuous. For an ILP, it can be computed by a finite number of limited operations on elements of the RHS: (i) rational multiplication (ii) nonnegative combination (iii) rounding (iv) taking the maximum Chvátal fcns. Gomory fcns. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 3 / 69

Properties of the Value Function (cont.) There is a one-to-one correspondence between ILP instances and Gomory functions. The Jeroslow Formula shows that the value function of an MILP can be constructed from the value function of an associated ILP. The value function of the earlier example is { 3 max d 3, z(d) = min { 3 max d 3, d d } + 3 d + d, } d, Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 33 / 69

Approximating the Value Function In general, it is difficult to construct the value function explicitly. We therefore propose to approximate the value function by either upper or lower bounding functions Lower bounds Derived by considering the value function of relaxations of the original problem or by constructing dual functions Relax constraints. Upper bounds Derived by considering the value function of restrictions of the original problem Fix variables. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 34 / 69

The LP Relaxation The value function of the LP relaxation of the original problem provides an easy lower bound. F LP (d) = max {vd : va c}. v Rm By linear programming duality theory, we have F LP (d) z(d) for all d R m. Of course, F LP is not necessarily strong. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 35 / 69

Example: LP Dual Function F LP (d) = min s.t which can be written explicitly as { F LP (d) = vd, 0 v 1, and v R, 0, d 0 1 d, d > 0. 3 z(d) F LP(d) 5 3 1 1-4 7-3 5-3 -1 1 0 1 1 3 5 3 7 4 d Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 36 / 69

Relaxing Linear Constraints Consider the value functions of each single row relaxation: z i (q) = min{cx a i x = q, x Z r + R n r + } q R, i M {1,..., m} where a i is the i th row of A. Theorem 1 Let F(d) = max i M {z i(d i )}, d = (d 1,...,d m ), d R m. Then F is subadditive and F(d) z(d) d R m. We know a lot about the structure of the value function of single-row relaxations. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 37 / 69

Dual Functions A dual function F : R m R is one that satisfies F(d) z(d) for all d R m. This is a generalization of the concept of dual solution from the LP case. How to select such a function? We generally choose one for which F(b) z(b) for some particular b R m of interest. This results in the following dual problem: z D = max {F(b) : F(d) z(d), d R m, F Υ m } where Υ m {f f : R m R} We call F strong for this instance if F is a feasible dual function and F (b) = z(b). This dual instance always has a solution F that is strong if the value function is bounded and Υ m {f f : R m R}. Why? Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 38 / 69

Example: Feasible Dual Functions 3 z(d) F(d) 5 3 1 1-4 7-3 5-3 -1 1 0 1 1 3 5 3 7 4 d Notice how different dual solutions are optimal for some right-hand sides and not for others. Only the value function is optimal for all right-hand sides. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 39 / 69

The Subadditive Dual Let a function F be defined over a domain V. Then F is subadditive if F(v 1 ) + F(v ) F(v 1 + v ) v 1, v, v 1 + v V. Recall that the value function z is subadditive over Θ. If Υ m Γ m {F is subadditive F : R m R, F(0) = 0}, we can rewrite the dual problem above as the subadditive dual z D = max F(b) F(a j ) c j j = 1,..., r, F(a j ) c j j = r + 1,..., n, and F Γ m, where the function F is defined by F(d) = lim sup δ 0 + F(δd) δ d R m. Here, F is the upper d-directional derivative of F at zero. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 40 / 69

Example: Upper D-directional Derivative The upper d-directional derivative can be interpreted as the slope of the value function in direction d at 0. For the example, we have 3 z(d) z(d) 5 3 1 1-4 7-3 5-3 -1 1 0 1 1 3 5 3 7 4 d Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 41 / 69

Example: Subadditive Dual For our IP instance, the subadditive dual problem is max F(b) F(1) 1 F( 3 ) 0 F(1) F( 1) 1 F Γ 1.. and we have the following feasible dual functions: 1 F 1 (d) = d is an optimal dual function for b {0, 1,,...}. F (d) = 0 is an optimal function for b {..., 3, 3, 0}. 3 F 3 (d) = max{ 1 d d d 4, d 3 d d d 4 } is an optimal function for b {[0, 1 4 ] [1, 5 4 ] [, 9 4 ]...}. 4 F 4 (d) = max{ 3 d 3 d 3 d 3 3 d, 3 4 d d 3 3 d 3 3 + d } is an optimal function for b {... [ 7, 3] [, 3 ] [ 1, 0]} Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 4 / 69

Properties of the Subadditive Dual Using the subadditive dual, we can generalize many of the properties of the LP dual. Weak/Strong Duality Farkas Lemma Optimality conditions (complementary slackness) Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 43 / 69

Reformulation with Optimality Conditions In principle, we can use subadditive duality to obtain optimality conditions for the lower-level problem (reformulation shown here is for the pure integer case). max x,y,f c 1 x + d 1 y subject to A 1 x b 1 A x + G y b F(g j ) d j, j = 1,...,n (F(g j ) d j )y j = 0, j = 1,...,n n j=1 F(g j )y j = F(b A x) x Z n1 +, y Z n +, F Γ m. This is analogous to the reformulation in the continuous case, but is intractable in general. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 44 / 69

Constructing Dual Functions Explicit construction The Value Function Generating Functions Relaxations Lagrangian Relaxation Quadratic Lagrangian Relaxation Corrected Linear Dual Functions Primal Solution Algorithms Cutting Plane Method Branch-and-Bound Method Branch-and-Cut Method For the remainder of this part of the talk, we consider the instance (MILP). Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 45 / 69

Branch-and-Bound Method Then, Assume that the primal problem is solved to optimality. Let T be the set of leaf nodes of the search tree. Thus, we ve solved the LP relaxation of the following problem at node t T z t (b) = min cx s.t x S t (b), where S t (b) = {Ax = b, x l t, x u t, x Z n } and u t, l t Z r are the branching bounds applied to the integer variables. Let (v t, v t, v t ) be the dual feasible solution used to prune node t, if t is feasibly pruned a dual feasible solution (that can be obtained from it parent) to node t, if t is infeasibly pruned F BB (d) = min t T {vt d + v t l t v t u t } is an optimal solution to the generalized dual problem. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 46 / 69

Aggregated Lower Bounding Approximations Note that we can combine dual functions to better approximate the value function. This is similar to what is done to approximate the LP value function in Bender s Decomposition. F 1 : the set of dual functions obtained from single row relaxations F : the set of the dual functions obtained from primal solution procedures for each b U R m where U is some collection of right-hand sides. Then, we call the dual function F defined as F(d) = max f F 1 F {f(d)} d R m a global approximation of the value function. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 47 / 69

Upper Bounding Approximations Just like the lower bounding approximations, we are interested in a function that is a valid upper bound for the value function. Since upper bounds are closely related to feasibility, it is harder to obtain such a function than to obtain dual/lower bounding functions. One possible way is to consider the maximal subadditive extension result. If f(d) z(d) forall d [0, q], then any extension of f from [0, q] to R m + is an upper bounding function. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 48 / 69

Upper Bounding Approximations (cont.) Another way is to consider the continuous relaxation of the primal instance. Assume that {u R m ua C c C} is not empty and bounded. Let z C(d) = max{vd v V} where V is the set of extreme points of dual polytope. Then, z C(d) z(d) d R m. Furthermore, we can move z C to a given right hand side b to obtain strong upper bounding functions. Theorem Let x be an optimal solution to the primal problem with right-hand side b. Define the function f as f(d) = c I x I + z C(d A I x I ) d Rm. Then, f(d) z(d) d R m with f(b) = z(b), and hence, is a strong upper bounding function at b. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 49 / 69

Example min 3x 1 + 7 x + 3x 3 + 6x 4 + 7x 5 + 5x 6 s.t 6x 1 + 5x 4x 3 + x 4 7x 5 + x 6 = b and x 1, x, x 3 Z +, x 4, x 5, x 6 R +. (SP) z 18 16 14 1 10 8 6 4-16 -14-1 -10-8 -6-4 - 0 4 6 8 10 1 14 16 18 0 4 6 8 d Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 50 / 69

z f 4 f 5 f 3 f f 1 b 1 b b 3 b 4 b 5 d Figure: Upper bounding functions obtained at right-hand sides b i, i = 1,..., 5. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 51 / 69

Aggregated Upper Bounding Approximations Let F U be the set of upper bounding functions obtained for each b U R m where Ui s some collection of right-hand sides. Then F(d) = min f F U{f(d)} d Rm would be a global upper approximation of the value function. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 5 / 69

A Branch and Cut Algorithm Putting all of this together, we propose a branch-and-bound approach. Components Bounding methods ( this talk) Branching methods ( this talk) Search strategies Preprocessing methods Primal heuristics In the remainder of the talk, we address development of these components. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 53 / 69

Lower Bounds Lower bounds can be obtained by relaxing the value function constraint. y 5 4 3 1 x 1 3 4 5 6 7 8 F I subject to A 1 x b 1 max c 1 x + d 1 y F G y b a x d y z LL (b A x) x X, y Y, where z LL is an upper approximation of the value function of the lower-level problem. The upper approximation z LL is assumed to be piecewise linear and generated dynamically. If the number of pieces is small, we can reformulate the above in the usual way using integer variables. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 54 / 69

Bilevel Feasibility Check Let (ˆx, ŷ) be a solution to the lower bounding problem. We fix x = ˆx and solve the lower-level problem with the fixed upper-level solution ˆx. Let y be the solution to (0). min d y (0) y PL I(ˆx) (ˆx, y ) is bilevel feasible c 1ˆx + d 1 y is a valid upper bound on the optimal value of the original MIBLP Either 1 d ŷ = d y (ˆx, ŷ) is bilevel feasible. d ŷ > d y (ˆx, ŷ) is bilevel infeasible. What do we do in the case of bilevel infeasibility? Generate a valid inequality violated by (ˆx, ŷ). Improve our approximation of the value function so that (ˆx, ŷ) is no longer feasible. Branch on a disjunction violated by (ˆx, ŷ). Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 55 / 69

Bilevel Feasibility Cut (Pure Integer Case) Let A := [ ] [ ] [ ] A 1 0 b 1 A, G := G, and b := b. A basic feasible solution (ˆx, ŷ) Ω I to the lower bounding problem is the unique solution to a ix + g iy = b i, i I where I is the set of active constraints at (ˆx, ŷ). This implies that { (x, y) Ω I a i x + g i y = } { } b i = (ˆx, ŷ) i I i I and i I a ix + g iy i I b i is valid for Ω. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 56 / 69

Bilevel Feasibility Cut (cont.) A Valid Inequality i I a ix + g iy i I b i 1 for all (x, y) Ω I \ {(ˆx, ŷ)}. max min {y x + y, x y, 3x y 3, y 3, x, y Z + }. x y y x + y 5 3 x + y 4 1 1 3 This yields a finite algorithm in the pure integer case. x Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 57 / 69

Value Function Disjunction (Single Constraint Case) f(d, ζ C ) f(d, η C ) 0 d d For any d d, Similarly, for any d d, z(d) max{f(d, ζ C ), f(d, η C )} = f(d, ζ C ). z(d) max{f(d, ζ C ), f(d, η C )} = f(d, η C ). Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 58 / 69

Value Function Disjunction (cont.) Thus, we have the following disjunction. Bilevel Feasibility Disjunction b A x b A ˆx AND d y f(b A ˆx, ζ C ) OR b A x b A ˆx AND d y f(b A ˆx, η C ). Such a disjunction can be used to either branch or cut when solutions (ˆx, ŷ) Ω I such that ŷ M I (ˆx) are found. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 59 / 69

Algorithms Based on Lower Approximation It is more difficult to develop algorithms based on lower approximations in the general case. The main challenge is that we need to know the lower level solution at each step. This is different than in the case of recourse problems. We think it may be possible to overcome these challenges, but these ideas are preliminary. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 60 / 69

Implementation The Mixed Integer Bilevel Solver (MibS) implements the branch and bound framework described here using software available from the Computational Infrastructure for Operations Research (COIN-OR) repository. COIN-OR Components Used The COIN High Performance Parallel Search (CHiPPS) framework to perform the branch and bound. The COIN Branch and Cut (CBC) framework for solving the MILPs. The COIN LP Solver (CLP) framework for solving the LPs arising in the branch and cut. The Cut Generation Library (CGL) for generating cutting planes within CBC. The Open Solver Interface (OSI) for interfacing with CBC and CLP. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 6 / 69

What Is Implemented MibS is still in its infancy and is not fully general. Currently, we have: Bilevel feasibility cuts (pure integer case). Specialized methods (primarily cuts) for pure binary at the upper level. Specialized methods for interdiction problems. Disjunctive cuts based on the value function for lower-level problems with a single constraint. Several primal heuristics. Simple preprocessing. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 63 / 69

Preliminary Results from Knapsack Interdiction Maximum Infeasibility Strong Branching n Avg Nodes Avg Depth Avg CPU (s) Avg Nodes Avg Depth Avg CPU (s) 0 359.30 8.65 9.3 358.30 8.65 11.07 658.40 9.85 18.50 658.0 9.85 18.9 4 1414.80 10.85 46.03 1410.80 10.75 46.46 6 75.00 1.05 97.55 73.50 1.05 100.17 8 536.40 1.90 14.97 538.60 1.95 0.6 30 1065.00 14.05 48.70 10638.00 14.10 538.3 Interdiction problems in which the lower-level problems are binary knapsack problems. Data was taken from the Multiple Criteria Decision Making library and modified to suit our setting. Results for each problem size reflect the average of 0 instances. These instances were running using the interdiction customization. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 64 / 69

Preliminary Results from Assignment Interdiction Instance Nodes Depth CPU (s) AP05-1 603 33 90.5 AP05-3881 3 384.97 AP05-3 3909 3 05.93 AP05-4 441 36 10.66 AP05-5 3505 33 119.18 AP05-6 031 35 80.31 AP05-7 957 9 153.0 AP05-8 3549 3 4.77 AP05-9 71 33 111.13 AP05-10 399 31 11.07 AP05-11 707 33 35.13 AP05-1 407 18 9.51 AP05-13 391 18 3.80 AP05-14 3173 8 61.08 AP05-15 509 3 17.05 AP05-16 1699 9 44.61 AP05-17 5417 9 01.34 AP05-18 5785 3 176.67 AP05-19 59 3 79.70 AP05-0 585 31 77.35 AP05-1 6039 33 161.44 AP05-479 9 48.06 AP05-3 1519 5 49.40 AP05-4 15 5 1.3 AP05-5 3857 31 115.97 Here, the lower-level problems are binary assignment problems. Data also taken from Multiple Criteria Decision Making library. Problems have 50 variables and 45 constraints. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 65 / 69

Preliminary Results Solving Bilevel ILPs Problem Class Num Rows Num Cols Num Lower 1 0 0 5 0 0 10 3 0 0 15 4 30 0 10 5 40 0 10 Table: IBLP Instance Class Description. 10 random ILP instances with two objectives and randomly chosen set of lower-level columns. Coefficients chosen in the range [ 50, 50]. No upper level constraints. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 66 / 69

Preliminary Results Solving Bilevel ILPs Class No. Optimal Avg. Gap (%) Avg. No. Nodes Avg. No. Cuts Avg. CPU (s) 1 9 5.10 116755.40 50168.0 18.31 5 31.88 46590.60 9046.50 1596.71 3 46.61 47945.90 8578.40 166.73 4 5 61.3 43997.80 3515.60 771.89 5 6 4.34 347703.70 139005.0 108.47 Table: Comparison of results with and without heuristic methods. Computational experiments performed on cluster with dual 8 core Opteron processors and 3G memory. Time limit of 8 hours was used. These results are for MibS with only feasibility cuts (no heuristics or generic MILP cuts). Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 67 / 69

Conclusions and Future Work Preliminary testing to date has revealed that these problems can be extremely difficult to solve in practice. What we have implemented so far has only scratched the surface. Currently, we are focusing on special cases where we can get traction. Interdiction problems Stochastic integer programs Much work remains to be done. Please join us! Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 68 / 69

References I Bienstock, D. and A. Verma 008. The n-k problem in power grids: New models, formulations and computation. Available at http://www.columbia.edu/~dano/papers/nmk.pdf. Moore, J. and J. Bard 1990. The mixed integer linear bilevel programming problem. Operations Research 38(5), 911 91. Ralphs, et al. (COR@L Lab) BILP University of Edinburgh, 11 May 011 69 / 69