A Benders Algorithm for Two-Stage Stochastic Optimization Problems With Mixed Integer Recourse

Similar documents
Generation and Representation of Piecewise Polyhedral Value Functions

Bilevel Integer Optimization: Theory and Algorithms

Integer Programming ISE 418. Lecture 8. Dr. Ted Ralphs

Bilevel Integer Programming

Multiobjective Mixed-Integer Stackelberg Games

The Value function of a Mixed-Integer Linear Program with a Single Constraint

The Value function of a Mixed-Integer Linear Program with a Single Constraint

Bilevel Integer Linear Programming

Stochastic Integer Programming

Decomposition Algorithms for Two-Stage Distributionally Robust Mixed Binary Programs

An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 44

Decomposition Algorithms with Parametric Gomory Cuts for Two-Stage Stochastic Integer Programs

Financial Optimization ISE 347/447. Lecture 21. Dr. Ted Ralphs

On the Value Function of a Mixed Integer Linear Program

Multistage Discrete Optimization: Introduction

Column Generation for Extended Formulations

Scenario Grouping and Decomposition Algorithms for Chance-constrained Programs

Decomposition methods in optimization

K-Adaptability in Two-Stage Mixed-Integer Robust Optimization

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 38

Strengthened Benders Cuts for Stochastic Integer Programs with Continuous Recourse

Disconnecting Networks via Node Deletions

Computations with Disjunctive Cuts for Two-Stage Stochastic Mixed 0-1 Integer Programs

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Fenchel Decomposition for Stochastic Mixed-Integer Programming

Nonconvex Generalized Benders Decomposition Paul I. Barton

MILP reformulation of the multi-echelon stochastic inventory system with uncertain demands

Stochastic Dual Dynamic Integer Programming

On the Value Function of a Mixed Integer Linear Optimization Problem and an Algorithm for its Construction

Can Li a, Ignacio E. Grossmann a,

Reformulation and Sampling to Solve a Stochastic Network Interdiction Problem

PART 4 INTEGER PROGRAMMING

Lecture 8: Column Generation

A Branch-and-Refine Method for Nonconvex Mixed-Integer Optimization

Basic notions of Mixed Integer Non-Linear Programming

Development of the new MINLP Solver Decogo using SCIP - Status Report

Integer Programming Duality

Duality, Warm Starting, and Sensitivity Analysis for MILP

Can Li a, Ignacio E. Grossmann a,

Integer Programming ISE 418. Lecture 13. Dr. Ted Ralphs

Computational Integer Programming. Lecture 2: Modeling and Formulation. Dr. Ted Ralphs

Software for Integer and Nonlinear Optimization

IP Cut Homework from J and B Chapter 9: 14, 15, 16, 23, 24, You wish to solve the IP below with a cutting plane technique.

Gestion de la production. Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA

Integer Programming Reformulations: Dantzig-Wolfe & Benders Decomposition the Coluna Software Platform

Decomposition with Branch-and-Cut Approaches for Two Stage Stochastic Mixed-Integer Programming

Lecture 23 Branch-and-Bound Algorithm. November 3, 2009

Benders Decomposition

Decomposition Algorithms for Two-Stage Chance-Constrained Programs

Implementation of an αbb-type underestimator in the SGO-algorithm

Part 4. Decomposition Algorithms

Integer Programming. Wolfram Wiesemann. December 6, 2007

Mixed Integer Non Linear Programming

Disjunctive Decomposition for Two-Stage Stochastic Mixed-Binary Programs with Random Recourse

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Lecture 8: Column Generation

Introduction to Mathematical Programming IE406. Lecture 21. Dr. Ted Ralphs

arxiv: v2 [math.oc] 31 May 2014

A Benders decomposition method for locating stations in a one-way electric car sharing system under demand uncertainty

IE418 Integer Programming

Highly-scalable branch and bound for maximum monomial agreement

Stochastic Integer Programming An Algorithmic Perspective

MIDAS: A Mixed Integer Dynamic Approximation Scheme

Development of an algorithm for solving mixed integer and nonconvex problems arising in electrical supply networks

Recoverable Robustness in Scheduling Problems

Multistage Robust Mixed Integer Optimization with Adaptive Partitions

An improved L-shaped method for two-stage convex 0-1 mixed integer nonlinear stochastic programs

Integer Programming ISE 418. Lecture 12. Dr. Ted Ralphs

Mixed-Integer Nonlinear Decomposition Toolbox for Pyomo (MindtPy)

Feasibility Pump Heuristics for Column Generation Approaches

Large-scale optimization and decomposition methods: outline. Column Generation and Cutting Plane methods: a unified view

Scenario grouping and decomposition algorithms for chance-constrained programs

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini

A Lifted Linear Programming Branch-and-Bound Algorithm for Mixed Integer Conic Quadratic Programs

An Exact Algorithm for the Steiner Tree Problem with Delays

Stochastic Programming Models in Design OUTLINE

Introduction to integer programming II

Inverse Integer Linear Programs: Complexity and Computation

3.8 Strong valid inequalities

Interior Point Algorithms for Constrained Convex Optimization

Easy and not so easy multifacility location problems... (In 20 minutes.)

Some Recent Advances in Mixed-Integer Nonlinear Programming

Valid Inequalities and Restrictions for Stochastic Programming Problems with First Order Stochastic Dominance Constraints

Two-stage stochastic (and distributionally robust) p-order conic mixed integer programs: Tight second stage formulations

Solving Dual Problems

Indicator Constraints in Mixed-Integer Programming

where X is the feasible region, i.e., the set of the feasible solutions.

Stochastic Programming: From statistical data to optimal decisions

Two-stage integer programs with stochastic right-hand sides: a superadditive dual approach

Separation, Inverse Optimization, and Decomposition. Some Observations. Ted Ralphs 1 Joint work with: Aykut Bulut 1

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.

Optimization Tools in an Uncertain Environment

Disjunctive Decomposition for Two-Stage Stochastic Mixed-Binary Programs with GUB Constraints

Decomposition-based Methods for Large-scale Discrete Optimization p.1

Stabilization in Column Generation: numerical study

Solution Methods for Stochastic Programs

Stochastic Programming: Optimization When Uncertainty Matters

INEXACT CUTS IN BENDERS' DECOMPOSITION GOLBON ZAKERI, ANDREW B. PHILPOTT AND DAVID M. RYAN y Abstract. Benders' decomposition is a well-known techniqu

MINLP: Theory, Algorithms, Applications: Lecture 3, Basics of Algorothms

Transcription:

A Benders Algorithm for Two-Stage Stochastic Optimization Problems With Mixed Integer Recourse Ted Ralphs 1 Joint work with Menal Güzelsoy 2 and Anahita Hassanzadeh 1 1 COR@L Lab, Department of Industrial and Systems Engineering, Lehigh University 2 SAS Institute, Advanced Analytics, Operations Research R & D INFORMS Computing Society Conference, Richmond, VA, January 11, 2015

Outline 1 Introduction 2 Value Function 3 Algorithms 4 Conclusions and Future Work

Outline 1 Introduction 2 Value Function 3 Algorithms 4 Conclusions and Future Work

Two-Stage Mixed Integer Optimization We have the following general formulation: where z 2SMILP = min x P 1 Ψ(x) = min x P 1 { c x + Ξ(x) }, (1) P 1 = {x X Ax = b, x 0} (2) is the first-stage feasible region with X = Z r1 + R n1 r1 +. Ξ represents the impact of future uncertainty. The canonical form employed in stochastic programming with recourse is Ξ(x) = E ω Ω [φ(h ω T ω x)], (3) φ is the second-stage value function to be defined shortly. T ω Q m2 n1 and h ω Q m2 represent the input to the second-stage problem for scenario ω Ω.

The Second-Stage Value Function The structure of the objective function Ψ depends primarily on the structure of the value function φ(β) = min y P q y (RP) 2(β) where P 2 (β) = {y Y Wy = β} (4) is the second-stage feasible region with respect to a given right-hand side β and Y = Z r2 + R n2 r2 +. The second-stage problem is parameterized on the unknown value β of the right-hand side. This value is determined jointly by the realized value of ω and the values of the first-stage decision variables. We assume ω follows a uniform distribution with a finite support, P 1 is compact, and E w Ω[φ(h ω T ωx)] is finite for all x X.

Outline 1 Introduction 2 Value Function 3 Algorithms 4 Conclusions and Future Work

MILP Value Function (Pure) The MILP value function is non-convex, discontinuous,and piecewise polyhedral in general. Example 1 φ(b) = min 3x 1 + 7 x2 + 3x3 + 6x4 + 7x5 + 5x6 2 s.t. 6x 1 + 5x 2 4x 3 + 2x 4 7x 5 + x 6 = b x 1, x 2, x 3, x 4, x 5, x 6 Z +

MILP Value Function (Mixed) Example 2 φ(b) = min 3x 1 + 7 x2 + 3x3 + 6x4 + 7x5 + 5x6 2 s.t. 6x 1 + 5x 2 4x 3 + 2x 4 7x 5 + x 6 = b x 1, x 2, x 3 Z +, x 4, x 5, x 6 R +

Mixed Integer Two-Stage Problem Example 3 min f (x 1, x 2 ) = min 3x 1 4x 2 + E[Φ MILP (ω 2x 1 0.5x 2 )] s.t. x 1 5, x 2 5 x 1, x 2 R +, (Ex.SMP) and ω {6, 12} with a uniform probability distribution.

Continuous and Integer Restriction of an MILP Consider the general form of the second-stage value function φ(β) = min q I y I + q C y C s.t. W I y I + W C y C = b, y Z r2 + R n2 r2 + (MILP) The structure is inherited from that of the continuous restriction: φ C (β) = min q C y C s.t. W C y C = β, y C R n2 r2 + (CR) and the similarly defined integer restriction: φ I (β) = min q I y I s.t. W I y I = β (IR) y I Z r2 +

Value Function of the Continuous Restriction Example 4 φ C (β) = min 6y 1 + 7y 2 + 5y 3 s.t. 2y 1 7y 2 + y 3 = β y 1, y 2, y 3 R +

Points of Strict Local Convexity Example 5 Theorem 1 (?) Under the assumption that {β R m2 φ I (β) < } is finite, there exists a finite set S Y such that φ(β) = min y {q I y I + φ C (β W I y I )}. (5) I S

Outline 1 Introduction 2 Value Function 3 Algorithms 4 Conclusions and Future Work

Representing the Value Function In practice, the value function can be represented in primarily two different ways. 1 On the one hand, the value function is uniquely determined by its points of strict local convexity. Embedding the value function using this representation involves explicitly listing these points and choosing one (binary variables). The corresponding continuous part of the solution can be generated dynamically or can also be represented explicitly by dual extreme points. 2 The value function can also be represented explicitly in terms of its polyhedral pieces. In this case, the points of strict local convexity are implicit and the selection is of the relevant piece or pieces. This yields a much larger representation.

Embedding the Value Function Algorithmically, the value function can be embedded in a number of ways. 1 If the objective functions of the inner and outer optimizations agree, we can embed the inner constraints and evaluate the value function implicitly. This amounts to solving the extensive form in the case of stochastic integer programming. For stochastic programs with many scenarios, this form can be too large to solve explicitly. 2 We can generate the value function a priori and then embed the full representation. The full representation might be very large It is likely that most of the representation is irrelevant to the computation. 3 We can dynamically generate relevant parts of the representation ala cut generation. It is also possible to take a hybrid approach in which we, e.g., generate the full representation a priori, but add parts dynamically.

Conceptual Algorithm for Generating the Value Function Algorithm Initialize: Let z(b) = for all b B, Γ 0 =, xi 0 = 0, S0 = {xi 0 }, and k = 0. while Γ k > 0 do: Let z(b) = min{ z, z(b; xi k )} for all b B. k k + 1. Solve Γ k = max z(b) c I x I s.t. A I x I = b x I Z r +. (SP) to obtain x k I. Set S k S k 1 {x k } end while return z(b) = z(b) for all b B.

Conceptual Algorithm for Generating the Value Function z f 4 f 5 f 3 f 2 f 1 b 1 b 2 b 3 b 4 b 5 d Figure: Upper bounding functions obtained at right-hand sides b i, i = 1,..., 5.

Formulating (SP) Surprisingly, the cut generation problem (SP) can be formulated easily as an MINLP. Γ k = max θ s.t. θ + c I x I c I xi i + (A I x I A I xi) i ν i i = 1,..., k 1 A C ν i c C i = 1,..., k 1 ν i R m i = 1,..., k 1 x I Z r +. (6)

Computational Results Figure: Normalized approximation gap vs. iteration number.

Generating the Value Function by Branch and Bound Our first algorithm was based on refining a single upper approximation (essentially a restriction of the full value function). It is also possible to construct the value function by lower approximation using branch and bound. Any branch-and-bound tree yields a lower approximation of the value function.

Dual Functions from Branch-and-Bound (?) Let T be set of the terminating nodes of the tree. Then in a terminating node t T we solve: φ t (b) = min c x s.t. Ax = b, l t x u t, x 0 (7) The dual at node t: φ t (b) = max {π t b + π t l t + π t u t } s.t. π t A + π t + π t c π 0, π 0 We obtain the following strong dual function: (8) where (ˆπ t, ˆπ t, ˆ π t ) is an optimal solution to the dual (8). min t T {ˆπt b + ˆπ t l t + ˆ π t u t }, (9)

Iterative Refinement The tree obtained from evaluating φ(b) yields a dual function strong at b. By solving for other right-hand sides, we obtain additional dual functions that can be aggregated. These additional solves can be done within the same tree, eventually yielding a single tree representing the entire function. Node 0 Node 0 x2 = 0 x2 1 x2 = 0 x2 1 Node 1 Node 2 Node 1 Node 2 x2 = 1 x2 2 Node 3 Node 4

Tree Representation of the Value Function Continuing the process, we eventually generate the entire value function. Consider the strengthened dual φ (β) = min t T q I t y t I t + φ t N\I t (β W It y t I t ), (10) I t is the set of indices of fixed variables, y t I t are the values of the corresponding variables in node t. φ t N\I t is the value function of the linear program at node t including only the unfixed variables. Theorem 2 Under the assumption that {β R m2 φ I (β) < } is finite, there exists a branch-and-bound tree with respect to which φ = φ.

Example of Value Function Tree Node 0 y3 = 0 y3 1 Node 1 Node 8 y2 = 0 y2 1 y3 = 1 y3 2 Node 2 max{ 2β, β} Node 3 Node 9 max{β + 5, g7 = 2β 1} Node 10 y2 = 1 y2 2 y3 = 2 y3 3 Node 4 max{ 2β + 14, β 1} Node 5 Node 11 max{β + 10, g9 = 2β 2} Node 12 y2 = 2 y2 3 y3 = 3 y3 4 Node 6 Node 7 max{2β + 28, β 2} 2β + 42 Node 13 max{β + 15, 2β 3} Node 14 y3 = 4 y3 5 Node 15 max{β + 20, 2β 4} Node 16 y3 = 5 y3 6 Node 17 Node 18 max{β + 25, 2β 5} β + 30

Correspondence of Nodes and Local Stability Regions

Back to Stochastic Programming: Lit Review First Stage Second Stage Stochasticity R Z B R Z B W T h q (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) (?) Current work

Benders Master Formulation Γ k = min c x + p ω z ω ω Ω s.t. z ω q t,ω z ω q t,ω M ω (1 u t,ω ) q t,ω (h ω T ω x) η i + α i u t,ω = 1 t T u t,ω B, x X. t T, ω Ω t T, ω Ω i I(t), t T, ω Ω ω Ω t T, ω Ω (MP)

Benders Master Variables Notation: T indexes the leaf nodes of the branch-and-bound tree. I(t) are the indices of the affine functions describing the LP value function associated with node t. η i, α i represents the affine function indexed by i I(t) for node t T q t,ω = max I(t) b η i + α i represents the evaluation of the value function of node t in scenario ω. u t,ω represents which of the nodes of the branch-and-bound tree is chosen as the maximum in scenario ω. z ω = min t T q t,ω represents the evaluation of the current value function approximation for scenario ω. x are the first-stage variables.

Overall Approach As in the classical algorithm, we alterante between solving the master problem and solving subproblems that update the current approximation (refines T). We maintain a single branch and bound tree and warm-start solution of the subproblems. This requires a solver capable of warm-starting the branch-and-bound process, as well as exporting the dual function. For this purpose, we use the SYMPHONY MILP solver, which has this capability. There are many possible variations, such as separate approximations for each scenario.

Example Consider where min f (x) = min 3x 1 4x 2 + s.t. x 1 + x 2 5 x Z + 2 0.5Q(x, s) s=1 (11) Q(x, s) = min 3y 1 + 7 2 y 2 + 3y 3 + 6y 4 + 7y 5 s.t. 6y 1 + 5y 2 4y 3 + 2y 4 7y 5 = h(s) 2x 1 1 2 x 2 (12) y 1, y 2, y 3 Z +, y 4, y 5 R + with h(s) { 4, 10}.

Example

Solving the Master Problem A fundamental question is how to solve the master problem efficiently. It is a large integer program with many constraints, most of which will be redundant. It seems clear that we need to dynamically manage the LP relaxations. There are several options 1 Put constraints in to a cut pool and add them only as they are violated. 2 Add all constraints in the beginning, but aggressively remove slack ones. Our attempts have so far failed. Not having the full formulation makes branching less effective. We are currently investigating.

Example from the Literature Consider where f (b) = inf 1.5x 1 4x 2 + E[z(x, ω)] s.t. x 1, x 2 B 1, z(b) = inf 16y 1 19y 2 23y 3 28y 4 + 100R s.t. 2y 1 + 3y 2 + 4y 3 + 5y 4 R h(x) x 1 6y 1 + y 2 + 3y 3 + 2y 4 R h(x) x 2 y i B, i = 1,..., 4, R Z +. (13) (14) Here, the scenarios belong to the set {ω 1, ω 2 }, each having a probability of 0.5. The distribution of the right-hand side is (h 1 (ω 1 ), h 2 (ω 1 )) = (5, 2) and (h 1 (ω 2 ), h 2 (ω 2 )) = (10, 3).

Example cont. To construct larger instances, we allow the right hand sides of the two constraints to be in the set {5, 5 +, 5 + 2,..., 15} {5, 5 +, 5 + 2,..., 15} to generate 4, 9, 36, 121 scenarios with {1, 2, 5, 10}. We further extend the set by changing 15 to 20 to generate 225 scenarios with = 1. We have the size of the deterministic equivalent of the resulting problems as follows No. Scen. 4 9 36 121 225 No. Vars 24 47 182 607 1127 No. Const 8 18 72 242 450 Table: Size of the deterministic equivalent of the test instances

Example cont. Static and Dynamic Cut Generation Instance Obj No. Stats. Size of Stat. Dyn. Size of Dyn. Iter. Master Time Master Time 4-Scen -63.50 3 1 (27, 47) 0.16 1 (27, 47) 0.16 9-Scen -65.67 3 1 (71, 127) 0.32 1 (71, 127) 0.3 36-Scen -66.83 3 17 (301, 552) 3.21 3 (277, 505) 2.43 121-Scen -67.17 4 55 (1022,1891) 17.25 3 (986, 1807) 12.81 225-Scen -79.66 4 5 (1883, 3465) 28.29 7 (1812, 3314) 33.00 Table: Performance of the algorithm applied to test problems The third and fifth columns show the size of the master problem in the final iteration in the form of number of tree nodes (number of variables, number of constraints) respectively with static and dynamic cut generation. The fourth and sixth columns show the solution time in seconds respectively for static and dynamic cut generation.

SSLP Instances We apply the algorithm to the SSLP test instances from SIPLIB. The instances have the following properties. The last column shows the deterministic equivalent solution time in seconds. DEP 2nd Stage Instance cons bin int cons bin int Time (s) % Gap sslp-5-25(25) 751 3130 125 30 130 5 3.17 0.0 sslp-5-25(50) 1501 6255 250 30 130 5 4.22 0.0 sslp-5-25(100) 3001 12505 500 30 130 5 14.34 0.0 Table: The deterministic equivalent of SSLP instances where DEP and 2nd Stage correspond to the deterministic equivalent and the second stage problems and cons, bins and int respectively represent the number of constraints, binary variables and general integer variables in the corresponding problem.

SSLP Instances Static Cut Dynamic Cut Instance Iteration Size Time (s) % Gap Iteration Size Time (s) %Gap sslp-5-25(25) 18 431 (1040, 2016) 89.92 0.0 47 31 (2493, 3639) 39.57 0.0 sslp-5-25(50) 18 26 (1484, 2948) 123.22 0.0 22 9 (2443, 3489) 242.46 0.0 sslp-5-25(100) 18 899 (3703, 7192) 835.76 0.0 12 1533 (2608, 3604) 398.59 0.0 Table: Generalized Benders algorithm applied to SSLP instances The results are generated using the MILP solver SYMPHONY version 5.5. A time limit of one hour was enforced on the solution time. The Gap% column refers to the relative gap between the upper bound and lower bound of the Generalized Benders algorithm. The tests are performed on a a Linux (Debian 6.0.9) operating system running 16 AMD processors on a 800 MHz speed and 31 GB RAM, compiled with g++.

Outline 1 Introduction 2 Value Function 3 Algorithms 4 Conclusions and Future Work

Conclusions and Future Work We have developed an algorithm for the two-stage problem with general mixed integer variables in both stages. The algorithm uses the Benders framework with B&B dual functions as the optimality cuts. We need to keep the size of approximations small. Currently, we have implemented a static cut generation as well as a dynamic cut generation by allowing the constraints to be added to the master problem dynamically as needed. There are still many questions to be answered about how to make all of this as efficient as possible. We are working on making the dynamic cut generation scheme more efficient by incorporating cuts during strong branching. This is work under progress.