Integer programming (part 2) Lecturer: Javier Peña Convex Optimization /36-725

Similar documents
Lift-and-Project Inequalities

Polyhedral Approach to Integer Linear Programming. Tepper School of Business Carnegie Mellon University, Pittsburgh

Operations Research Lecture 6: Integer Programming

Change in the State of the Art of (Mixed) Integer Programming

Structure in Mixed Integer Conic Optimization: From Minimal Inequalities to Conic Disjunctive Cuts

MVE165/MMG631 Linear and integer optimization with applications Lecture 8 Discrete optimization: theory and algorithms

Introduction to Mathematical Programming IE406. Lecture 21. Dr. Ted Ralphs

Introduction to Integer Linear Programming

Lecture 2. Split Inequalities and Gomory Mixed Integer Cuts. Tepper School of Business Carnegie Mellon University, Pittsburgh

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini

18 hours nodes, first feasible 3.7% gap Time: 92 days!! LP relaxation at root node: Branch and bound

Introduction to Integer Programming

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018

Integer Programming ISE 418. Lecture 13. Dr. Ted Ralphs

Integer Programming Methods LNMB

1 Solution of a Large-Scale Traveling-Salesman Problem... 7 George B. Dantzig, Delbert R. Fulkerson, and Selmer M. Johnson

Integer Programming ISE 418. Lecture 8. Dr. Ted Ralphs

Integer Programming ISE 418. Lecture 13b. Dr. Ted Ralphs

3.7 Cutting plane methods

Cutting Planes and Elementary Closures for Non-linear Integer Programming

Cutting Planes in SCIP

BCOL RESEARCH REPORT 07.04

A computational study of Gomory cut generators

Multi-Row Cuts in Integer Programming. Tepper School of Business Carnegie Mellon University, Pittsburgh

Structure of Valid Inequalities for Mixed Integer Conic Programs

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014

Section Notes 9. IP: Cutting Planes. Applied Math 121. Week of April 12, 2010

Constraint Qualification Failure in Action

Cuts for mixed 0-1 conic programs

On the knapsack closure of 0-1 Integer Linear Programs. Matteo Fischetti University of Padova, Italy

AM 121: Intro to Optimization! Models and Methods! Fall 2018!

Mixed Integer Programming (MIP) for Causal Inference and Beyond

The Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

On the knapsack closure of 0-1 Integer Linear Programs

On Counting Lattice Points and Chvátal-Gomory Cutting Planes

Column Generation for Extended Formulations

Mixed Integer Programming Solvers: from Where to Where. Andrea Lodi University of Bologna, Italy

constraints Ax+Gy» b (thus any valid inequalityforp is of the form u(ax+gy)» ub for u 2 R m + ). In [13], Gomory introduced a family of valid inequali

Effective Separation of Disjunctive Cuts for Convex Mixed Integer Nonlinear Programs

An Integer Cutting-Plane Procedure for the Dantzig-Wolfe Decomposition: Theory

Integer Programming. Wolfram Wiesemann. December 6, 2007

Lecture 9: Dantzig-Wolfe Decomposition

arxiv: v3 [math.oc] 24 May 2016

Barrier Method. Javier Peña Convex Optimization /36-725

Discrete Optimization 2010 Lecture 7 Introduction to Integer Programming

Cutting Planes for Mixed-Integer Programming: Theory and Practice

Deciding Emptiness of the Gomory-Chvátal Closure is NP-Complete, Even for a Rational Polyhedron Containing No Integer Point

Lecture 8: Column Generation

Linear integer programming and its application

Decomposition-based Methods for Large-scale Discrete Optimization p.1

Minimal Valid Inequalities for Integer Constraints

Lecture 8: Column Generation

Optimization for Compressed Sensing

Two-Term Disjunctions on the Second-Order Cone

Fast Algorithms for LAD Lasso Problems

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Column basis reduction and decomposable knapsack problems

Monoidal Cut Strengthening and Generalized Mixed-Integer Rounding for Disjunctions and Complementarity Constraints

Math 5593 Linear Programming Week 1

to work with) can be solved by solving their LP relaxations with the Simplex method I Cutting plane algorithms, e.g., Gomory s fractional cutting

Primal-Dual Interior-Point Methods

A geometric perspective on lifting

The strength of multi-row models 1

MINLP: Theory, Algorithms, Applications: Lecture 3, Basics of Algorothms

Integer Programming ISE 418. Lecture 16. Dr. Ted Ralphs

On the Relative Strength of Split, Triangle and Quadrilateral Cuts

Lagrangian Relaxation in MIP

Stochastic Integer Programming

Some cut-generating functions for second-order conic sets

Column Basis Reduction, Decomposable Knapsack. and Cascade Problems. Gábor Pataki. joint work with Bala Krishnamoorthy. What is basis reduction?

Introduction to optimization and operations research

The Frank-Wolfe Algorithm:

How to convexify the intersection of a second order cone and a nonconvex quadratic

Lift-and-Project Cuts for Mixed Integer Convex Programs

Computing with Multi-Row Intersection Cuts

Extended Formulations, Lagrangian Relaxation, & Column Generation: tackling large scale applications

LOWER BOUNDS FOR INTEGER PROGRAMMING PROBLEMS

Minimal inequalities for an infinite relaxation of integer programs

Best subset selection via bi-objective mixed integer linear programming

MINI-TUTORIAL ON SEMI-ALGEBRAIC PROOF SYSTEMS. Albert Atserias Universitat Politècnica de Catalunya Barcelona

Decision Procedures An Algorithmic Point of View

Indicator Constraints in Mixed-Integer Programming

Travelling Salesman Problem

CORC Technical Report TR Cuts for mixed 0-1 conic programming

Logic, Optimization and Data Analytics

Solution Methods for Stochastic Programs

DATA MINING AND MACHINE LEARNING

Disjunctive Cuts for Mixed Integer Nonlinear Programming Problems

Operations Research Methods in Constraint Programming

A hard integer program made easy by lexicography

3.10 Lagrangian relaxation

A geometric perspective on lifting

1 Column Generation and the Cutting Stock Problem

16.410/413 Principles of Autonomy and Decision Making

Outline. Relaxation. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Lagrangian Relaxation. Lecture 12 Single Machine Models, Column Generation

Integer Programming: Cutting Planes

Lifting for conic mixed-integer programming

Decision Diagram Relaxations for Integer Programming

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Transcription:

Integer programming (part 2) Lecturer: Javier Peña Convex Optimization 10-725/36-725

Last time: integer programming Consider the problem min x subject to f(x) x C x j Z, j J where f : R n R, C R n are convex, and J {1,..., n}. Branch and bound Algorithm to solve integer programs. Divide-and-conquer scheme combined with upper and lower bounds for efficiency. 2

Consider the problem Branch and bound algorithm min x subject to f(x) x C x j Z, j J (IP) where f : R n R, C R n are convex and J {1,..., n}. 1. Solve the convex relaxation min x subject to f(x) x C (CR) 2. (CR) infeasible (IP) is infeasible. Stop 3. Solution x to (CR) is (IP) feasible x solves (IP). Stop 4. Solution x to (CR) not (IP) feasible lower bound for (IP). Branch and recursively solve subproblems. 3

After branching Key component of B&B After branching compute a lower bound for each subproblem. If a lower bound for a subproblem is larger than the current upper bound, no need to consider the subproblem. Most straightforward way to compute lower bounds is via a convex relaxation but other methods (e.g., Lagrangian relaxations) can also be used. 4

Tightness of relaxations Suppose we have two equivalent formulations (e.g., facility location) min x subject to with C C. f(x) x C x j Z, j J min x f(x) subject to x C x j Z, j J Which one should we prefer? 5

Outline Today: More about solution techniques Cutting planes Branch and cut Two extended examples Best subset selection Least mean squares 6

Convexification Consider the special case of an integer program with linear objective min c T x x subject to x C x j Z, j J This problem is equivalent to min x subject to c T x x S where S := conv{x C : x j Z, j J}. 7

Special case: integer linear programs Consider the problem min x subject to c T x Ax b x j Z, j J Theorem If A, b are rational, then the set is a polyhedron. S := conv{x : Ax b, x j Z, j J} Thus the above integer linear program is equivalent to a linear program. How hard could that be? 8

Cutting plane algorithm We say that the inequality π T x π 0 is valid for a set S if Consider the problem π T x π 0 for all x S. min x subject to c T x x C x j Z, j J and let S := conv{x C : x j Z, j J}. 9

Cutting plane algorithm Recall: S = conv{x C : x j Z, j J} and want to solve min x subject to Cutting plane algorithm c T x x C x j Z, j J (IP) 1. let C 0 := C and compute x (0) := argmin{c T x : x C 0 } x 2. for k = 0, 1,... if x (k) is (IP) feasible then x (k) is an optimal solution. Stop else find a valid inequality (π, π 0 ) for S that cuts off x (k) let C k+1 := C k {x : π T x π 0 } compute x (k+1) := argmin x {c T x : x C k+1 } end if end for A valid inequality is also called a cutting plane or a cut 10

11 A bit of history on cutting planes In 1954 Dantzig, Fulkerson, and Johnson pioneered the cutting plane approach for the traveling salesman problem. In 1958 Gomory proposed a general-purpose cutting plane method to solve any integer linear program. For more than three decades Gomory cuts were deemed impractical for solving actual problems. In the early 1990s Sebastian Ceria (at CMU) successfully incorporated Gomory cuts in a branch and cut framework. By the late 1990s cutting planes had become a key component of commercial optimization solvers. This subject has a long tradition at Carnegie Mellon and is associated to some of our biggest names: Balas, Cornuejols, Jeroslow, Kilinc-Karzan and many of their collaborators.

12 Gomory cuts (1958) This class of cuts is based on the following observation: Suppose if a b and a is an integer then a b. S x Zn + : n a j x j = a 0 j=1 where a 0 Z. The Gomory fractional cut is n (a j a j )x j a 0 a 0. j=1 There is a rich variety of extensions of the above idea: Chvatal cuts, split cuts, lift-and-project cuts, etc.

13 Theorem (Balas 1974) Lift-and-project cuts Assume C = {x : Ax b} {x : 0 x j 1}. Then C j :=conv{x C : x j {0, 1}} is the projection of the polyhedron L j (C) defined by the following inequalities Ay λb Az (1 λ)b y + z = x λ 1 λ 0. Suppose x C but x C j. To find a cut separating x from C j solve x x min x,y,z,λ subject to (x, y, z, λ) L j (C)

14 Branch and cut algorithm Combine strengths of both branch and bound and cutting planes. Consider the problem min x subject to f(x) x C x j Z, j J (IP) where f : R n R, C R n are convex and J {1,..., n}.

15 Branch and cut algorithm Branch and cut algorithm 1. Solve the convex relaxation min x subject to f(x) x C (CR) 2. (CR) infeasible (IP) is infeasible. 3. Solution x to (CR) is (IP) feasible x solution to (IP). 4. Solution x to (CR) is not (IP) feasible. Choose between two alternatives 4.1 Add cuts and go back to step 1 4.2 Branch and recursively solve subproblems

16 Integer programming technology State-of-the-art solvers (Gurobi, CPLEX, FICO) rely on extremely efficient implementations of numerical linear algebra, simplex, interior-point, and other algorithms for convex optimization. For mixed integer optimization, most solvers use a clever type of branch and cut algorithm. They rely extensively on convex relaxations. They also take advantage of warm starts as much as possible. Some interesting numbers Speedup in algorithms 1990 2016: over 500,000 Speedup in hardware 1990 2016: over 500,000 Total speedup over 250 billion = 2.5 10 11

17 Best subset selection Assume X = [ x 1 x p] R n p and y R n. Best subset selection problem: 1 min β 2 y Xβ 2 2 subject to β 0 k Here β 0 := number of nonzero entries of β.

18 Best subset selection Integer programming formulation min β,z subject to 1 2 y Xβ 2 2 β i M i z i, i = 1,..., p p z i k i=1 z i {0, 1}, i = 1,..., p. Here M i is some a priori known bound on β i for i = 1,..., p. They can be computed via suitable preprocessing of X, y.

19 A clever way to get good feasible solutions Consider the problem Bertsimas, King, and Mazumder min g(β) subject to β 0 k β where g : R p R is smooth convex and g is L-Lipschitz. Best subset selection corresponds to g(β) = 1 2 Xβ y 2 2. Observation For u R p the vector H k (u) = argmin β u 2 2 β: β 0 k is obtained by retaining the k largest entries of u.

Consider the problem Discrete first-order algorithm Bertsimas et al. min g(β) subject to β 0 k β where g : R p R is smooth convex and g is L-Lipschitz. 1. start with some β (0) 2. for i = 0, 1,..., β (i+1) = H k ( β (i) 1 L g(β(i) ) ) end for The above iterates satisfy β (i) β where ( β = H k β 1 ) g( β). L This is a kind of local solution to the above minimization problem. 20

21 Best Subset Regression Computational results (Bertsimas et al.) Best Subset Regression Dataset, n =350, p Diabetes =64, k Diabetes =6 dataset, n Dataset, = 350, p = n =350, 64, k = 6p =64, k =6 Typical behavior of Overall Algorithm Typical behavior of Overall Algorithm Objective 0.68 0.70 0.72 0.74 0.76 0.78 0.80 Upper Bounds Lower Bounds Global Minimum MIO Gap 0.00 0.05 0.10 0.15 0 100 200 300 400 0 100 200 300 400 (MIT) Statistics via Optimization Bertsimas (MIT) June 2015 12 / 48 Statistics via Optimization J

FIG. 3. The evolution of the MIO optimality gap [in log ( ) scale] for22p with bounds implied by Sections 2.3.1, 2.3.2 and observed that tion (2.5)armed Computational withwarm-startsand results (Bertsimas additional et al.) boundsclosed faster than their Cold Cold Start vs Warm counterpart. Starts

Computational results (Bertsimas et al.) Best Subset Regression Sparsity detection (synthetic datasets) Sparsity Detection for n =500,p =100 40 Method MIO Lasso Step Sparsenet 30 Number of Nonzeros 20 10 0 1.742 3.484 6.967 Signal to Noise Ratio Bertsimas (MIT) Statistics via Optimization June 2015 21 / 48 23

24 Least median of squares regression Assume X = [ x 1 x p] R n p and y R n. Given β R p let r := y Xβ Observe Least squares (LS): β LS := argmin β Least absolute deviation (LAD): β LAD = argmin β Least Median of Squares (LMS) i r 2 i β LMS := argmin(median r i ). β r i i

25 Least quantile regression Least Quantile of Squares (LQS) β LQS := argmin r (q) β where r (q) is the qth ordered absolute residual: r (1) r (2) r (n). Key step in the formulation Use binary and auxiliary variables to encode the relevant condition for each entry i of r. r i r (q) or r i r (q)

26 Least quantile regression Integer programming formulation min β,µ, µ,z,γ subject to γ γ r i + µ i i = 1,..., n γ r i µ i, i = 1,..., n µ i Mz i, i = 1,..., n µ i M(1 z i ), i = 1,..., n p z i = q i=1 µ i, µ i 0, i = 1,..., n z i {0, 1}, i = 1,..., p.

27 First-order algorithm Bertsimas & Mazumder Observe r (q) = y (q) x T (q) β = H q(β) H q+1 (β) where H q (β) = n i=q y (i) x T (i) β = max w subject to n w i y i x T i β i=1 n w i = q i=1 0 w i 1, i = 1,..., n. The function H q (β) is convex. Subgradient algorithm yields a local min to H q (β) H q+1 (β).

Typical Computational Evolution ofresults MIO (Bertsimas & Mazumder) Bertsimas (MIT) Statistics via Optimization June 2015 40 28/ Alcohol Data; (n,p,q) = (44,5,31) Alcohol Data; (n,p,q) = (44,7,31) Objective Value 0.0 0.1 0.2 0.3 0.4 Upper Bounds Lower Bounds Optimal Solution 0.0 0.1 0.2 0.3 0.4 Upper Bounds Lower Bounds Optimal Solution 0 20 40 60 80 0 200 400 600 800 Time (s) Time (s)

act of Warm-Starts Computational results (Bertsimas & Mazumder) Cold vs Warm Starts Evolution of MIO (cold-start) [top] vs (warm-start) [bottom] Objective Value 0 10 20 30 40 50 Upper Bounds Lower Bounds Optimal Solution Objective Value 0 2 4 6 8 10 12 0 500 1000 1500 2000 Time (s) Upper Bounds Lower Bounds Optimal Solution 0 500 1000 1500 2000 Time (s) 29

30 References and further reading Belotti, Kirches, Leyffer, Linderoth, Luedke, and Mahajan (2012), Mixed-integer nonlinear optimization Bertsimas and Mazumder (2016), Best subset selection via a modern optimization lens Bertsimas, King, and Mazumder (2014), Least quantile regression via modern optimization Conforti, Cornuejols, and Zambelli (2014), Integer programming Wolsey (1998), Integer programming