Integrating CP and Mathematical Programming

Size: px

Start display at page:

Download "Integrating CP and Mathematical Programming"

Vivien Ferguson
5 years ago
Views:

1 Integrating CP and Mathematical Programming John Hooker Carnegie Mellon University June 2011 June 2011 Slide 1

2 Why Integrate CP and MP? Complementary strengths Computational advantages Outline of the Tutorial June 2011 Slide 2

3 Complementary Strengths CP: Inference methods Modeling Exploits local structure MP: Relaxation methods Tools for filtering Duality theory Exploits global structure Let s bring them together! June 2011 Slide 3

4 Computational Advantage of Integrating CP and MP Using CP + relaxation from MP Problem Relaxation Speedup Lesson timetabling June 2011 Slide 4 Production planning with piecewise linear costs Automatic digital recording Assignment + reduced cost variable fixing Convex hull Lagrangean 2 to 50 times faster than CP 20 to 120 times faster than MILP (CPLEX 12). Search tree times smaller 1 to 10 times faster than MILP, which is faster than CP.

5 Computational Advantage of Integrating CP and MP Using CP + relaxation from MP Problem Relaxation Speedup Radiation therapy Lagrangean 10 times faster than CP, MILP Stable set Structural design (nonlinear & discrete) Semidefinite programming Linear quasirelaxation + logic cuts Better than CP in less time Up to 600 times faster than MILP, GO software 2 problems: <6 min vs >20 hrs for MILP June 2011 Slide 5

6 Computational Advantage of Integrating CP and MP Using CP-based Branch and Price Problem Urban transit crew scheduling Traveling tournament problem Speedup Optimal schedule for twice as many trips as traditional branch and price First to solve 8-team instance June 2011 Slide 6

7 Computational Advantage of Integrating CP and MP Using Benders methods Problem Method Speedup Min-cost machine assignment & scheduling Same Polypropylene batch scheduling at BASF MILP/CP Benders SIMPL implementation MILP/CP Benders 20 to 1000 times faster than CP, MILP Solved some problems in < 1 sec that are intractable for CP, MILP Solved previously insoluble problem in 10 min June 2011 Slide 7

8 Computational Advantage of Integrating CP and MP Using Benders methods Problem Method Speedup Single-machine scheduling MILP/CP Benders Solved much longer time horizons than MILP, CP Facility assignment and resourceconstrained scheduling (min cost, min makespan) Sports scheduling June 2011 Slide 8 MILP/CP Benders + subproblem relaxations MILP/CP Benders times faster than CP, MILP Several orders of magnitude relative to state of the art

9 Software for Integrating CP and MP ECLiPSe Exchanges information between ECLiPSEe solver, Xpress-MP OPL Studio (IBM) Combines CPLEX and ILOG CP Optimizer with script language Xpress-Mosel (FICO) Combines Xpress-MP, Xpress-Kalis with low-level modeling G12 (NICTA) Maps problem into script for cooperating solvers SIMPL (CMU) Full integration with high-level modeling (prototype) SCIP (ZIB) Combines MILP and CP-based propagation June 2011 Slide 9

10 Outline of the Tutorial Why Integrate OR and CP? A Glimpse at CP Initial Example: Integrated Methods CP Concepts CP Filtering Algorithms Linear Relaxation and CP Mixed Integer/Linear Modeling Network Flows and Filtering Integral Polyhedra Cutting Planes Lagrangean Relaxation and CP Dynamic Programming in CP CP-based Branch and Price CP-based Benders Decomposition June 2011 Slide 10

11 Detailed Outline Why Integrate OR and CP? Complementary strengths Computational advantages Outline of the tutorial A Glimpse at CP Early successes Advantages and disadvantages Initial Example: Integrated Methods Freight Transfer Bounds Propagation Cutting Planes Branch-infer-and-relax Tree June 2011 Slide 11

12 Detailed Outline CP Concepts Consistency Hyperarc Consistency Modeling Examples CP Filtering Algorithms Element Alldiff Disjunctive Scheduling Cumulative Scheduling Linear Relaxation and CP Why relax? Algebraic Analysis of LP Linear Programming Duality LP-Based Domain Filtering Example: Single-Vehicle Routing Disjunctions of Linear Systems June 2011 Slide 12

13 Detailed Outline Mixed Integer/Linear Modeling MILP Representability 4.2 Disjunctive Modeling 4.3 Knapsack Modeling Network Flows and Filtering Min Cost Network Flow Max Flow Filtering: Cardinality Filtering: Sequence Integral Polyhedra Total Unimodularity Network Flow Matrices Interval Matrices June 2011 Slide 13

14 Detailed Outline Cutting Planes 0-1 Knapsack Cuts Gomory Cuts Mixed Integer Rounding Cuts Example: Product Configuration Lagrangean Relaxation and CP Lagrangean Duality Properties of the Lagrangean Dual Example: Fast Linear Programming Domain Filtering Example: Continuous Global Optimization June 2011 Slide 14

15 Detailed Outline Dynamic Programming in CP Example: Capital Budgeting Domain Filtering Recursive Optimization Filtering: Stretch Filtering: Regular CP-based Branch and Price Basic Idea Example: Airline Crew Scheduling CP-based Benders Decomposition Benders Decomposition in the Abstract Classical Benders Decomposition Example: Machine Scheduling June 2011 Slide 15

16 Background Reading This tutorial is based on: J. N. Hooker, Integrated Methods for Optimization, 2 nd ed., Springer (to appear 2011). Contains exercises. June 2011 Slide 16

17 A Glimpse at Constraint Programming Early Successes Advantages and Disadvantages June 2011 Slide 17

18 What is constraint programming? It is a relatively new technology developed in the computer science and artificial intelligence communities. It has found an important role in scheduling, logistics and supply chain management. June 2011 Slide 18

19 Early commercial successes Circuit design (Siemens) Container port scheduling (Hong Kong and Singapore) Real-time control (Siemens, Xerox) June 2011 Slide 19

20 Applications Job shop scheduling Assembly line smoothing and balancing Cellular frequency assignment Nurse scheduling Shift planning Maintenance planning Airline crew rostering and scheduling Airport gate allocation and stand planning June 2011 Slide 20

21 Applications Production scheduling chemicals aviation oil refining steel lumber photographic plates tires Transport scheduling (food, nuclear fuel) Warehouse management Course timetabling June 2011 Slide 21

22 Advantages and Disadvantages CP vs. Mathematical Programming MP Numerical calculation Relaxation Atomistic modeling (linear inequalities) Branching Independence of model and algorithm CP Logic processing Inference (filtering, constraint propagation) High-level modeling (global constraints) Branching Constraint-based processing June 2011 Slide 22

23 Programming programming In constraint programming: programming = a form of computer programming (constraint-based processing) In mathematical programming: programming = logistics planning (historically) June 2011 Slide 23

24 CP vs. MP In mathematical programming, equations (constraints) describe the problem but don t tell how to solve it. In constraint programming, each constraint invokes a procedure that screens out unacceptable solutions. Much as each line of a computer program invokes an operation. June 2011 Slide 24

25 Advantages of CP Better at sequencing and scheduling where MP methods have weak relaxations. Adding messy constraints makes the problem easier. The more constraints, the better. More powerful modeling language. Global constraints lead to succinct models. Constraints convey problem structure to the solver. Better at highly-constrained problems Misleading better when constraints propagate well, or when constraints have few variables. June 2011 Slide 25

26 Disdvantages of CP Weaker for continuous variables. Due to lack of numerical techniques May fail when constraints contain many variables. These constraints don t propagate well. Often not good for finding optimal solutions. Due to lack of relaxation technology. May not scale up Discrete combinatorial methods Software is not robust Younger field June 2011 Slide 26

27 Obvious solution Integrate CP and MP. June 2011 Slide 27

28 Trends CP is better known in continental Europe, Asia. Less known in North America, seen as threat to OR. CP/MP integration is growing Eclipse, Mozart, OPL Studio, SIMPL, SCIP, BARON Heuristic methods increasingly important in CP Discrete combinatorial methods MP/CP/heuristics may become a single technology. June 2011 Slide 28

29 Initial Example: Integrated Methods Freight Transfer Bounds Propagation Cutting Planes Branch-infer-and-relax Tree June 2011 Slide 29

30 Example: Freight Transfer Transport 42 tons of freight overnight in trucks that come in 4 sizes Truck Number Capacity Cost size available (tons) per truck loading docks available. Allocate 3 loading docks to the largest trucks even if only 1 or 2 of these trucks are used. June 2011 Slide 30

31 Number of trucks of type 1 Knapsack packing constraint min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x ( 1 x 2) ( x + x + x 5) x i {0,1,2,3} Knapsack covering constraint Conditional constraint June 2011 Slide 31 Truck type Number available Capacity (tons) Cost per truck

32 Bounds propagation min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x ( 1 x 2) ( x + x + x 5) x i {0,1,2,3} x = 7 1 June 2011 Slide 32

33 Bounds propagation Reduced domain x min 90x + 60x + 50x + 40x ( 1 x 2) ( x + x + x 5) x + 5x + 4x + 3x x + x + x + x x {1,2,3}, x, x, x {0,1,2,3} = June 2011 Slide 33

34 Bounds consistency Let {L j,, U j } be the domain of x j A constraint set is bounds consistent if for each j : x j = L j in some feasible solution and x j = U j in some feasible solution. Bounds consistency we will not set x j to any infeasible values during branching. Bounds propagation achieves bounds consistency for a single inequality. 7x 1 + 5x 2 + 4x 3 + 3x 4 42 is bounds consistent when the domains are x 1 {1,2,3} and x 2, x 3, x 4 {0,1,2,3}. But not necessarily for a set of inequalities. June 2011 Slide 34

35 Bounds consistency Bounds propagation may not achieve bounds consistency for a set of constraints. Consider set of inequalities + x with domains x 1, x 2 {0,1}, solutions (x 1,x 2 ) = (1,0), (1,1). Bounds propagation has no effect on the domains. But constraint set is not bounds consistent because x 1 = 0 in no feasible solution. x x x 0 June 2011 Slide 35

36 Cutting Planes Begin with continuous relaxation min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x x 3, x 1 i 1 Replace domains with bounds This is a linear programming problem, which is easy to solve. Its optimal value provides a lower bound on optimal value of original problem. June 2011 Slide 36

37 Cutting planes (valid inequalities) min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x x 3, x 1 i 1 We can create a tighter relaxation (larger minimum value) with the addition of cutting planes. June 2011 Slide 37

38 Cutting planes (valid inequalities) min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x x 3, x 1 i 1 Cutting plane Continuous relaxation All feasible solutions of the original problem satisfy a cutting plane (i.e., it is valid). June 2011 Slide 38 But a cutting plane may exclude ( cut off ) solutions of the continuous relaxation. Feasible solutions

39 Cutting planes (valid inequalities) min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x x 3, x 1 {1,2} is a packing i 1 because 7x 1 + 5x 2 alone cannot satisfy the inequality, even with x 1 = x 2 = 3. June 2011 Slide 39

40 Cutting planes (valid inequalities) min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x x 3, x 1 {1,2} is a packing i 1 So, 4x + 3x 42 ( ) 3 4 Knapsack cut which implies x 42 ( ) + x = max{ 4,3} June 2011 Slide 40

41 Cutting planes (valid inequalities) Let x i have domain [L i,u i ] and let a 0. In general, a packing P for ax a 0 satisfies i P a x a a U i i 0 i i i P and generates a general integer knapsack cut i P x i a0 max i P i P a U i { a } i i June 2011 Slide 41

42 Cutting planes (valid inequalities) min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x x 3, x 1 i 1 Maximal Packings Knapsack cuts {1,2} x 3 + x 4 2 {1,3} x 2 + x 4 2 {1,4} x 2 + x 3 3 {2,3,4} x 1 1 Propagated bound June 2011 Slide 42 Knapsack cuts corresponding to nonmaximal packings can be nonredundant

43 Continuous relaxation with cuts min 90x + 60x + 50x + 40x x + 5x + 4x + 3x x + x + x + x 8 0 x 3, x 1 x x x i x 2 + x x Knapsack cuts Optimal value of is a lower bound on optimal value of original problem. June 2011 Slide 43

44 Branchinfer-andrelax tree Propagate bounds and solve relaxation of original problem. x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ June 2011 Slide 44

45 Branch-inferand-relax tree Branch on a variable with nonintegral value in the relaxation. x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 = 3 June 2011 Slide 45

46 Branch-inferand-relax tree Propagate bounds and solve relaxation. Since relaxation is infeasible, backtrack. x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 = 3 June 2011 Slide 46

47 Branch-inferand-relax tree Propagate bounds and solve relaxation. Branch on nonintegral variable. x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 = 3 x 1 { 3} x 2 {0123} x 3 {0123} x 4 {0123} x = (3,2.6,2,0) value = 526 x 2 = 3 x 2 {0,1,2} June 2011 Slide 47

48 Branch-inferand-relax tree Branch again. x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 = 3 x 1 { 3} x 2 {0123} x 3 {0123} x 4 {0123} x = (3,2.6,2,0) value = 526 x 1 { 3} x 2 {012 } x 3 { 123} x 4 {0123} x = (3,2,2¾,0) value = 527½ x 2 = 3 x 2 {0,1,2} x 3 {1,2} x 3 = 3 June 2011 Slide 48

49 Branch-inferand-relax tree Solution of relaxation is integral and therefore feasible in the original problem. This becomes the incumbent solution. June 2011 Slide 49 x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 { 3} x 2 {012 } x 3 { 123} x 4 {0123} x = (3,2,2¾,0) value = 527½ x 1 { 3} x 3 {1,2} x 2 { 12 } x 3 { 12 } x 4 { 123} x = (3,2,2,1) value = 530 feasible solution x 1 = 3 x 3 = 3 x 1 { 3} x 2 {0123} x 3 {0123} x 4 {0123} x = (3,2.6,2,0) value = 526 x 2 = 3 x 2 {0,1,2}

50 Branch-inferand-relax tree Solution is nonintegral, but we can backtrack because value of relaxation is no better than incumbent solution. June 2011 Slide 50 x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 { 3} x 2 {012 } x 3 { 123} x 4 {0123} x = (3,2,2¾,0) value = 527½ x 1 { 3} x 3 {1,2} x 2 { 12 } x 3 { 12 } x 4 { 123} x = (3,2,2,1) value = 530 feasible solution x 1 = 3 x 3 = 3 x 1 { 3} x 2 {0123} x 3 {0123} x 4 {0123} x = (3,2.6,2,0) value = 526 x 2 = 3 x 2 {0,1,2} x 1 { 3} x 2 {012 } x 3 { 3} x 4 {012 } x = (3,1½,3,½) value = 530 backtrack due to bound

51 Branch-inferand-relax tree Another feasible solution found. No better than incumbent solution, which is optimal because search has finished. June 2011 Slide 51 x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 { 3} x 2 {012 } x 3 { 123} x 4 {0123} x = (3,2,2¾,0) value = 527½ x 1 { 3} x 3 {1,2} x 2 { 12 } x 3 { 12 } x 4 { 123} x = (3,2,2,1) value = 530 feasible solution x 1 = 3 x 3 = 3 x 1 { 3} x 2 {0123} x 3 {0123} x 4 {0123} x = (3,2.6,2,0) value = 526 x 2 = 3 x 2 {0,1,2} x 1 { 3} x 2 {012 } x 3 { 3} x 4 {012 } x = (3,1½,3,½) value = 530 backtrack due to bound x 1 { 3} x 2 { 3} x 3 {012 } x 4 {012 } x = (3,3,0,2) value = 530 feasible solution

52 Branch-inferand-relax tree Two optimal solutions found. x 1 { } x 2 { } x 3 { } x 4 { } infeasible x 1 { 123} x 2 {0123} x 3 {0123} x 4 {0123} x = (2⅓,3,2⅔,0) value = 523⅓ x 1 {1,2} x 1 = 3 x 1 { 3} x 2 {0123} x 3 {0123} x 4 {0123} x = (3,2.6,2,0) value = 526 June 2011 Slide 52 x 1 { 3} x 2 {012 } x 3 { 123} x 4 {0123} x = (3,2,2¾,0) value = 527½ x 1 { 3} x 3 {1,2} x 2 { 12 } x 3 { 12 } x 4 { 123} x = (3,2,2,1) value = 530 feasible solution x 3 = 3 x 2 = 3 x 2 {0,1,2} x 1 { 3} x 2 {012 } x 3 { 3} x 4 {012 } x = (3,1½,3,½) value = 530 backtrack due to bound x 1 { 3} x 2 { 3} x 3 {012 } x 4 {012 } x = (3,3,0,2) value = 530 feasible solution

53 Constraint Programming Concepts Consistency Generalized Arc Consistency Modeling Examples June 2011 Slide 53

54 Consistency A constraint set is consistent if every partial assignment to the variables that violates no constraint is feasible. i.e., can be extended to a feasible solution. Consistency feasibility Consistency means that any infeasible partial assignment is explicitly ruled out by a constraint. Fully consistent constraint sets can be solved without backtracking. June 2011 Slide 54

55 Consistency Consider the constraint set x x x j + x x { 0,1} 1 0 It is not consistent, because x 1 = 0 violates no constraint and yet is infeasible (no solution has x 1 = 0). Adding the constraint x 1 = 1 makes the set consistent. June 2011 Slide 55

56 x x + x { 0,1} 1 other constraints x j x x 1 = 0 x 1 = 1 subtree with 2 99 nodes but no feasible solution By adding the constraint x 1 = 1, the left subtree is eliminated June 2011 Slide 56

57 Generalized Arc Consistency (GAC) Also known as hyperarc consistency. A constraint set is GAC if every value in every variable domain is part of some feasible solution. That is, the domains are reduced as much as possible. If all constraints are binary (contain 2 variables), GAC = arc consistency. Domain reduction is CP s biggest engine. June 2011 Slide 57

58 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 58

59 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 59

60 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 60

61 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 61

62 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 62

63 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 63

64 Graph coloring problem that can be solved by arc consistency maintenance alone. Color nodes with red, green, blue with no two adjacent nodes having the same color. June 2011 Slide 64

65 Modeling Examples with Global Constraints Traveling Salesman Traveling salesman problem: Let c ij = distance from city i to city j. Find the shortest route that visits each of n cities exactly once. June 2011 Slide 65

66 Popular 0-1 model Let x ij = 1 if city i immediately precedes city j, 0 otherwise min s.t. ij i j ij ij ij i V j W x ij c x x x ij = 1, all j = 1, all i { 0,1} { } x 1, all disjoint V, W 1,, n ij Subtour elimination constraints June 2011 Slide 66

67 A CP model Let y k = the kth city visited. The model would be written in a specific constraint programming language but would essentially say: Variable indices min k y y k k + 1 s.t. alldiff( y,, y ) y k c 1 { 1,, n} n Global constraint June 2011 Slide 67

68 An alternate CP model Let y k = the city visited after city k. min k ky k s.t. circuit( y,, y ) y k c 1 { 1,, n} n Hamiltonian circuit constraint June 2011 Slide 68

69 Element constraint The constraint c y 5 can be implemented: z 5 ( y c c z) element,(,, ), 1 n Assign z the yth value in the list The constraint x y 5 can be implemented z 5 ( y x x z) element,(,, ), (this is a slightly different constraint) 1 n Add the constraint z = x y June 2011 Slide 69

70 Modeling example: Lot sizing and scheduling Day: A B A Product At most one product manufactured on each day. Demands for each product on each day. Minimize setup + holding cost. June 2011 Slide 70

71 Integer programming model (Wolsey) June 2011 Slide 71 min hitsit + qijδijt t, i j t s.t. s + x = d + s, all i, t i, t 1 it it it z y y, all i, t it it i, t 1 z y, all i, t it it z 1 y, all i, t δ δ δ x it i, t 1 y + y 1, all i, j, t ijt i, t 1 jt ijt ijt it i y it y y i, t 1 jt, all i, j, t, all i, j, t it, all i, t, z, δ {0,1} it it ijt x, s 0 it Cy y it = 1, all t Many variables

72 CP model Minimize holding and setup costs min qyt 1y + h t is it t i s.t. s + x = d + s, all i, t i, t 1 it it it 0 x C, s 0, all i, t it ( ) ( = ) y i x 0, all i, t t it it Inventory balance Production capacity June 2011 Slide 72

73 CP model Variable indices Minimize holding and setup costs min qyt 1y + h t is it t i s.t. s + x = d + s, all i, t i, t 1 it it it 0 x C, s 0, all i, t it ( ) ( = ) y i x 0, all i, t t it it Inventory balance Production capacity June 2011 Slide 73 Production level of product i in period t Product manufactured in period t

74 Cumulative scheduling constraint Used for resource-constrained scheduling. Total resources consumed by jobs at any one time must not exceed L. ( t t p p c c L) cumulative (,, ),(,, ),(,, ), 1 n 1 n 1 n Job start times (variables) Job processing times Job resource requirements June 2011 Slide 74

75 Cumulative scheduling constraint Minimize makespan (no deadlines, all release times = 0): L 4 resources 1 5 Min makespan = min June 2011 Slide 75 z 1 5 ( t t ) s.t. cumulative (,, ),(3,3,3,5,5),(3,3,3,2,2),7 z t + 3 z t time Resources used Processing times Job start times L

76 Modeling example: Ship loading Will use ILOG s OPL Studio modeling language. Example is from OPL manual. The problem Load 34 items on the ship in minimum time (min makespan) Each item requires a certain time and certain number of workers. Total of 8 workers available. June 2011 Slide 76

77 Item Labor Item Duration Duration Labor Problem data June 2011 Slide 77

78 Precedence constraints 1 2, , , , ,30,31, June 2011 Slide 78

79 Use the cumulative scheduling constraint. min z s.t. z t + 3, z t + 4, etc ( t t ) cumulative (,, ),(3,4,,2),(4,4,,3), t t + 3, t t + 3, etc. June 2011 Slide 79

80 OPL model int capacity = 8; int nbtasks = 34; range Tasks 1..nbTasks; int duration[tasks] = [3,4,4,6,,2]; int totalduration = sum(t in Tasks) duration[t]; int demand[tasks] = [4,4,3,4,,3]; struct Precedences { int before; int after; } {Precedences} setofprecedences = { <1,2>, <1,4>,, <33,34> }; June 2011 Slide 80

81 schedulehorizon = totalduration; Activity a[t in Tasks](duration[t]); DiscreteResource res(8); Activity makespan(0); minimize makespan.end subject to forall(t in Tasks) a[t] precedes makespan; forall(p in setofprecedences) a[p.before] precedes a[p.after]; forall(t in Tasks) a[t] requires(demand[t]) res; }; June 2011 Slide 81

82 Modeling example: Production scheduling with intermediate storage Manufacturing Unit Storage Tanks Capacity C 1 Capacity C 2 Packing Units June 2011 Slide 82 Capacity C 3

83 Filling of storage tank Level Need to enforce capacity constraint here only Filling starts Packing starts June 2011 Slide 83 t u t + (b/r) u + (b/s) Manufacturing rate Batch size Filling ends Packing ends Packing rate

84 min s.t. Makespan T bj T u j +, all j s t R, all j j j ( t v e m) cumulative,,, bi vi = ui + ti, all i s s i bi 1 + siui Ci, all i r Tank capacity i 1 n cumulative u,,,,, u j t j 0 j i b s 1 b s n e p Job release time m storage tanks Job duration p packing units June 2011 Slide 84 e = (1,,1)

85 Modeling example: Employee scheduling Schedule four nurses in 8-hour shifts. A nurse works at most one shift a day, at least 5 days a week. Same schedule every week. No shift staffed by more than two different nurses in a week. A nurse cannot work different shifts on two consecutive days. A nurse who works shift 2 or 3 must do so at least two days in a row. June 2011 Slide 85

86 Two ways to view the problem Assign nurses to shifts Sun Mon Tue Wed Thu Fri Sat Shift 1 A B A A A A A Shift 2 C C C B B B B Shift 3 D D D D C C D Assign shifts to nurses Sun Mon Tue Wed Thu Fri Sat Nurse A Nurse B Nurse C Nurse D June 2011 Slide 86 0 = day off

87 Use both formulations in the same model! First, assign nurses to shifts. Let w sd = nurse assigned to shift s on day d alldiff( w, w, w ), all d The variables w 1d, w 2d, 1d 2d 3d w 3d take different values That is, schedule 3 different nurses on each day June 2011 Slide 87

88 Use both formulations in the same model! First, assign nurses to shifts. Let w sd = nurse assigned to shift s on day d alldiff( w, w, w ), all 1d 2d 3d ( w A B C D ) cardinality (,,, ),(5,5,5,5),(6,6,6,6) d A occurs at least 5 and at most 6 times in the array w, and similarly for B, C, D. That is, each nurse works at least 5 and at most 6 days a week June 2011 Slide 88

89 Use both formulations in the same model! First, assign nurses to shifts. Let w sd = nurse assigned to shift s on day d ( w1 d w2d w3d ) d ( w A B C D ) alldiff,,, all cardinality (,,, ),(5,5,5,5),(6,6,6,6) ( ) nvalues w,..., w 1,2, all s s,sun s,sat The variables w s,sun,, w s,sat take at least 1 and at most 2 different values. June 2011 Slide 89 That is, at least 1 and at most 2 nurses work any given shift.

90 Remaining constraints are not easily expressed in this notation. So, assign shifts to nurses. Let y id = shift assigned to nurse i on day d alldiff (,, ) y y y, all d 1d 2d 3d Assign a different nurse to each shift on each day. June 2011 Slide 90 This constraint is redundant of previous constraints, but redundant constraints speed solution.

91 Remaining constraints are not easily expressed in this notation. So, assign shifts to nurses. Let y id = shift assigned to nurse i on day d alldiff ( y, y, y ) 1d 2d 3d ( y y P ) i,sun, all stretch,, (2,3),(2,2),(6,6),, all i,sat d i Every stretch of 2 s has length between 2 and 6. Every stretch of 3 s has length between 2 and 6. So a nurse who works shift 2 or 3 must do so at least two days in a row. June 2011 Slide 91

92 Remaining constraints are not easily expressed in this notation. So, assign shifts to nurses. Let y id = shift assigned to nurse i on day d alldiff ( y, y, y ) 1d 2d 3d ( y y P ) i,sun, all stretch,, (2,3),(2,2),(6,6),, all i,sat d i Here P = {(s,0),(0,s) s = 1,2,3} Whenever a stretch of a s immediately precedes a stretch of b s, (a,b) must be one of the pairs in P. So a nurse cannot switch shifts without taking at least one day off. June 2011 Slide 92

93 Now we must connect the w sd variables to the y id variables. Use channeling constraints: w y y w id sd d d = = i, all i, d s, all s, d Channeling constraints increase propagation and make the problem easier to solve. June 2011 Slide 93

94 The complete model is: ( w1 d w2d w3d ) d ( w A B C D ) alldiff,,, all cardinality (,,, ),(5,5,5,5),(6,6,6,6) ( ) nvalues w,..., w 1,2, all s alldiff s,sun ( y, y, y ) 1d 2d 3d s,sat ( y y P ) i,sun, all stretch,, (2,3),(2,2),(6,6),, all i,sat d i w y y w id sd d d = = i, all i, d s, all s, d June 2011 Slide 94

95 CP Filtering Algorithms Element Alldiff Disjunctive Scheduling Cumulative Scheduling June 2011 Slide 95

96 Filtering for element ( y x x z) element,(,, ), Variable domains can be easily filtered to maintain GAC. 1 n Domain of z D D D z z x j D j y { } D D j D D D y y z x x { } Dz if Dy = j Dx otherwise j j j June 2011 Slide 96

97 Filtering for element Example... ( y x x x x z) element,(,,, ), The initial domains are: D D D D D D z y x x x x = = = = = = { 20,30,60,80,90 } { 1,3,4 } { 10,50} { 10,20} { 40,50,80,90 } { 40,50,70} The reduced domains are: D D D D D D z y x x x x = = = = = = { 80,90} { 3} { 10,50} { 10,20} { 80,90} { 40,50,70} June 2011 Slide 97

98 Filtering for alldiff ( y y ) alldiff 1,, n Domains can be filtered with an algorithm based on maximum cardinality bipartite matching and a theorem of Berge. It is a special case of optimality conditions for max flow. June 2011 Slide 98

99 Filtering for alldiff Consider the domains y y y y y { 1} { 2,3,5 } { 1,2,3,5 } { 1,5 } { 1,2,3,4,5,6 } June 2011 Slide 99

100 Indicate domains with edges y 1 y 2 y 3 y y 5 6 June 2011 Slide 100

101 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. y 5 6 June 2011 Slide 101

102 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. y 5 6 June 2011 Slide 102

103 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. Mark edges in alternating paths that start at an uncovered vertex. y 5 6 June 2011 Slide 103

104 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. Mark edges in alternating paths that start at an uncovered vertex. y 5 6 June 2011 Slide 104

105 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. Mark edges in alternating paths that start at an uncovered vertex. Mark edges in alternating cycles. y 5 6 June 2011 Slide 105

106 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. Mark edges in alternating paths that start at an uncovered vertex. Mark edges in alternating cycles. y 5 6 Remove unmarked edges not in matching. June 2011 Slide 106

107 Indicate domains with edges y 1 y 2 y 3 y Find maximum cardinality bipartite matching. Mark edges in alternating paths that start at an uncovered vertex. Mark edges in alternating cycles. y 5 6 Remove unmarked edges not in matching. June 2011 Slide 107

108 Filtering for alldiff Domains have been filtered: y y y y y { 1} { 2,3,5 } { 1,2,3,5 } { 1,5 } { 1,2,3,4,5,6 } y y y y y { 1} { 2,3} { 2,3} { 5} { 4,6} GAC achieved. June 2011 Slide 108

109 Disjunctive scheduling Consider a disjunctive scheduling constraint: ( s s s s p p p p ) nooverlap (,,, ),(,,, ) Start time variables June 2011 Slide 109

110 Edge finding for disjunctive scheduling Consider a disjunctive scheduling constraint: ( s s s s p p p p ) nooverlap (,,, ),(,,, ) Processing times June 2011 Slide 110

111 Edge finding for disjunctive scheduling Consider a disjunctive scheduling constraint: ( s s s s p p p p ) nooverlap (,,, ),(,,, ) Variable domains defined by time windows and processing times s s s s [0,10 1] [0,10 3] [2,7 3] [4,7 2] June 2011 Slide 111

112 Edge finding for disjunctive scheduling Consider a disjunctive scheduling constraint: ( s s s s p p p p ) nooverlap (,,, ),(,,, ) A feasible (min makespan) solution: Time window June 2011 Slide 112

113 Edge finding for disjunctive scheduling But let s reduce 2 of the deadlines to 9: June 2011 Slide 113

114 Edge finding for disjunctive scheduling But let s reduce 2 of the deadlines to 9: We will use edge finding to prove that there is no feasible schedule. June 2011 Slide 114

115 Edge finding for disjunctive scheduling We can deduce that job 2 must precede jobs 3 and 5: 2 { 3,5} Because if job 2 is not first, there is not enough time for all 3 jobs within the time windows: L{2,3,5} E{3,5} < p{2,3,5} June 2011 Slide 115 E {3,5} 7<3+3+2 L {2,3,5}

116 Edge finding for disjunctive scheduling We can deduce that job 2 must precede jobs 3 and 5: 2 { 3,5} Because if job 2 is not first, there is not enough time for all 3 jobs within the time windows: Latest deadline L{2,3,5} E{3,5} < p{2,3,5} June 2011 Slide 116 E {3,5} 7<3+3+2 L {2,3,5}

117 Edge finding for disjunctive scheduling We can deduce that job 2 must precede jobs 3 and 5: 2 { 3,5} Because if job 2 is not first, there is not enough time for all 3 jobs within the time windows: Earliest release time L{2,3,5} E{3,5} < p{2,3,5} June 2011 Slide 117 E {3,5} 7<3+3+2 L {2,3,5}

118 Edge finding for disjunctive scheduling We can deduce that job 2 must precede jobs 3 and 5: 2 { 3,5} Because if job 2 is not first, there is not enough time for all 3 jobs within the time windows: L{2,3,5} E{3,5} < p{2,3,5} Total processing time June 2011 Slide 118 E {3,5} 7<3+3+2 L {2,3,5}

119 Edge finding for disjunctive scheduling We can deduce that job 2 must precede jobs 3 and 5: 2 { 3,5} So we can tighten deadline of job 2 to minimum of L {3} p{3} = 4 L {5} p{5} 5 = L{3,5} p{3,5} = 2 Since time window of job 2 is now too narrow, there is no feasible schedule. June 2011 Slide 119 E {3,5} 7<3+3+2 L {2,3,5}

120 Edge finding for disjunctive scheduling In general, we can deduce that job k must precede all the jobs in set J: k J If there is not enough time for all the jobs after the earliest release time of the jobs in J LJ { k } EJ < pj { k } L{2,3,5} E{3,5} < p{2,3,5} June 2011 Slide 120

121 Edge finding for disjunctive scheduling In general, we can deduce that job k must precede all the jobs in set J: k J If there is not enough time for all the jobs after the earliest release time of the jobs in J LJ { k } EJ < pj { k } L{2,3,5} E{3,5} < p{2,3,5} Now we can tighten the deadline for job k to: { L p } min J J L{3,5} p{3,5} = 2 J J June 2011 Slide 121

122 Edge finding for disjunctive scheduling There is a symmetric rule: k J If there is not enough time for all the jobs before the latest deadline of the jobs in J: L E < p J J { k } J { k } Now we can tighten the release date for job k to: { E + p } max J J J J June 2011 Slide 122

123 Edge finding for disjunctive scheduling Problem: how can we avoid enumerating all subsets J of jobs to find edges? L E < p J { k } J J { k } and all subsets J of J to tighten the bounds? { L p } min J J J J June 2011 Slide 123

124 Edge finding for disjunctive scheduling Key result: We only have to consider sets J whose time windows lie within some interval. { L p } min J J J J e.g., J = {3,5} June 2011 Slide 124

125 Edge finding for disjunctive scheduling Key result: We only have to consider sets J whose time windows lie within some interval. { L p } min J J J J e.g., J = {3,5} Removing a job from those within an interval only weakens the test L E < p J { k } J J { k } June 2011 Slide 125 There are a polynomial number of intervals defined by release times and deadlines.

126 Edge finding for disjunctive scheduling Key result: We only have to consider sets J whose time windows lie within some interval. { L p } min J J J J e.g., J = {3,5} Note: Edge finding does not achieve bounds consistency, which is an NP-hard problem. June 2011 Slide 126

127 Edge finding for disjunctive scheduling One O(n 2 ) algorithm is based on the Jackson pre-emptive schedule (JPS). Using a different example, the JPS is: June 2011 Slide 127

128 Edge finding for disjunctive scheduling One O(n 2 ) algorithm is based on the Jackson pre-emptive schedule (JPS). Using a different example, the JPS is: For each job Scan jobs k J in decreasing order of L Select first k for which L E < p + p Conclude that Update E to JPS( i, k) i i i i J ik Jobs unfinished at time E i in JPS k i i J ik k Jobs j i in J i with L j L k Total remaining processing time in JPS of jobs in J ik June 2011 Slide 128 Latest completion time in JPS of jobs in J ik

129 Not-first/not-last rules We can deduce that job 4 cannot precede jobs 1 and 2: ( 4 {1,2} ) Because if job 4 is first, there is too little time to complete the jobs before the later deadline of jobs 1 and 2: L{1,2} E4 < p1 + p2 + p4 E 4 6<1+3+3 L {1,2} June 2011 Slide 129

130 Not-first/not-last rules We can deduce that job 4 cannot precede jobs 1 and 2: Now we can tighten the release time of job 4 to minimum of: E 1 + p1 = 3 ( 4 {1,2} ) E 2 + p2 = 4 E 4 6<1+3+3 L {1,2} June 2011 Slide 130

131 Not-first/not-last rules In general, we can deduce that job k cannot precede all the jobs in J: ( k J ) if there is too little time after release time of job k to complete all jobs before the latest deadline in J: LJ Ek < pj Now we can update E i to min E j + p j j J { } June 2011 Slide 131

132 Not-first/not-last rules In general, we can deduce that job k cannot precede all the jobs in J: ( k J ) if there is too little time after release time of job k to complete all jobs before the latest deadline in J: LJ Ek < pj Now we can update E i to min E j + p j j J { } There is a symmetric not-last rule. The rules can be applied in polynomial time, although an efficient algorithm is quite complicated. June 2011 Slide 132

133 Cumulative scheduling Consider a cumulative scheduling constraint: ( s s s p p p c c c C) cumulative (,, ),(,, ),(,, ), A feasible solution: June 2011 Slide 133

134 Edge finding for cumulative scheduling We can deduce that job 3 must finish after the others finish: 3 > { 1,2 } Because the total energy required exceeds the area between the earliest release time and the later deadline of jobs 1,2: ( ) e + e > C L E 3 {1,2} {1,2} {1,2,3} June 2011 Slide 134

135 Edge finding for cumulative scheduling We can deduce that job 3 must finish after the others finish: 3 > { 1,2 } Because the total energy required exceeds the area between the earliest release time and the later deadline of jobs 1,2: ( ) e + e > C L E 3 {1,2} {1,2} {1,2,3} Total energy required = June 2011 Slide 135

136 Edge finding for cumulative scheduling We can deduce that job 3 must finish after the others finish: 3 > { 1,2 } Because the total energy required exceeds the area between the earliest release time and the later deadline of jobs 1,2: ( ) e + e > C L E 3 {1,2} {1,2} {1,2,3} Total energy required = 22 Area available = June 2011 Slide 136

137 Edge finding for cumulative scheduling We can deduce that job 3 must finish after the others finish: 3 > { 1,2 } We can update the release time of job 3 to E {1,2} + e ( C c )( L E ) J 3 {1,2} {1,2} c 3 Energy available for jobs 1,2 if space is left for job 3 to start anytime = June 2011 Slide 137

138 Edge finding for cumulative scheduling We can deduce that job 3 must finish after the others finish: 3 > { 1,2 } We can update the release time of job 3 to E {1,2} + e ( C c )( L E ) J 3 {1,2} {1,2} c 3 Energy available for jobs 1,2 if space is left for job 3 to start anytime = 10 Excess energy required by jobs 1,2 = 4 June 2011 Slide

139 Edge finding for cumulative scheduling We can deduce that job 3 must finish after the others finish: 3 > { 1,2 } We can update the release time of job 3 to E {1,2} + e ( C c )( L E ) J 3 {1,2} {1,2} c 3 Energy available for jobs 1,2 if space is left for job 3 to start anytime = 10 Excess energy required by jobs 1,2 = 4 June 2011 Slide Move up job 3 release time 4/2 = 2 units beyond E 10 {1,2} E 3

140 Edge finding for cumulative scheduling In general, if e ( ) J { k } > C LJ EJ { k } then k > J, and update E k to max J J e ( C c )( L E ) > 0 J k J J E J ej ( C ck )( LJ EJ ) + ck In general, if e ( ) J { k } > C LJ { k } EJ then k < J, and update L k to min J J e ( C c )( L E ) > 0 J k J J L J ej ( C ck )( LJ EJ ) ck June 2011 Slide 140

141 Edge finding for cumulative scheduling There is an O(n 2 ) algorithm that finds all applications of the edge finding rules. June 2011 Slide 141

142 Other propagation rules for cumulative scheduling Extended edge finding. Timetabling. Not-first/not-last rules. Energetic reasoning. June 2011 Slide 142

143 Linear Relaxation Why Relax? Algebraic Analysis of LP Linear Programming Duality LP-Based Domain Filtering Example: Single-Vehicle Routing Disjunctions of Linear Systems June 2011 Slide 143

144 Why Relax? Solving a relaxation of a problem can: Tighten variable bounds. Possibly solve original problem. Guide the search in a promising direction. Filter domains using reduced costs or Lagrange multipliers. Prune the search tree using a bound on the optimal value. Provide a more global view, because a single OR relaxation can pool relaxations of several constraints. June 2011 Slide 144

145 Some MP models that can provide relaxations: Linear programming (LP). Mixed integer linear programming (MILP) Can itself be relaxed as an LP. LP relaxation can be strengthened with cutting planes. Lagrangean relaxation. Specialized relaxations. For particular problem classes. For global constraints. June 2011 Slide 145

146 Motivation Linear programming is remarkably versatile for representing real-world problems. LP is by far the most widely used tool for relaxation. LP relaxations can be strengthened by cutting planes. - Based on polyhedral analysis. LP has an elegant and powerful duality theory. - Useful for domain filtering, and much else. The LP problem is extremely well solved. June 2011 Slide 146

147 Algebraic Analysis of LP An example min 4x + 7x x + 3x 6 2x + x 4 x, x 0 2x 1 + x 2 4 Optimal solution x = (3,0) 4x 1 + 7x 2 = 12 2x 1 + 3x 2 6 June 2011 Slide 147

148 Algebraic Analysis of LP Rewrite min 4x + 7x x + 3x 6 2x + x 4 x, x 0 as min 4x + 7x 1 2 2x + 3x x = x + x x = x, x, x, x In general an LP has the form min cx Ax = b x 0 June 2011 Slide 148

149 Algebraic analysis of LP Write min cx Ax = b x 0 as min + B B N N Bx + Nx = b x B B c x N, x 0 N c x where A = [ B N] m n matrix Basic variables Nonbasic variables Any set of m linearly independent columns of A. These form a basis for the space spanned by the columns. June 2011 Slide 149

150 Algebraic analysis of LP Write min cx Ax = b x 0 as min + B B N N Bx + Nx = b x B B c x N, x 0 N c x where A = [ B N] Solve constraint equation for x B : x = 1 1 B B b B NxN All solutions can be obtained by setting x N to some value. The solution is basic if x N = 0. It is a basic feasible solution if x N = 0 and x B 0. June 2011 Slide 150

151 Example x 2 = basic feasible solution min 4x + 7x 1 2 2x + 3x x = x + x x = x, x, x, x x 2, x 3 basic 2x 1 + x 2 4 x 2, x 4 basic x 1, x 2 basic x 3, x 4 basic x 1, x 3 basic 2x 1 + 3x 2 6 x 1, x 4 basic x 1 June 2011 Slide 151

152 Algebraic analysis of LP Write min cx Ax = b x 0 as min + B B N N Bx + Nx = b x B B c x N, x 0 N c x where A = [ B N] x = B 1 b B 1 Nx Solve constraint equation for x B : Express cost in terms of nonbasic variables: B N c B b + ( c c B N) x 1 1 B N B N Vector of reduced costs June 2011 Slide 152 Since x N 0, basic solution (x B,0) is optimal if reduced costs are nonnegative.

153 Example x 2 min 4x + 7x 1 2 2x + 3x x = x + x x = x, x, x, x Consider this basic feasible solution x 1, x 4 basic x 1 June 2011 Slide 153

154 Example Write as c B x B c N x N [ ] x + [ ] 1 x2 min 4x1 + 7x2 min x4 x3 2x 1 + 3x2 x3 = x1 3 1 x1 6 2x1 + x2 x4 = 4 Bx B + = 2 1 x4 1 0 x4 4 x1, x2, x3, x4 0 x1 x1 0 NxN b, x4 x4 0 June 2011 Slide 154

155 Example Bx B c B x B c N x N [ ] x + [ ] 1 x2 min x4 x3 2 0 x1 3 1 x1 6 + = 2 1 x4 1 0 x4 4 x1 x1 0 NxN b, x4 x4 0 June 2011 Slide 155

156 Example c B x B c N x N [ ] x + [ ] 1 x2 min x4 x3 2 0 x1 3 1 x1 6 Bx B + = 2 1 x4 1 0 x4 4 x1 x1 0 NxN b, x4 x4 0 Basic solution is x = B b B Nx = B b B N x1 1/ = = = x x 2 x 1, x 4 basic June 2011 Slide 156 x 1

157 Example Bx B c B x B c N x N [ ] x + [ ] 1 x2 min x4 x3 2 0 x1 3 1 x1 6 + = 2 1 x4 1 0 x4 4 x1 x1 0 NxN, x4 x4 0 June 2011 Slide 157 Basic solution is x = B b B Nx = B b B Reduced costs are c N 1 cbb N 1/ = [ 7 0] [ 4 0] = [ 1 2] [ 0 0] N x1 1/ = = = x Solution is optimal

158 Linear Programming Duality An LP can be viewed as an inference problem min cx Ax b x 0 = max v x 0 Ax b cx v implies Dual problem: Find the tightest lower bound on the objective function that is implied by the constraints. June 2011 Slide 158

159 An LP can be viewed as an inference problem min cx Ax b x 0 = max v x 0 Ax b cx v That is, some surrogate (nonnegative linear combination) of Ax b dominates cx v From Farkas Lemma: If Ax b, x 0 is feasible, x 0 λax λb dominates cx v Ax b cx v iff for some λ 0 June 2011 Slide 159 λa c and λb v

160 An LP can be viewed as an inference problem min cx Ax b x 0 = max v x 0 Ax b cx v max λ 0 λb = This is the λa c classical LP dual From Farkas Lemma: If Ax b, x 0 is feasible, x 0 λax λb dominates cx v Ax b cx v iff for some λ 0 June 2011 Slide 160 λa c and λb v

161 This equality is called strong duality. min cx = max λb Ax b x 0 λa c λ 0 If Ax b, x 0 is feasible This is the classical LP dual Note that the dual of the dual is the primal (i.e., the original LP). June 2011 Slide 161

162 Example Primal min 4x + 7x = 2x + 3 x 6 ( λ ) 1 2 2x + x x, x ( λ ) 1 Dual max 6λ + 4λ = λ + 2λ 4 2 3λ + λ 7 λ, λ 0 ( x ( x 1 2 ) ) A dual solution is (λ 1,λ 2 ) = (2,0) June 2011 Slide 162 2x + 3x x + x 4 4x + 6x x + 7x ( λ = 2) 1 ( λ = 0) 2 dominates Dual multipliers Surrogate Tightest bound on cost

163 Weak Duality If x* is feasible in the primal problem and λ* is feasible in the dual problem then cx* λ*b. min cx Ax b x 0 max λb λa c λ 0 This is because cx* λ*ax* λ*b λ* is dual feasible and x* 0 x* is primal feasible and λ* 0 June 2011 Slide 163

164 Dual multipliers as marginal costs Suppose we perturb the RHS of an LP (i.e., change the requirement levels): min cx Ax b + b x 0 The dual of the perturbed LP has the same constraints at the original LP: max λ( b λa c + b) λ 0 So an optimal solution λ* of the original dual is feasible in the perturbed dual. June 2011 Slide 164

165 Dual multipliers as marginal costs Suppose we perturb the RHS of an LP (i.e., change the requirement levels): min cx Ax b + b x 0 By weak duality, the optimal value of the perturbed LP is at least λ*(b + b) = λ*b + λ* b. Optimal value of original LP, by strong duality. So λ i * is a lower bound on the marginal cost of increasing the i-th requirement by one unit ( b i = 1). If λ i * > 0, the i-th constraint must be tight (complementary slackness). June 2011 Slide 165

166 Dual of an LP in equality form Primal Dual min c x + c x B B N N Bx + Nx = b x B B N, x 0 N ( λ) max λb λb c λn c B N λ unrestricted ( x ( x B B ) ) June 2011 Slide 166

167 Dual of an LP in equality form Primal Dual min c x + c x B B N N Bx + Nx = b x B B N, x 0 N ( λ) max λb λb c λn c B N λ unrestricted ( x ( x B B ) ) Recall that reduced cost vector is c N 1 cbb N λ = c λn N this solves the dual if (x B,0) solves the primal June 2011 Slide 167

168 Dual of an LP in equality form Primal min c x + c x B B N N Bx + Nx = b x B B N, x 0 N ( λ) Dual max λb λb c λn c B N λ unrestricted ( x ( x B B ) ) Recall that reduced cost vector is Check: 1 λb = c B B = c λ B B 1 N = cbb N cn c N 1 cbb N λ = c λn this solves the dual if (x B,0) solves the primal N Because reduced cost is nonnegative at optimal solution (x B,0). June 2011 Slide 168

169 Dual of an LP in equality form Primal min c x + c x B B N N Bx + Nx = b x B B N, x 0 N ( λ) Dual max λb λb c λn c B N λ unrestricted ( x ( x B B ) ) Recall that reduced cost vector is c N 1 cbb N λ = c λn N In the example, June 2011 Slide 169 [ 4 0] 1/ 2 [ 2 0] λ = cbb = = this solves the dual if (x B,0) solves the primal

170 Dual of an LP in equality form Primal min c x + c x B B N N Bx + Nx = b x B B N, x 0 N ( λ) Dual max λb λb c λn c B N λ unrestricted ( x ( x B B ) ) Recall that reduced cost vector is 1 cbb N Note that the reduced cost of an individual variable x j is c N λ = c λn N r j = c j λaj Column j of A June 2011 Slide 170

171 LP-based Domain Filtering Let min cx Ax b x 0 be an LP relaxation of a CP problem. One way to filter the domain of x j is to minimize and maximize x j subject to Ax b, x 0. - This is time consuming. A faster method is to use dual multipliers to derive valid inequalities. - A special case of this method uses reduced costs to bound or fix variables. - Reduced-cost variable fixing is a widely used technique in OR. CPAIOR tutorial May 2009 Slide 171

172 Suppose: min cx Ax b x 0 has optimal solution x*, optimal value v*, and optimal dual solution λ*. and λ i * > 0, which means the i-th constraint is tight (complementary slackness); and the LP is a relaxation of a CP problem; and we have a feasible solution of the CP problem with value U, so that U is an upper bound on the optimal value. June 2011 Slide 172

173 Supposing min cx Ax b x 0 has optimal solution x*, optimal value v*, and optimal dual solution λ*: If x were to change to a value other than x*, the LHS of i-th constraint A i x b i would change by some amount b i. Since the constraint is tight, this would increase the optimal value as much as changing the constraint to A i x b i + b i. So it would increase the optimal value at least λ i * b i. June 2011 Slide 173

174 Supposing min cx Ax b x 0 has optimal solution x*, optimal value v*, and optimal dual solution λ*: We have found: a change in x that changes A i x by b i the optimal value of LP at least λ i * b i. increases Since optimal value of the LP optimal value of the CP U, we have λ i * b i U v*, or * b i U v λ * i June 2011 Slide 174

175 Supposing min cx Ax b x 0 has optimal solution x*, optimal value v*, and optimal dual solution λ*: We have found: a change in x that changes A i x by b i the optimal value of LP at least λ i * b i. increases Since optimal value of the LP optimal value of the CP U, we have λ i * b i U v*, or * Since b i = A i x A i x* = A i x b i, this implies the inequality June 2011 Slide 175 i A x U v bi + λ * i * b i U v λ * i which can be propagated.

176 Example min x + 7x 1 2 2x + 3x 6 ( λ = 2) 2x + x 4 ( λ = 0) x, x A x Suppose we have a feasible solution of the original CP with value U = 13. Since the first constraint is tight, we can propagate the inequality U v b + λ 1 * 1 * or x1 + 3x2 6 + = June 2011 Slide 176

177 Reduced-cost domain filtering Suppose x j * = 0, which means the constraint x j 0 is tight. The inequality i A x * U v bi + becomes λ * i x j U r j v * The dual multiplier for x j 0 is the reduced cost r j of x j, because increasing x j (currently 0) by 1 increases optimal cost by r j. Similar reasoning can bound a variable below when it is at its upper bound. June 2011 Slide 177

178 Example min x + 7x 1 2 2x + 3x 6 ( λ = 2) 2x + x 4 ( λ = 0) x, x Since x 2 * = 0, we have or x 2 Suppose we have a feasible solution of the original CP with value U = = x 2 U r 2 v * If x 2 is required to be integer, we can fix it to zero. This is reduced-cost variable fixing. June 2011 Slide 178

179 Example: Single-Vehicle Routing A vehicle must make several stops and return home, perhaps subject to time windows. The objective is to find the order of stops that minimizes travel time. This is also known as the traveling salesman problem (with time windows). Travel time c ij Stop i Stop j June 2011 Slide 179

180 Assignment Relaxation min ij ij ij j j x ij x = x = 1, all i { } c x ij ji 0,1, all i, j = 1 if stop i immediately precedes stop j Stop i is preceded and followed by exactly one stop. June 2011 Slide 180

181 Assignment Relaxation min ij = x = 1, all 0 x 1, all i, j ij x ij j j ij c x ij ji = 1 if stop i immediately precedes stop j i Stop i is preceded and followed by exactly one stop. Because this problem is totally unimodular, it can be solved as an LP. The relaxation provides a very weak lower bound on the optimal value. But reduced-cost variable fixing can be very useful in a CP context. June 2011 Slide 181

182 Disjunctions of linear systems Disjunctions of linear systems often occur naturally in problems and can be given a convex hull relaxation. A disjunction of linear systems represents a union of polyhedra. min k k ( A x b ) k cx June 2011 Slide 182

183 Relaxing a disjunction of linear systems Disjunctions of linear systems often occur naturally in problems and can be given a convex hull relaxation. A disjunction of linear systems represents a union of polyhedra. We want a convex hull relaxation (tightest linear relaxation). min k k ( A x b ) k cx June 2011 Slide 183

184 Relaxing a disjunction of linear systems Disjunctions of linear systems often occur naturally in problems and can be given a convex hull relaxation. The closure of the convex hull of min k k ( A x b ) k cx June 2011 Slide 184 is described by min cx k k k A x b y, all k k y x = k k = 1 k 0 y 1 k x k

185 Why? To derive convex hull relaxation of a disjunction Write each solution as a convex combination of points in the polyhedron min cx k k k A x b, all k k y x = k k = 1 k 0 y 1 k y x k 1 x x 2 x Convex hull relaxation (tightest linear relaxation) June 2011 Slide 185

186 Why? To derive convex hull relaxation of a disjunction Write each solution as a convex combination of points in the polyhedron min k k k A x b, all k k x = y k cx k = 1 0 y 1 k y x k k Change of variable k x = y x k 1 x x min k k k A x b y, all k k x = y k cx k = 1 k 0 y 1 k 2 x x k Convex hull relaxation (tightest linear relaxation) June 2011 Slide 186

187 Mixed Integer/Linear Modeling MILP Representability Disjunctive Modeling Knapsack Modeling June 2011 Slide 187

188 Motivation A mixed integer/linear programming (MILP) problem has the form min cx + dy Ax + By b x, y 0 y integer We can relax a CP problem by modeling some constraints with an MILP. If desired, we can then relax the MILP by dropping the integrality constraint, to obtain an LP. The LP relaxation can be strengthened with cutting planes. The first step is to learn how to write MILP models. June 2011 Slide 188

189 MILP Representability R n A subset S of is MILP representable if it is the projection onto x of some MILP constraint set of the form Ax + Bu + Dy b x, y 0 { } n p m x R Z, u R, y 0,1 k June 2011 Slide 189

190 MILP Representability R n A subset S of is MILP representable if it is the projection onto x of some MILP constraint set of the form Ax + Bu + Dy b x, y 0 { } n p m x R Z, u R, y 0,1 k Recession cone of polyhedron R Theorem. S is MILP representable if and only if S is the union of finitely many mixed integer polyhedra having the same recession cone. n Mixed integer polyhedron June 2011 Slide 190

191 Example: Fixed charge function Minimize a fixed charge function: min x x 2 1 x 2 0 if x1 = 0 f + cx1 if x1 > 0 0 x 2 June 2011 Slide 191 x 1

192 Example Minimize a fixed charge function: min x x 2 1 x 2 0 if x1 = 0 f + cx1 if x1 > 0 0 x 2 Feasible set June 2011 Slide 192 x 1

193 Example Minimize a fixed charge function: min x x 2 1 x 2 0 if x1 = 0 f + cx1 if x1 > 0 0 x 2 Union of two polyhedra P 1, P 2 P 1 June 2011 Slide 193 x 1

194 Example Minimize a fixed charge function: min x x 2 1 x 2 0 if x1 = 0 f + cx1 if x1 > 0 0 x 2 Union of two polyhedra P 1, P 2 P 2 P 1 June 2011 Slide 194 x 1

195 Example Minimize a fixed charge function: min x x 2 1 x 2 0 if x1 = 0 f + cx1 if x1 > 0 0 x 2 The polyhedra have different recession cones. P 1 P 2 June 2011 Slide 195 x 1 P 1 recession cone P 2 recession cone

196 Example Minimize a fixed charge function: Add an upper bound on x 1 min x if x1 = 0 x2 f + cx1 if x1 > 0 0 x M x 2 The polyhedra have the same recession cone. P 1 P 2 June 2011 Slide 196 M x 1 P 1 recession cone P 2 recession cone

197 Modeling a union of polyhedra Start with a disjunction of linear systems to represent the union of polyhedra. The kth polyhedron is {x A k x b} min cx ( k k A x b ) k Introduce a 0-1 variable y k that is 1 when x is in polyhedron k. Disaggregate x to create an x k for each k. min cx k k k A x b y, all k k k y x = y k k = 1 x k { 0,1} k June 2011 Slide 197

198 Example Start with a disjunction of linear systems to represent the union of polyhedra min x 2 x1 = 0 0 x1 M x 0 x f + cx x 2 P 2 P 1 June 2011 Slide 198 M x 1

199 Example Start with a disjunction of linear systems to represent the union of polyhedra Introduce a 0-1 variable y k that is 1 when x is in polyhedron k. Disaggregate x to create an x k for each k. min x 2 x1 = 0 0 x1 M x 0 x f + cx min cx x = 0, x x1 My2 cx1 + x2 fy2 0, { } y + y = 1, y 0,1 x = x + x k June 2011 Slide 199

200 Example To simplify: Replace x 12 with x 1. Replace x 22 with x 2. min = 0, x x1 My2 cx1 + x2 fy2 0, Replace y 2 with y. y + y = 1, y { 0,1} x x x = x + x k This yields min 0 x x { 0,1} My x fy + cx y or min fy + cx 0 x y { 0,1} My Big M June 2011 Slide 200

201 Disjunctive Modeling Disjunctions often occur naturally in problems and can be given an MILP model. Recall that a disjunction of linear systems (representing polyhedra min k k with the same recession cone) ( A x b ) k cx June 2011 Slide 201 has the MILP model min cx k k k A x b y, all k k k y x = y k k = 1 x k { 0,1} k

202 Example: Uncapacitated facility location

203 Uncapacitated facility location m possible factory locations n markets Disjunctive model: min i i ij ij ij xij = 0, all j 0 xij 1, all j, all zi 0 zi f = i x = 1, all j i ij z + c x Fraction of market j s demand satisfied from location i i Fixed cost f i i c ij Transport cost j No factory at location i Factory at location i June 2011 Slide 203

204 Uncapacitated facility location MILP formulation: min 0 x y, all i, j y i f y + c x ij i { 0,1} i i ij ij ij i Disjunctive model: min i i ij ij ij xij = 0, all j 0 xij 1, all j, all zi 0 zi f = i x = 1, all j i ij z + c x i No factory at location i Factory at location i June 2011 Slide 204

205 Uncapacitated facility location Maximum output from location i MILP formulation: min i 0 x y, all i, j y i f y + c x ij { 0,1} i i ij ij ij i Beginner s model: min y j i i x ny, all i, j ij f y + c x { 0,1} i i ij ij ij i Based on capacitated location model. It has a weaker continuous relaxation (obtained by replacing y i {0,1} with 0 y i 1). June 2011 Slide 205 This beginner s mistake can be avoided by starting with disjunctive formulation.

206 Knapsack Modeling Knapsack models consist of knapsack covering and knapsack packing constraints. The freight transfer model presented earlier is an example. We will consider a similar example that combines disjunctive and knapsack modeling. Most OR professionals are unlikely to write a model as good as the one presented here. June 2011 Slide 206

207 Note on tightness of knapsack models The continuous relaxation of a knapsack model is not in general a convex hull relaxation. - A disjunctive formulation would provide a convex hull relaxation, but there are exponentially many disjuncts. Knapsack cuts can significantly tighten the relaxation. June 2011 Slide 207

208 Example: Package transport Each package j has size a j Each truck i has capacity Q i and costs c i to operate June 2011 Slide 208 Truck i used 1 if truck i carries package j Disjunctive model min i i Q y a ; x = 1, all j i i j ij i j i i z yi = 1 z i 0 i = c y = i a zi 0, all i j xij Q = i j xij = 0 xij {0,1}, all j y { 0,1} 1 if truck i is used Knapsack constraints Truck i not used

209 Example: Package transport MILP model min i Q y a ; x = 1, all j a x Q y, all i x y, all i, j x { }, y 0,1 i i i i j ij i j i j ij ij j ij i i i i c y Disjunctive model min i i Q y a ; x = 1, all j i i j ij i j i yi = 1 z i 0 i = c y = i a zi 0, all j xij Q = i j xij = 0 xij {0,1}, all j y i z { 0,1} i June 2011 Slide 209

210 Example: Package transport MILP model min i Q y a ; x = 1, all j a x Q y, all i x y, all i, j { }, y 0,1 i i i i j ij i j i x j ij ij j ij i i i i c y Modeling trick; unobvious without disjunctive approach Most OR professionals would omit this constraint, since it is the sum over i of the next constraint. But it generates very effective knapsack cuts. June 2011 Slide 210

211 Network Flows and Filtering Min Cost Network Flow Max Flow Filtering: Cardinality Filtering: Sequence June 2011 Slide 211

212 Min Cost Network Flow A min cost network flow problem: Net supply at node Unit cost of flow June 2011 Slide 212

213 Min Cost Network Flow Flow on arc (i,j) In general: This is an LP. June 2011 Slide 213

214 Min Cost Network Flow Matrix form: June 2011 Slide 214

215 Min Cost Network Flow Matrix form: Rows sum to zero. So rank < m (= # of nodes) June 2011 Slide 215 Will show rank = m 1

216 Min Cost Network Flow Basis tree theorem. Every basis corresponds to a spanning tree. Corresponding columns Spanning tree June 2011 Slide 216

217 Min Cost Network Flow Can triangularize (except for last row) by permuting rows, columns: June 2011 Slide 217

218 Min Cost Network Flow Can triangularize (except for last row) by permuting rows, columns: So columns have rank m 1 and form a basis. June 2011 Slide 218

219 Min Cost Network Flow Conversely, any basis corresponds to a spanning tree. Why? Columns corresponding to a cycle are linearly dependent and therefore not part of a basis. Linearly dependent columns: Cycle Multiplier for backward arc Multiplier for forward arc

220 Optimality conditions Recall that basic solution is optimal if reduced cost vector wh 0, where But, which means To evaluate, compute by solving triangular system. June 2011 Slide 220

221 Optimality conditions Basis tree To evaluate, compute by solving the triangular system. Equations to solve (after fixing one u i to, say, zero): June 2011 Slide 221

222 Optimality conditions Can solve directly on the network: Fix this potential to zero, e.g. Equations to solve (after fixing one u i to, say, zero): June 2011 Slide 222

223 Optimality conditions Can improve solution by adding arc with negative reduced cost to basis. Reduced cost is r 13 = c 13 u 1 + u 3 = 1 June 2011 Slide 223

224 Improvement step Can improve solution by adding arc with negative reduced cost to basis. This creates a cycle. June 2011 Slide 224

225 Improvement step Can improve solution by adding arc with negative reduced cost to basis. This creates a cycle. Move flow around cycle to obtain next basic solution. June 2011 Slide 225

226 Improvement step Can improve solution by adding arc with negative reduced cost to basis. This creates a cycle. Move flow around cycle to obtain next basic solution. This is one step of the network simplex method. June 2011 Slide 226

227 Max Flow Problem Special case of max cost flow problem. Useful for filtering alldiff, cardinality, etc. Maximize flow from source s to sink t. Arc capacity June 2011 Slide 227

228 Max Flow Problem Special case of max cost flow problem. Useful for filtering alldiff, cardinality, etc. Maximize flow from source s to sink t. Arc capacity In general,

229 Max Flow Problem Special case of max cost flow problem. Formulation as max cost flow problem: Cost is 1 on return arc, zero on other arcs. Basic solution is shown. June 2011 Slide 229

230 Max Flow Problem S Easy to compute potentials (dual variables). This is an S-T cut Potentials in S = 0 Potentials in T = 1 T June 2011 Slide 230 Cost = 1

231 Max Flow Problem S Easy to compute potentials (dual variables). This is an S-T cut Potentials in S = 0 Potentials in T = 1 T Reduced costs also easy:

232 Max Flow Problem S So, basic solution is optimal if Flows S T are at capacity Flows T S are zero T June 2011 Slide 232

233 Improvement step This basic solution is suboptimal. Add nonzero T-S arc to the basis. June 2011 Slide 233

234 Improvement step This basic solution is suboptimal. Add nonzero T-S arc to the basis. This creates a cycle. June 2011 Slide 234

235 Improvement step This basic solution is suboptimal. Add nonzero T-S arc to the basis. This creates a cycle. To increase total s-t flow: Increase flow on forward arcs of dashed path, decrease on backward arcs. June 2011 Slide 235

236 Improvement step This basic solution is suboptimal. Add nonzero T-S arc to the basis. This creates a cycle. To increase total s-t flow: Increase flow on forward arcs of dashed path, decrease on backward arcs. Equivalently, increase flow on augmenting path of the residual graph. June 2011 Slide 236

237 Filtering: Cardinality Constraint Network flow model of with domains Capacity bounds

238 Filtering: Cardinality Constraint Network flow model of with domains Capacity bounds 2 Flow 2 0

239 Filtering: Cardinality Constraint Network flow model of with domains Capacity bounds 2 Flow 2 0 Constraint is feasible because max s-t flow = 4

240 Filtering: Cardinality Constraint Network-based filtering achieves domain consistency. Can remove c from domain of x 2 if max flow from c to x 2 is zero. Capacity bounds 2 Flow 2 0 Constraint is feasible because max s-t flow = 4

241 Filtering: Cardinality Constraint Network-based filtering achieves domain consistency. Can remove c from domain of x 2 if max flow from c to x 2 is zero. But zero is not max flow. There is an augmenting path from x 2 to c in the residual graph. Capacity bounds 2 Flow 2 0 Constraint is feasible because max s-t flow = 4

242 Filtering: Cardinality Constraint Network-based filtering achieves domain consistency. Can remove c from domain of x 2 if max flow from c to x 2 is zero. But zero is not max flow. There is an augmenting path from x 2 to c in the residual graph Residual graph

243 Filtering: Cardinality Constraint Network-based filtering achieves domain consistency. Can remove c from domain of x 2 if max flow from c to x 2 is zero. But zero is not max flow. There is an augmenting path from x 2 to c in the residual graph. However, we can remove a from domain of x 2 (no augmenting path) Residual graph

244 Filtering: Alldiff Alldiff is a special case in which these capacities are [0,1]. (Max cardinality bipartite matching) Capacity bounds 2 0 2

245 Filtering: Sequence The sequence constraint has several polytime filters that achieve domain consistency: Cumulative sums (also filters gensequence) Network flow model Decomposition and propagation (based on Berge acyclicity of constraint hypergraph). We will develop the network flow model. June 2011 Slide 245

246 Filtering: Sequence Consider constraint That is, every stretch of 3 variables y i at most 1 s. must contain at least and June 2011 Slide 246

247 Filtering: Sequence Consider constraint That is, every stretch of 3 variables y i at most 1 s. IP formulation is must contain at least and June 2011 Slide 247

248 Filtering: Sequence Consider constraint That is, every stretch of 3 variables y i at most 1 s. IP formulation is or must contain at least and June 2011 Slide 248

249 Filtering: Sequence The transpose of the matrix has the consecutive ones property. We will see later that it is therefore totally unimodular and can be solved as an LP (all LP solutions are integral). June 2011 Slide 249

250 Filtering: Sequence The transpose of the matrix has the consecutive ones property. We will see later that it is therefore totally unimodular and can be solved as an LP (all LP solutions are integral). In fact, it can be solved and filtered as a network flow problem. June 2011 Slide 250

251 Filtering: Sequence The transpose of the matrix has the consecutive ones property. We will see later that it is therefore totally unimodular and can be solved as an LP (all LP solutions are integral). In fact, it can be solved and filtered as a network flow problem. Subtract each row from the next (after adding row of zeros): June 2011 Slide 251

252 Filtering: Sequence The transpose of the matrix has the consecutive ones property. We will see later that it is therefore totally unimodular and can be solved as an LP (all LP solutions are integral). In fact, it can be solved and filtered as a network flow problem. Subtract each row from the next (after adding row of zeros):

253 Filtering: Sequence This is a network flow problem. The network is

254 Filtering: Sequence This is a network flow problem. The network is June 2011 Slide 254

255 Filtering: Sequence The network can be analyzed for filtering in the same way as the cardinality network. y 1 y 2 y 3 y 4 y 5 y 6 y 7 June 2011 Slide 255

256 Filtering: gensequence The gensequence constraint allows arbitrary stretches of variables: where and X i is a subset of consecutive variables x i June 2011 Slide 256

257 Filtering: gensequence The gensequence constraint allows arbitrary stretches of variables: where and X i is a subset of consecutive variables in x = (x 1,, x n ). It may be possible to permute rows so that the matrix has the consecutive ones property. This allows the network flow model to be used. This can be checked in O(m + n + r) time, where r = number of nonzeros in matrix. June 2011 Slide 257

258 Filtering: gensequence The gensequence constraint allows arbitrary stretches of variables: where and X i is a subset of consecutive variables in x = (x 1,, x n ). It may be possible to permute rows so that the matrix has the consecutive ones property. This allows the network flow model to be used. This can be checked in O(m + n + r) time, where r = number of nonzeros in matrix. Even without consecutive ones, there may be an equivalent network flow matrix. This can be checked in O(mr) time.

259 Integral Polyhedra Total Unimodularity Network Flow Matrices Interval Matrices June 2011 Slide 259

260 Integral polyhedron An integral polyhedron is one whose vertices have all integral coordinates. If the continuous relaxation of an MILP model describes an integral polyhedron, the model can be solved as an LP. (All vertices are integral.) June 2011 Slide 260

261 Integral polyhedron An integral polyhedron is one whose vertices have all integral coordinates. If the continuous relaxation of an MILP model describes an integral polyhedron, the model can be solved as an LP. (All vertices are integral.) Classic result: Total unimodularity A matrix is totally unimodular if every square submatrix has determinant 0, 1, or 1. June 2011 Slide 261

262 Integral polyhedron An integral polyhedron is one whose vertices have all integral coordinates. If the continuous relaxation of an MILP model describes an integral polyhedron, the model can be solved as an LP. (All vertices are integral.) Classic result: Total unimodularity A matrix is totally unimodular if every square submatrix has determinant 0, 1, or 1. Theorem. Matrix A with integral components is totally unimodular if and only if Ax b, x 0 describes an integral polyhedron for any integral b. June 2011 Slide 262

263 Total unimodularity Lemma. The following preserve total unimodularity: Transposition Swapping rows or columns Negating a column Adding a unit column. June 2011 Slide 263

264 Total unimodularity Lemma. The following preserve total unimodularity: Transposition Swapping rows or columns Negating a column Adding a unit column. Key Theorem. Matrix A is totally unimodular if and only if every subset J of columns has a partition J = J 1 J 2 such that for each row i of A, Aij j J j J 1 2 A ij 1 June 2011 Slide 264

265 Total unimodularity Corollary. A network flow matrix is totally unimodular. June 2011 Slide 265

266 Total unimodularity Corollary. A matrix with the consecutive ones property (interval matrix) is totally unimodular. June 2011 Slide 266

267 Cutting Planes 0-1 Knapsack Cuts Gomory Cuts Mixed Integer Rounding Cuts Example: Product Configuration June 2011 Slide 267

268 To review A cutting plane (cut, valid inequality) for an MILP model: is valid - It is satisfied by all feasible solutions of the model. cuts off solutions of the continuous relaxation. - This makes the relaxation tighter. Cutting plane Continuous relaxation Feasible solutions June 2011 Slide 268

269 Motivation Cutting planes (cuts) tighten the continuous relaxation of an MILP model. Knapsack cuts - Generated for individual knapsack constraints. - We saw general integer knapsack cuts earlier knapsack cuts and lifting techniques are well studied and widely used. Rounding cuts - Generated for the entire MILP, they are widely used. - Gomory cuts for integer variables only. - Mixed integer rounding cuts for any MILP. June 2011 Slide 269

270 0-1 Knapsack Cuts 0-1 knapsack cuts are designed for knapsack constraints with 0-1 variables. The analysis is different from that of general knapsack constraints, to exploit the special structure of 0-1 inequalities. June 2011 Slide 270

271 0-1 Knapsack Cuts 0-1 knapsack cuts are designed for knapsack constraints with 0-1 variables. The analysis is different from that of general knapsack constraints, to exploit the special structure of 0-1 inequalities. Consider a 0-1 knapsack packing constraint ax a 0. (Knapsack covering constraints are similarly analyzed.) Index set J is a cover if j 0 j J x j J 1 ax a j J 0 a > The cover inequality is a 0-1 knapsack cut for a June 2011 Slide 271 Only minimal covers need be considered.

272 Example J = {1,2,3,4} is a cover for 6x + 5x + 5x + 5x + 8x + 3x This gives rise to the cover inequality x1 + x2 + x3 + x4 3 Index set J is a cover if j 0 j J x j J 1 ax a j J 0 a > The cover inequality is a 0-1 knapsack cut for a June 2011 Slide 272 Only minimal covers need be considered.

273 Sequential lifting A cover inequality can often be strengthened by lifting it into a higher dimensional space. That is, by adding variables. Sequential lifting adds one variable at a time. Sequence-independent lifting adds several variables at once. June 2011 Slide 273

274 Sequential lifting To lift a cover inequality x J 1 j J add a term to the left-hand side x + π x J 1 j j J j k k where π k is the largest coefficient for which the inequality is still valid. So, π = J 1 max x a x a a k x 0 j { 0,1 j j j k } j J j J for j J This can be done repeatedly (by dynamic programming). June 2011 Slide 274

275 Example Given To lift add a term to the left-hand side where π = 3 max This yields 6x + 5x + 5x + 5x + 8x + 3x 17 x1 + x2 + x3 + x4 3 { x x x x x x x x } { 0,1} x j for j {1,2,3,4} x + x + x + x + 2x Further lifting leaves the cut unchanged. x x x x π x But if the variables are added in the order x 6, x 5, the result is different: x1 + x2 + x3 + x4 + x5 + x6 3 June 2011 Slide 275

276 Sequence-independent lifting Sequence-independent lifting usually yields a weaker cut than sequential lifting. But it adds all the variables at once and is much faster. Commonly used in commercial MILP solvers. June 2011 Slide 276

277 Sequence-independent lifting To lift a cover inequality x J 1 j J j add terms to the left-hand side x + ρ( a ) x J 1 where with j J j j k j J { } { } j if Aj u Aj + 1 and j 0,, p 1 ρ( u) = j + ( u Aj )/ if Aj u < Aj and j 1,, p 1 p + ( u Ap )/ if Ap u = J j J a j a { 1,, p} = 0 A j j = k = 1 A 0 = 0 a k June 2011 Slide 277

278 Example Given To lift Add terms 6x + 5x + 5x + 5x + 8x + 3x 17 x1 + x2 + x3 + x4 3 x + x + x + x + ρ(8) x + ρ(3) x 3 where ρ(u) is given by This yields the lifted cut x + x + x + x + (5 / 4) x + (1/ 4) x June 2011 Slide 278

279 Gomory Cuts When an integer programming problem has a nonintegral solution, we can generate at least one Gomory cut to cut off that solution. - This is a special case of a separating cut, because it separates the current solution of the relaxation from the feasible set. Gomory cuts are widely used and very effective in MILP solvers. Separating cut Solution of continuous relaxation Feasible solutions June 2011 Slide 279

280 Gomory cuts Given an integer programming problem min cx Ax x = b 0 and integral Let (x B,0) be an optimal solution of the continuous relaxation, where x ˆ ˆ B = b NxN ˆ 1 ˆ 1 b B = b, N = B N Then if x i is nonintegral in this solution, the following Gomory cut is violated by (x B,0): x ˆ ˆ i + N i x N b i June 2011 Slide 280

281 Example min 2x + 3x x x 3 4x + 3x 6 x, x 0 and integral or min 2x + 3x 1 2 x + 3x x = x + 3x x = 6 x j and integral Optimal solution of the continuous relaxation has x x 1 2 / 3 1 B = = x2 1/ 3 1/ 3 ˆ N = 4 / 9 1/ 9 1 bˆ = 2 / 3 June 2011 Slide 281

282 Example min 2x + 3x x x 3 4x + 3x 6 x, x 0 and integral The Gomory cut is x [ ] 2 or min 2x + 3x 1 2 x + 3x x = x + 3x x = 6 x j x ˆ ˆ i + N i x N b i x / 9 1/ 9 2 / 3 x 4 0 and integral Optimal solution of the continuous relaxation has x x 1 2 / 3 1 B = = x2 1/ 3 1/ 3 ˆ N = 4 / 9 1/ 9 1 bˆ = 2 / 3 or x2 x3 0 In x 1,x 2 space this is x1 + 2x2 3 June 2011 Slide 282

283 Example min 2x + 3x x x 3 4x + 3x 6 x, x 0 and integral or min 2x + 3x 1 2 x + 3x x = x + 3x x = 6 x j and integral Optimal solution of the continuous relaxation has x x 1 2 / 3 1 B = = x2 1/ 3 1/ 3 ˆ N = 4 / 9 1/ 9 1 bˆ = 2 / 3 bˆ 1 = 2 / 3 Gomory cut x 1 + 2x 2 3 June 2011 Slide 283 Gomory cut after re-solving LP with previous cut.

284 Mixed Integer Rounding Cuts Mixed integer rounding (MIR) cuts can be generated for solutions of any relaxed MILP in which one or more integer variables has a fractional value. Like Gomory cuts, they are separating cuts. MIR cuts are widely used in commercial solvers. June 2011 Slide 284

285 MIR cuts Given an MILP problem min cx + dy Ax + Dy = b x, y 0 and y integral In an optimal solution of the continuous relaxation, let J = { j y j is nonbasic} K = { j x j is nonbasic} N = nonbasic cols of [A D] Then if y i is nonintegral in this solution, the following MIR cut is violated by the solution of the relaxation: frac( Nˆ ) ˆ ˆ ij 1 ˆ + y ˆ ˆ i + N ij y j + N ij + Nij x j Nij bi frac( ˆ + ) frac( ˆ j J ) 1 j J b 2 i b i j K where J1 { j J frac( Nˆ ) frac( ˆ ij bj )} J = J \ J = 2 1 June 2011 Slide 285

286 Example 3x + 4x 6y 4y = x + 2x y y = x, y 0, y integer j j j Take basic solution (x 1,y 1 ) = (8/3,17/3). Then N 1/ 3 2 / 3 ˆ 8 / 3 = 2 / 3 8 / 3 bˆ = 17 / 3 J = {2}, K = {2}, J 1 =, J 2 = {2} 1/ The MIR cut is y1 + 1/ 3 + y 2 + (2 / 3) x2 8 / 3 2 / 3 2 / 3 y + (1/ 2) y + x 3 or June 2011 Slide 286

287 Example: Product Configuration This example illustrates: Combination of propagation and relaxation. Processing of variable indices. Continuous relaxation of element constraint. June 2011 Slide 287

288 The problem Choose what type of each component, and how many Memory Memory Memory Memory Memory Memory Personal computer Power supply Disk drive Disk drive Disk drive Disk drive Disk drive Power supply Power supply Power supply June 2011 Slide 288

289 Model of the problem Unit cost of producing attribute j Amount of attribute j produced (< 0 if consumed): memory, heat, power, weight, etc. min j c v v = q A, all j j i ijt ik L v U, all j j j j j j i Amount of attribute j produced by type t i of component i t i is a variable index Quantity of component i installed June 2011 Slide 289

290 To solve it: Branch on domains of t i and q i. Propagate element constraints and bounds on v j. Variable index is converted to specially structured element constraint. Valid knapsack cuts are derived and propagated. Use linear continuous relaxations. Special purpose MILP relaxation for element. June 2011 Slide 290

291 Propagation min j c v v = q A, all j j i ijt ik L v U, all j j j j j j i This is propagated in the usual way June 2011 Slide 291

292 Propagation v z, all j j = i i ( ) element t,( q, A,, q A ), z, all i, j i i ij1 i ijn i min j c v v = q A, all j j i ijt ik L v U, all j j j j j j i This is rewritten as This is propagated in the usual way June 2011 Slide 292

293 Propagation v z, all j j = i i ( ) element t,( q, A,, q A ), z, all i, j i i ij1 i ijn i This can be propagated by (a) using specialized filters for element constraints of this form June 2011 Slide 293

294 Propagation v z, all j j = i i ( ) element t,( q, A,, q A ), z, all i, j i i ij1 i ijn i This is propagated by (a) using specialized filters for element constraints of this form, (b) adding knapsack cuts for the valid inequalities: i i max k D t i min k D t i { } A q v, all j ijk { } i A q v, all j ijk i j and (c) propagating the knapsack cuts. June 2011 Slide 294 j [ v, v ] j j is current domain of v j

295 Relaxation min j c v v = q A, all j j i ijt ik L v U, all j j j j j j i This is relaxed as v j v j v j June 2011 Slide 295

296 Relaxation v z, all j j = i i ( ) element t,( q, A,, q A ), z, all i, j i i ij1 i ijn i min j c v v = q A, all j j i ijt ik L v U, all j j j j j j i This is relaxed by relaxing this and adding the knapsack cuts. This is relaxed as v j v j v j June 2011 Slide 296

297 Relaxation v z, all j j = i i ( ) element t,( q, A,, q A ), z, all i, j i i ij1 i ijn i This is relaxed by replacing each element constraint with a disjunctive convex hull relaxation: z = A q, q = q i ijk ik i ik k D k D ti ti June 2011 Slide 297

298 Relaxation So the following LP relaxation is solved at each node of the search tree to obtain a lower bound: June 2011 Slide 298 min j v = A q, all j j ijk ik i k D q = q, all i j k D t i v v v, all j q q q, all i { } knapsack cuts for max A q v, all j { } knapsack cuts for min A q v, all j q 0, all i, k j t i ik j j j i i i ik c v j i i k D k D t i t i ijk i j ijk i j

299 Lagrangean Relaxation Lagrangean Duality Properties of the Lagrangean Dual Example: Fast Linear Programming Domain Filtering Example: Continuous Global Optimization June 2011 Slide 299

300 Motivation Lagrangean relaxation can provide better bounds than LP relaxation. The Lagrangean dual generalizes LP duality. It provides domain filtering analogous to that based on LP duality. - This is a key technique in continuous global optimization. Lagrangean relaxation gets rid of troublesome constraints by dualizing them. - That is, moving them into the objective function. - The Lagrangean relaxation may decouple. June 2011 Slide 300

301 Lagrangean Duality Consider an inequality-constrained problem min f ( x) g( x) 0 x S Hard constraints Easy constraints The object is to get rid of (dualize) the hard constraints by moving them into the objective function. June 2011 Slide 301

302 Lagrangean Duality Consider an inequality-constrained problem min f ( x) g( x) 0 x S It is related to an inference problem max v s S g( x) b f ( x) v implies Lagrangean Dual problem: Find the tightest lower bound on the objective function that is implied by the constraints. June 2011 Slide 302

303 Primal Dual min f ( x) max v g( x) 0 s S g( x) b f ( x) v x S Surrogate Let us say that x S λg( x) 0 dominates f ( x) v 0 g( x) 0 f ( x) v iff for some λ 0 λg(x) f(x) v for all x S That is, v f(x) λg(x) for all x S June 2011 Slide 303

304 Primal Dual min f ( x) max v g( x) 0 s S g( x) b f ( x) v x S Surrogate Let us say that x S λg( x) 0 dominates f ( x) v 0 g( x) 0 f ( x) v iff for some λ 0 λg(x) f(x) v for all x S That is, v f(x) λg(x) for all x S Or v min { f ( x) λg( x) } x S June 2011 Slide 304

305 Primal min f ( x) g( x) 0 x S Let us say that x S λg( x) 0 dominates f ( x) v 0 g( x) 0 f ( x) v iff for some λ 0 So the dual becomes June 2011 Slide 305 max v λg(x) f(x) v for all x S That is, v f(x) λg(x) for all x S Or Dual max { λ } v s S g( x) b f ( x) v v min f ( x) g( x) for some λ 0 x S Surrogate { λ } v min f ( x) g( x) x S

306 Now we have Primal min f ( x) g( x) 0 x S These constraints are dualized Dual max v or where max θ( λ) λ 0 { λ } v min f ( x) g( x) for some λ 0 x S { λ } θ ( λ ) = min f ( x ) g ( x ) x S Lagrangean relaxation Vector of Lagrange multipliers June 2011 Slide 306 The Lagrangean dual can be viewed as the problem of finding the Lagrangean relaxation that gives the tightest bound.

307 Example min 3x + 4x x + 3x 0 2x + x 5 0 { } x, x 0,1,2,3 The Lagrangean relaxation is { x x x x x x } θ( λ, λ ) = min λ ( + 3 ) λ (2 + 5) x {0,,3} x {0,,3} j { λ λ x λ λ x λ } = min (3 + 2 ) + (4 3 ) + 5 j The Lagrangean relaxation is easy to solve for any given λ 1, λ 2 : Strongest surrogate x x if 3 + λ1 2λ2 0 = 3 otherwise 0 if 4 3λ1 λ2 0 = 3 otherwise June 2011 Slide 307 Optimal solution (2,1)

308 Example θ(λ 1,λ 2 ) is piecewise linear and concave. min 3x + 4x x + 3x 0 2x + x 5 0 { } x, x 0,1,2,3 λ 2 θ(λ)=5 θ(λ)=9 2/7 θ(λ)=0 λ 1 θ(λ)=0 θ(λ)=7.5 Solution of Lagrangean dual: (λ 1,λ 2 ) = (5/7, 13/7), θ(λ) = 9 2/7 June 2011 Slide 308 Optimal solution (2,1) Value = 10 Note duality gap between 10 and 9 2/7 (no strong duality).

309 Example min 3x + 4x x + 3x 0 2x + x 5 0 { } x, x 0,1,2,3 Note: in this example, the Lagrangean dual provides the same bound (9 2/7) as the continuous relaxation of the IP. This is because the Lagrangean relaxation can be solved as an LP: { x x } θ( λ, λ ) = min (3 + λ 2 λ ) + (4 3 λ λ ) + 5λ 1 2 x {0,,3} x 3 j { λ λ x λ λ x λ } = min (3 + 2 ) + (4 3 ) + 5 j Lagrangean duality is useful when the Lagrangean relaxation is tighter than an LP but nonetheless easy to solve. June 2011 Slide 309

310 Properties of the Lagrangean dual Weak duality: For any feasible x* and any λ* 0, f(x*) θ(λ*). In particular, min f ( x) g( x) 0 x S max θ( λ) λ 0 Concavity: θ(λ) is concave. It can therefore be maximized by local search methods. Complementary slackness: If x* and λ* are optimal, and there is no duality gap, then λ*g(x*) = 0. June 2011 Slide 310

311 Solving the Lagrangean dual Let λ k be the kth iterate, and let λ + = λ + α ξ k 1 k k k Subgradient of θ(λ) at λ = λ k If x k solves the Lagrangean relaxation for λ = λ k, then ξ k = g(x k ). This is because θ(λ) = f(x k ) + λg(x k ) at λ = λ k. The stepsize α k must be adjusted so that the sequence converges but not before reaching a maximum. June 2011 Slide 311

312 Example: Fast Linear Programming In CP contexts, it is best to process each node of the search tree very rapidly. Lagrangean relaxation may allow very fast calculation of a lower bound on the optimal value of the LP relaxation at each node. The idea is to solve the Lagrangean dual at the root node (which is an LP) and use the same Lagrange multipliers to get an LP bound at other nodes. June 2011 Slide 312

313 At root node, solve Dualize Special structure, e.g. variable bounds min cx Ax Dx b d x 0 ( λ) The (partial) LP dual solution λ* solves the Lagrangean dual in which x 0 { cx λ Ax b } θ( λ) = min ( ) Dx d June 2011 Slide 313

314 At root node, solve Dualize Special structure, e.g. variable bounds min cx Ax Dx b d x 0 ( λ) The (partial) LP dual solution λ* solves the Lagrangean dual in which x 0 { cx λ Ax b } θ( λ) = min ( ) Dx d At another node, the LP is Here θ(λ*) is still a lower bound on the optimal value of the LP and can be quickly calculated by solving a specially structured LP. min Ax Dx Hx x 0 cx b d h ( λ) Branching constraints, etc. June 2011 Slide 314

315 Domain Filtering Suppose: min f ( x) g( x) 0 x S has optimal solution x*, optimal value v*, and optimal Lagrangean dual solution λ*. and λ i * > 0, which means the i-th constraint is tight (complementary slackness); and the problem is a relaxation of a CP problem; and we have a feasible solution of the CP problem with value U, so that U is an upper bound on the optimal value. June 2011 Slide 315

316 Supposing min f ( x) g( x) 0 x S has optimal solution x*, optimal value v*, and optimal Lagrangean dual solution λ*: If x were to change to a value other than x*, the LHS of i-th constraint g i (x) 0 would change by some amount i. Since the constraint is tight, this would increase the optimal value as much as changing the constraint to g i (x) i 0. So it would increase the optimal value at least λ i * i. (It is easily shown that Lagrange multipliers are marginal costs. Dual multipliers for LP are a special case of Lagrange multipliers.) June 2011 Slide 316

317 Supposing min f ( x) g( x) 0 x S has optimal solution x*, optimal value v*, and optimal Lagrangean dual solution λ*: We have found: a change in x that changes g i (x) by i the optimal value at least λ i * i. increases Since optimal value of this problem optimal value of the CP U, we have λ i * i U v*, or * U v i λ * i June 2011 Slide 317

318 Supposing min f ( x) g( x) 0 x S has optimal solution x*, optimal value v*, and optimal Lagrangean dual solution λ*: We have found: a change in x that changes g i (x) by i the optimal value at least λ i * i. increases Since optimal value of this problem optimal value of the CP U, we have λ i * i U v*, or * Since i = g i (x) g i (x*) = g i (x), this implies the inequality June 2011 Slide 318 g ( x) i U v λ * i * U v i λ * i which can be propagated.

319 Example: Continuous Global Optimization Some of the best continuous global solvers (e.g., BARON) combine OR-style relaxation with CP-style interval arithmetic and domain filtering. These methods can be combined with domain filtering based on Lagrange multipliers. June 2011 Slide 319

320 Continuous Global Optimization max 4x x = x 1 2 x 1 2 2x + x 2 x + [0,1], x [0,2] 1 2 x 2 Global optimum Local optimum Feasible set June 2011 Slide 320 x 1

321 To solve it: Search: split interval domains of x 1, x 2. Each node of search tree is a problem restriction. Propagation: Interval propagation, domain filtering. Use Lagrange multipliers to infer valid inequality for propagation. Reduced-cost variable fixing is a special case. Relaxation: Use function factorization to obtain linear continuous relaxation. June 2011 Slide 321

322 Interval propagation x 2 Propagate intervals [0,1], [0,2] through constraints to obtain [1/8,7/8], [1/4,7/4] x 1 June 2011 Slide 322

323 Relaxation (function factorization) Factor complex functions into elementary functions that have known linear relaxations. Write 4x 1 x 2 = 1 as 4y = 1 where y = x 1 x 2. This factors 4x 1 x 2 into linear function 4y and bilinear function x 1 x 2. Linear function 4y is its own linear relaxation. June 2011 Slide 323

324 Relaxation (function factorization) Factor complex functions into elementary functions that have known linear relaxations. Write 4x 1 x 2 = 1 as 4y = 1 where y = x 1 x 2. This factors 4x 1 x 2 into linear function 4y and bilinear function x 1 x 2. Linear function 4y is its own linear relaxation. Bilinear function y = x 1 x 2 has relaxation: x x + x x x x y x x + x x x x x x + x x x x y x x + x x x x where domain of x j is [ x, x ] j j June 2011 Slide 324

325 Relaxation (function factorization) The linear relaxation becomes: min x 4y = x 1 2 2x + x 2 x x + x x x x y x x + x x x x x x + x x x x y x x + x x x x x x x, j = 1,2 j j j

326 Relaxation (function factorization) x 2 Solve linear relaxation. x 1

327 Relaxation (function factorization) x 2 Solve linear relaxation. x2 [1,1.75] Since solution is infeasible, split an interval and branch. x2 [0.25,1] x 1

328 x2 [1,1.75] x2 [0.25,1] x 2 x 2 x 1 x 1

329 x2 [1,1.75] x2 [0.25,1] x 2 Solution of relaxation is feasible, value = 1.25 This becomes incumbent solution x 2 x 1 x 1

330 x2 [1,1.75] x2 [0.25,1] x 2 Solution of relaxation is feasible, value = 1.25 This becomes incumbent solution x 2 Solution of relaxation is not quite feasible, value = Also use Lagrange multipliers for domain filtering x 1 x 1

331 Relaxation (function factorization) min x 4y = x 1 2 2x + x 2 x x + x x x x y x x + x x x x x x + x x x x y x x + x x x x x x x, j = 1,2 j j j Associated Lagrange multiplier in solution of relaxation is λ 2 = 1.1

332 Relaxation (function factorization) min x 4y = x 1 2 2x + x 2 Associated Lagrange multiplier in solution of relaxation is λ 2 = 1.1 x x + x x x x y x x + x x x x x x + x x x x y x x + x x x x x x x, j = 1,2 j j j This yields a valid inequality for propagation: x1 + x2 2 = Value of relaxation Lagrange multiplier Value of incumbent solution

333 Dynamic Programming in CP Example: Capital Budgeting Domain Filtering Recursive Optimization Filtering for Stretch Filtering for Regular

334 Motivation Dynamic programming (DP) is a highly versatile technique that can exploit recursive structure in a problem. Domain filtering is straightforward for problems modeled as a DP. DP is also important in designing filters for some global constraints, such as stretch and regular. Nonserial DP is related to bucket elimination in CP and exploits the structure of the primal graph. DP modeling is the art of keeping the state space small while maintaining a Markovian property. We will examine only one simple example of serial DP.

335 Example: Capital Budgeting We wish to built power plants with a total cost of at most 12 million Euros. There are three types of plants, costing 4, 2 or 3 million Euros each. We must build one or two of each type. The problem has a simple knapsack packing model: Number of factories of type j 4x + 2x + 3x 12 x j { 1,2 }

336 Example: Capital Budgeting 4x + 2x + 3x 12 x j { 1,2 } State s k Stage k The recursion for ax b might be f ( s ) = f ( s + a x ) k k k + 1 k k k x D = # of paths from state s k to feasible solutions k x k State is sum of first k terms of ax x 3 =2 x 3 =1 f 3 (8) = f 4 (8+3 1) + f 4 (8+3 2)} = = 1 f 4 (14)=0 f 4 (11)=1

337 Example: Capital Budgeting 4x + 2x + 3x 12 x j { 1,2 } The recursion for ax b might be Boundary condition: f f ( s ) = f ( s + a x ) k k k + 1 k k k x D 1 if sn+ 1 b ( s ) = 0 otherwise n+ 1 n+ 1 k x k Feasible if: f (0) > 0 1 f k (s k ) for each state s k

338 Example: Capital Budgeting 4x + 2x + 3x 12 x j { 1,2 } The problem is feasible. 3 0 Each path to 1 is a feasible solution Path 1: x = (1,2,1) 2 1 Path 2: x = (1,1,2) 1 Path 3: x = (1,1,1) Possible costs are 9,11,12. f k (s k ) for each state s k

339 Example: Capital Budgeting 4x + 2x + 3x 12 x j { 1,2 } Key property: 3 0 The DP model is Markovian Possible transitions depend only on current state not how the state was reached f k (s k ) for each state s k

340 Domain Filtering 4x + 2x + 3x 12 x j { 1,2 } To filter domains: observe what values of x k occur on feasible paths. x 1 =1 x 2 =2 x 3 =1 D = x 3 D = x 2 D = x 1 { 1,2 } { 1,2 } { 1} x 2 =1 x 3 =2 x 3 =1

341 Recursive Optimization max 15x + 10x + 12x { 1,2 } x + 2x + 3x 12 x j Maximize revenue The recursion includes arc values: { } f ( s ) = max c x + f ( s + a x ) k k k k k + 1 x D k k k k x k = value on max value path from s k to final stage (value to go) Arc value f 3 (8) = max{12 1+f 4 (8+3 1), 12 2+f 4 (8+3 2)} = max{12, } = 12 f 4 (14)= f 4 (11)=0

342 f Recursive optimization max 15x + 10x + 12x { 1,2 } Boundary condition: 0 if sn+ 1 b ( s ) = otherwise n+ 1 n x + 2x + 3x 12 x j The recursion includes arc values: { } f ( s ) = max c x + f ( s + a x ) k k k k k + 1 k k k Optimal value: f 1 (0) f k (s k ) for each state s k 0 0 0

343 Recursive optimization max 15x + 10x + 12x { 1,2 } x + 2x + 3x 12 x j 49 The maximum revenue is The optimal path is easy to retrace (x 1,x 2,x 3 ) = (1,1,2) 0 f k (s k ) for each state s k

344 Filtering: Stretch Example: where x = (x 1,, x n ) pattern P = {(a,b), (b,a), (b,c), (c,b)} Shifts must occur in stretches of length 2 or 3. Workers cannot change directly between shifts a and c.

345 Filtering: Stretch Example: where x = (x 1,, x n ) pattern P = {(a,b), (b,a), (b,c), (c,b)} Shifts must occur in stretches of length 2 or 3. Workers cannot change directly between shifts a and c. Assume variable domains:

346 Filtering: Stretch Example: where x = (x 1,, x n ) pattern P = {(a,b), (b,a), (b,c), (c,b)} Shifts must occur in stretches of length 2 or 3. Workers cannot change directly between shifts a and c. Assume variable domains: One feasible solution.

347 Filtering: Stretch Example: where x = (x 1,, x n ) pattern P = {(a,b), (b,a), (b,c), (c,b)} Shifts must occur in stretches of length 2 or 3. Workers cannot change directly between shifts a and c. Assume variable domains: The other feasible solution.

348 Filtering: Stretch Example: State transition graph (transitions defined by choice of stretch): (day,shift)

349 Filtering: Stretch Example: State transition graph (transitions defined by choice of stretch): (day,shift) Model is Markovian because pattern constraint involves only 2 consecutive states.

350 Filtering: Stretch Example: State transition graph (transitions defined by choice of stretch): (day,shift) Remove states that are not backward reachable from a feasible end state.

351 Filtering: Stretch Example: State transition graph (transitions defined by choice of stretch): (day,shift) Domains can now be filtered:

352 Filtering: Regular Encode the stretch example as a finite deterministic automaton A: Initial state Circled nodes are accepting (terminal) states Transitions defined by choice of shift.

353 Filtering: Regular Encode the stretch example as a finite deterministic automaton A: Initial state Circled nodes are accepting (terminal) states Transitions defined by choice of shift. Now impose the constraint

354 Filtering: Regular Filtering can be done on a DP state transition graph:

355 Filtering: Regular Filtering can be done on a DP state transition graph: Remove states that are not backward reachable from an accepting state in the final stage.

356 Filtering: Regular Filtering can be done on a DP state transition graph: Remove states that are not backward reachable from an accepting state in the final stage. Now filter the domains.

357 CP-based Branch and Price Basic Idea Example: Airline Crew Scheduling

358 Motivation Branch and price allows solution of integer programming problems with a huge number of variables. The problem is solved by a branch-and-relax method. The difference lies in how the LP relaxation is solved. Variables are added to the LP relaxation only as needed. Variables are priced to find which ones should be added. CP is useful for solving the pricing problem, particularly when constraints are complex. CP-based branch and price has been successfully applied to airline crew scheduling, transit scheduling, and other transportation-related problems.

359 Basic Idea Suppose the LP relaxation of an integer programming problem has a huge number of variables: We will solve a restricted master problem, which has a small subset of the variables: Column j of A min cx Ax = b x 0 min j J x j j 0 j J A x j c x j = b j ( λ) Adding x k to the problem would improve the solution if x k has a negative reduced cost: r = c λa < 0 k k k

360 Basic Idea Adding x k to the problem would improve the solution if x k has a negative reduced cost: r = c λa < k k k 0 Computing the reduced cost of x k is known as pricing x k. Cost of column y So we solve the pricing problem: min y cy λy is a column of A If the solution y* satisfies c y* λy* < 0, then we can add column y to the restricted master problem.

361 Basic Idea The pricing problem max λy y is a column of A need not be solved to optimality, so long as we find a column with negative reduced cost. However, when we can no longer find an improving column, we solved the pricing problem to optimality to make sure we have the optimal solution of the LP. If we can state constraints that the columns of A must satisfy, CP may be a good way to solve the pricing problem.

362 Example: Airline Crew Scheduling We want to assign crew members to flights to minimize cost while covering the flights and observing complex work rules. Flight data Start time Finish time A roster is the sequence of flights assigned to a single crew member. The gap between two consecutive flights in a roster must be from 2 to 3 hours. Total flight time for a roster must be between 6 and 10 hours. For example, flight 1 cannot immediately precede 6 flight 4 cannot immediately precede 5. The possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6)

363 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member.

364 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. Rosters that cover flight 1.

365 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. Rosters that cover flight 2.

366 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. Rosters that cover flight 3.

367 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. Rosters that cover flight 4.

368 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. Rosters that cover flight 5.

369 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. Rosters that cover flight 6.

370 Airline Crew Scheduling There are 2 crew members, and the possible rosters are: (1,3,5), (1,4,6), (2,3,5), (2,4,6) The LP relaxation of the problem is: Cost c 12 of assigning crew member 1 to roster 2 = 1 if we assign crew member 1 to roster 2, = 0 otherwise. Each crew member is assigned to exactly 1 roster. Each flight is assigned at least 1 crew member. In a real problem, there can be millions of rosters.

371 Airline Crew Scheduling We start by solving the problem with a subset of the columns: Optimal dual solution u 1 u 2 v 1 v 2 v 3 v 4 v 5 v 6

372 Airline Crew Scheduling We start by solving the problem with a subset of the columns: Dual variables u 1 u 2 v 1 v 2 v 3 v 4 v 5 v 6

373 Airline Crew Scheduling We start by solving the problem with a subset of the columns: Dual variables u 1 u 2 v 1 v 2 v 3 v 4 v 5 v 6 The reduced cost of an excluded roster k for crew member i is c u v ik i j j in roster k We will formulate the pricing problem as a shortest path problem.

374 Pricing problem Crew member 1 Crew member 2

375 Pricing problem Each s-t path corresponds to a roster, provided the flight time is within bounds. Crew member 1 Crew member 2

376 Pricing problem Cost of flight 3 if it immediately follows flight 1, offset by dual multiplier for flight 1 Crew member 1 Crew member 2

377 Pricing problem Cost of transferring from home to flight 1, offset by dual multiplier for crew member 1 Crew member 1 Dual multiplier omitted to break symmetry Crew member 2

378 Pricing problem Length of a path is reduced cost of the corresponding roster. Crew member 1 Crew member 2

379 Pricing problem Arc lengths using dual solution of LP relaxation Crew member Crew member

380 Pricing problem Solution of shortest path problems Crew member 1 Reduced cost = 1 Add x 12 to problem. Crew member 2 Reduced cost = 2 Add x 23 to problem After x 12 and x 23 are added to the problem, no remaining variable has negative reduced cost. 2 2

381 Pricing problem The shortest path problem cannot be solved by traditional shortest path algorithms, due to the bounds on total duration of flights. It can be solved by CP: Set of flights assigned to crew member i Path length Graph Path global constraint Setsum global constraint Path( X, z, G), all flights i min i j X i { } i T f s T max X flights, z < 0, all i i ( j j ) i Duration of flight j

382 CP-based Benders Decomposition Benders Decomposition in the Abstract Classical Benders Decomposition Example: Machine Scheduling

Tutorial: Operations Research in Constraint Programming

Tutorial: Operations Research in Constraint Programming John Hooker Carnegie Mellon University May 2009 Revised June 2009 May 2009 Slide 1 Why Integrate OR and CP? Complementary strengths Computational