Lecture 8: Column Generation - PDF Free Download

Lecture 8: Column Generation (3 units) Outline Cutting stock problem Classical IP formulation Set covering formulation Column generation A dual perspective Vehicle routing problem 1 / 33

Cutting stock problem 2 / 33

Problem description A paper mill has a number of rolls of paper of fixed width. Different customers want different numbers of rolls of various-sized widths. How are you going to cut the rolls so that you minimize the waste (amount of left-overs)? 3 / 33

An example The width of large rolls: 5600mm. The width and demand of customers: Width 1380 1520 1560 1710 1820 1880 1930 2000 2050 2100 2140 2150 2200 Demand 22 25 12 14 18 18 20 10 12 14 16 18 20 An optimal solution: 4 / 33

Classical IP formulation Suppose the fixed width of large rolls is W. The m customers want n i rolls of width w i (i = 1,..., m), (w i W ). Notations: K: Index set of available rolls y k = 1 if roll k is cut, 0 otherwise xi k : number of times item i is cut on roll k The IP formulation of Kantorovich: (P 1 ) min k K y k s.t. xi k n i, k K i = 1,..., m, (demand) w i xi k Wy k, k K, (width limitation) x k i Z +, y k {0, 1}. 5 / 33

However, the IP formulation (P 1 ) is inefficient both from computational and theoretical point views. For example, when the number of rolls is 100 and the number of items is 20, the problem could not be solved to optimality in days (CPLEX). The main reason is that the linear program (LP) relaxation of (P 1 ) is poor. Actually, the LP bound of (P 1 ) is Z LP = y k = k K k K = w i k K w i x k i W xi k m W = w i n i W. Question: Is there an alternative IP formulation? m w i n i W. 6 / 33

Set covering formulation of Gilmore and Gomory Let xj = number of times pattern j is used aij = number of times item i is cut in pattern j For example, the fixed width of large rolls is W = 100 and the demands are n i = 100, 200, 300, w i = 25, 35, 45 (i = 1, 2, 3). The large roll can be cut into Pattern 1: 4 rolls each of width w1 = 25 a 11 = 4 Pattern 2: 1 roll with width w 1 = 25 and 2 rolls each of width w 2 = 35 a 12 = 1, a 22 = 2 Pattern 3: 2 rolls with width w3 = 45 a 33 = 2 7 / 33

Set covering formulation: (P 2 ) min s.t. n j=1 x j n a ij x j n i, i = 1,..., m, (demand) j=1 x j Z +, j = 1,..., n, where n is the total number of cutting patterns satisfying m w ia ij W. Each column in (P 2 ) represents a cutting pattern. How many columns (cutting patterns) are there? It could be as many as m! k!(m k)!, where k is the average number of items in each cutting patterns. Exponentially large!. 8 / 33

Let s first consider the LP relaxation of (P 2 ): (LPM) min s.t. n j=1 x j n a ij x j n i, i = 1,..., m, (demand) j=1 x j R +, j = 1,..., n, which is called Linear Programming Master Problem. The solution to (LPM) could be fractional, It is possible to round up the fractional solution to get a feasible solution to (P 2 ). The Simplex Algorithm for Master LP: Select initial solution Find the most negative reduced cost (new basic variable) Find variable to replace (new non-basic var) Repeat until there exists no variables with negative reduced cost exists 9 / 33

Consider the standard LP and its dual: min c T x s.t. Ax = b x 0 Simplex Tableau x B x N rhs B N b = cb T cn T 0 max b T π s.t. A T π c x B x N rhs I B 1 N b 0 cn T ct B B 1 N c T B 1 b Reduced cost of non-basic variable: c j c T B B 1 a j, where a j is a column of A. If c j c T B B 1 a j < 0 implies the current basis can be improved, otherwise, if c j c T B B 1 a j 0 for all j N, then the current solution is optimal and π T = c T B B 1 is dual feasible (optimal). 10 / 33

What if we have extremely many variables? Even if we had a way of generating all the columns (cutting patterns), the standard simplex algorithm will need to calculate the reduced cost for each non-basic variable, which is impossible if n is huge (out of memory). A crucial insight: The number of non-zero variables (the basis variables) is equal to the number of constraints, hence even if the number of possible variables (columns) may be large, we only need a small subset of these columns in the optimal solution. In the case of cutting stock problem, the number of items is usually much smaller than the number of cutting patterns. 11 / 33

The main idea of column generation is to start with a small subset P {1,..., n} such that the following subproblem is feasible: (RLPM) min j P x j s.t. a ij x j n i, j P i = 1,..., m, x j 0, j P, Recall that the dual problem of (LPM) is max n i π i s.t. a ij π i 1, j P, π i 0, i = 1,..., m. 12 / 33

Generating columns Our next task is to find a column (cut pattern) in {1,..., n} \ P that could improve the current optimal solution of the linear relaxation (RLPM). Given the optimal dual solution π of (RLPM), the reduced cost of column j {1,..., n} \ P is 1 a ij π i. A naive way of finding the new column: min{1 a ij π i j {1,..., n} \ P}, which is impractical because we are not able to list all cutting patterns in real applications. 13 / 33

Knapsack subproblem We can look for a column (cut pattern) such that: κ := min 1 s.t. π i y i = 1 max π i y i w i y i W, y i Z +, i = 1,..., m, where y = (y 1,..., y m ) represents a column (a 1j,..., a mj ) T (a cutting pattern) and the constraints m w iy i W, y Z m +, enforce that y satisfies the conditions for a feasible cutting pattern. This is a Knapsack Problem (an easy NP-hard problem) and can be solved in O(mW ) time by dynamic programming. 14 / 33

Column generation algorithm The Column Generation Algorithm Start with initial columns A of (LPM). For instance, use the simple pattern to cut a roll into W /w i rolls of width w i, A is a diagonal matrix. repeat 1. Solve the restricted LP master problem (RLPM). Let π be the optimal multipliers ( π T = c T B B 1 ). 2. Identify a new column by solving the knapsack subproblem with optimal value κ. 3. Add the new column to master problem (RLPM) until κ 0 15 / 33

An example A steel company wants to cut the steel rods of width 218cm. The customers want 44 pieces of width 81 cm., 3 pieces of width 70 cm, and 48 pieces of width 68 cm. Initial master problem (one item in each large rod): min x 1 + x 2 + x 3 s.t. x 1 44, x 2 3, x 3 48, x 0. Optimal multipliers: π = (1, 1, 1) T. Initial knapsack subproblem: max y 1 + y 2 + y 3 s.t. 81y 1 + 70y 2 + 68y 3 218, y Z 3 +. 16 / 33

Optimal solution to the initial knapsack subproblem: y = (0, 0, 3) T with κ = 1 3 = 2 < 0. Second master problem: min x 1 + x 2 + x 3 + x 4 s.t. x 1 44, x 2 3, x 3 + 3x 4 48, Optimal multipliers: π = (1.0, 1.0, 0.33) T. Second knapsack subproblem: max y 1 + y 2 + 0.33y 3 s.t. 81y 1 + 70y 2 + 68y 3 218, y Z 3 +. Optimal solution: y = (0, 3, 0) T with κ = 1 3 = 2 < 0. 17 / 33

3rd master problem: min x 1 + x 2 + x 3 + x 4 + x 5 s.t. x 1 44, x 2 + 3x 5 3, x 3 + 3x 4 48, Optimal multipliers: π = (1.0, 0.33, 0.33) T. 3rd subproblem: max y 1 + 0.33y 2 + 0.33y 3 s.t. 81y 1 + 70y 2 + 68y 3 218, y Z 3 +. Optimal solution: y = (2, 0, 0) T with κ = 1 2 = 2 < 0. Continue... 18 / 33

Getting integer solutions After solving the master linear program, we still have to find the integer solution to the original problem (P 2 ). Let x j = x j, x j is integral and n j=1 a ijx j n j=1 a ijx j n i. So the rounding up solution is feasible to (P 2 ). We can also use branch-and-bound framework to get the optimal integral solution. Branch-and-Price algorithm. 19 / 33

A Dual perspective Consider the classical formulation of cutting stock problem: (P 1 ) min k K y k s.t. k K x k i n i, i = 1,..., m, (demand) n w i xi k Wy k, k K, (width limitation) x k i Z +, y k {0, 1}. How to get a Lagrangian relaxation? Observe that if the demand constraints are removed, the problem can be decomposed into K knapsack problem! 20 / 33

For u i 0, i = 1,..., m, the Lagrangian function is L(u) = min ( y k + u i n i ) xi k k K k K s.t. w i xi k Wy k, k K, xi k Z +, y k {0, 1}. So L(u) = L k (u) + n i u i k K where L k (u) = min y k u i xi k s.t. w i xi k Wy k, x k i Z +, y k {0, 1}. 21 / 33

L s (u) = min(0, 1 z ), where z = max u i xi k s.t. w i xi k W, xi k Z +. This is a knapsack problem. Notice that L s (u) is independent of k. L(u) = KL s (u) + n i u i, where K = K. 22 / 33

Theorem: the dual problem max u 0 L(u) = v(lpm). Proof: Let z j = (a 1j,..., a mj ) T, j = 1,..., n, be the extreme points of the convex hull of the integer set X = {x Z m + w i x i W }. Then L s (u) = min(0, 1 max j=1,...,n So that max u 0 L(u) = max u 0 min j=1,...,n ( a ij u i ) = min min(0, 1 m a ij u i ). j=1,...,n K min(0, 1 a ij u i ) + ) n i u i 23 / 33

This can be reduced to max z s.t. z n i u i, z K(1 a ij u i ) + n i u i, j = 1,..., n, u i 0, i = 1,..., m. The dual of the above problem is n min Kλ j j=1 s.t. λ 0 + n λ j = 1, j=1 n i (λ 0 + j=1 j=1 n n λ j ) + a ij Kλ j 0, i = 1,..., m λ j 0, j = 0, 1,..., n. 24 / 33

Notice that λ 0 can be eliminated from the second constraint. So the constraint λ 0 + n j=1 λ j = 1 is redundant provided that there is (λ 1,..., λ n ) satisfying n a ij Kλ j n i, i = 1,..., m, and j=1 n λ j 1. j=1 This is always possible when K is large (enough rolls). Now, let x j = Kλ j, we obtain min s.t. n j=1 x j n a ij x j n i, i = 1,..., m, j=1 x j 0, j = 0, 1,..., n. This is exactly the LP relaxation (LPM). 25 / 33

Vehicle routing problem We consider a basic model of vehicle routing problem. A number of vehicle located at a central depot has to serve a set of customers which geographically surround the depot. Each vehicle has a given capacity and each customer has a given demand. The goal is to minimize the total travel distance. Figure: Vehicle routing problem 26 / 33

Notations N : set of nodes, N = N; node 1 N : depot; d i : demand of customer i; c ij : transportation cost (distance) from i to j; W : vehicle capacity; x ij {0, 1}: x ij = 1 if a vehicle goes directly from i to j, x ij = 0 otherwise; y i 0: load of vehicle when leaving node i. 27 / 33

An arc flow model for the basic VRP (VRP-af ) min i N c ij x ij j N s.t. x ij = 1, i N \ {1}, j N x ij = x jk, j N, i N k N W (1 x ij ) + y i y j + d j, i N, j N \ {1}, y i W, i N, x ij {0, 1}, i, j N. 28 / 33

A set covering model the basic VRP Suppose that we have access to all possible feasible routes, denoted by R. R is the set of all tours starting from the depot, visiting some customers exactly once and coming back to the depot. A VRP solution can be viewed as a collection of feasible routes. The VRP problem is then to choose routes in the solution so as to minimize the total transportation cost of the routes in the solution. Notations: R: set of feasible routes; cr : transportation cost of route r R; air : a ir = 1 if customer i is served on route r, a ir = 0 otherwise; yr {0, 1}: y r = 1 if route r is in the solution, y r = 0 otherwise. 29 / 33

A set covering model of the basic VRP: (VRP-sc) min r R c r y r s.t. r R a ir y r 1, i N \ {1}, y r {0, 1}, r R. The number of routes (tours) in R are huge (exponentially large) and we are not able to list all the routes (coefficients a ir for all i N \ {1}) and the cost c r ). In order to use column generation algorithm, we need to know how to efficiently generate feasible routes (tours). 30 / 33

Column generation for VRP Consider the LP relaxation of the set covering model, denoted by (VRP-sc) (LPM). The dual of (VRP-sc) is: (D) max s.t. i N \{1} i N \{1} π i a ir π i c r, r R, π i 0, i N \ {1}. Let π be an optimal solution to the dual problem (D). The reduced cost of a route is: c r = c r a ir π i. i N \{1} A column (route) with negative reduced cost c r < 0 can be added into the Restricted Master Problem to give a better solution. 31 / 33

Subproblem How to find a route r with negative reduced cost c r < 0? The transportation cost c r for a route is the sum of cost for the arcs in the route: c r = c ij x ij, i N j N where x ij {0, 1} represents whether or not arc (i, j) is in the route. We can express the reduced cost c r for a route r as c r = c r c ij x ij, where c ij = c ij π i. i N \{1} a ir π i = i N j N 32 / 33

The subproblem for column generation can be then expressed as (SP) min (c ij π i )x ij i N j N s.t. x 1j = 1, j N \{1} x ij = x jk, j N, i N k N W (1 x ij ) + y i y j + d j, i N, j N \ {1}, y i W, i N, x ij {0, 1}, i, j N. The subproblem (D) is a shortest-path problem with negative arc costs, starting from the depot and ending in the depot. This is a Resource Constrained Shortest Path problem. Integer Solution: heuristic and branch-and-bound method. 33 / 33