Acceleration and Stabilization Techniques for Column Generation

Acceleration and Stabilization Techniques for Column Generation Zhouchun Huang Qipeng Phil Zheng Department of Industrial Engineering & Management Systems University of Central Florida Sep 26, 2014

Outline 1 Introduction to Column Generation 2 Motivation 3 Acceleration and Stabilization Techniques Boxstep method Polyhedral penalty du Merle stabilization Bundle method Interior point stabilization 4 Conclusions

Introduction to Column Generation Consider: min cx Ax b x X Dantzig-Wolfe s Decomposition: min Q (cx i )λ i Q (Ax i )λ i b Q λ i = 1 x i, i = 1, 2,..., Q are extreme points of X, which is nonempty and bounded.

Introduction to Column Generation With only a subset of columns (EP) included, the D-W reformulation is restricted to RMP: min (cx i )λ i (Ax i )λ i b λ i = 1 The column that is excluded but will enter the basis of master problem is determined by SP or oracle: min (c u k A)x v k x X

Column Generation Algorithm Initialize: x 0, UB = +, LB = k 0 1. (λ k ; u k, v k ) optimizer(rmp) 2. ˆx oracle(u k, v k ) 3. if z(sp) 0 λ λ k else x k+1 ˆx; k k + 1 go to 1

Successful Applications Cutting stock problems Air crew scheduling Aircraft fleeting and routing Crew rostering Vehicle routing Global shipping Multi-item lot-sizing Optical telecommunications network design Cancer radiation treatment using IMRT

Motivation Why the CG is efficient. D-W decomposition can reformulate a large-scale linear program into a master problem and a set of subproblems to make the problem solvable.. A compact formulation of a MIP may have a weak LP relaxation, while a D-W reformulation with a huge number of variables could give much better LB.. Most variables will be non-basic and only a subset of variables need to be considered. Why the CG is not efficient Multiple dual solutions are associated to each primal solution due to high degeneracy. Dual instability in the behaviror of dual variables are frequent when problems get larger. As a result, oscillations are observed and very slow convergence is often obtained while implementing the CG algorithm.

CG seen in the Dual Space Note: The crucial part of CG is choosing dual solutions that drives the algorithm toward convergence. Column Generation Lagrangian Method [RMP] min (cx i )λ i l(u) = ub + min (c ua)x x X (Ax i )λ i b [LD] max ub + v λ i = 1 v (c ua)x i, i = 1, 2,..., Q [SP] min (c u k A)x v k x X [RLD] max ub + v v (c ua)x i, i = 1, 2,..., k UB = z (RMP) LB = z (RMP) + z (SP) UB = z (RLD)= l k (u k ) LB = l(u k )

CG seen in the Dual Space l k (u k ) l k+1 (u k+1 ) l(u k ) u k u u k+1

δ u c δ + Acceleration and Stabilization Techniques Boxstep δ u c δ + δ u c δ + Polyhedral penalty u c δ u c δ + u c u c du Merle Stabilization δ Bundle u c δ + u c δ u c δ + δ u c δ + u c

Boxstep method Recall: [RLD] max ub + v v (c ua)x i, i = 1, 2,..., k The Boxstep method modeled a local problem which is to maximize the Lagrangian function over a box of [δ, δ + ], so that the distance traveled by the dual variables in the dual space is limited to the box. max ub + v v (c ua)x i, i = 1, 2,..., k δ u δ + min (cx i )λ i δ y + δ +y + (Ax i )λ i y + y + b λ i = 1 Marsten R.E.,Hogan W.W., Blankenship J.W.:The BOXSTEP Method for Large-Scale Optimization. Operations Research, 23,389-405(1975).

Boxstep method Trust Regions / Box Step b, 4, 3, 1, 2 small box size: too many steps toward the best dual large box size: weak stabilization, i.e.,hard to get a better dual solution(with better LB) Desrosiers J.,Ben amore H.,Soumis F.,Villeneuve D.:IMA workshop:computational Methods for Large Scale Integer Programs, University of Minnesota, Minneapolis,14-19(2002). Marsten R.E.,Hogan W.W., Blankenship J.W.:The BOXSTEP Method for Large-Scale Optimization. Operations Research, 23,389-405(1975).

Polyhedral penalty Polyhedral penalty is applied to linearly penalize the distance between the dual variables and a stability center. Dual max ub + v ε u u c v (c ua)x i, i = 1, 2,..., k max ub + v ε w ε +w + v (c ua)x i, i = 1, 2,..., k u + w u c u w + u c u, w, w + 0 Primal min (cx i )λ i u cy + u cy + (Ax i )λ i y + y + b y ε y + ε + λ i = 1 λ, y, y + 0 Kim S., Chang K.N.,Lee J.Y.:A descent method with linear programming subproblems for nondifferentiable convex optimization. Math. Program.,71(1),17-28(1995).

du Merle stabilization du Merle stabilization is a combination of the Boxstep method and polyhedral penalty. Dual Primal max ub + v ε w ε + w + v (c ua)x i, i = 1, 2,..., k (u + w u c ) u + w δ (u w + u c ) u w + δ + u, w, w + 0 min (cx i )λ i δ y + δ + y + (Ax i )λ i y + y + b y ε y + ε + λ i = 1 λ, y, y + 0 du Merle O.,Villeneuve D.,Desrosiers J., Hansen P.: Stabilized column generation. Discrete Math.,194,229-237(1999).

du Merle stabilization algorithm Initialize: x 0, δ 0, ε 0 k 0 1. (λ k, y k ; u k, v k ) optimizer(rmp) 2. ˆx oracle(u k, v k ) 3. if z(sp) 0 and y = y + = 0 λ λ k else x k+1 ˆx δ k+1 update δ(k) ε k+1 update ε(k) k k + 1 go to 1 min (cx i )λ i δ y + δ + y + (Ax i )λ i y + y + b y ε y + ε + λ i = 1 λ, y, y + 0

Parameters in du Merle stabilization 234 O. du Merle et al./discrete Mathematics 194 (1999) 229-237 Table 1 Instance of an airline crew pairing problem -- variations on 6 Parameters Main cpu time ratio cpu time (s) (8 _, 6 ) iterations opt ±mizer/oraele Speedup factor (-~, oo) 433 1.037 1491.0 1.00 [-50, oe) 440 1.444 1985.3 0.75 [0, 100] 157 0.521 267.6 5.57 [50, 100] 130 0.301 201.2 7.41 z (SP) 0 and y = 0 function value. Typically, using stabilized column generation reduces solution time by a factor ranging from 2 to 10. optimal As an example, we used the stabilized algorithm to solve the linear relaxation of a 986-1eg instance for a regional carrier [7]: the number of column generation iterations is reduced by a factor of 3.33 and the ceu time by a factor of 7.41. On the one hand, decrease ε the strategy for the update-6 procedure is to update the vector parameters (6_,6+) update δ when the column k+1 δ generation k if l(u algorithm k, v stalls, k ) > l(u using the k 1, v following k 1 ) predefined sequence of embedded intervals: [50, 100], [0, 100], [-50, ec) and (-e~,oe). The last interval imposes no restrictions on the dual variables and corresponds to the set partitioning add columns formulation; the first one gives very rough estimates of the dual variables, the objective function value being around 100 000. On the other hand, the vector parameters e_ and e+ are selected at random in the ranges [9, 11] and [0, 10-4], respectively, and are kept add columns fixed throughout the solution process. The first interval, for selecting e_, allows for overcovering update δthe k+1 flight legs δ k and if simulates l(u k, v k a ) set > covering l(u k 1 formulation,, v k 1 ) while the second interval, for choosing e+, corresponds to a small perturbation. For different starting values of (~ _,6 ) in the above sequence of embedded intervals, Table 1 shows the results obtained on a SUN Ultra SPARC 1300 workstation: the number of iterations, z (SP) 0 and y > 0 z (SP) < 0 and y = 0 (rare to happen) z (SP) < 0 and y > 0

Quality of dual solutions The quality of dual is estimated via the computation of a lower bound: l(u k, v k ) = u k b + (c u k A)x k since (u k, (c u k A)x k ) is a feasible solution of the LD: max ub + v v (c ua)x i, i = 1, 2,..., Q

Bundle method Bundle method differs from polyhedral penalty methods in that it takes infinitely small penalty ranges and infinitely many break points. max ub + v 1 u uc 2 2t v (c ua)x i, i = 1, 2,..., k u 0 Its Fenchel bidual is formulated as, min (cx i )λ i + u cg + t 2 g 2 (Ax i )λ i + g b λ i = 1 λ, g 0 Briant O. et al.:comparison of bundle and classical column generation. Math.Program.,Ser.A 113,299-344(2008)

Interior point stabilization The idea is to generate a dual solution that is in the interior of the convex hull of optimal dual solutions to the master instead of using one of its extreme points. max u(µb) + v v (c ua)x i, i = 1, 2,..., k u 0 µ U(0, 1), i.e. each element of µ is uniformly distributed a set of problems with different µ are solved an interior point is obtained by taking convex combination of optimal solutions the dual of the problem above is, min (cx i )λ i (Ax i )λ i µb λ i = 1 λ 0 Rousseau L.M. et al.:interior point stabilization for column generation. Operations Research Letters 35,2660-668(2007)

Conclusions introduction to column generation slow convergence of column generation stabilization techniques

The End