Week 11 1 Integer Linear Programming This week we will discuss solution methods for solving integer linear programming problems. I will skip the part on complexity theory, Section 11.8, although this is essential to the theory of combinatorial optimization. I have treated it a little bit in the very first lecture. The message to be learned from it is that some problems are intrinsically harder than others, and that for a wide class of problems among which Integer Linear Prgramming there is strong evidence that we will not be able to ever design an efficient (read polynomial time) algorithm. Yet many problems, and indeed almost every practical problem, are belonging to this class of so-called NP-hard problems. In research 3 approaches have been taken: Accept long running times or even accept that some bigger instances cannot be solved in reasonable time, but insist on optimality. Methods of this kind are Branch & Bound, Cutting Plane algorithms (or the combination Branch & Cut), and Dynamic Programming. Insist on polynomial running times, accepting that in many instances only approximations of the optimal solution will be found, and supplementing the algorithms with a rigorous mathematical analysis of the (relative) error made, in the form of a guarantee. Design smart methods that usually appear to give good approximate solutions, but without hardly any analysis on either running time or the error made, apart from subjecting them to benchmark or randomly generated instances of problems. To this class of algorithms belong Genetic Algorithms, Simulated Annealing, Tabu Search, Neural Networks, Petri Nets. Theoreticians sometimes call them homeopathic algorithms, which sounds denigrating and often is meant to mean so, but also expresses the frustration that these methods very frequently outperform the mathematically well-analysed methods and we have so-far failed to give any explanation for this phenomenon. It is one of the great challenges of our research field! 1.1 Cuts, Branch & Bound, and Lagrangean Relaxation Most truly hard optimisation problems are attacked by a combination of the three ideas. The idea behind B&B is to partition the problem in subproblems and solve relaxations of the problems. E.g., given a 0-1 ILP, one can create two subproblems by selecting one of the binary variables x j say and for one subproblem require x j = 0 and for the other x j = 1. They clearly have disjoint sets of feasible solutions, which together contain all feasible solutions of the master problem. Typically, relaxations of the subproblems are solved efficiently, yielding lower bounds (in case of minimization) on the optimal solutions of the 1
subproblem. If a relaxation happens to give a feasible solution to the master problem (an integer solution in case of ILP), then we have solved the subproblem, which we do not need to investigate any further. Neither do we need to investigate any subproblem whose optimal relaxed solution is worse than the best feasible solution found so far, or any subproblem that appears to be infeasible. This is called pruning the B&B search tree. Cuts have been used successfully to improve the lower bounds in the relaxed problems. They are meant to approximate the convex hull of the integer points in an ILP as close as possible, hoping to obtain an optimal LP-solution that is very close in value to the optimal ILP-solution. Another powerful tool for improving lower bounds is Lagrangean Relaxation. Instead of relaxing integrality constraints it relaxes complicating constraints by adding the violation multiplied with a Lagrangean multiplier in the objective function. The violation of the relaxed constraints is punished by a linear cost: the value of the multiplier times the amount of violation (right-hand side minus left-hand side in case of a constraint). Choosing the best values for the Lagrangean multipliers is called the Lagrangean Dual problem. Lagrangean relaxation has two advantages: the combination with B&B works often well in practice and there is some nice theory connected to it, relating the LPrelaxation, the Lagrangean relaxation and the optimal solution. Good approximation algorithms are used for finding good feasible solutions, to enhance the pruning of the B&B search tree. Though in practice, good lower bounds appear to speed up B&B much more than good upper bounds. There are examples in which even verifying claimed optimality of a given solution does not lead to significant pruning of the search tree. Applying this methodology to solve practical problems is really an art. I refer to the book of Applegate, Cook and some others on the Travelling Salesman Problem (TSP), in which they account their quest for solving very large instances of the TSP. Although Lagrangean Relaxation and the corresponding Lagrangean Duality gives nice theory, and every now and then a set of powerful cuts for some specific problem appear, from the theoretical side the area attracts currently somewhat less attention. I leave reading Sections 11.1-4 to yourself. I do assume for the exam that you understand the basic B&B method, and the basic idea behind cuts and Lagrangean relaxation, without knowing detailed proofs of theorems in case of Lagrangean relaxation. Solving the Lagrangean Dual is continuous opimisation and I do not assume you to know it. Therefore I have chosen to concentrate on Approximation Algorithms. By coincidence this means that I will also show you an example of Dynamic Programming. Although, from a mathematical point of view, one can hardly call research on homeopathic algorithms theoretical, I wish to discuss with you the simulated annealing algorithm, partly because I have the feeling that it is the 2
one, among the homeopathic algorithms, that seems closest to our abilities to analyse mathematically w.r.t. performance quality. 1.2 A Fully Polynomial Approximation Scheme for the 0-1 Knapsack Problem 0-1 Knapsack defined as ILP: max s.t. n j=1 c jx j n j=1 a jx j b x j {0, 1}, j = 1,..., n. Let us first make a DP-formulation. A DP formulates the problem as a series of decisions, solving subproblems for certain states of the problem. It is a general concept and DPs for different problems are different. The classical example is the shortest path problem in a layered network, which can be thought of as a k-day journey from some starting place to some destination, and cities to pass the night are partitioned into those that can be reached after one day, two days etc. Then the DP determines the problem how to get from each of the cities that can be reached after k 1 days to the destination. Then for each of the cities reachable after k 2 days it determines through which city reachable after k 1 days to travel, which is a simple matter of comparing the direct distance + the distance to be travelled on the last day. Then we continue to level k 3, now knowing how to reach from each of the cities at level k 2 how to reach most cheaply the destination. Etc. Also here practicing is the only devise. Smart DPs can solve problems very quickly. So here I will present just one example of such a DP. I define W j (C) as the minimum total weight needed to attain total profit C, if only items from among the set 1, 2,..., j are allowed to be used. And I define W 0 (0) = 0 and W 0 (C) = for all C 0. Then I propose the following recursion: W j (C) = min{w j 1 (C), a j + W j 1 (C c j )}. We can either reject item j and we have to get our profit C entirely from the first j 1 items, or we can select item j, which adds a j to the weight, and in which case we need to cover only a profit of C c j with the first j 1 items. If we call j a phase and C a state then we see that for each phase j we need to compute W j (C) for all values of C that could possibly be reached, i.e., C should range from 0 to n j=1 c j. Computing it for one state is obviously a matter of one addition, one substraction and one comparison, i.e., constant time. Thus the total running time of the algorithm is the number of phases (n) times the 3
number of states O(n n j=1 c j) = O(n 2 c max ), with c max = max j c j. This is exponential time since c m ax requires log c max bits for its encoding on a binary computer. It says that the class of instances of 0-1 Knapsack with objective coefficients bounded by a polynomial function of the number of items is solvable in polynomial time. It will be easy for you to find a dynamic programming formulation in which the state is the total weight and C j (W ) is defined as the maximum profit reachable with the first j items given that the total weight does not exceed W. In fact this is a more natural formulation, but we will see directly that the first one allows us to designing an algorithm that can achieve any desired accuracy compared to optimal in a relative sense. Definition 1.1 r-approximation Algorithm. We call an algorithm H an r-approximation algorithm for a problem if it has polynomial running time and if it always returns a solution with value Z H, such that, in case of minimisation, and in case of maximisation Z H (1 + r)z OP T, Z H (1 r)z OP T. For optimization problems we are interested in algorithms that are running in polynomial time and for which we can derive a guarantee on the relative error that they make: a proof that even for the worst-case instance the relative error remains within the given relative error. We will show that the situation for 0-1 Knapsack is very favourable, in the sense that for every ɛ > 0 we can design an algorithm that runs in time polynomial in the number of items and in 1/ɛ and gives a worst-case relative error of 1 ɛ. Such a set of algorithms, one for each ɛ, is called a Fully Polynomial Time Approximation Scheme (FPTAS). Consider the following approximation algorithm. Given any instance of the knapsack problem with objective coefficients c 1,..., c n, delete the last t digits of these coefficients, yielding the coefficients ĉ 1,..., ĉ n, i.e., round the coefficients down to the nearest multiple of 10 t and divide by 10 t. Solve the instance with these smaller objective coefficients using the DP developed before. Let us analyse the error that we make. First notice that c j 10 t 10 t ĉ j c j. Let S and S be optimal selections of items for the original problem and the new problem then c j n10 t. j S c j j S c j j S 10 t ĉ j j S 10 t ĉ j j S(c j 10 t ) j S 4
where we used optimality of the two solutions and the bounds on ĉ j. For the remaining part of the analysis I follow the book exactly. 1.3 Approximability and non-approximability in the TSP I show that the TSP cannot be approximated within any approximation relative ratio (or if one wishes to avoid 0-length edges, within any polynomially bounded approximation ratio) unless P=NP. For the TSP with distances satisfying the triangle inequality (the so-called - TSP) I will give the 1-approximation algorithm based on doubling the Minimum Spanning Tree, and indicate how to make the improvement to the 1/2- approximation algorithm of Christofides. This requires the graph theoretical concept of an Euler Graph. Definition 1.2 Euler Graph. A graph G = (V, E) is an Euler graph if the graph contains a tour, a so-called Euler Walk, which starts and ends in the same vertex and traverses every edge exactly once: i.e., a graph that can be drawn on a piece of paper starting end ending in the same point and without lifting the pen from the paper and without drawing some edge twice. Theorem 1.1 A graph G = (V, E) is an Euler graph if and only if it is connected and each vertex is incident to an even number of edges (every vertex has even degree). One direction of this theorem is trivially true. The other one is more tricky, but outside the scope of this course. Given an Euler graph it is easy to (polynomial time) to find an Euler walk. A double MST is an Euler graph. A MST together with a minimum total length perfect matching PM of its vertices with odd degree is also an Euler Graph. Shortcutting a double MST gives a tour of length at most 2Z MST 2Z T SP. Shortcutting the union of the MST and the PM gives a tour of length at most Z MST + Z P M. Notice that the length of a perfect matching on a (even) subset of the nodes has length at most 1 2 the length of the TSP-tour on the same subset of nodes, hence of length at most 1 2 the length of the TSP-tour on all the nodes. Therefore, Z MST + Z P M Z T SP + 1 SP ZT 2. Thus, for the -TSP we have a polynomial time 1 2-approximation algorithm. Research Question. It is a major open question in Combinatorial Optimization if there exists a r-approximation algorithm with r < 1 2 if P NP. 5
Material of Week 11 from [B& T] Chapter 11 Exercises of Week 11 11.6, 11.14, and if you like 11.16 Next time in two weeks: April 27 Chapter 11, Local Search and Simulated Annealing, and Chapter 12 6