Discrete (and Continuous) Optimization WI4 131 Kees Roos Technische Universiteit Delft Faculteit Electrotechniek, Wiskunde en Informatica Afdeling Informatie, Systemen en Algoritmiek e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos November December, A.D. 2004
Course Schedule 1. Formulations (18 pages) 2. Optimality, Relaxation, and Bounds (10 pages) 3. Well-solved Problems (13 pages) 4. Matching and Assigments (10 pages) 5. Dynamic Programming (11 pages) 6. Complexity and Problem Reduction (8 pages) 7. Branch and Bound (17 pages) 8. Cutting Plane Algorithms (21 pages) 9. Strong Valid Inequalities (22 pages) 10. Lagrangian Duality (14 pages) 11. Column Generation Algorithms (16 pages) 12. Heuristic Algorithms (15 pages) 13. From Theory to Solutions (20 pages) Optimization Group 1
Capter 2 Optimality, Relaxation, and Bounds Optimization Group 2
Optimality and Relaxation (IP) z = max {c(x) : x X Z n } Basic idea underlying methods for solving (IP): find a lower bound z z and an upper bound z z such that z = z = z. Practically, this means that any algorithm will find a decreasing sequence z 1 > z 2 >... > z s z of upper bounds, and an increasing sequence z 1 < z 2 <... < z t z of lower bounds. The stopping criterion in general takes the form z s z t ǫ, where epsilon is some suitably chosen small nonnegative number. Optimization Group 3
How to obtain Bounds? Every feasible solution x X provides a lower bound z = c(x) z. This is essentially the only way to obtain lower bounds. For some IPs, it is easy to find a feasible solution (e.g. Assignment, TSP, Knapsack), but for other IPs, finding a feasible solution may be very difficult. The most important approach for finding upper bounds is by relaxation. The given IP is replaced by a simpler problem whose optimal value is at least as large as z. There are two obvious ways to get a relaxation: (i) Enlarge the feasible set. (ii) Replace the objective function by a function that has the same or a larger value everywhere. Definition 1 The problem (RP) z R = max {f(x) : x T Z n } is a relaxation of (IP) if X T and f(x) c(x) for all x X. Proposition 1 If (RP) is a relaxation of (IP) then z R z. Proof: If x is optimal for (IP), then x X T and z = c(x ) f(x ). As x T, f(x ) is a lower bound for z R, it follows that z f(x ) z R. Optimization Group 4
Linear Relaxations Definition 2 For the IP max { c T x : x X = P Z n} with formulation P, a linear relaxation is the LO problem z LP = max { c T x : x P }. Recall that P = { x R n + : Ax b}. As P Z n P and the objective is unchanged, this is clearly a relaxation. Not surprisingly, better formulations give tighter (upper) bounds. Proposition 2 If P 1 and P 2 are two formulations for the feasible set X in an IP, and P 1 P 2, the the respective upper bounds zi LP (i = 1,2) satisfy z1 LP z2 LP. Sometimes the relaxation RP immediately solves the IP. Proposition 3 If the relaxation (RP) is infeasible, then so is (IP). On the other hand, if (RP) has an optimal solution x and x X and c(x ) = f(x ), then x is an optimal solution of (IP). Proof: If (RP) is infeasible then T =. Since X T, also X =. For the second part of the lemma: as x X, z c(x ) = f(x ) = z R and z z R we get c(x ) = z. Optimization Group 5
Example z = max {4x y : 7x 2y 14, y 3, 2x 2y 3, x, y 0, x, y integer} The figure below illustrates the situation. y 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 c x The figure makes clear that x = 2, y = 1 is the optimal solution, with 7 as objective value. The optimal solution of the LP relaxation is x = 20 7, y = 3 with 59 7 as objective value. This is an upper bound. Rounding this bound to an integer gives 8 as the linear relaxation bound. Optimization Group 6
Combinatorial Relaxations Whenever the relaxation is a combinatorial optimization problem, we speak of a combinatorial relaxation. Below follow some examples. TSP: The TSP, on a digraph D = (V, A), amounts to finding a (salesman, or Hamiltonian) tour T with minimal length in terms of given arc weights c ij, (i, j) A. We have seen that a tour is an assignment without subtours. So we have z TSP = min (i,j) T c ij : T is a tour min (i,j) T c ij : T is an assigment Symmetric TSP: The STSP, on a graph G = (V, E), amounts to finding a tour T with minimal length in terms of given edge weights c e, e E. Definition 3 A 1-tree is a subgraph consisting of two edges adjacent to node 1, plus the edges of a tree on the remaining nodes {2,..., n}. Observe that a tour consists of two edges adjacent to node 1, plus the edges of a path through the remaining nodes. Since a path is a special case of a tree we have z STSP = min e T c e : T is a tour min e T c e : T is a 1-tree.. Optimization Group 7
Combinatorial Relaxations (cont.) Quadratic 0-1 Problem: This is the (in general hard!) problem of maximizing a quadratic function over the unit cube: z = max 1 i<j n q ij x i x j + 1 i n p i x i : x {0,1} n Replacing all terms q ij x i x j with q ij < 0 by 0 the objective function does not decrease. So we have the relaxation z R = max 1 i<j n max { q ij,0 } x i x j + 1 i n p i x i : x {0,1} n This problem is also a quadratic 0-1 problem, but now the quadratic terms have nonnegative coefficients. Such a problem can be solved by solved by solving a series of (easy!) maximum flow problems (see Chapter 9). Knapsack Problem: The set underlying this problem is X = This set can be extended to X R = the largest integer less than or equal to a. x ZN + : n j=1 x ZN + : n aj xj b j=1 a j x j b., where a denotes Optimization Group 8
Consider the IP Lagrangian Relaxations (IP) z = max { c T x : Ax b, x X Z n}. If this problem is too difficult to solve directly, one possible way to proceed is to drop the constraint Ax b. This enlarges the feasible region and so yields a relaxation of (IP). An important extension of this idea is not just to drop complicating constraints, but then to add them into the objective function with Lagrange multipliers. Proposition 4 Suppose that (IP) has an optimal solution, and let u R m. Define z(u) = max { c T x + u T (b Ax) : x X }. Then z(u) z for all u 0. Proof: Let x be optimal for (IP). Then c T x = z, Ax b and x X. Since u 0, it follows that c T x + u T (b Ax ) c T x = z. The main challenge is, of course, to find Lagrange multipliers that minimize z(u). The best Lagrange multipliers are found by solving (if we can!) min z(u) = min max { c T x + u T (b Ax) }. u 0 u 0 x X Optimization Group 9
LO Duality (recapitulation) Consider the LO problem (P) z = max { c T x : Ax b, x 0 }. Its dual problem is (D) w = min { b T y : A T y c, y 0 }. Proposition 5 (Weak duality) If x is feasible for (P) and y is feasible for (D) then c T x b T y. Proof: c T x ( A T y ) T x = y T Ax y T b = b T y. Proposition 6 (Strong duality) Let x be feasible for (P) and y feasible for (D). Then x and y are optimal if and only if c T x = b T y. Scheme for dualizing: Primal problem (P) Dual problem (D) min c T x max b T y equality constraint free variable inequality constraint variable 0 inequality constraint variable 0 free variable equality constraint variable 0 inequality constraint variable 0 inequality constraint Optimization Group 10
Duality In the case of Linear Optimization every feasible solution of the dual problem gives a bound for the optimal value of the primal problem. It is natural to ask whether it is possible to find a dual problem for an IP. Definition 4 The two problems (IP) z = max {c(x) : x X}; (D) w = min {ω(u) : u U} form a (weak)-dual pair if c(x) ω(u) for all x X and all u U. When moreover w = z, they form a strong-dual pair. N.B. Any feasible solution of a dual problem yields an upper bound for IP. On the contrary, for a relaxation only its optimal solution yields an upper bound. Proposition 7 The IP z = max { c T x : Ax b, x Z n +} and the IP w LP = min { u T b : A T u c, u Z m +} form a dual pair. Proposition 8 Suppose that (IP) and (D) are a dual pair. If (D) is unbounded then (IP) is infeasible. On the other hand, if x X and u U satisfy c(x ) = w(u ), then x is an optimal solution of (IP) and u is an optimal solution of (D). Optimization Group 11
A Dual for the Matching Problem Given a graph G = (V, E), a matching M E is a set of (vertex-)disjoint edges. A covering by nodes is a set R V of nodes such that every edge has at least one end point in R. In the graph at the left the red edges form a matching, and the green nodes form a covering by nodes. Proposition 9 The problem of finding a maximum cardinality matching: extra tak max { M : M is a matching} M E and the problem of finding a minimum cardinality covering by nodes: min R V form a weak-dual pair. { R : R is covering by nodes} Proof: If M is a matching then the end nodes of its edges are distinct, so there number is 2k, where k = M. Any covering by nodes R must contain at least one of the end nodes of each edge in M. Hence R k. Therefore, R M. Unfortunately, this duality is not strong! Since the given matching is maximal and the covering by node is minimal, as easily can be verified, the above graph proves this. Optimization Group 12
A Dual for the Matching Problem (cont.) The former result can also be obtained from LO duality. Definition 5 The node-edge matrix of a graph G = (V, E) is an n = V by m = E matrix A with A i,e = 1 when node i is incident with edge e, and A i,e = 0 otherwise. With the help of the node-edge matrix A of G the matching problem can be formulated as the following IP: and the covering by nodes problem as: Using LO duality we may write z = max { 1 T x : Ax 1, x Z m + w = min { 1 T y : A T y 1, y Z n +}. z = max { 1 T x : Ax 1, x Z m + max { 1 T x : Ax 1, x 0 } = min { 1 T y : A T y 1, y 0 } min { 1 T y : A T y 1, y Z n +} = w. } } Optimization Group 13
Primal Bounds: Greedy Search The idea of a greedy heuristic is to construct a solution from scratch (the empty set), choosing at each step the item bringing the best immediate award. We give some examples. 0-1 Knapsack problem: z = max 12x 1 +8x 2 +17x 3 +11x 4 +6x 5 +2x 6 +2x 7 4x 1 +3x 2 +7x 3 +5x 4 +3x 5 +2x 6 +3x 7 9 x {0,1} 7 Greedy Solution: Order the variables so that their profit per unit is nondecreasing. already done. This is Variables with a low index are now more attractive than variables with higher indices. So we proceed as shown in the table. var. c i a i value use of resource resource remaining x 1 3 1 4 5 8 x 2 3 1 3 2 17 x 3 7 0 7 2 11 x 4 5 0 5 2 x 5 2 0 3 2 x 6 1 1 2 2 x 7 2 3 0 3 0 The resulting solution is x G = (1,1,0,0,0,1,0) with objective value z G = 22. All we know is that 22 is a lower bound for the optimal value. Observe that x = (1,0,0,1,0,0,0) is feasible with the higher value 23. Optimization Group 14
Primal Bounds: Greedy Search (cont.) Symmetric TSP: Consider an instance with the distance matrix: 9 2 8 12 11 7 19 10 32 29 18 6 24 3 19 Greedy Solution: Order the edges according to nondecreasing cost, and seek to use them in this order. So we proceed as follows to construct the tour at the left. 1 2 6 3 5 4 Heuristic tour step arc length 1 (1,3) 2 accept 2 (4,6) 3 accept 3 (3,6) 6 accept 4 (2,3) 7 conflict in node 3 5 (1,4) 8 creates subtour 6 (1,2) 9 accept 7 (2,5) 10 accept 8 (4,5) 24 forced to accept 2 3 1 4 Better tour 6 5 The length of the created tour is z G = 54. The tour 1 4 6 5 2 3 1 is shorter: it has length 49. Optimization Group 15
Primal Bounds: Local Search Local search methods assume that a feasible solution is known. It is called the incumbent. The idea of a local search heuristic is to define a neighborhood of solutions close to the incumbent. Then the best solution in this neighborhood is found. If it is better than the incumbent, it replaces it, and the procedure is repeated. Otherwise, the incumbent is locally optimal with respect to the neighborhood, and the heuristic terminates. Below we give two examples. Optimization Group 16
Primal Bounds: Local Search (cont.) Uncapacitated Facility Location: Consider an instance with m = 6 clients and n = 4 depots, and costs as shown below: 6 2 3 4 1 9 4 11 15 2 6 3 (c ij ) = and (f 9 11 4 8 j ) = (21, 16, 11, 24) 7 23 2 9 4 3 1 5 Let N = {1,2,3,4} denote the set of depots, and S the set of open depots. Let the incumbent be the solution with depots 1 and 2 open, so S = {1,2}. Each client is served by the open depot with cheapest cost for the client. So the costs for the incumbent are (2 + 1 + 2 + 9 + 7 + 3) + (21 + 16) = 61. A possible neighborhood Q(S) of S is the set of all solutions obtained from S by adding or removing a single depot: Q(S) = {T N : T = S {j} for j / S or T = S \ {i} for i S}. In the current example: Q(S) = {{1}, {2}, {1, 2, 3}, {1, 2, 4}}. A simple computation makes clear that the costs for these 4 solutions are 63, 66, 60 and 84, respectively. So S = {1,2,3} is the next incumbent. The new neighborhood becomes Q(S) = {{1,2}, {1,3}, {2,3}, {1,2,3,4}}, with minimal costs 42 for S = {2,3}, which is the new incumbent. The new neighborhood becomes Q(S) = {{2}, {3}, {1,2,3}, {2,3,4}}, with minimal costs 31 for S = {3}, the new incumbent. The new neighborhood becomes Q(S) = {{1,3}, {2,3}, {3,4}, }, with all costs > 31. So S = {3} is a locally optimal solution. Optimization Group 17
Primal Bounds: Local Search (cont.) Graph Equipartition Problem: Given a graph G = (V, E) and n = V, the problem is to find a subset S V with S = n 2 for which the number c(s) of edges in the cut set δ(s, V \ S) is minimized, where δ(s, V \ S) = {(i, j) E : i S, j / S}. In this example all feasible sets have the same size, n 2. A natural neighborhood of a feasible set S V therefore consists of all subsets of nodes obtained by replacing one element in S by one element not in S: Q(S) = {T N : T \ S = S \ T = 1}. 1 Example: In the graph shown left we start with S = {1,2,3} for which c(s) = 6. Then 2 6 Q(S) = {{1,2,4}, {1,2,5}, {1,2,6}, {1,3,4}, {1,3,5}, {1,3,6}, {2,3,4}, {2,3,5}, {2,3,6}}. 3 4 5 for which c(t) = 6,5,4,4,5,6,5,2,5 respectively. The new incumbent is S = {2,3,5} with c(s) = 2. Q(S) does not contain a better solution, as may be easily verified, so S = {2,3,5} is locally optimal. Optimization Group 18