MTAT.03.86: Advanced Methods in Algorithms Homework Assignment 4 Solutions University of Tartu 1 Probabilistic algorithm Let S = {x 1, x,, x n } be a set of binary variables of size n 1, x i {0, 1}. Consider a binary equation φ with m 1 unknowns of the form x t1 + x t + + x tm = 0, where all computations are done modulo, and t i {1,,, n} for all i = 1,,, m. (a) Consider a simple probabilistic algorithm that assigns values 0 and 1 to all x i, i = 1,,, n, as follows Prob(x i = 0) = Prob(x i = 1) = 1 (the value to each variable is assigned independently of other variables). Show that the probability that the proposed algorithm does not find a solution of φ is exactly 1. Hint: for each assignment of values that does not solve φ there is an assignment that solves it. (b) Consider now a system of equations φ 1, φ,, φ r as above, r 1, where any two equations φ i and φ j use a disjoint set of unknowns. What is the probability that the algorithm in part (a) solves the system in (b) (i.e., all r equations are solved at the same time)? (c) How many times should the algorithm in (a) be executed in order to solve the system given in (b) with probability at least 1? at least 0.99? 1.1 Failure probability First, consider the case where m = 1, then the equation is x t1 = 0 and by definition Prob(x t1 = 0) = 1 as required. Now, assume that there are more variables m. Then we have the following using the law of total probability and some simplifications. Prob(x t1 + x t + + x tm = 0) = Prob(x tm = 0) Prob(x t1 + x t + + x tm 1 = 0) + Prob(x tm = 1) Prob(x t1 + x t + + x tm 1 = 1) = 1 (Prob(x t 1 + x t + + x tm 1 = 1) + Prob(x t1 + x t + + x tm 1 = 0)) = 1 1 = 1 1
1. System of equations As each equation has distinct variables, then the probability of solving any one equation is 1. All r equations are independent, therefore the probability of the joint event that all equations are solved is Prob(solved) = 1 r. 1.3 Improving the probability with repetitions Repetitions can help to increase the success probability. Let the number of repetitions be k. The probability that the algorithm fails to solve the system is 1 1 and the probability that all k r repetitions independently fail is therefore (1 1 ) k. We have the following r Prob(k repetitions solve the system) = Prob(at least one repetition solves the system) = 1 Prob(no single repetition solves the problem) = 1 (1 1 r )k To ensure success probability 1 we need to find k from the following inequality using the given simplifications and taking the logarithm from both sides of the inequality. 1 (1 1 r )k 1 1 (1 1 r )k 1 log (1 1 r ) k 1 k log (1 1 r ) Where note that log (1 1 r ) is negative and when dividing the inequality with that value we need to change the direction of the inequality. The other case in analogous giving log (0.01) log (1 1 r ) k. Maximum flow LP Consider the following flow network N.
a 3 5 s 7 t 4 b 1. Write the problem of finding maximum flow from s to t in N as a linear program.. Write down the dual of this linear program. There should be a dual variable for each edge of the network and for each vertex other than s and t. Now, consider a general flow network. Recall the linear program formulation for a general maximum flow problem, which was shown in the class. 3. Write down the dual of this general flow linear-programming problem, using a variable y e for each edge and x u for each vertex u s, t. 4. Show that any solution to the general dual problem must satisfy the following property: for any directed path from s to t in the network, the sum of y e values along the path must be at least 1..1 Maximum flow Let s use a variable f e to denote the flow in each edge e. The objective function is to maximize the incoming flow in vertex t and in this example it has no outgoing edges, therefore it is max f s,t + f a,t. 3
For each edge, we have to satisfy the edge rule, giving us the following constraints. f s,a 3 f s,t 7 f s,b 4 f b,a f a,t 5 f s,a 0 f s,t 0 f s,b 0 f b,a 0 f a,t 0 In addition, the flow must satisfy the vertex rule for a and b. f s,a + f b,a f a,t = 0 f s,b f b,a = 0 In total the problem becomes max f s,t + f a,t s. t. f s,a + f b,a f a,t = 0 f s,b f b,a = 0 f s,a 3 f s,t 7 f s,b 4 f b,a f a,t 5 f s,a 0 f s,t 0 f s,b 0 f b,a 0 f a,t 0 4
. Dual problem To find the dual problem, let s at first reformat the primal problem to make the matrix A more explicit. In addition, let s use variable x v for each vertex rule of v and y e for each edge rule on edge e. max f s,t +f a,t s. t. f s,a +f b,a f a,t = 0 x a f s,b f b,a = 0 x b f s,a 3 y s,a f s,t 7 y s,t f s,b 4 y s,b f b,a y b,a f a,t 5 y a,t f s,t, f s,a, f s,b, f b,a, f a,t 0 It is straightforward to write out the dual problem using the columns for the matrix A that describes the constraints. In addition, all our initial variables have the non-negativity constraint meaning that all resulting constraints in the dual problem are inequalities. However, in the dual problem we only have non-negativity constraints for y e variables as these correspond to inequalities of the primal problem. min 3y s,a + 7y s,t + 4y s,b + y b,a + 5y a,t s. t. y s,t 1 x a + y s,a 0 x b + y s,b 0 x a x b + y b,a 1 x a + y a,t 0 y s,a 0 y s,t 0 y s,b 0 y b,a 0 y a,t 0 5
.3 Dual of the general problem The general linear program for a flow network N(G(V, E), s, t, c) can be written out using a variable f e for each e E that denotes the flow in the corresponding edge. max f e e in(t) e out(t) s. t. e E : f e c(e) v V \ {s, t} : f e f e = 0 e in(v) e out(v) e E : f e 0 f e The initial problem has V \ {s, t} equalities and E inequalities if we do not count the basic restrictions for f e 0. Hence, the dual problem will have V \{s, t} + E variables. As we have the constraint f e 0 for all initial variables then we know that all the constraints in the dual problem will be inequalities. We use a variable y e for each edge e and x u for each vertex u that is not a source or the sink. Here x u corresponds to the equality for the vertex rule of u and y e corresponds to the inequality for the edge rule of e. We assume that there are no self loops because these are not interesting for maximum flow problems. The objective function becomes minimizing c(e) y e. e E The constraints will be divided based on to which edge the derived inequality corresponds to. We will get one constraint for each edge in the graph. The main difference between them is if the edge is connected to the source or sink or not. For each edge that has one endpoint in the sink t we know that it was also used in the initial objective function, therefore the corresponding inequality will have the form... 1 or... 1. The rest of the equations will have the form... 0. The most special edges are between s and t because they are not part of any vertex rule, but participate in the objective function. Therefore, we only have some rule for the edges as y (s,t) 1 and y (t,s) 1. For each e = (u, t) E, u V we have x u + y e 1. Because initially f e is used in two constraints, one for the vertex rule of u and the other for the edge rule of e. In addition it is used in the objective function with a positive sign. Analogously, for e = (t, u) E we have x u + y e 1 because this edge is used for the vertex rule of u with a positive sign and in the corresponding edge rule. However, it was present in the objective function with the coefficient 1. Similarly, we get the special cases with e = (s, u) for u V \ {t} that gives us an inequality x u + y e 0 because we use it in the edge rule and vertex rule for u and for e = (u, s) E for u V \ {t} that gives x u + y e 0. 6
Finally, we have the general edges e = (u, v) where u, v V \{t, s}. All these edges are reflected by three new variables and give x u + x v + y e 0 because they are used in the vertex rules for u and v and for edge rule in e. In addition, we have y e 0 because the variables for the edges correspond to the inequalities in the original problem. Therefore, in total we obtain the following dual problem. min c(e) y e e E s. t. e = (t, s) E : y e 1 e = (s, t) E : y e 1 e = (t, u) E, u V \ {s} : x u + y e 1 e = (u, t) E, u V \ {s} : x u + y e 1 e = (s, u) E, u V \ {t} : x u + y e 0 e = (u, s) E, u V \ {t} : x u + y e 0 e = (u, v) E, u, v V \ {t, s} : x u + x v + y e 0 e E : y e 0.4 Directed path We know that none of the values y e are negative, therefore if the value of one edge is larger than one, then this is sufficient for the proof. The condition clearly holds for the only path with just one edge y (s,t) because we have the constraint y (s,t) 1. Assume that we have some path using the numbered vertices s 1... n t such that the source and sink are passed through only once. Then we consider the sum y (s,1) + y (1,) +... + y (n,t). From the constraint x n + y (n,t) 1 we get y (n,t) 1 + x n. Therefore y (s,1) + y (1,) +... + y (n,t) y (s,1) + y (1,) +... + y (n 1,n) + 1 + x n. Next, we can consider the constraints x n 1 + y (n 1,n) + x n 0 where y n 1,n + x n x n 1 gives y (s,1) + y (1,) +... + y n 1,n + x n + 1 y (s,1) + y (1,) +... + y (n,n 1) + x n 1 + 1. We can keep using this rule on the intermediate edges (u, v) until we reach the sink and have only y s,1 + x 1 + 1 where we have y s,1 + x 1 0, therefore y s,1 + x 1 + 1 1. Following it back we get y (s,1) + y (1,) +... + y (n,t) 1. We also have to consider the more general paths that pass through the source or the sink many times. Let s 1... n t be such path now where some 0 i n can be t or s. 7
We can consider this path in fragments that start and end in s or t. Each fragment is a path. Namely, we start from s and the first fragment ends when we first encounter t or s. We start the next fragment using this s or t and continue fragmenting until we reach the end of the path. The possible fragments are s s, s t, t s and t t. However, from the previous we know that the sum is greater than one for each fragment s t. In addition, trivially each path from s to t has to contain at least one fragment s t because otherwise we start from s and can only have fragments s s which means that we never reach the t in the end of the path however we have to reach it by definition. The sum of y e in each fragment is non-negative because each y e 0. Therefore, we can only look at the edges that belong to this fragment y e y e 1 e path e s t holds because the previous result for path from s to t clearly holds for this simple fragment s t. 3 Set cover and set packing You are given a set of elements U = {x 1, x,, x n } and m its subsets S i U, i = 1,,, m, such that U = S 1 S S m. Definition. A set packing of U is a collection P of subsets S i, such that any two sets in P are disjoint. The maximum set packing problem is the problem of finding a set packing with maximum number of sets. Definition. A set cover of U is a collection F of subsets S i, such that the sets in F cover U, i.e. Si FS i = U. The minimum set cover problem is the problem of finding a set cover with minimum number of sets. Example. Let n = 5 and m = 4. Take U = {1,, 3, 4, 5}, S 1 = {1,, 3}, S = {, 3, 4}, S 3 = {3, 4, 5}, S 4 = {4}. Then P = {S 1, S 4 } is a maximum set packing, and F = {S 1, S 3 } is a minimum set cover. 1. Formulate maximum set packing problem as integer linear-programming problem.. Formulate minimum set cover problem as integer linear-programming problem. 3. In the two formulations, relax constraints such that the resulting problems become linearprogramming problems, and the resulting two problems are dual. Explain why the problems are dual indeed. Assume that there are no empty sets S i. 8
3.1 Maximum set packing integer LP Let s i be the indicator variable that is 1 if S i is included in the packing solution. For each variable x U we need to ensure that at most one set is included in the packing, hence we get x U : m s i I(x, S i ) where I(x, S i ) = 1 if x S i and 0 otherwise. Hence, we obtain max s. t. x U : m s i m s i I(x, S i ) 1 i {1,..., m} : s i {0, 1} 3. Minimum set cover integer LP For each set S i let s i be an indicator variable such that s i = 1 if S i is in the set cover and s i = 0 otherwise. Then the problem is to minimize m s i. However, we need constraints to make sure that for each variable x j at least on set S i such that x j S i is in the cover. x U : m s i I(x, S i ) 1 where I(x, S i ) = 1 if x S i and 0 otherwise. Hence, we obtain min s. t. x U : m s i m s i I(x, S i ) 1 i {1,..., m} : s i {0, 1} 3.3 Duality 3.3.1 Relaxing Set packing problem can be relaxed by setting 0 s i 1. However, the bound s i 1 is assured by the constraints as the constraints state that a sum of some s i has to be less than or equal to 1. Therefore we can omit this constraint and we get the following relaxed linear program. 9
max s. t. x U : m s i m s i I(x, S i ) 1 i {1,..., m} : s i 0 For the set cover problem, the minimization ensures that there is no reason for the variables s i to increase larger than 1 to satisfy any of the constraints. Hence we can omit s i 1 constraint as it is redundant. min s. t. x U : m s i m s i I(x, S i ) 1 i {1,..., m} : s i 0 3.3. Explanation of duality Set packing dual Consider the case of finding the dual of the set packing solution. The dual problem has one constraint for each set S i and the objective function sums the variables for each element in U. min s. t. i {1,..., m} : n X i n X j I(x j, S i ) 1 j=1 i {1,..., n} : X i 0 We can see this as a relaxation of the following problem. 10
min s. t. i {1,..., m} : n X i n X j I(x j, S i ) 1 j=1 i {1,..., n} : X i {0, 1} where the constraints X i 1 are removed as this is ensured by the fact that the object is to minimize and none of the constraints require larger variables than 1 as long as the variables are non-negative. In fact, this is a set covering problem, but for different sets and elements than in the packing problem. Namely, the set of elements is U = {S 1,..., S m } and its subsets X i = {S j : x i S j }. Set covering dual Consider the dual problem of the set covering problem max s. t. i {1,..., m} : n X i n X j I(x j, S i ) 1 j=1 i {1,..., n} : X j 0 That can be seen as a relaxation of the following integer linear programming problem max s. t. i {1,..., m} : n X i n X j I(x j, S i ) 1 j=1 i {1,..., n} : X j {0, 1} This is a set packing problem for U = {S 1,..., S m } and its subsets X i = {S j : x i S j }. 11
4 Simplex algorithm Solve the following linear-programming problem using simplex algorithm: max x 1 + 7x + x 3 s.t. x 1 + x 3 x + x 3 4 x 1 + x 3 x 1 0, x 0, x 3 0 4.1 Tableau format We write down the initial problem in the tableau format and then use this to perform the simplex algorithm on this format. The first part is the matrix A representing the left hand side of the constraints. The middle part stands for the slack variables and the last column corresponds to the right hand side of the constraints b. The last row holds the objective function information, especially it begins with c. 1 1 0 1 0 0 3 0 1 0 1 0 4 1 0 1 0 0 1 1 7 1 0 0 0 0 4. Steps of the simplex algorithm The simplex algorithm in requires us to follow two rules and do Gaussian elimination. 1. Choose a column (variable) with a positive coefficient in the last row. If no such column exists then the current solution is optimal.. For each row i where this column c has a positive entry, compute the ratio b[i] c[i] between this entry and the right hand side of the corresponding constraint. Choose the row with the smallest ratio. The chosen element is used as a pivot and then Gaussian elimination is performed. First we can choose column 1, or 3. Let s choose and the respective ratios are 3 and 4 meaning that we have to select row. We obtain the following matrix after applying the Gaussian elimination. (First row is (1)- 1 (), Second row is 1 (), Third remains the same and Fourth row is (4)- 7 ()). 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 1 0 0 1 1 0 5 0 7 0 14 1
Now the only column that we can choose is the first column and based on the respective ratios, we have to take the first row. The second row remains the same, but we need to subtract the first row from the thirds and fourth row. 1 0 1 1 1 0 1 1 1 0 1 0 0 3 1 0 0 1 1 1 0 0 1 3 0 15 In here we can not do any more steps as all variables have negative entries in the last row. 4.3 Result The result of the Simplex algorithm can be written out from the final tableau based on the basic variables. Right now the basic variables are x 1, x and z 3 as these are the variables with a single 1 in the respective columns. We can find the values for these variables in the last column of in the row where the respective columns have 1. Hence, the optimum is obtained at x 1 = 1 and x =. We also know that z 3 = 1, but the value of the slack variable does not affect the objective function. In addition, the values of all non-basic variables is 0, hence x 3 = 0 and the value of the objective function is 15 (as represented in the bottom right entry in the tableau). 13