Network Design and Game Theory Spring 2008 Lecture 2 Instructor: Mohammad T. Hajiaghayi Scribe: Imdadullah Khan February 04, 2008 MAXIMUM COVERAGE In this lecture we review Maximum Coverage and Unique Coverage problems. We design approximation algorithm for the problem and establish approximation guarantees and hardness of approximation for them. In lecture, we defined maximum coverage as follows: Maximum Coverage: Input : A universe set U = {e,..., e n } and a collection S = {S... S k } of subsets of U such that U = {S,..., S k }. Also, cost and weight functions c : S Q + and w : U Q + and a bound L Q + called the budget.. Goal: Find S S such that the total cost of its sets is at most L while maximizing the total weight of covered elements. The problem is NP-complete as unit cost Set Cover problem can be trivially reduced to it: The instance of Maximum Coverage with minimum value of L that covers U, gives solution to unit cost Set Cover.. Unit-Cost Case We give a ( e )-approximation algorithm for special case of Maximum Coverage when the costs of all sets are. We then prove that this is the best we can achieve. Let W i, i =... k to be the total weight of elements in S i. For G S, let w(g) be the total weight of elements covered by G. Let W i denote the total weight of elements in S i that are not covered by G. We now show that the simple
greedy algorithm which add to the collection, at each step a set which maximize the weight of uncovered elements, achieves this approximation guarantee. The algorithm is as follows: Greedy ( e )-approximation algorithm:. Set G, C 0, X S. 2. While X do: 3. Find a set S i X, such that W i is maximum 4. IF C + cost(s i ) L then (a) G G S i (b) C C + c(s i ) 5. X X \ S i Output: G. Analysis Let OPT be an optimal solution to the problem instance. Let S... S m be the sets added to our solution in this order at each iteration. Let G i = i j= S i. Lemma w(g i ) w(g i ) L (w(opt) w(g i )) Proof: For each set in OPT \ G i, the total weight of elements in it, that are not covered by G i is at most W i. This is because S i maximize the weight of uncovered elements, and has weight W i. Since there are at most L sets in OPT \ G i, the total weight of elements covered by sets in OPT \ G i but not covered by G i is at most LW i. Hence w(opt) w(g i ) LW i L(w(G i ) w(g i )) and the lemma follows. Lemma 2 w(g i ) ( ( L )i )w(opt) Proof: We prove this by induction on number of iterations. The base case, i.e. w(g ) w(opt ) L follows from Lemma. For induction, assume that it holds for iterations... i. w(g i ) = w(g i ) + [w(g i ) w(g i )] w(g i ) + L (w(opt) w(g i )) = ( L )w(g i ) + L w(opt) ( L )( ( L )i )w(opt) + L w(opt) = ( ( L )i )w(opt) 2
Where the first inequality follows from Lemma and the second inequality follows from induction hypothesis. The above lemmas imply that the greedy algorithm is ( e )-approximation algorithm..2 Arbitrary Cost The natural extension of above algorithm for arbitrary cost is to use the modified greedy heuristic. That is at each iteration select a set which maximize the W weight to cost ratio i.e i c(s i ). We note that the above greedy algorithm has an unbounded approximation factor. For example, consider U = {x, x 2 }, with weights w(x ) = and w(x 2 ) = p. Let S = {x } and S 2 = {x 2 } with costs and p + respectively. If the budget L = p +, the greedy algorithm will select S as it maximize the ratio, and its output will be {S } because of the budget constraint. Whereas the optimal solution is {S 2 } In what follows we modify the above solution to get a 2 ( e )-approximation algorithm for Maximum Coverage with arbitrary cost. Let S max S be the set that covers the maximum total weight and has cost at most L. The new algorithm outputs the result of greedy algorithm (with modified heuristic) if the total weight covered is more than that of S max otherwise it outputs S max. We prove its approximation factor using lemmas analogous to lemma and 2. The notation is the same as above, and we define c i = c(s i ) and c(g i ) to be total cost of all sets in G i. Lemma 3 w(g i ) w(g i ) c i L (w(opt) w(g i )) Proof: The proof is very similar to that of lemma. Lemma 4 w(g i ) ( i k= ( c k L )w(opt) Proof: Again the proof just follow same reasoning as above in lemma 2. Theorem The above algorithm achieve an approximation factor of 2 ( e ) for the budgeted Maximum Coverage problem. Proof: Suppose that the number of iterations for the algorithm is t. means that adding S t+ to G t will violate the budget constraint. This 3
So c(g t+ ) = c(s t+ ) + c(g t ) > L. By Lemma 4, we get t+ w(g t+ ) [ ( c k L )]w(opt) k= t+ c k [ ( c(g t+ ) )]w(opt) k= [ ( t + )t+ ]w(opt) ( e )w(opt) We used the fact that for real numbers a... a n such that n i= a i = A the ( n k= ( a i A )) is minimized when a = a 2... = a n = A n. So we have w(g t+ ) = w(g t ) + W t+ ( e )w(opt). Also note that w(s max ) W t+ so w(g t ) + w(s max ) w(g t ) + W t+ ( e )w(opt). From the above inequality, at least one of the values w(g t ) and w(s max ) is greater than 2 ( e )w(opt) and the theorem follows. Remark This approximation factor can be improved to get rid of the 2 factor..3 Hardness Proof In this section we prove that the above approximation guarantees are best possible. We prove that even in unit cost and unit weight case it is impossible to improve this approximation factor unless NP DTIME(n loglogn ). Recall that we discussed the following hardness result about Set Cover. Theorem 2 (Ferge) There is no ( ɛ) lg n approximation algorithm to set cover unless NP DTIME ( n lg lg n). In this section we prove the following result Theorem 3 There is no ( e + ɛ) approximation algorithm for Maximum Coverage problem, for any ɛ > 0, unless NP DTIME ( n lg lg n). Proof: Suppose there is an approximation algorithm A for unit cost Maximum Coverage achieving an approximation factor α > ( e ). We show that in this case, the Set Cover can be approximated by a factor better than ln n, contradicting Theorem 2. Consider a unit cost set cover instance. Assume that 4
the number of sets in optimal set cover is k. Apply algorithm A to the above instance with unit weight for each element and budget k. Since every element is covered in a set cover (with k sets), hence the optimal maximum coverage must also cover every element, (as the set cover solution is a solution, and that is within the budget). Since algorithm A is an α-approximation, its solution will cover αn elements. Discard all the sets used in the cover and all the element that were covered. Apply again algorithm A on the reduced instance, until all elements are covered. Suppose the algorithm ran for t iterations. Let the number of uncovered elements at the start of iteration i be n i. The algorithm picks k sets and cover at least αn i elements, i.e. n i+ = n i ( α) and n = n. The total number of sets that are picked is tk, hence this algorithm is a t factor approximation to Set Cover problem. Since n t+ = = n( α) t Hence As we assumed that t = ln n ln( α ) α > ( e ) Therefore, ln( α ) > Since t is the approximation factor of this algorithm for Set Cover and we got that t = c ln n, c <, this is a contradiction to Theorem 2. 2 Unique Coverage In lecture, we define Unique Coverage as follows. Input : A universe set U = {e,..., e n } and a collection S = {S... S k } of subsets of U such that U = {S,..., S k }. Goal: Find S S that maximizes the number of uniquely covered elements. We define a more general version of the problem. Budgeted Unique Coverage Input : A universe set U = {e,..., e n } and a collection S = {S... S k } of subsets of U such that U = {S,..., S k }. Also, cost and weight functions c : S Q + and w : U Q + and a bound L Q + called the budget. 5
Goal: Find S S, whose total cost is at most L and that maximizes the total weight of uniquely covered elements. Below, we present a randomized approximation algorithm to this problem. e 2 lg n Let d(u i ) be the degree of element u i, i.e. the number of sets in S containing u i.. Partition the elements of U into lg n classes based on their degrees. i.e. class j = {u i 2 j d(u i ) 2 j+ }. 2. Let i be the class with maximum cardinality, Clearly its size is at least n lg n. 3. choose a set S S to be in S with probability 2 i Analysis For a fixed element x in class i, let the degree of x be d. By definition of classes we know that 2 i d 2 i+. The probability that x is covered exactly once by S is ( ) d ( 2 i )( 2 i )d Incorporating the bounds on d we get that this probability is at least ( 2i 2 i )( i+ 2 i )2 e 2 Hence the expected number of elements of class i covered exactly once is at least e 2 n lg n e 2 lg n (OPT) Therefore, the algorithm is a -approximation algorithm. e 2 lg n Remark This algorithm can be derandomized by standard application of method of conditional expectation. 6