Lecture 4: An FPTAS for Knapsack, and K-Center

Similar documents
1 The Knapsack Problem

This means that we can assume each list ) is

8 Knapsack Problem 8.1 (Knapsack)

1 Ordinary Load Balancing

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

The Knapsack Problem. 28. April /44

Lecture 11 October 7, 2013

1 The Arthur-Merlin Story

1 Review for Lecture 2 MaxFlow

Geometric Steiner Trees

a 1 a 2 a 3 a 4 v i c i c(a 1, a 3 ) = 3

K-center Hardness and Max-Coverage (Greedy)

Lecture 18: More NP-Complete Problems

IE418 Integer Programming

Algorithms. Outline! Approximation Algorithms. The class APX. The intelligence behind the hardware. ! Based on

Week Cuts, Branch & Bound, and Lagrangean Relaxation

Fundamentals of optimization problems

Essential facts about NP-completeness:

Lecture 13 March 7, 2017

Solutions to Exercises

CS 6820 Fall 2014 Lectures, October 3-20, 2014

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

Note that M i,j depends on two entries in row (i 1). If we proceed in a row major order, these two entries will be available when we are ready to comp

Lecture 15 (Oct 6): LP Duality

Duality of LPs and Applications

Computational Complexity. IE 496 Lecture 6. Dr. Ted Ralphs

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

University of California Berkeley CS170: Efficient Algorithms and Intractable Problems November 19, 2001 Professor Luca Trevisan. Midterm 2 Solutions

Lec. 2: Approximation Algorithms for NP-hard Problems (Part II)

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Maximum sum contiguous subsequence Longest common subsequence Matrix chain multiplication All pair shortest path Kna. Dynamic Programming

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

10.4 The Kruskal Katona theorem

8.5 Sequencing Problems

Improved Fully Polynomial time Approximation Scheme for the 0-1 Multiple-choice Knapsack Problem

Lecture 2: Divide and conquer and Dynamic programming

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths

Knapsack. Bag/knapsack of integer capacity B n items item i has size s i and profit/weight w i

Lecture 3. 1 Polynomial-time algorithms for the maximum flow problem

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

NP-COMPLETE PROBLEMS. 1. Characterizing NP. Proof

6.854J / J Advanced Algorithms Fall 2008

Topics in Approximation Algorithms Solution for Homework 3

Optimization Exercise Set n. 4 :

BBM402-Lecture 20: LP Duality

P,NP, NP-Hard and NP-Complete

arxiv: v1 [math.oc] 3 Jan 2019

Algorithms: Lecture 12. Chalmers University of Technology

1 Primals and Duals: Zero Sum Games

CS Algorithms and Complexity

Computability and Complexity Theory

Optimization Exercise Set n.5 :

On the Tightness of an LP Relaxation for Rational Optimization and its Applications

Algorithms Exam TIN093 /DIT602

COL351: Analysis and Design of Algorithms (CSE, IITD, Semester-I ) Name: Entry number:

Lecture #21. c T x Ax b. maximize subject to

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

Discrete Optimization 2010 Lecture 10 P, N P, and N PCompleteness

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

Combinatorial Optimization

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Lecture 13 Spectral Graph Algorithms

P, NP, NP-Complete, and NPhard

COSC 341: Lecture 25 Coping with NP-hardness (2)

Efficient approximation algorithms for the Subset-Sums Equality problem

CMPSCI611: Three Divide-and-Conquer Examples Lecture 2

Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems

ACO Comprehensive Exam March 17 and 18, Computability, Complexity and Algorithms

Lecture 4. 1 FPTAS - Fully Polynomial Time Approximation Scheme

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Theoretical Computer Science

Combinatorial optimization problems

Theory of Computation Chapter 9

Lecture 2: Network Flows 1

ACO Comprehensive Exam 19 March Graph Theory

Exercises NP-completeness

Efficient Approximation for Restricted Biclique Cover Problems

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

arxiv: v1 [cs.cg] 29 Jun 2012

Outline. 1 NP-Completeness Theory. 2 Limitation of Computation. 3 Examples. 4 Decision Problems. 5 Verification Algorithm

Complexity Theory of Polynomial-Time Problems

Chapter 8. NP and Computational Intractability

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

Approximation Algorithms and Hardness of Approximation. IPM, Jan Mohammad R. Salavatipour Department of Computing Science University of Alberta

The Knapsack Problem. n items with weight w i N and profit p i N. Choose a subset x of items

Introduction to Semidefinite Programming I: Basic properties a

CPSC 320 (Intermediate Algorithm Design and Analysis). Summer Instructor: Dr. Lior Malka Final Examination, July 24th, 2009

Topics in Theoretical Computer Science April 08, Lecture 8

Lecture 6: Greedy Algorithms I

IE 5531: Engineering Optimization I

2. A vertex in G is central if its greatest distance from any other vertex is as small as possible. This distance is the radius of G.

Lecture 18: P & NP. Revised, May 1, CLRS, pp

More NP-Complete Problems

CS 350 Algorithms and Complexity

Branching. Teppo Niinimäki. Helsinki October 14, 2011 Seminar: Exact Exponential Algorithms UNIVERSITY OF HELSINKI Department of Computer Science

CS 350 Algorithms and Complexity

8. INTRACTABILITY I. Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley. Last updated on 2/6/18 2:16 AM

NP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Transcription:

Comp 260: Advanced Algorithms Tufts University, Spring 2016 Prof. Lenore Cowen Scribe: Eric Bailey Lecture 4: An FPTAS for Knapsack, and K-Center 1 Introduction Definition 1.0.1. The Knapsack problem (restated) Given n objects {a 1,..., a n }, with sizes {s 1,..., s n }, and profits {p 1,..., p n }, and a knapsack with capacity B, where a i, s i, p i, B N and a i B i, find a subset of objects whose total size is bounded by B and whose total profit is maximized. 2 Hardness Of Approximation Theorem 2.0.2. If P NP, then no polynomial-time algorithm can solve the Knapsack Problem with a p k solution for and fixed constant k. Proof. Assume there exists a polynomial-time algorithm A with performance guarantee k > 0 for all instances of the Knapsack Problem. We show that A can be used to construct a solution with value p in polynomial-time. Suppose we are given an instance I = {< a i, p i, s i >} of knapsack of size n and capacity B. Let I = {< a i, p i, s i >} where a i = a i, s i = s i, p i = (k + 1)p i, and B = B = m. Definition 2.0.3. A solution is feasible if it can fit in the knapsack. Remark 2.0.4. A feasible solution for I, i.e. a set of objects that fit in the knapsack, is equivalent to a feasible solution for I. 1

Run algorithm A on I which yields solution A(I ) such that A(I ) p I k. Considering the same solution M on I yields (k + 1)M (k + 1)p I k. So dividing by k + 1, M p I k k + 1 < 1. But since all solutions are integral, M p I = 0. Therefore M is an optimal solution to the Knapsack Problem. Definition 2.0.5. Let π be an optimization problem with objective function f π and optimal solution S. A is an approximation scheme for π if on input (I, ɛ), where I is an instance of π and ɛ > 0 is an error parameter, it outputs a solution S such that: { (1 + ɛ)s if π is a minimization problem f π (I, ɛ) (1 ɛ)s if π is a maximization problem Definition 2.0.6. A is said to be a PTAS (Polynomial Time Approximation Scheme) if for each fixed ɛ > 0, its running time is polynomial in the size of of instance I. Definition 2.0.7. A is said to be an FPTAS (Fully PTAS) if the running time of A is bounded by a polynomial in the size of I and 1 ɛ. Claim 2.0.8. For ɛ < 1, there exists an algorithm giving a (1 ɛ)p solution in O(n 1 3 ) time for the Knapsack Problem. ɛ 2

3 An NP-hard dynamic programming algorithm for Knapsack Let p max be the profit of the most profitable object. p max = max i n p i and let p denote the optimal solution, the most profit we can take home in the knapsack. Then it follows that n p max p This is obvious since p is less then the sum of all n p i s. And this sum is in turn at most n p max. For each i {1,..., n} and p {1,..., n p max }, let S i,p denote a subset of {a 1,..., a i } whose total profit is exactly p and whose total size is minimized. Let A(i, p) denote the size of S i,p, where A(i, p) = if no such set S i,p exists. Thus, the p can be expressed as: p = max{p A(n, p) B} We can use a dynamic programming algorithm which runs in O(n 2 p max ) to compute all A(i, p) s, and then select the instance with the smallest size and maximum profit, thus solving the knapsack problem. Wait! I thought this was a NP-Hard problem? Didn t you contradict yourself by stating a polynomial running time? Actually no. It would be polynomial 3

if p max were polynomial wrt n. But we are not guaranteed this. If we were, then yes, this algorithm runs in polynomial time. Instead, this algorithm is called pseudo-polynomial, as the actual value of p max is of size O(2 n ) with regard to the input to the problem. This is because p max is written in binary in the input, thus n O(log p max ) when actually input into the problem. 3.1 Dynamic Programming Algorithm for Knapsack Goal: Compute A(i, p) for i {1,..., n}, p {1,..., n p max } in time O(n 2 p max ) using dynamic programming. First, compute A(1, p) for each p in {1,..., n p max }. That s simply: A(1, p) = s 1 if p 1 = p A(1, p) = if p 1! = p To demonstrate, here is an example knapsack with objects, sizes and profits as specified: Object A B C D E Size 7 2 9 3 1 Profit 3 2 3 1 2 We can construct a table where we store the results from our dynamic programming algorithm. The number of columns in the table is determined by n p max (in this case, 5 * 3). The first row is as follows: 1 2 3 4 5 6 7 8... 15 A(1, p) 7... 4

To calculate A(2,p) and so on, we use the following recurrence: A(i + 1) = min(a(i, p), s i+1 + A(i, p p i+1 )) if p i+1 < p = A(i, p) if otherwise Using this recurrence, we can fill in the next few rows of the table like so: 1 2 3 4 5 6 7 8... 15 A(1, p) 7... A(2, p) 2 7 9... A(2, p) 2 7 9 16 18... Given the position in question, the recurrence gives a choice between the value in the column directly above (calculated without taking into consideration the i + 1st element), or the value gotten from using the i + 1st element plus whatever is in the table using the first i elements to generate profit p p i+1. For example, to calculate A(3, 5) we notice that p 3 < p, or 3 < 5, so we have a choice between the value in A(2, 5), which is 9 or s 3 + A(2, 2) which equals 11. Clearly 9 is the minimum of the two and gets assigned as the value of A(3, 5). Once the table is completely filled, we scan the profit columns right to left looking for the first occurrence of a size B. That gives us our p. END Also, note that we must store backpointers to explain where each entry came from, in order to choose the actual set of items responsible for the actual values of the matrix we decided to fill. As values can only be filled from the space directly above the actual value A(i, j) or to the top left of this value, backpointers may only point up or to the top left. Pointers that point to the top left item indicate that we chose item i, and pointers that just point up indicate that we did not choose item i. The problem with this method is that there could potentially be many columns given a large enough p max. So, the next question is: how to turn this into an approximation algorithm which runs in polynomial time regardless of p max? 5

4 An FPTAS for Knapsack In this section we construct a FPTAS for knapsack; we ll refer to it as KNAP- SACK FTPAS. To make this algorithm run in polynomial time, we will simply ignore a certain number of the least significant digits, so we will get a pretty good approximation (by only looking at the most important digits), but still not perfect, as we are losing information. Steps: 1. Given ɛ > 0, let k = ɛ pmax n 2. For each a i, define p i = p i k 3. Let I = (a i, s i, p i) where a i = a i, s i = s i, and p i is as shown above. The dynamic algorithm for solving Knapsack is then applied to the new instance I and outputs max{s max, S }, where S max is the smallest object of profit p max if S max B. Lemma 4.0.1. Let A denote the set output by KNAPSACK FPTAS. profit(a) p 1 1 + ɛ Proof. Let O denote the set with profit p on an instance I of knapsack. We now reason about I and the associated rounded profit instance I defined above. Note that any feasible (meaning the items fit in the knapsack) instance of knapsack in I corresponds to a feasible instance of knapsack in I and vice versa, since the objects and sizes haven t changed, only profits have changed. For an instance N of knapsack we denote by P rofit(n) its profit under the original instance, and by P rofit (N) its profit using the new rounded profits. 6

For all objects a, p a k can be smaller than p a (because of the floor function) but not by more than k. This follows from the definition of p i. Restated: p a p a k k Thus: P rofit(o) k P rofit (O) nk (1) Now, P rofit (S ) is optimal, which implies P rofit (S ) P rofit (Y ), for any Y that fits in the knapsack. (2) Therefore: P rofit (S ) P rofit (O) (3) multiplying both sides by k, we get: k P rofit (S ) k P rofit (O) (4) So by (2) we get: P rofit(s ) k P rofit (S ) (by defn of P rofit ) k P rofit (O) (by 3) P rofit(o) nk (by 1) p nk (by defn of P rofit(0)) p ɛ p max (by defn of k) 7

We also know that P rofit(a) p max (5) and also that P rofit(a) P rofit(s ) (6) since A returns the max of these two values. Therefore: P rofit(a) profit(s ) (since the profits of S are lower than A s) p ɛ p max p ɛ P rofit(a) By simple algebra, we get: P rofit(a) 1 1 + ɛ p which completes our proof of the lemma. 4.1 Proof of Polynomial Time Execution Theorem 4.1.1. KNAPSACK FPTAS is an FPTAS Proof. By the lemma, the solution is within 1 - ɛ of p. By the definition of k, the running time is: 8

O(n 2 p max k ) = O(n2 n ɛ ) QED. Note: the smaller the ɛ, or the closer you want to get to p, the more the running time inflates. Definition 4.1.2. A problem Π is strongly NP-Hard if every problem in NP can be polynomially reduced to Π so that all numbers in the reduced instance can be written in unary. Note: If a problem has an FPTAS, it can t be strongly NP-Hard. 5 k-center Problem Imagine we have a complete, undirected graph, where each node is a city, and edges represent the shortest distance between these cities. We have funds to build exactly k emergency centers. The k-center problem with triangle inequality is to place our k emergency centers such that no one has to go too far to get to their closest center. k-center Problem: Input: Given G = (V, E), a complete undirected graph whose edges are shortest paths between each pair of nodes. Let D ij denote the path distance between nodes i and j. (Remember, if we start with an incomplete graph, we can make it complete by adding edges where D ij is the length of the shortest existing path between i and j.) Output: A subset of nodes S V with S = k, such that the longest distance of a node to its closest node in S is minimized. Specifically, we want to minimize cost(s) = max j V min i S D ij. Our Approximation Algorithm: 9

We assume that G satisfies the triangle inequality (i.e. D ij + D jk D ik, i, j, k V ). So first, we reorder the edges e 1, e 2..., e m, in order of cost, such that cost(e 1 ) cost(e 2 )... cost(e m ). Then, we add the lightest edge e 1 and look at that graph, then add the edges e 1, e 2, and look at that graph, then add edges e 1, e 2, e 3... Let G i = (V, E i ), where E i = {e 1, e 2,..., e i }. Note that our original graph G is now G m. Definition 5.0.3. A dominating set of G is a subset S V such that every node in V S is adjacent to a vertex in S. (That is, for each node, either you re in, or you have a neighbor who is in the dominating set.) Claim 5.0.4. The optimal solution to a k-center problem is a dominating set in G. (Note that this is a trick question because G is complete. Any one vertex or set of vertices in a complete graph is a dominating set!) Claim 5.0.5. The optimal solution to a k-center problem in G is a dominating set in G i for some i i 0. This claim is trivially true (when i = m). Look at graphs G 1, G 2, G 3,..., G m 2, G m 1, G m. As we move backward from G m, at which point do we no longer have a dominating set? Let C be the cost of an optimal solution to k-center in G. Let e c be the LAST edge of cost c. Remember, this edge is not unique! Multiple edges could have the same cost, so e c is the LAST edge, such that for all e c+1...e i, cost(e c+1...) > cost(g c ). If we consider G c, we know we have a graph that only includes edges up to cost c. For example, if we want to get everyone to the emergency center in 20 minutes or less, we ignore all edges that take more than 20 minutes, and we are now considering G 20. Claim 5.0.6. There is a dominating set in G c of size k or less, and if we can find it, we have our solution to k-center. Claim 5.0.7. There is no dominating set in G c 1 of size k or less. For ease of argument, let s assume all edges have distinct costs. This claim is trivially true by contradiction. Suppose otherwise, that is, there is a dominating set in G c 1 of size k or less. This feasible, dominating 10

set is a solution to k-center of cost < C, which is a contradiction since we assume C is optimal. According to the two claims above, the k-center problem with triangle inequality is equivalent to finding the smallest index i, such that G i has a dominating set of size k. Since finding the dominating set like k-center is NP-hard, we can use this fact to approximate k-center by lower-bounding the size of the dominating set in G i. Definition 5.0.8. The square of graph G = (V, E), denoted G 2 = (V, E 2 ) has an edge between i and j, if and only if there is a path of length 1 or 2 between i and j. Notes: We can compute the square of a graph by multiplying the adjacency matrices. The cube of graph G, denoted G 3, adds edges between i and j if there exists a path of length 1, 2, or 3 between i and j. This can be extended to create G 4, G 5,... This makes no difference for G (which is complete), but we are also going to be looking at G c, for different c s, which are certainly not complete graphs. Definition 5.0.9. An independent set in a graph G = (V, E) is a set S V, such that i S, if (i, j) E, then j S. Definition 5.0.10. A maximal independent set (MIS) in a graph G = (V, E) is an independent set such that v V, either v S, or u such that (u, v) E, and u S. Finding a maximum independent set of a graph is NP-hard, but finding a maximal independent set of a graph can be solved in polynomial time using a simple greedy algorithm. Lemma 5.0.11. Let H = (V, E). Let I be an independent set in H 2. Then I dom(h), where dom(h) denotes the size of a minimum cardinality dominating set in H. (Note: dom(h) is NP-hard to compute, but any independent set is smaller than this.) 11

Proof. Let D be a minimum cardinality dominating set in H. For each vertex d D, its neighborhood forms a clique in H 2. So H 2 contains D cliques spanning all the vertices, which implies any independent set in H 2 can pick at most 1 vertex per clique. So I D. If we start with a vertex and its neighbors and square it, we have a clique! Therefore, we can only take 1 vertex from each of these cliques / neighborhoods. 6 2-Approximation Algorithm for k-center Algorithm A: 1. Construct G 2 1, G 2 2,...,G 2 m. 2. Compute a maximal independent set (MIS) L i in each graph G 2 i. 3. Find the smallest index i, such that L i k, and call that MIS L j 4. Return L j Lemma 6.0.12. For j as defined in the algorithm above, the cost(e j ) C. Note that cost(e j ) is the most expensive edge in L j. Proof. For every i < j, we have L i > k since dom(g i ) L i by Lemma 1. (That is, if we go through and look at G 2 1, G 2 2, G 2 3,..., G 2 m, we run out of our k centers before we reach G 2 m. That implies dom(g i ) > k. So the first index for which the k-center problem forms a dominating set > i, so C > cost(e i ).) Theorem 6.0.13. Algorithm A returns a solution of cost at most 2 OP T. Proof. Observe that a maximal independent set in H 2 is also a dominating set in H 2 (any maximal independent set is a dominating set, but not vice versa). Thus, if we have a maximal independent set that is equal to the dominating set in G 2 i (let s call it D), then every vertex is on a path of length at most 2 to a vertex in D in the original graph G i. 12

Since i < C by lemma 2, then each edge e G i, cost(e) < cost(c ), the path of length 2 in G has edges of cost less than cost(c ). Thus, by triangle inequality, cost 2C, vertex to their closest vertex in D. We have shown a 2-approximation to the k-center problem, but can we do any better? The answer is NO, as we show in the following section. 7 Hardness of Approximation Theorem 7.0.14. Approximating the k-center problem with triangle inequality within a factor of 2 ɛ is NP-hard for any ɛ > 0. Proof by reduction from dominating set. Given a graph G = (V, E), we construct an instance of k-center satisfying the triangle inequality, such that if G has a dominating set of size k, then the optimal cost of the k-center is 1. Otherwise the optimal cost of the k- center is 2. We put the following weights on the edges of the complete graph (note that they satisfy the triangle inequality): w(e) = { 1 if e E; 2 if e E. Thus, the approximation algorithm will output a solution of cost 1 if there is a dominating set in G and output a solution of cost 2 otherwise. So we can use the approximation algorithm to decide whether there is a dominating set in G. Therefore, the approximation algorithm is also NP-hard. 13