Spring Lecture 21 NP-Complete Problems

CISC 320 Introduction to Algorithms Spring 2014 Lecture 21 NP-Complete Problems 1

We discuss some hard problems: how hard? (computational complexity) what makes them hard? any solutions? Definitions Decision Problems: A decision problem takes input and yields two possible answers, yes and no. e.g., Given graph G=(V,E) and a positive integer k, is there a coloring of G using at most k colors (and no adjacent vertices are assigned the same color)? Optimization problems: An optimization problem takes input and find the optimal solutions. e.g., Given G, determine the minimal number of colors (and the coloring) such that no adjacent vertices are assigned the same color. For classification, problems are defined in terms of yes-or-no (decision) version. We can do so because these two versions are equally hard, while optimization problems are seemingly harder. In homework 5, we would prove that optimization version is not harder than the yes-or-no version for a few well-known hard problems. 2

Computational complexity Polynomially bounded An algorithm is polynomially bounded if its worst-case complexity is bounded by a polynomial function of the input size. A problem is polynomially bounded if there is a polynomially bounded algorithm for it. The class P is the class of decision problems that are polynomially bounded. While class P may seem to be too broad, such broadness is necessary for the class to be independent from specific model of computation. it is still a useful classification because problems not in P would be intractable. The class P is closed under compositions p 1 (n) + p 2 (n) (sequential blocks) p 1 (p 2 (n)) (nested subroutines) 3

The class NP is the class of decision problems for which there is a polynomially bounded nondeterministic algorithm. NP stands for Nondeterministic Polynomial, and should not be mistaken as non-polynomial. Nondeterministic algorithms are a model of computation, which is inherently parallel, and hence more powerful, but has no incarnation in the real world yet. A nondeterministic algorithm has two phases: Guessing a best one (certificate) among multiple options (but it takes as much time as there were only one option) Verifying the guessed solution (in a deterministic way) In practice, a problem is in NP if its solutions can be verified by a polynomial algorithm. 4

Where is the gas station? Which rotue to take? Take the leftmost. Which rotue to take? Take the rightmost n levels es Which rotue to take? Take the middle. A nondeterministic algorithm always makes the right guess among multi options. For the problem above, a nondeterministic algorithm takes n guesses and n checkings to find the gas station. In contrast, a deterministic algorithm using a breath-first search would have to take O(3 n ) steps. 5

Classes beyond NP PSPACE problems that can be solved by using reasonable amount of memory y( (bounded as a polynomial of the input size), without regard to time the solution takes. EXPTIME problems that can be solved in exponential time. 6

Theorem 13.2 P NP. Proof If a problem is in P, there is a polynomial algorithm A. With minor modification, we can have a nondeterministic polynomial algorithm A : the guessing phase is trivial, i.e., do nothing; and the verifying i phase is just A. Therefore, any problem in P is also in NP. A million $ question: does P = NP or is P a proper subset of NP? (http://www.claymath.org/millennium_prize_problems/) Polynomial l Reductions 7

The size of the input As the complexity is measured against the size of the input, we have to be careful about what we mean input size. For example, look at the problem of integer factorization. factor = 0 for (j = 2; j < n; j++) if ((n mod j) == 0) factor = j; break; return factor; In this pseudo code, the for loop runs at most n steps, and (n mod j) can be evaluated in O(log 2 (n)). Therefore, the algorithm is O(n 2 ). However, integer factorization is not known to be in P. (Because it is hard problem, it is used in cryptology to provide a secure encryption algorithm.) What is wrong here? The size of an input is the number of characters it takes to write the input. If n = 150, we write three digits, not 150 digits (we would if a unary notation is used). In decimal notation, integer n has the size s = log 10 (n). So running time O(n) is actually O(10 s ). 8

NP-Complete Problems are hardest ones in the class NP. If an NPC problem could be solved in polynomial time, so could be all problems in NP. How do we know a problem is NPC? Two steps: Prove it is in NP. Prove it is NP-hard. A problem is NP-hard if it is as hard as or even harder than any problem in NP. 9

Garey & Johnson 79 10

Garey & Johnson 79 11

Garey & Johnson 79 12

Polynomial Reductions -- a formal way to say as hard as. Reduction is a transformation from one problem to another. Formally, let T be a function from input set for a decision problem Ui into the input set tfor a decision i problem V, such hthatt For every string x, if x is a yes input for U, then T(x) is a yes input for V. For every string x, if x is a no input for U, then T(x) is a no input for V. (Or equivalently, if T(x) is a yes input for V, then x is a yes input for U). T is a polynomial reduction when it can be computed in polynomially bounded time. Problem U is polynomially reducible to V, denoted as U p V, if there exists a polynomial reduction from U to V. 13

x (an input for U) T T(x) An input for V Algorithm for V yes or no answer Algorithm for U 14

Directed Hamiltonian cycle problem is reducible to undirected Hamiltonian cycle problem - u 1 u 2 u 3 u y 1 y 2 y 3 v 1 v 2 v 3 y x z w v z 1 z 2 z 3 x 1 x 2 x 3 w 1 w 2 w 3 G = (V,E) G = (V,E ) 15

Theorem 13.3 If U p V and V is in P, then U is in P. Proof Let p be a polynomial bound on the computation of T, and q a polynomial bound on an algorithm A for V. Let x be an input for U of size n. The size of T(x) is at most p(n), and algorithm A on T(x) takes at most q(p(n)) steps. The total amount of work to transform x to T(x) and then use V s Vs algorithm to get the correct answer for U on x is p(n) + q(p(n)), a polynomial in n. Definition: NP-hard A problem U is NP-hard if every problem P is reducible to U. A NP-hard problem needs not to be in NP, but if it does, it becomes NP-Complete, i.e., hardest one in NP. NP-Complete problems form an equivalent class under polynomial reductions: if U, V NPC, then U p V and V p U. How can we prove a problem U is NP-hard without having to reduce every problem in NP to U? If we know that a hardest one in NP can be reduced to U, then all can be reducible to U. 16

Cook s Theorem The satisfiability problem is NP-Complete. Conjunctive normal form (CNF) of a propositional formula consists a sequence of clauses separated by boolean AND operator ( ), where a clause is a sequence of literals separated by boolean OR operators ( ). For example, (p q s) ( q r) ( p r) ( r s) ( p s q) where p, q, r, and s are propositional variables. Truth assignment: an assignment of true or false to each variable. A truth assignment is said to satisfy a formula if it makes the value of the entire formula true. e.g., (r=true, s = true, p=false, q = false) is a truth assignment that satisfies the CNF above. Decision i Problem: Given a CNF formula, is there a truth th assignment that t satisfy it? 17

Bounded Halting Given program X and integer K, find data which, when given as input to X, causes X to stop in at most K steps. We prove Bounded Halting (BH) is NP-complete. 1) BH is in NP. A solution to BH can be verified within time polynomial in K: simply simulate program X on data for K steps. (For technical reason, K is given in unary notation.) 2) BH is NP-hard. Let A be any problem in NP. There exists a problem PA that can test solutions to A, and halt within polynomial time p(n). Modify PA to PA such that PA will go into an infinity loop whenever PA would halt with a no answer. Present PA and p(n) as input to the bounded halting problem. If BH is solved, A is solved. That is, BH is as hard as, and could be harder than, any problem in NP. Therefore, Bounded Halting problem is NP-Complete Complete. 18

More NP-Complete problems Traveling Salesman Problem (TSP) Optimization problem: Given a complete, weighted graph, find a minimum-weight Hamiltonian cycle Decision Problem: Given a complete, weighted graph and an integer k, is there a Hamiltonian i cycle with total t weight at most k? Clique problem A clique in graph G is a subset of vertices W such that each pair of vertices in W is connected by an edge in G. Optimization problem: Given a graph G, find the largest clique in G. Decision problem: Given a graph G and integer k, does G have a clique of size at least k? 19

What to do in case of NP-Complete problems? Use a heuristic Find an approximate algorithm Use exponential time algorithm anyway y 20