NP-Completeness ch34 Hewett Problem Tractable Intractable Non-computable computationally infeasible super poly-time alg. sol. E.g., O(2 n ) computationally feasible poly-time alg. sol. E.g., O(n k ) No alg sol. E.g., halting problem But n 10000000 - high degree poly-time alg is computationally infeasible So why do we still associate poly-time to tractable? Reasons: high degree poly-time is very rare in practice poly-time alg has useful properties: poly-time sol is tranferable in different computational models (e.g., poly-time on RAM is also poly-time on TM) poly-time alg has closure property on +,., composition --> can combine poly-time sol to get another poly-time sol 1
Problem Types Two problem types: Decision Problems (DP): sol is yes/no Optimization Problems (OP): sol is min/max value Imposing a bound to OP gives a related DP, e.g., 1) Shortest-path problem <G, u, v>: find shortest path between u & v in an unweighted undirected graph 2) Path-problem <G, u, v, k>: Is there a path between u & v whose length is k? Here, imposing a bound k on an OP, 1) to get a DP, 2) DP ~ Verification Problem E.g., To solve the Path problem: For each path u v, verify if its length is k If true then return yes Solving DP is easier than OP 2
Complexity Classes P = a class of probs solvable in determ poly-time alg NP = a class of probs solvable in non-determ poly-time alg f: Problems Formal Languages Problem instance L, a language (set of words - binary string) Time to encode problem instances to binary strings should not effect efficiency of problem solution ~ assume poly-time Another definition: NP = a class of probs solvable by non-deter alg in poly-time Ls accepted by non-tm in poly-time Similarly for P. Intuitively, P ~ Problems that can be solved quickly NP ~ Problems that can be verified quickly Which is easier P or NP? Example HAM-CYCLE (Hamiltonian cycle problem): Does an undirected graph contain a hamiltonian cycle? (i.e., a cycle that contains every vertex of a graph exactly once) yes no Goal: find a hamiltonian cycle of a given graph if it exists 3
HAM-CYCLE Find a hamiltonian cycle (i.e., a cycle that contains every vertex of a graph exactly once) of a given graph if it exists Claim: a non-det poly-time alg sol for HAM-CYCLE i.e., HAM-CYCLE NP Idea: 1) guess edges in a cycle to be p[1..n] 2) verify that p[1..n] forms a hamiltonian cycle O(1) O(n) O(n 2 ) a) check path starts and ends at the same vertex i.e., p[1] = p[n] b) each edge in the path exists in graph i.e., (p[i], p[i+1]) E c) no duplicate vertex on the path 3) Finite # of guesses (since E is finite) and each guess takes poly-time to verify --> total takes poly-time HAM-CYCLE Note: HAM-CYCLE can be verified quickly ~ NP Step 1) --> non-determinism Step 2) & 3) --> verification in poly-time HAM-CYCLE is DP (implicit bound = # all vertices) Is HAM-CYCLE P? Since no det poly-time alg sol has been found yet Thus, the answer is don t know 4
P vs. NP To show L NP is easier than to show L P (i.e., verified quickly vs. solved quickly) Clearly, P NP (problem that can be solved quickly can be verified quickly) But is NP P? -- Open problem: P NP? There exists many problems in NP but we don t know if they are in P (e.g., HAM-CYCLE) Fact: P is closed under compliment (i.e., L P L P) But we don t know if NP is closed under compliment Open problem: L NP L NP? or NP = co-np? NP-Completeness (NPC) Intuitively: NPB problems are those that if they are in NP then they are as hard as any problem in NP Implications: If any one NPC problem can be solved in poly-time = (*) then 1) all NP problems can be solved in poly-time (by def) and so, 2) NP P --> NP = P!!!! (see more details later in Result2) But (*) is unlikely to be true. There are many NPC problems with no poly-time sol found so far... At the moment, we can t prove or disprove (*) 5
NPC - formal definition To show L NPC - we want to show how hard L is NPC vs. NP-Hard (NPH) Reduction: L is polynomially reducible to L, denoted by L p L if a poly-time computational function f: L L s.t. x L iff f(x) L x L f L L NPC vs. NPH Def: L NPC if 1) L NP, and 2) L p L for all L NP Def: L NPH if L p L for all L NP L NPC means L NP and L NPH L NPH means every NP prob is polynomially reducible to L Result 1: L and L are languages representing DPs If L p L then L P L P (lemma 34.3) x B f by function f by Alg A by Alg B (constructed as below) x L iff f(x) L A decides if a given input L or not f(x) A f(x) L yes x L yes x L f(x) L no no B correctly decides if x L or not in poly-time. Therefore, L P. 6
NPC vs. NPH Def: L NPC if 1) L NP, and 2) L p L for all L NP Def: L NPH if L p L for all L NP Result 1: If L p L then L P L P Result 2: If we can solve any NPC problem in poly-time then P = NP (this is why NPC is important!) I.e., If L NPC and L P then P = NP (lemma 34.4) Only need to show NP P. Suppose S NP. To show S P. L NPC L p L for all L NP (definition of NPC part 2)) S p L (since S NP) S P (since S p L and L P and by Result 1) To show P NPC Def: L NPC if 1) L NP, and 2) L p L for all L NP Result 1: If L p L then L P L P Result 2: If L NPC and L P then P = NP To show L NPH directly is difficult (Why?) Can we prove this indirectly? YES How? 1. Show L NP 2. Select one known L NPC 3. Show L p L Why? 2. S p L for all S NP (by def) S p L p L for all S NP (by the above, 3. and transitivity of p ) S p L for all S NP L NPH L NPC (since 1.) 7
Traveling Salesman Problem (TSP) TSP Find a min cost of tsp-tour (route that visits each of n cities once and back at the starting city) Notes: TSP is an OP Putting the bound on OP gives a related DP I.e., a TSP <G, cost, k>, where G is a complete graph ( an edge between any two vertices) cost is a function corresponding to each edge, and k is a bound of travel cost To show TSP NPC How to show L NPC 1. Show L NP 2. Select one known L NPC 3. Show L p L 1. Show TSP NP (~ TSP can be verified quickly) Given an instance of a TSP <G, cost, k> and a sequence of n vertices (cities), S Can we verify that S is a tsp-tour with cost k in poly-time? Answer: Yes. Check if 1. V = n (same size as S) tsp-tour? 2. S[i] V and S[i+1] Adj [S[i]] for all i = 1, 2,...n-1 3. S[n] V and S[1] Adj [S[n]] cost k? 4. Sum up the edge costs and check if it is bounded by k. The above takes poly-time (O(n)). Since there are finitely many k s... thus, total time for Alg to verify TSP takes poly-time TSP NP 8
To show TSP NPC How to show L NPC 1. Show L NP 2. Select one known L NPC 3. Show L p L By finding poly-time alg A that computes f s.t. x L iff f(x) L 2. Select HAM-CYCLE which is known to be NPC 3. To show HAM-CYCLE p TSP Idea: Given an instance of HAM-CYCLE, G. Construct G, a corresponding instance TSP s.t. the construction takes poly-time. E.g., G = (V, E): A C B D f G = (V, E ): a complete graph, where A 0 1 B 0 1 0 0 C D c(i, j) = 0 if (i, j) E 1 o.w. HAM-CYCLE p TSP To show that the construction f takes poly-time copy V in G to V in G create complete graph G (for each v V, Adj[v] V {v}) assign cost to each edge (can you write pseudocode?) O(V) O(V) O(E) To show that G HAM-CYCLE iff f(g) = G TSP G contains a hamiltonian cycle h Each edge in h is in E and has cost 0 in G. G contains a tsp-tour, namely h with cost 0. This is min cost possible. Thus, G TSP. G has a tsp-tour h of cost at most 0 (possible min route cost) Since cost of edges in E can be 0 or 1 (no -ve cost), each edge on h must have cost 0 each edge on h must be in E tsp-tour h (hamiltonian h ) is in G 9
Other NPCs Hamiltonian path circuit satisfiability satisfiability 3-CNF satisfiability vertex-cover TSP Graph-coloring problem Why learn this? Why do CS students need to learn about problem classes in complexity theory? Why good algorithm designers must understand NP-completeness? Reasons: Different complexity classes of problems effect how we design alg to solve them: P ~ can find alg to solve the problem quickly NP ~ can find alg to verify the problem quickly NPC ~ no hope to find fast alg. Must either find approx sol find tractable sol in special case 10