P, NP, NP-Complete, and NPhard

P, NP, NP-Complete, and NPhard Problems Zhenjiang Li 21/09/2011

Outline Algorithm time complicity P and NP problems NP-Complete and NP-Hard problems

Algorithm time complicity

Outline What is this course about? What are algorithms? What does it mean to analyze an algorithm? Comparing time complexity Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 7 / 31

Computational Problem Definition A computational problem is a specification of the desired input-output relationship Example (Computational Problem) Sorting Input: Sequence of n numbers a 1,, a n Output: Permutation (reordering) a 1, a 2,, a n such that a 1 a 2 a n Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 8 / 31

Instance Definition A problem instance is any valid input to the problem. Example (Instance of the Sorting Problem) 8, 3, 6, 7, 1, 2, 9 Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 9 / 31

Algorithm Definition An algorithm is a well defined computational procedure that transforms inputs into outputs, achieving the desired input-output relationship Definition A correct algorithm halts with the correct output for every input instance. We can then say that the algorithm solves the problem Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 10 / 31

Example: Insertion Sort Pseudocode: Input: A[1... n] is an array of numbers for j 2 to n do key A[j]; i j 1; while i 1 and A[i] > key do A[i + 1] A[i]; i i 1; end A[i + 1] key; end Sorted key Unsorted Where in the sorted part to put key? Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 11 / 31

How Does It Work? An incremental approach: To sort a given array of length n, Example at the ith step it sorts the array of the first i items by making use of the sorted array of the first i 1 items in the (i 1)th step Sort A = 6, 3, 2, 4, 5 with insertion sort Step 1: 6, 3, 2, 4, 5 Step 2: 3, 6, 2, 4, 5 Step 3: 2, 3, 6, 4, 5 Step 4: 2, 3, 4, 6, 5 Step 5: 2, 3, 4, 5, 6 Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 12 / 31

Outline What is this course about? What are algorithms? What does it mean to analyze an algorithm? Comparing time complexity Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 13 / 31

Analyzing Algorithms Predict resource utilization 1 Memory (space complexity) 2 Running time (time complexity) focus of this course depends on the speed of the computer depends on the implementation details depends on the input, especially on the size of the input In light of the above factors, how can we compare different algorithms in terms of their running times? We want to find a way of measuring running times that is mathematically elegant and machine-independent. Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 14 / 31

Machine-independent running time 1 We will measure the running time as the number of primitive operations (e.g., addition, multiplication, comparisons) used by the algorithm 2 We will measure the running time as a function of the input size. Let n denote the input size and let T (n) denote the running time for input of size n. Input size n: rigorous definition given later sorting: number of items to be sorted graphs: number of vertices and edges Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 15 / 31

Three Kinds of Analysis: I Best Case: An instance for a given size n that results in the fastest possible running time. Example (Insertion sort) A[1] A[2] A[3] A[n] The number of comparisons needed is equal to } 1 + 1 + 1 {{ + + 1 } = n 1 = Θ(n) n 1 key Sorted Unsorted key is compared to only the element right before it. Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 16 / 31

Three Kinds of Analysis: II Worst Case: An instance for a given size n that results in the slowest possible running time. Example (Insertion sort) A[1] A[2] A[3] A[n] The number of comparisons needed is equal to 1 + 2 + + (n 1) = n(n 1) 2 = Θ(n 2 ) key Sorted Unsorted key is compared to everything element before it. Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 17 / 31

Three Kinds of Analysis: III Average Case: Running time averaged over all possible instances for the given size, assuming some probability distribution on the instances. Example (Insertion sort) Θ(n 2 ), assuming that each of the n! instances is equally likely (uniform distribution). key Sorted Unsorted On average, key is compared to half of the elements before it. Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 18 / 31

Three Kinds of Analysis Best case: Clearly useless Worst case: Commonly used, will also be used in this course Gives a running time guarantee no matter what the input is Fair comparison among different algorithms Average case: Used sometimes Need to assume some distribution: real-world inputs are seldom uniformly random! Analysis is complicated Will not be used in this course Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 19 / 31

Outline What is this course about? What are algorithms? What does it mean to analyze an algorithm? Comparing time complexity Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 20 / 31

Comparing Time Complexity T (n) Algorithm 1 Algorithm 2 Which algorithm is superior for large n? T (n) for Algorithm 1 is 3n 3 + 6n 2 4n + 17 T (n) for Algorithm 2 is 7n 2 8n + 20 Clearly, Algorithm 2 is superior. n Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 21 / 31

Asymptotic Analysis T (n) Algorithm 1 Algorithm 2 T (n) for Algorithm 1 is 3n 3 + 6n 2 4n + 17 = Θ(n 3 ) T (n) for Algorithm 2 is 7n 2 8n + 20 = Θ(n 2 ) Θ-notation Drop low-order terms; ingore leading constants Look at growth of T (n) as n When n is large enough, a Θ(n 2 ) algortihm always beats a Θ(n 3 ) algorithm Lecture 1: Introduction COMP271: Design and Analysis of Algorithms 22 / 31 n

Big-Oh Asymptotic upper bound Definition (big-oh) f (n) = O(g(n)): There exists constant c > 0 and n 0 such that f (n) c g(n) for n n 0 When estimating the growth rate of T (n) using big-oh: ignore the low order terms ignore the constant coefficient of the most significant term n 0 cg(n) f(n) n f(n) = O(g(n)) Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 2 / 14

Big-Oh: Example Definition (big-oh) f (n) = O(g(n)): There exists constant c > 0 and n 0 such that f (n) c g(n) for n n 0 Example Let T (n) = 3n 2 + 4n + 5. Prove that T (n) = O(n 2 ). Proof. T (n) = 3n 2 + 4n + 5 3n 2 + 4n 2 + 5n 2 = 12n 2. Thus, T (n) 12n 2 for all n 1. Setting n 0 = 1 and c = 12 in the definition, we have that T (n) = O(n 2 ). Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 3 / 14

Big-Oh: More Example n 2 /2 3n = O(n 2 ) 1 + 4n = O(n) log 10 n = log 2 n log 2 10 = O(log 2 n) = O(log n) sin n = O(1), 10 = O(1), 10 10 = O(1) n i=1 i 2 n n 2 = O(n 3 ) n i=1 i n n = O(n2 ) 2 10n is not O(2 n ) log(n!) = log(n) + log(n 1) + + log 1 = O(n log n) n i=1 1 i = O(log n) (on board) Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 4 / 14

Big-Omega Asymptotic lower bound Definition (big-omega) f (n) = Ω(g(n)): There exists constant c > 0 and n 0 such that f (n) c g(n) for n n 0. It is easy to show that n 2 2 n2 3n 4 for all n 12. Thus, n 2 /2 3n = Ω(n 2 ). n 0 f(n) cg(n) n f(n) = Ω(g(n)) Example log(n!) = log(n) + log(n 1) + + log 1 log(n) + log(n 1) + + log(n/2) n/2 log(n/2) = n/2 (log n 1) = Ω(n log n). Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 5 / 14

Big-Theta Asymptotic tight bound Definition (big-theta) f (n) = Θ(g(n)): f (n) = O(g(n)) and f (n) = Ω(g(n)) We have shown that and n 2 /2 3n = O(n 2 ), n 2 /2 3n = Ω(n 2 ). Therefore, we have that n 2 /2 3n = Θ(n 2 ). Usually (and in this course), it is sufficient to show only upper bounds (big-oh), though we should try to make these as tight as we can. Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 6 / 14

Some Thoughts on Algorithm Design Algorithm Design, as taught in this class, is mainly about designing algorithms that have small big-oh running times As n gets larger and larger, O(n log n) algorithms will run faster than O(n 2 ) ones and O(n) algorithms will beat O(n log n) ones Good algorithm design & analysis allows you to identify the hard parts of your problem and deal with them effectively Too often, programmers try to solve problems using brute force techniques and end up with slow complicated code! A few hours of abstract thought devoted to algorithm design often results in faster, simpler, and more general solutions. Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 7 / 14

Algorithm Tuning After algorithm design one can continue on to Algorithm tuning concentrate on improving algorithms by cutting down on the constants in the big O() bounds needs a good understanding of both algorithm design principles and efficient use of data structures In this course we will not go further into algorithm tuning For a good introduction, see chapter 9 in Programming Pearls, 2nd ed by Jon Bentley Lecture 2: Asymptotic Notations and Recurrences COMP271: Design and Analysis of Algorithms 8 / 14

P and NP problems

Decision Problems Definition: A decision problem is a question that has two possible answers, yes and no. Note: If is the problem and is the input we will often write to denote a yes answer and to denote a no answer. Note: This notation comes from thinking of as a language and asking whether is in the language (yes) or not (no). See CLRS, pp. 975-977 for more details Definition: An optimization problem requires an answer that is an optimal configuration. Remark: An optimization problem usually has a corresponding decision problem. Examples that we will see: MST vs. Decision Spanning Tree (DST) Knapsack vs. Decision Knapsack (DKnapsack) SubSet Sum vs. Decision Subset Sum (DSubset Sum) 13

Decision Problem: MST Optimization problem: Minimum Spanning Tree Given a weighted graph, find a minimum spanning tree (MST) of. Decision problem: Decision Spanning Tree (DST) Given a weighted graph and an integer, does have a spanning tree of weight at most? The inputs are of the form So we will write or to denote, respectively, yes and no answers. 14

Optimization and Decision Problems For almost all optimization problems there exists a corresponding simpler decision problem. Given a subroutine for solving the optimization problem, solving the corresponding decision problem is usually be trivial. Example: If we know how to solve MST we can solve DST which asks if there is an Spanning Tree with weight at most How? First solve the MST problem and then check if the MST has cost If it does, answer Yes. If it doesn t, answer No. Thus if we prove that a given decision problem is hard to solve efficiently, then it is obvious that the optimization problem must be (at least as) hard. Note: The reason for introducing Decision problems is that it will be more convenient to compare the hardness of decision problems than of optimization problems (since all decision problems share the same form of output, either yes or no.) 17

Decision Problems: Yes-Inputs and No-Inputs Yes-Input and No-Input: An instance of a decision problem is called a yes-input (resp. no-input) if the answer to the instance is yes (resp. no). CYC Problem: Does an undirected graph cycle? have a Example of Yes-Inputs and No-Inputs: a b 1 2 d c 4 3 Yes-input G No-input G 18

Decision Problems: Yes-Inputs and No-Inputs Decision Problem (TRIPLE): Does a triple? of nonnegative integers satisfy Example of Yes-Inputs:,. Example of No-Inputs:,. 19

Complexity Classes The Theory of Complexity deals with the classification of certain decision problems into several classes: the class of easy problems, the class of hard problems, the class of hardest problems; relations among the three classes; properties of problems in the three classes. Question: How to classify decision problems? Answer: Use polynomial-time algorithms. 21

Polynomial-Time Algorithms Definition: An algorithm is polynomial-time if its running time is, where is a constant independent of, and is the input size of the problem that the algorithm solves. Remark: Whether you use or (for fixed ) as the input size, it will not affect the conclusion of whether an algorithm is polynomial time. This explains why we introduced the concept of two functions being of the same type earlier on. Using the definition of polynomial-time it is not necessary to fixate on the input size as being the exact minimum number of bits needed to encode the input! 22

Nonpolynomial-Time Algorithms Definition: An algorithm is non-polynomial-time if the running time is not for any fixed. Example: Let s return to the brute force algorithm for determining whether a positive integer is a prime: it checks, in time, whether divides for each with. The complete algorithm therefore uses time. Conclusion: The algorithm is nonpolynomial! Why? The input size is, and so 24

Is Knapsack Polynomial? Recall the problem. We have a knapsack of capacity (a positive integer) and objects with weights and values, where and are positive integers. The optimization problem is to find the largest value of any subset that fits in the knapsack, that is,. The decision problem is, given, to find if there is a subset of the objects that fits in the knapsack and has total value at least? In class we saw a dynamic programming algorithm for soving the optimization version of Knapsack. Is this a polynomial algorithm? No! The size of the input is the is not polynomial in Depending upon the values of and, could even be exponential in It is unknown as to whether there exists a polynomial time algorithm for Knapsack. In fact, Knapsack is a -Complete problem, so anyone who could determine whether there was a polynomial-time algorithm for solving it would be proving that or and would win the prize from the Clay Institute! 25

Polynomial- vs. Nonpolynomial-Time Nonpolynomial-time algorithms are impractical. For example, to run an algorithm of time complexity for on a computer which does 1 Terraoperation ( operations) per second: It takes seconds years. For the sake of our discussion of complexity classes Polynomial-time algorithms are practical. Note: in reality an algorithm is not really practical. 26

Polynomial-Time Solvable Problems Definition: A problem is solvable in polynomial time (or more simply, the problem is in polynomial time) if there exists an algorithm which solves the problem in polynomial time. Examples: The integer multiplication problem, and the cycle detection problem for undirected graphs. Remark: Polynomial-time solvable problems are also called tractable problems. 27

The Class Definition: The class consists of all decision problems that are solvable in polynomial time. That is, there exists an algorithm that will decide in polynomial time if any given input is a yes-input or a no-input. How to prove that a decision problem is in? You need to find a polynomial-time algorithm for this problem. How to prove that a decision problem is not in? You need to prove there is no polynomial-time algorithm for this problem (much harder). 28

Certificates and Verifying Certificates We have already seen the class We are now almost ready to introduce the class Before doing so we must first introduce the concept of Certificates. Observation: A decision problem is usually formulated as: Is there an object satisfying some conditions? A Certificate is a specific object corresponding to a yes-input, such that it can be used to show the validity of that yes-input. By definition, only yes-input needs a certificate (a noinput does not need to have a certificate to show it is a no-input). Verifying a certificate: Given a presumed yes-input and its corresponding certificate, by making use of the given certificate, we verify that the input is actually a yes-input. 31

The Class Definition: The class consists of all decision problems such that, for each yes-input, there exists a certificate that can be verified in polynomial time. Remark: stands for nondeterministic polynomial time. The class was originally studied in the context of nondeterminism, here we use an equivalent notion of verification. 32

Satisfiability I We will now introduce Satisfiability (SAT), which, we will see later, is one of the most important problems. Definition: A Boolean formula is a logical formula which consists of boolean variables (0=false, 1=true), logical operations, NOT,, OR,, AND. These are defined by: 38

Satisfiability II A given Boolean formula is satisfiable if there is a way to assign truth values (0 or 1) to the variables such that the final result is 1. Example:. For example, the assignment,, makes true, and hence it is satisfiable. 39

SAT SAT problem: Determine whether an input Boolean formula is satisfiable. If an Boolean formula is satisfiable, it is a yes-input; otherwise, it is a no-input. Claim: SAT. Proof: The certificate consists of a particular 0 or 1 assignment to the variables. Given this assignment, we can evaluate the formula of length (counting variables, operations, and parentheses), it requires at most evaluations, each taking constant time. Hence, to check a certificate takes time. So we have SAT. 41

NP-Complete and NP-hard problems

Polynomial-Time Reductions Definition Let L 1 and L 2 be two decision problems. A Polynomial-Time Reduction from L 1 to L 2 is a transformation f with the following two properties: 1 f transforms an input x for L 1 into an input f (x) for L 2 such that a yes-input of L 1 maps to a yes-input of L 2, and a no-input of L 1 maps to a no-input of L 2. Y N f Y N 2 f (x) is computable in polynomial time (in size(x)). L1 L2 If such an f exists, we say that L 1 is polynomial-time reducible to L 2, and write L 1 P L 2. Lecture 22: NP-Completeness

Polynomial-Time Reductions Intuitively, L 1 P L 2 means L 1 is no harder than L 2. Given an algorithm A 2 for the decision problem L 2, we can develop an algorithm A 1 to solve L 1 : Algorithm for L 1 x input for L 1 Transformation f f(x) input for L 2 algorithm for L 2 yes/no answer for L 2 on f(x) yes/no answer for L 1 on x If A 2 is polynomial-time algorithm, so is A 1. Theorem If L 1 P L 2 and L 2 P, then L 1 P. Lecture 22: NP-Completeness

Reduction between Decision Problems Lemma (Transitivity of the relation P ) If L 1 P L 2 and L 2 P L 3, then L 1 P L 3. Proof. Since L 1 P L 2, there is a polynomial-time reduction f 1 from L 1 to L 2. Similarly, since L 2 P L 3, there is a polynomial-time reduction f 2 from L 2 to L 3. Note that f 1 (x) can be calculated in time polynomial in size(x). In particular this implies that size(f 1 (x)) is polynomial in size(x). f (x) = f 2 (f 1 (x)) can therefore be calculated in time polynomial in size(x). Furthermore x is a yes-input for L 1 if and only if f (x) is a yes-input for L 3 (why). Thus the combined transformation defined by f (x) = f 2 (f 1 (x)) is a polynomial-time reduction from L 1 to L 3. Hence L 1 P L 3. Lecture 22: NP-Completeness

The Class NP-Complete (NPC) We have finally reached our goal of introducing class NPC. Definition The class NPC of NP-complete problems consists of all decision problems L such that 1 L NP; 2 for every L NP, L P L. Intuitively, NPC consists of all the hardest problems in NP. Lecture 22: NP-Completeness

NP-Completeness and Its Properties Let L be any problem in NPC. Theorem 1 If there is a polynomial-time algorithm for L, then there is a polynomial-time algorithm for every L NP. 2 If there is no polynomial-time algorithm for L, then there is no polynomial-time algorithm for any L NPC. Proof. 1 By definition of NPC, for every L NP, L P L. Since L P, by the theorem on Slide 6, L P. 2 By the previous conclusion. Lecture 22: NP-Completeness

NP-Completeness and Its Properties According to the above theorem, either 1 all NP-Complete problems are polynomial time solvable, or 2 all NP-Complete problems are not polynomial time solvable. This is the major reason we are interested in NP-Completeness. Lecture 22: NP-Completeness

The Classes P, NP, and NPC Recall P NP. Question 1 Is NPC NP? Yes, by definition! Question 2 Is P = NP? Open problem! Probably very hard It is generally believed that P NP. Lecture 22: NP-Completeness

The Classes P, NP, and NPC Lecture 22: NP-Completeness

The Class NP-Complete (NPC) From the definition of NP-complete, it appears impossible to prove one problem L NPC! By definition, it requires us to show every L NP, L P L. But there are infinitely many problem in NP, so how can we argue there exists a reduction from every L to L? To prove the first NP-complete problem, we have to use the definition of NP, and the simplicity of the TM helps again. Once we have proved the first NP-complete problem, by to the transitivity property of the relation p, we have an easier way to show that a problem L NPC: Proof. (a) L NP; (b) for some L NPC, L P L. Let L be any problem in NP. Since L is NP-complete, L p L. Since L p L, by transitivity, L p L. Lecture 22: NP-Completeness

Cook s Theorem (Cook-Levin Theorem) Theorem (Cook s Theorem) SAT NPC. Proof. See p. 310 312. Lecture 22: NP-Completeness

Problem: CLIQUE Definition (Clique) A clique in an undirected graph G = (V, E) is a subset V V of vertices such that each pair u, v V is connected by an edge (u, v) E. In other words, a clique is a complete subgraph of G Example a vertex is a clique of size 1, an edge a clique of size 2. 1 2 Find a clique with 4 vertices 5 3 4 CLIQUE Find a clique of maximum size in a graph. Lecture 22: NP-Completeness

NPC Problem: DCLIQUE The Decision Clique Problem DCLIQUE Given an undirected graph G and an integer k, determine whether G has a clique with k vertices. Theorem DCLIQUE NPC. Proof We need to show two things. (a) That DCLIQUE NP and (b) That there is some L NPC such that L P DCLIQUE. Lecture 22: NP-Completeness

Proof that DCLIQUE NPC Claim (a) DCLIQUE NP Proof. Proving (a) is easy. A certificate will be a set of vertices V V, V = k that is a possible clique. To check that V is a clique all that is needed is to check that all edges (u, v) with u v, u, v V, are in E. This can be done in time O( V 2 ) if the edges are kept in an adjacency matrix (and even if they are kept in an adjacency list how?). Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) Claim (b) There is some L NPC such that L P DCLIQUE. To prove (b) we will show that 3-SAT P DCLIQUE. Instance of 3-SAT problem Algorithm that Instance of clique problem C 1, C 2,..., C n transforms a graph G = (V, E) 3-SAT instance to and an integer k clique instance This will be the hard part. We will do this by building a gadget that allows a reduction from the 3-SAT problem (on logical formulas) to the DCLIQUE problem (on graphs, and integers). Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) Recall that the input to 3-SAT is a logical formula φ of the form φ = C 1 C 2 C n where each clause C i is a triple of the form C i = y i,1 y i,2 y i,3 where each literal y i,j is a variable or the negation of a variable. Example C 1 = (x 1 x 2 x 3 ), C 2 = ( x 1 x 2 x 3 ), C 3 = (x 1 x 2 x 3 ) We will define a polynomial transformation f from 3-SAT to DCLIQUE f : φ (G, k) that builds a graph G and integer k such that φ is a Yes-input to 3-SAT if and only if (G, k) is a Yes-input to DCLIQUE. Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) Suppose that φ is a 3-SAT formula with n clauses, i.e., φ = C 1 C 2 C n. We start by setting k = n. We now construct the graph G = (V, E). 1 For each clause C i = x i,1 x i,2 x i,3 we create 3 vertices, v1 i, v 2 i, v 3 i, in V so G has 3n vertices. We will label these vertices with the corresponding variable or variable negation that they represent. (Note that many vertices might share the same label) Example 2 We create an edge between vertices vj i and v i j if and only if the following two conditions hold: (a) vj i and v i j are in different triples, i.e., i i, and (b) vj i is not the negation of v i j. Example Note that the transformation maps all 3-SAT inputs to some DCLIQUE inputs, i.e., it does not require that all DCLIQUE inputs have pre-images from 3-SAT inputs. Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) Example φ = C 1 C 2 C 3 C 1 = (x 1 x 2 x 3 ), C 2 = ( x 1 x 2 x 3 ), C 3 = (x 1 x 2 x 3 ) C1 X1 X 2 X 3 X 1 X 1 C2 X 2 X 2 C3 Return X3 X3 Observe that the assignment X 1 =false, X 2 =false, X 3 =true satisfies φ (a yes-input for 3-SAT). This corresponds to the clique of size 3 comprising the x 2 node in C 1, the x 3 node in C 2, and the x 3 node in C 3 (a yes-input for DCLIQUE). Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) Correctness We claim that a 3-CNF formula φ with k clauses is satisfiable if and only if f (φ) = (G, k) has a clique of size k. : Suppose φ is satisfiable. Consider the satisfying truth assignment. Each of the k clauses has at least one true literal. Select one such true literal from each clause. Observe that these true literals must be logically consistent with each other (i.e., for any i, x i and x i will not both appear). Recall that in our construction of G we connect a pair of vertices if they are in different clauses and are logically consistent. Thus, for every pair of these literals, there must be an edge in G connecting the corresponding vertices. Thus these k vertices must form a clique. Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) : Suppose G has a clique of size k. Observe that there is no edge between vertices in the same clause. Hence, each clause contributes exactly one vertex to the clique. Moreover, since the construction of G connects only logically consistent vertices by an edge, every vertex in the clique must be logically consistent. Hence we can assign all the vertices in the clique to be true, and this truth assignment makes φ satisfiable. Lecture 22: NP-Completeness

Proof that DCLIQUE NPC (cont) Note that the graph G has 3k vertices and at most 3k(3k 1)/2 edges and can be built in O(k 2 ) time So f is a polynomial-time reduction. We have therefore just proven that 3-SAT P DCLIQUE. Since we already know that 3-SAT NPC and have seen that DCLIQUE NP, we have just proven that DCLIQUE NPC. Lecture 22: NP-Completeness

NP-Hard Problems Definition A problem L is NP-hard if problem in NPC can be polynomially reduced to it (but L does not need to be in NP). In general, the optimization versions of NP-Complete problems are NP-Hard. Example VC: Given an undirected graph G, find a minimum-size vertex cover. DVC: Given an undirected graph G and k, is there a vertex cover of size k? If we can solve the optimization problem VC, we can easily solve the decision problem DVC. Simply run VC on graph G and find a minimum vertex cover S. Now, given (G, k), solve DVC(G, k) by checking whether k S. If k S, answer Yes, if not, answer No. Lecture 22: NP-Completeness

Epilogue: How to Deal with Hard Problems Heuristics: All the hardness results (undecidability, NP-hardness) hold for any algorithm that solves the problem in general (worst-case analysis). There are many efficient algorithms solving these problems for typical cases. They run fast on typical inputs and find the optimal solutions (they may be slow on some contrived inputs). They run fast on all inputs and typically find near-optimal solutions (they may return bad solutions on some contrived inputs). Approximation algorithms: All the hardness results show that finding the optimal solutions is difficult, but there are efficient algorithms for finding solutions that are at most c times worse than the optimal ones. Average-case analysis: By assuming the input follows some distribution, it is possible to design algorithms whose running time is good on average. Lecture 22: NP-Completeness