Computational Complexity Problems, instances and algorithms Running time vs. computational complexity General description of the theory of NP-completeness Problem samples 1
Computational Complexity What is computational complexity all about?: Tractability vs. intractability of problems Particular Topics: Turing machines, deterministic & non-deterministic Complexity classes P, NP, co-np, #P, and PSPACE NP-completeness, NP-hardness, #P-completeness, PSPACE-completeness Special cases and sub-problems Approximation algorithms, e.g., heuristics, & performance bounds 2
Computational Complexity What does it mean to say that a problem is intractable? There are a couple of different notions of intractability: undecidable no algorithm exists for the problem decidable there is an algorithm, but the only ones we know require exponential time to compute a solution Here, we will focus on the last category. 3
The Traveling Salesman Optimization Problem Recall that a problem is a general question to be answered. A problem consists of: Some number of parameters (a generic instance) A statement of what properties a solution possesses An example of a problem: TRAVELING SALESMAN OPTIMIZATION INSTANCE: Set C of m cities, distance d(c i, c j ) Z + for each pair of cities c i, c j C. GOAL: Find a tour of C (i.e., a permutation <c (1), c (2),, c (m) > of C) having minimum total length. Note the format! 4
TSP Optimization Instance A problem instance is a collection of specific values for all of a problems parameters. A TSP instance: c1 9 C = {c 1, c 2, c 3, c 4 } 5 3 c4 D(c 1,c 2 ) = 10 D(c 1,c 3 ) = 5 D(c 1,c 4 ) = 9 10 c3 6 9 D(c 2,c 3 ) = 6 D(c 2,c 4 ) = 9 D(c 3,c 4 ) = 3 c2 5
Problems, Instances and Algorithms Let denote a problem. The parameters for define a multi-dimensional data space (or collection) of instances referred to as D. Each point in this space represents one specific instance. 6
Problems, Instances and Algorithms Our definition of a problem is very general, and contains many useless problems: SILLY INTEGER COMPUTATION INSTANCE: Positive integer B. GOAL: Compute the largest prime number less than 1000. The question to be asked is usually in terms of the instance parameters. 7
Optimization vs. Decision Problems Many (natural) problems of interest are optimization problems. Minimization, maximization Although not as natural on the surface, the theory will focus on decision problems, which are problems that have yes or no answers. A decision problem consists of two parts: A list of parameters (i.e., a generic instance); defines a set D of instances. A yes/no question asked in terms of the parameters; specifies a subset of yes instances Y which is a subset of D. 8
The Traveling Salesman Decision Problem (TSP) TRAVELING SALESMAN INSTANCE: Set C of m cities, distance d(c i, c j ) Z + for each pair of cities c i, c j C positive integer B. QUESTION: Is there a tour of C having length B or less, I.e., a permutation <c (1), c (2),, c (m) > of C such that: m 1 i= 1 d( c ( i), c ( i + 1)) + d( c ( m), c (1)) B? 9
TSP Instance A TSP instance (decision version): C = {c 1, c 2, c 3, c 4 } D(c 1,c 2 ) = 10 D(c 1,c 3 ) = 5 D(c 1,c 4 ) = 9 D(c 2,c 3 ) = 6 D(c 2,c 4 ) = 9 D(c 3,c 4 ) = 3 c1 10 5 c3 6 9 3 9 c4 B = 27 c2 10
Optimization vs. Decision Problems Why decision problems? Convenience: defining the classes of problems P and NP is easier the proofs of NP-completeness are easier unreasonably large output does not affect running time or complexity. No loss of generality; results extend to optimization problems*** More specifically, most optimization problems can be converted to a decision problem by adding an additional parameter B. The complexity of an optimization problem is typically equivalent to that of a corresponding decision problem. 11
Running Time v.s. Complexity Recall the distinction between the running time of a specific algorithm vs. the computational complexity of a particular problem. Example: MATRIX MULTIPLICATION INSTANCE: Two n x n matrices A and B SOLUTION: One n x n matrix C = A x B Running times of specific algorithms: Simple row/column algorithm - O(n 3 ) Strassen s algorithm - O(n 2.807 ) Coppersmith-Winograd algorithm - O(n 2.3728639 ) 12
Running Time v.s. Complexity Recall the distinction between the running time of a specific algorithm vs. the computational complexity of a particular problem. Example: MATRIX MULTIPLICATION INSTANCE: Two n x n matrices A and B SOLUTION: One n x n matrix C = A x B The (inherent) computational complexity of matrix multiplication: Any algorithm for matrix multiplication requires (n 2 ), i.e, O(n 2 ) is the best any algorithm could possibly do (this is an information theoretic argument). 13
Running Time v.s. Complexity Example: INTEGER SORTING INSTANCE: List of n integers. SOLUTION: The list of integers in non-decreasing order. Running times of specific algorithms: Real dumb algorithm - O(n 3 ) Bubble sort - O(n 2 ) Merge sort - O(nlogn) The (inherent) computational complexity of sorting: Any comparison-based sorting algorithm requires (nlogn) operations in the worst case, i.e, O(nlogn) is the best any algorithm could possibly do. 14
The Satisfiability Problem (SAT) A very important problem in the theory of NP-completeness is the satisfiability problem. SATISFIABILITY INSTANCE: Set U of variables and a collection C of clauses over U. QUESTION: Is there a satisfying truth assignment for C? Example #1: U = {u 1, u 2 } C = {{ u 1, u 2 }, { u 1, u 2 }} Answer is yes - satisfiable by setting both variables T 15
The Satisfiability Problem (SAT) A very important problem in the theory of NP-completeness is the satisfiability problem. SATISFIABILITY INSTANCE: Set U of variables and a collection C of clauses over U. QUESTION: Is there a satisfying truth assignment for C? Example #2: U = {u 1, u 2 } C = {{ u 1, u 2 }, { u 1, u 2 }, { u 1 }} Answer is no 16
Satisfiability, Cont. What would be a simple algorithm for SAT? Build a truth table Running time would be (at least) O(n2 m ) m is the number of variables n is the length of the expression Is a more efficient algorithm possible? probably How about one with polynomial running time? Come see me if you find one! A live white turkey and a Stanford job awaits SAT was the first problem proven to be NP-complete 17
More Sample Problems CLIQUE INSTANCE: A Graph G = (V, E) and a positive integer J <= V. QUESTION: Does G contain a clique of size J or more? GRAPH K-COLORABILITY INSTANCE: A Graph G = (V, E) and a positive integer K <= V. QUESTION: Is the graph G K-colorable? These can similarly be solved in exponential time, but no one has ever found a polynomial time algorithm for either of them. These problems are also NP-complete. 18
General Points We are interested in the border between exponential and polynomial - given a problem, is there a polynomial time algorithm for it, or are all algorithms for it exponential in running time? We are not interested in what the specific polynomial or exponential is, per se, although the theory can be modified/refined to consider these. => Simplistically and inaccurately speaking, saying that a problem is NPcomplete or NP-hard is essentially saying that there is no (deterministic) polynomial time algorithm for that problem. 19
General Points, Cont. Polynomial time does not necessarily imply practical. O(n 1000 ) O(n 2 ) could be 10,000,000n 2 NP-complete/NP-hard does not necessarily imply that their aren t useful, practical algorithms. An algorithm could have worst-case running time O(2 n ) because of some small number of cases, but O(n 2 ) average Simplex algorithm for linear programming Branch-and-bound algorithm for knapsack problem. O( n) isn t all that bad. 2 20
General Points, Cont. Proving a problem is NP-complete or NP-hard is just the beginning: Heuristic development and analysis (the problem doesn t go away) Special cases of the problem may be solvable in polynomial time Sub-exponential time algorithms may exist. 21
General Description of the Theory NP consists of those decision problems that can be solved in Nondeterministic Polynomial time Holy cow! What is that, and how could it be possibly be important? NP Put simply, NP is a big set of many very common and useful problems. An important fact is that all of these problem can be solved in (deterministic) exponential time, i.e., there is an exponential time algorithm to solve them. 22
General Description of the Theory P consists of those problems from NP that can (also) be solved in deterministic polynomial time. P NP P NP P is also a very big set of common and useful problems, but these we know can be solving in deterministic polynomial time. 23
General Description of the Theory So now we know two things: All problems in NP can be solved in exponential time All problems in P can be solved in polynomial time (as well as exponential time) P NP Question - what would it mean for a problem to be in NP - P? This would be a problem that could be solved in exponential time, but not in polynomial time, i.e., it would be a hard problem. 24
General Description of the Theory Do any such problems exist, in NP- P? Nobody knows This is actually the big question, is P NP, or is P = NP? P NP The answer to this question appears to be P NP, i.e., there exist problems in NP for which there is no known (deterministic) polynomial time algorithm. 25
More Sample Problems DIVISIBILITY BY 2 INSTANCE: Integer k. QUESTION: Is k even? CLIQUE INSTANCE: A Graph G = (V, E) and a positive integer J <= V. QUESTION: Does G contain an independent set of size J or more? KNAPSACK INSTANCE: A finite set U, a size s(u) Z + and a value v(u) Z + for each u U, a size constraint B Z +, and a value goal k Z +. QUESTION: Is there a subset U U such that: σ u U s(u) B and σ u U v(u) K 26
General Description of the Theory There is another subset of problems in NP called NP-complete. NP-complete NP P The above diagram implies several relationships: P and NP-complete are subsets of NP (fact) P and NP-complete are proper subsets of NP (unproven, widely believed) P and NP-complete do not intersect (unproven, widely believed) Why is this set NP-complete important? 27
Facts about NP-complete Problems Some basic facts about NP-complete problems Suppose is an NP-complete problem. NP-complete P NP Fact #1: can be solved in exponential time. Fact #2: There are no known polynomial time algorithms for ; all known algorithms require exponential time, e.g., exhaustive search Fact #3: It is not known for certain whether requires exponential time or not. All NP-complete problems appear to require exponential time, but only because no polynomial time algorithm has been found for any of them. Fact #4: If P then P = NP No such NP-complete problem has ever been identified. 28
Facts about NP-complete Problems If a problem is NP-complete it is a big deal because: it is unlikely there is a polynomial-time algorithm for it if there were, everything in NP could be solved in polynomial time! Because of this, it is frequently said that NP-complete problems are the hardest problems in NP. Since the set of NP-complete problems contains many very practical problems that people have tried (and failed) to come up with polynomial time algorithms for, it is highly unlikely that any NP-complete problem can be solved in polynomial time. 29
Facts about NP-complete Problems Given a problem, we would like to know if P or NP-complete. How do we show a problem is in NP? develop a non-deterministic polynomial time algorithm for it (BTSOTC) If we know a problem is in NP, how do we show it is also in P? come up with a (deterministic) polynomial time algorithm for it (BTDT) If we can t find a polynomial time algorithm for it, how do we show a problem is (in) NP-complete? Use a proof technique called a deterministic polynomial time transformation to the problem from a known NP-complete problem (BTSOTC) 30
More Sample Problems CLIQUE INSTANCE: A Graph G = (V, E) and a positive integer J <= V. QUESTION: Does G contain a clique of size J or more? GRAPH K-COLORABILITY INSTANCE: A Graph G = (V, E) and a positive integer K <= V. QUESTION: Is the graph G K-colorable? These can similarly be solved in exponential time, but no one has ever found a polynomial time algorithm for either of them. These problems are also NP-complete. 31
Problems, Instances and Algorithms And, by the way An algorithm is a general, step-by-step procedure for solving a specific problem, e.g., a computer program. An algorithm is said to solve a problem if that algorithm can be applied to any instance of the problem and is guaranteed to always produce a solution for that instance. 32