10 2.2. CLASSES OF COMPUTATIONAL COMPLEXITY An optimization problem is defined as a class of similar problems with different input parameters. Each individual case with fixed parameter values is called an instance of the problem. E.g. input for an instance of the TSP problem is given by the number of vertices and the distances between vertices. The computation time required to obtain a solution depends on - the problem - size of the instance - parameter values of the instance - the algorithm - the computer and implementation of the algorithm In complexity theory, algorithms and problems are classified by the growth order of computation time as a function of instance size. Definition 2.1. The worst case time complexity or a complexity of an algorithm A is the time T (n) A that is needed at most for solving a problem instance of size n. An exact number is often impossible to determine, instead an asymptotic upper bound will be used. Definition 2.2. The order of growth of a function f(n) is O(g(n)), if there are positive constants a and N such that f(n) # ag(n) for every n$n. Examples: th 2 k k A k order polynomial f 1(n) = a 0+a1n+a2n +...+ ak n (k0z + ) is of order f 1(n) = O(n ) 3 6 3 Function f 2(n) = 2n +10 n log n is of order f 2(n) = O(n ). Symbol O is used for an upper bound for complexity T (n). A Notation: T (n) = O(g(n)) means the complexity of algorithm A is of order O(g(n)). A Definition 2.3. k Algorithm A is polynomial, if its complexity T A(n) = O(n ) for some constant k. Algorithm A is exponential, if it is not polynomial. log n n n n Examples of orders of exponential algorithms: O(n ), O(2 ), O(k ) where k>1, O(n!) and O(n ). Some sources use the term superpolynomial instead of exponential. Algorithms with a logarithmic order of complexity T A(n)=O(logkn) have a lower than polynomial order of growth, but the most important difference is between the polynomial and exponential classes. Polynomial algorithms can be combined, concatenated or nested and the resulting algorithm is still polynomial. E.g. when a polynomial algorithm A uses polynomial algorithm B as a subroutine so that the numberof times that B is invoked is bounded by a polynomial, then A is polynomial Complexity was defined as a function of the size of the problem. The size of the problem is the number of symbols needed to present the problem input, length of the input string. This size is not unambiguous, it is dependent on the encoding scheme of a problem (e.g. there are different ways to represent a graph). The encoding should be as concise (short) as possible. As in computation time, only the order of the size is relevant here. Usually the sizes of all "reasonable", nonredundant encodings are polynomially related. Often the size is a polynomial function of few problem
11 parameters, e.g. number of vertices and/or edges in a graph, number of variables, matrix dimensions. Then the complexity can be given as a function of these parameters only. COMPARISON OF SOME COMPLEXITY CLASSES Consider the order of growth of the following functions: f(n) n=10 n=100 n=1000 n 10 100 1000 n log n 33 664 9966 n 3 1000 10 6 10 9 10 6 n 8 10 14 10 22 10 30 n log n 2099 1.93@ 10 13 7.89@ 10 29 2 n 1024 1.27@ 10 30 1.05@ 10 301 n! 3628800 10 158 4@ 10 2567 (log = log 2 ) 6 Assume that the computer can execute 10 operations per second. The following table shows the maximum size n of problems solvable in 1 second and 3 years by algorithms having computation time shown on the left: f(n) 1 s 8 10 s. 3 years n 10 6 10 14 n log n 6 @ 10 4 2 @ 10 12 n 3 100 46415 10 6 n 8 1 10 n log n 22 112 2 n 19 46 n! 9 16 Example 2.1. Compare algorithms of a logarithmic, polynomial and exponential complexity: a problem can be solved by three different algorithms A 1, A 2 and A 3. Suppose, an instance of size 5 n=10 can be solved by the O(log n)-algorithm A 1 in one hour, by the O(n )-algorithm A 2 in one n minute and by the O(2 )-algorithm A 3 in one second. If the size of an instance is n=100, what upper bounds can be given for the execution time? Answer: The logarithmic algorithm A 1 will solve the problem in 2 hours, the polynomial algorithm 19 A 2in 69.4 days, but the exponential algorithm A 3may need 10 years. Is it always advisable to choose a polynomial algorithm instead of an exponential one, when 6 8 n possible? Is an algorithm with complexity T(n)=O(10 n ) peferred over one with T(n)=O(1.2 )? Not necessarily, because - the upper bounds may not be tight - worst cases may be rare
- the implementation of the method may have a significant influence in the computation time - the problem size may be moderate enough 12 For instance, a linear programming problem (LP) can be solved with either an exponential simplex algorithm or a polynomial Karmarkar's interior point method. The computer codes for simplex have been developed for decades and they work quite well in most cases, even for large problems. It seems that the hard instances are rare and the average computation time does not show exponential growth. The optimization problems can now be classified according to their complexity in a similar way. Because a problem can be solved with several algorithms, we seek for the most efficient algorithm in this definition: Definition 2.4. The complexity of an optimization problem X is polynomial, if there is a polynomial algorithm that solves problem X. To show that a given problem is polynomial, we need to define the algorithm, verify that it gives the right solution to every instance of the problem and show that the computation time for instances k of size n is always O(n ). Sometimes an average case complexity would be a better measure of computational effort than worst case complexity, but then a distribution of problems or solution times would be needed. TYPES OF PROBLEMS The algorithm complexity theory was originally developed for decision or recognition problems. For these, the problem is to decide whether or not an instance (encoded as a string of symbols) satisfies the given statement or condition. In other words, to decide / recognize whether the instance belongs to the set of yes instances of the problem. Examples of decision or recognition problems: 1) Input: Integer M Question: Is M a prime number? 2) Input: n integers x, x,...,x 1 2 n Question: Can the numbers be partitioned into two sets such that the sums of the integers in both sets are the same? (Partition problem) 3) Input: Graph G=(V,E). Question: Does the graph G have a Hamiltonian path (i.e. a path that passes through every vertex exactly once)? 4) Input: Number of vertices n, distances between all vertices c, integer L. ij Question: Does the graph contain a cycle through all the vertices with total length # L? 5) Input: n objects with weights w, i=1,...,n, weight capacity W of a container, number of coni tainers k. Question: Can the objects be placed in the containers within the capacity limit? 6) Input: Graph G=(V,E), integer k. Question: Does the graph contain a clique of k vertices?
7) Input: nxm matrix of integers A, mx1 vector of integers b. T Question: Is there an integer (or binary) vector x = (x 1,...,x n) such that Ax = b? 13 Many of these problems have an optimization counterpart: Problem 4 is a decision problem related to the traveling salesman problem, problem 5 to the bin packing problem, problem 6 to the maximum clique problem and problem 7 to the integer programming problem. For every combinatorial optimization problem min f(x) subject to x0s, three different versions of the problem can be stated: (1) Optimization problem: Find a feasible solution x* that minimizes the objective function f. (2) Evaluation problem: Find the minimum value f* of the objective function. (3) Decision problem, recognition problem: Is there a feasible solution x0s such that f(x) # L (a given threshold value). The first is the hardest of the three and the third is the easiest in the following sense: If the optimization problen is solved then the solution to the evaluation problem can be calculated and the answer to the decision/recognition problem is obvious. But is there any real difference in the difficulty of these problems? COMPLEXITY CLASSES P AND NP Definition 2.5. The class P consists of those decision problems that can be solved with a polynomial algorithm. Shortly, P = the set of all polynomially solvable decision problems. The solution to a recognition problem is either yes or no. If the solution is yes, the algorithm is said to accept the given input. The problem belongs to the class P if and only if the following condition holds: there is a deterministic Turing machine such that when an instance of size n is given as its input, the machine stops in an accepting state (yes state) after a polynomial number p(n) of steps if and only if answer for the instance is yes. Examples of problems whose decision versions belong to the class P: - LP (Linear Programming problem) - Assignment Problem - Minimum Spanning Tree Problem - Shortest Path Problem - several one machine scheduling problems - Perfect Matching - Edge Cover If the answer to the posed yes/no question is yes, a verification of the result is a proof based on a feasible arrangement x. For instance, if a traveling salesman solution x i.e. a cycle through all vertices with length # L is given or guessed, the verification is easy: just check that every vertex is included in the cycle, calculate the length of the cycle and compare it with the threshold value L. The given solution x acts as a certificate of the answer. The number of steps needed is a polynomial function of the number of vertices. Informally, the class NP (Nondeterministic Polynomial problems) consists of decision problems for which a yes instance can be verified in polynomial time.
14 Definition 2.6.a. The class NP consists of decision problems for which every yes instance y has a polynomial size certificate c(y) that can be checked in polynomial time to give answer yes. More formally: Definition 2.6.b. The class NP consists of decision problems for which there exists a polynomial p(n) and an algorithm A such that an instance, encoded as a string y, is a yes instance if and only if there exists a string c(y), the certificate, with size c(y) # p( y ) given input (y, c(y)), the algorithm A will stop with the answer yes after at most p( y ) steps. The solution of an NP problem is formalized by so-called nondeterministic algorithm (or nondeterministic Turing machine): For a given instance string y, the algorithm generates a random clue or guess c(y) and uses the concatenated string (y, c(y)) as its input. When the right certificate c(y) is generated for a yes instance, the algorithm stops with the answer yes. The right certificate must always have a positive probability. (There are other descriptions for the nondeterministic Turing machine.) Another definition by means of nondeterminism: Definition 2.6.c. The problem belongs to NP if there is a nondeterministic algorithm that 1) stops at an accepting state if and only if the instance satisfies the given statement and the right polynomial size certificate is given as a clue 2) takes a number of steps that is bounded by a fixed polynomial function of the size of the instance. Note that P f NP because the polynomially solvable problems can be solved by a nondeterministic algorithm: just use the polynomial algorithm as the checking algorithm and ignore the clue. The class NP contains the decision versions of virtually all combinatorial optimization problems. That is because the solution to a combinatorial optimization problem is usually an arrangement, design or sequence of objects that can be coded as a concise certificate to a given question.