Computational Complexity - PDF Free Download

Computational Complexity Pascal Hitzler http://www.pascal-hitzler.de CS 740 Lecture, Spring Quarter 2010 Wright State University, Dayton, OH, U.S.A. Final version. Contents 1 Big-Oh-Notation 2 2 Turing Machines and Time Complexity 5 3 Complexity under Turing Machine Variations 8 4 Linear Speedup 11 5 Properties of Time Complexity 13 6 Simulation of Computer Computations 14 7 PTime 15 8 Nondeterministic Turing Machines and Time Complexity 18 9 SAT is N P-Complete 20 10 Excursus: Is P = N P? 23 11 If P N P... 23 11.1 Problems between P and N P......................... 24 11.2 The Polynomial Hierarchy............................ 24 12 Beyond N P 25 13 N P-complete Problems 27 1

[Slideset 1: Motivation] 03/29/10 1 Big-Oh-Notation [4, Chapter 14.2] N = {0, 1, 2,... } Z = {..., 2, 1, 0, 1, 2,... } R: the real numbers 1.1 Definition f, g : N N f is of order g, if there is a constant c > 0 and n 0 N s.t. f(n) c g(n) for all n n 0. O(g) := {f f is of order g} ( big oh of g ). If f O(g) we can say that g provides an asymptotic upper bound on f. If f O(g) and g O(g), then they have the same rate of growth, and g is an asymptotically tight bound on f (and vice versa). 1.2 Remark Common abuse of notation: f = O(g) instead of f O(g). f(n) = n 2 + O(n) instead of f(n) = n 2 + g(n) for some g O(n). 1.3 Example f(n) = n 2 ; g(n) = n 3 f O(g). [For c = 1 and n > 1, n 2 c n 3.] g O(f). 03/31/10 [Assume n 3 O(n 2 ). Then ex. c, n 0 s.t. n 3 c n 2 for all n n 0. Choose n 1 = 1+max{c, n 0 }. Then n 3 1 = n 1 n 2 1 > c n 2 1 and n 1 > n 0. ] Exercise 1 (Due 04/05/10) Show that 2 n O(n!). Exercise 2 Show that n! O(2 n ). Exercise 3 Show: If f O(g) and g O(h), then f O(h). 1.4 Example f(n) = n 2 + 2n + 5; g(n) = n 2 g O(f) [For c = 1 and n > 0, n 2 c (n 2 + 2n + 5).] f O(g) [For n > 1 we have f(n) = n 2 + 2n + 5 n 2 + 2n 2 + 5n 2 = 8n 2 = 8 g(n). Hence, for c = 8 and n > 1, f(n) c g(n).] 1.5 Definition Θ(g) := {f f O(g) and g O(f)} 2

1.6 Example For f, g from Example 1.4, f Θ(g). 1.7 Remark A polynomial (with integer coefficients) of degree r N is a function of the form f : N Z : n c r n r + c r 1 n r 1 + + c 1 n + c 0, with 0 < r N, coefficients c i Z (i = 1,..., r), c r 0. For f, g : N Z, we say f O(g) if f O( g ), where f : N N : n f(n). 1.8 Example f(n) = n 2 + 2n + 5; g(n) = n 2 g O(f) [We have g : n n 2 and f f. Thus, from Example 1.4 we know that g O( f ).] f O(g) [As before, from Example 1.4.] 1.9 Remark For f, g : N R, we say f O(g) if f O( g ). 1.10 Remark log a (n) O(log b (n)) for all 1 < a, b N. [log a (n) = log a (b) log b (n) for all n N.] 1.11 Theorem The following hold. f(n) 1. If lim n g(n) f(n) 2. If lim n g(n) f(n) 3. If lim n g(n) = 0, then f O(g) and g O(f). = c with 0 < c <, then f Θ(g) and g Θ(f). =, then f O(g) and g O(f). Proof: We show part 1. f(n) Assume lim = 0, i.e., for each ε > 0 there exists n n g(n) ε N such that for all n n ε we have f(n) < ε, and hence f(n) < εg(n). Now select c = ε = 1 and n g(n) 0 = n ε. Then f(n) c g(n) for all n n 0, which shows f O(g). Now if we also assume g O(f), then there must exist d > 0 and m 0 N s.t. g(n) d f(n) for all n m 0, i.e., 1 f(n) for all n m f(n) d g(n) 0. But then lim 1 > 0. n g(n) d Exercise 4 Show Theorem 1.11 part 2. Exercise 5 Show Theorem 1.11 part 3. [Hint: Use part 1.] 3

1.12 Remark The l Hospital s Rule often comes in handy: 1.13 Example n log a (n) O(n 2 ) and n 2 O(n log a (n)) [ lim n n log a (n) n 2 f(n) lim n g(n) = lim f (n) n g (n) = lim n log a (n) + n(log a (e)/n) 2n 1.14 Theorem Let f be a polynomial of degree r. Then (1) f Θ(n r ) (2) f O(n k ) for all k > r (3) f O(n k ) for all k < r log = lim a (n) log + lim a (e) n 2n n 2n Exercise 6 Prove Theorem 1.14 (1). [Hint: Use l Hospital s Rule.] 1.15 Theorem The following hold. (1) log a (n) O(n) and n O(log a (n)) (2) n r O(b n ) and b n O(n r ) (3) b n O(n!) and n! O(b n ) ] = 0 + 0 = 0 Exercise 7 Prove Theorem 1.15 (1). 1.16 Remark The Big Oh Hierarchy. 04/05/10 O(1) constant sublinear subpolynomial O(log a (n)) logarithmic sublinear subpolynomial O(n) linear subpolynomial O(n log a (n)) n log n subpolynomial O(n 2 ) quadratic polynomial O(n 3 ) cubic polynomial O(n r ) polynomial (r 0) O(b n ) exponential (b > 1) exponential O(n!) factorial exponential 4

2 Turing Machines and Time Complexity [4, Chapter 14.3 and recap Chapters 8.1, 8.2] 2.1 Definition A standard (single tape, deterministic) Turing machine (TM) is a quintuple M = (Q, Σ, Γ, δ, q 0 ) with Q a finite set of states Γ a finite set called tape alphabet containing a blank B Σ Γ \ {B} the input alphabet δ : Q Γ partial Q Γ {L, R}, the transition function q 0 Q the start state Recall: Tape has a left boundary and is infinite to the right. Tape positions numbered starting with 0. Each tape position contains an element from Γ. Machine starts in state q 0 and at position 0. Input is written on tape beginning at 1. Rest of tape is blank. A transition 1. changes state, 2. writes new symbol at tape position, 3. moves head left or right. Computation halts if no action is defined. Computation terminates abnormally if it moves left of position 0. TMs can be represented by state diagrams. 2.2 Example Swap all a s to b s and all b s to a s in a string of a s and b s. Computation example: q 0 BababB Bq 1 ababb Bbq 1 babb Bbaq 1 abb Bbabq 1 bb Bbabaq 1 B Bbabq 2 ab Bbaq 2 bab Bbq 2 abab Bq 2 babab q 2 BbabaB Number of steps required with input string size n: n + 2 + n + 1 = 2n + 3 5

2.3 Example Copying a string: BuB becomes BuBuB (u is a string of a s and b s) Number of steps required with input string size n: O(n 2 ). Exercise 8 Make a standard TM which moves an input string consisting of a s and b s one position to the right. What is the complexity of your TM? Language accepting TMs additionally have a set F Q of final states. (They are sextuples.) A string is accepted if the computation halts in a final state (and does not terminate abnormally). 2.4 Example A TM for (a b) aa(a b). Number of steps required with input string size n: n (worst case) 2.5 Example A TM for {a i b i c i i 0} (Figure 1). Number of steps required with input string size n = 3i: O(i 4i) = O(i 2 ) = O(n 2 ) (worst case) 6

Figure 1: A TM for {a i b i c i i 0} Exercise 9 Make a standard TM which accepts the language {a 2i b i i 0}. What is the complexity of your TM? 2.6 Example A TM for palindromes over a and b (Figure 2). Number of steps required with input string size n: n+1 i = 1 2 (n + 2)(n + 1) O(n2 ) i=1 (worst case) 2.7 Definition For any TM M, the time complexity of M is the function tc M : N N s.t. tc M (n) is the maximum number of transitions processed by a computation of M on input of length n. 2.8 Definition A language L is accepted in deterministic time f(n) if there is a single tape deterministic TM M for it with tc M (n) O(f(n)). 04/07/10 7

Figure 2: A TM for palindromes over a and b. Exercise 10 (Due 04/12/10) Design a single tape TM M for {a i b i i 0} with tc M O(n log 2 (n)). [Hint: On each pass, mark half of the a s and b s that have not been previously marked.] 2.9 Remark Note, that worst-case behavior can happen when a string is not accepted. [E.g., straightforward TM to accept strings containing an a.] 3 Complexity under Turing Machine Variations [4, Chapter 14.3 cont., 14.4 and recap Chapters 8.5, 8.6] A k-track TM has one tape with k tracks. A single read-write head simultaneously reads the k symbols at the head position from all k tracks. We write the transitions as δ : Q Γ k Q Γ k {L, R}. The input string is on track 1. 3.1 Theorem If L is accepted by a k-track TM M then there is a standard TM M s.t. tc M (n) = tc M (n). 8

Proof: For M = (Q, Σ, Γ, δ, q 0, F ), let M = (Q, Σ {B} k 1, Γ k, δ, q 0, F ) with δ (q i, (x 1,..., x k )) = δ(q i, [x 1,..., x k ]). The number of transitions needed for a computation is unchanged. A k-tape TM has k tapes and k independent tape heads, which read simultaneously. A transision (i) changes the state, (ii) writes a symbol on each tape, and (iii) independently moves all tape heads. Transitions are written δ : (q i, x 1,..., x k ) [q j ; y 1, d 1 ;... ; y k, d k ] where x l, y l Γ and d i {L, R, S} (S means head stays). Any head moving off the tape causes an abnormal termination. The input string is on tape 1. 3.2 Example 2-tape TM for {a i ba i i 0}. Number of steps required with input string size n: n + 2 3.3 Example 2-tape TM for {uu u {a, b} }. Number of steps required with input string size n: 5 2 n + 4 3.4 Example 2-tape TM accepting palindromes. 9

Number of steps required with input string size n: 3(n + 1) + 1 3.5 Theorem If L is accepted by a k-tape TM M then there is a standard TM N s.t. tc N (n) = O(tc M (n) 2 ). 04/12/10 Proof: (sketch) By Theorem 3.1 it suffices to construct an equivalent 2k + 1-track TM M s.t. tc M O(tc M (n) 2 ). Simulation of the TM: We show how to do this for k = 2 (but the argument generalizes). Idea: tracks 1 and 3 maintain info on tapes 1 and 2 of M; tracks 2 and 4 have a single nonblank position indicating the position of the tape heads of M. Initially: write # in track 5, position 1 and X in tracks 2 and 4, position 1. States: 8-tuples of the form [s, q i, x 1, x 2, y 1, y 2, d 1, d 2 ], where q i Q, x i, y i Σ {U}, d i {L, R, S, U}. s represents the status of the simulation. U indicates an unknown item. Let δ : (q i, x 1, x 2 ) [q j ; y 1, d 1 ; y 2, d 2 ] be the applicable transition of M. M start state: [f1, q i, U, U, U, U, U, U]. The following actions simulate the transition of M: 1. f1 (find first symbol): M moves to the right until X on track 2. Enter state [f1, q i, x 1, U, U, U, U, U], where x 1 is symbol on track 1 under x. M returns to the position with # in track 5. 2. f2 (find second symbol): Same as above for recording symbol x 2 in track 3 under X in track 4. Enter state [f2, q i, x 1, x 2, U, U, U, U]. Tape head returns to #. 3. Enter state [p1, q j, x 1, x 2, y 1, y 2, d 1, d 2 ], with q j, y 1, y 2, d 1, d 2 obtained from δ(q i, x 1, x 2 ). 4. p1 (print first symbol): move to X in track 2. Write symbol y 1 on track 1. Move X on track 2 in direction indicated by d 1. Tape head returns to #. 5. p2 (print second symbol): move to X in track 4. Write symbol y 2 on track 3. Move X on track 4 in direction indicated by d 2. Tape head returns to #. If δ(q i, x 1, x 2 ) is undefined, then simulation halts after step 2. [f2, q i, x 1, y 1, U, U, U, U] is accepting whenever q i is accepting. For each additional tape, add two trackes, and states obtain 3 more parameters. The simulation has 2 more actions (a find and a print for the new tape). Complexity analysis: Assume we simulate the t-th transition of M. Heads of M are maximally at positions t. Finds require maximum of k 2t steps. Prints require maximum of k 2(t + 1) steps. Simulation of t-th transition requires maximum of 4kt + 2k + 1 steps. 10

Thus f(n) tc M (n) 1 + (4kt + 2k + 1) O(f(n) 2 ). t=1 Exercise 11 Let M be the TM from Example 3.2 and let N (standard 1-track!) be constructed as in the proof of Theorem 3.5. Determine the number of states of N. Exercise 12 Let a Random Access TM (RATM) be a one-tape TM where transitions are of the form δ(q i, x) = (q j, y, d), where d N. Such a transition is performed as usual, but the tape head is then moved to position d on the tape. Give a sketch, how a RATM can be simulated by a standard TM. Exercise 13 Do you think that the following is true? For any RATM M there is a standard TM N such that tc N (n) = O(tc M (n)). Justify your answer (no full formal proof needed). 4 Linear Speedup [4, Chapter 14.5] 4.1 Theorem Let M be a k-tape TM, k > 1, that accepts L with tc M (n) = f(n). Then, for any c > 0, there is a k-tape TM N that accepts L with tc N (n) c f(n) + 2n + 3. 4.2 Corollary Let M be a 1-tape TM that accepts L with tc M (n) = f(n). Then, for any c > 0, there is a 2-tape TM N that accepts L with tc N (n) c f(n) + 2n + 3. Proof: The 1-tape TM can be understood as a 2-tape TM where tape 2 is not used. Then apply Theorem 4.1. Exercise 14 Assume a language L is accepted by a 2-tape TM M with tc M (n) = n. Do you think it is possible to design a standard TM N for L with tc N (n) O(n)? Justify your answer. Proof sketch/idea for Theorem 4.1: Exemplified using Example 3.4. M = (Q, Σ, Γ, δ, q 0, F ). Simulation of the TM: N input alphabet: Σ. N tape alphabet: Γ {#} Γ m 11

Initialization of N by example. = A state of N consists of: the state of M for i = 1,..., k, the m-tuple currently scanned on tape i of N and the m-tuples to the immediate right and left a k-tuple [i 1,..., i k ], where i j is the position of the symbol on tape j being scanned by M in the m-tuple being scanned by N. After initialization, enter state (q 0 ;?, [BBB],?;?, [Bab],?; [1, 1]) (? is a placeholder, filled in by subsequent movements). Idea: Six transitions of N simulate m transitions of M (m depends on c). Simulated, compressed configurations: State: (q3;?, [BBb],?;?, [bbb],?; [3, 1]) Six transitions are performed: 1. move left, record tuples in the state 2. move two to the right, record tuples in the state 3. move one to the left State: (q 3, #, [BBb], [bab]; [bab], [bbb], [BBB]; [3, 1]) 4. and 5. rewrite tapes to match configuration of M after three transitions. State: (q 3,?, [BBb],?;?, [bbb],?; [3, 1]) Complexity analysis: Initialization requires 2n + 3 transitions. For simulation, 6 moves of N simulate m moves of M. With m 6 we obtain c 6 tc N (n) = m f(n) + 2n + 3 c f(n) + 2n + 3 12

as desired. 4.3 Remark In Theorem 4.1, the speedup is obtained by using a larger tape alphabet and by vastly increasing the number of states. 04/14/10 5 Properties of Time Complexity [4, Chapter 14.6] 5.1 Theorem Let f be a total computable function. Then there is a language L such that tc M bounded by f for any TM that accepts L. is not Proof sketch: Let u be an injective function which assigns to every TM M with Σ = {0, 1} a string u(m) over {0, 1}. [Each TM can be described by a finite string of characters; then consider a bit encoding of the characters.] Let u 1, u 2, u 3,... be an enumeration of all strings over {0, 1}. If u(m) = u i for some M, then set M i = M, otherwise set M i to be the one-state TM with no transitions. This needs to be done, such that there is a TM N which, on input any u i, can simulate M i. L = {u i M i does not accept u i in f(length(u i )) or fewer moves}. L is recursive. [Input some u i. Determine length(u i ). Compute f(length(u i )). Simulate M i on u i until M i either halts or completes f(length(u i )) transitions, whichever comes first. u i is accepted if either M i halted without accepting u i or M i did not halt in the first f(length(u i )) transitions. Otherwise, u i is rejected.] Let M be a TM accepting L. Then M = M j for some j. M = M j accepts u j iff M j halts on input u j without accepting u j in f(length(u j )) or fewer steps or M j does not halt in the first f(length(u j )) steps. Hence, if M accepts u j then it needs more than f(length(u j )) steps. If M does not accept u j, then it also cannot stop in f(length(u j )) or fewer steps, since then it would in fact accept u j. 5.2 Theorem There is a language L such that, for any TM M that accepts L, there is a TM N that accepts L with tc N (n) O(log 2 (tc M (n))). Proof: skipped 04/19/10 Exercise 15 (Due 04/28/10) Is the following true or false? Prove your claim. For the language L from Theorem 5.2, the following holds: If there is a TM M that accepts L with tc M (n) O(2 n ), then there is a TM N that accepts L with tc N (n) O(n). 13

Exercise 16 (Due 04/28/10) Is the following true or false? Prove your claim. Let M be a 5-track TM which accepts a language L. Then there is a 5-tape TM N that accepts L with tc N (n) 7 + 7n + tc M(n). 2 6 Simulation of Computer Computations [4, Chapter 14.7] Is the TM complexity model adequate? Assume a computer with the following parameters. finite memory divided into word-size chunks fixed word length, m w bits each each word has a fixed numeric address finite set of instructions each instruction fits in a single word has maximum t operands (addresses used in operation) can do one of move data perform arithmetic or Boolean calculations adjust the program flow allocate additional memory (maximum of m a words each time) can change at most t words in the memory Now simulate in 5 + t-tape TM. Tapes are: Input tape (divided into word chunks) Memory counter (address of next free word on tape 1) Program counter (location of next instruction to be exectuted) Input counter (location of beginning of input and location of next word to be read) Work tape t Register tapes Simulation of k-th instruction: 1. load operand data onto the register tapes (max t words) 2. perform indicated operation (one) 3. store results as requried (stores max t words or allocates m a words or memory) Operation: finite number of instructions with at most t operands. Maximum number of transitions needed shall be t 0 (max exists and is < ) Load and Store: m p number of bits used to store input m i number of bits used to store instructions m k total memory allocated by T M before instruction k m k = m p + m i + k m a 14

Any address can be located in m k transitions. Upper bounds: find instruction m k load operands t m k return register tape heads t m k perform operation t 0 store information t m k+1 return register tape heads t m k+1 upper bound for k-th instruction: (2t + 1)m k + 2tm k+1 + t 0 = (4t + 1)m p + (4t + 1)m i + 2tm a + t 0 + (4t + 1)km a If computer requires f(n) steps on input length m i = n, then simulation requires: f(n) ((4t + 1)m p + (4t + 1)n + 2tm a + t 0 + (4t + 1)km a ) k=1 f(n) = f(n)((4t + 1)m p + (4t + 1)n + 2tm a + t 0 ) + (4t + 1)km a f(n) = f(n)((4t + 1)m p + (4t + 1)n + 2tm a + t 0 ) + (4t + 1)m a k O(max{nf(n), f(n) 2 }) In particular: If an algorithm runs in polynomial time on a computer, then it can be simulated on a TM in polynomial time. [For f(n) O(n r ), we have nf(n) O(n r+1 ) and f(n) 2 O(n 2r )] Exercise 17 (Due 04/28/10) Can we conclude the following from the observations in this section? If an algorithm runs in exponential time on a computer, then it can be simulated on a TM in exponential time. If an algorithm runs in linear time on a computer, then it can be simulated on a TM in quadratic time. Justify your answer. 7 PTime [[4, Chapter 15.6] and some bits and pieces] 7.1 Definition A language L is decidable in polynomial time if there is a standard TM M that accepts L with tc M O(n r ), where r N is independent of n. P (PTime) is the complexity class of all such languages. [P is the set of all such languages.] 15 k=1 k=1

Exercise 18 (Due 04/28/10) Show that P is closed under language complement. 7.2 The Polynomial Time Church-Turing Thesis A decision problem can be solved in polynomial time by using a reasonable sequential model of computation if and only if it can be solved in polynomial time by a Turing Machine. We have seen that P is independent of the computation paradigm used: standard TMs k-track TM k-tape TM realistic computer 7.3 Definition Let L, Q, be languages over alphabets Σ 1 and Σ 2, respectively. L is reducible (in polynomial time) to Q if there is a polynomial-time computable function r : Σ 1 Σ 2 such that w L if and only if r(w) Q. 7.4 Example The TM reduces L = {x i y i z k i, k 0} to Q = {a i b i i 0}. Time complexity: O(n). 04/21/10 Exercise 19 (Due 05/10/10) Construct a TM which reduces {a i b i a i i 0} to {c i d i i 0}. 7.5 Theorem Let L be reducible to Q in polynomial time and let Q P. Then L P. 16

Proof: TM for reduction: R. TM deciding Q: M. R on input w generates r(w) as input to M. length(r(w)) max{length(w), tc R (length(w))}. If tc R O(n s ) and tc M O(n t ), then tc R (n) + tc M (max{n, tc R (n)}) O(n s ) + O(max{O(n t ), O((n s ) t )}) = O(n st ). 7.6 Definition A language (problem) L is P-hard, if every language in P is reducible to L in polynomial time. P-complete, if L is P-hard and L P. 7.7 Remark Definition 7.6 (hardness and completeness) are used likewise for other complexity classes. Thereby, reducibility is always considered to be polynomial time. 7.8 Remark In principle, you could use any decision problem for defining a complexity class. E.g., if the POPI-problem is to find out, if n potachls fit into a pistochl of size n, then a problem/language L is in POPI if L is reducible to the POPI-problem (in polynomial time), POPI-hard if the POPI-problem is reducible to L (in polynomial time), POPI-complete if it is both in POPI and POPI-hard. Obvious questions: Which complexity classes are interesting or useful? How do they relate to each other? In this class, we mainly talk about two complexity classes: P, and N P (soon). Exercise 20 (Due 05/10/10) By Theorem 5.1, there are problems which are not in P. Go to http://qwiki.stanford.edu/wiki/complexity Zoo and do the following. 1. Find the names of 4 complexity classes which contain P. 2. Find a P-hard problem and describe it briefly in general, but understandable, terms. (You may have to use other sources for background understanding.) 17

8 Nondeterministic Turing Machines and Time Complexity [4, Chapters 15.1, 15.2 cont., some of Chapter 7.1, and recap Chapter 8.7] A nondeterministic (ND) TM may specify any finite number of transitions for a given configuration. Transitions are defined by a function from Q Γ to subsets of Q Γ {L, R}. A computation arbitrarily chooses one of the possible transitions. Input is accepted if there is at least one computation terminating in an accepting state. 8.1 Definition Time complexity tc M : N N of an ND TM M is defined s.t. tc M (n) is the maximum number of transitions processed by input of length n. 8.2 Example Accept strings with a c preceeded or followed by ab. Complexity: O(n) 8.3 Example 2-tape ND palindrome finder. 04/28/10 Complexity: n + 2 if n is odd, n + 3 if n is even. 18

Exercise 21 Give a nondeterministic two-tape TM for {uu u {a, b} } which is quicker than the TM from Exercise 3.3. 8.4 Definition A language L is accepted in nondeterministic in polynomial time if there is an ND TM M that accepts L with tc M O(n r ), where r N is independent of n. N P is the complexity class of all such languages. 8.5 Remark P N P. It is currently not known if P = N P. 8.6 Theorem Let L be accepted by a one-tape ND TM M. Then L is accepted by a deterministic TM M with tc M (n) O(tc M (n)c tc M (n) ), where c is the maximum number of transitions for any state, symbol pair of M. Proof sketch: Simulation idea: Use 3-tape TM M. Tape 1 holds input. Tape 2 is used for simulating the tape of M. Tape 3 holds sequences (m 1,..., m n ) (1 m i c), which encode computations of M: m i indicates that, from the (maximally) c choices M has in performing the i-th transition, the m i -th choice is selected. M is simulated as follows: 1. generate a (m 1,..., m n ) 2. simulate M according to (m 1,..., m n ) 3. if input is not accepted, continue with step 1. Worst case: c tc M (n) sequences need to be examined. Simulation of a single computation needs maximally O(tc M (n)) transitions. Thus, tc M (n) O(tc M (n)c tc M (n) ). Exercise 22 Make a (deterministic) pseudo-code algorithm for an exhaustive search on a tree (i.e., if the sought element is not found, the whole tree should be traversed). Exercise 23 Make a non-deterministic pseudo-code algorithm for an exhaustive search on a tree. 8.7 Definition co-n P = {L L N P} and co-p = {L L P}, where L denotes the complement of L. It is currently not known if N P = co-n P. 8.8 Theorem If N P co-n P, then P N P. Proof: Proof by contraposition: If P = N P, then by Exercise 18 we have N P = P = co-p = co-n P. 19

8.9 Theorem If there is an N P-complete language L with L N P, then N P = co-n P. Proof: Assume L is a language as stated. Let Q N P. Then Q is reducible to L in polynomial time. This reduction is also a reduction of Q to L. Combining the TM which performs the reduction with the TM which accepts L results in an ND TM that accepts Q in polynomial time. Thus Q co-n P. This shows co-n P N P. The inclusion N P co-n P follows by symmetry. 8.10 Theorem Let Q be an N P-complete language. If Q is reducible to L in polynomial time, then L is N P-hard. Proof: If R N P, then R is reducible in polynomial time to Q, which in turn is reducible in polynomial time to L. By composition, R is reducible in polynomial time to L. 05/10/10 8.11 Remark When moving from languages to decision problems, the representation of numbers may make a difference: Conversion from binary to unary representation is in O(2 n ). Thus, if a problem can be solved in polynomial time with unary input representation, it may not be solvable in polynomial time with binary input representation. Most reasonable representations of a problem differ only polynomially in length, but not so unary number encoding. Thus, in complexity analysis, numbers are always assumed to be represented in binary. 8.12 Definition A decision problem with a polynomial solution using unary number representation, but no polynomial solution using binary representation, is called pseudo-polynomial. Exercise 24 Let L be the language of all strings over {a, b} that can be divided into two strings (not necessarily the same length) such that (1) both strings have the same number of bs and (2) both strings start and end with a. E.g., abbaababaa is in L because it can be divided into abba and ababaa, both of which have 2 bs and both of which start and end in a. The string bbabba is not in L. Give a 2-tape, single track ND TM that accepts L. 9 SAT is N P-Complete [4, Chapter 15.8] Let V be a set of Boolean variables. 9.1 Definition An atomic formula is a Boolean Variable. (Well-formed) formulas are defined as follows. 20

1. All atomic formulas are formulas. 2. For every formula F, F is a formula, called the negation of F. 3. For all formulas F and G, also (F G) and (F G) are formulas, called the disjunction and the conjunction of F and G, respectively. 4. Nothing else is a formula. 9.2 Definition T = {0, 1} the set of truth values: false, and true, respectively. An assignment is a function A : D T, where D is a set of atomic formulas. Assignments extend to formulas, via the following truth tables. A(F ) A(G) A(F G) 0 0 0 0 1 0 1 0 0 1 1 1 A(F ) A(G) A(F G) 0 0 0 0 1 1 1 0 1 1 1 1 A formula F is called satisfiable if there exists an assignment A with A(F ) = 1. 9.3 Example Determining the truth values of formulas using truth tables: A(B) A(F ) A(I) A(B F ) A( (B F )) A( I) A( (B F ) I) 0 0 0 0 1 1 1 0 0 1 0 1 0 1 0 1 0 0 1 1 1 0 1 1 0 1 0 1 1 0 0 0 1 1 1 1 0 1 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 0 0 0 Exercise 25 Make the truth table for the formula (I B) F. A(F ) A( F ) 0 1 1 0 Exercise 26 Give a formula F, containing only the Boolean variables A, B, and C, such that F has the following truth table. A(A) A(B) A(C) A(F ) 0 0 0 1 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 0 1 1 1 1 0 0 1 1 1 0 9.4 Definition Formulas F and G are (semantically) equivalent (written F G) if for every assignment A, A(F ) = A(G). 21

9.5 Theorem The following hold for all formulas F, G, and H. F G G F F G G F Commutativity (F G) H F (G H) (F G) H F (G H) Associativity F (G H) (F G) (F H) F (G H) (F G) (F H) Distributivity F F Double Negation (F G) F G (F G) F G de Morgan s Laws Proof: Straightforward using truth tables. 9.6 Definition A literal is an atomic formula (a positive literal) or the negation of an atomic formula (a negative literal). A clause is a disjunction of literals. A formula F is in conjunctive normal form (CNF) if it is a conjunction of clauses, i.e., if ( n ( m )) F = L i,j, i=1 j=1 05/12/10 where the L i,j are literals. 9.7 Theorem For every formula F there is a formula F 1 F in CNF. Proof: skipped Exercise 27 Transform ((A B) (C D) (E F )) into CNF. Exercise 28 Give an informal, but plausible, argument, why a naive algorithm for converting formulas into CNFs is not in P. [Don t use TMs.] 9.8 Definition The Satisfiability Problem (SAT ) is the problem of deciding if a formula in CNF is satisfiable. 9.9 Theorem (Cook s Theorem) SAT is N P-complete. Proof: later 9.10 Remark SAT is sometimes stated without the requirement that the formula is in CNF this is equivalent. 22

9.11 Proposition For any formula F, there is an equivalent formula which contains only,, and literals. (Called a negation normal form (NNF ) of F.) Proof: Apply de Morgan s laws exhaustively. Exercise 29 Give an informal, but plausible, argument that conversion of a formula into NNF is in P. 9.12 Definition Two formulas F and G are equisatisfiable if the following holds: F has a model if and only if G has a model. 9.13 Proposition For all formulas F i (i = 1, 2, 3), F 1 (F 2 F 3 ) and (F 1 E) ( E (F 2 F 3 )) are equisatisfiable (where E is a propositional variable not occurring in F 1, F 2, F 3 ). Proof: skipped Exercise 30 Use the idea from Proposition 9.13 to sketch a polynomial-time algorithm which converts any formula F into an equisatisfiable formula in CNF. [Hint: First convert into NNF.] Exercise 31 Give an informal, but plausible, argument, that the problem Decide if a formula is satisfiable is N P-complete. [Use Cook s Theorem and Exercise 30.] [Slideset 2: Proof of Cook s Theorem] 05/17/10 10 Excursus: Is P = N P? [mainly from memory] [blackboard] Exercise 32 (optional) Show R = {(1, x) : x R} {(2, x) : x R}. Exercise 33 (optional) Show, using a diagonalization argument, that the power set of N is of higher cardinality than N. [Hint: Consider only infinite subsets of N.] 05/19/10 11 If P N P... [A mix, mainly from [1, Chapter 7], with some from [4, Chapter 17] and other sources.] 23

11.1 Problems between P and N P N PC consists of all N P-complete languages. If P N P, is N PI = N P \ (P N PC)? 11.1 Theorem Let B be a recursive language such that B P. Then there exists D P s.t. A = D B P, A is (polytime) reducible to B but B is not (polytime) reducible to A. Exercise 34 (skipped) Assumed P N P, why does Theorem 11.1 show that N PI? Iteratively reapplying the argument from Exercise 34 yields an infinite collection of distinct complexity classes between P and N P. Are there any natural candidates for problems in N PI? Perhaps the following. Graph isomorphism: Given two graphs G = (V, E) and G = (V, E ), is there a bijection f : V V such that (u, v) E iff (f(u), f(v)) E? Composite numbers: Given k N, are there n, m N s.t. k = m n? 11.2 The Polynomial Hierarchy 11.2 Definition An oracle TM (OTM ) is a standard TM with an additional oracle tape with read-write oracle head. It has two additional distinguished states: the oracle consultation state and the resume-computation state. Also, an oracle function g : Σ Σ is given. Computation is as for a 2-tape TM, except in the oracle state: If y is on the oracle tape (right of the first blank), then it is rewritten to g(y) (with rest blank) in one step, and the state is changed to the resume state. Let C and D be two complexity classes (sets of languages). Denote by C D the class of all languages which are accepted by an OTM of complexity C, where computation of the oracle function has complexity D. 11.3 Remark P P = P [The one-step oracle consultation can be performed using a TM which runs in polynomial time. Note that there can be at most polynomially many such consultations.] Exercise 35 Show: P N P contains all languages which are (polynomial-time) reducible to a language in N P. 11.4 Definition The polynomial hierarchy: and for all k 0 Σ p 0 = Π p 0 = p 0 = P 05/24/10 p k+1 = PΣp k Σ p k+1 = N PΣp k Π p k+1 = co-σp k+1 24

PH is the union of all classes in the polynomial hierarchy. Exercise 36 Show Σ p 1 = N P. 11.5 Remark Π p 1 = co-n P P = co-n P p 2 = P Σp 1 = P N P 11.6 Remark Σ p i p i+1 Σp i+1 Π p i p i+1 Πp i+1 Σ p i = co-πp i It is not known if the inclusions are proper. If any Σ p k equals Σp k+1 or Πp k, then the hierarchy collapses above k. In particular, if P = N P, then P = PH. The following problem may be in Σ p 2 = N P N P : Maximum equivalent expression: Given a formula F and k N, is there F 1 F with k or fewer occurrences of literals? [N P-hardness: because SAT reduces to it. In Σ p 2: Use SAT (non-cnf version) as oracle. The OTM first guesses F 1, then consults the oracle.] Exercise 37 Spell out in more detail, how SAT reduces to maximum equivalent expression. 12 Beyond N P [Mainly from [4, Chapter 17], plus some from [1, Chapter 7] and other sources.] Space complexity: use modified k-tape TM (off-line TM ) with additional read-only input tape, and additional write-only output tape. The latter is not needed for language recognition tasks. 12.1 Definition The space complexity of a TM M is the function sc M : N N s.t. sc M (n) is the maximum number of tape squares read on any work tape by a computation of M when initiated with an input string of length n. (For an ND TM, take the maximum over every possible computation as usual.) 12.2 Example 3-tape palindrome recognizer M with sc M (n) = O(log 2 (n)). Idea: Use work tapes to hold numbers (binary encoding). They are used as counters for identifying and comparing the i-th element from the left with the i-th element from the right, until a mismatch is found (or the palindrome is accepted). 12.3 Remark Palindrome recognition is in the complexity class LOGSPACE. 25

12.4 Theorem For any TM M, sc M (n) tc M (n) + 1. 12.5 Definition An off-line TM is said to be s(n) space-bounded if the maximum number of tape squares used on a work tape with input of length n is at most max{1, s(n)}. (This can also be used with non-terminating TMs.) 12.6 Theorem (Savitch s Theorem) M a 2-tape ND TM with space bound s(n) log 2 (n) which accepts L. Then L is accepted by a deterministic TM with space bound O(s(n) 2 ). 12.7 Corollary P-Space = N P-Space Proof: Obviously, P-Space N P-Space. If L N P-Space, then it is accepted by an ND TM with polynomial space bound p(n). Then by Savitch s Theorem, L is accpeted by a deterministic TM with space bound O(p(n) 2 ). 12.8 Definition EXPTIME is the complexity class of problems solvable by a (deterministic) TM in O ( 2 g(n)) time. NEXPTIME is the corresponding ND class. 2-EXPTIME/N2-EXPTIME are defined similarly with time bound O(2 2p(n) ). (Similarly n-exptime the exponential hierarchy.) (EXPSPACE is the corresponding space complexity class.) Exercise 38 Show, that EXPSPACE=NEXPSPACE. 12.9 Remark It is known that LOGSPACE NLOGSPACE P N P PH P-Space = N P-Space EXPTIME NEXPTIME EXPSPACE 2-EXPTIME. Also known: LOGSPACE P-Space P EXPTIME N P NEXPTIME P-Space EXPSPACE If P = N P, then EXPTIME = NEXPTIME If P = N P, then P = PH 12.10 Remark The Web Ontology Language OWL-DL (see [2]) is N2-EXPTIME-complete. The description logic ALC (see [3]) is EXPTIME-complete. Exercise 39 Show: If N PI =, then N P EXPTIME. 12.11 Example P-Space-complete problems: 26

Regular expression non-universality: Given a regular expression α over a finite alphabet Σ, is the set represented by α different from Σ? Linear space acceptance: Given a linear space-bounded TM M and a finite string x over its input alphabet, does M accept x? 13 N P-complete Problems [Part of [4, Chapter 16]] [class presentations] References [1] M. R. Garey and D. S. Johnson. Computers and Intractability. Freeman, 1979. [2] P. Hitzler, M. Krötzsch, B. Parsia, P. F. Patel-Schneider, and S. Rudolph, editors. OWL 2 Web Ontology Language: Primer. W3C Recommendation 27 October 2009, 2009. Available from http://www.w3.org/tr/owl2-primer/. [3] P. Hitzler, M. Krötzsch, and S. Rudolph. Foundations of Semantic Web Technologies. Chapman & Hall/CRC, 2009. [4] T. A. Sudkamp. Languages and Machines. Addison Wesley, 3rd edition, 2006. 27