CS6902 Theory of Computation and Algorithms

CS6902 Theory of Computation and Algorithms Any mechanically (automatically) discretely computation of problem solving contains at least three components: - problem description - computational tool - procedure/analysis

Problem descriptions Formalize a problem

Sort the names of this class into alphabetic order lexicographically. - Abstract version of the problem: Instance: A set of names (last name follows middle name follows given name) Question: Find a list of these names in their lexicographical order - Decision version: of the problem (Yes/No) Instance: A set of names Question: Does the output list in lexicographical order? (Yes/No)

-Concrete version: by a reasonable encoding method, convert the decision version of the problem into binary string, say 01 string. - Problem and algorithm -> program in high level languages (C++, JAVA etc.) (programmer) -> assemble language (compiler) -> machine codes: 01-strings (assembler). This is machine acceptable.

A language L over alphabet E (a finite set of symbols, say {0,1}), is any set of string made up of symbols from E. - L={10, 11, 101, 111, 1011, 1101,10001, } is the language of binary representations of prime numbers. - L = {xx x in L and x in L } is that the concatenation of two languages L and L is the language L.

Problems can be formalized as Languages. - The decision problem PATH: Instance: A graph G=(V,E), vertices u and v, and a nonnegative integer k. Question: Does a path exist in G between u and v whose length is at most k? - The formal language version of PATH: PATH={<G, u, v, k> G=(V,E) is a graph, u, v in V, k in I+, and there exists a path from u to v in G whose length is at most k}.

Tools -Human hand with pen and paper, -Calculator -Computer The mathematical abstract of computational Tools correspond the computational models. The power of computational tools reflected by their corresponding models.

What is a computer? the computational power of a computer? -Turing Machine resources: storage (space) and time

Models of machines we will study are -Finite state automata (FSA): finite amount of storage and states. -linear bounded automata : on the size of the input of storage, finite states -Turing machine (TM): array -tape- with an unlimited number of entries, finite states

-Regarding the analysis of computational power of machine (algorithms) question... practical limitations (complexity) logical limitations (computability) we approach both issues formally, i.e., mathematically

Complexity: Algorithms / Problems Hierarchy of problems according to the complexity of algorithms to solve these problems. Undecidable (unsolvable) problems. Decidable (solvable) problems. NP-hard, NP-complete problems. Polynomial time solvable problems.

Figure 1: An simple illustration of complexity of problems

Undecidable (unsolvable) problems: (No algorithm exists) The Halting problem: Does there exist a program (algorithm/turing machine) Q to determine, for arbitrary given program P and input data D, whether or not P will halt on D?

Post's correspondence problem A correspondence system is a finite set P of ordered pairs of nonempty strings. A match of P is any string w such that for some n>0 and some pairs (u 1, v 1 ), (u 2, v 2 ),...,(u n, v n ) P, w = u 1 u 2...u n = v 1 v 2...v n.

For example If P = {(a, ab), (b, ca), (ca, a), (abc, c)}, then w=abcaaabc is a match of P, since if the following sequence of five pairs is chosen, the concatenation of the first components and the concatenation of the second components are both equal to w= abcaaabc : (a, ab), (b, ca), (ca, a), (a, ab), (abc, c). The post's correspondence problem is to determine given a correspondence system whether that system has a match.

Hilbert s tenth problem To devise an `algorithm that test whether or not a polynomial has an integral root. A polynomial is a sum of terms. For example, 6x 3 yz 2 + 3xy 2 - x 3 10 An integral root is a set of integer values setting to the polynomial so that it will be zero.

For example, the above polynomial has an Integral root: x=5, y=3, z=0 (135-125-10=0). Let D denote a set of polynomials so that D = {p p is a polynomial with an integral root} Hilbert s tenth problem becomes Is D decidable? The answer is that it is not decidable.

A brief idea of proof Let D be a set of special polynomials that p has one variable. I.e., D ={p p is polynomial over x with an integral root}. For example, 4x 3-2x 2 + x 7 Let M be a Turing Machine that input: a polynomial p over the variable x. Program: Evaluate p with x set successively to the values 0, 1, -1, 2, -2, 3, -3, If at any point the polynomial evaluated to zero, accept.

M recognizes D. For general polynomial, we can devise a Turing machine M to recognize D similarly. To set successively the values of all combinations to the variables, if the polynomial evaluated to zero, accept. For example, x 0 0 1 0-1 0 2 0 y 0 1 0-1 0 2 0-2 Can you set the value pattern as x 0 0 0 0 0 1 1 1 1 1 y 0 1-1 2-2 0 1-1 2-2?

M can be modified as decidable with D. The bound of the value of the single variable can be calculated. For example, 4x 3-2x 2 + x 7 has a bound of x < 7. Since the bound is finite, if the value has exceeded the bound and the polynomial is not evaluated to zero, then stop. Thus, M can decide D. The general bound for D is k C max /C 1. Matijasevic proved no such a bound for D.

An intuitive proof of Halting problem Let us assume there exits an algorithm Q with such a property, i.e., Q(P(D)) will be run forever if arbitrary algorithm P with input data D, P(D) run forever. halt if P(D) halts.

New algorithm B Note that Algorithm P is a string and data D is a string too. Thus, Q(P(P)) is also a legal input to Q, regarding P itself as data. Design an new Algorithm B(X) for any algorithm X such that B(X) Halts if Q(X(X)) runs forever Runs forever if Q(X(X)) halts

The construction of B Note that B can be constructed because Q can be constructed. For example, we may build B on Q as follows: When Q detects P(D) stops and Q shall stop, but we modify Q (called B) and let B run forever; while Q detects P(D) runs forever and Q shall run forever, but we modify Q (called B) and let B stop.

Contradiction Let B run with input data B, then B(B) will either halt or will run forever, and this can be detected by Q(B(B)). If B(B) stops, hence Q(B(B)) stops and forces B(B) runs forever by the construction of B. --- B(B) enters both stop and run forever.

continuous If B(B) runs forever, then Q(B,B) runs forever and forces B(B) stops. --- B(B) enters both stop and run forever. --- All statements are logically followed the assumption --- assumption is wrong --- there cannot exist such a program Q.

The diagonalization method This method is due Georg Cantor in 1873. Definitions: one-to-one function f : A to B if it never maps two different elements to the same place. f is onto if it hits every element of B. f is correspondence if it is both one-to-one and onto. Correspondence is to pair the elements in A and B.

The correspondence can be used to compare the size of two sets. Cantor extended this idea to infinite set. Definition: A set A is countable if either it is finite or it has the same size as natural numbers N. For example, N = {1,2,3, }, E={2,4,6, }, O={1,3,5, } are same size and hence countable. Q be set of rational numbers: Q={m/n m, n in N}

1/1 1/2 1/3 1/4 1/5 2/1 2/2 2/3 2/4 2/5 3/1 3/2 3/3 3/4 3/5 4/1 4/2 4/3 4/4 4/5.. 5/1 5/2 5/3 5/4 5/5.. Q is countable.

Real numbers R is uncountable. Let f make correspondence between N and R. n f(n) 1 3.14159265 2 55.55555555 3 1.41427689 4 0.50000000... Construct a real number x by giving its decimal representation such that x is not belong to any f(n).

To do that, we let the first digit of the first real be different from the first digi of x, say x=.2; then let the second digit of the second real be different from the second digit of x, say x=.34; so on so forth. The new real number x=.34. is different from any real in the table by at least one digit difference. Therefore, x does not correspondence to any natural number and is uncountable. (can we choose 0 and 9 as digits in x?)

More example Is * for ={a,b} countable? ὲ a b aa bb ab ba aaa abb aba aab bbb baa bab bba Yes. aaaa aabb aaba aaab abbb abaa abab abba baaa babb baba baab bbbb bbaa bbab bbba 2 L. number of stings with length L

The correspondence of natural number N and the string with length L can be N = i=0 to L 2 L for the first string with length L. Arbitrarily assign a integer between 2 L and 2 L+1 1 to the rest of strings with length L. For example, string with length of 2 are 4 --- aa 5 --- ab 6 --- bb 7 --- ba.

Diagonalization for Halting problem Let M1, M2, M3,. be all Turing machines listed in rows of an infinite table. Obviously, they include those machines: P,Q,B. (Algorithm regarded as a machine.) Let (M1), (M2), (M3), be their descriptions (as strings) listed in columns. Let entry (i,j) represent the result of the i-th machine runs on the j-th description as input.

(M1) (M2) (M3) M1 accept rej/nstop accept M2 accept accept rej/nstop M3 rej/nstop accept rej/nstop.... When a machine M runs on a description as input, it either accept or reject or nonstop.

When a machine Q runs on a description (machine M runs on input D), it either accept or reject. (M1) (M2) (M3) M1 accept reject accept M2 accept accept reject M3 reject accept reject...

When a machine B runs on a description (machine B runs on input B), it is both accept and reject. (M1) (M2) (M3) (B) M1 accept reject accept M2 accept accept reject M3 reject accept reject.. B reject reject accept?. More problems than machine can solve.

Polynomial-time decidable problems: (Algorithms exist and relatively efficient) Sorting a set of elements. Find the maximum, minimum, and median of a set of elements, Matrix multiplication. Matrixchain multiplication, Single source shortest path. Convex hull of a set of points. Voronoi diagrams. Delaunay triangulations.

NP-hard, NP-complete problems: (Algorithms exist, but not efficient) Boolean Satisfiability problem, vertex cover problem. Hamiltonian-cycle problem. A hamiltonian cycle of an undirected graph G=(V,E) is a simple cycle that contains each vertex in V. Does a graph G have a hamiltonian cycle? Traveling salespersons problem (Ω(m!), where m is the number of vertices in V.)

The measurement of the efficiency of algorithms (1) The worst-case time (and space). Insertion sort O(n 2 ) in worst-case time. (2) The average-case time. Quick sort O(n log n) time in average-case, O(n 2 ) in worst-case time. Other analysis methods:

The amortized analysis The randomized analysis In an amortized analysis, the time complexity is obtained by taking the average over all the operations performed. Even though a single operation in the sequence of operations may be very expensive, but the average cost of all operations in the sequence may be low. Example: Incrementing a binary counter. We shall count the number of flips in the counter when we keep adding a one from its lowest bit.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 1 1 1 1 1 1 1 1

Increment(A) i 0 While i < length[a] and A[i]=1 Do A[i] 0 i i+1 If i < length[a] Then A[i] 1

In the conventional worst-case analysis, consider all k bits in the counter are `1's. any further increasing a `1' will cause k flips. Therefore, n incremental will cause O(kn) flips. Note that A[1] flip every time, A[2] flip every other time, A[3] flip every foutrth time,..., A[i] flip every 2 i th time. Thus, we have that log n i =0 n/2 i < n i=01/2 i = 2n. The average cost of each incremental is O(1), not O(k).

Optimal Algorithms Upper bound of a problem (1) the number of basic operations sufficient to solving a problem (2) the minimum time complexity among all known algorithms for solving a problem (3) upper bound can be established by illustrating an algorithm.

lower bound of a problem (1) the number of basic operations necessary to solving a problem (2) the maximum time complexity necessary by any algorithm for solving a problem (3) lower bound is much more difficult to establish.

An algorithm is optimal if its time complexity (i.e., its upper bound) matches with the lower bound of the problem. For example, the problem of sort n elements by comparisons. Lower bound = log 2 n! as there are n! different outcomes (permutations) and any decision tree which has n! leaves must be of height >= log 2 n!. Clearly, Merge-sort algorithm is optimal and insertion sort is not.

While you may already learn some methods of lower bound establishment such as decision tree and adversary (oracle), we shall also introduce a very useful method: Establish upper and lower bounds by transformable problems. Decision tree. Adversary. Transformation.

Figure 2Transfer of upper and lower bounds between transformable problems.

Suppose we have two problems, problem α and problem β, which are related so that problem α can be solved as follows: 1. The input to problem α is converted into a suitable input to problem β. 2. Problem β is solved. 3. The output of problem β is transformed into a correct solution to problem α We then say that problem α has been transformed to problem β. If the above transformation of step 1 and step 3 together can be done in O(τ(N)) time, where N is the size of problem α, then we say that α is τ(n)- transformable to β

Proposition 1 (Lower bound via transformability). If problem α is known to require at least T(N) time and α is τ(n)-transformable to problem α τ ( N ) β, then β requires at least T(N) - O(τ(N)) time. Proposition 2 (upper bound via transformability). If problem β can be solved in T(N) time and problem α is τ(n)-transformable to problem α τ ( N ) β, then α can be solved in at most T(N) + O(τ(N)) time.

For example. Element Uniqueness: Given N real numbers, decide if any two are unique. (Denote this problem as α.) This problem is known to have a lower bound. In the algebraic decision tree model any algorithm that determines whether the member of a set of N real numbers are distinct requires Ω(N log N) tests. Now, we have another problem, Closest Pair: Given N points in the Euclidean plane, find the closest pair of points (the shortest Euclidean distance). Denote this as β.

We want to find the lower bound of this problem. (Can we use decision tree method or adversary method to this problem?)

We transfer Element Uniqueness problem to Closest Pair problem. Given a set of real numbers (x 1, x 2,..., x N ) (INPUT to α), treat them as points in the y=0 line in the xycoordinate system (convert them into a suitable input of β). Apply any algorithm to solve β. The solution is the closest pair. If the distance between this pair is nonzero, then the points are distinct, otherwise it is not. (Convert the solution of β to the solution of α.) τ N = O(N). By Proposition 1, β takes at least Ω(N log N) - O(N) time, which is the lower bound.

Using the same method, we can prove that the lower bound of sorting by comparison operations is Ω(n log n), by transferring the Element Uniqueness to Sorting. The lower bounds of a chain of problems can be proved in this manner.

Reduction for intractability The above transformation method can be used for proving a problem is intractable or tractable if the cost of transformation is bounded by a polynomial. For example, CLIQUE: Instance: A graph G=(V,E) and a positive integer J <= V.

Question: Does G contain a clique of size J or more? That is, a subset V in V such that V >= J and every two vertices in V are joined by an edge in E. VERTEX COVER (VC): Instance: A graph G=(V,E) and a positive integer k <= V. Question: Is there a vertex cover of size k or less for G? That is, a subset V in V such that V <=k and for each edge in E at least one of the endpoints is in V.

Let A be VC and B be clique For every instance of A, we can convert it to an instance of B in polynomial time. Let G=(V,E) and k <= V be an instance of VC. The corresponding instance of Clique is G c and the integer j= V -k. For covert the output of B to an output of A in polynomial time (constant time yes/no). => If A is intractable, then B is intractable.

For every instance of B, we can convert it to an instance of A in polynomial time. Let G=(V,E) and j <= V be an instance of Clique. The corresponding instance of VC is G c and the integer k= V -j. => If B is tractable, then A is tractable.

2 3 2 3 2 3 1 1 1 5 4 5 4 5 4 clique V =(1,4,5),k=3 (a) complete graph of G (b) vertex cover V"=(2,3), k=n k=2 (c)

Reduction for decidability Mapping reducibility: If there is a computable function f: for every w in A, there is an f(w) in B. f is called the reduction of A to B. If A < m B, and B is decidable, then A is decidable. If A < m B, and A is undecidable, then B is undecidable.

Post correspondence problem Some instance has no match obviously. (abc, ab) (ca, a) (acc, ba) since the first element in an order pair is always larger than the second. Let us define PCP more precisely: PCP={[P] P is an instance of the Post correspondence problem with a match}

Where P={t 1 /b 1, t 2 /b 2, t k /b k }, and a match is a sequence i 1, i 2,, i s such that t i1 t i2 t is = b i1 b i2 b is. Proof idea: To show that for any TM M and input w, we can construct an instance P such that a match is an accepting computation history for M on w. Thus, if we can determine whether the instance P has a match, we can determine whether M accepts w (halting problem).

Let us call [t i /b i ] a domino. In the construction of P, we choose the dominos so that a match forces a simulation of M to accept ω. Let us consider a simpler case that M on ω does not move its head off the left-hand end of the tape and the PCP requires a match always starts with [t 1 /b 1. ] Call it MPCP. MPCP={[P] P is an instance of the Post correspondence problem with a match starting at [t 1 /b 1 ]}.

Proof. Let TM R decide the PCP and construct TM S to decide A TM. M =(Q,Σ, Γ, δ, q o, q accept, q reject,), where Q is the set of states, Σ is the input alphabet, Γ is the tape alphabet, δ is the transition function of M. S constructs an instance P of PCP such that P has a match if and only if M accepts ω. The construction of P of MCPC consists of 7 parts.

1. Let [# # q o w 1 w w w n #] be the first domino in P, [t 1 /b 1 ]. where C 1 =q o w = q o w 1 w w w n is the first configuration of M and # is the separator. The current P will force the extension of the top string in order to form a match.

To do so, we shall provide additional dominos to allow this, but at the same time these dominos causes a single step simulation of M, as shown in the bottom part of the domino. Parts 2, 3, and 4 are as follows: 2. For every a, b in Γ and every q, r in Q, where q =q reject if δ( q, a) = (r,b,r), put [qa/br] into P. (head moves right)

3. For every a, b, c in Γ and every q, r in Q, where q = q reject. if δ( q, a) = (r,b,l), put [cqa /rcb] into P. (head moves left) 4. For every a in Γ, put [a/a] into P. (head is not on symbol a) What do the above construction parts mean? Consider the following example: Let Γ ={0, 1, 2, e}, where e is empty, w= 0 1 0 0, and the start state of M is q o.

Part 1 puts the first domino as follows. [# ] [# q o 0 1 0 0 ], then start to match. Suppose M in q o reads 0 and enters q 7, write a 2 on the tape and moves head to R. That is δ(q 0,0)=(q 7,2, R). Part 2 puts domino [q o 0 / 2 q 7 ], [# q o 0 ] [# q o 0 1 0 0 # 2 q 7 ], Part 3 puts nothing, and Part 4 puts [0/0], [1/1], [2,2], and [e/e], [# q o 0 1 0 0 # ] [# q o 0 1 0 0 # 2 q 7 1 0 0 #].

Part 5 for copy the # symbol to separate different configuration of M. I.e., put [#/#] and [#/e#] into P. The second domino allow us to add an empty symbol e to represent infinite number of blanks to the right. Thus, the current P has two configurations separated by #. [# q o 0 1 0 0 # ] [# q o 0 1 0 0 # 2 q 7 1 0 0 #].

Now, suppose M in q 7 reads 1 and enters q 5, write a 0 on the tape and moves head to R. That is, δ(q 7,1)=(q 5,0, R). With [2/2], [0,0], [0,0], we have that [# q o 0 100 # 2 q 7 1 0 0 # ] [# q o 0100 # 2 q 7 1 0 0 # 2 0 q 5 0 0 #].

Then, suppose M in q 5 reads 0 and enters q 9, write a 2 on the tape and moves head to L. That is, δ(q 5,0)=(q 9,2, L). We have dominos: [0q 5 0 / q 9 02], [1q 5 0 / q 9 12], [2q 5 0 / q 9 22], and [eq 5 0 / q 9 e2]. Only the first domino fits. [# q o 0100 # 2 q 7 100 # 20 q 5 00 # # q o 0100 # 2 q 7 100 # 20 q 5 00 # 2 q 9 020 # ]. This process of match and simulation M on ω continue until q accept has been reached.

We need to make a catch up for the top part of the current P. To do so, we have part 6. 6. For every a in Γ, put [a q accept / q accept ] and [q accept a / q accept ] into P. This is to add pseudo-steps to M after halted as the head eats the adjacent symbols until no symbol left. Suppose that M in q 9 reads 0 and enters q accept.

[# q o 0100 # 2 q 7 100 # 20 q 5 00 # ] # q o 0100 # 2 q 7 100 # 20 q 5 00 # 2 q 9 020 # ]. [ #20q 5 00#2q 9 020 # ] [ #20q 5 00#2q 9 020 #2q accept 20#] [ #20q 5 00#2q 9 020 #2q accept 20# ] [ #20q 5 00#2q 9 020 #2q accept 20#2q accept 0#] [ #2q 9 020 #2q accept 20#2q accept 0# ] [ #2q 9 020 #2q accept 20#2q accept 0#2q accept #]

7. Finally, we add domino [q accept ##/#] to complete the match. [ #2q accept 20#2q accept 0#2q accept # ] [ #2q accept 20#2q accept 0#2q accept #q accept #] [ #2q accept 20#2q accept 0#2q accept #q accept ##] [ #2q accept 20#2q accept 0#2q accept #q accept ##]

To remove the restriction on P that must start at the first domino, we add some symbols to every element of P. Then, it must start at the first domino. If P = {[t 1 /b 1 ],[ t 2 /b 2 ],, [t k /b k ]} is a match, then we let P={[*t 1 /*b 1 *],[ *t 2 /b 2 *],, [*t k /b k *], [*o/o]}. Clearly, PCP must start at the first domino since only *b 1 * has left symbol *. [*o/o] for allowing the top of P to add #.