Chapter 2: The Basics slides 2017, David Doty ECS 220: Theory of Computation based on The Nature of Computation by Moore and Mertens
Problem instances vs. decision problems vs. search problems Decision problem instance: is this graph Eulerian? (i.e., does it have an Eulerian path?) Decision problem: Given: a graph G Question: is G Eulerian? Search problem: Given: a graph G Find: an Eulerian path in G if one exists Search problem: A function f: {0,1}* {0,1}* Decision problem: A function φ: {0,1}* {0,1} a.k.a., predicate Decision problem: The set of all Eulerian graphs,,,, { }, Decision problem: The subset L EG {0,1}* of binary strings encoding Eulerian graphs 011 0011 01 101 0001 (a.k.a. language) L EG = {0, 10,,, } 110 1011 2.1: Problems and Solutions 1110 2
The definition of algorithm A problem is solved by an algorithm that computes the function/predicate associated to the problem Formal definition of algorithm (for this class): program written in your favorite programming language (e.g., Python, C++, Java, Haskell, ) Equivalent formal definition: Turing machine The textbook defers discussion of Turing machines until we have developed most of the basics of computational complexity theory Turing machines are historically important, and still relevant, but not required to understand computational complexity or computability 2.1: Problems and Solutions 3
Euclid s algorithm for greatest common divisor (300 BC) Given: integers a,b Find: gcd(a,b) d is a common divisor of a and b if and only if it is a common divisor of b and a mod b Thus gcd(a,b) = gcd(b, a mod b) Python code def gcd(a,b): if b==0: return a else: return gcd(b, a%b) a b a%b 66 mod 24 = 18 24 mod 18 = 6 18 mod 6 = 0 gcd(66,24) = 6 66 mod 24 = 18 How long does this algorithm take in the worst case? 2.1: Problems and Solutions 24 24 6x6 square 4 66
Time and scaling We measure running time as a function of input size; e.g., n 2 steps on inputs of size n. What is the size n of an instance G=(V,E) of the Eulerian path problem? n = V? n = V + E? n = s G = V 2, where s G {0,1}* is the adjacency matrix of G? What is the size n of an instance (a,b) of the GCD problem? n = a+b? n = max(a,b)? n = log 2 (a)+log 2 (b) is the number of bits we need to represent a and b 2.2: Time, Space and Scaling 5
Definition of running time What is a step? reasonable definition depends on programming language atomic or basic line of code in (imperative) program, e.g. a = 4 a = b+c (is this reasonable?) string = abcde while a < b: not a line that takes several steps or calls a subroutine, e.g. multiples_of_3 = [3*n for n in range(n)] if a < largest_prime_in_range(2,n): string = other_string.upper_case() single transition of a Turing machine single application of update rule in cellular automaton (many cell updates in parallel) step in a functional or logical programming language? (definition depends on how interpreter works) We will not worry too much about the precise definition since much computational complexity theory ignores polynomial differences. 2.2: Time, Space and Scaling 6
Time complexity of Euclidean algorithm input: (a,b); size = n = # bits of a and b (assume both are n bits) we will count how many divisions we need; if it is t(n) and each division takes time d(n), then the total time complexity is t(n)*d(n) Exercise: If a b, then a mod b < a/2 Exercise: Euclid s algorithm performs at most 2 log a = 2n divisions. Takes O(n 2 ) time if we use grade-school algorithm for division (Problem 2.11) 2.2: Time, Space and Scaling 7
Time complexity of factoring input: m; size = n = log m, so m 2 n find: a factor of m in the range [2, m-1] if it exists why is this enough to help us find the whole prime factorization? Obvious algorithm (trial division): time complexity (# of divisions): m 1/2 2 n/2 find_factor(m): for d in [2,3,, m 1/2 +1]: if m%d == 0: return d Best known algorithm: number field sieve, time complexity 2 nn1/3 It is believed by many (not everyone) that there is no polynomial-time algorithm for factoring (there is a n 3 quantum algorithm) 2.2: Time, Space and Scaling 8
Asymptotic notation t(n) = O(f(n)) what does it mean? intuitively, t f more properly written, t = O(f) even more properly written, t O(f) t = o(f) intuitively, t < f formally: lim nn tt nn ff nn = 0 t = O(f) if and only if f o(t) if and only if there is a constant C so that, for all n, t(n) C*f(n) t = Θ(f) if and only if t = O(f) and f = O(t) intuitively t = f t = Ω(f) if and only if f = O(t) intuitively t f t = ω(f) if and only if f = o(t) intuitively t > f 3n 2 = O(n 2 ) n 2 = O(3n 2 ) 100n 2 = o(n 2.1 ) n 2.1 = o(n 3 ) n 100 = o(2 n ) 2 n = o(2 3n ) if we say f(n) = O(log n), why don t we bother writing the base of the logarithm? 2.2: Time, Space and Scaling 9
Intrinsic complexity: algorithms vs. problems Grade school algorithm to multiply n-digit numbers x and y: O(n 2 ) time better algorithm: break x and y into two pieces, e.g., x = 1234567890 y = 9876543210 a = 12345 b = 67890 c = 98765 d = 43210 x = 10 n/2 a + b y = 10 n/2 c + d xy = (10 n/2 a + b)(10 n/2 c + d) = 10 n ac + 10 n/2 (ad+bc) + bd note (a+b)(c+d) ac bd = ad + bc we can do three multiplications (ac, bd, and (a+b)(c+d)) on n/2 digit integers, and 6 additions/subtractions and 2 shifts (which take O(n) time each) time complexity: t(n) = 3t(n/2) + O(n) = O(n log 3 ) = O(n 1.585 ) Fast Fourier Transform algorithm takes time O(n log n) Open question: is there an O(n) time algorithm? Lesson: although grade school algorithm seems most natural; the problem itself has an intrinsic time complexity; grade school algorithm doesn t meet it 2.3: Intrinsic Complexity belief that maybe there s a better algorithm than the first one you thought of n 1 2 3 x 3 2 1 1 2 3 2 4 6 3 6 9 3 9 4 8 3 uncanny valley of understanding that complexity is intrinsic to problems, not algorithms biologist n 2 mathematician engineer physicist skill at programming computer scientist 10
Why focus on polynomial running times? Running time of this algorithm on G=(V,E)? loop executes n= V times how long does each loop take? depends on time to evaluate deg(node) adjacency matrix: Ω(n) time, so n 2 for whole algorithm could be more if matrix lookups are not O(1) time adjacency list: O(1) time, so n for whole algorithm euler(g): y = 0 for each node in g.nodes: if deg(node) is odd: y = y + 1 if y > 2: return false return true These choices will affect running time measurements by no more than a polynomial factor ( most choices affect only by a linear factor) Conclusion: to understand fundamental properties of computation, and not details of individual modeling choices, consider equivalences up to a polynomial factor Computer science is no more about [technical details of] computers than astronomy is about telescopes folklore, quote often misattributed to Edsgar Dijkstra Rather than studying the art of grinding lenses and mirrors, let us turn our attention to the stars real quote from textbook (Cris Moore and Stephan Mertens) 2.4: The Importance of Being Polynomial 11
Our first complexity classes: P, TIME(t(n)), EXP P is the class of problems solvable in polynomial time Formal version for decision problems: Given A {0,1}*, A P if there is a program Q and a constant c such that, for all x {0,1}*, Q on input x halts in at most x c steps, and Q(x) = 1 if and only if x A For t: N N, TIME(t(n)) is the class of problems solvable in time O(t(n)). P = TIME(n c ) cc N poly(n) is a shorthand for O(n c ) for some constant c P = TIME(poly(n)) EXP = TIME(2 poly(n) ) = TIME(2 nncc ) cc N since for every c, n c = O(2 n ), this means P EXP 2.4: The Importance of Being Polynomial 1
Tractability versus mathematical insight Definition: P = class of decision problems solvable in polynomial time P = set of predicates φ: {0,1}* {0,1} such that there is a program Q and a constant c such that, for all strings x {0,1}* where n= x, Q(x) = φ(x) and Q halts on input x in at most n c steps This is often paraphrased as tractable decision problems Not really: n 100 is intractable However, not being in P generally implies intractable Real goal of studying P: mathematical insight problems in P have some special structure that allows algorithms to avoid an exponential time brute force search once this insight is made, others tend to follow, and so do efficient algorithms We think some problems in NP don t have this structure (i.e., P NP) The difference between polynomial and exponential time is one of kind, not of degree. textbook 2.5: Tractability and Mathematical Insight 13
Chapter 3: Insights and Algorithms We won t go through this in lecture. It reviews the big ideas from ECS 122a. I will assume everyone is familiar with the techniques and terminology from chapter 3, especially the following recursion (3.1) divide-and-conquer (3.2) dynamic programming (3.3) graph reachability; breadth-first and depth-first search (3.4) graph algorithms: shortest path (3.4), minimum spanning tree (3.5), min-cut (3.7) reductions (3.8) Chapter 3 14