Big O, Ω, and Θ Big-O gives us only a one-way comparison; if f is O(g) then g eventually is bigger than f from that point on, but in fact f could be very small in comparison. Example; 3n is O(2 2n ). We want to sandwich a function in terms of another familiar one. The first step is to provide lower bounds. Definition Let f, g : N R +. We say that f is Ω(g) if there are constants K and N such that for all n N, f(n) Kg(n). n 2 is Ω(2n + 6). This is because for all n 4, n 2 > 2n + 6, so we can take N = 4 and K = 1. We know that n log n 2 log(n!) This says log(n!) n log n/2, so log(n!) is Ω(n log n). In the last example, log(n!) is Ω(n log n) and log(n!) is also O(n log n). This is the situation when we have the function log(n!) sandwiched in terms of the (slightly) more familiar function n log n. Definition Let f, g : N R +. We say that f is Θ(g) if f is both O(g) and Ω(g). 1
Relating O and Ω Theorem f is O(g) if and only if g is Ω(f). Proof: Suppose f is O(g). Then there are constants K and N such that for all n N, f(n) K(g(n)). Thus, for all n N, Kg(n) f(n); that is, g(n) (1/K)f(n). So using 1/K as the multiplicative constant, we have g is Ω(f). Conversely, if g is Ω(f), there are constants K and N such that for all n N, g(n) Kf(n). As above, this says f(n) (1/K)g(n), and so f is O(g). This theorem says something about symmetric relations. What? We can make O, Ω, and Θ into relations on the very large space of all functions from N to R +. Define f O g if f is O(g). Similarly define f Ω g and f Θ g. 2
Properties of the relations O, Ω, and Θ Theorem The relation O is reflexive and transitive. Proof: Because for any f, f(n) f(n), we have f O f (use N = 0 and K = 1). For transitivity, suppose f O g and g O h. We have to show f O h. The hypothesis gives us K and N such that for n N, f(n) Kg(n); further, it gives us L and M such that for all n M, g(n) Lh(n). Assume that n is greater than both N and M. Then g(n) Lh(n), and so Kg(n) KLh(n). Since also f(n) Kg(n) we have f(n) KLh(n). Therefore we have f O h, using the constants max{n, M} and KL. In the same way you can show that Ω and Θ are reflexive and transitive. O and Ω are not symmetric. n is O(n 2 ) but n 2 is not O(n). But Θ is symmetric! Therefore Θ is an equivalence relation on the huge function space. The order equivalence class of f is the set of all functions g such that f Θ g. It s nice if we can classify a complicated function to be in the order equivalence class of a simple one; for example, log(n!) is order-equivalent to n log n. 3
Analyzing algorithms We are going to use all of this notation to help us classify algorithms by the amount of time they take to do certain jobs. Consider, for example, sorting. We agree on a measure of the size of the input, and then we look at a particular algorithm to see in terms of that measure, how many steps it takes to sort the input. For sorting, we measure size as the number of items in the array to be sorted (i.e., the length of the array). To facilitate comparing different algorithms, we assume that they will compare items in different array positions, and we count the number of these comparisons as a function of n. More precisely, given an algorithm, we define a function f(n) to be the maximum number of comparisons the algorithm makes among runs over all arrays of length n. This function is called the worst-case complexity of the algorithm. The task is to classify this function using O, or if we can, Θ, with a simple reference function. We ll look at the bubble sort to get started, but we ll also look at another, faster, sorting algorithms (briefly). 4
Bubble sorting again The number n is the size of the array a[ ]. The algorithm sets i to n, and bubbles the largest element down to this row. It then sets i to n 1 and bubbles the next largest element down, using n 2 comparisons. This all is repeated until i becomes 1. The number of comparisons in total is n 1 + n 2 + + 1 = n 1 i=1 i = (n 1)n/2 = n 2 /2 n/2. Notice that the bubble sort algorithm always takes this number of steps. on any size n array. Therefore the complexity is Θ(n 2 ). 5
Merge sort This algorithm sorts an array by dividing it (aproximately) into 2 halves, then recursively sorts each half. Once the two halves are sorted they are merged into one array. The time taken by this satisfies roughly f(n) = 2 f( n/2 ) + n. That is, the time for n items is twice the time for n/2 items, plus n steps to merge the arrays back together. This equation can be used to show that f(n) is O(n log n). 6
Euclid s GCD algorithm function gcd(m:n + ; n:n); %(gcd(m, 0) = m) { a := m; b := n; while b!= 0 do % gcd(a, b) = gcd(m, n) gcd(m,n) := a } Example: gcd(91, 287). The gcd is a = 7. {r := a mod b; a := b; b := r;} a 91 287 91 14 7 b 287 91 14 7 0 r? 91 14 7 0 7
An O-estimate for Euclid Theorem (Lamé). For any k 1, if Euclid s algorithm takes k or more trips to compute gcd(m, n), where m n, then n f k+1. Remark This shows that the Fibonacci numbers are the worst case for Euclid s algorithm. To see why, look first at the contrapositive version of Lamé. Theorem If n < f k+1, then Euclid s algorithm takes at most k 1 trips through the loop. The picture of this is: 4 3 2 1 2 3 5 8 You can show from this that it takes at most log α n + 1 steps to compute gcd(m, n), where α = (1 + 5)/2. Since log α n = log 10 n log α 10, this gives us an O(log 10 n) algorithm for the gcd. This is proportional to the number of decimal digits in n. Compare this with the time it takes to factor two 800-digit numbers. Naively, anyway, this could take 10 800 steps, on the way to finding the gcd of the two numbers by the sixth-grade method. Euclid can do it in microseconds. 8
What s a really hard problem? To multiply n n matrices, any algorithm must take Ω(n 2 ) steps, because it has to read in the matrices. No better lower bound for this is known. There is, however, an O(n 7 ) recursive algorithm. Can we find a problem that is so hard that any algorithm will require Ω(2 n ) steps? The answer is yes, but the problems tend to be in specific areas, so it would take me a while to tell you about one. Instead, I ll tell you about some problems that everybody believes have some kind of exponential lower bound, but nobody can prove it. The first of these is factoring an n-digit number. The second is to tell if a propositional expression in n variables is satisfiable (that is, some setting of the variables makes it true). 9
Understanding exponential versus polynomial complexity First of all we want to compare 2 n with all functions n p where p is a fixed constant. For any p, 2 n is not O(n p ). In fact, you can show, using calculus, that lim n n p /(2 n ) = 0. That means that you can make n p /(2 n ) as small as you want just by taking n big enough. Exponential functions like 2 n are therefore not O of any polynomial functions. If you have a problem that requires any algorithm to take Ω(2 n ) steps, then there is in fact no feasible algorithm to solve the problem. In fact, if you have a problem for which every algorithm is Ω(n p ) for any p, the problem is just as infeasible. People believe that the latter is the case for the satisfiability problem, but nobody has been able to prove it, though they have been trying for 30 years. It s one of the major outstanding unsolved problems of mathematics, and goes under the title P NP. 10
The Satisfiability Problem An input for this problem is a propositional formula P, and the problem is to tell whether or not the formula is satisfiable. We measure the size of the formula by the number of variables in it. People have experimented with formulas involving 1000 and more variables. Truth tables are not the way to go for this problem. You can t construct a truth table with 2 1000 rows. Yet nobody has found an O(n p ) algorithm for the problem. All algorithms so far have some sort of exponential lower bound, or have an unknown complexity, not polynomial. Quite a few algorithms work well on a whole lot of instances of the problem, though. The satisfiability problem is known to be an NP-complete problem. What this means is that if we had an O(n k ) algorithm for this problem, then a whole lot of other difficult problems, in the class NP would also have such algorithms to solve them. Nobody believes this to actually be true. However, mathematicians and computer scientists are still stumped. You yourself might die without knowing the answer. Or, somebody may announce the answer next week! 11
Are there problems that are super-exponential? Yes, there are problems for which all algorithms are Ω(2 2n ), for example. But the situation is even more dismal; there are problems for which no algorithm exists whatsoever! These are called undecidable problems. An example of such a problem is the satisfiability problem in predicate calculus. Given a sentence (no free variables) of predicate calculus, is that sentence true in some universe of discourse? Even in the universe of natural numbers, there is no algorithm which will tell if a sentence about them is true. No algorithm at all will correctly answer all questions of this type. This means that there are still jobs for mathematicians. They cannot be replaced by computers. 12
How do you prove that there isn t any algorithm to solve a problem? The basic method is to prove it by contradiction. The method also assumes that algorithms can all be coded as programs in a single programming language. It then uses the fact that all programs are just strings over an alphabet. It is assumed that (arbitrary length) strings are a datatype in the programming language. We then consider a specific property P of programs. The set of strings representing programs with that property, as it turns out, cannot be recognized by any one program in the language. Since each algorithm is representable by one program in the language, this means that no algorithm can solve the question: does a given program have property P? What s the magic property P? A program x has the property if x goes into an infinite loop when fed the string x (its own encoding) as an input. The analogy is: you are a good person if you go into an infinite tailspin when you start looking at your own DNA, which should tell you if you would go into an infinite loop when looking at your own DNA, which after all is supposed to predict everything about you. You can get a contradiction if you assume there is a program that will recognize property P. Other properties can be shown not to have an algorithm by proving that if they did, then there would be an algorithm to solve property P of programs. 13