MATH 22 FUNCTIONS: ORDER OF GROWTH. Lecture O: 10/21/2003. The old order changeth, yielding place to new. Tennyson, Idylls of the King

Similar documents
5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

An analogy from Calculus: limits

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

MATH 22 INFERENCE & QUANTIFICATION. Lecture F: 9/18/2003

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

Big-oh stuff. You should know this definition by heart and be able to give it,

Lecture 3: Big-O and Big-Θ

Basics of Proofs. 1 The Basics. 2 Proof Strategies. 2.1 Understand What s Going On

Growth of Functions (CLRS 2.3,3)

MATH 22. Lecture G: 9/23/2003 QUANTIFIERS & PIGEONHOLES

Math 3361-Modern Algebra Lecture 08 9/26/ Cardinality

MATH 22 HAMILTONIAN GRAPHS. Lecture V: 11/18/2003

εx 2 + x 1 = 0. (2) Suppose we try a regular perturbation expansion on it. Setting ε = 0 gives x 1 = 0,

Descriptive Statistics (And a little bit on rounding and significant digits)

This chapter covers asymptotic analysis of function growth and big-o notation.

Theory of Computation

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Solving recurrences. Frequently showing up when analysing divide&conquer algorithms or, more generally, recursive algorithms.

Math 31 Lesson Plan. Day 5: Intro to Groups. Elizabeth Gillaspy. September 28, 2011

MATH 22 FUNCTIONS: COMPOSITION & INVERSES. Lecture N: 10/16/2003. Mad world! mad kings! mad composition! Shakespeare, King John, II:1

The Growth of Functions. A Practical Introduction with as Little Theory as possible

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

Menu. Lecture 2: Orders of Growth. Predicting Running Time. Order Notation. Predicting Program Properties

Great Theoretical Ideas in Computer Science. Lecture 7: Introduction to Computational Complexity

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Solution to Proof Questions from September 1st

Slope Fields: Graphing Solutions Without the Solutions

Essential facts about NP-completeness:

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Class Note #20. In today s class, the following four concepts were introduced: decision

The following are generally referred to as the laws or rules of exponents. x a x b = x a+b (5.1) 1 x b a (5.2) (x a ) b = x ab (5.

Math 2602 Finite and Linear Math Fall 14. Homework 8: Core solutions

base 2 4 The EXPONENT tells you how many times to write the base as a factor. Evaluate the following expressions in standard notation.

One-to-one functions and onto functions

What is proof? Lesson 1

1 Continuity and Limits of Functions

Lecture 11: Extrema. Nathan Pflueger. 2 October 2013

Mathematical Induction

Q: How can quantum computers break ecryption?

The Time Complexity of an Algorithm

Math 5a Reading Assignments for Sections

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Data Structures and Algorithms. Asymptotic notation

The key is that there are two disjoint populations, and everyone in the market is on either one side or the other

#29: Logarithm review May 16, 2009

#26: Number Theory, Part I: Divisibility

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Solving Equations by Adding and Subtracting

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

CSE 417: Algorithms and Computational Complexity

CSE 373: Data Structures and Algorithms. Asymptotic Analysis. Autumn Shrirang (Shri) Mare

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Math (P)Review Part I:

Chapter 1 Review of Equations and Inequalities

MITOCW watch?v=pqkyqu11eta

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Factoring. there exists some 1 i < j l such that x i x j (mod p). (1) p gcd(x i x j, n).

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Lecture 3. Big-O notation, more recurrences!!

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Asymptotic Analysis 1

Solving Quadratic & Higher Degree Equations

with the size of the input in the limit, as the size of the misused.

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014

3.1 Asymptotic notation

Lecture 10: Powers of Matrices, Difference Equations

Proof Techniques (Review of Math 271)

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

Time-bounded computations

The Time Complexity of an Algorithm

CS1210 Lecture 23 March 8, 2019

Math 3012 Applied Combinatorics Lecture 5

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 3. Θ Notation. Comparing Algorithms

Computational complexity and some Graph Theory

MATH 22 THEMES. Lecture A: 9/2/2003. This is a wretched beginning, indeed!

NP-Completeness and Boolean Satisfiability

We set up the basic model of two-sided, one-to-one matching

Identifying the Graphs of Polynomial Functions

Theory of Computation

MITOCW ocw f99-lec17_300k

MATH10040: Chapter 0 Mathematics, Logic and Reasoning

Math 38: Graph Theory Spring 2004 Dartmouth College. On Writing Proofs. 1 Introduction. 2 Finding A Solution

Lecture 18: P & NP. Revised, May 1, CLRS, pp

MITOCW ocw lec8

Asymptotic Analysis. Thomas A. Anastasio. January 7, 2004

Lecture 2. More Algorithm Analysis, Math and MCSS By: Sarah Buchanan

Define Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 417: Algorithms and Computational Complexity. Winter 2007 Larry Ruzzo

Ordinary Differential Equations Prof. A. K. Nandakumaran Department of Mathematics Indian Institute of Science Bangalore

CSE 241 Class 1. Jeremy Buhler. August 24,

Mathematical Background. Unsigned binary numbers. Powers of 2. Logs and exponents. Mathematical Background. Today, we will review:

MITOCW ocw f99-lec30_300k

Defining Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 421: Intro Algorithms. Summer 2007 Larry Ruzzo

Difference Equations

Lecture 6: Introducing Complexity

Sequences and infinite series

Lesson 21 Not So Dramatic Quadratics

Transcription:

MATH 22 Lecture O: 10/21/2003 FUNCTIONS: ORDER OF GROWTH The old order changeth, yielding place to new. Tennyson, Idylls of the King Men are but children of a larger growth. Dryden, All for Love, Act 4, scene 1 Copyright 2003 Larry Denenberg

Administrivia http://denenberg.com/lectureo.pdf Exam: Monday 10/27, Robinson 253. Anyone know when? Suggestions for review questions still accepted Comment on the function sin 2 q Project 4: Grader pounds me, I pound you: This is your last warning! You can t manipulate non-equations. if p, then q is always true when p is false. Today: Growth rates of functions, preceded by awesome motivational discussion about problems and their algorithms

Problems & Algorithms Today we study growth rates of functions. Why? A problem is something that we solve with an algorithm. A problem consists of instances; an algorithm for the problem accepts an instance as input. Examples: SORTING is the problem of putting numbers in order. An instance of SORTING is a sequence of numbers. An algorithm that solves SORTING takes such a sequence as input and outputs the sorted sequence. FACTORING is the problem of finding prime factors. An instance of FACTORING is a single number. An algorithm that solves FACTORING takes a number N as input and outputs the factors of N. VALIDITY is the problem of determining whether a propositional formula, e.g. pq ( r) Æ q(r s), is valid, that is, true for any interpretation of its variables. An instance of VALIDITY is a propositional formula. An algorithm that solves VALIDITY takes such a formula as input, and outputs either yes or no. (The Twin Prime Conjecture is not a problem since it doesn t have instances. No individual question is a problem in this computer science sense.)

Problem Complexity Some problems seem to be harder than others: SORTING seems easier than FACTORING, but harder than MAXIMUM (i.e., finding the largest number of a sequence). But how do FACTORING and VALIDITY compare? How do we make these ideas precise? Answer: We define the difficulty of a problem H, which we call the computational complexity of H, as the difficulty of the best algorithm that solves H. Pretty lousy definition, eh? It just raises (not begs!!) the question: What is the difficulty of an algorithm? And what is best, that is, how do we compare algorithm difficulties? Today we take up tools we need to answer these questions. [Note: It s usually easy to measure the difficulty of an algorithm. But how do we know when you ve got the best algorithm for a problem? We must prove that no better algorithm exists, and this is often very hard. There are many problems for which we re unsure of the best algorithm, including FACTORING and VALIDITY!]

Algorithm Complexity The complexity of an algorithm is defined by the amount of resources it uses. Usually the resource we care about is time: Difficult problems take longer to solve. How can we measure the time used by an algorithm? The time depends on the particular machine and the efficiency of the coding. Most critically, it depends on the particular instance of the problem! (It usually takes more time to sort 1000000 numbers than to sort 10000. But it might take less time if the 1000000 numbers happen to be in order already, as they could be.) Here s how we cope with these issues: 0. We don t measure time in seconds, but rather in basic operations or program steps that are approximately equivalent and generally computer-independent. We ll see more on this point next lecture. 1. Every instance of a problem gets assigned a size. For example, the size of an instance of SORTING might be the number of numbers to be sorted; for an instance of FACTORING it might be the number of digits in the number to be factored. Having assigned sizes...

Complexity, continued 2. We measure the time an algorithm takes as a function of the size of the input. That is, an algorithm s complexity is not a number, but a function. If A is an algorithm, its complexity is a function f A (n) that tells how much time A requires on instances (inputs) of size n. 3. But algorithms usually take varying amounts of time on different inputs, even of the same size. (Some 1000- digit numbers are easier to factor than others!) So we define the complexity function to measure worst-case performance. That is, the complexity of algorithm A is a function f A such that f A (n) is the greatest possible amount of time required for A to solve a problem instance of size n. (Sometimes we use average case.) 4. We don t worry about time used on small inputs, which take little time anyway we can always handle any finite number of instances as special cases. So we compare complexities only at infinity, that is, for large inputs, though we can concoct cases where this is absurd. 5. Constant factors don t matter: Two algorithms whose time differs by a factor of 2 (say) are the same. Specialcase hardware or new technology can always increase speeds by constant factors, so ignoring them makes us technology-independent. (This can also be absurd.)

The Story of O With these ideas in mind we make the following definition: Let g be a function from positive numbers (e.g. Z + or R + ) to positive numbers. We define O(g) to be the set of all functions f for which the following condition holds: there exists some integer N and some positive constant c such that f(n) cg(n) for all n > N. It s our job today to understand this completely. First: For f to be in O(g) it s not required that f(n) g(n), just that f(n) cg(n), where c is a constant that we can make as large as we like. For example, if g(n) = 2n and f(n) = 800n, then f is in O(g), since we can pick c = 500 to boost g up. (We ll write this as 2n ΠO(800n).) Second: It s also not necessary that f(n) cg(n) for all n. It s enough for cg to exceed f for those n larger than some specific N which we can pick as large as we please. For example, n 20 is in O(2 n ) even though n 20 is bigger for n 143. Once n 144 (and forever after), 2 n wins. Loosely, you can think of O(g) as the set of all functions less than or equal to g in the sense of functional comparison that we said we d use for algorithm complexity: consider only behavior at infinity and ignore constant factors.

Examples Example: As you showed in Project 4, 4 n ΠO(n!) because for all n > 10 we have 4 n < n!. Here we ve used the definition with N = 10 and c = 1. Example: As we said, not only is 2n obviously in O(800n), but also 800n ΠO(2n). Similarly, we have 5n 3 ΠO(2n 3 ) and vice versa. In general, we can ignore multiplicative constants when deciding whether f ΠO(g) because we can always boost g by picking a bigger c. That is, if f ΠO(g) then k 1 f ΠO(k 2 g) for any positive constants k 1 and k 2, no matter how big or small. (This shouldn t be surprising; we defined it that way!) Example: n 2 ΠO(n 3 ), and indeed n a ΠO(n b ) for any a b. Other examples: n ΠO(n), n 4 ΠO(n 4.1 ). Example: n k ΠO(b n ) for any k and any b > 1. This is what we mean by saying that any exponential function eventually exceeds any polynomial. But it only happens at infinity ; you must choose a very large N to have n 100000000000000 1.000000000000001 n for all n > N.

Negative Results As it happens, n 3 œ O(n 2 ). How can we prove this? Back to basic quantifier logic! f Œ O(g) means ($N)($c)("n) n N Æ f(n) cg(n) Negating this statement, as you surely know, yields ("N)("c)($n) n N Ÿ f(n) > cg(n) That is, f œ O(g) means that no matter how big N and c, there s some still-bigger n such that f(n) exceeds g(n) even when g(n) is boosted up by a factor of c. Let s prove that n 2 œ O(n). We must prove that for any N and c there s some n > N such that n 2 > cn. So given N and c, can we find such an n? Pick n > c and it s certainly true! Of course n must also exceed N. So the proof starts like this: given N and c, let n = max(n,c)+1. We can similarly prove that n a œ O(n b ) for any a > b. Remember: To prove f Œ O(g), you get to find N and c to make f(n) g(n) for all n > N. But to prove f œ O(g) you have to show how to find, for any N and c picked by someone else, at least one n > N that makes f(n) > cg(n). Knowing how to use these definitions will enhance your enjoyment of the upcoming exam!

General Results For any function f, f Œ O(f). [Homework problem.] If f 1 and f 2 are both in O(g), then the function f 1 + f 2 is in O(g). Sketch of proof: Since f 1 Œ O(g) there exists an N 1 and a c 1... Then f 2 Œ O(g) gives us an N 2 and a c 2... Now pick N = max(n 1,N 2 ) and c = c 1 +c 2 to finish. If f Œ O(g) and g Œ O(h), then f Œ O(h). Said another way, if g Œ O(h), then O(g) Õ O(h). O(f) = O(g) if and only if both f Œ O(g) and g Œ O(f). [Another homework problem. Do we understand it?] As we said, f Œ O(g) loosely means g is at least as big as f. But beware of being too simplistic; functions are more complicated than numbers and can wiggle around in tricky ways. Another homework problem gives you specific functions f and g and asks you to prove that f œ O(g) and g œ O(f)! A blackboard picture will give you the intuition behind this. Problem: Find two such functions f and g that are monotonically increasing.

A Useful Result Suppose we have several functions f 1, f 2, f 3,..., f n, and one of them, say f 1, is the biggest in the sense that f 2, f 3,..., f n are all in O(f 1 ). Then we know by a result on the preceding page that f 1 + f 2 + f 3 +... + f n is in O(f 1 ). And certainly f 1 ΠO(f 1 + f 2 + f 3 +...+ f n ). Therefore, by another result on the preceding page, O(f 1 ) = O(f 1 + f 2 + f 3 +...+ f n ) if each f i ΠO(f 1 ) The bottom line here is that we can ignore all terms of a sum except the biggest. So, for example, O(10n 4 + 50n 3 + 2n 2 + 8n + 10000) = O(n 4 ) where we ve also ignored the multiplicative constant 10. That is, O(any polynomial) = O(its largest power). Another example: O(10n 4 + 500n 3.99999 + 2n 2.5 + 8n + 100n 1 ) = O(n 4 )

Some Mathematics For any positive a and b, O(log a n) = O(log b n). In particular, O(log n) = O(lg n) = O(ln n). This is true because changing the base of logarithms is just multiplication by a constant factor. Learn this. Reminder: n k ΠO(b n ) for any k and any b > 1; any exponential eventually grows faster than any polynomial. Taking logs, we find that log n ΠO(n). Indeed, we have log k n ΠO(n b ) for any k no matter how big and any b >0. This means, e.g., that log 100 n ΠO( n). Although you can ignore overall multiplicative constants, you can t ignore them everywhere! For example, 2 2n is not in O(2 n ); constant factors in the exponent are not ignorable. Similarly, O(2 n ) O(3 n ). What does O(1) mean? By definition, function f is in O(1) if there are N and c such that f(n) < c for all n > N. That is, the functions in O(1) are those bounded by a constant; the functions that never get to infinity at all! What would O(70) mean?

Other Sets O(g) is the set of functions no bigger than g. The set of functions at least as big as g is written W(g); formally, f is in W(g) if there exist N and c > 0 such that f(n) cg(n) for all n > N. As homework, you re going to prove that f Œ W(g) if and only if g Œ O(f). If we take the functions that are both no bigger than f and at least as big as f we get the set of functions that have the same order of growth as f. This set is written Q(f) and is defined as O(f) «W(f). Knowing the exact order of growth of a problem is the typical goal. We also have notation for functions that are definitely smaller than g : We say that f Œ o(g) if for all c > 0 there exists N such that f(n) < cg(n) for all n > N. That is, no matter how small you pick c, eventually the ratio f(n)/g(n) is < c; no constant factor can build f up to g. Analogously, w(g) is the set of functions that are definitely bigger than g ; you can guess the definition. It s very instructive to study the distinction between f Œ O(g) and f œ w(g) ; these are not the same! The latter is ("c)($n)("n); the former is ("c)("n)($n). Again, this is because functions are complicated.

Two G Warnings Warning #1: Grimaldi uses the phrase g dominates f to mean f ΠO(g). So he would say, for example, that any function f dominates itself, or that p(n) = 0.00001n dominates q(n) = 10000000n. Not everyone uses this terminology, myself in particular. By g dominates f I would mean f Πo(g), that is, f is definitely smaller than g in the limit. Functions p and q above have the same order of growth (namely linear, i.e., both are in Q(n)), but neither dominates the other. Warning #2: All functions in this lecture have been assumed to be from positive numbers to positive numbers. Grimaldi s presentation permits negative numbers as well, so his definition is full of absolute values and disclaimers about dividing by zero: He defines O(g) as the set of functions f for which there exist N and c such that f(n) < c g(n) for all n > N where g(n) 0. This means that f can t exceed g in absolute value, i.e., it can t go hugely negative below g.

Final Fun Facts There are all sorts of ways to slip orders of growth between each other. For example, if f(n) = n 4 log n, then f is in w(n 4 ) but is in o(n 4.000000000000001 ). As another example, consider f(n) = n log n. This function is W(n k ) for any k no matter how big, but is in o(b n ) for any b > 1. That is, it s strictly between the polynomials and the exponentials. Can you find a function that lies strictly between log k n and n b for huge k and tiny b? Sometimes the notation O(f) is used to mean some function in O(f). For example, we might write g(n) = n 4 + O(n 3 ) to mean that the difference between g(n) and n 4 is some function that never gets bigger than n 3. (Is it the same thing to write g(n) = n 4 + o(n 4 )? Which is stronger?) Also, older books write f = O(g) to mean f ΠO(g). The facts concerning the problems we used as examples: MAXIMUM is Q(n), SORTING is Q(n log n). Nobody knows the complexity of FACTORING, but it was just proven a year ago that just testing a number to see if it is prime is polynomial, that is, it s in Q(n k ) for some k. Whether VALIDITY is polynomial is the celebrated P=NP? question, still the most important unsolved problem in computer science.