Asymptotic Running Time of Algorithms

Similar documents
Solving Recurrences. Lecture 23 CS2110 Fall 2011

CS 2110: INDUCTION DISCUSSION TOPICS

CS173 Running Time and Big-O. Tandy Warnow

Topic 17. Analysis of Algorithms

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CS 310 Advanced Data Structures and Algorithms

Mathematical Background. Unsigned binary numbers. Powers of 2. Logs and exponents. Mathematical Background. Today, we will review:

ITEC2620 Introduction to Data Structures

Computer Algorithms CISC4080 CIS, Fordham Univ. Outline. Last class. Instructor: X. Zhang Lecture 2

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2

Lecture 2. Fundamentals of the Analysis of Algorithm Efficiency

csci 210: Data Structures Program Analysis

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

COMP Analysis of Algorithms & Data Structures

COMP Analysis of Algorithms & Data Structures

ASYMPTOTIC COMPLEXITY SEARCHING/SORTING

Algorithm Analysis, Asymptotic notations CISC4080 CIS, Fordham Univ. Instructor: X. Zhang

csci 210: Data Structures Program Analysis

Example: Fib(N) = Fib(N-1) + Fib(N-2), Fib(1) = 0, Fib(2) = 1

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Review Of Topics. Review: Induction

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

Lecture 2. More Algorithm Analysis, Math and MCSS By: Sarah Buchanan

COMP 250: Quicksort. Carlos G. Oliver. February 13, Carlos G. Oliver COMP 250: Quicksort February 13, / 21

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Big O 2/14/13. Administrative. Does it terminate? David Kauchak cs302 Spring 2013

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Analysis of Algorithms CMPSC 565

LECTURE NOTES ON DESIGN AND ANALYSIS OF ALGORITHMS

Central Algorithmic Techniques. Iterative Algorithms

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

Advanced Algorithmics (6EAP)

Algorithms Design & Analysis. Analysis of Algorithm

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Analysis of Algorithms

Divide and Conquer CPE 349. Theresa Migler-VonDollen

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014

Computational Complexity

Ch01. Analysis of Algorithms

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

data structures and algorithms lecture 2

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018

Fall 2016 Test 1 with Solutions

Announcements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010

Ch 01. Analysis of Algorithms

Data Structures and Algorithms

Lecture 1 - Preliminaries

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Algorithm. Executing the Max algorithm. Algorithm and Growth of Functions Benchaporn Jantarakongkul. (algorithm) ก ก. : ก {a i }=a 1,,a n a i N,

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

1 Divide and Conquer (September 3)

CS Data Structures and Algorithm Analysis

CS 161 Summer 2009 Homework #2 Sample Solutions

Module 1: Analyzing the Efficiency of Algorithms

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

CIS 121. Analysis of Algorithms & Computational Complexity. Slides based on materials provided by Mary Wootters (Stanford University)

Fundamentals of Programming. Efficiency of algorithms November 5, 2017

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Fundamental Algorithms

Fundamental Algorithms

Lecture 29: Tractable and Intractable Problems

Module 1: Analyzing the Efficiency of Algorithms

Solving Recurrences. 1. Express the running time (or use of some other resource) as a recurrence.

Midterm Exam. CS 3110: Design and Analysis of Algorithms. June 20, Group 1 Group 2 Group 3

Week 5: Quicksort, Lower bound, Greedy

Asymptotic Algorithm Analysis & Sorting

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College

CSED233: Data Structures (2017F) Lecture4: Analysis of Algorithms

Cpt S 223. School of EECS, WSU

Solving Recurrences. 1. Express the running time (or use of some other resource) as a recurrence.

CS 4104 Data and Algorithm Analysis. Recurrence Relations. Modeling Recursive Function Cost. Solving Recurrences. Clifford A. Shaffer.

The Time Complexity of an Algorithm

Running Time. Overview. Case Study: Sorting. Sorting problem: Analysis of algorithms: framework for comparing algorithms and predicting performance.

Algorithms And Programming I. Lecture 5 Quicksort

Asymptotic Notation. such that t(n) cf(n) for all n n 0. for some positive real constant c and integer threshold n 0

Lecture 4. Quicksort

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

What is Performance Analysis?

Data Structures and Algorithms. Asymptotic notation

COMP 555 Bioalgorithms. Fall Lecture 3: Algorithms and Complexity

CSCI 3110 Assignment 6 Solutions

Analysis of Algorithms

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Recommended readings: Description of Quicksort in my notes, Ch7 of your CLRS text.

Quicksort. Recurrence analysis Quicksort introduction. November 17, 2017 Hassan Khosravi / Geoffrey Tien 1

Growth of Functions (CLRS 2.3,3)

Divide&Conquer: MergeSort. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University Spring 2014

Data Structures and Algorithms Chapter 2

CS1210 Lecture 23 March 8, 2019

Great Theoretical Ideas in Computer Science. Lecture 7: Introduction to Computational Complexity

Computational Complexity. This lecture. Notes. Lecture 02 - Basic Complexity Analysis. Tom Kelsey & Susmit Sarkar. Notes

Divide and conquer. Philip II of Macedon

Data Structures and Algorithms CSE 465

Data Structures. Outline. Introduction. Andres Mendez-Vazquez. December 3, Data Manipulation Examples

Transcription:

Asymptotic Complexity: leading term analysis Asymptotic Running Time of Algorithms Comparing searching and sorting algorithms so far: Count worst-case of comparisons as function of array size. Drop lower-order terms, floors/ceilings, and constants to come up with asymptotic running time of algorithm. We will now generalize this approach to other programs: Count worst-case of operations executed by program as a function of input size. Formalize definition of big-o complexity to derive asymptotic running time of algorithm. Formal Definition of big-o Notation: Let f(n) and g(n) be functions. We say f(n) is of order g(n), written O(g(n)), if there is a constant c > 0 such that for all but a finite of positive values of n: f(n) c * g(n) In other words, sooner or later g(n) overtakes f(n) as n gets large. Example: f(n) = n+5; g(n) = n. Show that f(n) = O(g(n)). Choose c = 6: f(n) = n+5 6*n for all n > 0. Example: f(n) = 17n; g(n) = 3n 2. Show that f(n) = O(g(n)). Choose c = 6: f(n) = 17n 6 * 3n 2 for all n > 0. To prove that f(n) = O(g(n)), find an n 0 and c such that f(n) c * g(n) for all n > n 0. We will call the pair (c, n 0 ) a witness pair for proving that f(n) = O(g(n)). 1

For asymptotic complexity, base of logarithms does not matter. Let us show that log 2 (n) = O(log b (n)) for any b > 1. Need to find a witness pair (c, n 0 ) such that: log 2 (n) c * log b (n) for all n > n 0. Choose (c = log 2 (b), n 0 = 0). This works because c * log b (n) = log 2 (b) * log b (n) = log 2 (b log b (n) ) = log 2 (n) for all positive n. Why is Asymptotic Complexity So Important? Asymptotic complexity gives an idea of how rapidly the space/time requirements grow as problem size increases. Suppose we have a computing device that can execute 1000 complex operations per second. Here is the size problem that can be solved in a second, a minute, and an hour by algorithms of different asymptotic complexity: Complexity n 1 second 1000 1 minute 60,000 1 hour 3,600,000 n log n 140 4893 200,000 n 2 31 244 1897 3n 2 18 144 1096 n 3 10 39 153 2 n 9 15 21 What are the basic OPERATIONS? For searching and sorting algorithms, you can usually determine big-o complexity by counting comparisons. Why? Usually end up doing some fixed of arithmetic and/or logical operations per comparison. Why don t we count swaps for sorting? e.g., think about selection sort: max of n swaps but must do O(n 2 ) comparisons swaps and comparisons have similar cost Is counting of comparisons always right? No. e.g., selection sort on linked lists instead of arrays What Operations Should We Count? Must do a detailed counting of all executed operations. Estimate of primitive operations that RAM model (basic microprocessor) must do to run program: Basic operation: arithmetic/logical operations count as 1 operation. even though adds, multiplies, and divides don t (usually) cost the same Assignment: counts as 1 operation. operation count of righthand side of expression is determined separately. Loop: of operations per iteration * of loop iterations. Method invocation: of operations executed in the invoked method. ignore costs of building stack frames, Recursion: same as method invocation, but harder to reason about. Ignoring garbage collection, memory hierarchy (caches), 2

Example: Selection Sort Example: Matrix Multiplication public static void selectionsort(comparable[] a) { //array of size n for (int i = 0; i < a.length; i++) { <-- cost = c1, n times int MinPos = i; for (int j = i+1; j < a.length; j++) { if (a[j].compareto(a[minpos]) < 0) MinPos = j;} Comparable temp = a[i]; a[i] = a[minpos]; a[minpos] = temp;}} <-- cost = c2, n times <-- cost = c3, n*(n-1)/2 times <-- cost = c4, n*(n-1)/2 times <-- cost = c5, n*(n-1)/2 times <-- cost = c6, n times <-- cost = c7, n times <-- cost = c8, n times int n = A.length; for (int i = 0; i < n; i++) { } } for (int j = 0; j < n; j++) { sum = 0; for k = 0; k < n; k++) sum = sum + A[i][k]*B[k][j]; C[i][j] = sum; <-- cost = c0, 1 time <-- cost = c1, n times <-- cost = c2, n*n times <-- cost = c3, n*n times <-- cost = c4, n*n*n times <-- cost = c5, n*n*n times <-- cost = c6, n*n times Total of operations: = (c1+c2+c6+c7+c8)*n + (c3+c4+c5)*n*(n-1)/2 = (c1+c2+c6+c7+c8 -(c3+c4+c5)/2)*n + ((c3+c4+c5)/2)*n 2 = O(n 2 ) Total of operations: = c0 + c1*n + (c2+c3+c6)*n*n + (c4+c5)*n*n*n = O(n 3 ) Remarks For asymptotic running time, we do not need to count precise of operations executed by each statement, provided that of operations is independent of input size. Just use symbolic constants like c1, c2, instead. Our estimate used a precise count for the of times the j loop was executed in selection sort (e.g., n*(n-1)/2). Could have said it was executed n 2 times and still have obtained the same big-o complexity. Analysis of Merge-Sort public static Comparable[] mergesort(comparable[] A, int low, int high) { if (low < high - 1) //at least three elements <-- cost = c0, 1 time {int mid = (low + high)/2; <-- cost = c1, 1 time Comparable[] A1 = mergesort(a, low, mid); <-- cost =??, 1 time Comparable[] A2 = mergesort(a, mid+1, high); <-- cost =??, 1 time return merge(a1,a2);} <-- cost = c2*n + c3... Once you get the hang of this, you can quickly zero in on what is relevant for determining asymptotic complexity. For example, you can usually ignore everything that is not in the innermost loop. Why? Main difficulty: estimating running time for recursive programs Recurrence equation: T(n) = (c0+c1) + 2T(n/2) + (c2*n + c3) T(1) = c4 How do we solve this recurrence equation? <-- recurrence <-- base case 3

Analysis of Merge-Sort Recurrence equation: T(n) = (c0+c1) + 2T(n/2) + (c2*n + c3) T(1) = c4 First, simplify by dropping lower-order terms. Simplified recurrence equation: T(n) = 2T(n/2) + n T(1) = 1 It can be shown that T(n) = O(nlog(n)) is a solution to this recurrence. What do we mean by Solution Recurrence: T(n) = 2T(n/2) + n Solution: T(n) = nlog 2 n To prove, substitute nlog 2 n for T(n) in recurrence: T(n) = 2T(n/2) + n nlog 2 n = 2(n/2)log 2 (n/2) + n nlog 2 n = nlog 2 (n/2) + n nlog 2 n = n(log 2 (n) - log 2 (2)) + n nlog 2 n = n(log 2 (n) - 1) + n nlog 2 n = nlog 2 (n) - n + n nlog 2 n = nlog 2 n Solving recurrences Solving recurrences is like integration --- no general techniques known for solving recurrences. For CS 211, we just expect you to remember a few common patterns. CS 280, learn a bag of tricks for solving recurrences that arise in practice. Cheat Sheet for Common Recurrences Recurrence Relation Closed-Form Example c(n) = b + c(n-1) c(n) = O(n) Linear search c(n) = b*n + c(n-1) c(n) = O(n 2 ) Quicksort c(n) = b + c(n/2) c(n) = O(log(n)) Binary search c(n) = b*n + c(n/2) c(n) = b + kc(n/k) 4

Cheat Sheet for Common Recurrences Recurrence Relation Closed-Form Example c(n) = b + c(n-1) c(n) = O(n) Linear search c(n) = b*n + c(n-1) c(n) = O(n 2 ) Quicksort c(n) = b + c(n/2) c(n) = O(log(n)) Binary search Cheat Sheet for Common Recurrences Recurrence Relation Closed-Form Example c(n) = b + c(n-1) c(n) = O(n) Linear search c(n) = b*n + c(n-1) c(n) = O(n 2 ) Quicksort c(n) = b + c(n/2) c(n) = O(log(n)) Binary search c(n) = b*n + c(n/2) c(n) = O(n) c(n) = b*n + c(n/2) c(n) = O(n) c(n) = b + kc(n/k) c(n) = b + kc(n/k) c(n) = O(n) Cheat Sheet for Common Recurrences cont. Recurrence Relation Closed-Form Example c(n) = b*n + 2c(n/2) c(n) = O(nlog(n)) Mergesort c(n) = b*n + kc(n/k) c(n) = O(nlog(n)) c(2) = b c(n) = c(n-1) + c(n-2) + d c(n) = O(2 n ) Fibonacci Don t just memorize these. Try to understand each one. When in doubt, guess a solution and see if it works (just like with integration). Analysis of Quicksort: Tricky! public static void quicksort(comparable[] A, int l, int h) { if (l < h) {int p = partition(a,l+1,h,a[l]); //move pivot into its final resting place; Comparable temp = A[p-1]; A[p-1] = A[l]; A[l] = temp; //make recursive calls quicksort(a,l,p-1); quicksort(a,p,h);}} Incorrect attempt: c(1) = 1 c(n) = n + 2c(n/2) ---- --------- partition sorting the two partitioned arrays 5

Analysis of Quicksort: Tricky! public static void quicksort(comparable[] A, int l, int h) { if (l < h) {int p = partition(a,l+1,h,a[l]); //move pivot into its final resting place; Comparable temp = A[p-1]; A[p-1] = A[l]; A[l] = temp; //make recursive calls quicksort(a,l,p-1); quicksort(a,p,h);}} What is wrong with this analysis? Incorrect attempt: c(1) = 1 c(n) = n + 2c(n/2) ---- --------- partition sorting the two partitioned arrays Analysis of Quicksort: Tricky! Remember: big-o is worst-case complexity. What is worst-case for Quicksort? one of the partitioned subarrays is empty, and the other subarray has (n-1) elements (n th element is the pivot)! So actual worst-case recurrence relation is: c(1) = 1 c(n) = n + 1 + c(n-1) ------ ------------------------- partition sorting partitioned subarrays From table, c(n) = O(n 2 ) On average (not worst-case) quicksort runs in nlog(n) time. One approach to avoiding worst-case behavior: pick pivot carefully so that it always partitions array in half. Many heuristics for doing this, but none of them guarantee worst case will not occur. If want to pick pivot as first element, sorted array is worst case. One heuristic is to randomize array order before sorting! Not all algorithms are created equal Programs for same problem can vary enormously in asymptotic efficiency fib(n) = fib(n-1) + fib(n-2) fib(1) = 1 fib(2) = 1 Here is recursive program for fib(n): } static int fib(int n) { if (n <= 2) return 1; else return fib(n-1) + fib(n-2); 6

fib(2) fib(3) fib(4) fib(1) fib(2) fib(5) fib(2) fib(3) c(n) = c(n-1) + c(n-2) + 2 c(2) = 1; c(1) = 1 fib(1) For this problem, problem size is n. It can be shown that T(n) = O(2 n ). Cost of computing fib(n) recursively is exponential in size n. Iterative Code for Fibonnaci fib(n) = fib(n-1) + fib(n-2); fib(1) = 1; fib(2) = 1 dad = 1; grandad = 1; current = 1; for (i = 3; i 3; i++) { grandad = dad; dad = current; current = dad + grandad; } return (current); Number of times loop is executed? Less than N. Number of computations per loop? Fixed amount of work. ==> complexity of iterative algorithm = O(n) Much, much, much, much, much, better than O(2 n )! Summary Asymptotic complexity: measure of space/time required by an algorithm measure of algorithm, not problem Searching array: linear search O(n) binary search O(log(n)) Sorting array: SelectionSort: O(n 2 ) MergeSort: O(nlog(n)) Quicksort: O(n 2 ) sorts in place behaves more like O(nlog(n)) in practice Matrix operations: Matrix-vector product: O(n 2 ) Matrix-matrix multiplication: O(n 3 ) Closing Remarks Might think that as computers get faster, asymptotic complexity and design of efficient algorithms is less important. NOT TRUE! As computers get bigger/faster/cheaper, the size of data (N) gets larger Moore s Law: ~ computers double in speed every 3 years Speed is O(2 (years/3) ). If problem size grows at least O(2 (years/3) ), then it s a wash for O(n) algorithms. For things worse than O(n) such as O(n 2 log(n)), we are rapidly losing computational ground. Need more efficient algorithms now more than ever. 7

10 50 100 300 1000 n n n! 2 n n 3 n 2 nlogn 5n 50 33 100 1000 1024 3.6 million 10 billion 250 282 2500 125,000 a 16-digit a 65-digit an 85-digit 500 665 10,000 1,000,000 a 31-digit a 161-digit a 201-digit 1500 2469 90,000 27 million a 91-digit a 623-digit a 744-digit 5000 9966 1,000,000 1 billion a 302-digit unimagin-ably large unimagin-ably large n n 2 n n 5 n 2 How long would it take @ 1 instruction / msec? 10 20 50 100 300 1/10,000 sec 1/10 sec 1/1000 sec 2.8 hr 1/2500 sec 3.2 sec 1 sec 3.3 trillion years 1/400 sec 5.2 min 35.7 yr a 70-digit of 1/100 sec 2.8 hr 400 trillion a 185-digit of 9/100 sec 28.1 days a 75-digit of a 728-digit of protons in the known universe ~ 126 digits msec since the big bang ~ 24 digits - Source: D. Harel, Algorithmics the big bang was 15 billion years ago (5*10 17 secs) - Source: D. Harel, Algorithmics Asymptotic complexity and efficient algorithms becomes more important as technology improves can handle larger problems Human genome = 3.5 billion nucleotides ~ 1 Gb @ 1 base-pair instruction / msec n 2 Æ 388445 years n log n Æ 30.824 hours n Æ 1 hour 8