Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College

Similar documents
Growth of Functions (CLRS 2.3,3)

Data Structures and Algorithms Running time and growth functions January 18, 2018

Defining Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 421: Intro Algorithms. Summer 2007 Larry Ruzzo

Define Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 417: Algorithms and Computational Complexity. Winter 2007 Larry Ruzzo

Reading 10 : Asymptotic Analysis

csci 210: Data Structures Program Analysis

COMP 382: Reasoning about algorithms

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

csci 210: Data Structures Program Analysis

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Computational Complexity

Module 1: Analyzing the Efficiency of Algorithms

Big-O Notation and Complexity Analysis

Algorithms Design & Analysis. Analysis of Algorithm

Asymptotic Notation. such that t(n) cf(n) for all n n 0. for some positive real constant c and integer threshold n 0

CSED233: Data Structures (2017F) Lecture4: Analysis of Algorithms

Ch01. Analysis of Algorithms

3.1 Asymptotic notation

Ch 01. Analysis of Algorithms

Agenda. We ve discussed

Analysis of Algorithms

Asymptotic Analysis. Slides by Carl Kingsford. Jan. 27, AD Chapter 2

Asymptotic Analysis 1

Data Structures and Algorithms

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

Module 1: Analyzing the Efficiency of Algorithms

Big O (Asymptotic Upper Bound)

CS 344 Design and Analysis of Algorithms. Tarek El-Gaaly Course website:

Big O 2/14/13. Administrative. Does it terminate? David Kauchak cs302 Spring 2013

Advanced Algorithmics (6EAP)

CIS 121 Data Structures and Algorithms with Java Spring Big-Oh Notation Monday, January 22/Tuesday, January 23

Data Structures and Algorithms. Asymptotic notation

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 3. Θ Notation. Comparing Algorithms

Lecture 2. Fundamentals of the Analysis of Algorithm Efficiency

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

An analogy from Calculus: limits

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Algorithms and Their Complexity

Analysis of Algorithms

Input Decidable Language -- Program Halts on all Input Encoding of Input -- Natural Numbers Encoded in Binary or Decimal, Not Unary

Linear Selection and Linear Sorting

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

Cpt S 223. School of EECS, WSU

with the size of the input in the limit, as the size of the misused.

CS Data Structures and Algorithm Analysis

Asymptotic Analysis. Thomas A. Anastasio. January 7, 2004

CS 4407 Algorithms Lecture 2: Growth Functions

CS 380 ALGORITHM DESIGN AND ANALYSIS

Algorithms and Theory of Computation. Lecture 2: Big-O Notation Graph Algorithms

Copyright 2000, Kevin Wayne 1

Asymptotic Analysis Cont'd

Principles of Algorithm Analysis

Algorithms and Data S tructures Structures Complexity Complexit of Algorithms Ulf Leser

Analysis of Algorithms

data structures and algorithms lecture 2

Data Structures and Algorithms Chapter 2

Asymptotic Algorithm Analysis & Sorting

Introduction to Algorithms: Asymptotic Notation

Data Structures and Algorithms CSE 465

Runtime Complexity. CS 331: Data Structures and Algorithms

2.2 Asymptotic Order of Growth. definitions and notation (2.2) examples (2.4) properties (2.2)

CISC 235: Topic 1. Complexity of Iterative Algorithms

The Time Complexity of an Algorithm

The Time Complexity of an Algorithm

ECE250: Algorithms and Data Structures Analyzing and Designing Algorithms

Lecture 2. More Algorithm Analysis, Math and MCSS By: Sarah Buchanan

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 2

CSC Design and Analysis of Algorithms. Lecture 1

Limitations of Algorithm Power

Harvard CS 121 and CSCI E-121 Lecture 20: Polynomial Time

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

COMP 182 Algorithmic Thinking. Algorithm Efficiency. Luay Nakhleh Computer Science Rice University

CSE 421: Intro Algorithms. 2: Analysis. Winter 2012 Larry Ruzzo

CSCE 222 Discrete Structures for Computing

Lecture 3: Big-O and Big-Θ

Programming, Data Structures and Algorithms Prof. Hema Murthy Department of Computer Science and Engineering Indian Institute Technology, Madras

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

CS 4104 Data and Algorithm Analysis. Recurrence Relations. Modeling Recursive Function Cost. Solving Recurrences. Clifford A. Shaffer.

Great Theoretical Ideas in Computer Science. Lecture 9: Introduction to Computational Complexity

Lecture 2: Asymptotic Notation CSCI Algorithms I

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

P, NP, NP-Complete, and NPhard

More on Asymptotic Running Time Analysis CMPSC 122

Algorithm Analysis, Asymptotic notations CISC4080 CIS, Fordham Univ. Instructor: X. Zhang

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Review Of Topics. Review: Induction

Algorithms. Grad Refresher Course 2011 University of British Columbia. Ron Maharik

CS 310 Advanced Data Structures and Algorithms

CS473 - Algorithms I

Measuring Goodness of an Algorithm. Asymptotic Analysis of Algorithms. Measuring Efficiency of an Algorithm. Algorithm and Data Structure

Recursion. Computational complexity

COMP Analysis of Algorithms & Data Structures

3. Algorithms. What matters? How fast do we solve the problem? How much computer resource do we need?

Data Structures. Outline. Introduction. Andres Mendez-Vazquez. December 3, Data Manipulation Examples

Transcription:

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College Why analysis? We want to predict how the algorithm will behave (e.g. running time) on arbitrary inputs, and how it will compare to other algorithms. We want to understand its bottlenecks and the patterns behind its behaviour. Theoretical analysis: we break down the algorithm in instructions, and count the number of instructions. Note that the result depends crucially on what exactly we consider to be an instruction. What is an instruction? An instruction should be primitive (in the sense that it corresponds to a machine instruction on a real computer; e.g. cannot count sorting as one operation) and at the same time high-level enough (so that it s independent of the specific processor and platform). More formally, to define an instruction, we need to define a model of a generic computer. This is called the RAM model of computation: Random-access machine (RAM) model of computation: Contains all the standard instructions available on any computer: Load/Store, Arithmetics (e.g. +,,, /), Logic (e.g. >), jumps All instructions take the same amount of time Instructions executed sequentially one at a time To analyse an algorithm, we implicitly assume the RAM model of computation (all instructions take the same) and we count the number of instructions. This will be our estimate of running time. The running time of an algorithm is the number of instructions it executes in the RAM model of computation. RAM model not completely realistic, e.g. main memory not infinite (even though we often imagine it is when we program) not all memory accesses take same time (cache, main memory, disk) not all arithmetic operations take same time (e.g. multiplications expensive) instruction pipelining, etc But RAM model gives realistic results in most cases. 1

The running time is expressed as a function of the input size E.g: running time of a sorting algorithm is expressed as a function of n, the number of elements in the input array. Not always possible to come up with a precise expression of running time for a specific input. E.g. Running time of insertion sort on a specific input depends on that input; we can;t nail it down without knowing what the specific input is. We are interested in the smallest and the largest running time for an input of size n: Best-case running time: The shortest possible running time for any input of size n. Worst-case running time: The longest possible running time for any input of size n. Sometimes we might also be interested in: Average-case running time: The average running time over a set of inputs of size n. Must be careful: Need to assume an input distribution. Unless otherwise specified, we are interested in the worst-case running time. Wcrt gives us a guarantee. For some algorithms, worst-case occur fairly often. Average case is often as bad as worst case (but not always!). Case study: Express the best-case and worst case running times of bubble-sort, selection sort and insertion sort, and give examples of inputs that trigger best case and worst-case behaviour, respectively. Asymptotic analysis Assume we ve done a careful analysis, counting all instructions, and found that: Algorithm X best case is 3n + 5 (linear), and worst case is 2n 2 + 5n + 7 (quadratic). The effort to compute all terms and the constants in front of the terms is not worth it for two reasons: 1. these constants are not accurate because the RAM model is not completely realistic (not all operations cost the same). 2. for large input the running time is dominated by the higher order term For these reasons, when analysing algorithms, we are not interested in all the constants and all the lower order terms, but in the overall rate of growth of the running time. This is called aymptotic analysis (because it s done when n ). The rate of growth is expressed using O-notation, Ω-notation, Θ-notation. You have probably seen it intuitively defined but now we will now define it more carefully. 2

Notation: Let, be non-negative functions (think of them as representing the the worst-case running time of two algorithms). O-notation (pronounced: Big-O) We say that f is O() if, for sufficiently large n, the function is bounded above by a constant multiple of. Put differently, if f is upper bounded by g in the asymptotic sense, i.e. as n goes to. More precisely, f is O() if c > 0, n 0 > 0 such that c n n 0 We think of O() as corresponding to. c n 0 Examples: 1 3 n2 + 3n is O(n 2 ) because 1 3 n2 + 3n 1 3 n2 + 3n 2 = ( 1 3 + 3)n2 ; holds for c = 1/3 + 3 = 3.3 and n > 1. = 100n 2, = n 2 : O() and O() = n, = n 2 : O() 20n 2 + 10n lg n + 5 is O(n 3 ) 2 100 is O(1) (O(1) denotes constant time) Note that O( ) expresses an upper bound of f, but not necessarilly tight: n O(n), n O(n 2 ), n O(n 3 ), n O(n 100 ) We can prove that an algorithm has running time O(n 3 ), and then we analyse more carefully and manage to show that the running time is in fact O(n 2 ). The first bound is correct, but the second one is better. We want the tightest possible bound for the running time. 3

Ω-notation (big-omega) We say that f is Ω() if, for sufficiently large n, the function is bounded below by a constant multiple of. Put differently, if f is lower bounded by g in the asymptotic sense, i.e. as n goes to. More precisely, is Ω() if c > 0, n 0 > 0 such that c n n 0 } We think of Ω() as corresponding to. c n 0 Examples: 1 3 n2 + 3n Ω(n 2 ) because 1 3 n2 + 3n cn 2 ; true if c = 1/3 and n > 0. = an 2 + bn + c, = n 2 : Ω() and Ω() = 100n 2, = n 2 : Ω() and Ω() = n, = n 2 : Ω() Note that Ω( ) gives a lower bound of f, but not necessarilly tight: 0.1 Θ-notation (Big-Theta) n Ω(n), n 2 Ω(n), n 3 Ω(n), n 100 Ω(n) Θ() is used to express asymptotically tight bounds. We say that is a tight bound for, or isθ(), if is both O() and Ω(). That is, grows like to within a constant factor. is Θ() if is O() and is Ω() We think of Θ() as corresponding to =. 4

c 2 c 1 n 0 Examples: 6n log n + n log 2 n Θ(n log n) n Θ(n 2 ) 6n lg n + n lg n = Θ(n lg n) When analysing the worst or best case running time of an algorithm we aim for asymptotic tight bounds because they characterize the running time of an algorithm precisely up to constant factors. Rate of growth and limits Another way to think of rate of growth is to calculate the limit of the functions: lim n Using the definition of O, Ω, Θ it can be shown that : if lim n if lim n if lim n = 0: then intuitively f < g = f = O(g) and f Θ(g). = : then intuitively f > g = f = Ω(g) and f Θ(g). = c, c > 0: then intuitively f = c g = f = Θ(g). Thus comparing f and g in terms of their growth can be summarized as: f is below g f O(g) f g f is above g f Ω(g) f g f is both above and below g f Θ(g) f = g 5

Growth rate of standard functions Polynomial of degree d is defined as: p(n) = d a i n i = Θ(n d ) i=1 where a 1, a 2,..., a d are constants (and a d > 0). Any polylog grows slower than any polynomial. log a n = O(n b ), a > 0 Any polynomial grows slower than any exponential with base c > 1. n b = O(c n ), b > 0, c > 1 Asymptotic analysis, continued Upper and lower bounds are symmetrical: If f is upper-bounded by g then g is lower-bounded by f. Example: n O(n 2 ) and n 2 Ω(n) The correct way to say is that O() or is O(). Abusing notation, people often write = O(). 3n 2 + 2n + 10 = O(n 2 ), n = O(n 2 ), n 2 = Ω(n), n log n = Ω(n), 2n 2 + 3n = Θ(n 2 ) Example: Insertion-sort: Best case: Ω(n) Worst case: O(n 2 ) Therefore the running time is Ω(n) and O(n 2 ). We can actually say that worst case is Θ(n 2 ) because there exists an input for which insertion sort takes Ω(n 2 ). Same for best case. But, we cannot say that the running time of insertion sort is Θ(n 2 )! Use of O-notation makes it much easier to analyze algorithms; we can easily prove the O(n 2 ) insertion-sort time bound by saying that both loops run in O(n) time. We often use O(n) in equations and recurrences: e.g. 2n 2 + 3n + 1 = 2n 2 + O(n) (meaning that 2n 2 + 3n + 1 = 2n 2 + where is some function in O(n)). We use O(1) to denote constant time. Not all functions are asymptotically comparable! There exist functions f, g such that f is not O(g), f is not Ω(g) (and f is not Θ(g)). 6

Algorithms matter! Sort 10 million integers on 1 GHZ computer (1000 million instructions per second) using 2n 2 algorithm. 2 (10 7 ) 2 inst. = 200000 seconds 55 hours. 10 9 inst. per second 100 MHz computer (100 million instructions per second) using 50n log n algorithm. 50 107 log 10 7 inst. 10 8 inst. per second < 50 107 7 3 10 8 = 5 7 3 = 105 seconds. Review of Log and Exp Base 2 logarithm comes up all the time (from now on we will always mean log 2 n when we write log n or lg n). Number of times we can divide n by 2 to get to 1 or less. Number of bits in binary representation of n. Note: log n << n << n Properties: lg k n = (lg n) k lg lg n = lg(lg n) a log b c = c log b a a log a b = b log a n = log b n log b a lg b n = n lg b lg xy = lg x + lg y log a b = 1 log b a 7