csci 210: Data Structures Program Analysis

Similar documents
csci 210: Data Structures Program Analysis

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

Algorithms and Their Complexity

Data Structures and Algorithms. Asymptotic notation

Topic 17. Analysis of Algorithms

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Analysis of Algorithms

Running Time Evaluation

Analysis of Algorithms

CSED233: Data Structures (2017F) Lecture4: Analysis of Algorithms

Principles of Algorithm Analysis

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Growth of Functions (CLRS 2.3,3)

Running Time. Overview. Case Study: Sorting. Sorting problem: Analysis of algorithms: framework for comparing algorithms and predicting performance.

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014

Computer Algorithms CISC4080 CIS, Fordham Univ. Outline. Last class. Instructor: X. Zhang Lecture 2

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2

Analysis of Algorithms

CS 310 Advanced Data Structures and Algorithms

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

Big O 2/14/13. Administrative. Does it terminate? David Kauchak cs302 Spring 2013

Data Structures and Algorithms Running time and growth functions January 18, 2018

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

What is Performance Analysis?

Lecture 2. Fundamentals of the Analysis of Algorithm Efficiency

Algorithm Analysis, Asymptotic notations CISC4080 CIS, Fordham Univ. Instructor: X. Zhang

Algorithm. Executing the Max algorithm. Algorithm and Growth of Functions Benchaporn Jantarakongkul. (algorithm) ก ก. : ก {a i }=a 1,,a n a i N,

Lecture 2. More Algorithm Analysis, Math and MCSS By: Sarah Buchanan

Big-O Notation and Complexity Analysis

Ch 01. Analysis of Algorithms

LECTURE NOTES ON DESIGN AND ANALYSIS OF ALGORITHMS

CSCE 222 Discrete Structures for Computing

CISC 235: Topic 1. Complexity of Iterative Algorithms

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

Cpt S 223. School of EECS, WSU

Ch01. Analysis of Algorithms

Measuring Goodness of an Algorithm. Asymptotic Analysis of Algorithms. Measuring Efficiency of an Algorithm. Algorithm and Data Structure

Fundamentals of Programming. Efficiency of algorithms November 5, 2017

Asymptotic Analysis. Thomas A. Anastasio. January 7, 2004

3. Algorithms. What matters? How fast do we solve the problem? How much computer resource do we need?

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

Advanced Algorithmics (6EAP)

Computational Complexity

Asymptotic Running Time of Algorithms

Data Structures and Algorithms Chapter 2

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Data Structures. Outline. Introduction. Andres Mendez-Vazquez. December 3, Data Manipulation Examples

CIS 121 Data Structures and Algorithms with Java Spring Big-Oh Notation Monday, January 22/Tuesday, January 23

Great Theoretical Ideas in Computer Science. Lecture 7: Introduction to Computational Complexity

Example: Fib(N) = Fib(N-1) + Fib(N-2), Fib(1) = 0, Fib(2) = 1

Data Structures and Algorithms

Asymptotic Analysis of Algorithms. Chapter 4

Asymptotic Notation. such that t(n) cf(n) for all n n 0. for some positive real constant c and integer threshold n 0

Mat Week 6. Fall Mat Week 6. Algorithms. Properties. Examples. Searching. Sorting. Time Complexity. Example. Properties.

Reminder of Asymptotic Notation. Inf 2B: Asymptotic notation and Algorithms. Asymptotic notation for Running-time

COMP Analysis of Algorithms & Data Structures

Student Responsibilities Week 6. Mat Properties of Algorithms. 3.1 Algorithms. Finding the Maximum Value in a Finite Sequence Pseudocode

Introduction. An Introduction to Algorithms and Data Structures

CSE 417: Algorithms and Computational Complexity

COMP Analysis of Algorithms & Data Structures

Asymptotic Algorithm Analysis & Sorting

CS 380 ALGORITHM DESIGN AND ANALYSIS

Lecture 3: Big-O and Big-Θ

Analysis of Algorithms

with the size of the input in the limit, as the size of the misused.

CS 4407 Algorithms Lecture 2: Growth Functions

Big O (Asymptotic Upper Bound)

Growth of Functions. As an example for an estimate of computation time, let us consider the sequential search algorithm.

3.1 Asymptotic notation

Algorithm runtime analysis and computational tractability

CSC2100B Data Structures Analysis

ASYMPTOTIC COMPLEXITY SEARCHING/SORTING

Define Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 417: Algorithms and Computational Complexity. Winter 2007 Larry Ruzzo

data structures and algorithms lecture 2

The Time Complexity of an Algorithm

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

What we have learned What is algorithm Why study algorithm The time and space efficiency of algorithm The analysis framework of time efficiency Asympt

CSC Design and Analysis of Algorithms. Lecture 1

Runtime Complexity. CS 331: Data Structures and Algorithms

Algorithms Design & Analysis. Analysis of Algorithm

Defining Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 421: Intro Algorithms. Summer 2007 Larry Ruzzo

Divide and Conquer CPE 349. Theresa Migler-VonDollen

COMP 182 Algorithmic Thinking. Algorithm Efficiency. Luay Nakhleh Computer Science Rice University

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

Problem-Solving via Search Lecture 3

COMP 355 Advanced Algorithms

Copyright 2000, Kevin Wayne 1

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 2

CPSC 221 Basic Algorithms and Data Structures

Asymptotic Analysis Cont'd

The Time Complexity of an Algorithm

Enumerate all possible assignments and take the An algorithm is a well-defined computational

CS Data Structures and Algorithm Analysis

Algorithms. Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY. Algorihms, A.Yazici, Fall 2007 CEng 315

COMPUTER ALGORITHMS. Athasit Surarerks.

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background

Algorithms and Data S tructures Structures Complexity Complexit of Algorithms Ulf Leser

Transcription:

csci 210: Data Structures Program Analysis

Summary Topics commonly used functions analysis of algorithms experimental asymptotic notation asymptotic analysis big-o big-omega big-theta READING: GT textbook chapter 4

Analysis of Algorithms Goal: analyze the cost of a computer program cost = time: How much time does this algorithm require? cost = space: How much space (i.e. main memory) does this algorithm require? cost = space + time etc Analysis of algorithms and data structure is the major force that drives the design of solutions there are many solutions to a problem pick the one that is the most efficient How to compare various algorithms?? Analyze algorithms. The primary efficiency concern for an algorithm is: time all concepts that we discuss for time analysis apply also to space analysis running time of an algorithm increases with input size on inputs of same size, can vary from input to input e.g.: linear search an un-ordered array depends on hardware CPU speed, hard-disk, caches, bus, etc depends on OS, language, compiler, etc

Analysis of Algorithms Everything else being equal we d like to compare between algorithms we d like to study the relationship running time :-: size of input How to measure running time of an algorithm? 1. experimental studies 2. theoretical analysis running time Experimental studies implement chose various input sizes for each input size, chose various inputs run algorithm time compute average plot Input size

Experimental Analysis Limitations need to implement the algorithm need to implement all algorithms that we want to compare need many experiments need same platform Advantages find the best algorithm in practice running time We would like to analyze algorithms without having to implement them Basically, we would like to be able to look at two algorithms flowcharts and decide which one is better Input size

Theoretical Analysis Model: Assume all operations cost the same Running time (efficiency) of an algorithm: the number if operations executed by the algorithm Does this reflect actual running time? multiply nb. of instructions by processor speed 1GHz processor ==> 10^9 instructions/second Is this accurate? Not all instructions take the same... various other effects But, it is a very good predictor of running time in most cases

Notation: n = size of the input to the problem Running time: number of operations/instructions on an input of size n expressed as function of n: f(n) For an input of size n, running time may be smaller on some inputs than on others Best case running time: the smallest number of operations on an input of size n Worst-case running time: the largest number of operations on an input of size n for any n best-case running time(n) <= running time(n) <= worst-case running time (n) ideally, want to compute average-case running time hard to model

Running times Expressed as functions of n: f(n) The most common functions for running times are the following: constant time : f(n) = c logarithmic time f(n) = lg n linear time f(n) = n n lg n f(n) = n lg n quadratic f(n) = n^2 cubic f(n) = n^3 exponential f(n) = a^n

Constant time f(n) = c Meaning: for any n, f(n) is a constant c Elementary operations arithmetic operations boolean operations assignment statement function call access to an aray element a[i] etc

Logarithmic time x = log b n if and only of b x = n by definition, log b 1 = 0 log 2 10 in algorithm analysis, compute [log b n] the ceiling (the smallest integer >= log b n) [log b n] is the number of times you can divide n by b until we get a number <= 1 e.g. [log 2 8] = 3 log 2 10 = 4 Notation: lg n = log 2 n Logarithm rules Note: computing lg n on your calculator lg n = log 10 n / log 10 2

Exercises lg 2n = lg (n/2) = lg n 3 = lg 2 n log 4 n = 2 lg n

Binary search searching a sorted array public static binarysearch(int[] a, int key) { int left = 0; int right = a.length-1; while (left <= right) { int mid = left + (right-left)/2; if (key < a[mid]) right = mid-1; else if (key > a[mid]) left = mid+1; else return mid; } } running time: best case: constant worst-case: lg n Why? input size halves at every iteration of the loop

Linear running time f(n) = n Example: doing one pass through an array of n elements e.g. finding min/max/average in an array computing sum in an array search an un-ordered array (worst-case) int sum = 0 for (int i=0; i< a.length; i++) sum += a[i]

n-lg-n running time f(n) = n lg n grows faster than n (i.e. it is slower than n) grows slower than n 2 Examples performing n binary searches in an ordered array sorting

Quadratic time f(n) = n 2 appears in nested loops enumerating all pairs of n elements Examples: for (i=0; i<n; i++) for (j=0; j<n; j++) //do something //selection sort: for (i=0; i<n; i++) minindex = index-of-smallest element in a[i..n-1] swap a[i] with a[minindex] running time: index-of-smallest element in a[i..j] takes j-i+1 operations n + (n-1) + (n-2) + (n-3) +... + 3 + 2 + 1 this is n 2

Lemma: 1+ 2 + 3 + 4 +... + (n-2) + (n-1) + n = n(n+1)/2 (arithmetic sum) Proof:

Cubic running time: f(n) = n 3 In general, a polynomial running time is: f(n) = n d, d>0 Examples: nested loops Enumerate all triples of elements Imagine cities on a map. Are there 3 cities that no two are not joined by a road? Solution: enumerate all subsets of 3 cities. There are n chose 3 different subsets, which is order n 3. Exponential running time: f(n) = a n, a > 1 Examples: running time of Tower of Hanoi moving n disks from A to B requires at least 2 n moves; which means it requires at least this much time Exponent rules:

Comparing Growth-Rates 1 < lg n < n < n lg n < n 2 < n 3 < a n

Asymptotic analysis Focus on the growth of rate of the running time, as a function of n That is, ignore the constant factors and the lower-order terms Focus on the big-picture Example: we ll say that 2n, 3n, 5n, 100n, 3n+10, n + lgn, are all linear Why? constants are not accurate anyways operations are not equal capture the dominant part of the running time Notations: Big-Oh: express upper-bounds Big-Omega: express lower-bounds Big-Theta: express tight bounds (upper and lower bounds)

Big-Oh Definition: f(n) is O(g(n)) if exists c >0 such that f(n) <= c g(n) for all n >= n0 Intuition: big-oh represents an upper bound when we say f is O(g) this means that f <= g asymptotically g is an upper bound for f f stays below g as n goes to infinity Examples: 2n is O(n) 100n is O(n) 10n + 50 is O(n) 3n + lg n is O(n) lg n is O(log_10 n) lg_10 n is O(lg n) 5n^4 + 3n^3 + 2n^2 + 7n + 100 is O(n^4)

Big-Oh 2n^2 + n lg n +n + 10 is O(n^2 + n lg n) is O(n^3) is O(n^4) iso(n^2) 3n + 5 is O(n^10) is O(n^2) is O(n+lgn) Let s say you are 2 minutes away from the top and you don t know that. You ask: How much further to the top? Answer 1: at most 3 hours (True, but not that helpful) Answer 2: just a few minutes. When finding an upper bound, find the best one possible.

Exercises Write Big-Oh upper bounds for each of the following. 10n - 2 5n^3 + 2n^2 +10n +100 5n^2 + 3nlgn + 2n + 5 20n^3 + 10n lg n + 5 3 n lgn + 2 2^(n+2) 2n + 100 lgn

Big-Omega Definition: f(n) is Omega(g(n)) if exists c >0 such that f(n) >= c g(n) for all n >= n0 Intuition: big-omega represents a lower bound when we say f is Omega(g) this means that f >= g asymptotically g is a lower bound for f f stays above g as n goes to infinity Examples: 3nlgn + 2n is Omega(nlgn) 2n + 3 is Omega(n) 4n^2 +3n + 5 is Omega(n) 4n^2 +3n + 5 is Omega(n^2)

Big-Theta Definition: f(n) is Theta(g(n)) if f(n) is O(g(n)) and f is Omega(g(n)) i.e. there are constants c and c such that c g(n) <= f(n) <= c g(n) Intuition: f and g grow at the same rate, up to constant factors Theta captures the order of growth Examples: 3n + lg n + 10 is O(n) and Omega(n) ==> is Theta(n) 2n^2 + n lg n + 5 is Theta(n^2) 3lgn +2 is Theta(lgn)

Asymptotic Analysis Find tight bounds for the best-case and worst-case running times running time is Omega(best-case running time) running time is O(worst-case running time) Example: binary search is Theta(1) in the best case binary search is Theta(lg n) in the worst case binary search is Omega(1) and O(lg n) Usually we are interested the worst-case running time a Theta-bound for the worst-case running time Example: worst-case binary search is Theta(lg n) worst-case linear search is Theta(n) worst-case insertion sort is Theta(n^2) worst-case bubble-sort is O(n^2) worst-case find-min in an array is Theta(n) Note: it is correct to say worst-case binary search is O(lg n), but a Theta-bound is better

Asymptotic Analysis Suppose we have two algorithms for a problem: Algorithm A has a running time of O(n) Algorithm B has a running time of O(n^2) Which is better?

Asymptotic Analysis Suppose we have two algorithms for a problem: Algorithm A has a running time of O(n) assume the worst-case running time is Theta(n) Algorithm B has a running time of O(n^2) assume the worst-case running time is Theta(n^2) Which is better? order classes of functions Theta(1) < Theta(lg n) < Theta(n) < Theta(nlgn) < Theta(n^2) < Theta(n^3) < Theta(2^n) Theta(n) is better than Theta(n^2) etc Cannot distinguish between algorithms in the same class two algorithms that are Theta(n) worst-case are equivalent theoretically optimization of constants can be done at implementation-time

Growth-rate matters Example: Say n = 10^9 (1 billion elements) 10 MHz computer ==> 1 instruction takes 10^-7 seconds Binary search would take Theta(lg n) = lg 10^9 x 10^-7 sec = 30 x10^-7 sec = 3 microsec Sequential search would take Theta(n)= 10^9 x 10^-7 sec = 100 seconds Finding all pairs of elements would take Theta(n^2) = (10^9)^2 x 10^-7 sec = 10^11 seconds = 3170 years Imagine Theta(n^3) Imagine Theta(2^n)

Efficiency matters n lg n n n lg n n^2 n^3 2^n 8 3 8 24 64 512 256 16 4 16 64 256 4,096 65,536 32 5 32 160 1,024 32,768 4,294,967,296 64 6 64 384 4,096 262,144 1.8 x 10^19 128 7 128 896 16,384 2,097,152 3.40 x 10^38 256 8 256 2.048 65,536 16,777,216 1.15 x 10^77 512 9 512 4,608 262,144 134,217,728 1.34 x 10^154 1024 10 1024 1024^2 20 1,048,576 10^9

Assume we have a 1 GHz computer. This means an instruction takes 1 microsecond (10^-9 seconds). We have 3 algorithms: A: 400n B 2n^2 C: 2^n What is the maximum input size that can be solved with each algorithm in: 1 second 1 minute 1 hour Running time (in microseconds) 1 sec 1 min 1 hour 400n 2n^2 2^n

Problem We have an array X containing a sequence of numbers. We want to compute another array A such that A[i] represents the average X[0] + X[1] +... X[i]/ (i+1). A[0] = X[0] A[1] = (X[0] + X[1])/ 2 A[2] = (X[0] + X[1] + X[2]) / 3... The first i values of X are referred to as the i-prefix of X. X[0] +... X[i] is called prefix-sum, and A[i] prefix average. Application: In Economics. Imagine that X[i] represents the return of a mutual fund in year i. A[i] represents the average return over i years. Write a function double[] computeprefixaverage(double[] X) that creates, computes and returns the prefix averages. Analyze your algorithm (worst-case running time).

Asymptotic Analysis: Overview running time = number of instructions assume all instructions are equal usually interested in worst-case running time as a function of input size the largest number of instructions on an input of size n find the order of growth of the worst-case running time a Theta-bound classification of growth rates Theta(1) < Theta(lg n) < Theta(n) < Theta(nlgn) < Theta(n^2) < Theta(n^3) < Theta(2^n) At the algorithm design level, we want to find the most efficient algorithm in terms of growth rate We can optimize constants at the implementation step