Lecture 2. More Algorithm Analysis, Math and MCSS By: Sarah Buchanan

Similar documents
Data Structures and Algorithms. Asymptotic notation

Lecture 2. Fundamentals of the Analysis of Algorithm Efficiency

Big O 2/14/13. Administrative. Does it terminate? David Kauchak cs302 Spring 2013

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Analysis of Algorithms

Analysis of Algorithms

Algorithm efficiency can be measured in terms of: Time Space Other resources such as processors, network packets, etc.

Lecture 1: Asymptotic Complexity. 1 These slides include material originally prepared by Dr.Ron Cytron, Dr. Jeremy Buhler, and Dr. Steve Cole.

The Time Complexity of an Algorithm

CSED233: Data Structures (2017F) Lecture4: Analysis of Algorithms

Asymptotic Analysis of Algorithms. Chapter 4

Cpt S 223. School of EECS, WSU

Lecture 2: Asymptotic Notation CSCI Algorithms I

Analysis of Algorithms

Ch01. Analysis of Algorithms

3.1 Asymptotic notation

The Time Complexity of an Algorithm

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014

Growth of Functions (CLRS 2.3,3)

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

CISC 235: Topic 1. Complexity of Iterative Algorithms

An analogy from Calculus: limits

Running Time Evaluation

Topic 17. Analysis of Algorithms

Algorithms and Their Complexity

Asymptotic Analysis 1

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Ch 01. Analysis of Algorithms

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

CSC236 Week 4. Larry Zhang

CS 4407 Algorithms Lecture 2: Growth Functions

COMP 9024, Class notes, 11s2, Class 1

Programming, Data Structures and Algorithms Prof. Hema Murthy Department of Computer Science and Engineering Indian Institute Technology, Madras

Computational Complexity

CS Non-recursive and Recursive Algorithm Analysis

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 2

CIS 121 Data Structures and Algorithms with Java Spring Big-Oh Notation Monday, January 22/Tuesday, January 23

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018

EECS 477: Introduction to algorithms. Lecture 5

CS 380 ALGORITHM DESIGN AND ANALYSIS

CS473 - Algorithms I

CS 310 Advanced Data Structures and Algorithms

Data Structures and Algorithms Chapter 2

Principles of Algorithm Analysis

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

Announcements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010

Algorithms Design & Analysis. Analysis of Algorithm

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

Asymptotic Analysis. Thomas A. Anastasio. January 7, 2004

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

Module 1: Analyzing the Efficiency of Algorithms

Data Structures and Algorithms

CSC Design and Analysis of Algorithms. Lecture 1

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Measuring Goodness of an Algorithm. Asymptotic Analysis of Algorithms. Measuring Efficiency of an Algorithm. Algorithm and Data Structure

Written Homework #1: Analysis of Algorithms

Asymptotic Notation. such that t(n) cf(n) for all n n 0. for some positive real constant c and integer threshold n 0

csci 210: Data Structures Program Analysis

CS1210 Lecture 23 March 8, 2019

CIS 121. Analysis of Algorithms & Computational Complexity. Slides based on materials provided by Mary Wootters (Stanford University)

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Asymptotic Analysis, recurrences Date: 9/7/17

Asymptotic Analysis Cont'd

Analysis of Algorithms

Asymptotic Analysis. Slides by Carl Kingsford. Jan. 27, AD Chapter 2

i=1 i B[i] B[i] + A[i, j]; c n for j n downto i + 1 do c n i=1 (n i) C[i] C[i] + A[i, j]; c n

Define Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 417: Algorithms and Computational Complexity. Winter 2007 Larry Ruzzo

csci 210: Data Structures Program Analysis

Defining Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 421: Intro Algorithms. Summer 2007 Larry Ruzzo

COMP Analysis of Algorithms & Data Structures

Big O (Asymptotic Upper Bound)

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

CSE373: Data Structures and Algorithms Lecture 2: Math Review; Algorithm Analysis. Hunter Zahn Summer 2016

Module 1: Analyzing the Efficiency of Algorithms

Lecture 3. Big-O notation, more recurrences!!

Lecture 2: Asymptotic Analysis of Algorithms

COMP Analysis of Algorithms & Data Structures

Data Structures and Algorithms Running time and growth functions January 18, 2018

CSC2100B Data Structures Analysis

COMPUTER ALGORITHMS. Athasit Surarerks.

Theory of Computation

COMP 382: Reasoning about algorithms

Copyright 2000, Kevin Wayne 1

CSE 373: Data Structures and Algorithms. Asymptotic Analysis. Autumn Shrirang (Shri) Mare

with the size of the input in the limit, as the size of the misused.

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 3. Θ Notation. Comparing Algorithms

Introduction to Algorithms and Asymptotic analysis

3. Algorithms. What matters? How fast do we solve the problem? How much computer resource do we need?

O Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem.

Big-O Notation and Complexity Analysis

Enumerate all possible assignments and take the An algorithm is a well-defined computational

Announcements. CompSci 230 Discrete Math for Computer Science. The Growth of Functions. Section 3.2

Asymptotic Running Time of Algorithms

2.2 Asymptotic Order of Growth. definitions and notation (2.2) examples (2.4) properties (2.2)

Mathematical Background. Unsigned binary numbers. Powers of 2. Logs and exponents. Mathematical Background. Today, we will review:

data structures and algorithms lecture 2

Complexity Theory Part I

P, NP, NP-Complete, and NPhard

Math 391: Midterm 1.0 Spring 2016

Transcription:

Lecture 2 More Algorithm Analysis, Math and MCSS By: Sarah Buchanan

Announcements Assignment #1 is posted online It is directly related to MCSS which we will be talking about today or Monday. There are 3 solutions of different complexity to MCSS, just like there will be 3 solutions of different complexity, make sure you keep that in mind! Consider the overflow issue since you will be multiplying large amounts of integers. Given the range of numbers and the input size, and the fact that I expect experimental analysis, I would recommend reading about and using the java BigInteger class. DUE 9/8, by midnight on webcourses, 10% off 1 day late, 20% 2 days late, 0 after that.

Summary We are going to go over more mathematical techniques for analyzing algorithms. Then we are going to put them into practice! We will go over Big-Oh, Theta, and Omega And if we have time we will discuss MCSS and experimental analysis.

Math Review Last time we went over infinite geometric series and how to derive that formula. Infinite geometric series example, where r < 1: S = a 1 + a 1 r + a 1 r 2 + a 1 r 3 + S = a 1 /(1-r) Solve, More standard summations we will use: S = = a 1 (1-r n )/(1-r) Solve,

More Summations we will Use = n(n+1)/2 First n terms of an arithmetic series, = n(a 1 + a n ) /2 Solve, = 7 + 10 + 13 + 16 + An arithmetic series is a sequence of numbers such that the different between two successive members is a constant. Use the subtraction technique shown last class to solve

Math Review - Logs Logs the log function is the inverse of an exponent, if b a = c then by definition log b c = a You can never take the log of 0 or any negative value. Why? Since a positive number raised to any exponent can never be negative, and only logs with positive bases are computed. Rules: log b a + log b c = log b ac log b a c = c log b a log b a log b c = log b a/c log b a = log c a/log c b b log c a = a log c b b a /b c = b a-c b a b c = b a+c (b a ) c = b ac Solve

Math Review - Logs Key observation Logarithms grow slowly. 2 10 = 1024 (log 2 1024 = 10) 2 20 = 1048576 1x10 6 (log 2 1x10 6 20) 2 30 = 1073741824 1x10 9 (log 2 1x10 9 30) This means that the performance of an O(N log N) algorithm is much closer to that of an O(N) algorithm than an O(N 2 ) algorithm, even for large amounts of input. In general, any time we repeatedly halve or multiply a quantity in some way, a lot is involved in the analysis.

Examples on the board We re going to solve all of the examples given before on the board.

Sorted List Matching Problem Approach #1 Lets compare 3 different solutions to this problem and their runtimes. Problem: Given 2 sorted lists of names, output the names common to both lists. Obvious way to do this: For each name on list #1: a) Search for the current name in list #2. b) If the name is found, output it. So this isn t leveraging the fact that we know the list is sorted, it would take O(n) to do (a) and (b), multiplied by the n names in list#1 gives a total of O(n 2 )

Sorted List Matching Problem Approach #2 Let s use the fact that the lists are sorted! For each name on list #1: a) Search for the current name in list #2. b) If the name is found, output it. For step (a) use a binary search. From CS1, we know that this takes O(log n) time. (Because what a binary search does is compares the target to the middle item in the list. If the target is the same, you re done. If it s before the middle, repeat this procedure on the items before the middle. If it s after the middle, repeat on the items after the middle. THUS, we are dividing are search space in half each time.)

Sorted List Matching Problem Approach #2 Continued For the moment, lets assume that both lists are of equal size, then we can say that the size of list#2 is ½ the total input size, N. Then technically our search would take O(log N /2) time. HOWEVER, using our log rules, we find that log 2 N= log 2 (N /2) +1 Thus, our running time is simply O(log 2 N) BUT this is just for 1 loop iteration of finding the first name in list #1. How many loop iterations are there? N/2 THUS, the total running time of Approach #2 is O(N log 2 N)

Sorted List Matching Problem Approach #3 Can we do better? We still haven t used the fact that list #1 is sorted! Can we exploit this fact so that we don t have to do a full binary search for each name? List #1 List #2 Albert Brandon Carolyn Dan Elton Cari Carolyn Chris Fred Graham

Sorted List Matching Problem Approach #3 Formal Version of the algorithm: 1) Start 2 markers, one for each list, at the beginning of both lists. 2) Repeat the following until one marker has reached the end of its list: a) Compare the two names that the markers are pointing at. b) If they are equal, output the name and advance BOTH markers one spot. If they are NOT equal, simply advance the marker pointing to the name that comes earlier alphabetically one spot.

Sorted List Matching Problem Approach #3 Algorithm Run-Time Analysis For each loop iteration, we advance at least one marker. The max number of iterations then, would be the total number of names on both list, N. For each iteration, we are doing a constant amount of work. Essentially a comparison, and/or outputting a name. Thus, our algorithm runs in O(N) time an improvement. Can we do better? No, because we need to at least read each name in both lists, if we skip names, on BOTH lists we cannot deduce whether we could have matches or not. Even if we could skip the names in one list we would read all the names in one list giving O(N/2) = O(N).

Order Notation: Big-Oh, Theta, and Omega Since we want to simply count the number of simple statements an algorithm runs in terms of Big-Oh notation, we need to learn the formal definitions of Big-Oh, Big- Omega, and Big-Theta, so we properly apply these technical terms.

Big-Oh Definition of O(g(n)) f(n) = O(g(n)) iff for all n n 0 (n 0 is a constant) f(n) c g(n) for some constant c Example Let f(n) = 2n + 1 and g(n) = n Does f(n) = O(g(n))? Proof: Let c = 3, n 0 = 2 Then, f(n) = 2n + 1 and cg(n) = 3n Thus we need to show, (2n + 1) 3n for all n 2 We know 2n + 1 2n + n, since n > 1 = 3n

Big-Oh Simple rule for polynomial functions: If f(n) is a polynomial of degree k, then f(n) = O(n k ) Example, an 4 + bn 3 + cn 2 = O(n 4 ) Is 2n + 1 = O(n 10 )? Yes, try c = 1, n 0 = 2. Big-Oh is an upper bound. It simply guarantees that a function is no larger than a constant times a function g(n), for O(g(n)). We will usually use Big-Oh notation when we are describing a worst case running time.

Big Omega Big Omega is just about the opposite of Big Oh. Definition: f(n) = Ω(g(n)) iff for all n n0 (n0 is a constant) f(n) c g(n) for some constant c. The only difference is the inequality sign!

Big Omega Example, Let f(n) = n 2-3, g(n) = 10n f(n) = Ω(g(n))?? Proof: Let c =.1, n 0 = 3 Then, f(n) = n 2 3, c g(n) = n Thus, we need to show n 2 3 n, since n 3 = n(n 1) n(2), since n 3, n-1 2 n

Big Omega Ω establishes a lower bound for a function. f(n) has to grow at least as fast as g(n) to within a constant factor. For example, if I said that an algorithm runs in Ω(n) This means whenever you run an algorithm with an input of size n, the number of small instructions executed is AT LEAST cn. Where c is some positive constant.

Big Theta Definition f(n) = (g(n)) iff f(n) = O(g(n)) and f(n) = Ω(g(n)) This means that g(n) is both an upper AND lower bound of f(n) within a constant factor. As n grows large, f(n) and g(n) are within a constant factor. Thus if we can show an algorithm runs in : O(f(n)) time Ω(f(n)) time AND We can conclude that both the WORST case running time and BEST case running time are proportional to f(n). If this is the case, we CONDLUDE the algorithm runs in (f(n)) time.

Order Notation Summary O is like Ω is like is like = (since f(n) = O(g(n)) iff f(n) cg(n)) Finally, another way to think about each of these is that they describe a class of functions. If f(n) = O(n), it s like saying f(n) ϵ O(n) Meaning f(n) can be any number of functions, where f(n) is proportionate to n or smaller.

Experimental Analysis We just covered methods used to determine the theoretical run-time of an algorithm. BUT, sometimes an algorithm is too difficult to analyze theoretically. OR, Sometimes we would like to verify that an algorithm is actually running as fast as we expect. In these cases we can experimentally gauge the run-time of an algorithm.

Experimental Analysis Let T(N) be the empirical (observed) running time of the code and the claim is made that T(N) ϵ O(F(n). The techniques is to compute a series of values T(N)/F(N) for a range of N (commonly spaced out by factors of 2). Depending on the values of T(N)/F(N) we can determine how accurate our estimation for F(N) is according to: F(N) = is a close answer( ) if the values convert to a + constant is an overestimate if the values converge to zero is an underestimate if the values diverge.

Example 1 The table contains results from running an instance of an algorithm assumed to be cubic. Decide if the Big-Theta estimate, Θ(N 3 ) is accurate. Run N T(N) F(N) = N 3 T(N)/F(N) 1 100 0.017058 ms 10 6 1.0758 10-8 2 1000 17.058 ms 10 9 1.0758 10-8 3 5000 2132.2464 ms 1.25x10 11 1.0757 10-8 4 10000 17057.971 ms 10 12 1.0757 10-8 5 50000 2132246.375 ms 1.25x10 14 1.0757 10-8 The calculated values converge to a positive constant (1.0757 10-8 ) so the estimate of Θ (n 3 ) is an accurate estimate.

Example 2 The table contains results from running an instance of an algorithm assumed to be quadratic. Decide if the Big-Theta estimate, Θ (N 2 ) is accurate. Run N T(N) F(N) = N 2 T(N)/F(N) 1 100 0.00012 ms 10 4 1.6 10-8 2 1000 0.03389 ms 10 6 3.389 10-8 3 10000 10.6478 ms 10 8 1.064 10-7 4 100000 2970.0177 ms 5 1000000 938521.971 ms 10 10 2.970 10-7 10 12 9.385 10-7 The values diverge, so the code runs in Ω(N 2 ), and has a larger theta bound.

MCSS public static int MCSS(int [] a) { int max = 0, sum = 0, start = 0, end = 0; } // Cycle through all possible values of start and end indexes // for the sum. for (i = 0; i < a.length; i++) { for (j = i; j < a.length; j++) { sum = 0; // Find sum A[i] to A[j]. for (k = i; k <= j; k++) sum += a[k]; if (sum > max) { max = sum; start = i; // Although method doesn't return these end = j; // they can be computed. } } } return max; {-2, 11, -4, 13, -5, 2} i is the starting point, j is the ending point i = 0, j = 0, k = 0 etc. sum+=-2 j = 1, k = 0 sum += -2 k = 1 sum += 11 j = 2, k = 0 sum += -2, k = 1 sum+= 11 k = 2 sum+= -4 As we can see we are repeatedly starting the sum over and doing much more work than we need to!

MCSS public static int MCSS(int [] a) { int max = 0, sum = 0, start = 0, end = 0; // Cycles all possible values of start and end indexes // for the sum. for (i = 0; i < a.length; i++) { } for (j = i; j < a.length; j++) { } sum = 0; // Find sum A[i] to A[j]. for (k = i; k <= j; k++) sum += a[k]; if (sum > max) { } return max; max = sum; start = i; // Method doesn't return these end = j; // they can be computed. 3 loops: i loop executes N times j loop executes N-1 times k loop executes N-1 times, in the worst case when i = 0. Thus, this gives a rough estimate that the algorithm is O(N 3 )

MCSS Continued next class