ECE50: Algorithms and Data Structures Analyzing and Designing Algorithms Ladan Tahvildari, PEng, SMIEEE Associate Professor Software Technologies Applied Research (STAR) Group Dept. of Elect. & Comp. Eng. University of Waterloo Materials from CLRS: Chapter
Acknowledgements v The following resources have been used to prepare materials for this course: Ø MIT OpenCourseWare Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis in C++ (M. Wiess) Ø Data Structures and Algorithms in C++ (M. Goodrich) v Thanks to many people for pointing out mistakes, providing suggestions, or helping to improve the quality of this course over the last ten years: Ø http://www.stargroup.uwaterloo.ca/~ece50/acknowledgment/ Lecture ECE50
Analysis of Algorithms v The theoretical study of computer-program performance and resource usage. v What is more important than performance? Ø Modularity Ø Correctness Ø Maintainability Ø Functionality Ø Robustness Ø User-Friendliness Ø Simplicity Ø Extensibility Ø Reliability Ø Programmer Time Lecture ECE50 3
Why Study Algorithms and Performance? v Performance often draws the line between what is feasible and what is impossible. v Algorithms help us to understand scalability. v Algorithmic mathematics provides a language for talking about program behavior. Lecture ECE50 4
An Example v A sorting problem: Input: A sequence of numbers a 1, a,, a n Output: A re-ordering a' 1, a',, a' n such that a' 1 a' a' n v Consider two basic algorithms: Ø Insertion Sort Ø Merge Sort Lecture ECE50 5
Example v Suppose we want to sort a sequence of v Insertion Sort: c n [later ] 1 v Merge Sort: c nlg( ) [later ] n 6 10 numbers v Bad Programmer ( c = 50), compiler, inst/sec, Computer A v Good Programmer ( c 1 = ), machine language, inst/sec, Computer B 10 10 v Computer A: Ø Merge Sort (50)(10 6 )lg(10 6 ) /10 100 seconds v Computer B: Ø Insertion Sort ()(10 6 ) /10 00 seconds 10 v For numbers.3 days vs. minutes v Conclusions??? Lecture ECE50 6
Asymptotic Performance v In this course, we are concerned with asymptotic performance of the algorithms: Ø What about the behavior of the algorithm as the size of the problem grows? The running time The memory/storage requirements v The first task will be formally define this asymptotic notation. Lecture ECE50
Asymptotic Performance v Types of Analysis: Ø Worst Case Provides an upper bound in the running time Like an Absolute Guarantee Ø Average Case Estimate of the expected running time Can be useful, but what is average case? ü Random (equally likely) inputs ü Real-Life inputs, application driven Lecture ECE50 8
The Problem of Sorting Input: sequence a 1, a,, a n of numbers. Output: permutation a' 1, a',, a' n such that a' 1 a' a' n. Example: Input: 8 4 3 6 Output: 3 4 6 8 Consider two sorting algorithms: Ø Insertion Sort Ø Merge Sort n nlg(n) Lecture ECE50
Insertion Sort pseudocode INSERTION-SORT (A, n) for j to n do key A[ j] i j 1 while i > 0 and A[i] > key do A[i+1] A[i] i i 1 A[i+1] = key A: 1 i j n sorted key Lecture ECE50 10
Insertion Sort - Example 8 4 3 6 Lecture ECE50
Insertion Sort - Example 8 4 3 6 Lecture ECE50
Insertion Sort - Example 8 4 3 6 8 4 3 6 Lecture ECE50
Insertion Sort - Example 8 4 3 6 8 4 3 6 Lecture ECE50 14
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 Lecture ECE50 15
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 Lecture ECE50 16
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 4 8 3 6 Lecture ECE50 1
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 4 8 3 6 Lecture ECE50 18
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 4 8 3 6 3 4 8 6 Lecture ECE50 1
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 4 8 3 6 3 4 8 6 Lecture ECE50
Insertion Sort - Example 8 4 3 6 8 4 3 6 4 8 3 6 4 8 3 6 3 4 8 6 3 4 6 8 Lecture ECE50 1
Running Time v The running time depends on the input. Ø An already sorted sequence is easier to sort. v Parameterize the running time by the size of the input. Ø Short sequences are easier to sort than long ones. v We seek upper bounds on the running time, to have a guarantee of performance. Lecture ECE50
Kinds of Analyses v Worst-case: (usually) Ø T(n): maximum time of algorithm on any input of size n. v Average-case: (sometimes) Ø T(n): expected time of algorithm over all inputs of size n. Need assumption of statistical distribution of inputs. v Best-case: (NEVER) Ø Cheat with a slow algorithm that works fast on some input. Lecture ECE50 3
Machine-Independent Time What is insertion sort s worst-case time? v Depends on the speed of our computer: Ø Relative speed (on the same machine) Ø Absolute speed (on different machines) BIG IDEA: v Ignore machine-dependent constants Ø otherwise impossible to verify and to compare algorithms v Look at growth of T(n) as n. Asymptotic Analysis Lecture ECE50 4
Asymptotic Analysis Engineering: v Drop low-order terms; ignore leading constants. v Example: 3n 3 + 0n 5n + 6046 = Θ(n 3 ) Math: Θ(g(n)) = { f (n) : there exist positive constants c 1, c, and n 0 such that 0 c 1 g(n) f (n) c g(n) for all n n 0 } Lecture ECE50 5
Asymptotic Performance T(n) When n gets large enough, a Θ(n ) algorithm always beats a Θ(n 3 ) algorithm. n. Asymptotic analysis is a useful tool to help to structure our thinking toward better algorithm. We shouldn t ignore asymptotically slower algorithms, however. n 0 Lecture ECE50 6
Insertion Sort Analysis pseudocode INSERTION-SORT (A, n) for j to n do key A[ j] i j 1 while i > 0 and A[i] > key do A[i+1] A[i] i i 1 A[i+1] = key A: 1 i j n sorted key Lecture ECE50
Insertion Sort Analysis Worst case: Input reverse sorted. n T ( n) = Θ( j) = Θ n [arithmetic series] j= ( ) Lecture ECE50 8
Math Review v Floor and Ceiling v Arithmetic Series v 3 / = 1, 3/ =, 3/ = n k 1 k = 1+ v Geometric Series +... + n = log log ; log104= 10 i= 0 1 i 1 n( n + 1) v Appendix A, B, C of CLRS as needed during the course = Lecture ECE50
Insertion Sort Analysis Worst case: Input reverse sorted. n T ( n) = Θ( j) = Θ n [arithmetic series] j= ( ) Average case: All permutations equally likely. T ( n) = n j= Θ( j / ) = Θ ( n ) Is insertion sort a fast sorting algorithm? v Moderately so, for small n. v Not at all, for large n. Lecture ECE50 30
The Problem of Sorting Input: sequence a 1, a,, a n of numbers. Output: permutation a' 1, a',, a' n such that a' 1 a' a' n. Example: Input: 8 4 3 6 Output: 3 4 6 8 Consider two sorting algorithms: Ø Insertion Sort Ø Merge Sort n nlg(n) Lecture ECE50 31
Merge Sort MERGE-SORT A[1.. n] 1. If n = 1, done.. Recursively sort A[ 1.. n/ ] and A[ n/ +1.. n ]. 3. Merge the sorted lists. Key subroutine: MERGE Lecture ECE50 3
Merging Two Sorted Arrays 1 Lecture ECE50 33
Merging Two Sorted Arrays 1 1 Lecture ECE50 34
Merging Two Sorted Arrays 1 1 Lecture ECE50 35
Merging Two Sorted Arrays 1 1 Lecture ECE50 36
Merging Two Sorted Arrays 1 1 Lecture ECE50 3
Merging Two Sorted Arrays 1 1 Lecture ECE50 38
Merging Two Sorted Arrays 1 1 Lecture ECE50 3
Merging Two Sorted Arrays 1 1 Lecture ECE50 40
Merging Two Sorted Arrays 1 1 Lecture ECE50 41
Merging Two Sorted Arrays 1 1 Lecture ECE50 4
Merging Two Sorted Arrays 1 1 Lecture ECE50 43
Merging Two Sorted Arrays 1 1 Lecture ECE50 44
Merging Two Sorted Arrays 1 1 Time = Θ(n) to merge a total of n elements (linear time). Lecture ECE50 45
Analyzing Merge Sort T(n) Θ(1) T(n/) Θ(n) MERGE-SORT A[1.. n] 1. If n = 1, done.. Recursively sort A[ 1.. n/ ] and A[ n/ +1.. n ]. 3. Merge the sorted lists Sloppiness: Should be T( n/ ) + T( n/ ), but it turns out not to matter asymptotically. Lecture ECE50 46
Recurrence for Merge Sort T(n) = Θ(1) if n = 1; T(n/) + Θ(n) if n > 1. v We shall usually omit stating the base case when T(n) = Θ(1) for sufficiently small n, but only when it has no effect on the asymptotic solution to the recurrence. v Next lecture provides several ways to find a good upper bound on T(n). Lecture ECE50 4
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. Lecture ECE50 48
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. T(n) Lecture ECE50 4
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn T(n/) T(n/) Lecture ECE50 50
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn/ cn/ T(n/4) T(n/4) T(n/4) T(n/4) Lecture ECE50 51
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn/ cn/ cn/4 cn/4 cn/4 cn/4 Θ(1) Lecture ECE50 5
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn/ cn/ h = lg n cn/4 cn/4 cn/4 cn/4 Θ(1) Lecture ECE50 53
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn cn/ cn/ h = lg n cn/4 cn/4 cn/4 cn/4 Θ(1) Lecture ECE50 54
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn cn/ cn/ cn h = lg n cn/4 cn/4 cn/4 cn/4 Θ(1) Lecture ECE50 55
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn cn/ cn/ cn h = lg n cn/4 cn/4 cn/4 cn/4 cn Θ(1) Lecture ECE50 56
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn cn/ cn/ cn h = lg n cn/4 cn/4 cn/4 cn/4 cn Θ(1) #leaves = n Θ(n) Lecture ECE50 5
Recursion Tree Solve T(n) = T(n/) + cn, where c > 0 is constant. cn cn cn/ cn/ cn h = lg n cn/4 cn/4 cn/4 cn/4 cn Θ(1) #leaves = n Θ(n) Total = Θ(n lg n) Lecture ECE50 58
Conclusions v Θ(n lg n) grows more slowly than Θ(n ). v Therefore, merge sort asymptotically beats insertion sort in the worst case. v In practice, merge sort beats insertion sort for n > 30 or so. Lecture ECE50 5