Lecture 20: Analysis of Algorithms

Similar documents
10: Analysis of Algorithms

4.1, 4.2: Analysis of Algorithms

Running Time. Overview. Case Study: Sorting. Sorting problem: Analysis of algorithms: framework for comparing algorithms and predicting performance.

Lecture 14: Nov. 11 & 13

Algorithms. Algorithms 2.2 MERGESORT. mergesort bottom-up mergesort sorting complexity divide-and-conquer ROBERT SEDGEWICK KEVIN WAYNE

COMP 250: Quicksort. Carlos G. Oliver. February 13, Carlos G. Oliver COMP 250: Quicksort February 13, / 21

Lecture 4. Quicksort

Algorithms And Programming I. Lecture 5 Quicksort

Analysis of Algorithms

data structures and algorithms lecture 2

Sorting Algorithms. !rules of the game!shellsort!mergesort!quicksort!animations. Classic sorting algorithms

Analysis of Algorithms. Randomizing Quicksort

Chapter 5. Divide and Conquer CLRS 4.3. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Analysis of Algorithms CMPSC 565

Formal Languages. We ll use the English language as a running example.

1 Divide and Conquer (September 3)

Chapter 5. Divide and Conquer CLRS 4.3. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Week 5: Quicksort, Lower bound, Greedy

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Data Structures and Algorithms CSE 465

CSCE 222 Discrete Structures for Computing

Asymptotic Running Time of Algorithms

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Introduction. An Introduction to Algorithms and Data Structures

Big-O Notation and Complexity Analysis

Formal Languages. We ll use the English language as a running example.

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

csci 210: Data Structures Program Analysis

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

P, NP, NP-Complete, and NPhard

SORTING ALGORITHMS BBM ALGORITHMS DEPT. OF COMPUTER ENGINEERING ERKUT ERDEM. Mar. 21, 2013

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

csci 210: Data Structures Program Analysis

Central Algorithmic Techniques. Iterative Algorithms

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Topic 17. Analysis of Algorithms

Data Structures and Algorithms Chapter 2

Great Theoretical Ideas in Computer Science. Lecture 9: Introduction to Computational Complexity

Algorithms. Algorithms 2.3 QUICKSORT. quicksort selection duplicate keys system sorts ROBERT SEDGEWICK KEVIN WAYNE.

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Sorting algorithms. Sorting algorithms

Computational Complexity

Quicksort. Recurrence analysis Quicksort introduction. November 17, 2017 Hassan Khosravi / Geoffrey Tien 1

Outline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10

Divide and Conquer Algorithms

Algorithms. Algorithms 2.3 QUICKSORT. quicksort selection duplicate keys system sorts ROBERT SEDGEWICK KEVIN WAYNE.

Review Of Topics. Review: Induction

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Computer Algorithms CISC4080 CIS, Fordham Univ. Outline. Last class. Instructor: X. Zhang Lecture 2

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

Computer Science. Questions for discussion Part II. Computer Science COMPUTER SCIENCE. Section 4.2.

Complexity Theory Part I

Fundamental Algorithms

Fundamental Algorithms

Data Structures. Outline. Introduction. Andres Mendez-Vazquez. December 3, Data Manipulation Examples

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Chapter 2: The Basics. slides 2017, David Doty ECS 220: Theory of Computation based on The Nature of Computation by Moore and Mertens

Reminder of Asymptotic Notation. Inf 2B: Asymptotic notation and Algorithms. Asymptotic notation for Running-time

Survey says... Announcements. Survey says... Survey says...

On Partial Sorting. Conrado Martínez. Univ. Politècnica de Catalunya, Spain

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

LECTURE NOTES ON DESIGN AND ANALYSIS OF ALGORITHMS

NEW RESEARCH on THEORY and PRACTICE of SORTING and SEARCHING. R. Sedgewick Princeton University. J. Bentley Bell Laboratories

Module 1: Analyzing the Efficiency of Algorithms

Turing Machines and Time Complexity

Algorithm Design and Analysis

Asymptotic Algorithm Analysis & Sorting

Limitations of Algorithm Power

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Data Structures in Java

ITI Introduction to Computing II

Data Structures and Algorithms

10.3: Intractability. Overview. Exponential Growth. Properties of Algorithms. What is an algorithm? Turing machine.

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Measuring Goodness of an Algorithm. Asymptotic Analysis of Algorithms. Measuring Efficiency of an Algorithm. Algorithm and Data Structure

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

CS Analysis of Recursive Algorithms and Brute Force

Computer Science 4.1 PERFORMANCE. empirical analysis mathematical analysis ROBERT SEDGEWICK KEVIN WAYNE.

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

CISC 235: Topic 1. Complexity of Iterative Algorithms

Inf 2B: Sorting, MergeSort and Divide-and-Conquer

Elementary Sorts 1 / 18

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

O Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem.

Linear Time Selection

Sorting and Searching. Tony Wong

1. Basic Algorithms: Bubble Sort

Algorithms and Their Complexity

Divide-and-Conquer. Consequence. Brute force: n 2. Divide-and-conquer: n log n. Divide et impera. Veni, vidi, vici.

Quicksort algorithm Average case analysis

ECE250: Algorithms and Data Structures Analyzing and Designing Algorithms

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

1. Basic Algorithms: Bubble Sort

Chapter 5. Divide and Conquer. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Transcription:

Overview Lecture 20: Analysis of Algorithms Which algorithm should I choose for a particular application?! Ex: symbol table. (linked list, hash table, red-black tree,... )! Ex: sorting. (insertion sort, quicksort, mergesort,... )! Ex: halting problem. (none) Analysis of algorithms: framework for comparing algorithms and predicting performance. Computational complexity: framework for studying intrinsic difficulty of problems. COS 126: General Computer Science http://www.princeton.edu/~cos126 2 Ex 1: N-body Simulation. Overview! Simulate gravitational interactions among N bodies.! Brute force: N 2 steps.! Barnes-Hut: N log N steps, enables new research. Ex 2: Discrete Fourier transform.! Break down waveform of N samples into periodic components. Applications: DVD players, JPEG, analysis of astronomical data, medical imaging, nonlinear Schroedinger equation,...! Brute force: N 2 steps.! FFT algorithm: N log N steps, enables new technology. Ex 3: Sorting.! Rearrange N items in ascending order. physicists want N = # atoms in universe! Fundamental information processing abstraction. Andrew Appel PU '81 Freidrich Gauss 1805 Attribute Cost Applicability Improvement Better Machines vs. Better Algorithms Better Machine $$$ or more. Makes "everything" run faster. Incremental quantitative improvements. $ or less. May not apply to some problems. Ex: sorting a billion items with quicksort on home PC is 100x faster than insertion sorting it on supercomputer. Computer home PC Comparisons Per Second 10 7 Better Algorithm Dramatic quantitative improvements possible. Insertion 3 centuries 3 hours Jon von Neumann IAS 1945 super 10 12 2 weeks 3 4

Case Study: Sorting Insertion Sort Sorting problem:! Given N items, rearrange them in ascending order.! Applications: databases, data compression, computational biology, computer graphics,... Insertion sort! Brute-force sorting solution.! Move left-to-right through array.! Exchange next element with larger elements to its left, one-by-one. name Hauser Hong Hsu Hayes Haskell Hanley Hornet name Hanley Haskell Hauser Hayes Hong Hornet Hsu static void insertionsort(string[] a) { for (int i = 0; i < a.length; i++) { for (int j = i; j > 0; j--) { if (less(a[j], a[j-1])) exch(a, j, j-1); else break; 5 6 Helper Functions Estimating the Running Time Is string s lexicographically less than string t? static boolean less(string s, String t) { int n = Math.min(s.length(), t.length()); for (int i = 0; i < n; i++) { if (s.charat(i) < t.charat(i)) return true; else if (s.charat(i) > t.charat(i)) return false; return (s.length() < t.length()); Swap string references a[i] and a[j]. static void exch(string[] a, int i, int j) { String swap = a[i]; a[i] = a[j]; a[j] = swap; Total run time. Sum over all instructions: frequency! cost. Frequency. Determined by algorithm and input. Cost. Determined by compiler and machine. Easier alternative. (i) Analyze asymptotic growth. (ii) For medium N, run and measure time. (iii) For large N, use (i) and (ii) to predict time. Donald Knuth Asymptotic growth rates.! Estimate time as a function of input size N. N, N log N, N 2, N 3, 2 N, N!! Typically ignore lower order terms and leading coefficients. Ex. 6N 3 + 17N 2 + 56 is asymptotically proportional to N 3 7 8

Insertion Sort: Analysis Insertion Sort: Analysis Worst case.! Elements in reverse sorted order. i th iteration requires i - 1 compare and exchange operations total = 0 + 1 + 2 +... + N-2 + N-1 = N (N-1) / 2 Best case.! Elements in sorted order already. i th iteration requires only 1 compare operation total = 0 + 1 + 1 +... + 1 = N - 1 E F G H I J D C B A A B C D E F G H I J unsorted sorted active unsorted sorted active 10 11 Insertion Sort: Analysis Insertion Sort: Analysis Average case.! Elements are randomly ordered. i th iteration requires i / 2 comparison on average total = 0 + 1/2 + 2/2 +... + (N-1)/2 = N (N-1) / 4 Analysis Worst Comparisons N (N - 1) / 2 Asymptotic 0.5! N 2 Best N 1 N A D E F H J C B I G Average N (N 1) / 4 0.25! N 2 unsorted sorted active 12 13

Estimating the Running Time: Insertion Sort Insertion Sort: Number of Comparisons (ii) Collect empirical data for medium N.! Use System.currentTimeMillis to get number of milliseconds since January 1, 1970. long start = System.currentTimeMillis(); insertionsort(a); long stop = System.currentTimeMillis(); double elapsed = (stop - start) / 1000.0; N 5,000 10,000 20,000 comparisons 6.1 million 23 million 98 million running time 0.7 seconds 4 seconds 25 seconds (iii) Predict for large N.! Quadratic: 2x factor in input size " 4x factor in comparisons. Comparsions (millions) 100000 10000 Actual 1000 Predicted 100 10 1 1000 10000 100000 1000000 Input Size Lesson. Supercomputer can't rescue bad algorithm. 40,000 400 million 130 seconds 80,000 1.6 billion 570 seconds Data source: first N words of Charles Dicken's life work. Machine: Dell 2.2GHz PC with 1GB memory. computer home super comparisons per second 10 7 10 12 thousand Input Size million 1 day 1 second billion 3 centuries 2 weeks 14 15 Profiling What is my program doing? % java Xprof InsertionSort 50000 < dickens.txt Flat profile of 238.75 secs (11909 total ticks): main Compiled + native Method 90.9% 10824 + 0 InsertionSort.less 6.2% 733 + 0 java.lang.math.min 2.8% 337 + 0 InsertionSort.insertionsort 0.0% 0 + 3 InsertionSort.main 0.0% 2 + 0 java.io.bufferedinputstream.read 0.0% 1 + 0 java.lang.stringbuffer.length 0.0% 1 + 0 java.lang.stringbuffer.append 0.0% 1 + 0 java.lang.string.<init> 0.0% 1 + 0 StdIn.readString 99.9% 11900 + 3 Total compiled. C. A. R. Hoare, 1960 Q U I C K S O R T I S C O O L How to speed up the program?! Make less run substantially faster.! Call less substantially fewer times. 16 17

..! Sort each "half" recursively. partitioning element partitioning element Q U I C K S O R T I S C O O L Q U I C K S O R T I S C O O L I C K I C L Q U S O R T S O O CI C KI I CK L QO OU OS QO R ST S OT OU # L $ L partitioned array sort each piece 18 19 : Java Implementation : Implementing Partition.! Sort each "half" recursively. How do we partition in-place efficiently? static int partition(string[] a, int left, int right) { int i = left - 1; int j = right; while(true) { public void quicksort(string[] a, int left, int right) { if (right <= left) return; int i = partition(a, left, right); quicksort(a, left, i-1); quicksort(a, i+1, right); while (less(a[++i], a[right])) ; while (less(a[right], a[--j])) if (j == left) break; if (i >= j) break; exch(a, i, j); check if pointers cross swap find item on left to swap find item on right to swap exch(a, i, right); return i; swap with partitioning element return index where crossing occurs 20 21

: Analysis Sorting Summary Average case running time.! Roughly 2 N log e N comparisons. see Sedgewick book or COS 226 for proof! Assumption: partition on random element (instead of rightmost one). Worst case running proportional to N 2.! More likely that you are struck by lightning and meteor at same time. Numerical experiments roughly match predictions.! A bit better because of many equal keys. N comparisons running time Comparsions (thousands) 1000000 Insertion sort 100000 10000 1000 100 1000 10000 100000 1000000 Input Size 200,000 400,000 1 million 4.0 million 8.3 million 23 million 0.9 seconds 2.2 seconds 5.8 seconds 2 million 47 million 13 seconds 4 million 97 million 31 seconds Lesson. Good algorithm can be much more powerful than supercomputer. Computer home PC super Comparisons Per Second 10 7 10 12 Insertion 3 centuries 2 weeks 3 hours N = 1 billion 22 23 Design, Analysis, and Implementation of Algorithms Computational Complexity of Problems Algorithm.! Step-by-step recipe used to solve a problem.! Generally independent of programming language or machine on which it is to be executed. Design. Find a method to solve the problem. Analysis. Evaluate effectiveness and predict theoretical performance. Implementation. Write Java code to test your theory. Analysis Design Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X. Optimal algorithm. Algorithm with best cost guarantee for X. Example 1: X = sorting.! Measure costs in terms of comparisons.! Upper bound = N log 2 N with mergesort.! Lower bound = N log 2 N - N log 2 e.! Optimal algorithm = mergesort. lower bound ~ upper bound quicksort usually faster, but mergesort never slow applies to any comparison-based algorithm (see COS 226) Implement Theory simplifies and provides guidelines. 24 25

Computational Complexity of Problems Summary Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X. Optimal algorithm. Algorithm with best cost guarantee for X. How can I evaluate the performance of my algorithm?! Computational experiments.! Analysis of algorithms. What if it's not fast enough?! Understand why. analysis of algorithms! Buy a faster computer. performance improves incrementally Example 2: X = Euclidean TSP.! Measure cost in terms of arithmetic operations.! Upper bound = 2 N by dynamic programming. N! by brute force! Lower bound = N.! Optimal algorithm = ask again in 50 years.! Find a better algorithm in a textbook. performance can improve dramatically! Discover a new algorithm. not always easy or possible Sobering philosophical thoughts. see NP-completeness! In theory, most problems are undecidable.! In practice, most remaining problems are intractable.! Analysis of algorithms helps us improve the ones we use. Essence of computational complexity: closing the gap. 26 27