Week 5: Quicksort, Lower bound, Greedy

Similar documents
Algorithms And Programming I. Lecture 5 Quicksort

Data selection. Lower complexity bound for sorting

Lecture 4. Quicksort

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Review Of Topics. Review: Induction

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

COMP 250: Quicksort. Carlos G. Oliver. February 13, Carlos G. Oliver COMP 250: Quicksort February 13, / 21

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Advanced Analysis of Algorithms - Midterm (Solutions)

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Partition and Select

Data Structures and Algorithms

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Inf 2B: Sorting, MergeSort and Divide-and-Conquer

Analysis of Algorithms. Randomizing Quicksort

Data Structures and Algorithms CSE 465

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

CS 161 Summer 2009 Homework #2 Sample Solutions

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Fundamental Algorithms

Fundamental Algorithms

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Analysis of Algorithms CMPSC 565

Recommended readings: Description of Quicksort in my notes, Ch7 of your CLRS text.

Analysis of Algorithms - Using Asymptotic Bounds -

1 Divide and Conquer (September 3)

Central Algorithmic Techniques. Iterative Algorithms

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

data structures and algorithms lecture 2

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Problem Set 2 Solutions

CSCE 750, Spring 2001 Notes 2 Page 1 4 Chapter 4 Sorting ffl Reasons for studying sorting is a big deal pedagogically useful Λ the application itself

Mergesort and Recurrences (CLRS 2.3, 4.4)

Divide-and-conquer. Curs 2015

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Divide and Conquer CPE 349. Theresa Migler-VonDollen

Data Structures and Algorithm Analysis (CSC317) Randomized algorithms

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

Divide-and-conquer: Order Statistics. Curs: Fall 2017

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Outline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

Solutions. Problem 1: Suppose a polynomial in n of degree d has the form

Concrete models and tight upper/lower bounds

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

1 Terminology and setup

Problem 5. Use mathematical induction to show that when n is an exact power of two, the solution of the recurrence

Design and Analysis of Algorithms

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

Divide and Conquer Strategy

Introduction to Divide and Conquer

Algorithms. Algorithms 2.2 MERGESORT. mergesort bottom-up mergesort sorting complexity divide-and-conquer ROBERT SEDGEWICK KEVIN WAYNE

On Partial Sorting. Conrado Martínez. Univ. Politècnica de Catalunya, Spain

Quick Sort Notes , Spring 2010

Algorithm Design and Analysis

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

COMP Analysis of Algorithms & Data Structures

Divide and conquer. Philip II of Macedon

Lecture 5: Loop Invariants and Insertion-sort

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

COMP Analysis of Algorithms & Data Structures

CS361 Homework #3 Solutions

Bin Sort. Sorting integers in Range [1,...,n] Add all elements to table and then

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Divide-Conquer-Glue Algorithms

CIS 121. Analysis of Algorithms & Computational Complexity. Slides based on materials provided by Mary Wootters (Stanford University)

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

CS 361, Lecture 14. Outline

5. DIVIDE AND CONQUER I

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

CPS 616 DIVIDE-AND-CONQUER 6-1

Sorting. CS 3343 Fall Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk

1. Basic Algorithms: Bubble Sort

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

Midterm Exam. CS 3110: Design and Analysis of Algorithms. June 20, Group 1 Group 2 Group 3

Design and Analysis of Algorithms

Selection and Adversary Arguments. COMP 215 Lecture 19

CPSC 320 Sample Final Examination December 2013

CSCI 3110 Assignment 6 Solutions

Algorithms. Jordi Planes. Escola Politècnica Superior Universitat de Lleida

Kartsuba s Algorithm and Linear Time Selection

Lecture 4: Best, Worst, and Average Case Complexity. Wednesday, 30 Sept 2009

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Sorting. CMPS 2200 Fall Carola Wenk Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

Data Structures and Algorithms Chapter 3

Lecture 14: Nov. 11 & 13

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

CS Data Structures and Algorithm Analysis

CS483 Design and Analysis of Algorithms

CS Analysis of Recursive Algorithms and Brute Force

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

University of New Mexico Department of Computer Science. Midterm Examination. CS 361 Data Structures and Algorithms Spring, 2003

Fast Sorting and Selection. A Lower Bound for Worst Case

Transcription:

Week 5: Quicksort, Lower bound, Greedy Agenda: Quicksort: Average case Lower bound for sorting Greedy method 1

Week 5: Quicksort Recall Quicksort: The ideas: Pick one key Compare to others: partition into smaller and greater sublists Recursively sort two sublists Pseudocode: procedure Quicksort(A, p, r) if p < r then q Partition(A, p, r) Quicksort(A, p, q 1) Quicksort(A, q + 1, r) procedure Partition(A, p, r) ** A[r] is the key picked to do the partition x A[r] i p 1 for j p to r 1 do if A[j] x then i i + 1 exchange A[i] A[j] exchange A[i + 1] A[r] return i + 1 2

Week 5: Quicksort Partition(A, p, r): The invariant: A[p..i] contains keys A[r] A[(i + 1)..(j 1)] contains keys > A[r] Ideas: A[j] is the current key under examination j i If A[j] A[r], exchange A[j] A[i + 1] and increment i to maintain the invariant At the end, exchange A[r] A[i + 1] such that: A[p..i] contains keys A[i + 1] A[(i + 2)..r] contains keys > A[i + 1] After A[p..i] and A[(i + 2)..r] been sorted, A[p..r] is sorted. An example: A[1..8] = {3, 1, 7, 6, 4, 8, 2, 5}, p = 1, r = 8 3 1 7 6 4 8 2 5 i = 0, j = 1 3 1 7 6 4 8 2 5 i = 1, j = 2 3 1 7 6 4 8 2 5 i = 2, j = 3 3 1 7 6 4 8 2 5 i = 2, j = 4 3 1 4 6 7 8 2 5 i = 3, j = 5 3 1 4 6 7 8 2 5 i = 3, j = 6 3 1 4 6 7 8 2 5 i = 3, j = 7 3 1 4 2 7 8 6 5 i = 4, j = 7 3 1 4 2 5 8 6 7 i = 4, j = 7 3

Week 5: Quicksort Quicksort correctness: It follows from the correctness of Partition. Partition correctness: Loop invariant: At the start of for loop: 1. A[p..i] A[r] A[s] A[r], p s i 2. A[(i + 1)..(j 1)] > A[r] 3. x = A[r] Proof of LI: (pages 147 148) 1. Initialization 2. Maintenance 3. Termination LI correctness implies Partition correctness Why we study QuickSort and its analysis: very efficient, in use divide-and-conquer, randomization huge literature a model for analysis of algorithms 4

Week 5: Quicksort analysis Quicksort recursion tree: Observations: (Again) key comparison is the dominant operation Counting KC only need to know (at each call) the rank of the split key An example: 06 04 09 02 05 07 12 01 03 08 11 13 10 14 15 More observations: In the resulting recursion tree, at each node (all keys in left subtree) (key in this node) < (all keys in right subtree) 1-1 correspondence: quicksort recursion tree binary search tree 5

Quicksort WC running time: Week 5: Quicksort analysis The split key is compared with every other key: (n 1) KC Recurrence: where 0 n 1 n 1 T (n) = T (n 1 ) + T (n 1 n 1 ) + (n 1), Base case: T (0) = 0, T (1) = 0 Notice that when both subarrays are non-empty, we will be having (n 1 1) + (n 1 n 1 1) = (n 3) KC next level... Worst case: one of the subarray is empty!!! needs (n 2) KC next level WC recurrence: T (n) = T (0) + T (n 1) + (n 1) = T (n 1) + (n 1), Solving the recurrence Master Theorem does NOT apply T (n) = T (n 1) + (n 1) = T (n 2) + (n 2) + (n 1) =... = T (1) + 1 + 2 +... + (n 1) = (n 1)n 2 So, T (n) Θ(n 2 ) Therefore, quicksort is bad in terms of WC running time! 6

Week 5: Quicksort analysis Quicksort BC running time: Notice that when both subarrays are non-empty, we will be saving 1 KC... Best case: each partition is a bipartition!!! Saving as many KC as possible every level... The recursion tree is as short as possible... Recurrence: T (n) = 2 T ( n 1 ) + (n 1), 2 Solving the recurrence apply Master Theorem? not exactly T (n) Θ(n log n) Question: What is the best case array? for n = 7? Conclusion: In order to save time, A[n] better BI-partitions the array... usually it might not bipartition... we will push it by a technique called randomization (future lectures) 7

Quicksort BC running time (cont d): Week 5: Quicksort In the recursion tree, what is the number of KC at each level? Answer: n 1 at the top level at most 2 nodes at the 2nd level, at least (n 1 1) + (n 1 n 1 1) = n 3 KC at most 4 nodes at the 3rd level, at least (n 1 3) + (n 1 n 1 3) = n 7 KC... at kth level, at most 2 k 1 nodes, at least n 2 k + 1 KC How many levels are there? Answer: At least lg n levels binary tree So, at least we need lg n 1 (n 2 i + 1) KC, and i=1 lg n 1 i=1 (n 2 i + 1) = (n + 1)(lg n 1) (n 2) Θ(n log n) Try n = 2 k 1 to get the closed form for the following recurrence { 0, if n = 1 T (n) = (n 1) + T ( n 1 ) + T ( n 1 ), if n 2 2 2 8

Week 5: Quicksort Quicksort AC running time: Recall the Quicksort algorithm Pseudocode: procedure Quicksort(A, p, r) **p 146 if p < r then q Partition(A, p, r) Quicksort(A, p, q 1) Quicksort(A, q + 1, r) The recurrence for running time is: { 0, when n = 0, 1 T (n) = T (n 1 ) + T (n 1 n 1 ) + (n 1), when n 2 Average case: What is the probability for the left subarray to have size n 1? Average case: always ask average over what input distribution? Unless stated otherwise, assume each possible input equiprobable Uniform distribution Here, each of the possible inputs equiprobable Key observation: equiprobable inputs imply for each key, rank among keys so far is equiprobable So, n 1 can be 0, 1, 2,..., n 2, n 1, with the same probability 1 n 9

Week 5: Quicksort Solving T (n): T (n) = 1 (T (0) + T (n 1)) n Therefore, = 2 n + 1 n (T (1) + T (n 2)) +... + 1 n (T (n 2) + T (1)) + 1 n + (n 1) n 1 T (i) + (n 1) i=0 (T (n 1) + T (0)) n 1 n T (n) = 2 T (i) + n(n 1) i=0 n 2 (n 1) T (n 1) = 2 T (i) + (n 1)(n 2) i=0 Subtract the two terms: n T (n) (n 1) T (n 1) = 2T (n 1) + 2(n 1) Rearrange it: nt (n) = (n + 1)T (n 1) + 2(n 1) 10

Week 5: Quicksort Solving T (n) (cont d): Or we can say: T (n) = T (n 1) + 2(n 1) n+1 n n(n+1) = T (n 1) n + 2 n+1 2( 1 n 1 n+1 ) = T (n 1) n + 4 n+1 2 n which gives you (iterated substitution) T (n) n + 1 = n i=1 ( ) 2 2 i + 1 + n + 1 2 n 1 Recall that i=1 i = H n = ln n + γ the Harmonic number where γ 0.577 So, from we have Conclusion: T (n) n + 1 = n i=1 ( ) 2 2 i + 1 + n + 1 2 T (n) = 2(n + 1)H n+1 (4n + 2) 2(n + 1)(ln(n + 1) + γ) (4n + 2) Θ(n log n) Quicksort AC running time in Θ(n log n). 11

Week 5: Quicksort Quicksort Improvement and space requirement: Quicksort is considered an in-place sorting algorithm: extra space required at each recursive call is only constant. whereas in Mergesort, at each recursive call up to Θ(n) extra space is required. To improve the algorithm, it s better to pick the median as the pivot (but this is difficult) Other solution: use a random element as the pivot at every iteration! Pick one 1 i n randomly and then exchange A[i] A[n] before calling the Partition method. This way, no single input is always bad. Sorting Algorithms So Far: Running Time Comparison Alg. BC WC AC InsertionSort Θ(n) Θ(n 2 ) Θ(n 2 ) MergeSort Θ(n log n) Θ(n log n)? HeapSort Θ(n log n) Θ(n log n)? QuickSort Θ(n log n) Θ(n 2 ) Θ(n log n) 12

Week 5: Lower Bounds for Comparison-Based Sorting Two useful trees in algorithm analysis: Recursion tree node recursive call describes algorithm execution for one particular input by showing all calls made one algorithm execution all nodes (a tree) useful in analysis: sum the numbers of operations over all nodes 13

Week 5: Lower Bounds for Comparison-Based Sorting Recursion tree example: Mergesort pseudocode Merge(A; lo, mid, hi) **pre-condition: **pre-condition: **post-condition: lo mid hi A[lo, mid] and A[mid + 1, hi] sorted A[lo, hi] sorted MergeSort(A; lo, hi) if lo < hi then mid (lo + hi)/2 MergeSort(A; lo, mid) MergeSort(A; mid + 1, hi) Merge(A; lo, mid, hi) [1] [4] [2] [3] [5] [6] For different input instance, the number of operations at each node could be different. 14

Week 5: Lower Bounds for Comparison-Based Sorting Two useful trees in algorithm analysis: Recursion tree node recursion call describes algorithm execution for one particular input by showing all calls made one algorithm execution all nodes (a tree) useful in analysis: sum the numbers of operations over all nodes Decision tree node algorithm decision describes algorithm execution for all possible inputs by showing all possible algorithm decisions one algorithm execution one root-to-leaf path useful in analysis: sum the numbers of operations over nodes on one path 15

Week 5: Lower Bounds for Comparison-Based Sorting Selectionsort decision tree: Assume input keys in array A[1..3] = {a, b, c} Tree node: if A[k] > A[j] 2-way key comparison Node label A[j] SelectionSort(A; n) if n 1 then for j n downto 2 do psn j for k j 1 downto 1 do if A[k] > A[psn] then psn k exchange A[j] A[psn] return A[2] > A[3]? No. Yes. abc bac bca acb cab cba In every case whatever input instance is, 3 KC!!! 16

Week 5: Lower Bounds for Comparison-Based Sorting Sorting lower bound: Comparison-based sort: keys can be (2-way) compared only! This lower bound argument considers only the comparisonbased sorting algorithms. For example, Insertionsort, Mergesort, Heapsort, Quicksort, Binary tree facts: Suppose there are t leaves and k levels. Then, t 2 k 1 So, lg t (k 1) Equivalently, k 1 + lg t binary tree with t leaves has at least (1 + lg t) levels Comparison-based sorting algorithm facts: Look at its Decision Tree. We have, It s a binary tree. It should contain every possible permutation of the positions {1, 2,..., n}. So, it contains at least n! leaves... Equivalently, it has at least 1 + lg(n!) levels. A longest root-to-leaf path of length at least lg(n!). The worst case number of KC is at least lg(n!). lg(n!) Θ(n log n) Therefore, Mergesort and Heapsort are asymptotically optimal (comparison-based) sorting algorithms. 17

Algorithm Design Techniques: Three major design techniques are: 1. Greedy method 2. Divide and Conquer 3. Dynamic Programming 1- Greedy Method: Week 5: Algorithm Design Techniques This is usually good for optimization problems. In an optimization problem we are looking for a solution while we maximize or minimize an objective function. For example, maximizing the profit or minimizing the cost. Usually, coming up with a greedy solution is easy; But often it is more difficult to prove that the algorithm is correct. General scheme: always make choices that look best at the current step; these choices are final and are not changed later. Hope that with every step taken, we are getting closer and closer to an optimal solution. 18

Week 5: Greedy method Example 1: Fractional Knapsack Suppose we have a set S of n items, each with a profit/value b i and weight w i. We also have a knapsack of capacity W, Assume that each item can be picked at any fraction, that is we can pick 0 x i w i amount of item i. Our goal is to fill the knapsack (without exceeding its capacity) with a combination of the items with maximum profit. Formally, n find 0 x i w i for 1 i n such that i=1 x i W n x and i i=1 w i b i is maximized. Greedy idea: start picking the items with more value : value b i w i So let v i = b i w i. The algorithm will be as follows: 19

Week 5: Greedy method The pseudocode is: Procedure Frac-Knapsack (S, W ) for i 1 to n do x i 0 v i b i w i CurrentW 0 While CurrentW < W do let a i be the next item in S with highest value x i min{w i, W CurrentW } add x i amount of i to knapsack CurrentW CurrentW + x i How to find next highest value in each step? One way is to sort S at the begining based on v i s in nonincreasing order. Another way is to keep a PQ (max-heap) based on values. Since we check at most n items, the total time is O(n log n). Correctness of the algorithm: We prove, for all i 0, that if we have picked x 1,..., x i from items 1,..., i in the first i iterations (respectively), then this partial solution can be extended to an optimal solution. In other words, there is some optimal solution call it OPT which has x j amount from item j for 1 j i. 20

Week 5: Greedy method Fractional Knapsack (cont d) We prove by induction on i. Base case i = 0 we have an empty solution and clearly can be extended to an optimal one. Induction Step: Assume the statement is true for < i, with i 1, prove it for iteration i. That is, we have picked x 1,..., x i 1 of items 1,..., (i 1), so does the OPT. If OPT picks x i from item i we are done. picks x i x i. So assume OPT Note that the algorithm picks maximum amount we can from item i (either x i = w i or knapsack is full). Thus x i cannot be more than x i, i.e. x i < x i. i Since W k=1 x i 1 k > k=1 x k + x i, OPT must contain amounts from other items to match the deficiency of x i x i, say amounts x j1, x j2,..., x jl of items j 1,..., j l with x j1 +... + x jl x i x i. Since items are ranked based on value, all v jk v i for 1 k l. So if we replace a total of x i amount from items j 1,..., j l with x i amount of i in the OPT, the total value does not decrease, we sill have an optimal solution. Now OPT has x i amount of i and so extends our greedy solution. This completes the induction. 21