Lecture 4. Quicksort

Similar documents
Analysis of Algorithms. Randomizing Quicksort

Algorithms And Programming I. Lecture 5 Quicksort

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Data Structures and Algorithms CSE 465

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

Review Of Topics. Review: Induction

Analysis of Algorithms CMPSC 565

Week 5: Quicksort, Lower bound, Greedy

Fundamental Algorithms

Fundamental Algorithms

COMP 250: Quicksort. Carlos G. Oliver. February 13, Carlos G. Oliver COMP 250: Quicksort February 13, / 21

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

Data Structures and Algorithm Analysis (CSC317) Randomized algorithms

Recommended readings: Description of Quicksort in my notes, Ch7 of your CLRS text.

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

Divide-and-conquer. Curs 2015

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Data selection. Lower complexity bound for sorting

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Divide and Conquer Strategy

Divide and Conquer Algorithms

1 Divide and Conquer (September 3)

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

Divide and Conquer Algorithms

Divide-and-conquer: Order Statistics. Curs: Fall 2017

Data Structures and Algorithm Analysis (CSC317) Randomized Algorithms (part 3)

Divide and conquer. Philip II of Macedon

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Partition and Select

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

ITEC2620 Introduction to Data Structures

Lecture 5: Loop Invariants and Insertion-sort

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Inf 2B: Sorting, MergeSort and Divide-and-Conquer

Mergesort and Recurrences (CLRS 2.3, 4.4)

COMP Analysis of Algorithms & Data Structures

5. DIVIDE AND CONQUER I

Solutions. Problem 1: Suppose a polynomial in n of degree d has the form

Analysis of Algorithms - Using Asymptotic Bounds -

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

Searching. Sorting. Lambdas

Quick Sort Notes , Spring 2010

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

Sorting DS 2017/2018

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

CPS 616 DIVIDE-AND-CONQUER 6-1

Algorithm Design and Analysis

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Data Structures and Algorithms Chapter 3

CS 161 Summer 2009 Homework #2 Sample Solutions

Advanced Analysis of Algorithms - Midterm (Solutions)

COMP Analysis of Algorithms & Data Structures

Extended Algorithms Courses COMP3821/9801

Divide and Conquer. Arash Rafiey. 27 October, 2016

COMP Analysis of Algorithms & Data Structures

Selection and Adversary Arguments. COMP 215 Lecture 19

data structures and algorithms lecture 2

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

CSCI 3110 Assignment 6 Solutions

Data Structures and Algorithms

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Design and Analysis of Algorithms Recurrence. Prof. Chuhua Xian School of Computer Science and Engineering

Chapter 5. Divide and Conquer CLRS 4.3. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

COMP 382: Reasoning about algorithms

Data Structures and Algorithms Chapter 2

Central Algorithmic Techniques. Iterative Algorithms

Average Case Analysis. October 11, 2011

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

5. DIVIDE AND CONQUER I

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Design and Analysis of Algorithms

Quicksort algorithm Average case analysis

5 ProbabilisticAnalysisandRandomized Algorithms

Midterm Exam. CS 3110: Design and Analysis of Algorithms. June 20, Group 1 Group 2 Group 3

Randomized algorithms. Inge Li Gørtz

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Linear Time Selection

N/4 + N/2 + N = 2N 2.

Asymptotic Algorithm Analysis & Sorting

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

1. Basic Algorithms: Bubble Sort

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

Problem Set 2 Solutions

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

CS 470/570 Divide-and-Conquer. Format of Divide-and-Conquer algorithms: Master Recurrence Theorem (simpler version)

1. Basic Algorithms: Bubble Sort

Data Structures and Algorithms Chapter 3

Chapter 4 Divide-and-Conquer

Reductions, Recursion and Divide and Conquer

DM507 - Algoritmer og Datastrukturer Project 1

Introduction to Algorithms

Bin Sort. Sorting integers in Range [1,...,n] Add all elements to table and then

Transcription:

Lecture 4. Quicksort T. H. Cormen, C. E. Leiserson and R. L. Rivest Introduction to Algorithms, 3rd Edition, MIT Press, 2009 Sungkyunkwan University Hyunseung Choo choo@skku.edu Copyright 2000-2018 Networking Laboratory

Introduction for Quicksort Worst-case running time: Θ(n²) Expected running time: Θ(n lg n) Constants hidden in Θ(n lg n) are small Another divide-and-conquer algorithm The array A[p..r] is partitioned into two non-empty subarrays A[p..q] and A[q+1..r] Invariant: All elements in A[p..q] are less than all elements in A[q+1..r] The subarrays are recursively sorted by calls to QUICKSORT Unlike merge sort, no combining step: two subarrays form an already-sorted array Algorithms Networking Laboratory 2/48

Quicksort To sort the subarray A[p.. r] Divide Partition A[p..r], into two (possibly empty) subarrays A[p.. q-1] and A[q+1.. r], such that each element in the first subarray A[p.. q-1] is A[q] and A[q] is each element in the second subarray A[q+1.. r] Conquer Sort the two subarrays by recursive calls to QUICKSORT Combine No work is needed to combine the subarrays, because they are sorted in place Perform the divide step by a procedure PARTITION, which returns the index q that marks the position separating the subarrays Algorithms Networking Laboratory 3/48

Quicksort Code Algorithms Networking Laboratory 4/48

Partition Clearly, all the action takes place in the partition() function Rearranges the subarray in place End result: Two subarrays All values in first subarray all values in second one Returns the index of the pivot element separating the two subarrays How do you suppose we implement this? Algorithms Networking Laboratory 5/48

Partition PARTITION always selects the last element A[r] in the subarray A[p.. r] as the pivot The element around which to partition As the procedure executes, the array is partitioned into four regions, some of which may be empty Algorithms Networking Laboratory 6/48

Partition Loop invariant: 1. All entries in A[p.. i ] pivot 2. All entries in A[i+1.. j-1] > pivot 3. A[r] = pivot It s not needed as part of the loop invariant, but the fourth region is A[ j.. r-1], whose entries have not yet been examined, and so we don t know how they compare to the pivot. Algorithms Networking Laboratory 7/48

Partition Algorithms Networking Laboratory 8/48

Partition Algorithms Networking Laboratory 9/48

Partition Algorithms Networking Laboratory 10/48

Partition If A[j] > pivot: p i j r only increment j > x x x > x p i j r x x > x If A[j] pivot: i is incremented, A[j] and A[i] are swapped and then j is incremented p p x i i > x j x j r x r x x > x Algorithms Networking Laboratory 11/48

Correctness of Partition Initialization: Before the loop starts, all the conditions of the loop invariant are satisfied, because r is the pivot and the subarrays A[p.. i] and A[i+1.. j-1] are empty Maintenance: While the loop is running, if A[ j ] pivot, then A[ j] and A[i +1] are swapped and i and j are incremented If A[ j ] > pivot, then increment only j Algorithms Networking Laboratory 12/48

Correctness of Partition Termination: When the loop terminates, j = r, so all elements in A are partitioned into one of the three cases: A[p.. i ] pivot, A[i+1.. r-1] > pivot, and A[r] = pivot The last two lines of PARTITION move the pivot element from the end of the array to between the two subarrays: swapping the pivot(a[r]) and the first element of the second subarray(a[i + 1]) Time for partitioning: (n) to partition an n-element subarray Algorithms Networking Laboratory 13/48

Practice Problems The operation of PARTITION on an array A[1..12]= <13,19,9,5,12,8,7,4,21,2,6,11> is performed. Then the given array is divided into A[1..q] and A[q+1..12] such that A[i] A[j] for all 1 i q and q+1 j 12. What are q and A[q]? Algorithms Networking Laboratory 14/48

Quicksort Algorithm Video Content An illustration of Quick Sort. Algorithms Networking Laboratory 15/48

Quicksort Algorithm Algorithms Networking Laboratory 16/48

Performance of Quicksort The running time of Quicksort depends on the partitioning of the subarrays: If the subarrays are balanced, then quicksort can run as fast as mergesort If they are unbalanced, then quicksort can run as slowly as insertion sort Worst-case Occurs when the subarrays are completely unbalanced Has 0 elements in one subarray and n-1 elements in the other subarray Algorithms Networking Laboratory 17/48

Performance of Quicksort Worst-case Get the recurrence: T(n) = T(n-1) + T(0) + (n) = T(n-1) + (n) ( = (n²) ) Same running time as insertion sort In fact, the worst-case running time occurs when quicksort takes a sorted array as input, but insertion sort runs in O(n) time in this case Algorithms Networking Laboratory 18/48

Performance of Quicksort Best-case Occurs when the subarrays are completely balanced every time. Each subarray has n/2 elements Get the recurrence: T(n) = 2T(n/2) + (n) ( = (n lg n) ) Algorithms Networking Laboratory 19/48

Performance of Quicksort Balanced partitioning Quicksort s average running time is much closer to the best case than to the worst case. Imagine that PARTITION always produces a 9-to-1 split. Get the recurrence: T(n) T(9n/10) + T(n/10) + (n) O(n lg n) Algorithms Networking Laboratory 20/48

Performance of Quicksort Algorithms Networking Laboratory 21/48

Performance of Quicksort Intuition for the Average case Splits in the recursion tree will not always be constant There will usually be a mix of good and bad splits throughout the recursion tree To see that this doesn t affect the asymptotic running time of Quicksort, assume that levels alternate between best-case and worst-case splits Algorithms Networking Laboratory 22/48

Performance of Quicksort Intuition for the Average case The extra level in the left-hand figure only adds to the constant hidden in the -notation There are still the same number of subarrays to sort, and only twice as much work was done to get to that point Both figures(fig.7.5 a & b) result in O(n lg n) time, though the constant for the figure on the left is higher than that of the figure on the right Algorithms Networking Laboratory 23/48

Performance of Quicksort Algorithms Networking Laboratory 24/48

Practice Problems What is the running time of QUICKSORT when all elements of array A have the same value? Algorithms Networking Laboratory 25/48

Quicksort Sort an array A[p r] A[p q] A[q+1 r] Divide Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such that each element of A[p..q] is smaller than or equal to each element in A[q+1..r] The index (pivot) q is computed Conquer Recursively sort A[p..q] and A[q+1..r] using Quicksort Combine Trivial: the arrays are sorted in place no work needed to combine them: the entire array is now sorted Algorithms Networking Laboratory 26/48

Quicksort QUICKSORT(A, p, r) if p < r then q PARTITION(A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) Algorithms Networking Laboratory 27/48

Quicksort Algorithms Networking Laboratory 28/48

Partitioning the Array Idea Select a pivot element x around which to partition Grows two regions A[p i] x x A[j r] A[p i] x x A[j r] i j Algorithms Networking Laboratory 29/48

Algorithms Networking Laboratory 30/48 Example 7 3 1 4 6 2 3 5 i j 7 5 1 4 6 2 3 3 i j 7 5 1 4 6 2 3 3 i j 7 5 6 4 1 2 3 3 i j 7 3 1 4 6 2 3 5 i j A[p r] 7 5 6 4 1 2 3 3 i j A[p q] A[q+1 r]

Partitioning the Array PARTITION (A, p, r) 1. x A[p] 2. i p 1 3. j r + 1 4. while TRUE 5. do repeat j j 1 6. until A[j] x 7. repeat i i + 1 8. until A[i] x 9. if i < j 10. then exchange A[i] A[j] 11. else return j A: A: i p 5 a p 3 2 A[p q] 6 4 j=q 1 i 3 r 7 A[q+1 r] a r Running time: (n) n = r p + 1 j Algorithms Networking Laboratory 31/48

Partitioning the Array p r A: 5 3 2 6 4 1 3 7 i A[p q] A[q+1 r] j A: a p a r j=q i Algorithms Networking Laboratory 32/48

Performance of Quicksort Average case All permutations of the input numbers are equally likely On a random input array, we will have a mix of well balanced and unbalanced splits Good and bad splits are randomly distributed across throughout the tree 1 n (n 1)/2 n - 1 (n 1)/2 combined cost: 2n-1 = (n) (n 1)/2 + 1 n combined cost: n = (n) (n 1)/2 Alternate of a good and a bad split Nearly well balanced split Running time of Quicksort when levels alternate between good and bad splits is O(nlgn) Algorithms Networking Laboratory 33/48

Randomizing Quicksort Randomly permute the elements of the input array before sorting Modify the PARTITION procedure At each step of the algorithm we exchange element A[p] with an element chosen at random from A[p r] The pivot element x = A[p] is equally likely to be any one of the r p + 1 elements of the subarray Algorithms Networking Laboratory 34/48

Randomized Algorithms The behavior is determined in part by values produced by a random-number generator RANDOM(a, b) returns an integer r, where a r b and each of the b-a+1 possible values of r is equally likely Algorithm generates its own randomness No input can elicit worst case behavior Worst case occurs only if we get unlucky numbers from the random number generator Algorithms Networking Laboratory 35/48

Randomized PARTITION RANDOMIZED-PARTITION(A, p, r) i RANDOM(p, r) exchange A[p] A[i] return PARTITION(A, p, r) Algorithms Networking Laboratory 36/48

Randomized Quicksort RANDOMIZED-QUICKSORT(A, p, r) if p < r then q RANDOMIZED-PARTITION(A, p, r) RANDOMIZED-QUICKSORT(A, p, q) RANDOMIZED-QUICKSORT(A, q + 1, r) Algorithms Networking Laboratory 37/48

Worst-Case Analysis of Quicksort T(n) = worst-case running time T(n) = max (T(q) + T(n-q)) + (n) 1 q n-1 Use substitution method to show that the running time of Quicksort is O(n 2 ) Guess T(n) = O(n 2 ) Induction goal: T(n) cn 2 Induction hypothesis: T(k) ck 2 for any k n Algorithms Networking Laboratory 38/48

Worst-Case Analysis of Quicksort Proof of induction goal: T(n) max (cq 2 + c(n-q) 2 ) + (n) 1 q n-1 = c max (q 2 + (n-q) 2 ) + (n) 1 q n-1 The expression q 2 + (n-q) 2 achieves a maximum over the range 1 q n-1 at one of the endpoints max (q 2 + (n - q) 2 ) 1 2 + (n - 1) 2 = n 2 2(n 1) 1 q n-1 T(n) cn 2 2c(n 1) + (n) cn 2 Algorithms Networking Laboratory 39/48

Random Variables and Expectation Consider running time T(n) as a random variable This variable associates a real number with each possible outcome (split) of partitioning Expected value (expectation, mean) of a discrete random variable X is: E[X] = Σ x x Pr{X = x} Average over all possible values of random variable X Algorithms Networking Laboratory 40/48

Indicator Random Variables Given a sample space S and an event A, we define the indicator random variable I{A} associated with A: I{A} = 1 if A occurs 0 if A does not occur The expected value of an indicator random variable X A is: E[X A ] = Pr {A} Proof: E[X A ] = E[I{A}] = 1 Pr{A} + 0 Pr{Ā} = Pr{A} Algorithms Networking Laboratory 41/48

Number of Comparisons in PARTITION Need to compute the total number of comparisons performed in all calls to PARTITION X ij = I {z i is compared to z j } For any comparison during the entire execution of the algorithm, not just during one call to PARTITION Algorithms Networking Laboratory 42/48

Number of Comparisons in PARTITION Each pair of elements can be compared at most once X ij = I {z i is compared to z j } X n 1 i 1 n X j i 1 ij i n-1 i+1 n X represents the total number of comparisons performed by the algorithm Algorithms Networking Laboratory 43/48

Number of Comparisons in PARTITION X is an indicator random variable Compute the expected value E[X ] n 1 n n 1 n E X ij E X ij i 1 j i 1 i 1 j i 1 n 1 n i 1 j i 1 Pr{ z is by linearity of expectation compared to i z j the expectation of X ij is equal to the probability of the event z i is compared to z j } Algorithms Networking Laboratory 44/48

When Do We Compare Two Elements? Z 1,6 = {1, 2, 3, 4, 5, 6} z 2 z 9 z 8 z 3 z 5 z 4 z 1 z 6 z 10 z 7 2 9 8 3 5 4 1 6 10 7 Rename the elements of A as z 1, z 2,..., z n, with z i being the i-th smallest element Define the set Z ij = {z i, z i+1,..., z j } the set of elements between z i and z j Algorithms Networking Laboratory 45/48

When Do We Compare Two Elements? Z 1,6 = {1, 2, 3, 4, 5, 6} Pivot chosen such as: z i < x < z j z i and z j will never be compared z i or z j is the pivot z i and z j will be compared z 2 z 9 z 8 z 3 z 5 z 4 z 1 z 6 z 10 z 7 2 9 8 3 only if one of them is chosen as pivot before any other element in range z i to z j Only the pivot is compared with elements in both sets 5 4 1 6 10 7 Algorithms Networking Laboratory 46/48

Number of Comparisons in PARTITION z i is compared to z j Pr{ } = z i is the first pivot chosen from Z ij Pr{ } Pr{ z j is the first pivot chosen from Z ij } OR+ = 1/( j - i + 1) + 1/( j - i + 1) = 2/( j - i + 1) There are j i + 1 elements between z i and z j Pivot is chosen randomly and independently The probability that any particular element is the first one chosen is 1/( j - i + 1) Algorithms Networking Laboratory 47/48

Number of Comparisons in PARTITION Expected number of comparisons in PARTITION: E[ X ] n 1 n i 1 j i 1 Pr{ z is compared to i z j } E[ X ] n 1 n 2 j i 1 i 1 j i 1 O( nlg n) Expected running time of Quicksort using RANDOMIZED-PARTITION is O(nlgn) Algorithms Networking Laboratory 48/48