CSE 613: Parallel Programming. Lecture 9 ( Divide-and-Conquer: Partitioning for Selection and Sorting )

Similar documents
CSE 613: Parallel Programming. Lecture 8 ( Analyzing Divide-and-Conquer Algorithms )

CSE 613: Parallel Programming. Lectures ( Analyzing Divide-and-Conquer Algorithms )

CSE 548: Analysis of Algorithms. Lecture 12 ( Analyzing Parallel Algorithms )

CSE 548: Analysis of Algorithms. Lectures 18, 19, 20 & 21 ( Randomized Algorithms & High Probability Bounds )

Data Structures and Algorithms

Algorithms And Programming I. Lecture 5 Quicksort

CSE548, AMS542: Analysis of Algorithms, Spring 2014 Date: May 12. Final In-Class Exam. ( 2:35 PM 3:50 PM : 75 Minutes )

Divide-and-conquer: Order Statistics. Curs: Fall 2017

CSE 548: Analysis of Algorithms. Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication )

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

IS 709/809: Computational Methods in IS Research Fall Exam Review

Lecture 4. Quicksort

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

Quick Sort Notes , Spring 2010

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Data Structures and Algorithm Analysis (CSC317) Randomized Algorithms (part 3)

Divide and Conquer Algorithms

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

R ij = 2. Using all of these facts together, you can solve problem number 9.

Recap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86

Data Structures and Algorithms CSE 465

Data Structures and Algorithm Analysis (CSC317) Randomized algorithms

Divide and Conquer Algorithms

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

ITEC2620 Introduction to Data Structures

Divide-Conquer-Glue Algorithms

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Design and Analysis of Algorithms

Partition and Select

Data selection. Lower complexity bound for sorting

Divide-and-conquer. Curs 2015

Design and Analysis of Algorithms

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

CSCI 3110 Assignment 6 Solutions

Solutions. Problem 1: Suppose a polynomial in n of degree d has the form

Randomized algorithms. Inge Li Gørtz

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Fundamental Algorithms

Fundamental Algorithms

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Lecture 14: Nov. 11 & 13

Lecture 2: Randomized Algorithms

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Data Structures and Algorithms CSE 465

CSE 591 Homework 3 Sample Solutions. Problem 1

Extended Algorithms Courses COMP3821/9801

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

CSE 4502/5717: Big Data Analytics

Selection and Adversary Arguments. COMP 215 Lecture 19

Linear Time Selection

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

Divide and Conquer Strategy

Divide and conquer. Philip II of Macedon

Searching. Sorting. Lambdas

CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions

CSE 525 Randomized Algorithms & Probabilistic Analysis Spring Lecture 3: April 9

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Divide and Conquer. Recurrence Relations

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Quicksort algorithm Average case analysis

COMP 9024, Class notes, 11s2, Class 1

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

COMP Analysis of Algorithms & Data Structures

CS361 Homework #3 Solutions

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

CS483 Design and Analysis of Algorithms

Mergesort and Recurrences (CLRS 2.3, 4.4)

1 Divide and Conquer (September 3)

5. DIVIDE AND CONQUER I

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Analysis of Algorithms CMPSC 565

Advanced Analysis of Algorithms - Midterm (Solutions)

Advanced Analysis of Algorithms - Homework I (Solutions)

CPS 616 DIVIDE-AND-CONQUER 6-1

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

CSE 548: Analysis of Algorithms. Lecture 8 ( Amortized Analysis )

Bucket-Sort. Have seen lower bound of Ω(nlog n) for comparisonbased. Some cheating algorithms achieve O(n), given certain assumptions re input

Kartsuba s Algorithm and Linear Time Selection

Lecture 5: Loop Invariants and Insertion-sort

Review Of Topics. Review: Induction

About complexity. We define the class informally P in the following way:

On Partial Sorting. Conrado Martínez. Univ. Politècnica de Catalunya, Spain

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Problem Set 2 Solutions

Randomized Load Balancing:The Power of 2 Choices

Fast Sorting and Selection. A Lower Bound for Worst Case

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Random Variable. Pr(X = a) = Pr(s)

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

Average Case Analysis. October 11, 2011

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

CSCI 1590 Intro to Computational Complexity

Transcription:

CSE 613: Parallel Programming Lecture 9 ( Divide-and-Conquer: Partitioning for Selection and Sorting ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2012

Parallel Partition Input:An array A[ q: r] of distinct elements, and an element xfrom A[ q: r]. Output: Rearrange the elements of A[ q: r], and return an index k [ q, r], such that all elements in A[ q: k-1 ] are smaller than x, all elements in A[ k+ 1 : r] are larger than x, and A[ k] = x. Par-Partition ( A[ q : r ], x ) 1. n r q + 1 2. if n = 1 then return q 3. array B[ 0: n 1 ], lt[ 0: n 1 ], gt[ 0: n 1 ] 4. parallel for i 0 to n 1 do 5. B[ i ] A[ q + i ] 6. if B[ i ] < x then lt[ i ] 1 else lt[ i ] 0 7. if B[ i ] > x then gt[ i ] 1 else gt[ i ] 0 8. lt [ 0: n 1 ] Par-Prefix-Sum ( lt[ 0: n 1 ], + ) 9. gt[ 0: n 1 ] Par-Prefix-Sum ( gt[ 0: n 1 ], + ) 10. k q + lt [ n 1 ], A[ k ] x 11. parallel for i 0 to n 1 do 12. if B[ i ] < x then A[ q + lt [ i ] 1 ] B[ i ] 13. else if B[ i ] > x then A[ k + gt [ i ] ] B[ i ] 14. return k

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8 B: 9 5 7 11 1 3 8 14 4 21 lt: 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 gt:

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8 B: 9 5 7 11 1 3 8 14 4 21 lt: lt: 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 2 2 3 4 4 4 5 5 1 1 1 2 2 2 2 3 3 4 prefix sum prefix sum gt: gt:

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8 B: 9 5 7 11 1 3 8 14 4 21 lt: lt: 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 2 2 3 4 4 4 5 5 1 1 1 2 2 2 2 3 3 4 prefix sum prefix sum gt: gt: A:

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8 B: 9 5 7 11 1 3 8 14 4 21 lt: lt: 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 2 2 3 4 4 4 5 5 1 1 1 2 2 2 2 3 3 4 prefix sum k= 5 prefix sum gt: gt: A: 5 7 1 3 4 8

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8 B: 9 5 7 11 1 3 8 14 4 21 lt: lt: 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 2 2 3 4 4 4 5 5 1 1 1 2 2 2 2 3 3 4 prefix sum k= 5 prefix sum gt: gt: A: 5 7 1 3 4 8 9 11 14 21

Parallel Partition A: 9 5 7 11 1 3 8 14 4 21 x= 8 B: 9 5 7 11 1 3 8 14 4 21 lt: lt: 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 2 2 3 4 4 4 5 5 1 1 1 2 2 2 2 3 3 4 prefix sum k= 5 prefix sum gt: gt: A: 5 7 1 3 4 8 9 11 14 21

Parallel Partition: Analysis Par-Partition ( A[ q : r ], x ) 1. n r q + 1 2. if n = 1 then return q 3. array B[ 0: n 1 ], lt[ 0: n 1 ], gt[ 0: n 1 ] 4. parallel for i 0 to n 1 do 5. B[ i ] A[ q + i ] 6. if B[ i ] < x then lt[ i ] 1 else lt[ i ] 0 7. if B[ i ] > x then gt[ i ] 1 else gt[ i ] 0 8. lt [ 0: n 1 ] Par-Prefix-Sum ( lt[ 0: n 1 ], + ) 9. gt[ 0: n 1 ] Par-Prefix-Sum ( gt[ 0: n 1 ], + ) 10. k q + lt [ n 1 ], A[ k ] x 11. parallel for i 0 to n 1 do 12. if B[ i ] < x then A[ q + lt [ i ] 1 ] B[ i ] 13. else if B[ i ] > x then A[ k + gt [ i ] ] B[ i ] Work: Θ [ lines 1 7] Θ [ lines 8 9] Θ [ lines 10 14] Θ Span: Assuming log depth for parallel for loops: Θ log [ lines 1 7] Θ log 2 [ lines 8 9] Θ log [ lines 10 14] Θ log 2 14. return k Parallelism: Θ

Randomized Parallel QuickSort Input:An array A[ q: r] of distinct elements. Output: Elements of A[ q: r] sorted in increasing order of value. Par-Randomized-QuickSort ( A[ q : r ] ) 1. n r q + 1 2. if n 30 then 3. sort A[ q : r ] using any sorting algorithm 4. else 5. select a random element x from A[ q : r ] 6. k Par-Partition ( A[ q : r ], x ) 7. spawn Par-Randomized-QuickSort ( A[ q : k 1 ] ) 8. Par-Randomized-QuickSort ( A[ k + 1 : r ] ) 9. sync

Randomized Parallel QuickSort: Analysis Par-Randomized-QuickSort ( A[ q : r ] ) 1. n r q + 1 2. if n 30 then 3. sort A[ q : r ] using any sorting algorithm 4. else 5. select a random element x from A[ q : r ] 6. k Par-Partition ( A[ q : r ], x ) 7. spawn Par-Randomized-QuickSort ( A[ q : k 1 ] ) 8. Par-Randomized-QuickSort ( A[ k + 1 : r ] ) 9. sync Lines 1 6 take Θ log 2 parallel time and perform Θ work. Also the recursive spawns in lines 7 8 work on disjoint parts of A[ q: r]. So the upper bounds on the parallel time and the total work in each level of recursion are Θ log 2 and Θ, respectively. Hence, if is the recursion depthof the algorithm, then Ο and Ο log 2

Randomized Parallel QuickSort: Analysis Par-Randomized-QuickSort ( A[ q : r ] ) 1. n r q + 1 2. if n 30 then 3. sort A[ q : r ] using any sorting algorithm 4. else 5. select a random element x from A[ q : r ] 6. k Par-Partition ( A[ q : r ], x ) 7. spawn Par-Randomized-QuickSort ( A[ q : k 1 ] ) 8. Par-Randomized-QuickSort ( A[ k + 1 : r ] ) 9. sync We will show that w.h.p. recursion depth, Ο log. Hence, with high probability, Ο log and Ο log 3

Randomized Parallel QuickSort: Analysis Approach: We will show the following 1. For any specific element, the sizes of the partitions containing in any two consecutive levels of recursion decrease by a constant factor with a certain probability. 2. With probability 1Ο, the partition containing will be of size 30 or less after Ο log levels of recursion. 3. With probability 1Ο, the partition containing every element will be of size 30 or less after Ο log levels of recursion.

Randomized Parallel QuickSort: Analysis Lemma 1:Let be an arbitrary element of the original input array Aof size, and let be the size of the partition containing after partitioning at recursion depth 1. Then for any 0, Pr 7 8 1 4. Proof:Suppose at recursion depth 11element was chosen as the pivot element. One of the new partitions will have at least elements provided is among the smallest or largest elements in the old partition. The probability that is among the smallest or largest elements in the old partition is clearly.

Randomized Parallel QuickSort: Analysis Lemma 2:In 20loglevels of recursion, the probability that an element goes through 20loglog 30 unsuccessful partitioning steps(i.e., partitioning steps with 7 8 ) is Ο 1. [ all logarithms are to the base 8 7] Proof:The events consisting of the partitioning steps being successful can be modeled as Bernoulli trials. Let be a random variable denoting the number of unsuccessful partitioning steps among the 20log steps. Then Pr 20loglog 30 Pr 19log 20log 1 4 3 4

Randomized Parallel QuickSort: Analysis Lemma 2:In 20loglevels of recursion, the probability that an element goes through 20loglog 30 unsuccessful partitioning steps(i.e., partitioning steps with 7 8 ) is Ο 1. [ all logarithms are to the base 8 7] Proof: Pr 20loglog 30 Pr 19log 20log 1 4 3 4 20log 1 4 5log 5log 19log 5 19 Ο 1

Randomized Parallel QuickSort: Analysis Theorem 1:The recursion depth of Par-Randomized-Quicksort is 20log with probability 1Ο 1. Proof: The probability that one or more elements of A go through 20log log 30 unsuccessful partitioning steps is Ο Ο [ using Lemma 2 ] Hence, the probability that at least log 30 of the 20log partitioning steps is successful for all elements is 1Ο 1. After log 30 successful partitioning steps involving an element, the element belongs to a partition of size 30. Hence, Par-Randomized-Quicksort terminates in 20log levels of recursion with probability 1Ο 1.

Parallel Selection Input:A subarraya[ : ] of an array A[ 1 : ] of distinct elements, and a positive integer 1,1. Output: An element of A[ : ]such that,a[ : ]. Par-Selection ( A[ q : r ], n, k ) 1. n r q + 1 2. if n n / logn then 3. sort A[ q : r ] using a parallel sorting algorithm and return A[ q + k 1 ] 4. else 5. partition A[ q : r ] into blocks B i s each containing logn consecutive elements 6. parallel for i 1 to n / log n do 7. M[ i ] median of B i using a sequential selection algorithm 8. find the median m of M[ 1 : n / log n ] using a parallel sorting algorithm 9. t Par-Partition ( A[ q : r ], m ) 10. if k = t q + 1 then return A[ t ] 11. else if k < t q + 1 then return Par-Selection ( A[ q : t 1 ], n, k ) 12. else return Par-Selection ( A[ t + 1 : r ], n, k t + q 1 )

Parallel Selection Lemma 3:In Par-Selection( lines 11 12 ) :1 and 1:. Proof: It suffices to show that, :. Since is the median of s, it is larger than one half of the s. But each is larger than elements in. Hence,, :. Similarly, one can show that, :.

Parallel Selection Lemma 4:In Par-Selection after at most log log levels of recursion. Proof: It follows from Lemma 3 that recursion. Hence, for reaching, we need after levels of log log.

Deterministic Parallel Selection Par-Selection ( A[ q : r ], n, k ) 1. n r q + 1 2. if n n / logn then 3. sort A[ q : r ] using a parallel sorting algorithm and return A[ q + k 1 ] 4. else 5. partition A[ q : r ] into blocks B i s each containing logn consecutive elements 6. parallel for i 1 to n / log n do 7. M[ i ] median of B i using a sequential selection algorithm 8. find the median m of M[ 1 : n / log n ] using a parallel sorting algorithm 9. t Par-Partition ( A[ q : r ], m ) 10. if k = t q + 1 then return A[ t ] 11. else if k < t q + 1 then return Par-Selection ( A[ q : t 1 ], n, k ) 12. else return Par-Selection ( A[ t + 1 : r ], n, k t + q 1 ) Step 7:Use a linear time ( worst-case ) sequential selection algorithm ( see Section 9.3 of Introduction to Algorithms, 3 rd Ed. by Cormenet al. ). Steps 3 and 8:Use the parallel mergesortwith parallel merge ( see Lecture 8 ) that runs in Ο log 3 parallel time and performs Ο log work in the worst case.

Deterministic Parallel Selection Par-Selection ( A[ q : r ], n, k ) 1. n r q + 1 2. if n n / logn then 3. sort A[ q : r ] using a parallel sorting 4. else algorithm and return A[ q + k 1 ] 5. partition A[ q : r ] into blocks B i s each containing logn consecutive elements 6. parallel for i 1 to n / log n do 7. M[ i ] median of B i using a sequential selection algorithm 8. find the median m of M[ 1 : n / log n ] using a parallel sorting algorithm 9. t Par-Partition ( A[ q : r ], m ) 10. if k = t q + 1 then return A[ t ] 11. else if k < t q + 1 then return Par-Selection ( A[ q : t 1 ], n, k ) 12. else return Par-Selection ( A[ t + 1 : r ], n, k t + q 1 ) Last Level of Recursion Work: Ο log Span: Ο log 3 Ο Ο log 3 Any Other Recursion Level( except last ) Work: Ο log [ lines 5 7] Ο [ line 8 ] Ο [ line 9 ] Ο Span: Ο log Ο log 3 log [ lines 5 7 ] [ line 8 ] Ο log 2 [ line 9 ] Ο log 3

Deterministic Parallel Selection Par-Selection ( A[ q : r ], n, k ) 1. n r q + 1 2. if n n / logn then 3. sort A[ q : r ] using a parallel sorting algorithm and return A[ q + k 1 ] 4. else 5. partition A[ q : r ] into blocks B i s each containing logn consecutive elements 6. parallel for i 1 to n / log n do 7. M[ i ] median of B i using a sequential selection algorithm 8. find the median m of M[ 1 : n / log n ] using a parallel sorting algorithm 9. t Par-Partition ( A[ q : r ], m ) 10. if k = t q + 1 then return A[ t ] Overall Work: Ο Span: Ο Ο log log log 3 Ο log 3 loglog 11. else if k < t q + 1 then return Par-Selection ( A[ q : t 1 ], n, k ) 12. else return Par-Selection ( A[ t + 1 : r ], n, k t + q 1 )