Recap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86

Similar documents
Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Searching. Sorting. Lambdas

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

COMP 250: Quicksort. Carlos G. Oliver. February 13, Carlos G. Oliver COMP 250: Quicksort February 13, / 21

CSCI 3110 Assignment 6 Solutions

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Data Structures and Algorithm Analysis (CSC317) Randomized Algorithms (part 3)

1 Quick Sort LECTURE 7. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

CSE 613: Parallel Programming. Lecture 9 ( Divide-and-Conquer: Partitioning for Selection and Sorting )

ITEC2620 Introduction to Data Structures

Quick Sort Notes , Spring 2010

Data Structures and Algorithms CSE 465

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Algorithms And Programming I. Lecture 5 Quicksort

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Week 5: Quicksort, Lower bound, Greedy

Longest Common Prefixes

Central Algorithmic Techniques. Iterative Algorithms

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Kartsuba s Algorithm and Linear Time Selection

R ij = 2. Using all of these facts together, you can solve problem number 9.

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

data structures and algorithms lecture 2

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

CSED233: Data Structures (2017F) Lecture4: Analysis of Algorithms

Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Lecture 4. Quicksort

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Weighted Activity Selection

Data Structures and Algorithm Analysis (CSC317) Randomized algorithms

Quicksort algorithm Average case analysis

Ch 01. Analysis of Algorithms

Bucket-Sort. Have seen lower bound of Ω(nlog n) for comparisonbased. Some cheating algorithms achieve O(n), given certain assumptions re input

Data selection. Lower complexity bound for sorting

CS 1110: Introduction to Computing Using Python Sequence Algorithms

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

Advanced Analysis of Algorithms - Midterm (Solutions)

Learning Objectives

Linear Time Selection

Ch01. Analysis of Algorithms

Randomized algorithms. Inge Li Gørtz

Module 1: Analyzing the Efficiency of Algorithms

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

MA/CSSE 473 Day 05. Factors and Primes Recursive division algorithm

Data Structures in Java

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

Bin Sort. Sorting integers in Range [1,...,n] Add all elements to table and then

Fast Sorting and Selection. A Lower Bound for Worst Case

CS 2110: INDUCTION DISCUSSION TOPICS

CS361 Homework #3 Solutions

Lecture 6 September 21, 2016

What is Performance Analysis?

CS 310 Advanced Data Structures and Algorithms

Fundamental Algorithms

Fundamental Algorithms

Asymptotic Algorithm Analysis & Sorting

CS 344 Design and Analysis of Algorithms. Tarek El-Gaaly Course website:

Analysis of Algorithms. Randomizing Quicksort

1 Divide and Conquer (September 3)

Multi-Pivot Quicksort: Theory and Experiments

Algorithms and Data Structures

Hardware/Software Co-Design

COL 730: Parallel Programming

IS 709/809: Computational Methods in IS Research Fall Exam Review

Data Structures and Algorithms

CS 161 Summer 2009 Homework #2 Sample Solutions

Design and Analysis of Algorithms

Recursion: Introduction and Correctness

Analysis of Algorithms CMPSC 565

Randomization in Algorithms and Data Structures

CSE 312, Winter 2011, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Divide and conquer. Philip II of Macedon

CS483 Design and Analysis of Algorithms

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Design and Analysis of Algorithms PART II-- Median & Runtime Analysis Recorded by Chandrasekar Vijayarenu and Shridharan Muthu.

Average Case Analysis. October 11, 2011

Recommended readings: Description of Quicksort in my notes, Ch7 of your CLRS text.

Sorting DS 2017/2018

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Algorithm Design and Analysis

CPS 616 DIVIDE-AND-CONQUER 6-1

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

ASYMPTOTIC COMPLEXITY SEARCHING/SORTING

Problem Set 2 Solutions

Selection and Adversary Arguments. COMP 215 Lecture 19

Design and Analysis of Algorithms

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Divide and Conquer Strategy

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Analysis of Algorithms

Transcription:

Recap: Prefix Sums Given : set of n integers Find B: prefix sums : 3 1 1 7 2 5 9 2 4 3 3 B: 3 4 5 12 14 19 28 30 34 37 40 1 / 86

Recap: Parallel Prefix Sums Recursive algorithm Recursively computes sums Use partial sums to get prefix sums T(n) = O(log n) W(n) = O(n) Hard to get intuition Iterative algorithm easier to grasp? 2 / 86

Iterative prefix sum 2 phases: up-sweep, down-sweep Up-sweep pseudocode: 3 / 86

Up-sweep phase B[0] 3 1 1 7 2 5 9 2 3 1 1 7 2 5 9 2 4 / 86

Up-sweep phase B[1] B[0] 3 1 1 7 2 5 9 2 5 / 86

Up-sweep phase B[2] B[1] B[0] 3 1 1 7 2 5 9 2 6 / 86

Up-sweep phase B[3] B[2] B[1] B[0] 3 1 1 7 2 5 9 2 7 / 86

Up-sweep phase B[3] B[2] B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 8 / 86

Up-sweep phase B[3] B[2] 12 18 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 9 / 86

Up-sweep phase B[3] 30 B[2] 12 18 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 10 / 86

Down-sweep phase 11 / 86

Down-sweep phase C[3] 0 B[2] 12 18 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 12 / 86

Down-sweep phase C[3] 0 B[2] 12 18 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 13 / 86

Down-sweep phase C[3] 0 B[2] 12 18 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 14 / 86

Down-sweep phase C[3] 0 C[2] 0 12 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 15 / 86

Down-sweep phase C[3] 0 C[2] 0 12 B[1] 4 8 7 11 B[0] 3 1 1 7 2 5 9 2 16 / 86

Down-sweep phase C[3] 0 C[2] 0 12 C[1] 0 4 12 19 B[0] 3 1 1 7 2 5 9 2 17 / 86

Down-sweep phase C[3] 0 C[2] 0 12 C[1] 0 4 12 19 B[0] 3 1 1 7 2 5 9 2 18 / 86

Down-sweep phase C[3] 0 C[2] 0 12 C[1] 0 4 12 19 C[0] 0 3 4 5 12 17 19 28 19 / 86

Down-sweep phase C[0] 0 3 4 5 12 17 19 28 3 1 1 7 2 5 9 2 20 / 86

Down-sweep phase C[0] 0 3 4 5 12 17 19 28 3 4 5 12 14 22 28 30 21 / 86

pplications of prefix sums More useful than it seems: Create an array of 1s and 0s Prefix sums gives # of 1s up to each point Used to separate an array into 2 Using almost any criteria! Examples: separate array into upper-case and lower-case letters separate array into numbers >x and <x 22 / 86

Example: string separation Separate array into lower-case and upper-case: a P r e R E c F I o X o S U l M S 23 / 86

Example: string separation Create bitstring B: 1 if upper-case, 0 otherwise a P r e R E c F I o X o S U l M S 24 / 86

Example: string separation Create bitstring B: 1 if upper-case, 0 otherwise a P r e R E c F I o X o S U l M S B 0 1 0 0 1 1 0 1 1 0 1 0 1 1 0 1 1 Time/work to do this in parallel? 25 / 86

Example: string separation Create bitstring B: 1 if upper-case, 0 otherwise a P r e R E c F I o X o S U l M S B 0 1 0 0 1 1 0 1 1 0 1 0 1 1 0 1 1 Time/work to do this in parallel? W(n) = O(n) T(n) = O(1) 26 / 86

Example: string separation Perform prefix sums on B a P r e R E c F I o X o S U l M S B 0 1 0 0 1 1 0 1 1 0 1 0 1 1 0 1 1 27 / 86

Example: string separation Perform prefix sums on B a P r e R E c F I o X o S U l M S B 0 1 1 1 2 3 3 4 5 5 6 6 7 8 8 9 10 What is B[i]? 28 / 86

Example: string separation Perform prefix sums on B a P r e R E c F I o X o S U l M S B 0 1 1 1 2 3 3 4 5 5 6 6 7 8 8 9 10 What is B[i]? The number of capital letters with index i 29 / 86

Example: string separation Copy capital letters into C a P r e R E c F I o X o S U l M S B 0 1 1 1 2 3 3 4 5 5 6 6 7 8 8 9 10 C How can we use B to write only capitals into C? 30 / 86

Example: string separation Copy capital letters into C a P r e R E c F I o X o S U l M S B 0 1 1 1 2 3 3 4 5 5 6 6 7 8 8 9 10 C How can we use B to write only capitals into C? B[i] is the index of each capital in C! 31 / 86

Example: string separation Copy capital letters into C a P r e R E c F I o X o S U l M S B 0 1 1 1 2 3 3 4 5 5 6 6 7 8 8 9 10 C P R E F I X S U M S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 How can we use B to write only capitals into C? B[i] is the index of each capital in C! 32 / 86

Example: string separation Create B 1 for lower-case, 0 otherwise a P r e R E c F I o X o S U l M S B 1 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 C P R E F I X S U M S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 33 / 86

Example: string separation Prefix sums on B a P r e R E c F I o X o S U l M S B 1 1 2 3 3 3 4 4 4 5 5 6 6 6 7 7 7 C P R E F I X S U M S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 34 / 86

Example: string separation Copy lower-case into the rest of C a P r e R E c F I o X o S U l M S B 1 1 2 3 3 3 4 4 4 5 5 6 6 6 7 7 7 C P R E F I X S U M S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 35 / 86

Example: string separation Copy lower-case into the rest of C a P r e R E c F I o X o S U l M S B 1 1 2 3 3 3 4 4 4 5 5 6 6 6 7 7 7 C P R E F I X S U M S a r e c o o l 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 [i] = C[j] where j = B[n] + B [i] = 10 + B [i] 36 / 86

Example: string separation Create B and B Prefix sums Copy into C Total algorithm 37 / 86

Example: string separation Create B and B Prefix sums Copy into C Total algorithm 38 / 86

Example: string separation Create B and B Prefix sums Copy into C Total algorithm 39 / 86

Quicksort Review Quicksort is a popular sorting algorithm Works in-place O(n 2 ) worst-case BUT O(n log n) expected Each recursive call: Find pivot Partition around pivot 40 / 86

Sequential Quicksort 41 / 86

Select pivot 3 9 1 7 4 5 8 2 pivot 42 / 86

Select pivot 4 9 1 7 3 5 8 2 43 / 86

Partition elements 4 9 1 7 3 5 8 2 44 / 86

Partition elements 4 9 1 7 3 5 8 2 part 45 / 86

Partition elements i 4 9 1 7 3 5 8 2 part 46 / 86

Partition elements FLSE i 4 9 1 7 3 5 8 2 part 47 / 86

Partition elements TRUE i 4 9 1 7 3 5 8 2 part 48 / 86

Partition elements TRUE i 4 1 9 7 3 5 8 2 part 49 / 86

Partition elements FLSE i 4 1 9 7 3 5 8 2 part 50 / 86

Partition elements TRUE i 4 1 9 7 3 5 8 2 part 51 / 86

Partition elements TRUE i 4 1 3 7 9 5 8 2 part 52 / 86

Partition elements FLSE i 4 1 3 7 9 5 8 2 part 53 / 86

Partition elements FLSE i 4 1 3 7 9 5 8 2 part 54 / 86

Partition elements TRUE i 4 1 3 7 9 5 8 2 part 55 / 86

Partition elements TRUE i 4 1 3 2 9 5 8 7 part 56 / 86

Recurse 4 1 3 2 9 5 8 7 part 57 / 86

Recursion sorts sublists 1 2 3 4 5 7 8 9 part 58 / 86

How can we parallelize? O(1)??? Parallel calls 59 / 86

Parallel partition Separate all elements pivot 4 9 1 7 3 5 8 2 pivot How can we do this in parallel? 60 / 86

Parallel partition Separate all elements pivot 4 9 1 7 3 5 8 2 pivot How can we do this in parallel? Prefix sums! 61 / 86

Parallel partition Create B[i] by comparing [i] to pivot 1 if [i] [0] 0 otherwise 4 9 1 7 3 5 8 2 B 1 0 1 0 1 0 0 1 62 / 86

Parallel partition Prefix sums on B 4 9 1 7 3 5 8 2 B 1 1 2 2 3 3 3 4 63 / 86

Parallel partition Write each [i] [0] to array C C[B[i]] = [i] 4 9 1 7 3 5 8 2 B 1 1 2 2 3 3 3 4 C 4 1 3 2 64 / 86

Parallel partition Create B as opposite of B B [i] = 1 if [i] > [0] B [i] = 0 otherwise 4 9 1 7 3 5 8 2 B 0 1 0 1 0 1 1 0 C 4 1 3 2 65 / 86

Parallel partition Prefix sums on B 4 9 1 7 3 5 8 2 B 0 1 1 2 2 3 4 4 C 4 1 3 2 66 / 86

Parallel partition Write remaining elements to C C[B[n-1] + B [i]] = [i] 4 9 1 7 3 5 8 2 B 0 1 1 2 2 3 4 4 C 4 1 3 2 9 7 5 8 67 / 86

Parallel quicksort analysis Each recursive call performs prefix sum Worst-case, pivot is always min or max: If we assume good pivot is chosen: 68 / 86

Parallel quicksort analysis ssuming a good pivot choice: 69 / 86

Issues with parallel quicksort Have to copy to C => not in-place O(n) extra space needed O(log2 n) average parallel runtime Recursive definition Difficult to make iterative Perform many small prefix-sums Performance overhead 70 / 86

Iterative solution What if we can combine recursive calls One iteration for each level Separate recursive calls on partitions: P 0 P 1 P 2 P 3 1 3 5 2 7 6 9 12 16 19 17 14 22 25 20 23 71 / 86

Iterative solution Know size of partition i = P i Find a pivot for each partition P 0 P 1 P 2 P 3 1 3 5 2 7 6 9 12 16 19 17 14 22 25 20 23 72 / 86

Iterative solution Know size of partition i = P i Find a pivot for each partition Move pivots to front P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 73 / 86

Iterative solution Know size of partition i = P i Find a pivot for each partition Move pivots to front Compute B Compare each to the pivot in its partition P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 1 1 0 1 1 1 1 1 1 0 0 1 1 0 1 0 74 / 86

Iterative solution Want prefix sum within each partition: Segmented prefix sums Each partition is a separate segment P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 1 1 0 1 1 1 1 1 1 0 0 1 1 0 1 0 75 / 86

Iterative solution Want prefix sum within each partition: Segmented prefix sums Each partition is a separate segment Can combine into 1 operation... P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 1 2 2 3 1 2 3 1 2 2 2 3 1 1 2 2 76 / 86

Segmented prefix sums Input array and flag bits F 1 if start of new segment 0 otherwise Prefix sums, except sum resets when F[i]=1 3 1 4 1 5 2 1 3 4 0 2 6 1 0 3 4 F 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 77 / 86

Segmented prefix sums Input array and flag bits F 1 if start of new segment 0 otherwise Prefix sums, except sum resets when F[i]=1 3 1 4 1 5 2 1 3 4 0 2 6 1 0 3 4 F 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 C 3 4 8 1 6 8 9 12 16 0 2 6 1 1 4 8 78 / 86

Partition with segments Create F with partition boundaries P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 1 1 0 1 1 1 1 1 1 0 0 1 1 0 1 0 F 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 79 / 86

Partition with segments Create F with partition boundaries Perform segmented prefix sums on B and F P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 1 2 2 3 1 2 3 1 2 2 2 3 1 1 2 2 F 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 80 / 86

Partition with segments Create F with partition boundaries Perform segmented prefix sums on B and F Copy [i] into C[B[i]] (plus partition offsets) P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 1 2 2 3 1 2 3 1 2 2 2 3 1 1 2 2 C 3 1 2 9 6 7 16 12 14 22 20 81 / 86

Partition with segments Repeat for > pivots: Build B P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 C 3 1 2 9 6 7 16 12 14 22 20 82 / 86

Partition with segments Repeat for > pivots: Segmented prefix sums on B P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 0 0 1 1 0 0 0 0 0 1 2 2 0 1 1 2 C 3 1 2 9 6 7 16 12 14 22 20 83 / 86

Partition with segments Repeat for > pivots: Copy remaining values into C P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 0 0 1 1 0 0 0 0 0 1 2 2 0 1 1 2 C 3 1 2 5 9 6 7 16 12 14 19 17 22 20 25 23 84 / 86

Partition with segments Ready for next iteration... P 0 P 1 P 2 P 3 3 1 5 2 9 6 7 16 12 19 17 14 22 25 20 23 B 0 0 1 1 0 0 0 0 0 1 2 2 0 1 1 2 C 3 1 2 5 9 6 7 16 12 14 19 17 22 20 25 23 85 / 86

Notes about Iterative quicksort Need to keep track of partition offsets, etc. Still need to pick good pivots Same runtime as recursive Easier to optimize Unroll loops, etc. Less overhead (on most architectures) 86 / 86