CSE 613: Parallel Programming. Lectures ( Analyzing Divide-and-Conquer Algorithms )

Similar documents
CSE 613: Parallel Programming. Lecture 8 ( Analyzing Divide-and-Conquer Algorithms )

CSE 548: Analysis of Algorithms. Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication )

CSE 613: Parallel Programming. Lecture 9 ( Divide-and-Conquer: Partitioning for Selection and Sorting )

CSE 548: Analysis of Algorithms. Lecture 12 ( Analyzing Parallel Algorithms )

Algorithm Analysis Recurrence Relation. Chung-Ang University, Jaesung Lee

Divide-and-conquer: Order Statistics. Curs: Fall 2017

CSE548, AMS542: Analysis of Algorithms, Spring 2014 Date: May 12. Final In-Class Exam. ( 2:35 PM 3:50 PM : 75 Minutes )

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Data Structures and Algorithms CSE 465

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: October 11. In-Class Midterm. ( 7:05 PM 8:20 PM : 75 Minutes )

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

The Divide-and-Conquer Design Paradigm

Analysis of Multithreaded Algorithms

Divide and Conquer. Andreas Klappenecker

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn

Chapter 4 Divide-and-Conquer

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

CSE 591 Homework 3 Sample Solutions. Problem 1

Design and Analysis of Algorithms

Divide and Conquer. Recurrence Relations

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

CS 470/570 Divide-and-Conquer. Format of Divide-and-Conquer algorithms: Master Recurrence Theorem (simpler version)

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

Introduction to Algorithms 6.046J/18.401J/SMA5503

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

COMP Analysis of Algorithms & Data Structures

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Fast Convolution; Strassen s Method

COMP Analysis of Algorithms & Data Structures

Data Structures and Algorithms CSE 465

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

Inf 2B: Sorting, MergeSort and Divide-and-Conquer

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn

Mergesort and Recurrences (CLRS 2.3, 4.4)

Design and Analysis of Algorithms

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Partition and Select

CPS 616 DIVIDE-AND-CONQUER 6-1

Algorithms and Data Structures Strassen s Algorithm. ADS (2017/18) Lecture 4 slide 1

Algorithms And Programming I. Lecture 5 Quicksort

Algorithm Design and Analysis

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

CMPSCI611: Three Divide-and-Conquer Examples Lecture 2

Tutorials. Algorithms and Data Structures Strassen s Algorithm. The Master Theorem for solving recurrences. The Master Theorem (cont d)

Analysis of Algorithms - Using Asymptotic Bounds -

Divide and Conquer algorithms

Divide and Conquer. Arash Rafiey. 27 October, 2016

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Lecture 4. Quicksort

Divide-and-conquer algorithm

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

1 Divide and Conquer (September 3)

The Master Theorem for solving recurrences. Algorithms and Data Structures Strassen s Algorithm. Tutorials. The Master Theorem (cont d)

Asymptotic Algorithm Analysis & Sorting

Chapter 5. Divide and Conquer CLRS 4.3. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Divide-and-conquer. Curs 2015

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

Divide and Conquer Strategy

COL106: Data Structures and Algorithms (IIT Delhi, Semester-II )

Divide & Conquer. CS 320, Fall Dr. Geri Georg, Instructor CS320 Div&Conq 1

CSCI Honor seminar in algorithms Homework 2 Solution

CSE 548: Analysis of Algorithms. Lecture 8 ( Amortized Analysis )

Grade 11/12 Math Circles Fall Nov. 5 Recurrences, Part 2

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Fall 2017 November 10, Written Homework 5

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

Outline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10

ITEC2620 Introduction to Data Structures

Analysis of Algorithms CMPSC 565

Lecture 14: Nov. 11 & 13

Data Structures and Algorithms

Divide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai

Divide-Conquer-Glue Algorithms

Introduction to Divide and Conquer

CSC236 Intro. to the Theory of Computation Lecture 7: Master Theorem; more D&C; correctness

Linear Time Selection

CSE 548: Analysis of Algorithms. Lectures 18, 19, 20 & 21 ( Randomized Algorithms & High Probability Bounds )

Review Of Topics. Review: Induction

In-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Weighted Activity Selection

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Divide and conquer. Philip II of Macedon

Recurrences COMP 215

Asymptotic Analysis and Recurrences

CSC 8301 Design & Analysis of Algorithms: Lower Bounds

CS483 Design and Analysis of Algorithms

Divide&Conquer: MergeSort. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University Spring 2014

Chapter 5 Divide and Conquer

Divide and Conquer Algorithms

Advanced Analysis of Algorithms - Midterm (Solutions)

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201

Data Structures and Algorithms Chapter 3

Transcription:

CSE 613: Parallel Programming Lectures 13 14 ( Analyzing Divide-and-Conquer Algorithms ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2015

A Useful Recurrence Consider the following recurrence: Θ 1, 1,, ; where, 1and 1. Arises frequently in the analyses of divide-and-conquer algorithms. Recall the following from the analyses of QSort(quicksort) in lecture 1. Serial: 2 Θ Parallel (with serial partition): Θ Parallel (with parallel partition): Θ log

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,. Θ

How the Recurrence Unfolds: Case 1 Θ 1, 1,,. Ο log _ for some constant 0. Sums Geometrically Increase Level by Level. Last Level Dominates. Θ Θ

How the Recurrence Unfolds: Case 2 Θ 1, 1,,. Θ log lg Sums Arithmetically Increase for some constant 0. Level by Level. No Level Dominates. Θ + Θ

How the Recurrence Unfolds: Case 3 Θ 1, 1,,. Ω log + & for constants 0& 1. Sums Geometrically decrease Level by Level. First Level Dominates. Θ Θ

The Master Theorem Θ 1, 1,, 1,1. Case 1: Ο log _ for some constant0 Θ log Case 2: Θ log lg for some constant0. Θ log lg +1 Case 3: Ω log + and for constants0and 1. Θ

Back to QSort Complexities Now let s try the QSort(quicksort) recurrences from lecture 1. Serial: 2 Θ Master Theorem Case 2: Θ log Parallel (with serial partition): Θ Master Theorem Case 3: Θ Parallel (with parallel partition): Θ log Master Theorem Case 2: Θ log 2

More Example Applications of Master Theorem Karatsuba salgorithm: 3 Θ Master Theorem Case 1: Θ Strassen smatrix Multiplication: 7 Θ Master Theorem Case 1: Θ Fast Fourier Transform: 2 Θ Master Theorem Case 2: Θ log

Recurrences not Solvable using the Master Theorem Example 1: is not a constant Example 2: 2 logis not a constant Example 3: is not 1 Example 4: 2 is not 1.

Recurrences not Solvable using the Master Theorem Example 5: 3 is not positive Example 6: 2 sin violates regularity condition of case 3 Example 7: 2 Ο, but Ο for any constant 0 Example 8: 2 and are not fixed

Multithreaded Matrix Multiplication

Parallel Iterative MM Iter-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n is a positive integer } 1. for i 1 to n do 2. for j 1 to n do 3. Z[ i ][ j ] 0 4. for k 1 to n do 5. Z[ i ][ j ] Z[ i ][ j ] + X[ i ][ k ] Y[ k ][ j ] Par-Iter-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n is a positive integer } 1. parallel for i 1 to n do 2. parallel for j 1 to n do 3. Z[ i ][ j ] 0 4. for k 1 to n do 5. Z[ i ][ j ] Z[ i ][ j ] + X[ i ][ k ] Y[ k ][ j ]

Parallel Iterative MM Par-Iter-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n is a positive integer } 1. parallel for i 1 to n do 2. parallel for j 1 to n do 3. Z[ i ][ j ] 0 4. for k 1 to n do 5. Z[ i ][ j ] Z[ i ][ j ] + X[ i ][ k ] Y[ k ][ j ] Work: Θ Span: Θ loglog Θ Parallelism: Θ

Parallel Recursive MM Z X Y n/2 n/2 n/2 n/2 Z 11 Z 12 n/2 X 11 X 12 n/2 Y 11 Y 12 n = n n Z 21 Z 22 X 21 X 22 Y 21 Y 22 n n n n/2 n/2 X 11 Y 11 + X 12 Y 21 X 11 Y 12 + X 12 Y 22 = n X 21 Y 11 + X 22 Y 21 X 21 Y 12 + X 22 Y 22 n

Parallel Recursive MM Par-Rec-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else 4. spawn Par-Rec-MM ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM ( Z 21, X 21, Y 11 ) 7. Par-Rec-MM ( Z 21, X 21, Y 12 ) 8. sync 9. spawn Par-Rec-MM ( Z 11, X 12, Y 21 ) 10. spawn Par-Rec-MM ( Z 12, X 12, Y 22 ) 11. spawn Par-Rec-MM ( Z 21, X 22, Y 21 ) 12. Par-Rec-MM ( Z 22, X 22, Y 22 ) 13. sync 14. endif

Parallel Recursive MM Par-Rec-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else 4. spawn Par-Rec-MM ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM ( Z 21, X 21, Y 11 ) 7. Par-Rec-MM ( Z 21, X 21, Y 12 ) 8. sync 9. spawn Par-Rec-MM ( Z 11, X 12, Y 21 ) 10. spawn Par-Rec-MM ( Z 12, X 12, Y 22 ) 11. spawn Par-Rec-MM ( Z 21, X 22, Y 21 ) 12. Par-Rec-MM ( Z 22, X 22, Y 22 ) 13. sync 14. endif Work: Θ 1, 1, 8 2 Θ 1,. Span: Θ [ MT Case 1 ] Θ 1, 1, 2 2 Θ 1,. Θ [ MT Case 1 ] Parallelism: Additional Space: Θ Θ 1

n/2 Recursive MM with More Parallelism Z n/2 n/2 Z 11 Z 12 n/2 X 11 Y 11 + X 12 Y 21 X 11 Y 12 + X 12 Y 22 n = n Z 21 Z 22 X 21 Y 11 + X 22 Y 21 X 21 Y 12 + X 22 Y 22 n n n/2 n/2 n/2 X 11 Y 11 = + X 11 Y 12 n/2 X 12 Y 21 X 12 Y 22 n n X 21 Y 11 X 21 Y 12 X 22 Y 21 X 22 Y 22 n n

Recursive MM with More Parallelism Par-Rec-MM2 ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else { T is a temporary n n matrix } 4. spawn Par-Rec-MM2 ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM2 ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM2 ( Z 21, X 21, Y 11 ) 7. spawn Par-Rec-MM2 ( Z 21, X 21, Y 12 ) 8. spawn Par-Rec-MM2 ( T 11, X 12, Y 21 ) 9. spawn Par-Rec-MM2 ( T 12, X 12, Y 22 ) 10. spawn Par-Rec-MM2 ( T 21, X 22, Y 21 ) 11. Par-Rec-MM2 ( T 22, X 22, Y 22 ) 12. sync 13. parallel for i 1 to n do 14. parallel for j 1 to n do 15. Z[ i ][ j ] Z[ i ][ j ] + T[ i ][ j ] 16. endif

Recursive MM with More Parallelism Par-Rec-MM2 ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else { T is a temporary n n matrix } 4. spawn Par-Rec-MM2 ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM2 ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM2 ( Z 21, X 21, Y 11 ) 7. spawn Par-Rec-MM2 ( Z 21, X 21, Y 12 ) 8. spawn Par-Rec-MM2 ( T 11, X 12, Y 21 ) 9. spawn Par-Rec-MM2 ( T 12, X 12, Y 22 ) 10. spawn Par-Rec-MM2 ( T 21, X 22, Y 21 ) 11. Par-Rec-MM2 ( T 22, X 22, Y 22 ) 12. sync 13. parallel for i 1 to n do 14. parallel for j 1 to n do 15. Z[ i ][ j ] Z[ i ][ j ] + T[ i ][ j ] 16. endif Work: Θ 1, 1, 8 2 Θ,. Span: Θ [ MT Case 1 ] Θ 1, 1, 2 Θ log,. Θ log 2 [ MT Case 2 ] Parallelism: Additional Space: Θ Θ 1, 1, 8 2 Θ,. Θ [ MT Case 1 ]

Multithreaded Merge Sort

Parallel Merge Sort Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. Merge ( A, p, q, r ) Par-Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. spawn Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. sync 6. Merge ( A, p, q, r )

Parallel Merge Sort Par-Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. spawn Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. sync 6. Merge ( A, p, q, r ) Work: Span: Θ 1, 1, 2 2 Θ,. Θ log [ MT Case 2 ] Θ 1, 1, 2 Θ,. Parallelism: Θ [ MT Case 3 ] Θ log

Source:Cormenet al., Introduction to Algorithms, 3 rd Edition subarrays to merge: suppose: merged output: Parallel Merge 1 1...... 1

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Step 1: Find, where is the midpoint of..

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Step 2: Use binary search to find the index in subarray.. so that the subarraywould still be sorted if we insert between 1 and

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Step 3:Copy to, where

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Perform the following two steps in parallel. Step 4(a):Recursively merge.. 1 with.. 1, and place the result into.. 1

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Perform the following two steps in parallel. Step 4(a):Recursively merge.. 1 with.. 1, and place the result into.. 1 Step 4(b):Recursively merge 1.. with 1.., and place the result into 1..

Parallel Merge Par-Merge ( T, p 1, r 1, p 2, r 2, A, p 3 ) 1. n 1 r 1 p 1 + 1, n 2 r 2 p 2 + 1 2. if n 1 < n 2 then 3. p 1 p 2, r 1 r 2, n 1 n 2 4. if n 1 = 0 then return 5. else 6. q 1 ( p 1 + r 1 ) / 2 7. q 2 Binary-Search ( T[ q 1 ], T, p 2, r 2 ) 8. q 3 p 3 + ( q 1 p 1 ) + ( q 2 p 2 ) 9. A[ q 3 ] T[ q 1 ] 10. spawn Par-Merge ( T, p 1, q 1-1, p 2, q 2-1, A, p 3 ) 11. Par-Merge ( T, q 1 +1, r 1, q 2 +1, r 2, A, q 3 +1 ) 12. sync

Parallel Merge Par-Merge ( T, p 1, r 1, p 2, r 2, A, p 3 ) 1. n 1 r 1 p 1 + 1, n 2 r 2 p 2 + 1 2. if n 1 < n 2 then 3. p 1 p 2, r 1 r 2, n 1 n 2 4. if n 1 = 0 then return 5. else 6. q 1 ( p 1 + r 1 ) / 2 7. q 2 Binary-Search ( T[ q 1 ], T, p 2, r 2 ) 8. q 3 p 3 + ( q 1 p 1 ) + ( q 2 p 2 ) 9. A[ q 3 ] T[ q 1 ] We have, 2 In the worst case, a recursive call in lines 9-10 merges half the elements of.. with all elements of... 10. spawn Par-Merge ( T, p 1, q 1-1, p 2, q 2-1, A, p 3 ) 11. Par-Merge ( T, q 1 +1, r 1, q 2 +1, r 2, A, q 3 +1 ) 12. sync Hence, #elements involved in such a call: 2 2 2 2 2 2 4 2 4 3 4

Parallel Merge Par-Merge ( T, p 1, r 1, p 2, r 2, A, p 3 ) 1. n 1 r 1 p 1 + 1, n 2 r 2 p 2 + 1 2. if n 1 < n 2 then 3. p 1 p 2, r 1 r 2, n 1 n 2 4. if n 1 = 0 then return 5. else 6. q 1 ( p 1 + r 1 ) / 2 7. q 2 Binary-Search ( T[ q 1 ], T, p 2, r 2 ) 8. q 3 p 3 + ( q 1 p 1 ) + ( q 2 p 2 ) 9. A[ q 3 ] T[ q 1 ] 10. spawn Par-Merge ( T, p 1, q 1-1, p 2, q 2-1, A, p 3 ) 11. Par-Merge ( T, q 1 +1, r 1, q 2 +1, r 2, A, q 3 +1 ) 12. sync Span: Θ 1, 1, 3 4 Θ log,. Θ log 2 [ MT Case 2 ] Work: Clearly, Ω We show below that, Ο For some,, we have the following recurrence, 1 Ο log Assuming logfor positive constants and, and substituting on the right hand side of the above recurrence gives us: log Ο. Hence, Θ.

Parallel Merge Sort with Parallel Merge Par-Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. spawn Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. sync 6. Par-Merge ( A, p, q, r ) Work: Span: Θ 1, 1, 2 2 Θ,. Θ log [ MT Case 2 ] Θ 1, 1, 2 Θ log2,. Parallelism: Θ log 3 [ MT Case 2 ] Θ