CSE 613: Parallel Programming. Lecture 8 ( Analyzing Divide-and-Conquer Algorithms )

Similar documents
CSE 613: Parallel Programming. Lectures ( Analyzing Divide-and-Conquer Algorithms )

CSE 613: Parallel Programming. Lecture 9 ( Divide-and-Conquer: Partitioning for Selection and Sorting )

CSE 548: Analysis of Algorithms. Lecture 12 ( Analyzing Parallel Algorithms )

CSE 548: Analysis of Algorithms. Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication )

Algorithm Analysis Recurrence Relation. Chung-Ang University, Jaesung Lee

CSE548, AMS542: Analysis of Algorithms, Spring 2014 Date: May 12. Final In-Class Exam. ( 2:35 PM 3:50 PM : 75 Minutes )

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Divide-and-conquer: Order Statistics. Curs: Fall 2017

Data Structures and Algorithms CSE 465

Analysis of Multithreaded Algorithms

Divide and Conquer. Recurrence Relations

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

COMP Analysis of Algorithms & Data Structures

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

Data Structures and Algorithms CSE 465

COMP Analysis of Algorithms & Data Structures

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

Inf 2B: Sorting, MergeSort and Divide-and-Conquer

Chapter 4 Divide-and-Conquer

The Divide-and-Conquer Design Paradigm

Mergesort and Recurrences (CLRS 2.3, 4.4)

Divide and Conquer. Andreas Klappenecker

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

CSE 591 Homework 3 Sample Solutions. Problem 1

Partition and Select

Algorithms And Programming I. Lecture 5 Quicksort

Algorithm Design and Analysis

Introduction to Algorithms 6.046J/18.401J/SMA5503

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Analysis of Algorithms I: Asymptotic Notation, Induction, and MergeSort

Divide-Conquer-Glue. Divide-Conquer-Glue Algorithm Strategy. Skyline Problem as an Example of Divide-Conquer-Glue

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Lecture 4. Quicksort

Design and Analysis of Algorithms

Design and Analysis of Algorithms

Outline. 1 Introduction. 3 Quicksort. 4 Analysis. 5 References. Idea. 1 Choose an element x and reorder the array as follows:

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: October 11. In-Class Midterm. ( 7:05 PM 8:20 PM : 75 Minutes )

COL106: Data Structures and Algorithms (IIT Delhi, Semester-II )

CSE 548: Analysis of Algorithms. Lecture 8 ( Amortized Analysis )

Grade 11/12 Math Circles Fall Nov. 5 Recurrences, Part 2

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

Outline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10

Asymptotic Algorithm Analysis & Sorting

Analysis of Algorithms - Using Asymptotic Bounds -

ITEC2620 Introduction to Data Structures

Analysis of Algorithms CMPSC 565

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn

Lecture 14: Nov. 11 & 13

Data Structures and Algorithms

CS 470/570 Divide-and-Conquer. Format of Divide-and-Conquer algorithms: Master Recurrence Theorem (simpler version)

Divide-Conquer-Glue Algorithms

Introduction to Divide and Conquer

CPS 616 DIVIDE-AND-CONQUER 6-1

Divide and Conquer. Arash Rafiey. 27 October, 2016

CSC236 Intro. to the Theory of Computation Lecture 7: Master Theorem; more D&C; correctness

Linear Time Selection

Algorithms and Data Structures Strassen s Algorithm. ADS (2017/18) Lecture 4 slide 1

Fast Convolution; Strassen s Method

CSE 548: Analysis of Algorithms. Lectures 18, 19, 20 & 21 ( Randomized Algorithms & High Probability Bounds )

Review Of Topics. Review: Induction

In-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2

Weighted Activity Selection

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Recurrences COMP 215

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

Divide-and-conquer. Curs 2015

Divide and Conquer Strategy

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

Fall 2017 November 10, Written Homework 5

Tutorials. Algorithms and Data Structures Strassen s Algorithm. The Master Theorem for solving recurrences. The Master Theorem (cont d)

Divide&Conquer: MergeSort. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University Spring 2014

Problem Set 1. CSE 373 Spring Out: February 9, 2016

CSCI Honor seminar in algorithms Homework 2 Solution

IS 709/809: Computational Methods in IS Research Fall Exam Review

Data Structures and Algorithms Chapter 3

Advanced Analysis of Algorithms - Homework I (Solutions)

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

The Master Theorem for solving recurrences. Algorithms and Data Structures Strassen s Algorithm. Tutorials. The Master Theorem (cont d)

Multi-threading model

Algorithms. Jordi Planes. Escola Politècnica Superior Universitat de Lleida

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

Divide & Conquer. CS 320, Fall Dr. Geri Georg, Instructor CS320 Div&Conq 1

Week 5: Quicksort, Lower bound, Greedy

Lecture 5: Loop Invariants and Insertion-sort

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Divide and Conquer Algorithms

Algorithm efficiency analysis

CS 240A : Examples with Cilk++

Data Structures in Java

Notes for Recitation 14

Fundamental Algorithms

What we have learned What is algorithm Why study algorithm The time and space efficiency of algorithm The analysis framework of time efficiency Asympt

Fundamental Algorithms

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

Transcription:

CSE 613: Parallel Programming Lecture 8 ( Analyzing Divide-and-Conquer Algorithms ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2012

A Useful Recurrence Consider the following recurrence: Θ 1, 1,, ; where, 1and 1. Arises frequently in the analyses of divide-and-conquer algorithms. Recall the following from the analyses of QSort(quicksort) in lecture 1. Serial: 2 Θ Parallel (with serial partition): Θ Parallel (with parallel partition): Θ log

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,.

How the Recurrence Unfolds Θ 1, 1,,. Θ

How the Recurrence Unfolds: Case 1 Θ 1, 1,,. Ο log _ for some constant 0. Sums Geometrically Increase Level by Level. Last Level Dominates. Θ Θ

How the Recurrence Unfolds: Case 2 Θ 1, 1,,. Θ log lg Sums Arithmetically Increase for some constant 0. Level by Level. No Level Dominates. Θ + Θ

How the Recurrence Unfolds: Case 3 Θ 1, 1,,. Ω log + & for constants 0& 1. Sums Geometrically decrease Level by Level. First Level Dominates. Θ Θ

The Master Theorem Θ 1, 1,, 1,1. Case 1: Ο log _ for some constant0 Θ log Case 2: Θ log lg for some constant0. Θ log lg +1 Case 3: Ω log + and for constants0and 1. Θ

Back to QSort Complexities Now let s try the QSort(quicksort) recurrences from lecture 1. Serial: 2 Θ Master Theorem Case 2: Θ log Parallel (with serial partition): Θ Master Theorem Case 3: Θ Parallel (with parallel partition): Θ log Master Theorem Case 2: Θ log 2

Multithreaded Matrix Multiplication

Parallel Iterative MM Iter-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n is a positive integer } 1. for i 1 to n do 2. for j 1 to n do 3. Z[ i ][ j ] 0 4. for k 1 to n do 5. Z[ i ][ j ] Z[ i ][ j ] + X[ i ][ k ] Y[ k ][ j ] Par-Iter-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n is a positive integer } 1. parallel for i 1 to n do 2. parallel for j 1 to n do 3. Z[ i ][ j ] 0 4. for k 1 to n do 5. Z[ i ][ j ] Z[ i ][ j ] + X[ i ][ k ] Y[ k ][ j ]

Parallel Iterative MM Par-Iter-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n is a positive integer } 1. parallel for i 1 to n do 2. parallel for j 1 to n do 3. Z[ i ][ j ] 0 4. for k 1 to n do 5. Z[ i ][ j ] Z[ i ][ j ] + X[ i ][ k ] Y[ k ][ j ] Work: Θ Span: Θ loglog Θ Parallelism: Θ

Parallel Recursive MM Z X Y n/2 n/2 n/2 n/2 Z 11 Z 12 n/2 X 11 X 12 n/2 Y 11 Y 12 n = n n Z 21 Z 22 X 21 X 22 Y 21 Y 22 n n n n/2 n/2 X 11 Y 11 + X 12 Y 21 X 11 Y 12 + X 12 Y 22 = n X 21 Y 11 + X 22 Y 21 X 21 Y 12 + X 22 Y 22 n

Parallel Recursive MM Par-Rec-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else 4. spawn Par-Rec-MM ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM ( Z 21, X 21, Y 11 ) 7. Par-Rec-MM ( Z 21, X 21, Y 12 ) 8. sync 9. spawn Par-Rec-MM ( Z 11, X 12, Y 21 ) 10. spawn Par-Rec-MM ( Z 12, X 12, Y 22 ) 11. spawn Par-Rec-MM ( Z 21, X 22, Y 21 ) 12. Par-Rec-MM ( Z 22, X 22, Y 22 ) 13. sync 14. endif

Parallel Recursive MM Par-Rec-MM ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else 4. spawn Par-Rec-MM ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM ( Z 21, X 21, Y 11 ) 7. Par-Rec-MM ( Z 21, X 21, Y 12 ) 8. sync 9. spawn Par-Rec-MM ( Z 11, X 12, Y 21 ) 10. spawn Par-Rec-MM ( Z 12, X 12, Y 22 ) 11. spawn Par-Rec-MM ( Z 21, X 22, Y 21 ) 12. Par-Rec-MM ( Z 22, X 22, Y 22 ) 13. sync 14. endif Work: Θ 1, 1, 8 2 Θ 1,. Span: Θ [ MT Case 1 ] Θ 1, 1, 2 2 Θ 1,. Θ [ MT Case 1 ] Parallelism: Additional Space: Θ 1 Θ

n/2 Recursive MM with More Parallelism Z n/2 n/2 Z 11 Z 12 n/2 X 11 Y 11 + X 12 Y 21 X 11 Y 12 + X 12 Y 22 n = n Z 21 Z 22 X 21 Y 11 + X 22 Y 21 X 21 Y 12 + X 22 Y 22 n n n/2 n/2 n/2 X 11 Y 11 = + X 11 Y 12 n/2 X 12 Y 21 X 12 Y 22 n n X 21 Y 11 X 21 Y 12 X 22 Y 21 X 22 Y 22 n n

Recursive MM with More Parallelism Par-Rec-MM2 ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else { T is a temporary n n matrix } 4. spawn Par-Rec-MM2 ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM2 ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM2 ( Z 21, X 21, Y 11 ) 7. spawn Par-Rec-MM2 ( Z 21, X 21, Y 12 ) 8. spawn Par-Rec-MM2 ( T 11, X 12, Y 21 ) 9. spawn Par-Rec-MM2 ( T 12, X 12, Y 22 ) 10. spawn Par-Rec-MM2 ( T 21, X 22, Y 21 ) 11. Par-Rec-MM2 ( T 22, X 22, Y 22 ) 12. sync 13. parallel for i 1 to n do 14. parallel for j 1 to n do 15. Z[ i ][ j ] Z[ i ][ j ] + T[ i ][ j ] 16. endif

Recursive MM with More Parallelism Par-Rec-MM2 ( Z, X, Y ) { X, Y, Z are n n matrices, where n = 2 k for integer k 0 } 1. if n = 1 then 2. Z Z + X Y 3. else { T is a temporary n n matrix } 4. spawn Par-Rec-MM2 ( Z 11, X 11, Y 11 ) 5. spawn Par-Rec-MM2 ( Z 12, X 11, Y 12 ) 6. spawn Par-Rec-MM2 ( Z 21, X 21, Y 11 ) 7. spawn Par-Rec-MM2 ( Z 21, X 21, Y 12 ) 8. spawn Par-Rec-MM2 ( T 11, X 12, Y 21 ) 9. spawn Par-Rec-MM2 ( T 12, X 12, Y 22 ) 10. spawn Par-Rec-MM2 ( T 21, X 22, Y 21 ) 11. Par-Rec-MM2 ( T 22, X 22, Y 22 ) 12. sync 13. parallel for i 1 to n do 14. parallel for j 1 to n do 15. Z[ i ][ j ] Z[ i ][ j ] + T[ i ][ j ] 16. endif Work: Θ 1, 1, 8 2 Θ,. Span: Θ [ MT Case 1 ] Θ 1, 1, 2 Θ log,. Θ log 2 [ MT Case 2 ] Parallelism: Θ Additional Space: Θ 1, 1, 8 2 Θ,. Θ [ MT Case 1 ]

Multithreaded Merge Sort

Parallel Merge Sort Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. Merge ( A, p, q, r ) Par-Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. spawn Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. sync 6. Merge ( A, p, q, r )

Parallel Merge Sort Par-Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. spawn Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. sync 6. Merge ( A, p, q, r ) Work: Θ 1, 1, 2 2 Θ,. Θ log [ MT Case 2 ] Span: Θ 1, 1, 2 Θ,. Θ [ MT Case 3 ] Parallelism: Θ log

Source:Cormenet al., Introduction to Algorithms, 3 rd Edition subarrays to merge: suppose: merged output: Parallel Merge 1 1...... 1

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Step 1: Find, where is the midpoint of..

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Step 2: Use binary search to find the index in subarray.. so that the subarraywould still be sorted if we insert between 1 and

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Step 3:Copy to, where

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Perform the following two steps in parallel. Step 4(a):Recursively merge.. 1 with.. 1, and place the result into.. 1

subarrays to merge: Parallel Merge 1 1.... suppose: Source:Cormenet al., Introduction to Algorithms, 3 rd Edition merged output:.. 1 Perform the following two steps in parallel. Step 4(a):Recursively merge.. 1 with.. 1, and place the result into.. 1 Step 4(b):Recursively merge 1.. with 1.., and place the result into 1..

Parallel Merge Par-Merge ( T, p 1, r 1, p 2, r 2, A, p 3 ) 1. n 1 r 1 p 1 + 1, n 2 r 2 p 2 + 1 2. if n 1 < n 2 then 3. p 1 p 2, r 1 r 2, n 1 n 2 4. if n 1 = 0 then return 5. else 6. q 1 ( p 1 + r 1 ) / 2 7. q 2 Binary-Search ( T[ q 1 ], T, p 2, r 2 ) 8. q 3 p 3 + ( q 1 p 1 ) + ( q 2 p 2 ) 9. A[ q 3 ] T[ q 1 ] 10. spawn Par-Merge ( T, p 1, q 1-1, p 2, q 2-1, A, p 3 ) 11. Par-Merge ( T, q 1 +1, r 1, q 2 +1, r 2, A, q 3 +1 ) 12. sync

Parallel Merge Par-Merge ( T, p 1, r 1, p 2, r 2, A, p 3 ) 1. n 1 r 1 p 1 + 1, n 2 r 2 p 2 + 1 2. if n 1 < n 2 then 3. p 1 p 2, r 1 r 2, n 1 n 2 4. if n 1 = 0 then return 5. else 6. q 1 ( p 1 + r 1 ) / 2 7. q 2 Binary-Search ( T[ q 1 ], T, p 2, r 2 ) 8. q 3 p 3 + ( q 1 p 1 ) + ( q 2 p 2 ) 9. A[ q 3 ] T[ q 1 ] We have, 2 In the worst case, a recursive call in lines 9-10 merges half the elements of.. with all elements of... 10. spawn Par-Merge ( T, p 1, q 1-1, p 2, q 2-1, A, p 3 ) 11. Par-Merge ( T, q 1 +1, r 1, q 2 +1, r 2, A, q 3 +1 ) 12. sync Hence, #elements involved in such a call: 2 2 2 2 2 2 4 2 4 3 4

Parallel Merge Par-Merge ( T, p 1, r 1, p 2, r 2, A, p 3 ) 1. n 1 r 1 p 1 + 1, n 2 r 2 p 2 + 1 2. if n 1 < n 2 then 3. p 1 p 2, r 1 r 2, n 1 n 2 4. if n 1 = 0 then return 5. else 6. q 1 ( p 1 + r 1 ) / 2 7. q 2 Binary-Search ( T[ q 1 ], T, p 2, r 2 ) 8. q 3 p 3 + ( q 1 p 1 ) + ( q 2 p 2 ) 9. A[ q 3 ] T[ q 1 ] 10. spawn Par-Merge ( T, p 1, q 1-1, p 2, q 2-1, A, p 3 ) 11. Par-Merge ( T, q 1 +1, r 1, q 2 +1, r 2, A, q 3 +1 ) 12. sync Span: Θ 1, 1, 3 4 Θ log,. Θ log 2 [ MT Case 2 ] Work: Clearly, Ω We show below that, Ο For some,, we have the following recurrence, 1 Ο log Assuming log for positive constants and, and substituting on the right hand side of the above recurrence gives us: log Ο. Hence, Θ.

Parallel Merge Sort with Parallel Merge Par-Merge-Sort ( A, p, r ) { sort the elements in A[ p r ] } 1. if p < r then 2. q ( p + r ) / 2 3. spawn Merge-Sort ( A, p, q ) 4. Merge-Sort ( A, q + 1, r ) 5. sync 6. Par-Merge ( A, p, q, r ) Work: Θ 1, 1, 2 2 Θ,. Θ log [ MT Case 2 ] Span: Θ 1, 1, 2 Θ log2,. Θ log 3 [ MT Case 2 ] Parallelism: Θ