Algorithms PART II: Partitioning and Divide & Conquer. HPC Fall 2007 Prof. Robert van Engelen

Size: px
Start display at page:

Download "Algorithms PART II: Partitioning and Divide & Conquer. HPC Fall 2007 Prof. Robert van Engelen"

Transcription

1 Algorithms PART II: Partitioning and Divide & Conquer HPC Fall 2007 Prof. Robert van Engelen

2 Overview Partitioning strategies Divide and conquer strategies Further reading HPC Fall

3 Partitioning Strategies Block partitioning of a 2D domain Data partitioning Perform domain decomposition to run parallel tasks on subdomains Scatter-compute-gather where local computation may require communication and scatter/gather may involve computations function f(x,y) u=g(x) v=h(y) return u+v end Thread 1 Thread 2 u=g(x) v=h(y) return u+v Task partitioning Decompose functions into independent subfunctions and execute the subfunctions in parallel HPC Fall

4 Partitioning Strategies Partitioning strategy (data partitioning): 1. Break up a given problem into P independent subproblems 2. Solve the P subproblems concurrently 3. Collect and combine the P solutions Embarrassingly parallel Is a simple form of data partitioning without initial work and no interaction between workers HPC Fall

5 Partitioning Example 1: Summation Summation of n values X = [x 1,,x n ] 1. Divide X into P equally-sized sublists X p, p = 0,,P-1 and distribute the X p sublists to the P processors 2. The processors sum the local parts s p = X p 3. Combine the local sums s = s p Algorithms: 1. Scatter list X using a scatter-tree 2. Serial summation of parts 3. Reduce local sums HPC Fall

6 Partitioning Example 1: Summation n/2 Log 2 (P) steps scatter (divide) n/8 n/4 n/4 n/8 n/8 n/8 Total amount of data transferred: n/2 log 2 (P) time Local summations: n/p steps reduce (combine) Log 2 (P) steps Total amount of data transferred: P-1 HPC Fall

7 Partitioning Example 1: Summation Communication time Scatter: t comm1 = k=1..log2(p) (t startup + 2 -k n t data ) = log 2 (P)t start + n(p-1)/p t data Reduce: t comm2 = log 2 (P) (t start + t data ) Total: t comm = 2 log 2 (P)t start + ( n(p-1)/p + log 2 (P) ) t data Computation time Local sum: Global sum: Total: t comp1 = n/p t comp2 = log 2 (P) t comp = n/p + log 2 (P) Speedup, assuming t startup = 0 Sequential time: t s = n-1 Parallel time: t P = ( n(p-1)/p + log 2 (P) ) t data + n/p + log 2 (P) Speedup: S P = t s /t P = O(n / (n + log(p))) Best speedup w/o communication: S P = O(P/log(P)) HPC Fall

8 General M-Ary Partitioning divide Example: partitioning an image, e.g. to compute histogram in parallel Second division First division time compute Third division combine 3-level 4-ary partitioning for 4 3 = 64 processors HPC Fall

9 Partitioning Example 2: Parallel Bucket Sort Bucket sort of a list of values bounded within a range [lo hi] 1. Partition n values in n/p segments 2a. Sort each segment into P small buckets (local computation) 2b. Send content of small buckets to P large buckets 3. Sort P large buckets and merge lists Unsorted values P processors Small buckets Empty small buckets into large buckets Sort content of buckets and merge lists Sorted values HPC Fall

10 Partitioning Example 2: Parallel Bucket Sort Input: list X of length n with minimum value L and maximum U Output: sorted list X def function bucket(x) = P*(x-L)/(U-L); scatter list X to local X p lists each of size n/p forall processors p = 0,,P-1 for i = 0,,n/P-1 x = X p [i] put x into small bucket b p [bucket(x)] all-to-all of small buckets b p into large buckets B p sort values in B p [0,,P-1] using a sequential sort algorithm gather X from B p into a merged sorted list HPC Fall

11 Partitioning Example 2: Parallel Bucket Sort Communication time (assuming uniform distribution in X) Scatter: t comm1 = log 2 (P)t startup + n(p-1)/p t data All-to-all: t comm2 = (P-1)(t startup + n/p 2 t data ) Gather: t comm3 = log 2 (P)t startup + n(p-1)/p t data Computation time (assuming uniform distribution in X) Small bucket sort: Large bucket sort: Speedup t comp1 = n/p Sequential time: t s = n log 2 (n/p) Parallel time: Speedup w/o communication: t comp2 = n/p log 2 (n/p) (with P buckets) t P = 2 log 2 (P)t startup + 2 n(p-1)/p t data + (P-1)(t startup + n/p 2 t data ) + n/p (1 + log 2 (n/p)) S P = O(P) HPC Fall

12 Partitioning Example 3: Barnes Hut Algorithm Direction of the force between two bodies at points p and q HPC Fall

13 .... Partitioning Example 3: Barnes Hut Algorithm Quadtree Parent computes M and C Particle at (x,y) and mass m. A square w/o particle is deleted Particles in 2D space Mass of parent is sum of masses of children Center of mass HPC Fall

14 Partitioning Example 3: Barnes Hut Algorithm for (t = 0; t < tmax; t++) { Build_tree(); Compute_Total_Mass_Center(); Compute_Force(); Update_Positions(); } Sequential time is O(n log n) Assuming P = n then t P = O(log P) (*) (**) Compute_Force() { for (i = 0; i < n; i++) Compute_Tree_Force(i,root) } Compute_Tree_Force(i,node) { if (box at node contains one particle) F = force using eq (**) else { r = distance from i to C (*) of box D = size of box if (D/r < theta) F = force using eq (**) with total M else for (all children c of box) F = F + Compute_Tree_Force(i,c); } return F; } HPC Fall

15 Divide and Conquer Divide and conquer strategy (definition by JáJá 1992) 1. Break up a given problem into independent subproblems 2. Solve the subproblems recursively and concurrently 3. Collect and combine the solutions into the overall solution In contrast to the partitioning strategy, divide and conquer uses recursive partitioning with concurrent execution to divide the problem down into independent subproblems In deeper levels of recursion the number of active processors may increase or decrease HPC Fall

16 Divide & Conquer Example 1: Parallel Recursive Matmul Block matrix multiplication in recursion by decomposing matrix in 2 2 submatrices and computing the submatrices recursively Mat matmul(mat A, Mat B, int s) { if (s == 1) C = A * B; else { s = s/2; P0 = matmul(a p,p, B p,p, s); P1 = matmul(a p,q, B q,p, s); P2 = matmul(a p,p, B p,q, s); P3 = matmul(a p,q, B q,q, s); P4 = matmul(a q,p, B p,p, s); P5 = matmul(a q,q, B q,p, s); P6 = matmul(a q,p, B p,q, s); P7 = matmul(a q,q, B q,q, s); C p,p = P0 + P1; C p,q = P2 + P3; C q,p = P4 + P5; C q,q = P6 + P7; } return C; } P0 P7 computed in parallel Level of parallelism increases with deepening recursion Suitable for shared memory systems HPC Fall

17 Divide and Conquer Example 2: Parallel Convex Hull Algorithm The planar convex hull of a set of points S={p 1,p 2,,p n } of p i =(x,y) coordinates is the smallest convex polygon that encompasses all points S on the x-y plane y x HPC Fall

18 Divide and Conquer Example 2: Parallel Convex Hull Algorithm The upper convex hull spans points {q 1,,q s } S from point q 1 with minimum x to q s with maximum x The convex hull = upper convex hull + lower convex hull Problem: Given points S = {p 1,,p n } such that x(p 1 ) < x(p 2 ) < < x(p n ), construct the upper convex hull in parallel Upper convex hull y q 1 q s x HPC Fall

19 Divide and Conquer Example 2: Parallel Convex Hull Algorithm Parallel convex hull: 1. Divide the x-sorted points S into sets S 1 and S 2 of equal size 2. Compute upper convex hull recursively on S 1 and S 2 3. Combine UCH(S 1 ) and UCH(S 2 ) by computing the upper common tangent to form UCH(S) Upper common tangent S 2 S 1 HPC Fall

20 Divide and Conquer Example 2: Parallel Convex Hull Algorithm Base case of recursion: two points, which are returned as UCH(S) The line segment (a,b) can be computed sequentially in O(log n) time with n = UCH(S 1 ) + UCH(S 2 ) using a binary search method Line segments can be implemented as linked list of points, thus UCH(S 1 ) and UCH(S 2 ) can be connected using one pointer change of a to point to b in O(1) time Parallel convex hull time complexity recurrence relation: T(n) < T(n/2) + a log n with solution: T(n) = O(log 2 n) Parallel convex hull operations recurrence relation: W(n) < 2W(n/2) + b n with solution: W(n) = O(n log n) which is cost optimal, since sequential algorithm is O(n log n) HPC Fall

21 Divide and Conquer Example 3: First-Order Linear Recurrences First-order linear recurrence y 1 = b 1 y i = a i y i-1 + b i Example applications: 2 < i < n Prefix sum y i = j=1..i b j is a special case of a first-order linear recurrence with a i = 1 (the multiplicative unit element) n-th order polynomial evaluation using Horner s rule p(x) = (((b 1 x + b 2 ) x + b 3 ) x + + b n-1 ) x + b n is a special case of a first-order linear recurrence with a i = x Solving a bi-diagonal system By = c, let a i = l i /d i b i = c i /d i then solve linear recurrence to obtain solution y d 1 l 2 d 2 l 3 d 3 l n d n y 1 y 2 y 3 y n = c 1 c 2 c 3 c n HPC Fall

22 Divide and Conquer Example 3: First-Order Linear Recurrences Rewrite y i = a i y i-1 + b i into y i = a i (a i-1 y i-2 + b i-1 ) + b i This equation defines a linear recurrence of size n/2 for even index i z 1 = b 1 z i = a i z i-1 + b i 2 < i < n/2 1. Let a i = a 2i a 2i-1 b i = a 2i b 2i-1 + b 2i 2. Solve z i recursively 3. For 1 < i < n set y i = z i/2 if i is even y i = a i z (i-1)/2 +b i if i is odd > 1 y i = b 1 if i = 1 HPC Fall

23 Recursion level Divide and Conquer Example 3: First-Order Linear Recurrences log 2 n recursive steps Parallel algorithm: linrecsolve(a[], b[], y[], n) { if (n==1) { y[1] = b[1]; return; } forall (i = 1 to n/2) { a new [i] = a[2*i]*a[2*i-1]; b new [i] = a[2*i]*b[2*i-1]+b[2*i]; } linrecsolve(a new, b new, z, n/2); forall (i = 1 to n) { if (i == 1) y[1] = b[1]; else if (even(i)) y[i] = z[i/2]; else y[i] = a[i]*z[(i-1)/2]+b[i]; } } b 1 b 1 = a 2 b 1 + b 2 b 1 = a 2 b 1 + b 2 = ((a 2 b 1 + b 2 ) a 3 + b 3 ) a 4 + b 4 b 1 = a 2 b 1 + b 2 = ((a 2 b 1 + b 2 ) a 3 + b 3 ) a 4 + b 4 = ((((a 2 b 1 + b 2 ) a 3 + b 3 ) a 4 + b 4 ) a 5 + b 5 ) a 6 + b 6 ) a 7 + b 7 ) a 8 + b 8 HPC Fall

24 Divide and Conquer Example 4: Triangular Matrix Inversion Consider Ax = b with triangular matrix A a 11 a 21 a 22 a 31 a 32 a 33 a n1 a n2 a nn Partition A into (n/2) (n/2) blocks A 1 A 2 A 3 Then A -1 is given by A A 3-1 A 2 A 1-1 A 3-1 HPC Fall

25 Divide and Conquer Example 4: Triangular Matrix Inversion Parallel algorithm: 1. Divide A into A 1, A 2, A 3 2. Recursively compute inverses of A 1 and A 3 in parallel 3. Multiply -A 3-1 A 2 A 1-1 and combine with A 1-1 and A 3-1 to get A -1 Time complexity is given by the recurrence relation T(n) = T(n/2) + c n with P=n 2 processors to compute -A 3-1 A 2 A 1-1 in O(n) operations in parallel, thus T(n) = O(n) time HPC Fall

26 Divide and Conquer Example 5: Banded Triangular Systems Consider Ax = b with banded matrix A with m=3 a 11 a 21 a 22 a 31 a 32 a 33 a 42 a 43 a 44 a 53 a 54 a 55 a 64 a 65 a 66 a 75 a 76 a 77 a 86 a 87 a 88 a 97 a 98 a 99 a 11 a 21 a 22 a 31 a 32 a 33 a 42 a 43 a 44 a 53 a 54 a 55 a 64 a 65 a 66 a 75 a 76 a 77 a 86 a 87 a 88 a 97 a 98 a 99 Define block diagonal D and inverse D -1 A 11 A A22 A D = D -1 = An/m,n/m An/m,n/m -1 HPC Fall

27 Divide and Conquer Example 5: Banded Triangular Systems Compute d = D -1 b and B = D -1 A where B i,i-1 = A ii -1 A i,i-1 d = D -1 b = d 1 d 2 d n/m B = D -1 A = I m B 21 I m B 32 I m B n/m,n/m-1 I m Solve first-order linear recurrence on m m matrices B i,i-1 x 1 = d 1 x i = -B i,i-1 x i-1 + d 1 2 < i < n/m Parallel time O(m + m log (n/m)) with P=nm processors Compute all A ii -1 (each requiring O(m) operations) in parallel with parallel matrix inversion algorithm Compute all B i,i-1 = A ii -1 A i,i-1 in O(m) operations in parallel Recurrence depth is log 2 (n/m), each step has O(m) operations HPC Fall

28 Divide and Conquer Example 6: LU of Tridiagonal Matrix Consider tridiagonal matrix LU decomposition a 1 c 1 b 2 a 2 c 2 b 3 a 3 c 3 b n a n = 1 l 2 1 l 3 1 l n 1 d 1 u 1 d 2 u 2 d 3 u 3 d n The LU decomposition A = L U satisfies a 1 = d 1 c i = u i a i = d i + l i u i-1 b i = l i d i-1 thus d 1 = a 1 d i = a i - l i u i-1 = a i - u i-1 b i /d i-1 = [ a i d i-1 - b i c i-1 ] / d i-1 HPC Fall

29 Divide and Conquer Example 6: Let LU of Tridiagonal Matrix R 1 = a a i -b i c i-1 R i = 1 0 T i = R i R i-1 R 1 From the Möbius transformation we have d i = Algorithm: Set up matrices R T T T i T i Solve first-order linear recurrence (prefix sum) of T Compute d i From the solution of d i compute l i = b i /d i-1 HPC Fall

30 Further Reading [PP2] pages [PSC] pages HPC Fall

COL 730: Parallel Programming

COL 730: Parallel Programming COL 730: Parallel Programming PARALLEL SORTING Bitonic Merge and Sort Bitonic sequence: {a 0, a 1,, a n-1 }: A sequence with a monotonically increasing part and a monotonically decreasing part For some

More information

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count Types of formulas for basic operation count Exact formula e.g., C(n) = n(n-1)/2 Algorithms, Design and Analysis Big-Oh analysis, Brute Force, Divide and conquer intro Formula indicating order of growth

More information

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and

More information

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk CMPS 2200 Fall 2017 Divide-and-Conquer Carola Wenk Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk 1 The divide-and-conquer design paradigm 1. Divide the problem (instance)

More information

Introduction to Algorithms 6.046J/18.401J/SMA5503

Introduction to Algorithms 6.046J/18.401J/SMA5503 Introduction to Algorithms 6.046J/8.40J/SMA5503 Lecture 3 Prof. Piotr Indyk The divide-and-conquer design paradigm. Divide the problem (instance) into subproblems. 2. Conquer the subproblems by solving

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CSE 101, Winter 2018 Design and Analysis of Algorithms Lecture 4: Divide and Conquer (I) Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ Divide and Conquer ( DQ ) First paradigm or framework DQ(S)

More information

Block-tridiagonal matrices

Block-tridiagonal matrices Block-tridiagonal matrices. p.1/31 Block-tridiagonal matrices - where do these arise? - as a result of a particular mesh-point ordering - as a part of a factorization procedure, for example when we compute

More information

COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions

COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions September 12, 2018 I. The Work-Time W-T presentation of EREW sequence reduction Algorithm 2 in the PRAM handout has work complexity

More information

data structures and algorithms lecture 2

data structures and algorithms lecture 2 data structures and algorithms 2018 09 06 lecture 2 recall: insertion sort Algorithm insertionsort(a, n): for j := 2 to n do key := A[j] i := j 1 while i 1 and A[i] > key do A[i + 1] := A[i] i := i 1 A[i

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #19 3/28/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 From last class PRAM

More information

The Divide-and-Conquer Design Paradigm

The Divide-and-Conquer Design Paradigm CS473- Algorithms I Lecture 4 The Divide-and-Conquer Design Paradigm CS473 Lecture 4 1 The Divide-and-Conquer Design Paradigm 1. Divide the problem (instance) into subproblems. 2. Conquer the subproblems

More information

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg Lecture 22: Multithreaded Algorithms CSCI 700 - Algorithms I Andrew Rosenberg Last Time Open Addressing Hashing Today Multithreading Two Styles of Threading Shared Memory Every thread can access the same

More information

Overview: Parallelisation via Pipelining

Overview: Parallelisation via Pipelining Overview: Parallelisation via Pipelining three type of pipelines adding numbers (type ) performance analysis of pipelines insertion sort (type ) linear system back substitution (type ) Ref: chapter : Wilkinson

More information

Divide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai

Divide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai Divide-and-Conquer Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, 33.4. CSE 6331 Algorithms Steve Lai Divide and Conquer Given an instance x of a prolem, the method works as follows: divide-and-conquer

More information

Topic 17. Analysis of Algorithms

Topic 17. Analysis of Algorithms Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm

More information

Algorithmic Approach to Counting of Certain Types m-ary Partitions

Algorithmic Approach to Counting of Certain Types m-ary Partitions Algorithmic Approach to Counting of Certain Types m-ary Partitions Valentin P. Bakoev Abstract Partitions of integers of the type m n as a sum of powers of m (the so called m-ary partitions) and their

More information

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30, Divide and Conquer CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Merging sorted lists: WHAT Given two sorted lists a 1 a 2 a 3 a k b 1 b 2 b 3 b

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CSE 101, Winter 2018 Design and Analysis of Algorithms Lecture 5: Divide and Conquer (Part 2) Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ A Lower Bound on Convex Hull Lecture 4 Task: sort the

More information

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3

MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3 MA008 p.1/37 MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3 Dr. Markus Hagenbuchner markus@uow.edu.au. MA008 p.2/37 Exercise 1 (from LN 2) Asymptotic Notation When constants appear in exponents

More information

Divide and Conquer Algorithms

Divide and Conquer Algorithms Divide and Conquer Algorithms T. M. Murali February 19, 2013 Divide and Conquer Break up a problem into several parts. Solve each part recursively. Solve base cases by brute force. Efficiently combine

More information

Search Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by

Search Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by Search Algorithms Analysis of Algorithms Prepared by John Reif, Ph.D. Search Algorithms a) Binary Search: average case b) Interpolation Search c) Unbounded Search (Advanced material) Readings Reading Selection:

More information

Parallel Prefix Algorithms 1. A Secret to turning serial into parallel

Parallel Prefix Algorithms 1. A Secret to turning serial into parallel Parallel Prefix Algorithms. A Secret to turning serial into parallel 2. Suppose you bump into a parallel algorithm that surprises you there is no way to parallelize this algorithm you say 3. Probably a

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 9 Divide and Conquer Merge sort Counting Inversions Binary Search Exponentiation Solving Recurrences Recursion Tree Method Master Theorem Sofya Raskhodnikova S. Raskhodnikova;

More information

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee Algorithm Analysis Divide and Conquer Chung-Ang University, Jaesung Lee Introduction 2 Divide and Conquer Paradigm 3 Advantages of Divide and Conquer Solving Difficult Problems Algorithm Efficiency Parallelism

More information

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline CS 4407, Algorithms Growth Functions

More information

Divide-and-conquer: Order Statistics. Curs: Fall 2017

Divide-and-conquer: Order Statistics. Curs: Fall 2017 Divide-and-conquer: Order Statistics Curs: Fall 2017 The divide-and-conquer strategy. 1. Break the problem into smaller subproblems, 2. recursively solve each problem, 3. appropriately combine their answers.

More information

Linear Systems of Equations by Gaussian Elimination

Linear Systems of Equations by Gaussian Elimination Chapter 6, p. 1/32 Linear of Equations by School of Engineering Sciences Parallel Computations for Large-Scale Problems I Chapter 6, p. 2/32 Outline 1 2 3 4 The Problem Consider the system a 1,1 x 1 +

More information

9. Numerical linear algebra background

9. Numerical linear algebra background Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

More information

Divide and Conquer Algorithms

Divide and Conquer Algorithms Divide and Conquer Algorithms T. M. Murali March 17, 2014 Divide and Conquer Break up a problem into several parts. Solve each part recursively. Solve base cases by brute force. Efficiently combine solutions

More information

Data Structures in Java

Data Structures in Java Data Structures in Java Lecture 20: Algorithm Design Techniques 12/2/2015 Daniel Bauer 1 Algorithms and Problem Solving Purpose of algorithms: find solutions to problems. Data Structures provide ways of

More information

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline CS 4407, Algorithms Growth Functions

More information

Overview: Synchronous Computations

Overview: Synchronous Computations Overview: Synchronous Computations barriers: linear, tree-based and butterfly degrees of synchronization synchronous example 1: Jacobi Iterations serial and parallel code, performance analysis synchronous

More information

Review Of Topics. Review: Induction

Review Of Topics. Review: Induction Review Of Topics Asymptotic notation Solving recurrences Sorting algorithms Insertion sort Merge sort Heap sort Quick sort Counting sort Radix sort Medians/order statistics Randomized algorithm Worst-case

More information

Divide and Conquer Algorithms

Divide and Conquer Algorithms Divide and Conquer Algorithms Introduction There exist many problems that can be solved using a divide-and-conquer algorithm. A divide-andconquer algorithm A follows these general guidelines. Divide Algorithm

More information

Notation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing

Notation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing Parallel Processing CS575 Parallel Processing Lecture five: Efficiency Wim Bohm, Colorado State University Some material from Speedup vs Efficiency in Parallel Systems - Eager, Zahorjan and Lazowska IEEE

More information

Data Structures and Algorithms Chapter 3

Data Structures and Algorithms Chapter 3 1 Data Structures and Algorithms Chapter 3 Werner Nutt 2 Acknowledgments The course follows the book Introduction to Algorithms, by Cormen, Leiserson, Rivest and Stein, MIT Press [CLRST]. Many examples

More information

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015 CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015 1 Introduction The divide-and-conquer strategy is a general paradigm for algorithm design. This strategy

More information

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes )

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes ) CSE613: Parallel Programming, Spring 2012 Date: May 11 Final Exam ( 11:15 AM 1:45 PM : 150 Minutes ) This exam will account for either 10% or 20% of your overall grade depending on your relative performance

More information

Lecture 6 September 21, 2016

Lecture 6 September 21, 2016 ICS 643: Advanced Parallel Algorithms Fall 2016 Lecture 6 September 21, 2016 Prof. Nodari Sitchinava Scribe: Tiffany Eulalio 1 Overview In the last lecture, we wrote a non-recursive summation program and

More information

The parallelization of the Keller box method on heterogeneous cluster of workstations

The parallelization of the Keller box method on heterogeneous cluster of workstations Available online at http://wwwibnusinautmmy/jfs Journal of Fundamental Sciences Article The parallelization of the Keller box method on heterogeneous cluster of workstations Norhafiza Hamzah*, Norma Alias,

More information

1 Sequences and Summation

1 Sequences and Summation 1 Sequences and Summation A sequence is a function whose domain is either all the integers between two given integers or all the integers greater than or equal to a given integer. For example, a m, a m+1,...,

More information

CS325: Analysis of Algorithms, Fall Midterm

CS325: Analysis of Algorithms, Fall Midterm CS325: Analysis of Algorithms, Fall 2017 Midterm I don t know policy: you may write I don t know and nothing else to answer a question and receive 25 percent of the total points for that problem whereas

More information

Matrix Multiplication

Matrix Multiplication Matrix Multiplication Matrix Multiplication Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB. n c ij = a ik b kj k=1 c 11 c 12 c 1n c 21 c 22 c 2n c n1 c n2 c nn = a 11 a 12 a 1n

More information

Divide-and-Conquer Algorithms Part Two

Divide-and-Conquer Algorithms Part Two Divide-and-Conquer Algorithms Part Two Recap from Last Time Divide-and-Conquer Algorithms A divide-and-conquer algorithm is one that works as follows: (Divide) Split the input apart into multiple smaller

More information

Fundamental Algorithms

Fundamental Algorithms Fundamental Algorithms Chapter 5: Searching Michael Bader Winter 2014/15 Chapter 5: Searching, Winter 2014/15 1 Searching Definition (Search Problem) Input: a sequence or set A of n elements (objects)

More information

Randomization in Algorithms and Data Structures

Randomization in Algorithms and Data Structures Randomization in Algorithms and Data Structures 3 lectures (Tuesdays 14:15-16:00) May 1: Gerth Stølting Brodal May 8: Kasper Green Larsen May 15: Peyman Afshani For each lecture one handin exercise deadline

More information

Dynamic Programming. Reading: CLRS Chapter 15 & Section CSE 6331: Algorithms Steve Lai

Dynamic Programming. Reading: CLRS Chapter 15 & Section CSE 6331: Algorithms Steve Lai Dynamic Programming Reading: CLRS Chapter 5 & Section 25.2 CSE 633: Algorithms Steve Lai Optimization Problems Problems that can be solved by dynamic programming are typically optimization problems. Optimization

More information

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch] Divide and Conquer Andreas Klappenecker [based on slides by Prof. Welch] Divide and Conquer Paradigm An important general technique for designing algorithms: divide problem into subproblems recursively

More information

Divide and Conquer CPE 349. Theresa Migler-VonDollen

Divide and Conquer CPE 349. Theresa Migler-VonDollen Divide and Conquer CPE 349 Theresa Migler-VonDollen Divide and Conquer Divide and Conquer is a strategy that solves a problem by: 1 Breaking the problem into subproblems that are themselves smaller instances

More information

Analytical Modeling of Parallel Systems

Analytical Modeling of Parallel Systems Analytical Modeling of Parallel Systems Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing

More information

CISC 235: Topic 1. Complexity of Iterative Algorithms

CISC 235: Topic 1. Complexity of Iterative Algorithms CISC 235: Topic 1 Complexity of Iterative Algorithms Outline Complexity Basics Big-Oh Notation Big-Ω and Big-θ Notation Summations Limitations of Big-Oh Analysis 2 Complexity Complexity is the study of

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar

More information

Notes for Recitation 14

Notes for Recitation 14 6.04/18.06J Mathematics for Computer Science October 4, 006 Tom Leighton and Marten van Dijk Notes for Recitation 14 1 The Akra-Bazzi Theorem Theorem 1 (Akra-Bazzi, strong form). Suppose that: is defined

More information

CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University

CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution

More information

Lecture 4: Linear Algebra 1

Lecture 4: Linear Algebra 1 Lecture 4: Linear Algebra 1 Sourendu Gupta TIFR Graduate School Computational Physics 1 February 12, 2010 c : Sourendu Gupta (TIFR) Lecture 4: Linear Algebra 1 CP 1 1 / 26 Outline 1 Linear problems Motivation

More information

Review: From problem to parallel algorithm

Review: From problem to parallel algorithm Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:

More information

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD Greedy s Greedy s Shortest path Claim 2: Let S be a subset of vertices containing s such that we know the shortest path length l(s, u) from s to any vertex in u S. Let e = (u, v) be an edge such that 1

More information

Lecture 14: Nov. 11 & 13

Lecture 14: Nov. 11 & 13 CIS 2168 Data Structures Fall 2014 Lecturer: Anwar Mamat Lecture 14: Nov. 11 & 13 Disclaimer: These notes may be distributed outside this class only with the permission of the Instructor. 14.1 Sorting

More information

Fundamental Algorithms

Fundamental Algorithms Chapter 2: Sorting, Winter 2018/19 1 Fundamental Algorithms Chapter 2: Sorting Jan Křetínský Winter 2018/19 Chapter 2: Sorting, Winter 2018/19 2 Part I Simple Sorts Chapter 2: Sorting, Winter 2018/19 3

More information

Fundamental Algorithms

Fundamental Algorithms Fundamental Algorithms Chapter 2: Sorting Harald Räcke Winter 2015/16 Chapter 2: Sorting, Winter 2015/16 1 Part I Simple Sorts Chapter 2: Sorting, Winter 2015/16 2 The Sorting Problem Definition Sorting

More information

! Break up problem into several parts. ! Solve each part recursively. ! Combine solutions to sub-problems into overall solution.

! Break up problem into several parts. ! Solve each part recursively. ! Combine solutions to sub-problems into overall solution. Divide-and-Conquer Chapter 5 Divide and Conquer Divide-and-conquer.! Break up problem into several parts.! Solve each part recursively.! Combine solutions to sub-problems into overall solution. Most common

More information

Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun

Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun spcl.inf.ethz.ch @spcl_eth Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun Design of Parallel and High-Performance Computing Fall 2017 DPHPC Overview cache coherency memory models 2 Speedup An application

More information

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016 CS125 Lecture 4 Fall 2016 Divide and Conquer We have seen one general paradigm for finding algorithms: the greedy approach. We now consider another general paradigm, known as divide and conquer. We have

More information

Copyright 2000, Kevin Wayne 1

Copyright 2000, Kevin Wayne 1 Divide-and-Conquer Chapter 5 Divide and Conquer Divide-and-conquer. Break up problem into several parts. Solve each part recursively. Combine solutions to sub-problems into overall solution. Most common

More information

Legendre s Equation. PHYS Southern Illinois University. October 18, 2016

Legendre s Equation. PHYS Southern Illinois University. October 18, 2016 Legendre s Equation PHYS 500 - Southern Illinois University October 18, 2016 PHYS 500 - Southern Illinois University Legendre s Equation October 18, 2016 1 / 11 Legendre s Equation Recall We are trying

More information

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort Sorting Algorithms We have already seen: Selection-sort Insertion-sort Heap-sort We will see: Bubble-sort Merge-sort Quick-sort We will show that: O(n log n) is optimal for comparison based sorting. Bubble-Sort

More information

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm: The maximum-subarray problem Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm: Brute force algorithm: At best, θ(n 2 ) time complexity 129 Can we do divide

More information

Advanced Analysis of Algorithms - Midterm (Solutions)

Advanced Analysis of Algorithms - Midterm (Solutions) Advanced Analysis of Algorithms - Midterm (Solutions) K. Subramani LCSEE, West Virginia University, Morgantown, WV {ksmani@csee.wvu.edu} 1 Problems 1. Solve the following recurrence using substitution:

More information

Solution of Linear Systems

Solution of Linear Systems Solution of Linear Systems Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico May 12, 2016 CPD (DEI / IST) Parallel and Distributed Computing

More information

CS483 Design and Analysis of Algorithms

CS483 Design and Analysis of Algorithms CS483 Design and Analysis of Algorithms Lecture 6-8 Divide and Conquer Algorithms Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments

More information

R ij = 2. Using all of these facts together, you can solve problem number 9.

R ij = 2. Using all of these facts together, you can solve problem number 9. Help for Homework Problem #9 Let G(V,E) be any undirected graph We want to calculate the travel time across the graph. Think of each edge as one resistor of 1 Ohm. Say we have two nodes: i and j Let the

More information

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005

V. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005 V. Adamchi Recurrences Victor Adamchi Fall of 00 Plan Multiple roots. More on multiple roots. Inhomogeneous equations 3. Divide-and-conquer recurrences In the previous lecture we have showed that if the

More information

EECS 358 Introduction to Parallel Computing Final Assignment

EECS 358 Introduction to Parallel Computing Final Assignment EECS 358 Introduction to Parallel Computing Final Assignment Jiangtao Gou Zhenyu Zhao March 19, 2013 1 Problem 1 1.1 Matrix-vector Multiplication on Hypercube and Torus As shown in slide 15.11, we assumed

More information

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu Analysis of Algorithm Efficiency Dr. Yingwu Zhu Measure Algorithm Efficiency Time efficiency How fast the algorithm runs; amount of time required to accomplish the task Our focus! Space efficiency Amount

More information

Notes on Paths, Trees and Lagrange Inversion

Notes on Paths, Trees and Lagrange Inversion Notes on Paths, Trees and Lagrange Inversion Today we are going to start with a problem that may seem somewhat unmotivated, and solve it in two ways. From there, we will proceed to discuss applications

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

CPS 616 DIVIDE-AND-CONQUER 6-1

CPS 616 DIVIDE-AND-CONQUER 6-1 CPS 616 DIVIDE-AND-CONQUER 6-1 DIVIDE-AND-CONQUER Approach 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances recursively 3. Obtain solution to original (larger)

More information

Analytical Modeling of Parallel Programs. S. Oliveira

Analytical Modeling of Parallel Programs. S. Oliveira Analytical Modeling of Parallel Programs S. Oliveira Fall 2005 1 Scalability of Parallel Systems Efficiency of a parallel program E = S/P = T s /PT p Using the parallel overhead expression E = 1/(1 + T

More information

Barrier. Overview: Synchronous Computations. Barriers. Counter-based or Linear Barriers

Barrier. Overview: Synchronous Computations. Barriers. Counter-based or Linear Barriers Overview: Synchronous Computations Barrier barriers: linear, tree-based and butterfly degrees of synchronization synchronous example : Jacobi Iterations serial and parallel code, performance analysis synchronous

More information

Lecture 2: Divide and conquer and Dynamic programming

Lecture 2: Divide and conquer and Dynamic programming Chapter 2 Lecture 2: Divide and conquer and Dynamic programming 2.1 Divide and Conquer Idea: - divide the problem into subproblems in linear time - solve subproblems recursively - combine the results in

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

Searching. Sorting. Lambdas

Searching. Sorting. Lambdas .. s Babes-Bolyai University arthur@cs.ubbcluj.ro Overview 1 2 3 Feedback for the course You can write feedback at academicinfo.ubbcluj.ro It is both important as well as anonymous Write both what you

More information

Divide and Conquer Problem Solving Method

Divide and Conquer Problem Solving Method Divide and Conquer Problem Solving Method 1. Problem Instances Let P be a problem that is amenable to the divide and conquer algorithm design method and let P 0, P 1, P 2, be distinct instances of the

More information

Divide-and-conquer algorithm

Divide-and-conquer algorithm Divide-and-conquer algorithm IDEA: n n matrix = 2 2 matrix of (n/2) (n/2) submatrices: r=ae+bg s=af+bh t =ce+dh u=cf+dg r t s u = a c e g September 15, 2004 Introduction to Algorithms L3.31 b d C = A B

More information

Kartsuba s Algorithm and Linear Time Selection

Kartsuba s Algorithm and Linear Time Selection CS 374: Algorithms & Models of Computation, Fall 2015 Kartsuba s Algorithm and Linear Time Selection Lecture 09 September 22, 2015 Chandra & Manoj (UIUC) CS374 1 Fall 2015 1 / 32 Part I Fast Multiplication

More information

Recap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86

Recap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86 Recap: Prefix Sums Given : set of n integers Find B: prefix sums : 3 1 1 7 2 5 9 2 4 3 3 B: 3 4 5 12 14 19 28 30 34 37 40 1 / 86 Recap: Parallel Prefix Sums Recursive algorithm Recursively computes sums

More information

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101 A design paradigm Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/17 112 Multiplying complex numbers (from Jeff Edmonds slides) INPUT: Two pairs of integers, (a,b),

More information

Divide and conquer. Philip II of Macedon

Divide and conquer. Philip II of Macedon Divide and conquer Philip II of Macedon Divide and conquer 1) Divide your problem into subproblems 2) Solve the subproblems recursively, that is, run the same algorithm on the subproblems (when the subproblems

More information

Divide and Conquer. Recurrence Relations

Divide and Conquer. Recurrence Relations Divide and Conquer Recurrence Relations Divide-and-Conquer Strategy: Break up problem into parts. Solve each part recursively. Combine solutions to sub-problems into overall solution. 2 MergeSort Mergesort.

More information

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM

Sorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM Sorting Chapter 11-1 - Sorting Ø We have seen the advantage of sorted data representations for a number of applications q Sparse vectors q Maps q Dictionaries Ø Here we consider the problem of how to efficiently

More information

Analysis of Multithreaded Algorithms

Analysis of Multithreaded Algorithms Analysis of Multithreaded Algorithms Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 4435 - CS 9624 (Moreno Maza) Analysis of Multithreaded Algorithms CS 4435 - CS 9624 1 /

More information

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14 Divide and Conquer Algorithms CSE 101: Design and Analysis of Algorithms Lecture 14 CSE 101: Design and analysis of algorithms Divide and conquer algorithms Reading: Sections 2.3 and 2.4 Homework 6 will

More information

5 Spatial Access Methods

5 Spatial Access Methods 5 Spatial Access Methods 5.1 Quadtree 5.2 R-tree 5.3 K-d tree 5.4 BSP tree 3 27 A 17 5 G E 4 7 13 28 B 26 11 J 29 18 K 5.5 Grid file 9 31 8 17 24 28 5.6 Summary D C 21 22 F 15 23 H Spatial Databases and

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms Spring 2017-2018 Outline 1 Sorting Algorithms (contd.) Outline Sorting Algorithms (contd.) 1 Sorting Algorithms (contd.) Analysis of Quicksort Time to sort array of length

More information

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science

Divide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science Divide & Conquer Jordi Cortadella and Jordi Petit Department of Computer Science Divide-and-conquer algorithms Strategy: Divide the problem into smaller subproblems of the same type of problem Solve the

More information

ITEC2620 Introduction to Data Structures

ITEC2620 Introduction to Data Structures ITEC2620 Introduction to Data Structures Lecture 6a Complexity Analysis Recursive Algorithms Complexity Analysis Determine how the processing time of an algorithm grows with input size What if the algorithm

More information

How to Multiply. 5.5 Integer Multiplication. Complex Multiplication. Integer Arithmetic. Complex multiplication. (a + bi) (c + di) = x + yi.

How to Multiply. 5.5 Integer Multiplication. Complex Multiplication. Integer Arithmetic. Complex multiplication. (a + bi) (c + di) = x + yi. How to ultiply Slides by Kevin Wayne. Copyright 5 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials Complex ultiplication Complex multiplication. a + bi) c + di) = x + yi.

More information

O Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem.

O Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem. O Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem. defn: v(n) = O(f(n)) constants c and n 0 such that v(n) c f(n) whenever n > n 0 Termed complexity:

More information

9. Numerical linear algebra background

9. Numerical linear algebra background Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

More information