Algorithms PART II: Partitioning and Divide & Conquer. HPC Fall 2007 Prof. Robert van Engelen
|
|
- Candace Underwood
- 5 years ago
- Views:
Transcription
1 Algorithms PART II: Partitioning and Divide & Conquer HPC Fall 2007 Prof. Robert van Engelen
2 Overview Partitioning strategies Divide and conquer strategies Further reading HPC Fall
3 Partitioning Strategies Block partitioning of a 2D domain Data partitioning Perform domain decomposition to run parallel tasks on subdomains Scatter-compute-gather where local computation may require communication and scatter/gather may involve computations function f(x,y) u=g(x) v=h(y) return u+v end Thread 1 Thread 2 u=g(x) v=h(y) return u+v Task partitioning Decompose functions into independent subfunctions and execute the subfunctions in parallel HPC Fall
4 Partitioning Strategies Partitioning strategy (data partitioning): 1. Break up a given problem into P independent subproblems 2. Solve the P subproblems concurrently 3. Collect and combine the P solutions Embarrassingly parallel Is a simple form of data partitioning without initial work and no interaction between workers HPC Fall
5 Partitioning Example 1: Summation Summation of n values X = [x 1,,x n ] 1. Divide X into P equally-sized sublists X p, p = 0,,P-1 and distribute the X p sublists to the P processors 2. The processors sum the local parts s p = X p 3. Combine the local sums s = s p Algorithms: 1. Scatter list X using a scatter-tree 2. Serial summation of parts 3. Reduce local sums HPC Fall
6 Partitioning Example 1: Summation n/2 Log 2 (P) steps scatter (divide) n/8 n/4 n/4 n/8 n/8 n/8 Total amount of data transferred: n/2 log 2 (P) time Local summations: n/p steps reduce (combine) Log 2 (P) steps Total amount of data transferred: P-1 HPC Fall
7 Partitioning Example 1: Summation Communication time Scatter: t comm1 = k=1..log2(p) (t startup + 2 -k n t data ) = log 2 (P)t start + n(p-1)/p t data Reduce: t comm2 = log 2 (P) (t start + t data ) Total: t comm = 2 log 2 (P)t start + ( n(p-1)/p + log 2 (P) ) t data Computation time Local sum: Global sum: Total: t comp1 = n/p t comp2 = log 2 (P) t comp = n/p + log 2 (P) Speedup, assuming t startup = 0 Sequential time: t s = n-1 Parallel time: t P = ( n(p-1)/p + log 2 (P) ) t data + n/p + log 2 (P) Speedup: S P = t s /t P = O(n / (n + log(p))) Best speedup w/o communication: S P = O(P/log(P)) HPC Fall
8 General M-Ary Partitioning divide Example: partitioning an image, e.g. to compute histogram in parallel Second division First division time compute Third division combine 3-level 4-ary partitioning for 4 3 = 64 processors HPC Fall
9 Partitioning Example 2: Parallel Bucket Sort Bucket sort of a list of values bounded within a range [lo hi] 1. Partition n values in n/p segments 2a. Sort each segment into P small buckets (local computation) 2b. Send content of small buckets to P large buckets 3. Sort P large buckets and merge lists Unsorted values P processors Small buckets Empty small buckets into large buckets Sort content of buckets and merge lists Sorted values HPC Fall
10 Partitioning Example 2: Parallel Bucket Sort Input: list X of length n with minimum value L and maximum U Output: sorted list X def function bucket(x) = P*(x-L)/(U-L); scatter list X to local X p lists each of size n/p forall processors p = 0,,P-1 for i = 0,,n/P-1 x = X p [i] put x into small bucket b p [bucket(x)] all-to-all of small buckets b p into large buckets B p sort values in B p [0,,P-1] using a sequential sort algorithm gather X from B p into a merged sorted list HPC Fall
11 Partitioning Example 2: Parallel Bucket Sort Communication time (assuming uniform distribution in X) Scatter: t comm1 = log 2 (P)t startup + n(p-1)/p t data All-to-all: t comm2 = (P-1)(t startup + n/p 2 t data ) Gather: t comm3 = log 2 (P)t startup + n(p-1)/p t data Computation time (assuming uniform distribution in X) Small bucket sort: Large bucket sort: Speedup t comp1 = n/p Sequential time: t s = n log 2 (n/p) Parallel time: Speedup w/o communication: t comp2 = n/p log 2 (n/p) (with P buckets) t P = 2 log 2 (P)t startup + 2 n(p-1)/p t data + (P-1)(t startup + n/p 2 t data ) + n/p (1 + log 2 (n/p)) S P = O(P) HPC Fall
12 Partitioning Example 3: Barnes Hut Algorithm Direction of the force between two bodies at points p and q HPC Fall
13 .... Partitioning Example 3: Barnes Hut Algorithm Quadtree Parent computes M and C Particle at (x,y) and mass m. A square w/o particle is deleted Particles in 2D space Mass of parent is sum of masses of children Center of mass HPC Fall
14 Partitioning Example 3: Barnes Hut Algorithm for (t = 0; t < tmax; t++) { Build_tree(); Compute_Total_Mass_Center(); Compute_Force(); Update_Positions(); } Sequential time is O(n log n) Assuming P = n then t P = O(log P) (*) (**) Compute_Force() { for (i = 0; i < n; i++) Compute_Tree_Force(i,root) } Compute_Tree_Force(i,node) { if (box at node contains one particle) F = force using eq (**) else { r = distance from i to C (*) of box D = size of box if (D/r < theta) F = force using eq (**) with total M else for (all children c of box) F = F + Compute_Tree_Force(i,c); } return F; } HPC Fall
15 Divide and Conquer Divide and conquer strategy (definition by JáJá 1992) 1. Break up a given problem into independent subproblems 2. Solve the subproblems recursively and concurrently 3. Collect and combine the solutions into the overall solution In contrast to the partitioning strategy, divide and conquer uses recursive partitioning with concurrent execution to divide the problem down into independent subproblems In deeper levels of recursion the number of active processors may increase or decrease HPC Fall
16 Divide & Conquer Example 1: Parallel Recursive Matmul Block matrix multiplication in recursion by decomposing matrix in 2 2 submatrices and computing the submatrices recursively Mat matmul(mat A, Mat B, int s) { if (s == 1) C = A * B; else { s = s/2; P0 = matmul(a p,p, B p,p, s); P1 = matmul(a p,q, B q,p, s); P2 = matmul(a p,p, B p,q, s); P3 = matmul(a p,q, B q,q, s); P4 = matmul(a q,p, B p,p, s); P5 = matmul(a q,q, B q,p, s); P6 = matmul(a q,p, B p,q, s); P7 = matmul(a q,q, B q,q, s); C p,p = P0 + P1; C p,q = P2 + P3; C q,p = P4 + P5; C q,q = P6 + P7; } return C; } P0 P7 computed in parallel Level of parallelism increases with deepening recursion Suitable for shared memory systems HPC Fall
17 Divide and Conquer Example 2: Parallel Convex Hull Algorithm The planar convex hull of a set of points S={p 1,p 2,,p n } of p i =(x,y) coordinates is the smallest convex polygon that encompasses all points S on the x-y plane y x HPC Fall
18 Divide and Conquer Example 2: Parallel Convex Hull Algorithm The upper convex hull spans points {q 1,,q s } S from point q 1 with minimum x to q s with maximum x The convex hull = upper convex hull + lower convex hull Problem: Given points S = {p 1,,p n } such that x(p 1 ) < x(p 2 ) < < x(p n ), construct the upper convex hull in parallel Upper convex hull y q 1 q s x HPC Fall
19 Divide and Conquer Example 2: Parallel Convex Hull Algorithm Parallel convex hull: 1. Divide the x-sorted points S into sets S 1 and S 2 of equal size 2. Compute upper convex hull recursively on S 1 and S 2 3. Combine UCH(S 1 ) and UCH(S 2 ) by computing the upper common tangent to form UCH(S) Upper common tangent S 2 S 1 HPC Fall
20 Divide and Conquer Example 2: Parallel Convex Hull Algorithm Base case of recursion: two points, which are returned as UCH(S) The line segment (a,b) can be computed sequentially in O(log n) time with n = UCH(S 1 ) + UCH(S 2 ) using a binary search method Line segments can be implemented as linked list of points, thus UCH(S 1 ) and UCH(S 2 ) can be connected using one pointer change of a to point to b in O(1) time Parallel convex hull time complexity recurrence relation: T(n) < T(n/2) + a log n with solution: T(n) = O(log 2 n) Parallel convex hull operations recurrence relation: W(n) < 2W(n/2) + b n with solution: W(n) = O(n log n) which is cost optimal, since sequential algorithm is O(n log n) HPC Fall
21 Divide and Conquer Example 3: First-Order Linear Recurrences First-order linear recurrence y 1 = b 1 y i = a i y i-1 + b i Example applications: 2 < i < n Prefix sum y i = j=1..i b j is a special case of a first-order linear recurrence with a i = 1 (the multiplicative unit element) n-th order polynomial evaluation using Horner s rule p(x) = (((b 1 x + b 2 ) x + b 3 ) x + + b n-1 ) x + b n is a special case of a first-order linear recurrence with a i = x Solving a bi-diagonal system By = c, let a i = l i /d i b i = c i /d i then solve linear recurrence to obtain solution y d 1 l 2 d 2 l 3 d 3 l n d n y 1 y 2 y 3 y n = c 1 c 2 c 3 c n HPC Fall
22 Divide and Conquer Example 3: First-Order Linear Recurrences Rewrite y i = a i y i-1 + b i into y i = a i (a i-1 y i-2 + b i-1 ) + b i This equation defines a linear recurrence of size n/2 for even index i z 1 = b 1 z i = a i z i-1 + b i 2 < i < n/2 1. Let a i = a 2i a 2i-1 b i = a 2i b 2i-1 + b 2i 2. Solve z i recursively 3. For 1 < i < n set y i = z i/2 if i is even y i = a i z (i-1)/2 +b i if i is odd > 1 y i = b 1 if i = 1 HPC Fall
23 Recursion level Divide and Conquer Example 3: First-Order Linear Recurrences log 2 n recursive steps Parallel algorithm: linrecsolve(a[], b[], y[], n) { if (n==1) { y[1] = b[1]; return; } forall (i = 1 to n/2) { a new [i] = a[2*i]*a[2*i-1]; b new [i] = a[2*i]*b[2*i-1]+b[2*i]; } linrecsolve(a new, b new, z, n/2); forall (i = 1 to n) { if (i == 1) y[1] = b[1]; else if (even(i)) y[i] = z[i/2]; else y[i] = a[i]*z[(i-1)/2]+b[i]; } } b 1 b 1 = a 2 b 1 + b 2 b 1 = a 2 b 1 + b 2 = ((a 2 b 1 + b 2 ) a 3 + b 3 ) a 4 + b 4 b 1 = a 2 b 1 + b 2 = ((a 2 b 1 + b 2 ) a 3 + b 3 ) a 4 + b 4 = ((((a 2 b 1 + b 2 ) a 3 + b 3 ) a 4 + b 4 ) a 5 + b 5 ) a 6 + b 6 ) a 7 + b 7 ) a 8 + b 8 HPC Fall
24 Divide and Conquer Example 4: Triangular Matrix Inversion Consider Ax = b with triangular matrix A a 11 a 21 a 22 a 31 a 32 a 33 a n1 a n2 a nn Partition A into (n/2) (n/2) blocks A 1 A 2 A 3 Then A -1 is given by A A 3-1 A 2 A 1-1 A 3-1 HPC Fall
25 Divide and Conquer Example 4: Triangular Matrix Inversion Parallel algorithm: 1. Divide A into A 1, A 2, A 3 2. Recursively compute inverses of A 1 and A 3 in parallel 3. Multiply -A 3-1 A 2 A 1-1 and combine with A 1-1 and A 3-1 to get A -1 Time complexity is given by the recurrence relation T(n) = T(n/2) + c n with P=n 2 processors to compute -A 3-1 A 2 A 1-1 in O(n) operations in parallel, thus T(n) = O(n) time HPC Fall
26 Divide and Conquer Example 5: Banded Triangular Systems Consider Ax = b with banded matrix A with m=3 a 11 a 21 a 22 a 31 a 32 a 33 a 42 a 43 a 44 a 53 a 54 a 55 a 64 a 65 a 66 a 75 a 76 a 77 a 86 a 87 a 88 a 97 a 98 a 99 a 11 a 21 a 22 a 31 a 32 a 33 a 42 a 43 a 44 a 53 a 54 a 55 a 64 a 65 a 66 a 75 a 76 a 77 a 86 a 87 a 88 a 97 a 98 a 99 Define block diagonal D and inverse D -1 A 11 A A22 A D = D -1 = An/m,n/m An/m,n/m -1 HPC Fall
27 Divide and Conquer Example 5: Banded Triangular Systems Compute d = D -1 b and B = D -1 A where B i,i-1 = A ii -1 A i,i-1 d = D -1 b = d 1 d 2 d n/m B = D -1 A = I m B 21 I m B 32 I m B n/m,n/m-1 I m Solve first-order linear recurrence on m m matrices B i,i-1 x 1 = d 1 x i = -B i,i-1 x i-1 + d 1 2 < i < n/m Parallel time O(m + m log (n/m)) with P=nm processors Compute all A ii -1 (each requiring O(m) operations) in parallel with parallel matrix inversion algorithm Compute all B i,i-1 = A ii -1 A i,i-1 in O(m) operations in parallel Recurrence depth is log 2 (n/m), each step has O(m) operations HPC Fall
28 Divide and Conquer Example 6: LU of Tridiagonal Matrix Consider tridiagonal matrix LU decomposition a 1 c 1 b 2 a 2 c 2 b 3 a 3 c 3 b n a n = 1 l 2 1 l 3 1 l n 1 d 1 u 1 d 2 u 2 d 3 u 3 d n The LU decomposition A = L U satisfies a 1 = d 1 c i = u i a i = d i + l i u i-1 b i = l i d i-1 thus d 1 = a 1 d i = a i - l i u i-1 = a i - u i-1 b i /d i-1 = [ a i d i-1 - b i c i-1 ] / d i-1 HPC Fall
29 Divide and Conquer Example 6: Let LU of Tridiagonal Matrix R 1 = a a i -b i c i-1 R i = 1 0 T i = R i R i-1 R 1 From the Möbius transformation we have d i = Algorithm: Set up matrices R T T T i T i Solve first-order linear recurrence (prefix sum) of T Compute d i From the solution of d i compute l i = b i /d i-1 HPC Fall
30 Further Reading [PP2] pages [PSC] pages HPC Fall
COL 730: Parallel Programming
COL 730: Parallel Programming PARALLEL SORTING Bitonic Merge and Sort Bitonic sequence: {a 0, a 1,, a n-1 }: A sequence with a monotonically increasing part and a monotonically decreasing part For some
More informationAlgorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count
Types of formulas for basic operation count Exact formula e.g., C(n) = n(n-1)/2 Algorithms, Design and Analysis Big-Oh analysis, Brute Force, Divide and conquer intro Formula indicating order of growth
More informationParallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco
Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and
More informationAlgorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {
Question 1. (10 points) for (i = 1; i
More informationCMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk
CMPS 2200 Fall 2017 Divide-and-Conquer Carola Wenk Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk 1 The divide-and-conquer design paradigm 1. Divide the problem (instance)
More informationIntroduction to Algorithms 6.046J/18.401J/SMA5503
Introduction to Algorithms 6.046J/8.40J/SMA5503 Lecture 3 Prof. Piotr Indyk The divide-and-conquer design paradigm. Divide the problem (instance) into subproblems. 2. Conquer the subproblems by solving
More informationDesign and Analysis of Algorithms
CSE 101, Winter 2018 Design and Analysis of Algorithms Lecture 4: Divide and Conquer (I) Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ Divide and Conquer ( DQ ) First paradigm or framework DQ(S)
More informationBlock-tridiagonal matrices
Block-tridiagonal matrices. p.1/31 Block-tridiagonal matrices - where do these arise? - as a result of a particular mesh-point ordering - as a part of a factorization procedure, for example when we compute
More informationCOMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions
COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions September 12, 2018 I. The Work-Time W-T presentation of EREW sequence reduction Algorithm 2 in the PRAM handout has work complexity
More informationdata structures and algorithms lecture 2
data structures and algorithms 2018 09 06 lecture 2 recall: insertion sort Algorithm insertionsort(a, n): for j := 2 to n do key := A[j] i := j 1 while i 1 and A[i] > key do A[i + 1] := A[i] i := i 1 A[i
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #19 3/28/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 From last class PRAM
More informationThe Divide-and-Conquer Design Paradigm
CS473- Algorithms I Lecture 4 The Divide-and-Conquer Design Paradigm CS473 Lecture 4 1 The Divide-and-Conquer Design Paradigm 1. Divide the problem (instance) into subproblems. 2. Conquer the subproblems
More informationLecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg
Lecture 22: Multithreaded Algorithms CSCI 700 - Algorithms I Andrew Rosenberg Last Time Open Addressing Hashing Today Multithreading Two Styles of Threading Shared Memory Every thread can access the same
More informationOverview: Parallelisation via Pipelining
Overview: Parallelisation via Pipelining three type of pipelines adding numbers (type ) performance analysis of pipelines insertion sort (type ) linear system back substitution (type ) Ref: chapter : Wilkinson
More informationDivide-and-Conquer. Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, CSE 6331 Algorithms Steve Lai
Divide-and-Conquer Reading: CLRS Sections 2.3, 4.1, 4.2, 4.3, 28.2, 33.4. CSE 6331 Algorithms Steve Lai Divide and Conquer Given an instance x of a prolem, the method works as follows: divide-and-conquer
More informationTopic 17. Analysis of Algorithms
Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm
More informationAlgorithmic Approach to Counting of Certain Types m-ary Partitions
Algorithmic Approach to Counting of Certain Types m-ary Partitions Valentin P. Bakoev Abstract Partitions of integers of the type m n as a sum of powers of m (the so called m-ary partitions) and their
More informationDivide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,
Divide and Conquer CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Merging sorted lists: WHAT Given two sorted lists a 1 a 2 a 3 a k b 1 b 2 b 3 b
More informationDesign and Analysis of Algorithms
CSE 101, Winter 2018 Design and Analysis of Algorithms Lecture 5: Divide and Conquer (Part 2) Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ A Lower Bound on Convex Hull Lecture 4 Task: sort the
More informationMA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3
MA008 p.1/37 MA008/MIIZ01 Design and Analysis of Algorithms Lecture Notes 3 Dr. Markus Hagenbuchner markus@uow.edu.au. MA008 p.2/37 Exercise 1 (from LN 2) Asymptotic Notation When constants appear in exponents
More informationDivide and Conquer Algorithms
Divide and Conquer Algorithms T. M. Murali February 19, 2013 Divide and Conquer Break up a problem into several parts. Solve each part recursively. Solve base cases by brute force. Efficiently combine
More informationSearch Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by
Search Algorithms Analysis of Algorithms Prepared by John Reif, Ph.D. Search Algorithms a) Binary Search: average case b) Interpolation Search c) Unbounded Search (Advanced material) Readings Reading Selection:
More informationParallel Prefix Algorithms 1. A Secret to turning serial into parallel
Parallel Prefix Algorithms. A Secret to turning serial into parallel 2. Suppose you bump into a parallel algorithm that surprises you there is no way to parallelize this algorithm you say 3. Probably a
More informationAlgorithm Design and Analysis
Algorithm Design and Analysis LECTURE 9 Divide and Conquer Merge sort Counting Inversions Binary Search Exponentiation Solving Recurrences Recursion Tree Method Master Theorem Sofya Raskhodnikova S. Raskhodnikova;
More informationAlgorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee
Algorithm Analysis Divide and Conquer Chung-Ang University, Jaesung Lee Introduction 2 Divide and Conquer Paradigm 3 Advantages of Divide and Conquer Solving Difficult Problems Algorithm Efficiency Parallelism
More informationCS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms
CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline CS 4407, Algorithms Growth Functions
More informationDivide-and-conquer: Order Statistics. Curs: Fall 2017
Divide-and-conquer: Order Statistics Curs: Fall 2017 The divide-and-conquer strategy. 1. Break the problem into smaller subproblems, 2. recursively solve each problem, 3. appropriately combine their answers.
More informationLinear Systems of Equations by Gaussian Elimination
Chapter 6, p. 1/32 Linear of Equations by School of Engineering Sciences Parallel Computations for Large-Scale Problems I Chapter 6, p. 2/32 Outline 1 2 3 4 The Problem Consider the system a 1,1 x 1 +
More information9. Numerical linear algebra background
Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization
More informationDivide and Conquer Algorithms
Divide and Conquer Algorithms T. M. Murali March 17, 2014 Divide and Conquer Break up a problem into several parts. Solve each part recursively. Solve base cases by brute force. Efficiently combine solutions
More informationData Structures in Java
Data Structures in Java Lecture 20: Algorithm Design Techniques 12/2/2015 Daniel Bauer 1 Algorithms and Problem Solving Purpose of algorithms: find solutions to problems. Data Structures provide ways of
More informationCS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms
CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline CS 4407, Algorithms Growth Functions
More informationOverview: Synchronous Computations
Overview: Synchronous Computations barriers: linear, tree-based and butterfly degrees of synchronization synchronous example 1: Jacobi Iterations serial and parallel code, performance analysis synchronous
More informationReview Of Topics. Review: Induction
Review Of Topics Asymptotic notation Solving recurrences Sorting algorithms Insertion sort Merge sort Heap sort Quick sort Counting sort Radix sort Medians/order statistics Randomized algorithm Worst-case
More informationDivide and Conquer Algorithms
Divide and Conquer Algorithms Introduction There exist many problems that can be solved using a divide-and-conquer algorithm. A divide-andconquer algorithm A follows these general guidelines. Divide Algorithm
More informationNotation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing
Parallel Processing CS575 Parallel Processing Lecture five: Efficiency Wim Bohm, Colorado State University Some material from Speedup vs Efficiency in Parallel Systems - Eager, Zahorjan and Lazowska IEEE
More informationData Structures and Algorithms Chapter 3
1 Data Structures and Algorithms Chapter 3 Werner Nutt 2 Acknowledgments The course follows the book Introduction to Algorithms, by Cormen, Leiserson, Rivest and Stein, MIT Press [CLRST]. Many examples
More informationCMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015
CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015 1 Introduction The divide-and-conquer strategy is a general paradigm for algorithm design. This strategy
More informationCSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes )
CSE613: Parallel Programming, Spring 2012 Date: May 11 Final Exam ( 11:15 AM 1:45 PM : 150 Minutes ) This exam will account for either 10% or 20% of your overall grade depending on your relative performance
More informationLecture 6 September 21, 2016
ICS 643: Advanced Parallel Algorithms Fall 2016 Lecture 6 September 21, 2016 Prof. Nodari Sitchinava Scribe: Tiffany Eulalio 1 Overview In the last lecture, we wrote a non-recursive summation program and
More informationThe parallelization of the Keller box method on heterogeneous cluster of workstations
Available online at http://wwwibnusinautmmy/jfs Journal of Fundamental Sciences Article The parallelization of the Keller box method on heterogeneous cluster of workstations Norhafiza Hamzah*, Norma Alias,
More information1 Sequences and Summation
1 Sequences and Summation A sequence is a function whose domain is either all the integers between two given integers or all the integers greater than or equal to a given integer. For example, a m, a m+1,...,
More informationCS325: Analysis of Algorithms, Fall Midterm
CS325: Analysis of Algorithms, Fall 2017 Midterm I don t know policy: you may write I don t know and nothing else to answer a question and receive 25 percent of the total points for that problem whereas
More informationMatrix Multiplication
Matrix Multiplication Matrix Multiplication Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB. n c ij = a ik b kj k=1 c 11 c 12 c 1n c 21 c 22 c 2n c n1 c n2 c nn = a 11 a 12 a 1n
More informationDivide-and-Conquer Algorithms Part Two
Divide-and-Conquer Algorithms Part Two Recap from Last Time Divide-and-Conquer Algorithms A divide-and-conquer algorithm is one that works as follows: (Divide) Split the input apart into multiple smaller
More informationFundamental Algorithms
Fundamental Algorithms Chapter 5: Searching Michael Bader Winter 2014/15 Chapter 5: Searching, Winter 2014/15 1 Searching Definition (Search Problem) Input: a sequence or set A of n elements (objects)
More informationRandomization in Algorithms and Data Structures
Randomization in Algorithms and Data Structures 3 lectures (Tuesdays 14:15-16:00) May 1: Gerth Stølting Brodal May 8: Kasper Green Larsen May 15: Peyman Afshani For each lecture one handin exercise deadline
More informationDynamic Programming. Reading: CLRS Chapter 15 & Section CSE 6331: Algorithms Steve Lai
Dynamic Programming Reading: CLRS Chapter 5 & Section 25.2 CSE 633: Algorithms Steve Lai Optimization Problems Problems that can be solved by dynamic programming are typically optimization problems. Optimization
More informationDivide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]
Divide and Conquer Andreas Klappenecker [based on slides by Prof. Welch] Divide and Conquer Paradigm An important general technique for designing algorithms: divide problem into subproblems recursively
More informationDivide and Conquer CPE 349. Theresa Migler-VonDollen
Divide and Conquer CPE 349 Theresa Migler-VonDollen Divide and Conquer Divide and Conquer is a strategy that solves a problem by: 1 Breaking the problem into subproblems that are themselves smaller instances
More informationAnalytical Modeling of Parallel Systems
Analytical Modeling of Parallel Systems Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing
More informationCISC 235: Topic 1. Complexity of Iterative Algorithms
CISC 235: Topic 1 Complexity of Iterative Algorithms Outline Complexity Basics Big-Oh Notation Big-Ω and Big-θ Notation Summations Limitations of Big-Oh Analysis 2 Complexity Complexity is the study of
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar
More informationNotes for Recitation 14
6.04/18.06J Mathematics for Computer Science October 4, 006 Tom Leighton and Marten van Dijk Notes for Recitation 14 1 The Akra-Bazzi Theorem Theorem 1 (Akra-Bazzi, strong form). Suppose that: is defined
More informationCS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University
CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution
More informationLecture 4: Linear Algebra 1
Lecture 4: Linear Algebra 1 Sourendu Gupta TIFR Graduate School Computational Physics 1 February 12, 2010 c : Sourendu Gupta (TIFR) Lecture 4: Linear Algebra 1 CP 1 1 / 26 Outline 1 Linear problems Motivation
More informationReview: From problem to parallel algorithm
Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:
More informationCSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD
Greedy s Greedy s Shortest path Claim 2: Let S be a subset of vertices containing s such that we know the shortest path length l(s, u) from s to any vertex in u S. Let e = (u, v) be an edge such that 1
More informationLecture 14: Nov. 11 & 13
CIS 2168 Data Structures Fall 2014 Lecturer: Anwar Mamat Lecture 14: Nov. 11 & 13 Disclaimer: These notes may be distributed outside this class only with the permission of the Instructor. 14.1 Sorting
More informationFundamental Algorithms
Chapter 2: Sorting, Winter 2018/19 1 Fundamental Algorithms Chapter 2: Sorting Jan Křetínský Winter 2018/19 Chapter 2: Sorting, Winter 2018/19 2 Part I Simple Sorts Chapter 2: Sorting, Winter 2018/19 3
More informationFundamental Algorithms
Fundamental Algorithms Chapter 2: Sorting Harald Räcke Winter 2015/16 Chapter 2: Sorting, Winter 2015/16 1 Part I Simple Sorts Chapter 2: Sorting, Winter 2015/16 2 The Sorting Problem Definition Sorting
More information! Break up problem into several parts. ! Solve each part recursively. ! Combine solutions to sub-problems into overall solution.
Divide-and-Conquer Chapter 5 Divide and Conquer Divide-and-conquer.! Break up problem into several parts.! Solve each part recursively.! Combine solutions to sub-problems into overall solution. Most common
More informationModels: Amdahl s Law, PRAM, α-β Tal Ben-Nun
spcl.inf.ethz.ch @spcl_eth Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun Design of Parallel and High-Performance Computing Fall 2017 DPHPC Overview cache coherency memory models 2 Speedup An application
More informationDivide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016
CS125 Lecture 4 Fall 2016 Divide and Conquer We have seen one general paradigm for finding algorithms: the greedy approach. We now consider another general paradigm, known as divide and conquer. We have
More informationCopyright 2000, Kevin Wayne 1
Divide-and-Conquer Chapter 5 Divide and Conquer Divide-and-conquer. Break up problem into several parts. Solve each part recursively. Combine solutions to sub-problems into overall solution. Most common
More informationLegendre s Equation. PHYS Southern Illinois University. October 18, 2016
Legendre s Equation PHYS 500 - Southern Illinois University October 18, 2016 PHYS 500 - Southern Illinois University Legendre s Equation October 18, 2016 1 / 11 Legendre s Equation Recall We are trying
More informationSorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort
Sorting Algorithms We have already seen: Selection-sort Insertion-sort Heap-sort We will see: Bubble-sort Merge-sort Quick-sort We will show that: O(n log n) is optimal for comparison based sorting. Bubble-Sort
More informationThe maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:
The maximum-subarray problem Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm: Brute force algorithm: At best, θ(n 2 ) time complexity 129 Can we do divide
More informationAdvanced Analysis of Algorithms - Midterm (Solutions)
Advanced Analysis of Algorithms - Midterm (Solutions) K. Subramani LCSEE, West Virginia University, Morgantown, WV {ksmani@csee.wvu.edu} 1 Problems 1. Solve the following recurrence using substitution:
More informationSolution of Linear Systems
Solution of Linear Systems Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico May 12, 2016 CPD (DEI / IST) Parallel and Distributed Computing
More informationCS483 Design and Analysis of Algorithms
CS483 Design and Analysis of Algorithms Lecture 6-8 Divide and Conquer Algorithms Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments
More informationR ij = 2. Using all of these facts together, you can solve problem number 9.
Help for Homework Problem #9 Let G(V,E) be any undirected graph We want to calculate the travel time across the graph. Think of each edge as one resistor of 1 Ohm. Say we have two nodes: i and j Let the
More informationV. Adamchik 1. Recurrences. Victor Adamchik Fall of 2005
V. Adamchi Recurrences Victor Adamchi Fall of 00 Plan Multiple roots. More on multiple roots. Inhomogeneous equations 3. Divide-and-conquer recurrences In the previous lecture we have showed that if the
More informationEECS 358 Introduction to Parallel Computing Final Assignment
EECS 358 Introduction to Parallel Computing Final Assignment Jiangtao Gou Zhenyu Zhao March 19, 2013 1 Problem 1 1.1 Matrix-vector Multiplication on Hypercube and Torus As shown in slide 15.11, we assumed
More informationAnalysis of Algorithm Efficiency. Dr. Yingwu Zhu
Analysis of Algorithm Efficiency Dr. Yingwu Zhu Measure Algorithm Efficiency Time efficiency How fast the algorithm runs; amount of time required to accomplish the task Our focus! Space efficiency Amount
More informationNotes on Paths, Trees and Lagrange Inversion
Notes on Paths, Trees and Lagrange Inversion Today we are going to start with a problem that may seem somewhat unmotivated, and solve it in two ways. From there, we will proceed to discuss applications
More informationDense Arithmetic over Finite Fields with CUMODP
Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,
More informationCPS 616 DIVIDE-AND-CONQUER 6-1
CPS 616 DIVIDE-AND-CONQUER 6-1 DIVIDE-AND-CONQUER Approach 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances recursively 3. Obtain solution to original (larger)
More informationAnalytical Modeling of Parallel Programs. S. Oliveira
Analytical Modeling of Parallel Programs S. Oliveira Fall 2005 1 Scalability of Parallel Systems Efficiency of a parallel program E = S/P = T s /PT p Using the parallel overhead expression E = 1/(1 + T
More informationBarrier. Overview: Synchronous Computations. Barriers. Counter-based or Linear Barriers
Overview: Synchronous Computations Barrier barriers: linear, tree-based and butterfly degrees of synchronization synchronous example : Jacobi Iterations serial and parallel code, performance analysis synchronous
More informationLecture 2: Divide and conquer and Dynamic programming
Chapter 2 Lecture 2: Divide and conquer and Dynamic programming 2.1 Divide and Conquer Idea: - divide the problem into subproblems in linear time - solve subproblems recursively - combine the results in
More informationJ.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009
Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.
More informationSearching. Sorting. Lambdas
.. s Babes-Bolyai University arthur@cs.ubbcluj.ro Overview 1 2 3 Feedback for the course You can write feedback at academicinfo.ubbcluj.ro It is both important as well as anonymous Write both what you
More informationDivide and Conquer Problem Solving Method
Divide and Conquer Problem Solving Method 1. Problem Instances Let P be a problem that is amenable to the divide and conquer algorithm design method and let P 0, P 1, P 2, be distinct instances of the
More informationDivide-and-conquer algorithm
Divide-and-conquer algorithm IDEA: n n matrix = 2 2 matrix of (n/2) (n/2) submatrices: r=ae+bg s=af+bh t =ce+dh u=cf+dg r t s u = a c e g September 15, 2004 Introduction to Algorithms L3.31 b d C = A B
More informationKartsuba s Algorithm and Linear Time Selection
CS 374: Algorithms & Models of Computation, Fall 2015 Kartsuba s Algorithm and Linear Time Selection Lecture 09 September 22, 2015 Chandra & Manoj (UIUC) CS374 1 Fall 2015 1 / 32 Part I Fast Multiplication
More informationRecap: Prefix Sums. Given A: set of n integers Find B: prefix sums 1 / 86
Recap: Prefix Sums Given : set of n integers Find B: prefix sums : 3 1 1 7 2 5 9 2 4 3 3 B: 3 4 5 12 14 19 28 30 34 37 40 1 / 86 Recap: Parallel Prefix Sums Recursive algorithm Recursively computes sums
More informationA design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101
A design paradigm Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/17 112 Multiplying complex numbers (from Jeff Edmonds slides) INPUT: Two pairs of integers, (a,b),
More informationDivide and conquer. Philip II of Macedon
Divide and conquer Philip II of Macedon Divide and conquer 1) Divide your problem into subproblems 2) Solve the subproblems recursively, that is, run the same algorithm on the subproblems (when the subproblems
More informationDivide and Conquer. Recurrence Relations
Divide and Conquer Recurrence Relations Divide-and-Conquer Strategy: Break up problem into parts. Solve each part recursively. Combine solutions to sub-problems into overall solution. 2 MergeSort Mergesort.
More informationSorting. Chapter 11. CSE 2011 Prof. J. Elder Last Updated: :11 AM
Sorting Chapter 11-1 - Sorting Ø We have seen the advantage of sorted data representations for a number of applications q Sparse vectors q Maps q Dictionaries Ø Here we consider the problem of how to efficiently
More informationAnalysis of Multithreaded Algorithms
Analysis of Multithreaded Algorithms Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 4435 - CS 9624 (Moreno Maza) Analysis of Multithreaded Algorithms CS 4435 - CS 9624 1 /
More informationDivide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14
Divide and Conquer Algorithms CSE 101: Design and Analysis of Algorithms Lecture 14 CSE 101: Design and analysis of algorithms Divide and conquer algorithms Reading: Sections 2.3 and 2.4 Homework 6 will
More information5 Spatial Access Methods
5 Spatial Access Methods 5.1 Quadtree 5.2 R-tree 5.3 K-d tree 5.4 BSP tree 3 27 A 17 5 G E 4 7 13 28 B 26 11 J 29 18 K 5.5 Grid file 9 31 8 17 24 28 5.6 Summary D C 21 22 F 15 23 H Spatial Databases and
More informationData Structures and Algorithms
Data Structures and Algorithms Spring 2017-2018 Outline 1 Sorting Algorithms (contd.) Outline Sorting Algorithms (contd.) 1 Sorting Algorithms (contd.) Analysis of Quicksort Time to sort array of length
More informationDivide & Conquer. Jordi Cortadella and Jordi Petit Department of Computer Science
Divide & Conquer Jordi Cortadella and Jordi Petit Department of Computer Science Divide-and-conquer algorithms Strategy: Divide the problem into smaller subproblems of the same type of problem Solve the
More informationITEC2620 Introduction to Data Structures
ITEC2620 Introduction to Data Structures Lecture 6a Complexity Analysis Recursive Algorithms Complexity Analysis Determine how the processing time of an algorithm grows with input size What if the algorithm
More informationHow to Multiply. 5.5 Integer Multiplication. Complex Multiplication. Integer Arithmetic. Complex multiplication. (a + bi) (c + di) = x + yi.
How to ultiply Slides by Kevin Wayne. Copyright 5 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials Complex ultiplication Complex multiplication. a + bi) c + di) = x + yi.
More informationO Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem.
O Notation (Big Oh) We want to give an upper bound on the amount of time it takes to solve a problem. defn: v(n) = O(f(n)) constants c and n 0 such that v(n) c f(n) whenever n > n 0 Termed complexity:
More information9. Numerical linear algebra background
Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization
More information