Cache-Oblivious Algorithms

Size: px
Start display at page:

Download "Cache-Oblivious Algorithms"

Transcription

1 Cache-Oblivious Algorithms 1

2 Cache-Oblivious Model 2

3 The Unknown Machine Algorithm C program gcc Object code linux Execution Can be executed on machines with a specific class of CPUs Algorithm Java program javac Java bytecode java Interpretation Can be executed on any machine with a Java interpreter 3

4 The Unknown Machine Algorithm C program gcc Object code linux Execution Can be executed on machines with a specific class of CPUs Algorithm Java program javac Java bytecode java Interpretation Can be executed on any machine with a Java interpreter Goal Develop algorithms that are optimized w.r.t. memory hierarchies without knowing the parameters 3

5 Cache-Oblivious Model CPU Memory I/O Disk I/O model Algorithms do not know the parameters B and M Optimal off-line cache replacement strategy Frigo et al

6 Justification of the ideal-cache model Optimal replacement LRU + 2 cache size at most 2 cache misses Sleator an Tarjan, 1985 Corollary T M,B (N) = O(T 2M,B (N)) #cache misses using LRU is O(T M,B (N)) Two memory levels Optimal cache-oblivious algorithm satisfying T M,B (N) = O(T 2M,B (N)) optimal #cache misses on each level of a multilevel cache using LRU Fully associativity cache Simulation of LRU Direct mapped cache Explicit memory management Dictionary (2-universal hash functions) of cache lines in memory Expected O(1) access time to a cache line in memory 5

7 Matrix Multiplication 6

8 Matrix Multiplication Problem C = A B, c ij = k=1..n a ik b kj Layout of matrices Row major Column major 4 4-blocked Bit interleaved 7

9 Matrix Multiplication Algorithm 1: Nested loops Row major Reading a column of B uses N I/Os Total O(N 3 ) I/Os for i = 1 to N for j = 1 to N c ij = 0 for k = 1 to N c ij = c ij + a ik b kj 8

10 Matrix Multiplication Algorithm 1: Nested loops Row major Reading a column of B uses N I/Os Total O(N 3 ) I/Os for i = 1 to N for j = 1 to N c ij = 0 for k = 1 to N c ij = c ij + a ik b kj Algorithm 2: Blocked algorithm (cache-aware) s Partition A and B into blocks of size s s where s s = Θ( M) Apply Algorithm 1 to the N N matrices where s s elements are s s matrices

11 Matrix Multiplication Algorithm 1: Nested loops Row major Reading a column of B uses N I/Os Total O(N 3 ) I/Os for i = 1 to N for j = 1 to N c ij = 0 for k = 1 to N c ij = c ij + a ik b kj Algorithm 2: Blocked algorithm (cache-aware) 0 Partition A and B into blocks of size s s where s 8 s = Θ( M) Apply Algorithm 1 to the N N matrices where 32 s s 40 elements are s s matrices s s-blocked or ( row major and M = Ω(B 2 ) ) ( ( ) ) 3 O N s s 2 = O ( ) ( ) N 3 B s B = O N 3 I/Os B M s

12 Matrix Multiplication Algorithm 1: Nested loops Row major Reading a column of B uses N I/Os Total O(N 3 ) I/Os for i = 1 to N for j = 1 to N c ij = 0 for k = 1 to N c ij = c ij + a ik b kj Algorithm 2: Blocked algorithm (cache-aware) 0 Partition A and B into blocks of size s s where s 8 s = Θ( M) Apply Algorithm 1 to the N N matrices where 32 s s 40 elements are s s matrices s s-blocked or ( row major and M = Ω(B 2 ) ) ( ( ) ) 3 O N s s 2 = O ( ) ( ) N 3 B s B = O N 3 I/Os B M s Optimal Hong & Kung,

13 Matrix Multiplication Algorithm 3: Recursive algorithm (cache-oblivious) A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 = A 11B 11 + A 12 B 21 A 11 B 12 + A 12 B 22 A 21 B 11 + A 22 B 21 A 21 B 12 + A 22 B 22 8 recursive N 2 N 2 matrix multiplications + 4 N 2 N 2 matrix sums 9

14 Matrix Multiplication Algorithm 3: Recursive algorithm (cache-oblivious) A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 = A 11B 11 + A 12 B 21 A 11 B 12 + A 12 B 22 A 21 B 11 + A 22 B 21 A 21 B 12 + A 22 B 22 8 recursive N 2 N 2 matrix multiplications + 4 N 2 N 2 # I/Os if bit interleaved or ( row major and M = Ω(B 2 ) ) T (N) T (N) O O( N 2 ) if N ε M B 8 T ( ) ( ) N + O N 2 otherwise ( N 3 2 B M ) B matrix sums 9

15 Matrix Multiplication Algorithm 3: Recursive algorithm (cache-oblivious) A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 = A 11B 11 + A 12 B 21 A 11 B 12 + A 12 B 22 A 21 B 11 + A 22 B 21 A 21 B 12 + A 22 B 22 8 recursive N 2 N 2 matrix multiplications + 4 N 2 N 2 # I/Os if bit interleaved or ( row major and M = Ω(B 2 ) ) T (N) T (N) O O( N 2 ) if N ε M B 8 T ( ) ( ) N + O N 2 otherwise ( N 3 2 B M ) B matrix sums Optimal Hong & Kung, 1981 Non-square matrices Frigo et al.,

16 Matrix Multiplication Algorithm 4: Strassen s algorithm (cache-oblivious) 7 recursive N 2 N 2 C 11 C 12 C 21 C 22 matrix multiplications + O(1) matrix sums = A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 m 1 := (a 21 + a 22 a 11 )(b 22 b 12 + b 11 ) c 11 := m 2 + m 3 m 2 := a 11 b 11 c 12 := m 1 + m 2 + m 5 + m 6 m 3 := a 12 b 21 c 21 := m 1 + m 2 + m 4 m 7 m 4 := (a 11 a 21 )(b 22 b 12 ) c 22 := m 1 + m 2 + m 4 + m 5 m 5 := (a 21 + a 22 )(b 12 b 11 ) m 6 := (a 12 a 21 + a 11 a 22 )b 22 m 7 := a 22 (b 11 + b 22 b 12 b 21 ) 10

17 Matrix Multiplication Algorithm 4: Strassen s algorithm (cache-oblivious) 7 recursive N 2 N 2 C 11 C 12 C 21 C 22 matrix multiplications + O(1) matrix sums = A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 m 1 := (a 21 + a 22 a 11 )(b 22 b 12 + b 11 ) c 11 := m 2 + m 3 m 2 := a 11 b 11 c 12 := m 1 + m 2 + m 5 + m 6 m 3 := a 12 b 21 c 21 := m 1 + m 2 + m 4 m 7 m 4 := (a 11 a 21 )(b 22 b 12 ) c 22 := m 1 + m 2 + m 4 + m 5 m 5 := (a 21 + a 22 )(b 12 b 11 ) m 6 := (a 12 a 21 + a 11 a 22 )b 22 m 7 := a 22 (b 11 + b 22 b 12 b 21 ) # I/Os if bit interleaved or ( row major and M = Ω(B 2 ) ) O( N 2 T (N) ) if N ε M B 7 T ( ) ( ) N + O N 2 otherwise T (N) O ( N log 2 7 B M 2 ) B log

18 Cache-Oblivious Search Trees 11

19 Static Cache-Oblivious Trees Recursive memory layout van Emde Boas layout h/2 A h B 1 B k h/2 A B 1 B k Degree O(1) Searches use O(log B N) I/Os Prokop

20 Static Cache-Oblivious Trees Recursive memory layout van Emde Boas layout h/2 A h B 1 B k h/2 A B 1 B k Degree O(1) Searches use O(log B N) I/Os Range reportings use O ( log B N + k B ) I/Os Prokop

21 Static Cache-Oblivious Trees Recursive memory layout van Emde Boas layout h/2 A h B 1 B k h/2 A B 1 B k Degree O(1) Searches use O(log B N) I/Os Range reportings use O ( log B N + k B ) I/Os Best possible (log 2 e + o(1)) log B N Prokop 1999 Bender, Brodal, Fagerberg, Ge, He, Hu Iacono, López-Ortiz

22 Dynamic Cache-Oblivious Trees Embed a dynamic tree of small height into a complete tree Static van Emde Boas layout Rebuild data structure whenever N doubles of halves Search O(log B N) Range Reporting Updates ) O ( log B N + k B O ( log B N + log2 N B ) Brodal, Fagerberg, Jacob

23 Example

24 Binary Trees of Small Height New If an insertion causes non-small height then rebuild subtree at nearest ancestor with sufficient few descendents Insertions require amortized time O(log 2 N) Andersson and Lai

25 Binary Trees of Small Height For each level i there is a threshold τ i = τ L + i, such that 0 < τ L = τ 0 < τ 1 < < τ H = τ U < 1 For a node v i on level i definethe density Insertion ρ(v i ) = # nodes below v i m i where m i = # possible nodes below v i with depth at most H Insert new element If depth > H then locate neirest ancestor v i with ρ(v i ) τ i and rebuild subtree at v i to have minimum height and elements evenly distributed between left and right subtrees Andersson and Lai

26 Binary Trees of Small Height Theorem Insertions require amortized time O(log 2 N) Proof Consider two redistributions of v i After the first redistribution ρ(v i ) τ i Before second redistribution a child v i+1 of v i has ρ(v i+1 ) > τ i+1 Insertions below v i : m(v i+1 ) (τ i+1 τ i ) = m(v i+1 ) Redistribution of v i costs m(v i ), i.e. per insertion below v i Total insertion cost per element m(v i ) m(v i+1 ) 2 H i=0 2 = O(log2 N) Andersson and Lai

27 Memory Layouts of Trees DFS inorder BFS van Emde Boas (in theory best) 18

28 Searches in Pointer Based Layouts pointer bfs pointer dfs pointer veb pointer random insert pointer random layout e+06 van Emde Boas layout wins, followed by the BFS layout 19

29 Searches with Implicit Layouts implicit bfs implicit dfs implicit veb implicit in-order implicit 9-ary bfs e+06 BFS layout wins due to simplicity and caching of topmost levels van Emde Boas layout requires quite complex index computations 20

30 Implicit vs Pointer Based Layouts pointer bfs implicit bfs pointer veb implicit veb e+06 BFS layout e+06 van Emde Boas layout Implicit layouts become competitive as n grows 21

31 Insertions in Implicit Layouts 0.1 implicit bfs random inserts implicit in-order random inserts implicit veb random inserts e+06 Insertions are rather slow (factor over searches) 22

32 Summary Dynamic cache-oblivious search trees Search O(log B N) Range Reporting Updates ) O ( log B N + k B O ( log B N + log2 N B Update time O(log B N) by one level of indirection (implies sub-optimal range reporting) Importance of memory layouts van Emde Boas layout gives good cache performance Computation time is important when considering caches ) 23

33 Cache-Oblivious Sorting 24

34 Sorting Problem Input : array containing x 1,..., x N Output : array with x 1,..., x N in sorted order Elements can be compared and copied

35 Binary Merge-Sort Ouput Merging Merging Merging Merging Input

36 Binary Merge-Sort Ouput Merging Merging Merging Merging Input Recursive; two arrays; size O(M) internally in cache O(N log N) comparisons O ( N log ) B 2 N M I/Os 26

37 Merge-Sort Degree I/O 2 O ( N log ) B 2 N M d (d M 1) B Θ ( M B ) O ( N B log d N M O ( N log ) B M/B N M = O(SortM,B (N)) Aggarwal and Vitter 1988 Funnel-Sort 2 O( 1 ε Sort M,B(N)) (M B 1+ε ) Frigo, Leiserson, Prokop and Ramachandran 1999 ) Brodal and Fagerberg

38 Lower Bound Brodal and Fagerberg 2003 Block Size Memory I/Os Machine 1 B 1 M t 1 Machine 2 B 2 M t 2 One algorithm, two machines, B 1 B 2 Trade-off 8t 1 B 1 + 3t 1 B 1 log 8Mt 2 t 1 B 1 N log N M 1.45N 28

39 Lower Bound Assumption I/Os Lazy Funnel-sort Binary Merge-sort B M 1 ε (a) B 2 = M 1 ε : Sort B2,M(N) (b) B 1 = 1 : Sort B1,M(N) 1 ε B M/2 (a) B 2 = M/2 : Sort B2,M(N) (b) B 1 = 1 : Sort B1,M(N) log M Corollary (a) (b) 29

40 Funnel-Sort 30

41 k-merger Frigo et al., FOCS 99 Sorted output stream M k sorted input streams 31

42 k-merger Frigo et al., FOCS 99 Sorted output stream M 0 k 1/2 -mergers M Recursive def. = B 1 B k buffers of size k 3/2 M 1 M k k sorted input streams 31

43 k-merger Frigo et al., FOCS 99 Sorted output stream M 0 k 1/2 -mergers M Recursive def. = B 1 B k buffers of size k 3/2 M 1 M k k sorted input streams M 0 B 1 M 1 B 2 M 2 B k M k Recursive Layout 31

44 Lazy k-merger Brodal and Fagerberg 2002 M 0 B 1 B k M 1 M k 32

45 Lazy k-merger Brodal and Fagerberg 2002 M 0 B 1 B k M 1 M k Procedure Fill(v) while out-buffer not full if left in-buffer empty Fill(left child) if right in-buffer empty Fill(right child) perform one merge step 32

46 Lazy k-merger Brodal and Fagerberg 2002 M 0 B 1 B k M 1 M k Procedure Fill(v) while out-buffer not full if left in-buffer empty Fill(left child) if right in-buffer empty Fill(right child) perform one merge step Lemma If M B 2 and output buffer has size k 3 then O( k3 B log M(k 3 ) + k) I/Os are done during an invocation of Fill(root) 32

47 Funnel-Sort Brodal and Fagerberg 2002 Frigo, Leiserson, Prokop and Ramachandran 1999 Divide input in N 1/3 segments of size N 2/3 Recursively Funnel-Sort each segment Merge sorted segments by an N 1/3 -merger k N 1/3 N 2/9 N 4/

48 Funnel-Sort Brodal and Fagerberg 2002 Frigo, Leiserson, Prokop and Ramachandran 1999 Divide input in N 1/3 segments of size N 2/3 Recursively Funnel-Sort each segment Merge sorted segments by an N 1/3 -merger k N 1/3 N 2/9 N 4/27. 2 Theorem Funnel-Sort performs O(Sort M,B (N)) I/Os for M B 2 33

49 Hardware Processor type Pentium 4 Pentium 3 MIPS Workstation Dell PC Delta PC SGI Octane Operating system GNU/Linux Kernel GNU/Linux Kernel IRIX version 6.5 version version Clock rate 2400 MHz 800 MHz 175 MHz Address space 32 bit 32 bit 64 bit Integer pipeline stages L1 data cache size 8 KB 16 KB 32 KB L1 line size 128 Bytes 32 Bytes 32 Bytes L1 associativity 4 way 4 way 2 way L2 cache size 512 KB 256 KB 1024 KB L2 line size 128 Bytes 32 Bytes 32 Bytes L2 associativity 8 way 4 way 2 way TLB entries TLB associativity Full 4 way 64 way TLB miss handler Hardware Hardware Software Main memory 512 MB 256 MB 128 MB 34

50 Wall Clock 100.0µs Pentium 4, 512/512 ffunnelsort funnelsort lowscosa stdsort ami_sort msort-c msort-m Wall clock time per element 10.0µs 1.0µs 0.1µs 1,000,000 10,000, ,000,000 1,000,000,000 Elements Kristoffer Vinther

51 Page Faults Pentium 4, 512/512 ffunnelsort funnelsort lowscosa stdsort msort-c msort-m Page faults per block of elements ,000,000 10,000, ,000,000 1,000,000,000 Elements Kristoffer Vinther

52 Cache Misses MIPS 10000, 1024/128 ffunnelsort funnelsort lowscosa stdsort msort-c msort-m L2 cache misses per lines of elements ,000 1,000,000 10,000, ,000,000 1,000,000,000 Elements Kristoffer Vinther

53 TLB Misses 10.0 MIPS 10000, 1024/128 ffunnelsort funnelsort lowscosa stdsort msort-c msort-m TLB misses per block of elements ,000 1,000,000 10,000, ,000,000 1,000,000,000 Elements Kristoffer Vinther

54 Conclusions Cache oblivious sorting is possible requires a tall cache assumption M B 1+ε comparable performance with cache aware algorithms Future work more experimental justification for the cache oblivious model limitations of the model time space trade-offs? tool-box for cache oblivious algorithms 39

Cache-Oblivious Algorithms

Cache-Oblivious Algorithms Cache-Oblivious Algorithms 1 Cache-Oblivious Model 2 The Unknown Machine Algorithm C program gcc Object code linux Execution Can be executed on machines with a specific class of CPUs Algorithm Java program

More information

Data Structures 1 NTIN066

Data Structures 1 NTIN066 Data Structures 1 NTIN066 Jirka Fink Department of Theoretical Computer Science and Mathematical Logic Faculty of Mathematics and Physics Charles University in Prague Winter semester 2017/18 Last change

More information

On the Limits of Cache-Obliviousness

On the Limits of Cache-Obliviousness On the Limits of Cache-Obliviousness Gerth Stølting Brodal, BRICS, Department of Computer Science University of Aarhus Ny Munegade, DK-8000 Aarhus C, Denmar gerth@bricsd Rolf Fagerberg BRICS, Department

More information

Data Structures 1 NTIN066

Data Structures 1 NTIN066 Data Structures 1 NTIN066 Jiří Fink https://kam.mff.cuni.cz/ fink/ Department of Theoretical Computer Science and Mathematical Logic Faculty of Mathematics and Physics Charles University in Prague Winter

More information

Cache-Oblivious Computations: Algorithms and Experimental Evaluation

Cache-Oblivious Computations: Algorithms and Experimental Evaluation Cache-Oblivious Computations: Algorithms and Experimental Evaluation Vijaya Ramachandran Department of Computer Sciences University of Texas at Austin Dissertation work of former PhD student Dr. Rezaul

More information

Scanning and Traversing: Maintaining Data for Traversals in a Memory Hierarchy

Scanning and Traversing: Maintaining Data for Traversals in a Memory Hierarchy Scanning and Traversing: Maintaining Data for Traversals in a Memory Hierarchy Michael A. ender 1, Richard Cole 2, Erik D. Demaine 3, and Martin Farach-Colton 4 1 Department of Computer Science, State

More information

A Tight Lower Bound for Dynamic Membership in the External Memory Model

A Tight Lower Bound for Dynamic Membership in the External Memory Model A Tight Lower Bound for Dynamic Membership in the External Memory Model Elad Verbin ITCS, Tsinghua University Qin Zhang Hong Kong University of Science & Technology April 2010 1-1 The computational model

More information

Optimal Color Range Reporting in One Dimension

Optimal Color Range Reporting in One Dimension Optimal Color Range Reporting in One Dimension Yakov Nekrich 1 and Jeffrey Scott Vitter 1 The University of Kansas. yakov.nekrich@googlemail.com, jsv@ku.edu Abstract. Color (or categorical) range reporting

More information

Dynamic Ordered Sets with Exponential Search Trees

Dynamic Ordered Sets with Exponential Search Trees Dynamic Ordered Sets with Exponential Search Trees Arne Andersson Computing Science Department Information Technology, Uppsala University Box 311, SE - 751 05 Uppsala, Sweden arnea@csd.uu.se http://www.csd.uu.se/

More information

On the limits of cache-oblivious rational permutations

On the limits of cache-oblivious rational permutations Theoretical Computer Science 402 (2008) 221 233 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs On the limits of cache-oblivious rational

More information

CO Algorithms: Brief History

CO Algorithms: Brief History Piyush Kumar Departmet of Computer Sciece Advisor: Joseph S.B. Mitchell CO Algorithms: Brief History Frigo, Leiserso, Prokop, Ramachadra (FOCS 99) Cache Oblivious Algorithms Thesis Harold Prokop s Thesis

More information

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: October 11. In-Class Midterm. ( 7:05 PM 8:20 PM : 75 Minutes )

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: October 11. In-Class Midterm. ( 7:05 PM 8:20 PM : 75 Minutes ) CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: October 11 In-Class Midterm ( 7:05 PM 8:20 PM : 75 Minutes ) This exam will account for either 15% or 30% of your overall grade depending on your

More information

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Priority Queue / Heap Stores (key,data) pairs (like dictionary) But, different set of operations: Initialize-Heap: creates new empty heap

More information

Cache-Oblivious Range Reporting With Optimal Queries Requires Superlinear Space

Cache-Oblivious Range Reporting With Optimal Queries Requires Superlinear Space Cache-Oblivious Range Reporting With Optimal Queries Requires Superlinear Space Peyman Afshani MADALGO Dept. of Computer Science University of Aarhus IT-Parken, Aabogade 34 DK-8200 Aarhus N Denmark peyman@madalgo.au.dk

More information

The Complexity of Constructing Evolutionary Trees Using Experiments

The Complexity of Constructing Evolutionary Trees Using Experiments The Complexity of Constructing Evolutionary Trees Using Experiments Gerth Stlting Brodal 1,, Rolf Fagerberg 1,, Christian N. S. Pedersen 1,, and Anna Östlin2, 1 BRICS, Department of Computer Science, University

More information

lsb(x) = Least Significant Bit?

lsb(x) = Least Significant Bit? lsb(x) = Least Significant Bit? w-1 lsb i 1 0 x 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 0 1 msb(x) in O(1) steps using 5 multiplications [M.L. Fredman, D.E. Willard, Surpassing the information-theoretic bound with

More information

Advanced Data Structures

Advanced Data Structures Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu Predecessor data structures We want to support the following operations on a set of integers from

More information

Analysis of Recursive Cache-Adaptive Algorithms. Andrea Lincoln

Analysis of Recursive Cache-Adaptive Algorithms. Andrea Lincoln Analysis of Recursive Cache-Adaptive Algorithms by Andrea Lincoln Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of

More information

Advanced Data Structures

Advanced Data Structures Simon Gog gog@kit.edu - Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Predecessor data structures We want to support

More information

Assignment 5: Solutions

Assignment 5: Solutions Comp 21: Algorithms and Data Structures Assignment : Solutions 1. Heaps. (a) First we remove the minimum key 1 (which we know is located at the root of the heap). We then replace it by the key in the position

More information

The Cost of Cache-Oblivious Searching

The Cost of Cache-Oblivious Searching The Cost of Cache-Oblivious Searching Michael A Bender Gerth Stølting Brodal Rolf Fagerberg Dongdong Ge Simai He Haodong Hu John Iacono Alejandro López-Ortiz Abstract This paper gives tight bounds on the

More information

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Exam 1. March 12th, CS525 - Midterm Exam Solutions Name CWID Exam 1 March 12th, 2014 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 5 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 90

More information

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Notes on Logarithmic Lower Bounds in the Cell Probe Model Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)

More information

Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact

Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact Ph.D Thesis Presentation of Sardar Anisul Haque Supervisor: Dr. Marc Moreno Maza Ontario Research Centre for Computer Algebra

More information

Lecture 4: Divide and Conquer: van Emde Boas Trees

Lecture 4: Divide and Conquer: van Emde Boas Trees Lecture 4: Divide and Conquer: van Emde Boas Trees Series of Improved Data Structures Insert, Successor Delete Space This lecture is based on personal communication with Michael Bender, 001. Goal We want

More information

Finger Search in the Implicit Model

Finger Search in the Implicit Model Finger Search in the Implicit Model Gerth Stølting Brodal, Jesper Sindahl Nielsen, Jakob Truelsen MADALGO, Department o Computer Science, Aarhus University, Denmark. {gerth,jasn,jakobt}@madalgo.au.dk Abstract.

More information

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th CSE 3500 Algorithms and Complexity Fall 2016 Lecture 10: September 29, 2016 Quick sort: Average Run Time In the last lecture we started analyzing the expected run time of quick sort. Let X = k 1, k 2,...,

More information

Disconnecting Networks via Node Deletions

Disconnecting Networks via Node Deletions 1 / 27 Disconnecting Networks via Node Deletions Exact Interdiction Models and Algorithms Siqian Shen 1 J. Cole Smith 2 R. Goli 2 1 IOE, University of Michigan 2 ISE, University of Florida 2012 INFORMS

More information

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Worst-Case Execution Time Analysis. LS 12, TU Dortmund Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 02, 03 May 2016 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 53 Most Essential Assumptions for Real-Time Systems Upper

More information

Using Hashing to Solve the Dictionary Problem (In External Memory)

Using Hashing to Solve the Dictionary Problem (In External Memory) Using Hashing to Solve the Dictionary Problem (In External Memory) John Iacono NYU Poly Mihai Pǎtraşcu AT&T Labs February 9, 2011 Abstract We consider the dictionary problem in external memory and improve

More information

Randomization in Algorithms and Data Structures

Randomization in Algorithms and Data Structures Randomization in Algorithms and Data Structures 3 lectures (Tuesdays 14:15-16:00) May 1: Gerth Stølting Brodal May 8: Kasper Green Larsen May 15: Peyman Afshani For each lecture one handin exercise deadline

More information

Cache Complexity and Multicore Implementation for Univariate Real Root Isolation

Cache Complexity and Multicore Implementation for Univariate Real Root Isolation Cache Complexity and Multicore Implementation for Univariate Real Root Isolation Changbo Chen, Marc Moreno Maza and Yuzhen Xie University of Western Ontario, London, Ontario, Canada E-mail: cchen5,moreno,yxie@csd.uwo.ca

More information

Introduction to Algorithms 6.046J/18.401J/SMA5503

Introduction to Algorithms 6.046J/18.401J/SMA5503 Introduction to Algorithms 6.046J/8.40J/SMA5503 Lecture 3 Prof. Piotr Indyk The divide-and-conquer design paradigm. Divide the problem (instance) into subproblems. 2. Conquer the subproblems by solving

More information

ICS 233 Computer Architecture & Assembly Language

ICS 233 Computer Architecture & Assembly Language ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by

More information

INF2220: algorithms and data structures Series 1

INF2220: algorithms and data structures Series 1 Universitetet i Oslo Institutt for Informatikk I. Yu, D. Karabeg INF2220: algorithms and data structures Series 1 Topic Function growth & estimation of running time, trees (Exercises with hints for solution)

More information

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing 1 Splay Trees Sleator and Tarjan, Self Adjusting Binary Search Trees JACM 32(3) 1985 The claim planning ahead. But matches previous idea of being lazy, letting potential build up, using it to pay for expensive

More information

Splay Trees. CMSC 420: Lecture 8

Splay Trees. CMSC 420: Lecture 8 Splay Trees CMSC 420: Lecture 8 AVL Trees Nice Features: - Worst case O(log n) performance guarantee - Fairly simple to implement Problem though: - Have to maintain etra balance factor storage at each

More information

Algorithms and Data Structures

Algorithms and Data Structures Algorithms and Data Structures, Divide and Conquer Albert-Ludwigs-Universität Freiburg Prof. Dr. Rolf Backofen Bioinformatics Group / Department of Computer Science Algorithms and Data Structures, December

More information

The Divide-and-Conquer Design Paradigm

The Divide-and-Conquer Design Paradigm CS473- Algorithms I Lecture 4 The Divide-and-Conquer Design Paradigm CS473 Lecture 4 1 The Divide-and-Conquer Design Paradigm 1. Divide the problem (instance) into subproblems. 2. Conquer the subproblems

More information

6.854 Advanced Algorithms

6.854 Advanced Algorithms 6.854 Advanced Algorithms Homework Solutions Hashing Bashing. Solution:. O(log U ) for the first level and for each of the O(n) second level functions, giving a total of O(n log U ) 2. Suppose we are using

More information

Cache Oblivious Stencil Computations

Cache Oblivious Stencil Computations Cache Oblivious Stencil Computations S. HUNOLD J. L. TRÄFF F. VERSACI Lectures on High Performance Computing 13 April 2015 F. Versaci (TU Wien) Cache Oblivious Stencil Computations 13 April 2015 1 / 19

More information

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions Name CWID Quiz 2 Due November 26th, 2015 CS525 - Advanced Database Organization s Please leave this empty! 1 2 3 4 5 6 7 Sum Instructions Multiple choice questions are graded in the following way: You

More information

Theory of Computing Systems 2006 Springer Science+Business Media, Inc.

Theory of Computing Systems 2006 Springer Science+Business Media, Inc. Theory Comput. Systems 39, 391 397 2006 DOI: 10.1007/s00224-005-1237-z Theory of Computing Systems 2006 Springer Science+Business Media, Inc. INSERTION SORT Is On log n Michael A. Bender, 1 Martín Farach-Colton,

More information

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Worst-Case Execution Time Analysis. LS 12, TU Dortmund Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 09/10, Jan., 2018 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 43 Most Essential Assumptions for Real-Time Systems Upper

More information

Caches in WCET Analysis

Caches in WCET Analysis Caches in WCET Analysis Jan Reineke Department of Computer Science Saarland University Saarbrücken, Germany ARTIST Summer School in Europe 2009 Autrans, France September 7-11, 2009 Jan Reineke Caches in

More information

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Advanced Implementations of Tables: Balanced Search Trees and Hashing Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the

More information

Communication avoiding parallel algorithms for dense matrix factorizations

Communication avoiding parallel algorithms for dense matrix factorizations Communication avoiding parallel dense matrix factorizations 1/ 44 Communication avoiding parallel algorithms for dense matrix factorizations Edgar Solomonik Department of EECS, UC Berkeley October 2013

More information

Measurement & Performance

Measurement & Performance Measurement & Performance Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law Topics 2 Page The Nature of Time real (i.e. wall clock) time = User Time: time spent

More information

On the Study of Tree Pattern Matching Algorithms and Applications

On the Study of Tree Pattern Matching Algorithms and Applications On the Study of Tree Pattern Matching Algorithms and Applications by Fei Ma B.Sc., Simon Fraser University, 2004 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of

More information

Measurement & Performance

Measurement & Performance Measurement & Performance Topics Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law 2 The Nature of Time real (i.e. wall clock) time = User Time: time spent executing

More information

Parallelization of the QC-lib Quantum Computer Simulator Library

Parallelization of the QC-lib Quantum Computer Simulator Library Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer September 9, 23 PPAM 23 1 Ian Glendinning / September 9, 23 Outline Introduction Quantum Bits, Registers

More information

Bloom Filters, general theory and variants

Bloom Filters, general theory and variants Bloom Filters: general theory and variants G. Caravagna caravagn@cli.di.unipi.it Information Retrieval Wherever a list or set is used, and space is a consideration, a Bloom Filter should be considered.

More information

Online Sorted Range Reporting and Approximating the Mode

Online Sorted Range Reporting and Approximating the Mode Online Sorted Range Reporting and Approximating the Mode Mark Greve Progress Report Department of Computer Science Aarhus University Denmark January 4, 2010 Supervisor: Gerth Stølting Brodal Online Sorted

More information

MATHEMATICAL OBJECTS in

MATHEMATICAL OBJECTS in MATHEMATICAL OBJECTS in Computational Tools in a Unified Object-Oriented Approach Yair Shapira @ CRC Press Taylor & Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor & Francis

More information

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20. Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20. Princeton University COS 4 Theory of Algorithms Spring 2002 Kevin Wayne Priority Queues Operation mae-heap insert find- delete- union

More information

Revisiting the Cache Miss Analysis of Multithreaded Algorithms

Revisiting the Cache Miss Analysis of Multithreaded Algorithms Revisiting the Cache Miss Analysis of Multithreaded Algorithms Richard Cole Vijaya Ramachandran September 19, 2012 Abstract This paper concerns the cache miss analysis of algorithms when scheduled in work-stealing

More information

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application Administrivia 1. markem/cs333/ 2. Staff 3. Prerequisites 4. Grading Course Objectives 1. Theory and application 2. Benefits 3. Labs TAs Overview 1. What is a computer system? CPU PC ALU System bus Memory

More information

Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine)

Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine) Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine) Mihai Pǎtraşcu, MIT, web.mit.edu/ mip/www/ Index terms: partial-sums problem, prefix sums, dynamic lower bounds Synonyms: dynamic trees 1

More information

Communication Complexity

Communication Complexity Communication Complexity Jaikumar Radhakrishnan School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai 31 May 2012 J 31 May 2012 1 / 13 Plan 1 Examples, the model, the

More information

Lecture 19. Architectural Directions

Lecture 19. Architectural Directions Lecture 19 Architectural Directions Today s lecture Advanced Architectures NUMA Blue Gene 2010 Scott B. Baden / CSE 160 / Winter 2010 2 Final examination Announcements Thursday, March 17, in this room:

More information

Timing analysis and timing predictability

Timing analysis and timing predictability Timing analysis and timing predictability Caches in WCET Analysis Reinhard Wilhelm 1 Jan Reineke 2 1 Saarland University, Saarbrücken, Germany 2 University of California, Berkeley, USA ArtistDesign Summer

More information

Data Structures and Algorithms " Search Trees!!

Data Structures and Algorithms  Search Trees!! Data Structures and Algorithms " Search Trees!! Outline" Binary Search Trees! AVL Trees! (2,4) Trees! 2 Binary Search Trees! "! < 6 2 > 1 4 = 8 9 Ordered Dictionaries" Keys are assumed to come from a total

More information

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu Performance Metrics for Computer Systems CASS 2018 Lavanya Ramapantulu Eight Great Ideas in Computer Architecture Design for Moore s Law Use abstraction to simplify design Make the common case fast Performance

More information

On Adaptive Integer Sorting

On Adaptive Integer Sorting On Adaptive Integer Sorting Anna Pagh 1, Rasmus Pagh 1, and Mikkel Thorup 2 1 IT University of Copenhagen, Rued Langgaardsvej 7, 2300 København S, Denmark {annao, pagh}@itu.dk 2 AT&T Labs Research, Shannon

More information

Atomic snapshots in O(log 3 n) steps using randomized helping

Atomic snapshots in O(log 3 n) steps using randomized helping Atomic snapshots in O(log 3 n) steps using randomized helping James Aspnes Keren Censor-Hillel September 7, 2018 Abstract A randomized construction of single-writer snapshot objects from atomic registers

More information

BRICS. Hashing, Randomness and Dictionaries. Basic Research in Computer Science BRICS DS-02-5 R. Pagh: Hashing, Randomness and Dictionaries

BRICS. Hashing, Randomness and Dictionaries. Basic Research in Computer Science BRICS DS-02-5 R. Pagh: Hashing, Randomness and Dictionaries BRICS Basic Research in Computer Science BRICS DS-02-5 R. Pagh: Hashing, Randomness and Dictionaries Hashing, Randomness and Dictionaries Rasmus Pagh BRICS Dissertation Series DS-02-5 ISSN 1396-7002 October

More information

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes.

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes. Analysis of Algorithms Fibonacci Heaps Andres Mendez-Vazquez October 29, 2015 1 / 119 Outline 1 Introduction Basic Definitions Ordered Trees 2 Binomial Trees Example 3 Fibonacci Heap Operations Fibonacci

More information

Cache-Efficient Algorithms II

Cache-Efficient Algorithms II 6.172 Performace Egieerig of Software Systems LECTURE 9 Cache-Efficiet Algorithms II Charles E. Leiserso October 7, 2010 2010 Charles E. Leiserso 1 Ideal-Cache Model Features Two-level hierarchy. Cache

More information

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Binary Search Introduction Problem Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Strategy 1: Random Search Randomly select a page until the page containing

More information

Cache-Oblivious Hashing

Cache-Oblivious Hashing Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong

More information

Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)

Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b) Introduction to Algorithms October 14, 2009 Massachusetts Institute of Technology 6.006 Spring 2009 Professors Srini Devadas and Constantinos (Costis) Daskalakis Quiz 1 Solutions Quiz 1 Solutions Problem

More information

Lower Bounds for External Memory Integer Sorting via Network Coding

Lower Bounds for External Memory Integer Sorting via Network Coding Lower Bounds for External Memory Integer Sorting via Network Coding Alireza Farhadi University of Maryland College Park, MD MohammadTaghi Hajiaghayi University of Maryland College Park, MD Elaine Shi Cornell

More information

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts) Introduction to Algorithms October 13, 2010 Massachusetts Institute of Technology 6.006 Fall 2010 Professors Konstantinos Daskalakis and Patrick Jaillet Quiz 1 Solutions Quiz 1 Solutions Problem 1. We

More information

FIBONACCI HEAPS. preliminaries insert extract the minimum decrease key bounding the rank meld and delete. Copyright 2013 Kevin Wayne

FIBONACCI HEAPS. preliminaries insert extract the minimum decrease key bounding the rank meld and delete. Copyright 2013 Kevin Wayne FIBONACCI HEAPS preliminaries insert extract the minimum decrease key bounding the rank meld and delete Copyright 2013 Kevin Wayne http://www.cs.princeton.edu/~wayne/kleinberg-tardos Last updated on Sep

More information

Succinct Data Structures for the Range Minimum Query Problems

Succinct Data Structures for the Range Minimum Query Problems Succinct Data Structures for the Range Minimum Query Problems S. Srinivasa Rao Seoul National University Joint work with Gerth Brodal, Pooya Davoodi, Mordecai Golin, Roberto Grossi John Iacono, Danny Kryzanc,

More information

On the Factor Refinement Principle and its Implementation on Multicore Architectures

On the Factor Refinement Principle and its Implementation on Multicore Architectures Home Search Collections Journals About Contact us My IOPscience On the Factor Refinement Principle and its Implementation on Multicore Architectures This article has been downloaded from IOPscience Please

More information

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory S.V. N. (vishy) Vishwanathan Purdue University and Microsoft vishy@purdue.edu October 9, 2012 S.V. N. Vishwanathan (Purdue,

More information

CSE 4502/5717: Big Data Analytics

CSE 4502/5717: Big Data Analytics CSE 4502/5717: Big Data Analytics otes by Anthony Hershberger Lecture 4 - January, 31st, 2018 1 Problem of Sorting A known lower bound for sorting is Ω is the input size; M is the core memory size; and

More information

Compressed Index for Dynamic Text

Compressed Index for Dynamic Text Compressed Index for Dynamic Text Wing-Kai Hon Tak-Wah Lam Kunihiko Sadakane Wing-Kin Sung Siu-Ming Yiu Abstract This paper investigates how to index a text which is subject to updates. The best solution

More information

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM Search Trees Chapter 1 < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 3 - Binary Search Trees Ø A binary search

More information

Priority Queues. Sanders: Algorithm Engineering 12. Juni Maintain set M of Elements with Keys. Insert: DeleteMin:

Priority Queues. Sanders: Algorithm Engineering 12. Juni Maintain set M of Elements with Keys. Insert: DeleteMin: Sanders: Algorithm Engineering 12. Juni 2008 1 Priority Queues Maintain set M of Elements with Keys Insert: DeleteMin: Applications (no additional operations): Multi-way Merging Greedy Algorithms (e.g.,

More information

Optimal Bounds for the Predecessor Problem and Related Problems

Optimal Bounds for the Predecessor Problem and Related Problems Optimal Bounds for the Predecessor Problem and Related Problems Paul Beame Computer Science and Engineering University of Washington Seattle, WA, USA 98195-2350 beame@cs.washington.edu Faith E. Fich Department

More information

Insertion Sort. We take the first element: 34 is sorted

Insertion Sort. We take the first element: 34 is sorted Insertion Sort Idea: by Ex Given the following sequence to be sorted 34 8 64 51 32 21 When the elements 1, p are sorted, then the next element, p+1 is inserted within these to the right place after some

More information

A Divide-and-Conquer Algorithm for Functions of Triangular Matrices

A Divide-and-Conquer Algorithm for Functions of Triangular Matrices A Divide-and-Conquer Algorithm for Functions of Triangular Matrices Ç. K. Koç Electrical & Computer Engineering Oregon State University Corvallis, Oregon 97331 Technical Report, June 1996 Abstract We propose

More information

CS425: Algorithms for Web Scale Data

CS425: Algorithms for Web Scale Data CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges

More information

Transposition Mechanism for Sparse Matrices on Vector Processors

Transposition Mechanism for Sparse Matrices on Vector Processors Transposition Mechanism for Sparse Matrices on Vector Processors Pyrrhos Stathis Stamatis Vassiliadis Sorin Cotofana Electrical Engineering Department, Delft University of Technology, Delft, The Netherlands

More information

R ij = 2. Using all of these facts together, you can solve problem number 9.

R ij = 2. Using all of these facts together, you can solve problem number 9. Help for Homework Problem #9 Let G(V,E) be any undirected graph We want to calculate the travel time across the graph. Think of each edge as one resistor of 1 Ohm. Say we have two nodes: i and j Let the

More information

Parallelization of the QC-lib Quantum Computer Simulator Library

Parallelization of the QC-lib Quantum Computer Simulator Library Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer VCPC European Centre for Parallel Computing at Vienna Liechtensteinstraße 22, A-19 Vienna, Austria http://www.vcpc.univie.ac.at/qc/

More information

Binary Search Trees. Motivation

Binary Search Trees. Motivation Binary Search Trees Motivation Searching for a particular record in an unordered list takes O(n), too slow for large lists (databases) If the list is ordered, can use an array implementation and use binary

More information

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz Notes on Complexity Theory: Fall 2005 Last updated: September, 2005 Jonathan Katz Lecture 4 1 Circuit Complexity Circuits are directed, acyclic graphs where nodes are called gates and edges are called

More information

Circuits. Lecture 11 Uniform Circuit Complexity

Circuits. Lecture 11 Uniform Circuit Complexity Circuits Lecture 11 Uniform Circuit Complexity 1 Recall 2 Recall Non-uniform complexity 2 Recall Non-uniform complexity P/1 Decidable 2 Recall Non-uniform complexity P/1 Decidable NP P/log NP = P 2 Recall

More information

Searching. Sorting. Lambdas

Searching. Sorting. Lambdas .. s Babes-Bolyai University arthur@cs.ubbcluj.ro Overview 1 2 3 Feedback for the course You can write feedback at academicinfo.ubbcluj.ro It is both important as well as anonymous Write both what you

More information

CSCI Honor seminar in algorithms Homework 2 Solution

CSCI Honor seminar in algorithms Homework 2 Solution CSCI 493.55 Honor seminar in algorithms Homework 2 Solution Saad Mneimneh Visiting Professor Hunter College of CUNY Problem 1: Rabin-Karp string matching Consider a binary string s of length n and another

More information

Divide and Conquer Algorithms

Divide and Conquer Algorithms Divide and Conquer Algorithms T. M. Murali February 19, 2013 Divide and Conquer Break up a problem into several parts. Solve each part recursively. Solve base cases by brute force. Efficiently combine

More information

Computing Techniques for Parallel and Distributed Systems with an Application to Data Compression. Sergio De Agostino Sapienza University di Rome

Computing Techniques for Parallel and Distributed Systems with an Application to Data Compression. Sergio De Agostino Sapienza University di Rome Computing Techniques for Parallel and Distributed Systems with an Application to Data Compression Sergio De Agostino Sapienza University di Rome Parallel Systems A parallel random access machine (PRAM)

More information

A Network-Oblivious Algorithms

A Network-Oblivious Algorithms A Network-Oblivious Algorithms GIANFRANCO BILARDI, ANDREA PIETRACAPRINA, GEPPINO PUCCI, MICHELE SCQUIZZATO, and FRANCESCO SILVESTRI, University of Padova A framework is proposed for the design and analysis

More information

COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions

COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions September 12, 2018 I. The Work-Time W-T presentation of EREW sequence reduction Algorithm 2 in the PRAM handout has work complexity

More information

Fast Output-Sensitive Matrix Multiplication

Fast Output-Sensitive Matrix Multiplication Fast Output-Sensitive Matrix Multiplication Riko Jacob 1 and Morten Stöckel 2 1 IT University of Copenhagen rikj@itu.dk 2 University of Copenhagen most@di.ku.dk Abstract. 3 We consider the problem of multiplying

More information

Height, Size Performance of Complete and Nearly Complete Binary Search Trees in Dictionary Applications

Height, Size Performance of Complete and Nearly Complete Binary Search Trees in Dictionary Applications Height, Size Performance of Complete and Nearly Complete Binary Search Trees in Dictionary Applications AHMED TAREK Department of Math and Computer Science California University of Pennsylvania Eberly

More information

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk CMPS 2200 Fall 2017 Divide-and-Conquer Carola Wenk Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk 1 The divide-and-conquer design paradigm 1. Divide the problem (instance)

More information