Advanced Data Structures

Similar documents
Advanced Data Structures

Fixed-Universe Successor : Van Emde Boas. Lecture 18

Lecture 4: Divide and Conquer: van Emde Boas Trees

10 van Emde Boas Trees

Content. 1 Static dictionaries. 2 Integer data structures. 3 Graph data structures. 4 Dynamization and persistence. 5 Cache-oblivious structures

Lecture 2 September 4, 2014

Space and Time-Efficient Data Structures for Massive Datasets

Dynamic Ordered Sets with Exponential Search Trees

Optimal Color Range Reporting in One Dimension

Advanced Data Structures

Inge Li Gørtz. Thank you to Kevin Wayne for inspiration to slides

Quiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)

Integer Sorting on the word-ram

Dictionary: an abstract data type

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Dictionary: an abstract data type

lsb(x) = Least Significant Bit?

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

5 Spatial Access Methods

Assignment 5: Solutions

Chapter 5 Arrays and Strings 5.1 Arrays as abstract data types 5.2 Contiguous representations of arrays 5.3 Sparse arrays 5.4 Representations of

Lecture 5: Fusion Trees (ctd.) Johannes Fischer

5 Spatial Access Methods

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Lecture 15 April 9, 2007

Fundamental Algorithms

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Data Structures and Algorithms " Search Trees!!

Communication Complexity

1 Hashing. 1.1 Perfect Hashing

Optimal Bounds for the Predecessor Problem and Related Problems

String Indexing for Patterns with Wildcards

arxiv: v1 [cs.ds] 15 Feb 2012

Outline. Computer Science 331. Cost of Binary Search Tree Operations. Bounds on Height: Worst- and Average-Case

Text Indexing: Lecture 6

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing

Data Structures in Java

arxiv:cs/ v1 [cs.ds] 5 Feb 2005

Hash Functions for Priority Queues

Theoretical Computer Science. Dynamic rank/select structures with applications to run-length encoded texts

CS Data Structures and Algorithm Analysis

Optimal Bounds for the Predecessor Problem and Related Problems

Hash tables. Hash tables

Lecture 6 September 21, 2016

Cache-Oblivious Algorithms

Hash tables. Hash tables

Cache-Oblivious Algorithms

Lecture 4 Thursday Sep 11, 2014

arxiv: v2 [cs.ds] 8 Apr 2016

Multiple Choice Tries and Distributed Hash Tables

Online Sorted Range Reporting and Approximating the Mode

Longest Gapped Repeats and Palindromes

INF2220: algorithms and data structures Series 1

Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine)

Algorithm Analysis Recurrence Relation. Chung-Ang University, Jaesung Lee

Succinct Data Structures for Approximating Convex Functions with Applications

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Exam in Discrete Mathematics

Minimal Indices for Successor Search

Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures

Amortized analysis. Amortized analysis

Algorithm efficiency analysis

Data Structures 1 NTIN066

Searching. Sorting. Lambdas

Lecture 18 April 26, 2012

BRICS. Hashing, Randomness and Dictionaries. Basic Research in Computer Science BRICS DS-02-5 R. Pagh: Hashing, Randomness and Dictionaries

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Space-Efficient Construction Algorithm for Circular Suffix Tree

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

compare to comparison and pointer based sorting, binary trees

Binary Search Trees. Motivation

Binary Search Trees. Dictionary Operations: Additional operations: find(key) insert(key, value) erase(key)

7 Dictionary. EADS c Ernst Mayr, Harald Räcke 109

CSCI 3110 Assignment 6 Solutions

Data Structures and Algorithms

CS1802 Week 11: Algorithms, Sums, Series, Induction

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

CS60007 Algorithm Design and Analysis 2018 Assignment 1

Problem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads

7 Dictionary. Ernst Mayr, Harald Räcke 126

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Data Structures and and Algorithm Xiaoqing Zheng

CS 240 Data Structures and Data Management. Module 4: Dictionaries

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

Problem. Maintain the above set so as to support certain

Cell Probe Lower Bounds and Approximations for Range Mode

In-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2

Module 1: Analyzing the Efficiency of Algorithms

Forbidden Patterns. {vmakinen leena.salmela

1 Maintaining a Dictionary

Algorithms lecture notes 1. Hashing, and Universal Hash functions

7 Dictionary. Harald Räcke 125

Transcription:

Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu

Predecessor data structures We want to support the following operations on a set of integers from the domain U = [u]. insert(x) Add x to S. I.e. S = S {x}. delete(x) Delete x from S. I.e. S = S \ {x}. member(x) = {x x S} predecessor(x) = max{y y x y S} successor(x) = min{y y x y S} where x is an integer in U and S the set of integers of size n stored in the data structure. min{s} = successor() max{s} = predecessor(u 1) Solution know from Algo I : Balanced search trees. E.g. red-black trees. In all comparison based approaches at least one operation takes Ω(log n) time. Why? 1 Simon Gog:

Predecessor data structures Ω(log n) bound can be beaten in the word RAM model: Memory is organized in words of b O(log u) bits A word can be accessed in constant time We can address all data using one word Standard arithmetic operations take constant time on words (i.e. addition, subtraction, division, shifts...) We first concentrate on the static case: The set S is fixed. I.e. no insert and delete operations. 2 Simon Gog:

x-fast trie (Willard, 1982) Conceptional: Complete binary tree of height w = log u 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Operations member/successor/prodecessor can be answered in O(w) time by traversing the tree 3 Simon Gog:

x-fast trie (Willard, 1982) Conceptional: Complete binary tree of height w = log u 1 1 1 1 1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Operations member/successor/prodecessor can be answered in O(w) time by traversing the tree 3 Simon Gog:

x-fast trie (Willard, 1982) Conceptional: Complete binary tree of height w = log u 1 1 1 1 1 1 1 1 1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Operations member/successor/prodecessor can be answered in O(w) time by traversing the tree 3 Simon Gog:

x-fast trie (Willard, 1982) Conceptional: Complete binary tree of height w = log u 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Operations member/successor/prodecessor can be answered in O(w) time by traversing the tree 3 Simon Gog:

x-fast trie From O(log u) to O(log log u)... 1 1 2 3 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 4 Simon Gog:

x-fast trie From O(log u) to O(log log u)... 1 2 3 1 5 6 3 1 12 Generate a perfect hash table h i for each level i (keys are the present nodes) Note: Node numbers (represented in binary) are prefixes of keys in S 5 Simon Gog:

x-fast trie From O(log u) to O(log log u)... h h 1 1 h 2 2 3 h 3 1 5 6 h 4 3 1 12 Generate a perfect hash table h i for each level i (keys are the present nodes) Note: Node numbers (represented in binary) are prefixes of keys in S 5 Simon Gog:

x-fast trie Query time from O(log u) to O(log log u)... Member queries can be answered in constant time by lookup in the leaf level using prefixes of the searched key. There are w prefixes, i.e. binary search takes O(log w) or O(log log u) time Space: w O(n) words, i.e. O(n log u) bits Predecessor/Successor queries For each node store a pointer to the maximal/minimal leaf in its subtree Use a double linked list to represent leaf nodes. Solving predecessor: Search for a node v which represents the longest prefix of x with any key in S. Two cases: Minimum in subtree of v is larger x: Return element to the left of the leaf. Maximum in the subtree of v is smaller than x. Return maximum. 6 Simon Gog:

y-fast trie (Willard, 1982) Space from O(n log u) to O(n) words... Split S into n w blocks of O(log u) elements B, B 1,..., B n w 1. max{b i } < min{b i+1 } for i < n w 1 Let r i = max{b i } be a representative of block B i. Build x-fast trie over representatives. B B 1... Total space: w n O(w) + O(n) = O(n) words 7 Simon Gog:

y-fast trie Space from O(n log u) to O(n) words... Use sorted array to represent B i A member query is answered as follows Search for successor of x in x-trie of r i s Let B k be the block of the successor of x Search in O(log w) = O(log log u) time for x in B k How does predecessor/successor work? 8 Simon Gog:

y-fast trie Changes to make structure dynamic use cuckoo hashing for x-fast trie use balanced search trees of size between 1 2 w and 2w for B is representative is not the maximum, but any element separating two consecutive groups Summary: Operation static y-fast trie dynamic y-fast trie pred(x)/succ(x) O(log log u) w.c. O(log log u) insert(x)/delete(x) O(log log u) exp. & am. construction O(n) exp. 9 Simon Gog:

Van Emde Boas Trees Conceptual bitvector B of length u with B[i] = 1 for all i S Split B into u/ u blocks (blue blocks) B, B 1,... Set bit in R[i] if there is at least one bit set in B i Also store the minimum/maximum of S Summary R 1 1 1 min = 1 max = 14 B= 1 1 1 1 1 1 1 Here: u = 16 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 1 Simon Gog:

Van Emde Boas Trees Define van Emde Boas tree (veb tree) recursively I.e. use veb to represent B i s and R veb(u) denotes the veb tree on a universe of size u Base case: u = 2. Only one node and variables min, max. Technicalities u = 2 (log u)/2 u = 2 (log u)/2 high(x) = x (block that contains x) u low(x) = x mod u (relative position of x in B high(x) ) 11 Simon Gog:

Van Emde Boas Trees Predecessor(x, B) (first attempt) Let y = height(x) and z = Predecessor(low(x), B y ) If z = return z + y u Let b = Predecessor(high(x), R) If b = return max(b b ) + b u Return Summary R 1 1 1 B= 1 B 1 B 1 1 B 2 1 1 1 B 3 1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 12 Simon Gog:

Van Emde Boas Trees Predecessor(x, B) (first attempt) Let y = height(x) and z = Predecessor(low(x), B y ) If z = return z + y u Let b = Predecessor(high(x), R) If b = return max(b b ) + b u Return Problem Recurrence for time complexity: T (u) = 2T ( u) + Θ(1) Substitute u by 2 k and define S(k) = T (2 k ) S(k) = 2S( k 2 ) + Θ(1) (solved by Master Theorem) We get T (u) = O(log u log log u). Next: improve time to O(log log u) 12 Simon Gog:

Van Emde Boas Trees Predecessor(x, B) (first attempt) Let y = height(x) and z = Predecessor(low(x), B y ) If z = return z + y u Let b = Predecessor(high(x), R) If b = return max(b b ) + b u Return Problem Recurrence for time complexity: T (u) = 2T ( u) + Θ(1) Substitute u by 2 k and define S(k) = T (2 k ) S(k) = 2S( k 2 ) + Θ(1) (solved by Master Theorem) We get T (u) = O(log u log log u). Next: improve time to O(log log u) 12 Simon Gog:

Van Emde Boas Trees Predecessor(x, B) (first attempt) Let y = height(x) and z = Predecessor(low(x), B y ) If z = return z + y u Let b = Predecessor(high(x), R) If b = return max(b b ) + b u Return Problem Recurrence for time complexity: T (u) = 2T ( u) + Θ(1) Substitute u by 2 k and define S(k) = T (2 k ) S(k) = 2S( k 2 ) + Θ(1) (solved by Master Theorem) We get T (u) = O(log u log log u). Next: improve time to O(log log u) 12 Simon Gog:

Van Emde Boas Trees Predecessor(x, B) (second attempt) If x > max return max Let y = high(x), if min(b y ) < x return Predecessor(low(x), B y ) Let b = Predecessor(high(x), R) If b = return max(b b ) + b u Return Recurrence for time complexity: T (u) = T ( u) + O(1) Solution (Master Theorem or drawing recursion tree): T (u) = Θ(log log u) 13 Simon Gog:

Van Emde Boas Tree Space complexity Note Recurrence: S(u) = ( u + 1)S( u) + Θ(1) Solution: S(u) O(u) Space complexity of x-fast and y-fast tries are better for small sets Van Emde Boas Tree does not rely on hashing 14 Simon Gog: