Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash
|
|
- Evangeline Watkins
- 5 years ago
- Views:
Transcription
1 Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash Martin Aumu ller, Martin Dietzfelbinger, Philipp Woelfel? Technische Universita t Ilmenau,? University of Calgary ESA Ljubljana, September 12, 2012 M. Aumu ller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 1/18
2 Cuckoo Hashing (Pagh/Rodler, 2001/2004) A hashing-based implementation of the dictionary data type. Setting: set S U of n keys two tables T 1 [0..m 1] and T 2 [0..m 1] two (hash) functions h 1, h 2 with h i : U [m] Rules: each table cell can hold exactly one key a key x must be stored either in T 1 [h 1 (x)] or T 2 [h 2 (x)] (fast lookup and deletions!) Definition If S can be stored according to these rules, we call (h 1, h 2 ) suitable for S. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 2/18
3 Cuckoo Hashing (Pagh/Rodler, 2001/2004) A hashing-based implementation of the dictionary data type. Setting: set S U of n keys two tables T 1 [0..m 1] and T 2 [0..m 1] two (hash) functions h 1, h 2 with h i : U [m] Rules: each table cell can hold exactly one key a key x must be stored either in T 1 [h 1 (x)] or T 2 [h 2 (x)] (fast lookup and deletions!) Definition If S can be stored according to these rules, we call (h 1, h 2 ) suitable for S. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 2/18
4 Cuckoo Hashing: Failure Probability Theorem (Pagh/Rodler, 2004) Let S U with S = n. If (h 1, h 2 ) are fully random (or Ω(log n)-wise independent), then In fact: Θ(1/n). Pr((h 1, h 2 ) unsuitable for S) = O(1/n). M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 3/18
5 Improving Standard Cuckoo Hashing: Stash (Kirsch/Mitzenmacher/Wieder, ESA 2008): Θ(1/n) is too large. Proposal: Can put up to s = O(1) keys into additional storage, called stash Theorem (K/M/W, 2008) Let S U with S = n. If (h 1, h 2 ) are fully random, then Pr((h 1, h 2 ) unsuitable for S with stash size s) = O(1/n s+1 ). Full Randomness Assumption (FRA) central in their analysis constructions for FRA known, but undesirable Open Question (Kirsch/Mitzenmacher/Wieder, SICOMP 09) A major open question is proving the above bounds for explicit hash families that can be represented, sampled, and evaluated efficiently. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 4/18
6 Our contributions efficient hash functions that can realize cuckoo hashing with a stash variation of a known construction of a uniform hash function (not covered in this talk) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 5/18
7 Method extension of (Dietzfelbinger/Woelfel, 2003) they proposed efficient hash functions for standard cuckoo hashing we exhibit a new class of (even simpler) hash functions that allow running c.h. with a stash M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 6/18
8 Key ingredient: linear polynomials where p U is a prime, and h(x) = ((a x + b) mod p) mod m, a and b are chosen uniformly at random from {0,..., p 1}. very simple structure! (Remark: This function is 2-wise independent, i.e., for any pair x, y U, x y, h(x) and h(y) are fully random.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 7/18
9 Our hash class (version for this talk) For given n 1, we combine linear polynomials with lookups in tables of size n filled with random values. h i (x) = f i (x) c j=1 z (i) j [ g j (x) ], i = 1, 2 Class of all these pairs (h 1, h 2 ) of hash functions: Z. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 8/18
10 Main Objective For given S and stash size s, calculate Pr ((h 1, h 2 ) unsuitable for S with stash size s). (h 1,h 2 ) Z What is a criteria for (h 1, h 2 ) being unsuitable? Tool: Cuckoo graph M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 9/18
11 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18
12 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18
13 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18
14 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18
15 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18
16 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18
17 Thus, we have Pr ((h 1, h 2 ) unsuitable for S with stash size s) (h 1,h 2 ) Z = Pr (h 1,h 2 ) Z ( T S : G(T, h 1, h 2 ) forms a MOS s ) T S Pr (G(T, h 1, h 2 ) forms a MOS s ) (h 1,h 2 ) Z if (h 1, h 2 ) are fully random, we provide a direct counting argument that this is O(1/n s+1 ) giving an alternative proof to the orginal analysis by Kirsch, Mitzenmacher and Wieder (who used machinery like Markov chain coupling) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 11/18
18 Behavior of our hash class on fixed T Recall: h i (x) = f i (x) c j=1 z (i) j [g j (x)] Central Observation Let T U. If there is a g j such that at most one pair of keys in T collides under g j (i.e., g j (x) = g j (y)), then h 1, h 2 are fully random on T. if this is the case: (h 1, h 2 ) T -good. otherwise (each g j has more than one colliding pair of keys): (h 1, h 2 ) is T -bad. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 12/18
19 Collecting bad hash functions We split our set of hash functions into harmful and harmless ones. (h 1, h 2 ) are harmful, if there exists T S s.t. G(T, h 1, h 2 ) forms a MOS s, and (h 1, h 2 ) is T -bad. B MOSs := the set of all the harmful pairs (h 1, h 2 ). (An event in our probability space!) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 13/18
20 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) Pr (h1,h 2 ) Z(A B MOSs ) + Pr(B MOSs ) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18
21 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) Pr (h1,h 2 ) Z(A B MOSs ) + Pr(B MOSs ) for this summand, we can use full randomness of (h 1, h 2 ) as before: this summand is O(1/n s+1 ) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18
22 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) O(1/n s+1 ) + Pr(B MOSs ) for this summand, we can use full randomness of (h 1, h 2 ) as before: this summand is O(1/n s+1 ) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18
23 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) O(1/n s+1 ) + Pr(B MOSs ) for this summand, we can use full randomness of (h 1, h 2 ) as before: this summand is O(1/n s+1 ) The hard part: Calculating/bounding Pr(B MOSs ) = Pr( T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad ) Wish: Use full randomness nonetheless Idea: Find a suitable event that contains B MOSs M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18
24 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad. #collisions g 1 5 g 2 7 g 3 4 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
25 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad. Recall: h i (x) = f i (x) c j=1 z(i) j [g j (x)] T -bad each g j has more than one pair of colliding keys #collisions g 1 5 g 2 7 g 3 4 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
26 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 5 g 2 7 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
27 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 5 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
28 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 5 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
29 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 5 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
30 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 4 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
31 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 3 g 2 4 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
32 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 3 g 2 4 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
33 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
34 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
35 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
36 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
37 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 1 g 2 3 g 3 3 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
38 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad 4 8 T -good exists g j with at most one pair 5 of colliding keys #collisions g 1 1 g 2 3 g 3 3 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
39 Peeling of bad graphs (simplified) If T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 1 g 2 3 g 3 3 then T S : G(T, h 1, h 2 ) forms peeled graph (h 1, h 2 ) are T -good M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18
40 Using a direct counting argument, we show Pr(B MOSs ) = O(n/n c/2 ). We thus have Pr((h 1, h 2 ) unsuitable) = O(1/n s+1 ) + O(n/n c/2 ) Setting c = 2(s + 2) gives our result. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 16/18
41 Conclusion we proposed efficient hash functions that can run c. h. with a stash approach has more potential: applicable to balanced allocation (forthcoming paper) Open question: Explicit hash functions for generalized cuckoo hashing with more than 2 hash functions. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 17/18
42 Thanks! Any questions? M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 18/18
Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions
1 / 29 Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions Martin Aumüller, Martin Dietzfelbinger Technische Universität Ilmenau 2 / 29 Cuckoo Hashing Maintain a dynamic dictionary
More informationOn Randomness in Hash Functions. Martin Dietzfelbinger Technische Universität Ilmenau
On Randomness in Hash Functions Martin Dietzfelbinger Technische Universität Ilmenau Martin Dietzfelbinger STACS Paris, March 2, 2012 Co-Workers Andrea Montanari Andreas Goerdt Christoph Weidling Martin
More informationLecture 5: Hashing. David Woodruff Carnegie Mellon University
Lecture 5: Hashing David Woodruff Carnegie Mellon University Hashing Universal hashing Perfect hashing Maintaining a Dictionary Let U be a universe of keys U could be all strings of ASCII characters of
More informationAnalysis of Algorithms I: Perfect Hashing
Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the
More informationDe-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results
De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results Yuriy Arbitman Moni Naor Gil Segev Abstract Cuckoo hashing is a highly practical dynamic dictionary: it provides amortized
More informationCS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32
CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 32 CS 473: Algorithms, Spring 2018 Universal Hashing Lecture 10 Feb 15, 2018 Most
More informationHash tables. Hash tables
Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key
More informationHash tables. Hash tables
Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key
More information1 Maintaining a Dictionary
15-451/651: Design & Analysis of Algorithms February 1, 2016 Lecture #7: Hashing last changed: January 29, 2016 Hashing is a great practical tool, with an interesting and subtle theory too. In addition
More informationLecture: Analysis of Algorithms (CS )
Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 Outline of Today s Class 2 Choosing Hash Functions Universal Universality Theorem Constructing a Set of Universal Hash Functions Perfect
More informationHash tables. Hash tables
Basic Probability Theory Two events A, B are independent if Conditional probability: Pr[A B] = Pr[A] Pr[B] Pr[A B] = Pr[A B] Pr[B] The expectation of a (discrete) random variable X is E[X ] = k k Pr[X
More informationLecture 4: Hashing and Streaming Algorithms
CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 4: Hashing and Streaming Algorithms Lecturer: Shayan Oveis Gharan 01/18/2017 Scribe: Yuqing Ai Disclaimer: These notes have not been subjected
More informationHashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille
Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing
More informationHashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing
Philip Bille Dictionaries Dictionary problem. Maintain a set S U = {,..., u-} supporting lookup(x): return true if x S and false otherwise. insert(x): set S = S {x} delete(x): set S = S - {x} Dictionaries
More informationCS 591, Lecture 6 Data Analytics: Theory and Applications Boston University
CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University Babis Tsourakakis February 8th, 2017 Universal hash family Notation: Universe U = {0,..., u 1}, index space M = {0,..., m 1},
More informationAlgorithms lecture notes 1. Hashing, and Universal Hash functions
Algorithms lecture notes 1 Hashing, and Universal Hash functions Algorithms lecture notes 2 Can we maintain a dictionary with O(1) per operation? Not in the deterministic sense. But in expectation, yes.
More informationHashing. Martin Babka. January 12, 2011
Hashing Martin Babka January 12, 2011 Hashing Hashing, Universal hashing, Perfect hashing Input data is uniformly distributed. A dynamic set is stored. Universal hashing Randomised algorithm uniform choice
More informationarxiv: v1 [cs.ds] 2 Mar 2009
De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results Yuriy Arbitman Moni Naor Gil Segev arxiv:0903.0391v1 [cs.ds] 2 Mar 2009 March 2, 2009 Abstract Cuckoo hashing is a
More informationHashing. Dictionaries Hashing with chaining Hash functions Linear Probing
Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Dictionaries Dictionary: Maintain a dynamic set S. Every
More informationSo far we have implemented the search for a key by carefully choosing split-elements.
7.7 Hashing Dictionary: S. insert(x): Insert an element x. S. delete(x): Delete the element pointed to by x. S. search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise
More informationIntroduction to Hashtables
Introduction to HashTables Boise State University March 5th 2015 Hash Tables: What Problem Do They Solve What Problem Do They Solve? Why not use arrays for everything? 1 Arrays can be very wasteful: Example
More informationAlgorithms for Data Science
Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based
More informationCS483 Design and Analysis of Algorithms
CS483 Design and Analysis of Algorithms Lectures 2-3 Algorithms with Numbers Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments
More informationHash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a
Hash Tables Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a mapping from U to M = {1,..., m}. A collision occurs when two hashed elements have h(x) =h(y).
More information1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:
CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at
More informationBalls and Bins: Smaller Hash Families and Faster Evaluation
Balls and Bins: Smaller Hash Families and Faster Evaluation L. Elisa Celis Omer Reingold Gil Segev Udi Wieder April 22, 2011 Abstract A fundamental fact in the analysis of randomized algorithm is that
More informationCuckoo Hashing and Cuckoo Filters
Cuckoo Hashing and Cuckoo Filters Noah Fleming May 7, 208 Preliminaries A dictionary is an abstract data type that stores a collection of elements, located by their key. It supports operations: insert,
More informationLecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.
CSE533: Information Theory in Computer Science September 8, 010 Lecturer: Anup Rao Lecture 6 Scribe: Lukas Svec 1 A lower bound for perfect hash functions Today we shall use graph entropy to improve the
More informationBalls and Bins: Smaller Hash Families and Faster Evaluation
Balls and Bins: Smaller Hash Families and Faster Evaluation L. Elisa Celis Omer Reingold Gil Segev Udi Wieder May 1, 2013 Abstract A fundamental fact in the analysis of randomized algorithms is that when
More information6.1 Occupancy Problem
15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.
More informationBounds on the OBDD-Size of Integer Multiplication via Universal Hashing
Bounds on the OBDD-Size of Integer Multiplication via Universal Hashing Philipp Woelfel Dept. of Computer Science University Dortmund D-44221 Dortmund Germany phone: +49 231 755-2120 fax: +49 231 755-2047
More informationCSE 190, Great ideas in algorithms: Pairwise independent hash functions
CSE 190, Great ideas in algorithms: Pairwise independent hash functions 1 Hash functions The goal of hash functions is to map elements from a large domain to a small one. Typically, to obtain the required
More informationLecture 4 Thursday Sep 11, 2014
CS 224: Advanced Algorithms Fall 2014 Lecture 4 Thursday Sep 11, 2014 Prof. Jelani Nelson Scribe: Marco Gentili 1 Overview Today we re going to talk about: 1. linear probing (show with 5-wise independence)
More informationLecture 24: Bloom Filters. Wednesday, June 2, 2010
Lecture 24: Bloom Filters Wednesday, June 2, 2010 1 Topics for the Final SQL Conceptual Design (BCNF) Transactions Indexes Query execution and optimization Cardinality Estimation Parallel Databases 2 Lecture
More informationPower of d Choices with Simple Tabulation
Power of d Choices with Simple Tabulation Anders Aamand 1 BARC, University of Copenhagen, Universitetsparken 1, Copenhagen, Denmark. aa@di.ku.dk https://orcid.org/0000-0002-0402-0514 Mathias Bæk Tejs Knudsen
More informationTabulation-Based Hashing
Mihai Pǎtraşcu Mikkel Thorup April 23, 2010 Applications of Hashing Hash tables: chaining x a t v f s r Applications of Hashing Hash tables: chaining linear probing m c a f t x y t Applications of Hashing
More informations 1 if xπy and f(x) = f(y)
Algorithms Proessor John Rei Hash Function : A B ALG 4.2 Universal Hash Functions: CLR - Chapter 34 Auxillary Reading Selections: AHU-Data Section 4.7 BB Section 8.4.4 Handout: Carter & Wegman, "Universal
More informationBalanced Allocation Through Random Walk
Balanced Allocation Through Random Walk Alan Frieze Samantha Petti November 25, 2017 Abstract We consider the allocation problem in which m (1 ε)dn items are to be allocated to n bins with capacity d.
More informationcompare to comparison and pointer based sorting, binary trees
Admin Hashing Dictionaries Model Operations. makeset, insert, delete, find keys are integers in M = {1,..., m} (so assume machine word size, or unit time, is log m) can store in array of size M using power:
More informationCPSC 467: Cryptography and Computer Security
CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 14 October 16, 2013 CPSC 467, Lecture 14 1/45 Message Digest / Cryptographic Hash Functions Hash Function Constructions Extending
More informationCache-Oblivious Hashing
Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong
More information1 Hashing. 1.1 Perfect Hashing
1 Hashing Hashing is covered by undergraduate courses like Algo I. However, there is much more to say on this topic. Here, we focus on two selected topics: perfect hashing and cockoo hashing. In general,
More informationDistributed Oblivious RAM for Secure Two-Party Computation
Seminar in Distributed Computing Distributed Oblivious RAM for Secure Two-Party Computation Steve Lu & Rafail Ostrovsky Philipp Gamper Philipp Gamper 2017-04-25 1 Yao s millionaires problem Two millionaires
More informationRainbow Tables ENEE 457/CMSC 498E
Rainbow Tables ENEE 457/CMSC 498E How are Passwords Stored? Option 1: Store all passwords in a table in the clear. Problem: If Server is compromised then all passwords are leaked. Option 2: Store only
More informationA Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple
A Lecture on Hashing Aram-Alexandre Pooladian, Alexander Iannantuono March 22, 217 This is the scribing of a lecture given by Luc Devroye on the 17th of March 217 for Honours Algorithms and Data Structures
More informationNotes for Lecture A can repeat step 3 as many times as it wishes. We will charge A one unit of time for every time it repeats step 3.
COS 533: Advanced Cryptography Lecture 2 (September 18, 2017) Lecturer: Mark Zhandry Princeton University Scribe: Mark Zhandry Notes for Lecture 2 1 Last Time Last time, we defined formally what an encryption
More informationAdvanced Implementations of Tables: Balanced Search Trees and Hashing
Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the
More information1 Difference between grad and undergrad algorithms
princeton univ. F 4 cos 52: Advanced Algorithm Design Lecture : Course Intro and Hashing Lecturer: Sanjeev Arora Scribe:Sanjeev Algorithms are integral to computer science and every computer scientist
More information12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]
Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally random number, so the code will be hard to crack. For letter A, you write 3,004,572,688. B is 28,731,569½.
More informationLecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox
Lecture 04: Balls and Bins: Overview In today s lecture we will start our study of balls-and-bins problems We shall consider a fundamental problem known as the Recall: Inequalities I Lemma Before we begin,
More informationExpectation of geometric distribution
Expectation of geometric distribution What is the probability that X is finite? Can now compute E(X): Σ k=1f X (k) = Σ k=1(1 p) k 1 p = pσ j=0(1 p) j = p 1 1 (1 p) = 1 E(X) = Σ k=1k (1 p) k 1 p = p [ Σ
More informationUNIFORM HASHING IN CONSTANT TIME AND OPTIMAL SPACE
UNIFORM HASHING IN CONSTANT TIME AND OPTIMAL SPACE ANNA PAGH AND RASMUS PAGH Abstract. Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e.,
More informationPower of d Choices with Simple Tabulation
Power of d Choices with Simple Tabulation Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup Abstract arxiv:1804.09684v1 [cs.ds] 25 Apr 2018 Suppose that we are to place m balls into n bins sequentially
More information? 11.5 Perfect hashing. Exercises
11.5 Perfect hashing 77 Exercises 11.4-1 Consider inserting the keys 10; ; 31; 4; 15; 8; 17; 88; 59 into a hash table of length m 11 using open addressing with the auxiliary hash function h 0.k/ k. Illustrate
More informationarxiv: v3 [cs.ds] 11 May 2017
Lecture Notes on Linear Probing with 5-Independent Hashing arxiv:1509.04549v3 [cs.ds] 11 May 2017 Mikkel Thorup May 12, 2017 Abstract These lecture notes show that linear probing takes expected constant
More informationRandomized Algorithms. Lecture 4. Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010
Randomized Algorithms Lecture 4 Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010 1 Pairwise independent hash functions In the previous lecture we encountered two families
More informationSearching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9
Constant time access Searching Chapter 9 Linear search Θ(n) OK Binary search Θ(log n) Better Can we achieve Θ(1) search time? CPTR 318 1 2 Use an array? Use random access on a key such as a string? Hash
More informationData Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86
Data Structure Mohsen Arab Yazd University January 13, 2015 Mohsen Arab (Yazd University ) Data Structure January 13, 2015 1 / 86 Table of Content Binary Search Tree Treaps Skip Lists Hash Tables Mohsen
More informationWhy Simple Hash Functions Work: Exploiting the Entropy in a Data Stream
Why Simple Hash Functions Work: Exploiting the Entropy in a Data Stream ichael itzenmacher Salil Vadhan School of Engineering & Applied Sciences Harvard University Cambridge, A 02138 {michaelm,salil}@eecs.harvard.edu
More informationHOMEWORK #2 - MATH 3260
HOMEWORK # - MATH 36 ASSIGNED: JANUARAY 3, 3 DUE: FEBRUARY 1, AT :3PM 1) a) Give by listing the sequence of vertices 4 Hamiltonian cycles in K 9 no two of which have an edge in common. Solution: Here is
More informationAd Placement Strategies
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January
More informationMotivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis
Motivation Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis Arrays provide an indirect way to access a set. Many times we need an association between two sets, or a set of keys and associated
More informationData Structures and Algorithm. Xiaoqing Zheng
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Dictionary problem Dictionary T holding n records: x records key[x] Other fields containing satellite data Operations on T: INSERT(T, x)
More informationExpectation of geometric distribution. Variance and Standard Deviation. Variance: Examples
Expectation of geometric distribution Variance and Standard Deviation What is the probability that X is finite? Can now compute E(X): Σ k=f X (k) = Σ k=( p) k p = pσ j=0( p) j = p ( p) = E(X) = Σ k=k (
More informationLecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing
Bloom filters and Hashing 1 Introduction The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of
More informationSome notes on streaming algorithms continued
U.C. Berkeley CS170: Algorithms Handout LN-11-9 Christos Papadimitriou & Luca Trevisan November 9, 016 Some notes on streaming algorithms continued Today we complete our quick review of streaming algorithms.
More informationCSE 502 Class 11 Part 2
CSE 502 Class 11 Part 2 Jeremy Buhler Steve Cole February 17 2015 Today: analysis of hashing 1 Constraints of Double Hashing How does using OA w/double hashing constrain our hash function design? Need
More information2. This exam consists of 15 questions. The rst nine questions are multiple choice Q10 requires two
CS{74 Combinatorics & Discrete Probability, Fall 96 Final Examination 2:30{3:30pm, 7 December Read these instructions carefully. This is a closed book exam. Calculators are permitted. 2. This exam consists
More informationCSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30
CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in
More informationOn Risks of Using a High Performance Hashing Scheme With Common Universal Classes
On Risks of Using a High Performance Hashing Scheme With Common Universal Classes Dissertation zur Erlangung des akademischen Grades Doctor rerum naturalium (Dr. rer. nat.) vorgelegt der Fakultät für Informatik
More information6.842 Randomness and Computation Lecture 5
6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its
More informationLecture 6 September 13, 2016
CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]
More informationProblem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)
Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.
More informationMa/CS 6b Class 15: The Probabilistic Method 2
Ma/CS 6b Class 15: The Probabilistic Method 2 By Adam Sheffer Reminder: Random Variables A random variable is a function from the set of possible events to R. Example. Say that we flip five coins. We can
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationCS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms
CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms Tim Roughgarden March 3, 2016 1 Preamble In CS109 and CS161, you learned some tricks of
More informationInsert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park
1617 Preview Data Structure Review COSC COSC Data Structure Review Linked Lists Stacks Queues Linked Lists Singly Linked List Doubly Linked List Typical Functions s Hash Functions Collision Resolution
More informationCharles University in Prague Faculty of Mathematics and Physics MASTER THESIS. Martin Babka. Properties of Universal Hashing
Charles University in Prague Faculty of Mathematics and Physics MASTER THESIS Martin Babka Properties of Universal Hashing Department of Theoretical Computer Science and Mathematical Logic Supervisor:
More informationLecture 14: Cryptographic Hash Functions
CSE 599b: Cryptography (Winter 2006) Lecture 14: Cryptographic Hash Functions 17 February 2006 Lecturer: Paul Beame Scribe: Paul Beame 1 Hash Function Properties A hash function family H = {H K } K K is
More informationCS369N: Beyond Worst-Case Analysis Lecture #6: Pseudorandom Data and Universal Hashing
CS369N: Beyond Worst-Case Analysis Lecture #6: Pseudorandom Data and Universal Hashing Tim Roughgarden April 4, 204 otivation: Linear Probing and Universal Hashing This lecture discusses a very neat paper
More informationLecture 23: Alternation vs. Counting
CS 710: Complexity Theory 4/13/010 Lecture 3: Alternation vs. Counting Instructor: Dieter van Melkebeek Scribe: Jeff Kinne & Mushfeq Khan We introduced counting complexity classes in the previous lecture
More informationCS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =
CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14 1 Probability First, recall a couple useful facts from last time about probability: Linearity of expectation: E(aX + by ) = ae(x)
More informationCPSC 467: Cryptography and Computer Security
CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 15 October 20, 2014 CPSC 467, Lecture 15 1/37 Common Hash Functions SHA-2 MD5 Birthday Attack on Hash Functions Constructing New
More informationGROUPS AS GRAPHS. W. B. Vasantha Kandasamy Florentin Smarandache
GROUPS AS GRAPHS W. B. Vasantha Kandasamy Florentin Smarandache 009 GROUPS AS GRAPHS W. B. Vasantha Kandasamy e-mail: vasanthakandasamy@gmail.com web: http://mat.iitm.ac.in/~wbv www.vasantha.in Florentin
More informationFIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS
FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS MOUMANTI PODDER 1. First order theory on G(n, p) We start with a very simple property of G(n,
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis
More informationBinary Decision Diagrams
Binary Decision Diagrams Sungho Kang Yonsei University Outline Representing Logic Function Design Considerations for a BDD package Algorithms 2 Why BDDs BDDs are Canonical (each Boolean function has its
More information1 Estimating Frequency Moments in Streams
CS 598CSC: Algorithms for Big Data Lecture date: August 28, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri 1 Estimating Frequency Moments in Streams A significant fraction of streaming literature
More informationFunctions of Several Variables: Chain Rules
Functions of Several Variables: Chain Rules Calculus III Josh Engwer TTU 29 September 2014 Josh Engwer (TTU) Functions of Several Variables: Chain Rules 29 September 2014 1 / 30 PART I PART I: MULTIVARIABLE
More informationCPSC 467: Cryptography and Computer Security
CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 16 October 30, 2017 CPSC 467, Lecture 16 1/52 Properties of Hash Functions Hash functions do not always look random Relations among
More informationAdvanced Data Structures
Simon Gog gog@kit.edu - 0 Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Dynamic Perfect Hashing What we want: O(1) lookup
More informationBasics of hashing: k-independence and applications
Basics of hashing: k-independence and applications Rasmus Pagh Supported by: 1 Agenda Load balancing using hashing! - Analysis using bounded independence! Implementation of small independence! Case studies:!
More informationAbstract Data Type (ADT) maintains a set of items, each with a key, subject to
Lecture Overview Dictionaries and Python Motivation Hash functions Chaining Simple uniform hashing Good hash functions Readings CLRS Chapter,, 3 Dictionary Problem Abstract Data Type (ADT) maintains a
More informationINTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class
Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -1 - INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University Data Set - SSN's from UTSA Class 467 13 3881 498 66 2055 450 27 3804 456 49 5261
More informationAdvanced Data Structures
Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu Predecessor data structures We want to support the following operations on a set of integers from
More informationarxiv: v1 [cs.cc] 14 Sep 2013
arxiv:1309.3690v1 [cs.cc] 14 Sep 2013 Element Distinctness, Frequency Moments, and Sliding Windows Paul Beame Computer Science and Engineering University of Washington Seattle, WA 98195-2350 beame@cs.washington.edu
More informationHashing Data Structures. Ananda Gunawardena
Hashing 15-121 Data Structures Ananda Gunawardena Hashing Why do we need hashing? Many applications deal with lots of data Search engines and web pages There are myriad look ups. The look ups are time
More informationHash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1
Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Direct-Address Tables 2 2 Let U = {0,...,m 1}, the set of
More informationProblem. Maintain the above set so as to support certain
Randomized Data Structures Roussos Ioannis Fundamental Data-structuring Problem Collection {S 1,S 2, } of sets of items. Each item i is an arbitrary record, indexed by a key k(i). Each k(i) is drawn from
More informationCS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014
CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014 Instructor: Chandra Cheuri Scribe: Chandra Cheuri The Misra-Greis deterministic counting guarantees that all items with frequency > F 1 /
More information