Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash

Size: px
Start display at page:

Download "Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash"

Transcription

1 Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash Martin Aumu ller, Martin Dietzfelbinger, Philipp Woelfel? Technische Universita t Ilmenau,? University of Calgary ESA Ljubljana, September 12, 2012 M. Aumu ller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 1/18

2 Cuckoo Hashing (Pagh/Rodler, 2001/2004) A hashing-based implementation of the dictionary data type. Setting: set S U of n keys two tables T 1 [0..m 1] and T 2 [0..m 1] two (hash) functions h 1, h 2 with h i : U [m] Rules: each table cell can hold exactly one key a key x must be stored either in T 1 [h 1 (x)] or T 2 [h 2 (x)] (fast lookup and deletions!) Definition If S can be stored according to these rules, we call (h 1, h 2 ) suitable for S. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 2/18

3 Cuckoo Hashing (Pagh/Rodler, 2001/2004) A hashing-based implementation of the dictionary data type. Setting: set S U of n keys two tables T 1 [0..m 1] and T 2 [0..m 1] two (hash) functions h 1, h 2 with h i : U [m] Rules: each table cell can hold exactly one key a key x must be stored either in T 1 [h 1 (x)] or T 2 [h 2 (x)] (fast lookup and deletions!) Definition If S can be stored according to these rules, we call (h 1, h 2 ) suitable for S. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 2/18

4 Cuckoo Hashing: Failure Probability Theorem (Pagh/Rodler, 2004) Let S U with S = n. If (h 1, h 2 ) are fully random (or Ω(log n)-wise independent), then In fact: Θ(1/n). Pr((h 1, h 2 ) unsuitable for S) = O(1/n). M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 3/18

5 Improving Standard Cuckoo Hashing: Stash (Kirsch/Mitzenmacher/Wieder, ESA 2008): Θ(1/n) is too large. Proposal: Can put up to s = O(1) keys into additional storage, called stash Theorem (K/M/W, 2008) Let S U with S = n. If (h 1, h 2 ) are fully random, then Pr((h 1, h 2 ) unsuitable for S with stash size s) = O(1/n s+1 ). Full Randomness Assumption (FRA) central in their analysis constructions for FRA known, but undesirable Open Question (Kirsch/Mitzenmacher/Wieder, SICOMP 09) A major open question is proving the above bounds for explicit hash families that can be represented, sampled, and evaluated efficiently. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 4/18

6 Our contributions efficient hash functions that can realize cuckoo hashing with a stash variation of a known construction of a uniform hash function (not covered in this talk) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 5/18

7 Method extension of (Dietzfelbinger/Woelfel, 2003) they proposed efficient hash functions for standard cuckoo hashing we exhibit a new class of (even simpler) hash functions that allow running c.h. with a stash M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 6/18

8 Key ingredient: linear polynomials where p U is a prime, and h(x) = ((a x + b) mod p) mod m, a and b are chosen uniformly at random from {0,..., p 1}. very simple structure! (Remark: This function is 2-wise independent, i.e., for any pair x, y U, x y, h(x) and h(y) are fully random.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 7/18

9 Our hash class (version for this talk) For given n 1, we combine linear polynomials with lookups in tables of size n filled with random values. h i (x) = f i (x) c j=1 z (i) j [ g j (x) ], i = 1, 2 Class of all these pairs (h 1, h 2 ) of hash functions: Z. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 8/18

10 Main Objective For given S and stash size s, calculate Pr ((h 1, h 2 ) unsuitable for S with stash size s). (h 1,h 2 ) Z What is a criteria for (h 1, h 2 ) being unsuitable? Tool: Cuckoo graph M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 9/18

11 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18

12 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18

13 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18

14 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18

15 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18

16 Suitability Criteria Fact (Kirsch/Mitzenmacher/Wieder, 2008) (h 1, h 2 ) suitable after removing s edges from G(S, h 1, h 2 ), all connected components are either trees or unicyclic. Consequence: unsuitable there exists a subgraph s.t. one cannot remove s edges to obtain only tree or unicyclic components. Minimal such bad subgraph : a MOS s. (Example: s = 2.) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 10/18

17 Thus, we have Pr ((h 1, h 2 ) unsuitable for S with stash size s) (h 1,h 2 ) Z = Pr (h 1,h 2 ) Z ( T S : G(T, h 1, h 2 ) forms a MOS s ) T S Pr (G(T, h 1, h 2 ) forms a MOS s ) (h 1,h 2 ) Z if (h 1, h 2 ) are fully random, we provide a direct counting argument that this is O(1/n s+1 ) giving an alternative proof to the orginal analysis by Kirsch, Mitzenmacher and Wieder (who used machinery like Markov chain coupling) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 11/18

18 Behavior of our hash class on fixed T Recall: h i (x) = f i (x) c j=1 z (i) j [g j (x)] Central Observation Let T U. If there is a g j such that at most one pair of keys in T collides under g j (i.e., g j (x) = g j (y)), then h 1, h 2 are fully random on T. if this is the case: (h 1, h 2 ) T -good. otherwise (each g j has more than one colliding pair of keys): (h 1, h 2 ) is T -bad. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 12/18

19 Collecting bad hash functions We split our set of hash functions into harmful and harmless ones. (h 1, h 2 ) are harmful, if there exists T S s.t. G(T, h 1, h 2 ) forms a MOS s, and (h 1, h 2 ) is T -bad. B MOSs := the set of all the harmful pairs (h 1, h 2 ). (An event in our probability space!) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 13/18

20 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) Pr (h1,h 2 ) Z(A B MOSs ) + Pr(B MOSs ) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18

21 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) Pr (h1,h 2 ) Z(A B MOSs ) + Pr(B MOSs ) for this summand, we can use full randomness of (h 1, h 2 ) as before: this summand is O(1/n s+1 ) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18

22 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) O(1/n s+1 ) + Pr(B MOSs ) for this summand, we can use full randomness of (h 1, h 2 ) as before: this summand is O(1/n s+1 ) M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18

23 Splitting the calculation Let A be the event T S : G(T, h 1, h 2 ) forms a MOS s. We calculate: Pr (h1,h 2 ) Z(A) O(1/n s+1 ) + Pr(B MOSs ) for this summand, we can use full randomness of (h 1, h 2 ) as before: this summand is O(1/n s+1 ) The hard part: Calculating/bounding Pr(B MOSs ) = Pr( T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad ) Wish: Use full randomness nonetheless Idea: Find a suitable event that contains B MOSs M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 14/18

24 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad. #collisions g 1 5 g 2 7 g 3 4 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

25 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad. Recall: h i (x) = f i (x) c j=1 z(i) j [g j (x)] T -bad each g j has more than one pair of colliding keys #collisions g 1 5 g 2 7 g 3 4 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

26 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 5 g 2 7 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

27 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 5 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

28 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 5 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

29 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 5 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

30 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 4 g 2 4 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

31 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 3 g 2 4 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

32 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 3 g 2 4 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

33 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

34 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

35 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

36 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 2 g 2 3 g M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

37 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 1 g 2 3 g 3 3 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

38 Peeling of bad graphs (simplified) Assume T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad 4 8 T -good exists g j with at most one pair 5 of colliding keys #collisions g 1 1 g 2 3 g 3 3 M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

39 Peeling of bad graphs (simplified) If T S : G(T, h 1, h 2 ) forms a MOS s (h 1, h 2 ) are T -bad #collisions g 1 1 g 2 3 g 3 3 then T S : G(T, h 1, h 2 ) forms peeled graph (h 1, h 2 ) are T -good M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 15/18

40 Using a direct counting argument, we show Pr(B MOSs ) = O(n/n c/2 ). We thus have Pr((h 1, h 2 ) unsuitable) = O(1/n s+1 ) + O(n/n c/2 ) Setting c = 2(s + 2) gives our result. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 16/18

41 Conclusion we proposed efficient hash functions that can run c. h. with a stash approach has more potential: applicable to balanced allocation (forthcoming paper) Open question: Explicit hash functions for generalized cuckoo hashing with more than 2 hash functions. M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 17/18

42 Thanks! Any questions? M. Aumüller Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash 18/18

Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions

Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions 1 / 29 Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions Martin Aumüller, Martin Dietzfelbinger Technische Universität Ilmenau 2 / 29 Cuckoo Hashing Maintain a dynamic dictionary

More information

On Randomness in Hash Functions. Martin Dietzfelbinger Technische Universität Ilmenau

On Randomness in Hash Functions. Martin Dietzfelbinger Technische Universität Ilmenau On Randomness in Hash Functions Martin Dietzfelbinger Technische Universität Ilmenau Martin Dietzfelbinger STACS Paris, March 2, 2012 Co-Workers Andrea Montanari Andreas Goerdt Christoph Weidling Martin

More information

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

Lecture 5: Hashing. David Woodruff Carnegie Mellon University Lecture 5: Hashing David Woodruff Carnegie Mellon University Hashing Universal hashing Perfect hashing Maintaining a Dictionary Let U be a universe of keys U could be all strings of ASCII characters of

More information

Analysis of Algorithms I: Perfect Hashing

Analysis of Algorithms I: Perfect Hashing Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the

More information

De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results

De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results Yuriy Arbitman Moni Naor Gil Segev Abstract Cuckoo hashing is a highly practical dynamic dictionary: it provides amortized

More information

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32 CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 32 CS 473: Algorithms, Spring 2018 Universal Hashing Lecture 10 Feb 15, 2018 Most

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

1 Maintaining a Dictionary

1 Maintaining a Dictionary 15-451/651: Design & Analysis of Algorithms February 1, 2016 Lecture #7: Hashing last changed: January 29, 2016 Hashing is a great practical tool, with an interesting and subtle theory too. In addition

More information

Lecture: Analysis of Algorithms (CS )

Lecture: Analysis of Algorithms (CS ) Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 Outline of Today s Class 2 Choosing Hash Functions Universal Universality Theorem Constructing a Set of Universal Hash Functions Perfect

More information

Hash tables. Hash tables

Hash tables. Hash tables Basic Probability Theory Two events A, B are independent if Conditional probability: Pr[A B] = Pr[A] Pr[B] Pr[A B] = Pr[A B] Pr[B] The expectation of a (discrete) random variable X is E[X ] = k k Pr[X

More information

Lecture 4: Hashing and Streaming Algorithms

Lecture 4: Hashing and Streaming Algorithms CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 4: Hashing and Streaming Algorithms Lecturer: Shayan Oveis Gharan 01/18/2017 Scribe: Yuqing Ai Disclaimer: These notes have not been subjected

More information

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

More information

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Dictionaries Dictionary problem. Maintain a set S U = {,..., u-} supporting lookup(x): return true if x S and false otherwise. insert(x): set S = S {x} delete(x): set S = S - {x} Dictionaries

More information

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University Babis Tsourakakis February 8th, 2017 Universal hash family Notation: Universe U = {0,..., u 1}, index space M = {0,..., m 1},

More information

Algorithms lecture notes 1. Hashing, and Universal Hash functions

Algorithms lecture notes 1. Hashing, and Universal Hash functions Algorithms lecture notes 1 Hashing, and Universal Hash functions Algorithms lecture notes 2 Can we maintain a dictionary with O(1) per operation? Not in the deterministic sense. But in expectation, yes.

More information

Hashing. Martin Babka. January 12, 2011

Hashing. Martin Babka. January 12, 2011 Hashing Martin Babka January 12, 2011 Hashing Hashing, Universal hashing, Perfect hashing Input data is uniformly distributed. A dynamic set is stored. Universal hashing Randomised algorithm uniform choice

More information

arxiv: v1 [cs.ds] 2 Mar 2009

arxiv: v1 [cs.ds] 2 Mar 2009 De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results Yuriy Arbitman Moni Naor Gil Segev arxiv:0903.0391v1 [cs.ds] 2 Mar 2009 March 2, 2009 Abstract Cuckoo hashing is a

More information

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Dictionaries Dictionary: Maintain a dynamic set S. Every

More information

So far we have implemented the search for a key by carefully choosing split-elements.

So far we have implemented the search for a key by carefully choosing split-elements. 7.7 Hashing Dictionary: S. insert(x): Insert an element x. S. delete(x): Delete the element pointed to by x. S. search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise

More information

Introduction to Hashtables

Introduction to Hashtables Introduction to HashTables Boise State University March 5th 2015 Hash Tables: What Problem Do They Solve What Problem Do They Solve? Why not use arrays for everything? 1 Arrays can be very wasteful: Example

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based

More information

CS483 Design and Analysis of Algorithms

CS483 Design and Analysis of Algorithms CS483 Design and Analysis of Algorithms Lectures 2-3 Algorithms with Numbers Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments

More information

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a Hash Tables Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a mapping from U to M = {1,..., m}. A collision occurs when two hashed elements have h(x) =h(y).

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at

More information

Balls and Bins: Smaller Hash Families and Faster Evaluation

Balls and Bins: Smaller Hash Families and Faster Evaluation Balls and Bins: Smaller Hash Families and Faster Evaluation L. Elisa Celis Omer Reingold Gil Segev Udi Wieder April 22, 2011 Abstract A fundamental fact in the analysis of randomized algorithm is that

More information

Cuckoo Hashing and Cuckoo Filters

Cuckoo Hashing and Cuckoo Filters Cuckoo Hashing and Cuckoo Filters Noah Fleming May 7, 208 Preliminaries A dictionary is an abstract data type that stores a collection of elements, located by their key. It supports operations: insert,

More information

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions. CSE533: Information Theory in Computer Science September 8, 010 Lecturer: Anup Rao Lecture 6 Scribe: Lukas Svec 1 A lower bound for perfect hash functions Today we shall use graph entropy to improve the

More information

Balls and Bins: Smaller Hash Families and Faster Evaluation

Balls and Bins: Smaller Hash Families and Faster Evaluation Balls and Bins: Smaller Hash Families and Faster Evaluation L. Elisa Celis Omer Reingold Gil Segev Udi Wieder May 1, 2013 Abstract A fundamental fact in the analysis of randomized algorithms is that when

More information

6.1 Occupancy Problem

6.1 Occupancy Problem 15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.

More information

Bounds on the OBDD-Size of Integer Multiplication via Universal Hashing

Bounds on the OBDD-Size of Integer Multiplication via Universal Hashing Bounds on the OBDD-Size of Integer Multiplication via Universal Hashing Philipp Woelfel Dept. of Computer Science University Dortmund D-44221 Dortmund Germany phone: +49 231 755-2120 fax: +49 231 755-2047

More information

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

CSE 190, Great ideas in algorithms: Pairwise independent hash functions CSE 190, Great ideas in algorithms: Pairwise independent hash functions 1 Hash functions The goal of hash functions is to map elements from a large domain to a small one. Typically, to obtain the required

More information

Lecture 4 Thursday Sep 11, 2014

Lecture 4 Thursday Sep 11, 2014 CS 224: Advanced Algorithms Fall 2014 Lecture 4 Thursday Sep 11, 2014 Prof. Jelani Nelson Scribe: Marco Gentili 1 Overview Today we re going to talk about: 1. linear probing (show with 5-wise independence)

More information

Lecture 24: Bloom Filters. Wednesday, June 2, 2010

Lecture 24: Bloom Filters. Wednesday, June 2, 2010 Lecture 24: Bloom Filters Wednesday, June 2, 2010 1 Topics for the Final SQL Conceptual Design (BCNF) Transactions Indexes Query execution and optimization Cardinality Estimation Parallel Databases 2 Lecture

More information

Power of d Choices with Simple Tabulation

Power of d Choices with Simple Tabulation Power of d Choices with Simple Tabulation Anders Aamand 1 BARC, University of Copenhagen, Universitetsparken 1, Copenhagen, Denmark. aa@di.ku.dk https://orcid.org/0000-0002-0402-0514 Mathias Bæk Tejs Knudsen

More information

Tabulation-Based Hashing

Tabulation-Based Hashing Mihai Pǎtraşcu Mikkel Thorup April 23, 2010 Applications of Hashing Hash tables: chaining x a t v f s r Applications of Hashing Hash tables: chaining linear probing m c a f t x y t Applications of Hashing

More information

s 1 if xπy and f(x) = f(y)

s 1 if xπy and f(x) = f(y) Algorithms Proessor John Rei Hash Function : A B ALG 4.2 Universal Hash Functions: CLR - Chapter 34 Auxillary Reading Selections: AHU-Data Section 4.7 BB Section 8.4.4 Handout: Carter & Wegman, "Universal

More information

Balanced Allocation Through Random Walk

Balanced Allocation Through Random Walk Balanced Allocation Through Random Walk Alan Frieze Samantha Petti November 25, 2017 Abstract We consider the allocation problem in which m (1 ε)dn items are to be allocated to n bins with capacity d.

More information

compare to comparison and pointer based sorting, binary trees

compare to comparison and pointer based sorting, binary trees Admin Hashing Dictionaries Model Operations. makeset, insert, delete, find keys are integers in M = {1,..., m} (so assume machine word size, or unit time, is log m) can store in array of size M using power:

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 14 October 16, 2013 CPSC 467, Lecture 14 1/45 Message Digest / Cryptographic Hash Functions Hash Function Constructions Extending

More information

Cache-Oblivious Hashing

Cache-Oblivious Hashing Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong

More information

1 Hashing. 1.1 Perfect Hashing

1 Hashing. 1.1 Perfect Hashing 1 Hashing Hashing is covered by undergraduate courses like Algo I. However, there is much more to say on this topic. Here, we focus on two selected topics: perfect hashing and cockoo hashing. In general,

More information

Distributed Oblivious RAM for Secure Two-Party Computation

Distributed Oblivious RAM for Secure Two-Party Computation Seminar in Distributed Computing Distributed Oblivious RAM for Secure Two-Party Computation Steve Lu & Rafail Ostrovsky Philipp Gamper Philipp Gamper 2017-04-25 1 Yao s millionaires problem Two millionaires

More information

Rainbow Tables ENEE 457/CMSC 498E

Rainbow Tables ENEE 457/CMSC 498E Rainbow Tables ENEE 457/CMSC 498E How are Passwords Stored? Option 1: Store all passwords in a table in the clear. Problem: If Server is compromised then all passwords are leaked. Option 2: Store only

More information

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple A Lecture on Hashing Aram-Alexandre Pooladian, Alexander Iannantuono March 22, 217 This is the scribing of a lecture given by Luc Devroye on the 17th of March 217 for Honours Algorithms and Data Structures

More information

Notes for Lecture A can repeat step 3 as many times as it wishes. We will charge A one unit of time for every time it repeats step 3.

Notes for Lecture A can repeat step 3 as many times as it wishes. We will charge A one unit of time for every time it repeats step 3. COS 533: Advanced Cryptography Lecture 2 (September 18, 2017) Lecturer: Mark Zhandry Princeton University Scribe: Mark Zhandry Notes for Lecture 2 1 Last Time Last time, we defined formally what an encryption

More information

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Advanced Implementations of Tables: Balanced Search Trees and Hashing Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the

More information

1 Difference between grad and undergrad algorithms

1 Difference between grad and undergrad algorithms princeton univ. F 4 cos 52: Advanced Algorithm Design Lecture : Course Intro and Hashing Lecturer: Sanjeev Arora Scribe:Sanjeev Algorithms are integral to computer science and every computer scientist

More information

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10] Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally random number, so the code will be hard to crack. For letter A, you write 3,004,572,688. B is 28,731,569½.

More information

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox Lecture 04: Balls and Bins: Overview In today s lecture we will start our study of balls-and-bins problems We shall consider a fundamental problem known as the Recall: Inequalities I Lemma Before we begin,

More information

Expectation of geometric distribution

Expectation of geometric distribution Expectation of geometric distribution What is the probability that X is finite? Can now compute E(X): Σ k=1f X (k) = Σ k=1(1 p) k 1 p = pσ j=0(1 p) j = p 1 1 (1 p) = 1 E(X) = Σ k=1k (1 p) k 1 p = p [ Σ

More information

UNIFORM HASHING IN CONSTANT TIME AND OPTIMAL SPACE

UNIFORM HASHING IN CONSTANT TIME AND OPTIMAL SPACE UNIFORM HASHING IN CONSTANT TIME AND OPTIMAL SPACE ANNA PAGH AND RASMUS PAGH Abstract. Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e.,

More information

Power of d Choices with Simple Tabulation

Power of d Choices with Simple Tabulation Power of d Choices with Simple Tabulation Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup Abstract arxiv:1804.09684v1 [cs.ds] 25 Apr 2018 Suppose that we are to place m balls into n bins sequentially

More information

? 11.5 Perfect hashing. Exercises

? 11.5 Perfect hashing. Exercises 11.5 Perfect hashing 77 Exercises 11.4-1 Consider inserting the keys 10; ; 31; 4; 15; 8; 17; 88; 59 into a hash table of length m 11 using open addressing with the auxiliary hash function h 0.k/ k. Illustrate

More information

arxiv: v3 [cs.ds] 11 May 2017

arxiv: v3 [cs.ds] 11 May 2017 Lecture Notes on Linear Probing with 5-Independent Hashing arxiv:1509.04549v3 [cs.ds] 11 May 2017 Mikkel Thorup May 12, 2017 Abstract These lecture notes show that linear probing takes expected constant

More information

Randomized Algorithms. Lecture 4. Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010

Randomized Algorithms. Lecture 4. Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010 Randomized Algorithms Lecture 4 Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010 1 Pairwise independent hash functions In the previous lecture we encountered two families

More information

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9 Constant time access Searching Chapter 9 Linear search Θ(n) OK Binary search Θ(log n) Better Can we achieve Θ(1) search time? CPTR 318 1 2 Use an array? Use random access on a key such as a string? Hash

More information

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86 Data Structure Mohsen Arab Yazd University January 13, 2015 Mohsen Arab (Yazd University ) Data Structure January 13, 2015 1 / 86 Table of Content Binary Search Tree Treaps Skip Lists Hash Tables Mohsen

More information

Why Simple Hash Functions Work: Exploiting the Entropy in a Data Stream

Why Simple Hash Functions Work: Exploiting the Entropy in a Data Stream Why Simple Hash Functions Work: Exploiting the Entropy in a Data Stream ichael itzenmacher Salil Vadhan School of Engineering & Applied Sciences Harvard University Cambridge, A 02138 {michaelm,salil}@eecs.harvard.edu

More information

HOMEWORK #2 - MATH 3260

HOMEWORK #2 - MATH 3260 HOMEWORK # - MATH 36 ASSIGNED: JANUARAY 3, 3 DUE: FEBRUARY 1, AT :3PM 1) a) Give by listing the sequence of vertices 4 Hamiltonian cycles in K 9 no two of which have an edge in common. Solution: Here is

More information

Ad Placement Strategies

Ad Placement Strategies Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January

More information

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis Motivation Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis Arrays provide an indirect way to access a set. Many times we need an association between two sets, or a set of keys and associated

More information

Data Structures and Algorithm. Xiaoqing Zheng

Data Structures and Algorithm. Xiaoqing Zheng Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Dictionary problem Dictionary T holding n records: x records key[x] Other fields containing satellite data Operations on T: INSERT(T, x)

More information

Expectation of geometric distribution. Variance and Standard Deviation. Variance: Examples

Expectation of geometric distribution. Variance and Standard Deviation. Variance: Examples Expectation of geometric distribution Variance and Standard Deviation What is the probability that X is finite? Can now compute E(X): Σ k=f X (k) = Σ k=( p) k p = pσ j=0( p) j = p ( p) = E(X) = Σ k=k (

More information

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing Bloom filters and Hashing 1 Introduction The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of

More information

Some notes on streaming algorithms continued

Some notes on streaming algorithms continued U.C. Berkeley CS170: Algorithms Handout LN-11-9 Christos Papadimitriou & Luca Trevisan November 9, 016 Some notes on streaming algorithms continued Today we complete our quick review of streaming algorithms.

More information

CSE 502 Class 11 Part 2

CSE 502 Class 11 Part 2 CSE 502 Class 11 Part 2 Jeremy Buhler Steve Cole February 17 2015 Today: analysis of hashing 1 Constraints of Double Hashing How does using OA w/double hashing constrain our hash function design? Need

More information

2. This exam consists of 15 questions. The rst nine questions are multiple choice Q10 requires two

2. This exam consists of 15 questions. The rst nine questions are multiple choice Q10 requires two CS{74 Combinatorics & Discrete Probability, Fall 96 Final Examination 2:30{3:30pm, 7 December Read these instructions carefully. This is a closed book exam. Calculators are permitted. 2. This exam consists

More information

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30 CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in

More information

On Risks of Using a High Performance Hashing Scheme With Common Universal Classes

On Risks of Using a High Performance Hashing Scheme With Common Universal Classes On Risks of Using a High Performance Hashing Scheme With Common Universal Classes Dissertation zur Erlangung des akademischen Grades Doctor rerum naturalium (Dr. rer. nat.) vorgelegt der Fakultät für Informatik

More information

6.842 Randomness and Computation Lecture 5

6.842 Randomness and Computation Lecture 5 6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its

More information

Lecture 6 September 13, 2016

Lecture 6 September 13, 2016 CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]

More information

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15) Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.

More information

Ma/CS 6b Class 15: The Probabilistic Method 2

Ma/CS 6b Class 15: The Probabilistic Method 2 Ma/CS 6b Class 15: The Probabilistic Method 2 By Adam Sheffer Reminder: Random Variables A random variable is a function from the set of possible events to R. Example. Say that we flip five coins. We can

More information

Dictionary: an abstract data type

Dictionary: an abstract data type 2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

More information

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms Tim Roughgarden March 3, 2016 1 Preamble In CS109 and CS161, you learned some tricks of

More information

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park 1617 Preview Data Structure Review COSC COSC Data Structure Review Linked Lists Stacks Queues Linked Lists Singly Linked List Doubly Linked List Typical Functions s Hash Functions Collision Resolution

More information

Charles University in Prague Faculty of Mathematics and Physics MASTER THESIS. Martin Babka. Properties of Universal Hashing

Charles University in Prague Faculty of Mathematics and Physics MASTER THESIS. Martin Babka. Properties of Universal Hashing Charles University in Prague Faculty of Mathematics and Physics MASTER THESIS Martin Babka Properties of Universal Hashing Department of Theoretical Computer Science and Mathematical Logic Supervisor:

More information

Lecture 14: Cryptographic Hash Functions

Lecture 14: Cryptographic Hash Functions CSE 599b: Cryptography (Winter 2006) Lecture 14: Cryptographic Hash Functions 17 February 2006 Lecturer: Paul Beame Scribe: Paul Beame 1 Hash Function Properties A hash function family H = {H K } K K is

More information

CS369N: Beyond Worst-Case Analysis Lecture #6: Pseudorandom Data and Universal Hashing

CS369N: Beyond Worst-Case Analysis Lecture #6: Pseudorandom Data and Universal Hashing CS369N: Beyond Worst-Case Analysis Lecture #6: Pseudorandom Data and Universal Hashing Tim Roughgarden April 4, 204 otivation: Linear Probing and Universal Hashing This lecture discusses a very neat paper

More information

Lecture 23: Alternation vs. Counting

Lecture 23: Alternation vs. Counting CS 710: Complexity Theory 4/13/010 Lecture 3: Alternation vs. Counting Instructor: Dieter van Melkebeek Scribe: Jeff Kinne & Mushfeq Khan We introduced counting complexity classes in the previous lecture

More information

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) = CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14 1 Probability First, recall a couple useful facts from last time about probability: Linearity of expectation: E(aX + by ) = ae(x)

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 15 October 20, 2014 CPSC 467, Lecture 15 1/37 Common Hash Functions SHA-2 MD5 Birthday Attack on Hash Functions Constructing New

More information

GROUPS AS GRAPHS. W. B. Vasantha Kandasamy Florentin Smarandache

GROUPS AS GRAPHS. W. B. Vasantha Kandasamy Florentin Smarandache GROUPS AS GRAPHS W. B. Vasantha Kandasamy Florentin Smarandache 009 GROUPS AS GRAPHS W. B. Vasantha Kandasamy e-mail: vasanthakandasamy@gmail.com web: http://mat.iitm.ac.in/~wbv www.vasantha.in Florentin

More information

FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS

FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS MOUMANTI PODDER 1. First order theory on G(n, p) We start with a very simple property of G(n,

More information

IEOR E4703: Monte-Carlo Simulation

IEOR E4703: Monte-Carlo Simulation IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Sungho Kang Yonsei University Outline Representing Logic Function Design Considerations for a BDD package Algorithms 2 Why BDDs BDDs are Canonical (each Boolean function has its

More information

1 Estimating Frequency Moments in Streams

1 Estimating Frequency Moments in Streams CS 598CSC: Algorithms for Big Data Lecture date: August 28, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri 1 Estimating Frequency Moments in Streams A significant fraction of streaming literature

More information

Functions of Several Variables: Chain Rules

Functions of Several Variables: Chain Rules Functions of Several Variables: Chain Rules Calculus III Josh Engwer TTU 29 September 2014 Josh Engwer (TTU) Functions of Several Variables: Chain Rules 29 September 2014 1 / 30 PART I PART I: MULTIVARIABLE

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 16 October 30, 2017 CPSC 467, Lecture 16 1/52 Properties of Hash Functions Hash functions do not always look random Relations among

More information

Advanced Data Structures

Advanced Data Structures Simon Gog gog@kit.edu - 0 Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Dynamic Perfect Hashing What we want: O(1) lookup

More information

Basics of hashing: k-independence and applications

Basics of hashing: k-independence and applications Basics of hashing: k-independence and applications Rasmus Pagh Supported by: 1 Agenda Load balancing using hashing! - Analysis using bounded independence! Implementation of small independence! Case studies:!

More information

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to Lecture Overview Dictionaries and Python Motivation Hash functions Chaining Simple uniform hashing Good hash functions Readings CLRS Chapter,, 3 Dictionary Problem Abstract Data Type (ADT) maintains a

More information

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -1 - INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University Data Set - SSN's from UTSA Class 467 13 3881 498 66 2055 450 27 3804 456 49 5261

More information

Advanced Data Structures

Advanced Data Structures Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu Predecessor data structures We want to support the following operations on a set of integers from

More information

arxiv: v1 [cs.cc] 14 Sep 2013

arxiv: v1 [cs.cc] 14 Sep 2013 arxiv:1309.3690v1 [cs.cc] 14 Sep 2013 Element Distinctness, Frequency Moments, and Sliding Windows Paul Beame Computer Science and Engineering University of Washington Seattle, WA 98195-2350 beame@cs.washington.edu

More information

Hashing Data Structures. Ananda Gunawardena

Hashing Data Structures. Ananda Gunawardena Hashing 15-121 Data Structures Ananda Gunawardena Hashing Why do we need hashing? Many applications deal with lots of data Search engines and web pages There are myriad look ups. The look ups are time

More information

Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1

Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Direct-Address Tables 2 2 Let U = {0,...,m 1}, the set of

More information

Problem. Maintain the above set so as to support certain

Problem. Maintain the above set so as to support certain Randomized Data Structures Roussos Ioannis Fundamental Data-structuring Problem Collection {S 1,S 2, } of sets of items. Each item i is an arbitrary record, indexed by a key k(i). Each k(i) is drawn from

More information

CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014

CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014 CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014 Instructor: Chandra Cheuri Scribe: Chandra Cheuri The Misra-Greis deterministic counting guarantees that all items with frequency > F 1 /

More information