Lecture Lecture 3 Tuesday Sep 09, 2014

Similar documents
Lecture 4 Thursday Sep 11, 2014

Lecture 4 February 2nd, 2017

Basics of hashing: k-independence and applications

Tabulation-Based Hashing

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University

So far we have implemented the search for a key by carefully choosing split-elements.

arxiv: v3 [cs.ds] 11 May 2017

Lecture: Analysis of Algorithms (CS )

Cache-Oblivious Hashing

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32

6.1 Occupancy Problem

Lecture 3 January 31, 2017

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

1 Hashing. 1.1 Perfect Hashing

Hashing. Martin Babka. January 12, 2011

Lecture 4: Hashing and Streaming Algorithms

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

lsb(x) = Least Significant Bit?

CS369N: Beyond Worst-Case Analysis Lecture #6: Pseudorandom Data and Universal Hashing

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

Hash tables. Hash tables

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple

Symbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record.

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

6.854 Advanced Algorithms

Each internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.

Lecture 18: Oct 15, 2014

Lecture 2 September 4, 2014

Introduction to Hash Tables

1 Estimating Frequency Moments in Streams

1 Maintaining a Dictionary

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Searching, mainly via Hash tables

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

Lecture Lecture 25 November 25, 2014

Analysis of Algorithms I: Perfect Hashing

Lecture 01 August 31, 2017

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

6.842 Randomness and Computation Lecture 5

Lecture 2: Streaming Algorithms

Hash tables. Hash tables

Algorithms lecture notes 1. Hashing, and Universal Hash functions

Hash tables. Hash tables

As mentioned, we will relax the conditions of our dictionary data structure. The relaxations we

CSCI8980 Algorithmic Techniques for Big Data September 12, Lecture 2

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9

Fundamental Algorithms

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.

Lecture 3 Sept. 4, 2014

UNIFORM HASHING IN CONSTANT TIME AND OPTIMAL SPACE

Problems. Mikkel Thorup. Figure 1: Mihai lived his short life in the fast lane. Here with his second wife Mirabela Bodic that he married at age 25.

Randomized Load Balancing:The Power of 2 Choices

Lecture 2 Sept. 8, 2015

Balls and Bins: Smaller Hash Families and Faster Evaluation

Lecture 10 Oct. 3, 2017

Lecture 1 September 3, 2013

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing

Cuckoo Hashing and Cuckoo Filters

The space complexity of approximating the frequency moments

Expectation of geometric distribution

Advanced Algorithm Design: Hashing and Applications to Compact Data Representation

Hashing, Hash Functions. Lecture 7

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class

On Randomness in Hash Functions. Martin Dietzfelbinger Technische Universität Ilmenau

Bottom-k and Priority Sampling, Set Similarity and Subset Sums with Minimal Independence. Mikkel Thorup University of Copenhagen

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to

The Power of Simple Tabulation Hashing

Communication Complexity

A Tight Lower Bound for Dynamic Membership in the External Memory Model

Lecture 4: Divide and Conquer: van Emde Boas Trees

Expectation of geometric distribution. Variance and Standard Deviation. Variance: Examples

Lecture 2. Frequency problems

NON-EXPANSIVE HASHING

Why Simple Hash Functions Work: Exploiting the Entropy in a Data Stream

Lecture 15 April 9, 2007

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Balls and Bins: Smaller Hash Families and Faster Evaluation

Lecture 6 September 13, 2016

? 11.5 Perfect hashing. Exercises

Lecture 26 December 1, 2015

Algorithms for Data Science

Tail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials).

compare to comparison and pointer based sorting, binary trees

Randomized Algorithms. Lecture 4. Lecturer: Moni Naor Scribe by: Tamar Zondiner & Omer Tamuz Updated: November 25, 2010

Lecture Examples of problems which have randomized algorithms

Randomized algorithm

CS483 Design and Analysis of Algorithms

Lecture Lecture 9 October 1, 2015

Application: Bucket Sort

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

CS 580: Algorithm Design and Analysis

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28

Some notes on streaming algorithms continued

14.1 Finding frequent elements in stream

Ad Placement Strategies

Transcription:

CS 4: Advanced Algorithms Fall 04 Lecture Lecture 3 Tuesday Sep 09, 04 Prof. Jelani Nelson Scribe: Thibaut Horel Overview In the previous lecture we finished covering data structures for the predecessor problem by studying Fusion Trees. In this lecture, we cover several notions related to hashing: using hashing for load balancing, k- wise independent families of hashing functions and the dictionary problem for which we show two solutions: chaining with linked lists and linear probing. Hashing In hashing, we have a family H of hashing functions mapping [u] to [m] (where [u] denotes the set {0,..., u }). For a set S [u] with S = n, we want that h picked at random in H behaves nicely on S. Example (load balancing). We have n jobs that need to be assigned to m machines and each job has a job ID in [u]. We want to evenly spread the job over the machines. We pick h at random in a family of hashing functions H. When a new job j [u] comes in, we assign it to machine h(j) [m]. This is usually referred to as the balls and bins random process. We would like to say that P( machine receiving more than m n jobs) is small. First, let us assume that the random variables h(i) for i [u] are independent. We will need the following concentration inequality. Theorem (Chernoff bound). Let X,..., X n be n independent random variables with X i {0, }. Let us write X = n i= X i, then: P ( X ( + δ) E[X] ) ( e δ ) E[X] ( + δ) +δ. The Chernoff bound can be proved using Markov s inequality. Lemma (Markov s inequality). Let X be a nonnegative, random variable, then: λ > 0, P(X λ) E[X] λ. Proof (Chernoff bound). By Markov s inequality, we have: P(X λ) = P ( e tx e tλ) E [ e tx] e tλ.

By independence: E [ e tx] = n E [ e tx ] i = i= n ( pi e t + ( p i ) ) i= n e p i(e t ) = e (et ) E[X], where the inequality uses + x e x. We then conclude the proof by choosing λ = ( + δ) E[X] and t = log( + δ). Using the Chernoff bound, we can prove the following proposition. Proposition 3. For the balls and bins random process, with m = n, then for any c > : ( P machine with more than c log n ) log log n jobs poly(n). Proof. Focus on some machine t and define: X i = { if h(i) = t 0 otherwise Then, X = n i= X i is the number of jobs assigned to machine t and E[X] =. Applying the Chernoff bound: P ( X c log n ) ( log log n e c log n/ log log n i= ) c log n log log n = ( e k ) k where k = c log n log log n. It is easy to see that for any ε > 0 and for n large enough, ( ) e k k. This n c ε implies, by using a union bound: P( overloaded machine) n P(t is overloaded) t= Chosing ε < c is enough to conclude the proof. n c ε We propose an alternative proof which does not rely on the Chernoff bound. Proof. Using union bounds: P( overloaded machine) n P(machine is overloaded) and we conclude as in the previous proof. = P( k items mapping to machine ) P(items in T all map to ) T [u] T =k T [u] T =k ( n k) n k = n k ( ) k/ k! (k/)

Note that the second proof does not require full independence, but only that sets of k elements are mapped to independent random variables. This motivates the following definitions. Definition 4. The random variables X,..., X n are k-wise independent iff for any set of indices i... i k n, the random variables X i,..., X ik are independent. Definition 5. A set of hash functions H is a k-wise independent family iff the random variables h(0), h(),..., h(u ) are k-wise independent when h H is drawn uniformly at random. Example. The set H of all functions from [u] to [m] is k-wise independent for all k. Note that this family of functions is impractical for many applications since you need O(u log m) bits to store a totally random function. For a hash table, this is already larger than the space of the data structure itself, O(n). Example (u = m = p, p prime). We define H = { h : h(x) = k i=0 a ix i mod p } where a i {0,..., p }. By using polynomial interpolation, we know that for any (x,..., x k, y,..., y k ) [p] k there exists a unique polynomial h H such that h(x ) = y,..., h(x k ) = y k. Since there are p k elements in H, if h H is drawn uniformly at random, P(h(x ) = y,..., h(x k ) = y k ) = /p k, which shows that H is k-wise independent. It is also easy to see that this family is not k + -wise independent: again using polynomial interpolation, once h(x ),..., h(x k ) are fixed, h(x k+ ) is entirely determined. Example (u = p, p prime ). We define H = { h : h(x) = ( k i=0 a ix i mod p ) mod m }. We see that P(h(x ) = y,..., h(x k ) = y k ) ( k. m) Note that in this example, h(x ),..., h(x k ) does not behave exactly like k independent uniformly distributed variables, but this bound is sufficient in most applications. 3 Dictionary problem This is a data structural problem with two operations: update(k,v): associates key k with value v. query(k): returns value v associated with key k or null if k is not in the dictionary. The keys and the values are assumed to live the in same universe [u]. 3. Chaining with linked lists The dynamic dictionary problem can be solved by hash tables with chaining. In this solution the hash table is an array A of size m. Each element in the array is a pointer to a linked list containing (key, value) pairs. We pick h H from a -wise independent family. The operations on key k are made in the linked list pointed to by A [ h(k) ]. In practice, we do not lose by requiring u to be prime. Indeed, by Bertrand s postulate, there always exists a prime between u and u. 3

The running time of an operation on key k is linear in the length of the linked list at h(k). It is possible to show that E h [length of list at h(k)] = O() if m n. For details of the analysis, see Lecture Notes from the Spring 04 offering of CS 4, or see [CLRS, Section.]. 3. Perfect hashing In this problem there is some set S [u] where S = n, and we would like to find a hash function h : [u] [m] (hashing into the m tables of some array) which acts injectively on S. This was mentioned in class, but no details were given for the interested reader, see the lecture notes on perfect hashing from the Spring 04 offering of CS 4, or read [CLRS, Section.5], or see the original paper [FKS84]. 3.3 Linear probing In real architectures, sequential memory accesses are much faster. A solution to the dictionary problem with better spacial locality is linear probing. In this solution, the hash table is an array A, but unlike the solution with chaining and linked lists, the values are stored directly in A. More specifically, the value associated with key k is stored at h(k). If this cell is already taken, we start walking to the right, one cell at a time, until we find and empty cell and insert the value in it. History. Linear probing dates back at least to 954 where it was used in an assembly program written by Samuel, Amdahl and Boehme. It was first analyzed in a note published by Knuth [Knu63] in the case where h is totally random (H is the set of all functions). In this paper, he showed that when m = ( + ε)n, then E[time per operation] = O ( ) ε. There was a breakthrough in a paper by Pagh, Pagh and Ruzic [PPR09] where it was proved that for any 5-wise independent family, E[time per operation] = O(). Finally Patrascu and Thorup [PT0] showed that 4-wise independent families do not guarantee a constant running time per operation (that is, they provided an example of a 4-wise independent family for which the expected running time is not constant). Here, we will prove the following proposition. Proposition 6. If m cn and H is 7-wise independent, then E[query time] = O(). To prove this proposition, we need to able to compute the expected number of cells we have to probe before finding an empty cell. We are going to show that this expected number is constant. Definition 7. Given an interval I of cells in A, we define L(I) = {i : h(i) I} 3. Definition 8. We say that I is full iff L(I) I. Lemma 9. If query(x) for some x makes k probes, then there exist at least k full intervals containing h(x). We wrap back to the beginning of the array when we reach its right edge. 3 We allow intervals to wrap around at the end of the table. 4

Proof. Let us define z the position of the first empty cell on the left of h(x) and z the position of the first empty cell on the right of h(x). It is easy to see that each of the intervals [z, h(x)], [z, h(x) + ],..., [z, z ] are full. There are z h(x) such intervals. But by the assumption of the lemma, z = h(x) + k which conclues the proof of the lemma. Proof of Proposition 6. By applying the previous lemma: E[time of query(x)] E[number of full intervals containing h(x)] P(I full) k max P(I full) k k I, I =k h(x) I I, I =k h(x) I We need to show that the max in the previous sum is O ( k 3 ) so that the sum converges. We assume that m = n where n is the number of different keys active. P(I full) = P(L(I) I ) P ( L(I) E L(I) E L(I) ) where the inequality follows from E[L(I)] = I. Now we would like to apply Chernoff, but we don t have independence. Instead we can raise both sides of the inequality to some power and apply Markov s inequality. This is called the methods of moments. We will do this in the next lecture. References [CLRS] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms, 3 rd edition, MIT Press, 009. [FKS84] Michael L. Fredman, János Komlós, Endre Szemerédi. Storing a Sparse Table with O() Worst Case Access Time. J. ACM 3(3), pages 538 544, 984. [Knu63] Don Knuth. Notes On Open Addressing, 963. [PPR09] Anna Pagh, Rasmus Pagh, and Milan Ruzic. Linear probing with constant independence. SIAM J. Comput., 39(3):07 0, 009. [PT0] Mihai Patrascu and Mikkel Thorup. On the k-independence required by linear probing and minwise independence. ICALP, pages 75 76, 00. 5