Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Size: px
Start display at page:

Download "Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86"

Transcription

1 Data Structure Mohsen Arab Yazd University January 13, 2015 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

2 Table of Content Binary Search Tree Treaps Skip Lists Hash Tables Mohsen Arab (Yazd University ) Data Structure January 13, / 86

3 Fundamental Data-structuring Problem fundamental data-structuring problem: maintain a collection {S 1, S 2,...} of sets of items to efficiently support certain types of queries and operations: MAKESET(S): create a new (empty) set S. INSERT(i, S): insert item i into the set S. DELETE(k,S): delete the item indexed by the key value k from the set S. FIND(k, S): return the item indexed by the key value k in the set S. JOIN(S 1, i, S 2 ): replace the sets S 1 and S 2 by the new set S = S 1 {i} S 2, where 1 for all items j S 1, k(j) < k(i), 2 for all items j S 2, k(j) > k(i). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

4 Fundamental Data-structuring Problem(cont.) Paste(S 1, S 2 ): replace the sets S 1 and S 2 by the new set S = S 1 S 2, where for all items i S 1 and j S 2, k(i) < k(j). Split(k,S): replace the set S by the new sets S 1 and S 2 where S 1 = {j S k(j) < k} S 2 = {j S k(j) > k} Mohsen Arab (Yazd University ) Data Structure January 13, / 86

5 binary search tree binary search tree: binary tree in which keys satisfy search tree property. Definition Search tree property: for all nodes with key value k, the left sub-tree contains only key values smaller than k and the right sub-tree contains only key values larger than k. the key values in binary tree are in symmetric order, if they satisfy search tree property. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

6 we will assume BST are endogenous. Definition Endogenous: all key values are stored at internal nodes, and all leaf nodes are empty. This will ensure that the trees are full, which means that every non-leaf (internal) node has exactly two children. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

7 standard implementations of operations MakeSet(S): initialize an empty tree for the set S. Joint(S 1,k,S 2 ): create a node containing key k as root, make S 1 and S 2 as its left and right sub-tree respectively. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

8 search Example:FIND(4, S) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

9 Insert perform Find(k,S), insert k where search fails (into the empty leaf node) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

10 Insert perform Find(k,S), insert k where search fails (into the empty leaf node) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

11 implementations of operations(delete) Delete(K,S): 1) if the node v containing k has a leaf as one of its two children. For example, if the right child of v is a leaf, then replace v by L( v) as the child of P(v). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

12 implementations of operations(delete) Delete(K,S): 1) if the node v containing k has a leaf as one of its two children. For example, if the right child of v is a leaf, then replace v by L( v) as the child of P(v). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

13 implementations of operations(delete) 2. If neither of the children is a leaf, let k be the key value that is the predecessor of k in the set S. Now, we can delete the node containing k since its right child is a leaf, and replace the key value k by k in the node v. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

14 implementations of operations(delete) 2. If neither of the children is a leaf, let k be the key value that is the predecessor of k in the set S. Now, we can delete the node containing k since its right child is a leaf, and replace the key value k by k in the node v. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

15 implementations of operations(cont.) Note PASTE(S 1, S 2 ): 1 delete the largest key value, say k, from S 1. 2 apply JOIN(S 1,k,S 2 ). k can be found by doing a FIND(,S 1 ). SPLIT(k, S): if k is at the root of S, do the reverse of the steps employed in JOIN(S 1,k,S 2 ). else, make use of rotations to move it to the root. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

16 Problem: Each operation can be performed in time proportional to the height of the tree. There is sequence of INSERT operations that result in tree of height linear in n. Solution: Perform rotations during update operations to ensure having all leaves in distance O(log n) from the root. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

17 Rotations Each type of rotation moves a node together with one of its sub-trees closer to the root (and some others away from the root), while preserving the search tree property. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

18 A different strategy: Splaying in self-adjusting search tree Splaying the splay operation moves a specified node to the root via a sequence of rotations Amortization partitioning of the total cost of a sequence of operations among the individual operations in that sequence. Thus, amortized time bound can be viewed as the average cost of the operations in a sequence Mohsen Arab (Yazd University ) Data Structure January 13, / 86

19 idea behind self-adjusting trees to use a particular implementation of the splay operation to move to the root a node accessed by a FIND operation How it can benefit us nodes which accessed often enough, remain close to root. Thus, total running time will increase not very much for an infrequently accessed node,total running time will not increase very much in any case. Note These self-adjusting trees guarantee only amortized logarithmic time per operation. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

20 Advantages and drawbacks of self-adjusting trees Advantages They are relatively simple to implement. do not require explicit balance information to be stored at nodes splay trees can be shown to be optimal with respect to arbitrary access frequencies for the items being stored. Drawbacks they restructure the entire tree during updates and even simple search operations. during any given operation splay trees may perform a logarithmic number of rotations we do not have the guarantee that every operation will run quickly Mohsen Arab (Yazd University ) Data Structure January 13, / 86

21 Treaps treaps are efficient randomized alternative to the balanced tree and self-adjusting tree. Treaps achieve essentially the same time bounds in the expected sense, but with following advantages: 1 do not require any explicit balance information 2 expected number of rotations performed is small for each operation 3 They are extremely simple to implement Mohsen Arab (Yazd University ) Data Structure January 13, / 86

22 binary search tree A (full, endogenous) binary tree whose nodes have key values associated with them is a binary search tree if the key values are in the symmetric order heap If the key values decrease monotonically along any root-leaf path, we call the structure a heap and say that the keys are stored in a heap order. treap Consider a binary tree where each node v contains a pair of values: a key k( v) as well as a priority p( v). We call this structure a treap if it is a binary search tree with respect to the key values and, simultaneously, a heap with respect to the priorities Mohsen Arab (Yazd University ) Data Structure January 13, / 86

23 example of treaps S = {(k 1,p 1 ),...,(k n,p n )} S={(2, 13), (4, 26), (6,19), (7, 30), (9,14), (11, 27), (12, 22)} Mohsen Arab (Yazd University ) Data Structure January 13, / 86

24 Theorem 8.1 Let S = {(k 1,p 1 ),...,(k n,p n )} be any set of key-priority pairs such that the keys and the priorities are distinct.then, there exists a unique treap T(S) for it. proof: It is obvious that the theorem is true for n = 0 and for n = 1. Suppose now that n 2, and assume that (k 1, p 1 ) has the highest priority in S. Then, a treap for S can be constructed by putting item 1 at the root of T(S). A treap for the items in S of key value smaller (larger) than k 1 can be constructed recursively, and this is stored as the left (right) sub-tree of item 1. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

25 implementation of Operations using treap MAKESET(S) or a FIND(k, S) operation exactly as before. INSERT(k, S): Do FIND(k, S) and inserting k at the empty leaf node where the search terminates with failure. if heap order property is violated ( parent(k).p < k.p): Repeat: decrease k s depth by performing a rotation at node w= parent(k) so that k becomes the parent of w. until k either becomes the root or parent(k).p > k.p. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

26 implementation of Operations using treap: Add(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

27 implementation of Operations using treap: Add(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

28 implementation of Operations using treap: Add(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

29 implementation of Operations using treap: Delete(), Example DELETE(k, S): operation is exactly the reverse of an insertion downward until both its children are leaves, and then simply discard the node. Note: The choice of the rotation (left or right) at each stage depends on the relative order of the priorities of the children of the node being deleted. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

30 Delete(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

31 Delete(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

32 Delete(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

33 Delete(), Example Mohsen Arab (Yazd University ) Data Structure January 13, / 86

34 JOIN(S 1, k, S 2 ): operation as before, and the resulting structure is a treap provided the priority of k is higher than that of any item in S 1 or S 2. If the new root (containing k) violates the heap order, we simply rotate that node downward until each of the two children of the node has a smaller priority or is a leaf. PASTE(S 1, S 2 ): As in BST. SPLlT(k, S): 1 delete k from S. 2 inserting it into S with a priority of. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

35 left spine of a tree: the path obtained by starting at the root and repeatedly moving to the left child until a leaf is reached; the right spine is defined similarly. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

36 left spine of a tree: the path obtained by starting at the root and repeatedly moving to the left child until a leaf is reached; the right spine is defined similarly. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

37 Mulmuley Games Mulmuley games are useful abstractions of processes underlying the behavior of certain geometric algorithms. The cast of characters in these games is: P = { P 1,...,P p } S = { S 1,...,S s } T = { T 1,...,T t } B = { B 1,...,B b } The set P S is drawn from a totally ordered universe. all players are smaller than all stoppers: for all i and j, P i < S j Mohsen Arab (Yazd University ) Data Structure January 13, / 86

38 Exercise 8.5: Let H k = k i=1 1/i. denote the kth Harmonic number. Show that: n k=1 H k = (n + 1)H n+1 (n + 1) Recall that H k = Ink + O(1) (Proposition B.4). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

39 Depending upon the set of active characters, we formulate four different games, with each game being more general than the previous one. Game A. initial set of characters X = P B. The game proceeds by repeatedly sampling from X without replacement, until the set X becomes empty. random variable V: the number of samples in which a player P i is chosen such that P i is larger than all previously chosen players. value of the game A p = E[V ] Mohsen Arab (Yazd University ) Data Structure January 13, / 86

40 Lemma 8.2: For all p 0, A p = H p. Proof: Assume that the set of players is ordered as P 1 > P 2 >... > P p. in Game A, bystanders are not considered, so we can set b=0. if the first chosen player is P i, the expected value of the game is 1 + A i 1. A p = p 1+A i 1 i=1 p = 1 + p A i 1 i=1 p Upon rearrangement, using the fact that A 0 = 0, p 1 i=1 A i = pa p p. By Exercise 8.5: Harmonic numbers are the solution to the above equation. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

41 Game C. initial set of characters X = P B S. the stoppers are treated as players. But the game stops when a stopper is chosen for the first time. value of the game C s p = E[V + 1] = E[V ] + 1 Note since all players are smaller than all stoppers, we will always get a contribution of 1 to the game value from the first stopper. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

42 Lemma 8.3 Lemma 8.3 For all p, s 0, C s p = 1 + H s+p - H s. Proof Assume that the set of players is ordered as P 1 > P 2 >... > P p. As in Game A, bystanders are not considered, so we can set b=0. if the first sample is P i,the probability of the this event is s/(s + p). The expected game value is 1 + C s i 1. if the first sample is a stopper,,the probability of the this event is s/(s + p). The game value is 1... Mohsen Arab (Yazd University ) Data Structure January 13, / 86

43 Proof of Lemma 8.3 Proof of Lemma 8.3 (cont.).. Cp s = ( s s+p 1) + ( 1 s+p p i=1 (1 + C i 1 s )). Upon rearrangement, using the fact that C0 s Cp s = s+p+1 p 1 i=1 s+p + C i s s+p which is equivalent to p 1 i=1 C s i = (s + p)c s p (s + p + 1). = 1, we obtain that Once again, using Exercise 8.5 it can be verified that the solution to the recurrence is given by C s p = 1 + H s+p H s. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

44 Game D and E. Games D and E are similar to Games A and C, But: in Game D, X = P B T and in Game E, X = P B S T. The role of the triggers is that the counting process begins only after the first trigger has been chosen. a player or a stopper contributes to V only if it is sampled after a trigger and before any stopper (and of course it is larger than all previously chosen players). i.e, Mohsen Arab (Yazd University ) Data Structure January 13, / 86

45 Lemma 8.4: For all p, t 0, Dp t = H p + H t H p+t. Lemma 8.5: For all p,s,t 0,Ep s,t = t s+t + (H s+p H s ) (H s+p+t H s+t ). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

46 Analysis of Treaps memory less property Since the random priorities for the elements of S are chosen independently, we can assume that the priorities are chosen before the insertion process is initiated Once the priorities have been fixed, Theorem 8.1 implies that the treap T is uniquely determined. This implies that the order in which the elements are inserted does not affect the structure of the tree. without loss of generality, we can assume that the elements of set S are inserted into T in the order of decreasing priority. An advantage of this view is that it implies that all insertions take place at the leaves and no rotations are required to ensure the heap order on the priorities. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

47 Lemma 8.6 Let T be a random treap for a set S of size n. For an element x S having rank k, E(depth(x)) = H k + H n k+1 1 idea of proof S = {y S y x},s + = {y S y x} Since x has rank k, it follows that S = k, S + = n k + 1 Q x S: the ancestors of x Q x = S Q x, Q + x = S + Q x we will establish that E[ Q x ] = H k. By By symmetry, it follows that E[ Q + x ] = H n k+1 1 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

48 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

49 Consider any ancestor y Q x of the node x. By the memoryless assumption, y must have been inserted prior to x: p y > p x. Since y < x, it must be the case that x lies in the right sub-tree of y. search for every element z whose value lies between y and x (y < z < x) must follow the path from the root to y, and in fact go into the right sub-tree of y. We conclude that y is an ancestor of every node containing an element of value between y and x. By our assumption,z must have been inserted after y, and hence is of lower priority than y. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

50 .. Continue of proof.. The preceding argument establishes that an element y S is an ancestor of x, or a member of Q x ; if and only if it was the largest element of S in the treap at the time of its insertion. the order of insertion is determined by the order of the priorities, and the latter is uniformly distributed by the order of the priorities, Thus, the order of insertion can be viewed as being determined by uniform sampling without replacement from the pool S. We can now claim that the distribution of Qx is the same as that of the value of Game A when P = S and B = S\S. Since S = k, the expected size of Qx = H k Mohsen Arab (Yazd University ) Data Structure January 13, / 86

51 For any element x in a treap, L x : length of the left spine of the right sub-tree of x. R x : length of the right spine of the left sub-tree of x. Lemma 8.7 Let T be a random treap for a set S of size n. For an element X S of rank k, E[R x ] = 1 1 k, E[L x] = 1 1 n k+1 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

52 proof: (1)an element z < x lies on the right spine of the left sub-tree of x if and only if (2) z is inserted after x, and all elements y whose values lie between z and x (z < y < x) are inserted after z. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

53 proof z is inserted after x, and all elements y whose values lie between z and x (z < y < x) are inserted after z element z lies on the right spine of the left sub-tree of x. a. if x is ancestor of z: if x doesn t lie on the spine right of left sub-tree x, then: z < u < x (or z < v < x ) and since u (or v) is ancestor of z, it is inserted before z (contradiction). b. if x is not ancestor of z: let w be lowest common ancestor of z and x. we wee that z < w < x and since w is ancestor of z, it should have been inserted before z (contradiction). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

54 Proof (1) (2): an element z < x lies on the right spine of the left sub-tree of x z is inserted after x, and all elements y whose values lie between z and x (z < y < x) are inserted after z. since x is ancestor of z, so it is have been inserted before z. Also, since all element y (z < y < x) should be inserted in the right sub-tree of z, then they will be inserted after z. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

55 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

56 Search in Skip List We search for a key x in a a skip list as follows: We start at the first position of the top list At the current position p, we compare x with y key(next(p)) x = y: we return element(next(p)) x> y: we scan forward x <y: we drop down Example: search for 78 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

57 Tree representation of a skip list Mohsen Arab (Yazd University ) Data Structure January 13, / 86

58 Analyzing Random Skip Lists A random leveling of the set S is defined as follows: Given the choice of level L i, the level L i+1 is defined by independently choosing to retain each element x L i with probability he process starts with L 1 = S and terminates when a newly constructed level is empty. alternate view: Mohsen Arab (Yazd University ) Data Structure January 13, / 86

59 Analyzing Random Skip Lists A random leveling of the set S is defined as follows: Given the choice of level L i, the level L i+1 is defined by independently choosing to retain each element x L i with probability he process starts with L 1 = S and terminates when a newly constructed level is empty. alternate view: let the levels l(x) for x S be independent random variables, each with the geometric distribution with parameter p=1/2. Let r be max x S (l(x)) + 1 Place x in each of the levels L 1,..., L l(x). Like random Treaps, a random level is chosen for every element of S upon its insertion and remains fixed until the element is deleted. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

60 Lemma 8.9 The number of levels r in a random leveling of a set S of size n has expected value E[r] = O(logn). Moreover, r = O(logn) with high probability. Proof: r = max x S (l(x)) + 1. Levels l(x) are i.i.d. random variables distributed geometrically with parameter 1/2. pr[max i X i > t] n(1 p) t = n 2 t, we have p=1/2, with choosing t = αlogn and r = max i x i we have: for any α > 1. pr[r > αlogn] 1 n α 1 Mohsen Arab (Yazd University ) Data Structure January 13, / 86

61 lemma 8.10 Define I j (Y ) as the interval at level j that contains y. For an interval I at level i + 1, c(i) denotes the number of children it has at level i. Lemma 8.9 The number of levels r in a random leveling of a set S of size n has expected value E[r] = O(log n). Moreover, r = O(log n) with high probability. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

62 Hash Tables 1 static dictionary: we are given a set of keys S and must organize it into a data structure that supports the efficient processing of FIND queries. 2 dynamic dictionary: set S is not provided in advance. Instead it is constructed by a series of INSERT and DELETE operations that are intermingled with the FIND queries. Data Structuring problem All data structures discussed earlier require (logn) time to process any search or update operation. These time bounds are optimal for data structures based on pointers and search trees we are faced with a logarithmic lower bound. These time bounds are based on the fact that the only computation we can perform over the keys is to compare them and thereby determine their relationship in the underlying total order. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

63 Hash Tables Suppose: keys in S are chosen from a totally ordered universe M of size m. w.l.o.g, M = {0,..., m 1} keys are distinct. The idea: Create an array T [0..m 1] of size m in which T[k]=1 if k S T[k] = NULL otherwise This is called a direct-address table Operations take O(1) time. So whats the problem? Mohsen Arab (Yazd University ) Data Structure January 13, / 86

64 Direct addressing works well when the range m of keys is relatively small. But what if the keys are 32-bit integers? Problem 1: direct-address table will have 2 32 entries, more than 4 billion. Problem 2: even if memory is not an issue the time to initialize the elements to NULL may be. we want to reduce the size of the table to value close to S, while maintaining the property that a search or update can be performed in O(1) time. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

65 A table T consisting of n cells indexed by N = {0,..., n 1} A hash function h(), which is a mapping from M into N n < m,otherwise use direct address table. collision occurs when: two distinct keys x and y map in A collision occurs when: two distinct keys x and y map in the same location, i.e. h(x) = h(y). Goal: maintain a small table, and use hash function h to map keys into this table. If h behaves randomly, shouldn t get too many collisions. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

66 Hash Tables Chaining Chaining puts elements that collide in a linked list: Mohsen Arab (Yazd University ) Data Structure January 13, / 86

67 Universal Hash Families 2-universal Let M = {0,..., m 1} and N = {0,..., n 1}, with m n. A family H of functions from M into N is said to be 2-universal if, for all x, y M such that x y, and for h chosen uniformly at random from H, Pr[h(x) = h(y)] 1 n Mohsen Arab (Yazd University ) Data Structure January 13, / 86

68 define the following indicator function for a collision between the keys x and y under{ the hash function h: } 1 for h(x)=h(y) and x y δ(x, y, h)= 0 otherwise For all X, Y M, define the following extensions of the indicator function δ: δ(x, y, H) = Σ h H δ(x, y, h), δ(x, Y, h) = Σ y Y δ(x, y, h), δ(x, Y, h) = Σ x X δ(x, Y, h), δ(x, Y, H) = Σ y Y δ(x, y, H), δ(x, Y, H) = Σ h H δ(x, Y, h). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

69 Note For a 2-universal family H and any x y, we have δ(x, y, H) H /n. Theorem 8.12: For any family H of functions from M to N, there exist x, y M such that δ(x, y, H) > H n H m Mohsen Arab (Yazd University ) Data Structure January 13, / 86

70 Proof of Theorem 8.12 Proof Fix some function h H, and for each z N define the set of elements of M mapped to z as A z = {x M h(x) = z} The sets A z, for z N, form a partition of M. It is easy to verify that { } 0 w z δ(a w, A z, h)= A z ( A z 1) w = z The total number of collisions between all possible pairs of elements is minimized when these sets A z are all of the same size. We obtain δ(m, M, h) = z N A z ( A z 1) n( m n ( m n 1)) = m2 ( 1 n 1 m ) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

71 Proof(Cont.) Proof(Cont.) δ(m, M, H) = h H δ(m, M, h) H m2 ( 1 n 1 m ). By the pigeonhole principle. x, y M such that: δ(x, y, H) δ(m,m,h) m 2 = H δ(m,m,h) m 2 H m2 ( 1 n 1 m ) m 2 = H ( 1 n 1 m ) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

72 Lemma 8.13: For all x M, S M, and random h H, Proof: E(δ(x, S, h)) = h H = 1 H = 1 H = 1 H 1 H = S n. δ(x,s,h) H h H y S y S h H y S y S δ(x, y, H) H n E[δ(x, S, h)] S n δ(x, y, h) δ(x, y, h) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

73 in Our dynamic dictionary scheme : Notes a hash function h H is chosen uniformly at random, remains fixed during entire sequence of updates and queries. An inserted key x is stored at the location h(x), and due to collisions there could be other keys also stored at that location. The keys colliding at a given location are organized into a linked list Assuming that the set of keys currently stored in the table is S M, the length of the linked list is δ(x, S, h), which has expectation S /n. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

74 Theorem 8.14: Consider a request sequence R = R l,r 2... R r of update and search operations starting with an empty hash table. Suppose that this sequence contains S INSERT operations. Let ρ(h, R) denote the total cost of processing these requests using the hash function h H. Theorem 8.14: For any sequence R of length r with S INSERTS, and h chosen uniformly at random from a 2-universal family H, E[ρ(h, R)] r(1 + s n ) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

75 Constructing Universal Hash Families Fix m and n. choose a prime p m. We will work over the field z p = {0, 1,..., p 1}. let g : z p N be the function given by g(x) = x mod n. For all a, b z p, define the linear function f a,b : z p z p and the hash function h a,b : z p N as follows. f a,b (x)=ax+b mod p. h a,b (x) = g(f a,b (x)) =(ax+b mod p) mod n Mohsen Arab (Yazd University ) Data Structure January 13, / 86

76 We the family of hash functions H = { h a,b a, b z p with a 0 } Lemma 8.15 or all x, y z p such that x y, δ(x, y, H) = δ(z p, z p, g). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

77 proof Suppose that x and y collide under a specific function h a,b. Let f a,b (X ) = r and f a,b (y) = s. observe that r s since a 0 and x y.a collision takes place if and only if g(r) = g(s), or equivalently, r s (mod n). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

78 Now, having fixed x and y, for each such choice of r s, the values of a and b are uniquely determined by solution of: ax + b r (mod p) ay + b s (mod p) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

79 Theorem 8.16: The family H= {h a,b a, b Z p with a 0} is a 2-universal family. Proof: For each z N, let A z = {x z p with g(x) = z}; it is clear that A z p/n. In other words, for every r Z p there are at most p/n different choices of s Z p such that g(r)=g(s). Since there are p different choices of r Z p to start with, δ(z P, Z p, g) p( p p(p 1) n 1) n lemma 8.15: δ(x, y, H) = δ(z p, z p, g), This Proof: δ(z P, Z p, g) p(p 1) n, so: δ(x, y, H) p(p 1) n. Since H = p(p 1), Therefore: δ(x, y, H) H n. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

80 Definition 8.6 Let M = {0, 1,..., m 1} and N = {0, 1,..., n 1}, with m n,. A family H of functions from M into N is said to be strongly 2-universal if for all x 1 x 2 M, any y 1, y 2 N, and h chosen uniformly at random from H, pr[h(x 1 ) = y 1 and h(x 2 ) = y 2 ]= 1 n 2. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

81 Definition 8.7 Definition A family of hash functions H = h : M N, is said to be a perfect hash family if for each set S M of size s < n there exists a hash function h H that is perfect for S. Note: It is clear that perfect hash families exist: for example, the family of all possible functions from M to T, is a perfect hash family. Given a perfect hash family H, we solve static dictionary by: 1 finding h H perfect for S. 2 storing each key x S at the location T [h(x)]. 3 responding to a search query for a key q by examining the contents of T [h(q)]. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

82 The preprocessing cost: depends on the cost of identifying a perfect hash function for a specific choice of S. search cost: depends on the time required to evaluate the hash function. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

83 since the choice of the hash function will depend on the set S, its description must also be stored in the table. Suppose that the size of the perfect hash family H is r. storing the description of a hash function from H will require Ω(log r) bits. it is essential that the description of the hash function should fit into 0(1) locations in the table T. A cell in the table, can be used to encode at most log m bits of information. Note therefore, we will only be interested in constructing hash families whose size r is bounded by a polynomial in m Mohsen Arab (Yazd University ) Data Structure January 13, / 86

84 Exercise 8.13: Assume for simplicity that n = s. Show that for m = 2 Ω(s), there exist perfect hash families of size polynomial in m. Thus, The existence of a perfect hash family is guaranteed only for values of m that are extremely large relative to n. Exercise 8.14: Assuming that n = s, show that any perfect hash family must have size 2 Ω(s). Thus, we need to have m = 2 Ω(s), or s = O( 1og m), to guarantee even the existence of a perfect hash family of size polynomial in m. Unfortunately, in practice the case s = O(1og m) is not very interesting for typical values of m, e.g, for m=2 32. Solution: using double hashing. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

85 Definition 8.8 Let S M and h: M N. For each table location 0 i n 1, we define the bin B i (h, S) = {x S h(x) = i} The size of a bin is denoted by b i (h, S) = Bi(h, S). Definition 8.9: A hash function h is b-perfect for S if b i (h, S) b, for each i. A family of hash functions { h: M N } is said to be a b-perfect hash family if for each S M of size s there exists a hash function h H that is b-perfect for S. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

86 Exercise 8.15: Show that there exists a b-perfect hash family H such that b = O(log n) and H m, for any m n. Double hashing: At the first level we use a (log m)-perfect hash function h to map S into the primary table T. Consider the bin B i consisting of all keys from S mapped into a particular cell T[i]. elements of the bin B i mapped into the secondary table T i associated with that location using a secondary hash function h i. Since the size of B i is bounded by b, we can find a hash function h i that is perfect for B i provided 2 b is polynomially bounded in m. For b = O(log m) this condition holds. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

87 the double hashing scheme can be implemented with O( 1) query time, for any m n. the goal of the primary hash functions should be to create bins small enough that some perfect hash functions can be used as the secondary hash functions. Exercise.8.16: Consider a table of size r indexed by R={0,..., r 1}, show that there exists a perfect hash family H = {M R} with H m provided that r = Ω(s 2 ), for all m s. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

88 Towards our final solution We will use a primary table of size n = s, choosing a primary hash function that ensures that the bin sizes are small. the perfect hash functions from Exercise 8.16 are then used to resolve the collisions by using secondary hash tables of size quadratic in the bin sizes, Total space required by the double hashing scheme s + O( s 1 i=0 b2 i ) Mohsen Arab (Yazd University ) Data Structure January 13, / 86

89 Achieving Bounded Query Time Our goal now is: 1 to find primary hash functions which ensure that the sum of the squares of the bin sizes is linear. 2 to find perfect hash functions for the secondary tables, which use at most quadratic space. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

90 Definition 8.10: Consider any V M with V = v, and let R={0,..., r 1} with r v. For 1 k p - 1, define the function h k : M R as follows, h k (x)=(kx mod p) mod r. For each i R, the bins corresponding to the keys colliding at i are denoted as B i (k, r, V ) = {x V h k (x) = i} and their sizes are denoted by b i (k, r, V ) = B i (k, r, V ). Mohsen Arab (Yazd University ) Data Structure January 13, / 86

91 Lemma 8.17: For all V M of size v, and all r v, ( ) p 1 r 1 bi (k, r, V ) k=1 i=0 2 < (p 1)v 2 r = mv 2 r. Proof: The left-hand side of (8.2)counts the number of tuples (k, {x, y}) such that h k causes x and y to collide. i.e, 1 x,y V with x y, and 2 ((kx mod p) mod r) = ((ky mod p) mod r). The relation between k and x,y is as follows: k(x y) mod p {±r, ±2r, ±3r,..., ± (p 1)/r r} Mohsen Arab (Yazd University ) Data Structure January 13, / 86

92 proof(cont.) Since p is a prime and Z p is a field, for any fixed value of x - y there is a unique solution for k satisfying the equation k(x-y) mod p= jr for any value of j. This immediately implies that the number of values of k that cause a collision between x and y is at most 2(p 1) r. ( ) v Finally, noting that the number of choices of the pair {x, y} is. we 2 obtain p 1 k=1 r 1 i=0 ( ) bi (k, r, V ) 2 ( ) v 2(p 1) 2 r < (p 1)v 2 r Mohsen Arab (Yazd University ) Data Structure January 13, / 86

93 Corollary 8.18 For all V M of size v, and all r v, there exists k {1,..., m} such that ( ) r 1 bi (k, r, V ) i=0 < v 2 2 r. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

94 Theorem 8.19 For any S M with S = s and m s, there exists a hash table representation of S that uses space O(s) and permits the processing of a FIND operation in O( 1) time. proof: The double hashing scheme is as described above, and all that remains to be shown is that there are choices of the primary hash function h k and the secondary hash functions h k1,..., h ks that ensure the promised performance bounds. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

95 proof(cont.) Consider first the primary hash function h k. The only property desired of this function is that the sum of squares of the colliding sets (the bins) be linear in n to ensure that the space used by the secondary hash tables is O(s). Applying Corollary 8.18 to the case where V = S and R = T, implying that v = r = s, we obtain that there exists a k {I,..., m} such that or that s 1 i=0 ( ) bi (k, s, S) < s. 2 s 1 i=0 b i(k, s, S)[b i (k, s, S) 1)] < 2s. Since s 1 i=0 B i(k, s, S) = S and s 1 i=0 b i(k, s, S) = s, Mohsen Arab (Yazd University ) Data Structure January 13, / 86

96 s 1 i=0 b i(k, s, S) 2 < 2s + s 1 i=0 b i(k, s, S) = 3s Consider now the secondary hash function h ki for the set S j = B i (k, s, S) of size s i. Applying Corollary 8.18 to the case where V = S i (or v = s i ) and using a secondary hash table of size r=si 2, it follows that there exists a k i {1,..., m} such that s 2 i 1 j=0 ( bj (k i, si 2, S ) i) < 1. 2 where b b j (k i, s 2 i, S i) is the number of collisions at the jth location of the secondary hash table for T[i]. This can be the case only when each term of the summation is zero, implying that b j (k i, s 2 i, S i) 1 for all j. Thus, it follows that there exists a perfect secondary hash function h ki. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Problem. Maintain the above set so as to support certain

Problem. Maintain the above set so as to support certain Randomized Data Structures Roussos Ioannis Fundamental Data-structuring Problem Collection {S 1,S 2, } of sets of items. Each item i is an arbitrary record, indexed by a key k(i). Each k(i) is drawn from

More information

1 Maintaining a Dictionary

1 Maintaining a Dictionary 15-451/651: Design & Analysis of Algorithms February 1, 2016 Lecture #7: Hashing last changed: January 29, 2016 Hashing is a great practical tool, with an interesting and subtle theory too. In addition

More information

Dictionary: an abstract data type

Dictionary: an abstract data type 2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

More information

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a Hash Tables Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a mapping from U to M = {1,..., m}. A collision occurs when two hashed elements have h(x) =h(y).

More information

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32 CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 32 CS 473: Algorithms, Spring 2018 Universal Hashing Lecture 10 Feb 15, 2018 Most

More information

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

Lecture 5: Hashing. David Woodruff Carnegie Mellon University Lecture 5: Hashing David Woodruff Carnegie Mellon University Hashing Universal hashing Perfect hashing Maintaining a Dictionary Let U be a universe of keys U could be all strings of ASCII characters of

More information

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Priority Queue / Heap Stores (key,data) pairs (like dictionary) But, different set of operations: Initialize-Heap: creates new empty heap

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at

More information

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis Motivation Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis Arrays provide an indirect way to access a set. Many times we need an association between two sets, or a set of keys and associated

More information

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

CSE 190, Great ideas in algorithms: Pairwise independent hash functions CSE 190, Great ideas in algorithms: Pairwise independent hash functions 1 Hash functions The goal of hash functions is to map elements from a large domain to a small one. Typically, to obtain the required

More information

Dictionary: an abstract data type

Dictionary: an abstract data type 2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

More information

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10] Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally random number, so the code will be hard to crack. For letter A, you write 3,004,572,688. B is 28,731,569½.

More information

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm 8 Priority Queues 8 Priority Queues A Priority Queue S is a dynamic set data structure that supports the following operations: S. build(x 1,..., x n ): Creates a data-structure that contains just the elements

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

compare to comparison and pointer based sorting, binary trees

compare to comparison and pointer based sorting, binary trees Admin Hashing Dictionaries Model Operations. makeset, insert, delete, find keys are integers in M = {1,..., m} (so assume machine word size, or unit time, is log m) can store in array of size M using power:

More information

Introduction to Hash Tables

Introduction to Hash Tables Introduction to Hash Tables Hash Functions A hash table represents a simple but efficient way of storing, finding, and removing elements. In general, a hash table is represented by an array of cells. In

More information

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

More information

Lecture: Analysis of Algorithms (CS )

Lecture: Analysis of Algorithms (CS ) Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 Outline of Today s Class 2 Choosing Hash Functions Universal Universality Theorem Constructing a Set of Universal Hash Functions Perfect

More information

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Dictionaries Dictionary problem. Maintain a set S U = {,..., u-} supporting lookup(x): return true if x S and false otherwise. insert(x): set S = S {x} delete(x): set S = S - {x} Dictionaries

More information

Data Structures and Algorithms " Search Trees!!

Data Structures and Algorithms  Search Trees!! Data Structures and Algorithms " Search Trees!! Outline" Binary Search Trees! AVL Trees! (2,4) Trees! 2 Binary Search Trees! "! < 6 2 > 1 4 = 8 9 Ordered Dictionaries" Keys are assumed to come from a total

More information

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM Search Trees Chapter 1 < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 3 - Binary Search Trees Ø A binary search

More information

Recitation 7. Treaps and Combining BSTs. 7.1 Announcements. FingerLab is due Friday afternoon. It s worth 125 points.

Recitation 7. Treaps and Combining BSTs. 7.1 Announcements. FingerLab is due Friday afternoon. It s worth 125 points. Recitation 7 Treaps and Combining BSTs 7. Announcements FingerLab is due Friday afternoon. It s worth 25 points. RangeLab will be released on Friday. 39 40 RECITATION 7. TREAPS AND COMBINING BSTS 7.2 Deletion

More information

Algorithms lecture notes 1. Hashing, and Universal Hash functions

Algorithms lecture notes 1. Hashing, and Universal Hash functions Algorithms lecture notes 1 Hashing, and Universal Hash functions Algorithms lecture notes 2 Can we maintain a dictionary with O(1) per operation? Not in the deterministic sense. But in expectation, yes.

More information

CS213d Data Structures and Algorithms

CS213d Data Structures and Algorithms CS21d Data Structures and Algorithms Heaps and their Applications Milind Sohoni IIT Bombay and IIT Dharwad March 22, 2017 1 / 18 What did I see today? March 22, 2017 2 / 18 Heap-Trees A tree T of height

More information

A fast algorithm to generate necklaces with xed content

A fast algorithm to generate necklaces with xed content Theoretical Computer Science 301 (003) 477 489 www.elsevier.com/locate/tcs Note A fast algorithm to generate necklaces with xed content Joe Sawada 1 Department of Computer Science, University of Toronto,

More information

Assignment 5: Solutions

Assignment 5: Solutions Comp 21: Algorithms and Data Structures Assignment : Solutions 1. Heaps. (a) First we remove the minimum key 1 (which we know is located at the root of the heap). We then replace it by the key in the position

More information

Priority queues implemented via heaps

Priority queues implemented via heaps Priority queues implemented via heaps Comp Sci 1575 Data s Outline 1 2 3 Outline 1 2 3 Priority queue: most important first Recall: queue is FIFO A normal queue data structure will not implement a priority

More information

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Notes on Logarithmic Lower Bounds in the Cell Probe Model Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)

More information

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015 Search Trees < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Learning Outcomes Ø From this lecture, you should be able to: q Define the properties of a binary search

More information

Lecture 5: Splay Trees

Lecture 5: Splay Trees 15-750: Graduate Algorithms February 3, 2016 Lecture 5: Splay Trees Lecturer: Gary Miller Scribe: Jiayi Li, Tianyi Yang 1 Introduction Recall from previous lecture that balanced binary search trees (BST)

More information

6.1 Occupancy Problem

6.1 Occupancy Problem 15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.

More information

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts) Introduction to Algorithms October 13, 2010 Massachusetts Institute of Technology 6.006 Fall 2010 Professors Konstantinos Daskalakis and Patrick Jaillet Quiz 1 Solutions Quiz 1 Solutions Problem 1. We

More information

CS Data Structures and Algorithm Analysis

CS Data Structures and Algorithm Analysis CS 483 - Data Structures and Algorithm Analysis Lecture VII: Chapter 6, part 2 R. Paul Wiegand George Mason University, Department of Computer Science March 22, 2006 Outline 1 Balanced Trees 2 Heaps &

More information

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Advanced Implementations of Tables: Balanced Search Trees and Hashing Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the

More information

Lecture 23: Alternation vs. Counting

Lecture 23: Alternation vs. Counting CS 710: Complexity Theory 4/13/010 Lecture 3: Alternation vs. Counting Instructor: Dieter van Melkebeek Scribe: Jeff Kinne & Mushfeq Khan We introduced counting complexity classes in the previous lecture

More information

CS 240 Data Structures and Data Management. Module 4: Dictionaries

CS 240 Data Structures and Data Management. Module 4: Dictionaries CS 24 Data Structures and Data Management Module 4: Dictionaries A. Biniaz A. Jamshidpey É. Schost Based on lecture notes by many previous cs24 instructors David R. Cheriton School of Computer Science,

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

Skip Lists. What is a Skip List. Skip Lists 3/19/14

Skip Lists. What is a Skip List. Skip Lists 3/19/14 Presentation for use with the textbook Data Structures and Algorithms in Java, 6 th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser, Wiley, 2014 Skip Lists 15 15 23 10 15 23 36 Skip Lists

More information

Dynamic Ordered Sets with Exponential Search Trees

Dynamic Ordered Sets with Exponential Search Trees Dynamic Ordered Sets with Exponential Search Trees Arne Andersson Computing Science Department Information Technology, Uppsala University Box 311, SE - 751 05 Uppsala, Sweden arnea@csd.uu.se http://www.csd.uu.se/

More information

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST Collision Kuan-Yu Chen ( 陳冠宇 ) 2018/12/17 @ TR-212, NTUST Review Hash table is a data structure in which keys are mapped to array positions by a hash function When two or more keys map to the same memory

More information

Space complexity of cutting planes refutations

Space complexity of cutting planes refutations Space complexity of cutting planes refutations Nicola Galesi, Pavel Pudlák, Neil Thapen Nicola Galesi Sapienza - University of Rome June 19, 2015 () June 19, 2015 1 / 32 Cutting planes proofs A refutational

More information

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park 1617 Preview Data Structure Review COSC COSC Data Structure Review Linked Lists Stacks Queues Linked Lists Singly Linked List Doubly Linked List Typical Functions s Hash Functions Collision Resolution

More information

CS361 Homework #3 Solutions

CS361 Homework #3 Solutions CS6 Homework # Solutions. Suppose I have a hash table with 5 locations. I would like to know how many items I can store in it before it becomes fairly likely that I have a collision, i.e., that two items

More information

Analysis of Approximate Quickselect and Related Problems

Analysis of Approximate Quickselect and Related Problems Analysis of Approximate Quickselect and Related Problems Conrado Martínez Univ. Politècnica Catalunya Joint work with A. Panholzer and H. Prodinger LIP6, Paris, April 2009 Introduction Quickselect finds

More information

s 1 if xπy and f(x) = f(y)

s 1 if xπy and f(x) = f(y) Algorithms Proessor John Rei Hash Function : A B ALG 4.2 Universal Hash Functions: CLR - Chapter 34 Auxillary Reading Selections: AHU-Data Section 4.7 BB Section 8.4.4 Handout: Carter & Wegman, "Universal

More information

arxiv: v2 [cs.ds] 3 Oct 2017

arxiv: v2 [cs.ds] 3 Oct 2017 Orthogonal Vectors Indexing Isaac Goldstein 1, Moshe Lewenstein 1, and Ely Porat 1 1 Bar-Ilan University, Ramat Gan, Israel {goldshi,moshe,porately}@cs.biu.ac.il arxiv:1710.00586v2 [cs.ds] 3 Oct 2017 Abstract

More information

Advanced Analysis of Algorithms - Midterm (Solutions)

Advanced Analysis of Algorithms - Midterm (Solutions) Advanced Analysis of Algorithms - Midterm (Solutions) K. Subramani LCSEE, West Virginia University, Morgantown, WV {ksmani@csee.wvu.edu} 1 Problems 1. Solve the following recurrence using substitution:

More information

CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis of Algorithms CS 161: Design and Analysis of Algorithms Greedy Algorithms 3: Minimum Spanning Trees/Scheduling Disjoint Sets, continued Analysis of Kruskal s Algorithm Interval Scheduling Disjoint Sets, Continued Each

More information

Cache-Oblivious Hashing

Cache-Oblivious Hashing Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Lecture 8 HASHING!!!!!

Lecture 8 HASHING!!!!! Lecture 8 HASHING!!!!! Announcements HW3 due Friday! HW4 posted Friday! Q: Where can I see examples of proofs? Lecture Notes CLRS HW Solutions Office hours: lines are long L Solutions: We will be (more)

More information

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs Krishnendu Chatterjee Rasmus Ibsen-Jensen Andreas Pavlogiannis IST Austria Abstract. We consider graphs with n nodes together

More information

Advanced Algorithms (2IL45)

Advanced Algorithms (2IL45) Technische Universiteit Eindhoven University of Technology Course Notes Fall 2013 Advanced Algorithms (2IL45) Mark de Berg Contents 1 Introduction to randomized algorithms 4 1.1 Some probability-theory

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based

More information

? 11.5 Perfect hashing. Exercises

? 11.5 Perfect hashing. Exercises 11.5 Perfect hashing 77 Exercises 11.4-1 Consider inserting the keys 10; ; 31; 4; 15; 8; 17; 88; 59 into a hash table of length m 11 using open addressing with the auxiliary hash function h 0.k/ k. Illustrate

More information

Advanced Data Structures

Advanced Data Structures Simon Gog gog@kit.edu - Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Predecessor data structures We want to support

More information

IITM-CS6845: Theory Toolkit February 3, 2012

IITM-CS6845: Theory Toolkit February 3, 2012 IITM-CS6845: Theory Toolkit February 3, 2012 Lecture 4 : Derandomizing the logspace algorithm for s-t connectivity Lecturer: N S Narayanaswamy Scribe: Mrinal Kumar Lecture Plan:In this lecture, we will

More information

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October Contents 1 Cartesian Tree. Definition. 1 2 Cartesian Tree. Construction 1 3 Cartesian Tree. Operations. 2 3.1 Split............................................

More information

ALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers

ALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers ALGEBRA CHRISTIAN REMLING 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers by Z = {..., 2, 1, 0, 1,...}. Given a, b Z, we write a b if b = ac for some

More information

Kartsuba s Algorithm and Linear Time Selection

Kartsuba s Algorithm and Linear Time Selection CS 374: Algorithms & Models of Computation, Fall 2015 Kartsuba s Algorithm and Linear Time Selection Lecture 09 September 22, 2015 Chandra & Manoj (UIUC) CS374 1 Fall 2015 1 / 32 Part I Fast Multiplication

More information

Randomization in Algorithms and Data Structures

Randomization in Algorithms and Data Structures Randomization in Algorithms and Data Structures 3 lectures (Tuesdays 14:15-16:00) May 1: Gerth Stølting Brodal May 8: Kasper Green Larsen May 15: Peyman Afshani For each lecture one handin exercise deadline

More information

Pattern Popularity in 132-Avoiding Permutations

Pattern Popularity in 132-Avoiding Permutations Pattern Popularity in 132-Avoiding Permutations The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Rudolph,

More information

Advanced Data Structures

Advanced Data Structures Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu Predecessor data structures We want to support the following operations on a set of integers from

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 14 October 16, 2013 CPSC 467, Lecture 14 1/45 Message Digest / Cryptographic Hash Functions Hash Function Constructions Extending

More information

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018 15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018 Today we ll talk about a topic that is both very old (as far as computer science

More information

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs CS26: A Second Course in Algorithms Lecture #2: Applications of Multiplicative Weights to Games and Linear Programs Tim Roughgarden February, 206 Extensions of the Multiplicative Weights Guarantee Last

More information

1 Approximate Quantiles and Summaries

1 Approximate Quantiles and Summaries CS 598CSC: Algorithms for Big Data Lecture date: Sept 25, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri Suppose we have a stream a 1, a 2,..., a n of objects from an ordered universe. For simplicity

More information

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30 CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in

More information

Hashing. Martin Babka. January 12, 2011

Hashing. Martin Babka. January 12, 2011 Hashing Martin Babka January 12, 2011 Hashing Hashing, Universal hashing, Perfect hashing Input data is uniformly distributed. A dynamic set is stored. Universal hashing Randomised algorithm uniform choice

More information

COS597D: Information Theory in Computer Science October 19, Lecture 10

COS597D: Information Theory in Computer Science October 19, Lecture 10 COS597D: Information Theory in Computer Science October 9, 20 Lecture 0 Lecturer: Mark Braverman Scribe: Andrej Risteski Kolmogorov Complexity In the previous lectures, we became acquainted with the concept

More information

Analysis of Algorithms I: Perfect Hashing

Analysis of Algorithms I: Perfect Hashing Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the

More information

CS60007 Algorithm Design and Analysis 2018 Assignment 1

CS60007 Algorithm Design and Analysis 2018 Assignment 1 CS60007 Algorithm Design and Analysis 2018 Assignment 1 Palash Dey and Swagato Sanyal Indian Institute of Technology, Kharagpur Please submit the solutions of the problems 6, 11, 12 and 13 (written in

More information

Hash Functions for Priority Queues

Hash Functions for Priority Queues INFORMATION AND CONTROL 63, 217-225 (1984) Hash Functions for Priority Queues M. AJTAI, M. FREDMAN, AND J. KOML6S* University of California San Diego, La Jolla, California 92093 The complexity of priority

More information

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple A Lecture on Hashing Aram-Alexandre Pooladian, Alexander Iannantuono March 22, 217 This is the scribing of a lecture given by Luc Devroye on the 17th of March 217 for Honours Algorithms and Data Structures

More information

arxiv: v2 [cs.dc] 28 Apr 2013

arxiv: v2 [cs.dc] 28 Apr 2013 A simpler load-balancing algorithm for range-partitioned data in Peer-to-Peer systems Jakarin Chawachat and Jittat Fakcharoenphol Department of Computer Engineering Kasetsart University Bangkok Thailand

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes.

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes. Analysis of Algorithms Fibonacci Heaps Andres Mendez-Vazquez October 29, 2015 1 / 119 Outline 1 Introduction Basic Definitions Ordered Trees 2 Binomial Trees Example 3 Fibonacci Heap Operations Fibonacci

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

So far we have implemented the search for a key by carefully choosing split-elements.

So far we have implemented the search for a key by carefully choosing split-elements. 7.7 Hashing Dictionary: S. insert(x): Insert an element x. S. delete(x): Delete the element pointed to by x. S. search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise

More information

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing

past balancing schemes require maintenance of balance info at all times, are aggresive use work of searches to pay for work of rebalancing 1 Splay Trees Sleator and Tarjan, Self Adjusting Binary Search Trees JACM 32(3) 1985 The claim planning ahead. But matches previous idea of being lazy, letting potential build up, using it to pay for expensive

More information

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005 CSCE 750 Final Exam Answer Key Wednesday December 7, 2005 Do all problems. Put your answers on blank paper or in a test booklet. There are 00 points total in the exam. You have 80 minutes. Please note

More information

7 Dictionary. EADS c Ernst Mayr, Harald Räcke 109

7 Dictionary. EADS c Ernst Mayr, Harald Räcke 109 7 Dictionary Dictionary: S.insert(x): Insert an element x. S.delete(x): Delete the element pointed to by x. S.search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise return

More information

Data Structures and Algorithm. Xiaoqing Zheng

Data Structures and Algorithm. Xiaoqing Zheng Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Dictionary problem Dictionary T holding n records: x records key[x] Other fields containing satellite data Operations on T: INSERT(T, x)

More information

Lecture #8: We now take up the concept of dynamic programming again using examples.

Lecture #8: We now take up the concept of dynamic programming again using examples. Lecture #8: 0.0.1 Dynamic Programming:(Chapter 15) We now take up the concept of dynamic programming again using examples. Example 1 (Chapter 15.2) Matrix Chain Multiplication INPUT: An Ordered set of

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

8. Prime Factorization and Primary Decompositions

8. Prime Factorization and Primary Decompositions 70 Andreas Gathmann 8. Prime Factorization and Primary Decompositions 13 When it comes to actual computations, Euclidean domains (or more generally principal ideal domains) are probably the nicest rings

More information

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd 4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

More information

Fast algorithms for collision and proximity problems involving moving geometric objects

Fast algorithms for collision and proximity problems involving moving geometric objects Fast algorithms for collision and proximity problems involving moving geometric objects Prosenjit Gupta Ravi Janardan Michiel Smid August 22, 1995 Abstract Consider a set of geometric objects, such as

More information

arxiv: v1 [cs.ds] 15 Feb 2012

arxiv: v1 [cs.ds] 15 Feb 2012 Linear-Space Substring Range Counting over Polylogarithmic Alphabets Travis Gagie 1 and Pawe l Gawrychowski 2 1 Aalto University, Finland travis.gagie@aalto.fi 2 Max Planck Institute, Germany gawry@cs.uni.wroc.pl

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Optimal Color Range Reporting in One Dimension

Optimal Color Range Reporting in One Dimension Optimal Color Range Reporting in One Dimension Yakov Nekrich 1 and Jeffrey Scott Vitter 1 The University of Kansas. yakov.nekrich@googlemail.com, jsv@ku.edu Abstract. Color (or categorical) range reporting

More information

Authentication. Chapter Message Authentication

Authentication. Chapter Message Authentication Chapter 5 Authentication 5.1 Message Authentication Suppose Bob receives a message addressed from Alice. How does Bob ensure that the message received is the same as the message sent by Alice? For example,

More information

Realization Plans for Extensive Form Games without Perfect Recall

Realization Plans for Extensive Form Games without Perfect Recall Realization Plans for Extensive Form Games without Perfect Recall Richard E. Stearns Department of Computer Science University at Albany - SUNY Albany, NY 12222 April 13, 2015 Abstract Given a game in

More information

Biased Quantiles. Flip Korn Graham Cormode S. Muthukrishnan

Biased Quantiles. Flip Korn Graham Cormode S. Muthukrishnan Biased Quantiles Graham Cormode cormode@bell-labs.com S. Muthukrishnan muthu@cs.rutgers.edu Flip Korn flip@research.att.com Divesh Srivastava divesh@research.att.com Quantiles Quantiles summarize data

More information

Domain Extender for Collision Resistant Hash Functions: Improving Upon Merkle-Damgård Iteration

Domain Extender for Collision Resistant Hash Functions: Improving Upon Merkle-Damgård Iteration Domain Extender for Collision Resistant Hash Functions: Improving Upon Merkle-Damgård Iteration Palash Sarkar Cryptology Research Group Applied Statistics Unit Indian Statistical Institute 203, B.T. Road,

More information

PRGs for space-bounded computation: INW, Nisan

PRGs for space-bounded computation: INW, Nisan 0368-4283: Space-Bounded Computation 15/5/2018 Lecture 9 PRGs for space-bounded computation: INW, Nisan Amnon Ta-Shma and Dean Doron 1 PRGs Definition 1. Let C be a collection of functions C : Σ n {0,

More information

RANDOMIZED ALGORITHMS

RANDOMIZED ALGORITHMS CH 9.4 : SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH, TAMASSIA AND MOUNT (WILEY 2004) AND SLIDES FROM NANCY M. AMATO AND

More information

1 Hashing. 1.1 Perfect Hashing

1 Hashing. 1.1 Perfect Hashing 1 Hashing Hashing is covered by undergraduate courses like Algo I. However, there is much more to say on this topic. Here, we focus on two selected topics: perfect hashing and cockoo hashing. In general,

More information

Lecture 18 April 26, 2012

Lecture 18 April 26, 2012 6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 18 April 26, 2012 1 Overview In the last lecture we introduced the concept of implicit, succinct, and compact data structures, and

More information

arxiv: v1 [math.co] 22 Jan 2013

arxiv: v1 [math.co] 22 Jan 2013 NESTED RECURSIONS, SIMULTANEOUS PARAMETERS AND TREE SUPERPOSITIONS ABRAHAM ISGUR, VITALY KUZNETSOV, MUSTAZEE RAHMAN, AND STEPHEN TANNY arxiv:1301.5055v1 [math.co] 22 Jan 2013 Abstract. We apply a tree-based

More information