Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

Size: px
Start display at page:

Download "Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park"

Transcription

1 1617 Preview Data Structure Review COSC COSC Data Structure Review Linked Lists Stacks Queues Linked Lists Singly Linked List Doubly Linked List Typical Functions s Hash Functions Collision Resolution Insert Sorted List Insert as the Last element (the First element?) Delete Chaining Search Open Addressing Probe Sequences 1 Data Structure Review COSC Data Structure Review COSC Stacks: LIFO Queues: FIFO Typical functions for managing a Stack Typical functions for managing a Queue Push EnQueue Pop DeQueue IsEmpty IsEmpty IsFull IsFull Peek (or Top) Peek (or Front) Stack Implementations Queue Implementations Using a Linked List Using a Linked List Using an Array Using an Array 4 s s (Direct-Address Table) Many applications require a dynamic data structure for supporting the so-called dictionary operations insert, search and delete. A hash table is an effective data structure for implementing such operations. Even though the worst case running time for a search takes linear time O(n) under reasonable assumptions, the average case takes constant time O(1). Direct addressing is a simple technique that works well when the universe U of keys is reasonably small. For U = {, 1,,, m 1} we can use an array T[..m 1] Direct-address table with U={,1,,, 4,, 6, 7,, 9 } K={,,, } U (universe of keys) 7 9 K (actual keys used) key data 6 1

2 1617 s Problems with Direct Addressing: If the universe U is large, storing a table of size U may be impractical or impossible. (e.g., the range of keys is all positive integers) Often, the set K of keys that are actually usedstored is small compared to U. In this case, most of the space allocated for T is wasted. (e.g., the range of keys is 1 but there are only 1 inputs) s are a good alternative When K is much smaller than U, a hash table requires significantly less space than a direct-address table. Storage requirements can be reduced to ( K ). The average case search time can be O(1). However, this is not the worst case. s A Hash table is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name, ID number ), to their associated values. The Big Idea: Instead of storing an element with key k in slot k, use a function h and store the element in slot h(k). We call h a hash function. h: U {, 1,..., m 1}, such that h(k) is a legal slot number in the hash table T. We say that k hashes to slot h(k). 7 s s T U (Universe of keys) k 1 k K (Actual 4 k keys used) k k Hash table T with a hash function h h(k 1) h(k 4) h(k ) h(k ) h(k ) m - 1 Collisions: A collision occurs when more than one key hashes to the same slot. This can happen when there are more possible keys than slots ( U > m). (where m is the number of slots in the hash table) For a given set K of keys with K m, a collision may or may not happen. When K > m, collisions will occur. Therefore, a hash table implementation must be able to handle collisions in all cases. Two common approaches for resolving collisions: Chaining Open Addressing 9 1 s (Collision Resolution by Chaining) T U (universe of keys) k 1 k 4 k K (actual keys used) k k Hash table T with a hash function h h(k 1) h(k 4) A Collision h(k )= h(k ) h(k ) m - 1 Collision Resolution by Chaining Instead of containing an actual element, slot j contains a pointer to the head of a list storing all elements that hash to j (or to the sentinel if using a circular, doubly linked list with a sentinel) If there are no such elements, slot j contains NULL. 11 1

3 1617 (Collision Resolution by Chaining) (Collision Resolution by Chaining) T k 1 k 4 Order in which keys are inserted Sam Smith 1 Sue Marker 1679 U (universe of keys) Sandra Lee 14 k 1 k K (Actual 4 k keys used) k 7 k k k k k k k k 6 k 7 Ted Doe Sam Parker Sam Smith Ted Doe 179 Mike Backer 1447 k 6 m - 1 Mike Backer 4 Sandra Lee 1796 Sue Marker 49 Sam Parker 1674 Hash table T with a hash function h 1 14 (Operations in Hashing with Chaining) (Operations in Hashing with Chaining) Insertion: CHAINED_HASH_INSERT(T, x) Insert x at the head of list T [h(x->key)] Worst-case running time is O(1). Assumes that the element being inserted isn t already in the list. It would take an additional search to check if it was already inserted. Search: CHAINED_HASH_SEARCH(T, k) Search for an element with key k in list T[h(k)] Running time is proportional to the length of the list of elements in slot h(k) (Operations in Hashing with Chaining) (Analysis of Hashing with Chaining) Deletion: CHAINED-HASH-DELETE(T, x) Delete x from the list in T [h(x->key)] Given pointer x to the element to delete, so no search is needed to find this element. Worst-case running time is O(1) time if the lists are doubly linked. If the lists are singly-linked, then deletion takes as long as searching, because we must find x s predecessor in its list in order to correctly update next pointers. We can analyze Hashing with Chaining based on the load factor. The load factor is the average number of elements loaded in each slot. Let n is the number of elements in the hash table Let m is the number of slots in the table Load factor =nm can be > 1, =1, or < 1 Worst case is when all n keys hash to the same slot get a single list of length n worst-case time to search is (n), plus time to compute the hash function. Average case depends on how well the hash function distributes the keys among the slots. 17 1

4 1617 (Analysis of Hashing with Chaining) Average-case performance of hashing with chaining. For average-case performance, we assume simple uniform hashing: any given element is equally likely to hash into any of the m slots. For j =, 1,...,m 1, we denote the length of list T [ j ] by n j. Then n = n + n 1 + +n m 1. (n is total number of input) Average value of n j is E [n j ] = α = nm. Assume that we can compute the hash function in O(1) time, so that the time required to search for the element with key k depends on the length n h(k) of the list T [h(k)]. (Analysis of Hashing with Chaining) To analyze the average case, we need to consider two cases: 1. If the hash table contains no element with key k, then the search is unsuccessful.. If the hash table does contain an element with key k, then the search is successful Theorem 1) An unsuccessful search takes expected time (1 + ). Theorem ) A successful search takes expected time (1 + ). 19 Hash collisions can also be resolved using open addressing. In open addressing, all elements occupy the hash table itself. Store all keys in the hash table itself. Order in which keys are inserted Sam Smith 1 Sue Marker 1679 Each slot contains either a key or NIL. Sandra Lee 14 To search for key k: Ted Doe 1 Sam Smith Compute h(k) and examine slot h(k). Examining a slot is known as a probe. If slot h(k) contains key k, the search is successful. If this slot contains NIL, the search is unsuccessful. If slot h(k) contains a key that is not k. We compute the index of some other slot, based on k and on which probe (count from : th, 1st, nd, etc.) were on. Keep probing until we either find key k (successful search) or we find a slot holding NIL (unsuccessful search). Sam Parker Mike Backer Sue Marker Ted Doe 179 Mike Backer 1447 Sandra Lee 1796 Sam Parker We need the sequence of slots probed to be a permutation of the slot numbers <, 1,..., m 1> (so that we examine all slots if we have to, and so that we don t examine any slot more than once). Thus, the hash function h is a function of h: U P P (where U is a set of keys and P is a set of slot numbers) The requirement that the sequence of slots be a permutation of, 1,...,m 1 is equivalent to requiring that the probe sequence h(k, ), h(k, 1),..., h(k, m 1) be a permutation of <, 1,...,m 1>. To insert, act as though we re searching, and insert at the first NIL slot we find. HASH-SEARCH returns the index of a slot containing key k, or NIL if the search is unsuccessful. HASH_SEARCH(T, k) { 1 i = ; rval = NIL; do { 4 j = h(k, i); if T[j] == k 6 rval = j; 7 i = i + 1; } while (T[j] NIL and T[j] k and i m) 9 return rval; } 4 4

5 1617 HASH_INSERT(T, k) { 1 i = ; stop = false; do { 4 j = h(k, i); if (T[j] == NIL) { 6 T[j] = k; 7 stop = true; } else { 9 i = i + 1; 1 } 11 } while(!stop and i m) 1 if (i == m) 1 Hash table overflow error } Deletion: Cannot just put NIL into the slot containing the key we want to delete. Sam Smith Sandra Lee Ted Doe Sam Parker Mike Backer Sue Marker Sue Marker 1679 Sam Smith Ted Doe 179 Mike Backer 1447 Sandra Lee 1796 Sam Parker 1674 Delete Ted Doe 6 Deletion: Cannot just put NIL into the slot containing the key we want to delete. Sam Smith Sandra Lee Sam Parker Mike Backer Sue Marker Sue Marker 1679 Sam Smith Mike Backer 1447 Sandra Lee 1796 Sam Parker 1674 Search Mike Backer Fail Deletion: Cannot just put NIL into the slot containing the key we want to delete. Solution: Use a special value DELETED instead of NIL when marking a slot as empty during deletion. Search should treat DELETED as though the slot holds a key that does not match the one being searched for. Insertion should treat DELETED as though the slot were empty, so that it can be reused. The disadvantage of using DELETED is that now search time is no longer dependent on the load factor. 7 HASH_INSERT(T, k) { 1 i = ; stop = false; do { 4 j = h(k, i); if (T[j] == NIL T[j] == DELETED) { 6 T[j] = k; 7 stop = true; } else 9 i = i + 1; 1 } while (!stop and i m) 11 if (i == m) 1 Hash table overflow error } Probe sequences are the list of locations which a method for open addressing produces as alternatives in the event of a collision. Linear probing Quadratic probing Double hashing 9

6 1617 Linear Probing Given an ordinary hash function hˊ: U > {, 1,, m 1} the method of linear probing uses the hash function h k, i = h k + i mod m, for i =, 1,, m 1 The initial probe determines the entire sequence only m possible sequences. Linear probing is easy to implement but it suffers from primary clustering. Primary clustering: probing sequences for two different keys may overlap and create clustering. As a result, the average search and insertion times increase. With linear probing, two different keys k 1 and k may overlap and create clustering. Ex) the probing sequence for key k1 is: {hˊ(k 1 ), hˊ(k 1 ) + 1, hˊ(k 1 ) +, } = { 1, 11, 1, 1, } the probing sequence for key k is: {h (k ), h (k ) + 1, h (k ) +, } = { 7,, 9, 1, 11, 1, 1, } 1 Quadratic probing The probe sequence starts at h (k). It jumps around in the table according to a quadratic function of the probe number: h k, i = h k + c 1 i + c i mod m, where c 1, c are constants. Must constrain c1, c, and m in order to ensure that we get a full permutation of {, 1,..., m 1}. If two distinct keys have the same h value, then they have the same probe sequence. This results in secondary clustering. Double Hashing Use two auxiliary hash functions, h 1 and h. h 1 gives the initial probe, and h gives the remaining probes: h(k, i) = (h 1 (k) + ih (k)) mod m Must have h (k) be relatively prime to m (no factors in common other than 1) in order to guarantee that the probe sequence is a full permutation of <, 1,..., m 1>. Could choose m to be a power of and h to always produce an odd number > 1. Could let m be prime and have 1 < h (k) < m. 4 (Collision Resolution by Open Addressing: Analysis) Assumptions: Analysis is in terms of load factor α. We will assume that the table never completely fills, so we always have n < m α < 1. Assume uniform hashing. No deletion. In a successful search, each key is equally likely to be searched for. Theorems: The expected number of probes in an unsuccessful search is at most 1 1 α. The expected number of probes in a successful search is at most 1 log 1 α 1 α (Hash Functions) What makes a good hash function? Ideally, the hash function satisfies the Simple Uniform Hashing Assumption. SUHA is a basic assumption that facilitates the mathematical analysis of hash tables. The assumption states that a hypothetical hashing function will evenly distribute items into the slots of a hash table. Moreover, each item to be hashed has an equal probability of being placed into a slot, regardless of the other elements already placed. 6 6

7 1617 (Hash Functions) What makes a good hash function? In practice, it s usually not possible to satisfy this assumption. We usually do not know in advance the probability distribution that keys are drawn from, and the keys may not be drawn independently. Instead, a heuristics based on the domain of the keys are often used to create a hash function that performs reasonably well. (Hash Functions) Keys are natural numbers Hash functions typically assume that the keys are natural numbers. When they are not, we need to find a way to convert them to natural numbers. For example, if the keys are character strings, we could convert them as follows: with key = Father ASCII value of F=7, a= 97, t=116, h=14, e=11, r=114. Since there are 1 basic ASCII code values, we can convert the string Father to = (Hash Functions: The Division Method) A key k is mapped into one of m slots by taking the remainder of k divided by m: h(k) = k mod m Advantage: Fast, since it requires just one division operation. Disadvantage: Have to avoid certain values of m: Powers of are bad. If m = p for integer p, then h(k) is just the least significant p bits of k. Good choice for m: A prime number not too close to an exact power of. The multiplication method hash function creates a slot number as follows: 1. Select a constant value A in the range < A < 1.. Multiply key k by A.. Extract the fractional part of ka. 4. Multiply the fractional part by m.. Take the floor of the result. h k = m (k A mod 1) where "k A mod 1" is the fractional part of ka which is between and 1 An advantage of this approach is that the value of m is not critical. 9 4 A power of is usually chosen for m in order to facilitate easy implementation of the function on most computers. Select m = p for some integer p. Suppose the word size of the machine is w bits. Assume that k fits into a single word. (k takes w bits.) Let s be an integer in the range < s < w (s takes w bits). Restrict A to be a fraction of the form s w A word = w bits k s = A w (A = s w s = A w ) r 1 r Multiply k by s. Since we are multiplying two w-bit words, the result is w bits, r 1 w + r, where r 1 is the high-order word of the product and r is the low-order word. h(k)

8 1617 r 1 holds the integer part of ka ( ka ) and r holds the fractional part of ka (ka mod 1 = ka ka ). Think of the binary point. (analog of decimal point, but for binary representation) as being between r 1 and r. Since we don t care about the integer part of ka, we can forget about r 1 and just use part of r. Since we want m (k A mod 1), we could get that value by shifting r to the left by p = log m bits (m= p )and then taking the p bits that were shifted to the left of the binary point. We don t need to shift. The p bits that would have been shifted to the left of the binary point are the p most significant bits of r. So we can just take these bits after having formed r by multiplying k by s. Example: with m = (= implies p = ), w =, k = 1. We must select s, < s < < s <. Let s select s = 1. Then, A = 1 = 1 ka = 1 1 = 7 = 17 = + 17 we only use fraction such that ka mod 1 = 17 to hash. So h k = m ka mod 1 = 17 = Using the implementation: ks = 1 1 = 7 = + 17 r1 =, r = 17. Written in w = bits, r = 11. Take the p = most significant bits of r, get 1 in binary, or 4 in decimal, so that h(k) = 4. w = bit k = 1 1 = 111 s =1 1 = 111 ks = 7 1 = 111 How to choose A: The multiplication method works with any legal value of A. But it works better with some values than with others, depending on the keys being hashed. Knuth suggests using A ( 1) r 1 = 1 r = 11 h(k) = extract p bits = 1 = 4 m = = p = 4 46

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis Motivation Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis Arrays provide an indirect way to access a set. Many times we need an association between two sets, or a set of keys and associated

More information

Fundamental Algorithms

Fundamental Algorithms Chapter 5: Hash Tables, Winter 2018/19 1 Fundamental Algorithms Chapter 5: Hash Tables Jan Křetínský Winter 2018/19 Chapter 5: Hash Tables, Winter 2018/19 2 Generalised Search Problem Definition (Search

More information

Data Structures and Algorithm. Xiaoqing Zheng

Data Structures and Algorithm. Xiaoqing Zheng Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Dictionary problem Dictionary T holding n records: x records key[x] Other fields containing satellite data Operations on T: INSERT(T, x)

More information

Hashing, Hash Functions. Lecture 7

Hashing, Hash Functions. Lecture 7 Hashing, Hash Functions Lecture 7 Symbol-table problem Symbol table T holding n records: x record key[x] Other fields containing satellite data Operations on T: INSERT(T, x) DELETE(T, x) SEARCH(T, k) How

More information

Symbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record.

Symbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record. CS 5633 -- Spring 25 Symbol-table problem Hashing Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk CS 5633 Analysis of Algorithms 1 Symbol table holding n records: record

More information

Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1

Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Direct-Address Tables 2 2 Let U = {0,...,m 1}, the set of

More information

Lecture: Analysis of Algorithms (CS )

Lecture: Analysis of Algorithms (CS ) Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 Outline of Today s Class 2 Choosing Hash Functions Universal Universality Theorem Constructing a Set of Universal Hash Functions Perfect

More information

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a Hash Tables Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a mapping from U to M = {1,..., m}. A collision occurs when two hashed elements have h(x) =h(y).

More information

Hash tables. Hash tables

Hash tables. Hash tables Basic Probability Theory Two events A, B are independent if Conditional probability: Pr[A B] = Pr[A] Pr[B] Pr[A B] = Pr[A B] Pr[B] The expectation of a (discrete) random variable X is E[X ] = k k Pr[X

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

Hash tables. Hash tables

Hash tables. Hash tables Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key

More information

CSE 502 Class 11 Part 2

CSE 502 Class 11 Part 2 CSE 502 Class 11 Part 2 Jeremy Buhler Steve Cole February 17 2015 Today: analysis of hashing 1 Constraints of Double Hashing How does using OA w/double hashing constrain our hash function design? Need

More information

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple A Lecture on Hashing Aram-Alexandre Pooladian, Alexander Iannantuono March 22, 217 This is the scribing of a lecture given by Luc Devroye on the 17th of March 217 for Honours Algorithms and Data Structures

More information

Searching, mainly via Hash tables

Searching, mainly via Hash tables Data structures and algorithms Part 11 Searching, mainly via Hash tables Petr Felkel 26.1.2007 Topics Searching Hashing Hash function Resolving collisions Hashing with chaining Open addressing Linear Probing

More information

Queues. Principles of Computer Science II. Basic features of Queues

Queues. Principles of Computer Science II. Basic features of Queues Queues Principles of Computer Science II Abstract Data Types Ioannis Chatzigiannakis Sapienza University of Rome Lecture 9 Queue is also an abstract data type or a linear data structure, in which the first

More information

COMP251: Hashing. Jérôme Waldispühl School of Computer Science McGill University. Based on (Cormen et al., 2002)

COMP251: Hashing. Jérôme Waldispühl School of Computer Science McGill University. Based on (Cormen et al., 2002) COMP251: Hashing Jérôme Waldispühl School of Computer Science McGill University Based on (Cormen et al., 2002) Table S with n records x: Problem DefiniNon X Key[x] InformaNon or data associated with x

More information

Introduction to Hashtables

Introduction to Hashtables Introduction to HashTables Boise State University March 5th 2015 Hash Tables: What Problem Do They Solve What Problem Do They Solve? Why not use arrays for everything? 1 Arrays can be very wasteful: Example

More information

1 Maintaining a Dictionary

1 Maintaining a Dictionary 15-451/651: Design & Analysis of Algorithms February 1, 2016 Lecture #7: Hashing last changed: January 29, 2016 Hashing is a great practical tool, with an interesting and subtle theory too. In addition

More information

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Advanced Implementations of Tables: Balanced Search Trees and Hashing Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the

More information

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30 CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in

More information

Hashing. Data organization in main memory or disk

Hashing. Data organization in main memory or disk Hashing Data organization in main memory or disk sequential, binary trees, The location of a key depends on other keys => unnecessary key comparisons to find a key Question: find key with a single comparison

More information

So far we have implemented the search for a key by carefully choosing split-elements.

So far we have implemented the search for a key by carefully choosing split-elements. 7.7 Hashing Dictionary: S. insert(x): Insert an element x. S. delete(x): Delete the element pointed to by x. S. search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise

More information

Problem Set 4 Solutions

Problem Set 4 Solutions Introduction to Algorithms October 8, 2001 Massachusetts Institute of Technology 6.046J/18.410J Singapore-MIT Alliance SMA5503 Professors Erik Demaine, Lee Wee Sun, and Charles E. Leiserson Handout 18

More information

Introduction to Hash Tables

Introduction to Hash Tables Introduction to Hash Tables Hash Functions A hash table represents a simple but efficient way of storing, finding, and removing elements. In general, a hash table is represented by an array of cells. In

More information

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST Collision Kuan-Yu Chen ( 陳冠宇 ) 2018/12/17 @ TR-212, NTUST Review Hash table is a data structure in which keys are mapped to array positions by a hash function When two or more keys map to the same memory

More information

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9 Constant time access Searching Chapter 9 Linear search Θ(n) OK Binary search Θ(log n) Better Can we achieve Θ(1) search time? CPTR 318 1 2 Use an array? Use random access on a key such as a string? Hash

More information

Data Structures and Algorithms Winter Semester

Data Structures and Algorithms Winter Semester Page 0 German University in Cairo December 26, 2015 Media Engineering and Technology Faculty Prof. Dr. Slim Abdennadher Dr. Wael Abouelsadaat Data Structures and Algorithms Winter Semester 2015-2016 Final

More information

Hashing. Why Hashing? Applications of Hashing

Hashing. Why Hashing? Applications of Hashing 12 Hashing Why Hashing? Hashing A Search algorithm is fast enough if its time performance is O(log 2 n) For 1 5 elements, it requires approx 17 operations But, such speed may not be applicable in real-world

More information

Hashing Data Structures. Ananda Gunawardena

Hashing Data Structures. Ananda Gunawardena Hashing 15-121 Data Structures Ananda Gunawardena Hashing Why do we need hashing? Many applications deal with lots of data Search engines and web pages There are myriad look ups. The look ups are time

More information

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is: CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at

More information

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Dictionaries Dictionary: Maintain a dynamic set S. Every

More information

? 11.5 Perfect hashing. Exercises

? 11.5 Perfect hashing. Exercises 11.5 Perfect hashing 77 Exercises 11.4-1 Consider inserting the keys 10; ; 31; 4; 15; 8; 17; 88; 59 into a hash table of length m 11 using open addressing with the auxiliary hash function h 0.k/ k. Illustrate

More information

Round 5: Hashing. Tommi Junttila. Aalto University School of Science Department of Computer Science

Round 5: Hashing. Tommi Junttila. Aalto University School of Science Department of Computer Science Round 5: Hashing Tommi Junttila Aalto University School of Science Department of Computer Science CS-A1140 Data Structures and Algorithms Autumn 017 Tommi Junttila (Aalto University) Round 5 CS-A1140 /

More information

Data Structures and Algorithm. Xiaoqing Zheng

Data Structures and Algorithm. Xiaoqing Zheng Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn MULTIPOP top[s] = 6 top[s] = 2 3 2 8 5 6 5 S MULTIPOP(S, x). while not STACK-EMPTY(S) and k 0 2. do POP(S) 3. k k MULTIPOP(S, 4) Analysis

More information

Algorithms lecture notes 1. Hashing, and Universal Hash functions

Algorithms lecture notes 1. Hashing, and Universal Hash functions Algorithms lecture notes 1 Hashing, and Universal Hash functions Algorithms lecture notes 2 Can we maintain a dictionary with O(1) per operation? Not in the deterministic sense. But in expectation, yes.

More information

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -1 - INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University Data Set - SSN's from UTSA Class 467 13 3881 498 66 2055 450 27 3804 456 49 5261

More information

CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community

CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community http://csc.cs.rit.edu Linked Lists 1. You are given the linked list: 1 2 3. You may assume that each node has one field

More information

Lecture 4: Stacks and Queues

Lecture 4: Stacks and Queues Reading materials Goodrich, Tamassia, Goldwasser (6th), chapter 6 OpenDSA (https://opendsa-server.cs.vt.edu/odsa/books/everything/html/): chapter 9.8-13 Contents 1 Stacks ADT 2 1.1 Example: CharStack ADT

More information

Solution suggestions for examination of Logic, Algorithms and Data Structures,

Solution suggestions for examination of Logic, Algorithms and Data Structures, Department of VT12 Software Engineering and Managment DIT725 (TIG023) Göteborg University, Chalmers 24/5-12 Solution suggestions for examination of Logic, Algorithms and Data Structures, Date : April 26,

More information

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10] Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally random number, so the code will be hard to crack. For letter A, you write 3,004,572,688. B is 28,731,569½.

More information

Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures

Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures Christian Scheideler SS 2018 23.04.2018 Chapter 3 1 Overview Basic data structures Search structures (successor searching) Dictionaries

More information

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

Lecture 5: Hashing. David Woodruff Carnegie Mellon University Lecture 5: Hashing David Woodruff Carnegie Mellon University Hashing Universal hashing Perfect hashing Maintaining a Dictionary Let U be a universe of keys U could be all strings of ASCII characters of

More information

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to Lecture Overview Dictionaries and Python Motivation Hash functions Chaining Simple uniform hashing Good hash functions Readings CLRS Chapter,, 3 Dictionary Problem Abstract Data Type (ADT) maintains a

More information

CS483 Design and Analysis of Algorithms

CS483 Design and Analysis of Algorithms CS483 Design and Analysis of Algorithms Lectures 2-3 Algorithms with Numbers Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments

More information

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

More information

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Dictionaries Dictionary problem. Maintain a set S U = {,..., u-} supporting lookup(x): return true if x S and false otherwise. insert(x): set S = S {x} delete(x): set S = S - {x} Dictionaries

More information

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15) Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.

More information

Lecture Lecture 3 Tuesday Sep 09, 2014

Lecture Lecture 3 Tuesday Sep 09, 2014 CS 4: Advanced Algorithms Fall 04 Lecture Lecture 3 Tuesday Sep 09, 04 Prof. Jelani Nelson Scribe: Thibaut Horel Overview In the previous lecture we finished covering data structures for the predecessor

More information

Hashing. Hashing DESIGN & ANALYSIS OF ALGORITHM

Hashing. Hashing DESIGN & ANALYSIS OF ALGORITHM Hashing Hashing Start with an array that holds the hash table. Use a hash function to take a key and map it to some index in the array. If the desired record is in the location given by the index, then

More information

Analysis of Algorithms I: Perfect Hashing

Analysis of Algorithms I: Perfect Hashing Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based

More information

Module 1: Analyzing the Efficiency of Algorithms

Module 1: Analyzing the Efficiency of Algorithms Module 1: Analyzing the Efficiency of Algorithms Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu What is an Algorithm?

More information

CS 5321: Advanced Algorithms Amortized Analysis of Data Structures. Motivations. Motivation cont d

CS 5321: Advanced Algorithms Amortized Analysis of Data Structures. Motivations. Motivation cont d CS 5321: Advanced Algorithms Amortized Analysis of Data Structures Ali Ebnenasir Department of Computer Science Michigan Technological University Motivations Why amortized analysis and when? Suppose you

More information

You separate binary numbers into columns in a similar fashion. 2 5 = 32

You separate binary numbers into columns in a similar fashion. 2 5 = 32 RSA Encryption 2 At the end of Part I of this article, we stated that RSA encryption works because it s impractical to factor n, which determines P 1 and P 2, which determines our private key, d, which

More information

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) = CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14 1 Probability First, recall a couple useful facts from last time about probability: Linearity of expectation: E(aX + by ) = ae(x)

More information

Distributed Systems Part II Solution to Exercise Sheet 10

Distributed Systems Part II Solution to Exercise Sheet 10 Distributed Computing HS 2012 Prof. R. Wattenhofer / C. Decker Distributed Systems Part II Solution to Exercise Sheet 10 1 Spin Locks A read-write lock is a lock that allows either multiple processes to

More information

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32 CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 32 CS 473: Algorithms, Spring 2018 Universal Hashing Lecture 10 Feb 15, 2018 Most

More information

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 ) CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26 Homework #2 ( Due: Nov 8 ) Task 1. [ 80 Points ] Average Case Analysis of Median-of-3 Quicksort Consider the median-of-3 quicksort algorithm

More information

Hashing. Martin Babka. January 12, 2011

Hashing. Martin Babka. January 12, 2011 Hashing Martin Babka January 12, 2011 Hashing Hashing, Universal hashing, Perfect hashing Input data is uniformly distributed. A dynamic set is stored. Universal hashing Randomised algorithm uniform choice

More information

Lecture Notes for Chapter 17: Amortized Analysis

Lecture Notes for Chapter 17: Amortized Analysis Lecture Notes for Chapter 17: Amortized Analysis Chapter 17 overview Amortized analysis Analyze a sequence of operations on a data structure. Goal: Show that although some individual operations may be

More information

Part I: Definitions and Properties

Part I: Definitions and Properties Turing Machines Part I: Definitions and Properties Finite State Automata Deterministic Automata (DFSA) M = {Q, Σ, δ, q 0, F} -- Σ = Symbols -- Q = States -- q 0 = Initial State -- F = Accepting States

More information

Cache-Oblivious Hashing

Cache-Oblivious Hashing Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong

More information

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts) Introduction to Algorithms October 13, 2010 Massachusetts Institute of Technology 6.006 Fall 2010 Professors Konstantinos Daskalakis and Patrick Jaillet Quiz 1 Solutions Quiz 1 Solutions Problem 1. We

More information

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee Algorithm Analysis Divide and Conquer Chung-Ang University, Jaesung Lee Introduction 2 Divide and Conquer Paradigm 3 Advantages of Divide and Conquer Solving Difficult Problems Algorithm Efficiency Parallelism

More information

Lecture 7: Amortized analysis

Lecture 7: Amortized analysis Lecture 7: Amortized analysis In many applications we want to minimize the time for a sequence of operations. To sum worst-case times for single operations can be overly pessimistic, since it ignores correlation

More information

Discrete Event Simulation. Motive

Discrete Event Simulation. Motive Discrete Event Simulation These slides are created by Dr. Yih Huang of George Mason University. Students registered in Dr. Huang's courses at GMU can make a single machine-readable copy and print a single

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Tutorial 4. Dynamic Set: Amortized Analysis

Tutorial 4. Dynamic Set: Amortized Analysis Tutorial 4 Dynamic Set: Amortized Analysis Review Binary tree Complete binary tree Full binary tree 2-tree P115 Unlike common binary tree, the base case is not an empty tree, but a external node Heap Binary

More information

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing) CS5314 Randomized Algorithms Lecture 15: Balls, Bins, Random Graphs (Hashing) 1 Objectives Study various hashing schemes Apply balls-and-bins model to analyze their performances 2 Chain Hashing Suppose

More information

Design of discrete-event simulations

Design of discrete-event simulations Design of discrete-event simulations Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/tlt-2707/ OUTLINE: Discrete event simulation; Event advance design; Unit-time

More information

Lecture 8 HASHING!!!!!

Lecture 8 HASHING!!!!! Lecture 8 HASHING!!!!! Announcements HW3 due Friday! HW4 posted Friday! Q: Where can I see examples of proofs? Lecture Notes CLRS HW Solutions Office hours: lines are long L Solutions: We will be (more)

More information

Introduction to Computing II (ITI 1121) FINAL EXAMINATION

Introduction to Computing II (ITI 1121) FINAL EXAMINATION Université d Ottawa Faculté de génie École de science informatique et de génie électrique University of Ottawa Faculty of engineering School of Electrical Engineering and Computer Science Identification

More information

Priority queues implemented via heaps

Priority queues implemented via heaps Priority queues implemented via heaps Comp Sci 1575 Data s Outline 1 2 3 Outline 1 2 3 Priority queue: most important first Recall: queue is FIFO A normal queue data structure will not implement a priority

More information

Turing Machine Variants

Turing Machine Variants CS311 Computational Structures Turing Machine Variants Lecture 12 Andrew Black Andrew Tolmach 1 The Church-Turing Thesis The problems that can be decided by an algorithm are exactly those that can be decided

More information

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

CSE 190, Great ideas in algorithms: Pairwise independent hash functions CSE 190, Great ideas in algorithms: Pairwise independent hash functions 1 Hash functions The goal of hash functions is to map elements from a large domain to a small one. Typically, to obtain the required

More information

6.1 Occupancy Problem

6.1 Occupancy Problem 15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.

More information

1 ListElement l e = f i r s t ; / / s t a r t i n g p o i n t 2 while ( l e. next!= n u l l ) 3 { l e = l e. next ; / / next step 4 } Removal

1 ListElement l e = f i r s t ; / / s t a r t i n g p o i n t 2 while ( l e. next!= n u l l ) 3 { l e = l e. next ; / / next step 4 } Removal Präsenzstunden Today In the same room as in the first week Assignment 5 Felix Friedrich, Lars Widmer, Fabian Stutz TA lecture, Informatics II D-BAUG March 18, 2014 HIL E 15.2 15:00-18:00 Timon Gehr (arriving

More information

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes )

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes ) CSE613: Parallel Programming, Spring 2012 Date: May 11 Final Exam ( 11:15 AM 1:45 PM : 150 Minutes ) This exam will account for either 10% or 20% of your overall grade depending on your relative performance

More information

MATH 243E Test #3 Solutions

MATH 243E Test #3 Solutions MATH 4E Test # Solutions () Find a recurrence relation for the number of bit strings of length n that contain a pair of consecutive 0s. You do not need to solve this recurrence relation. (Hint: Consider

More information

/463 Algorithms - Fall 2013 Solution to Assignment 4

/463 Algorithms - Fall 2013 Solution to Assignment 4 600.363/463 Algorithms - Fall 2013 Solution to Assignment 4 (30+20 points) I (10 points) This problem brings in an extra constraint on the optimal binary search tree - for every node v, the number of nodes

More information

Ad Placement Strategies

Ad Placement Strategies Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January

More information

Hashing. Alexandra Stefan

Hashing. Alexandra Stefan Hashng Alexandra Stefan 1 Hash tables Tables Drect access table (or key-ndex table): key => ndex Hash table: key => hash value => ndex Man components Hash functon Collson resoluton Dfferent keys mapped

More information

CITS2211 Discrete Structures (2017) Cardinality and Countability

CITS2211 Discrete Structures (2017) Cardinality and Countability CITS2211 Discrete Structures (2017) Cardinality and Countability Highlights What is cardinality? Is it the same as size? Types of cardinality and infinite sets Reading Sections 45 and 81 84 of Mathematics

More information

Amortized analysis. Amortized analysis

Amortized analysis. Amortized analysis In amortized analysis the goal is to bound the worst case time of a sequence of operations on a data-structure. If n operations take T (n) time (worst case), the amortized cost of an operation is T (n)/n.

More information

Array-based Hashtables

Array-based Hashtables Array-based Hashtables For simplicity, we will assume that we only insert numeric keys into the hashtable hash(x) = x % B; where B is the number of 5 Implementation class Hashtable { int [B]; bool occupied[b];

More information

compare to comparison and pointer based sorting, binary trees

compare to comparison and pointer based sorting, binary trees Admin Hashing Dictionaries Model Operations. makeset, insert, delete, find keys are integers in M = {1,..., m} (so assume machine word size, or unit time, is log m) can store in array of size M using power:

More information

Introduction to Computing II (ITI 1121) FINAL EXAMINATION

Introduction to Computing II (ITI 1121) FINAL EXAMINATION Université d Ottawa Faculté de génie École de science informatique et de génie électrique University of Ottawa Faculty of engineering School of Electrical Engineering and Computer Science Introduction

More information

Algorithms and Networking for Computer Games

Algorithms and Networking for Computer Games Algorithms and Networking for Computer Games Chapter 2: Random Numbers http://www.wiley.com/go/smed What are random numbers good for (according to D.E. Knuth) simulation sampling numerical analysis computer

More information

Topics in Computer Mathematics

Topics in Computer Mathematics Random Number Generation (Uniform random numbers) Introduction We frequently need some way to generate numbers that are random (by some criteria), especially in computer science. Simulations of natural

More information

Application: Bucket Sort

Application: Bucket Sort 5.2.2. Application: Bucket Sort Bucket sort breaks the log) lower bound for standard comparison-based sorting, under certain assumptions on the input We want to sort a set of =2 integers chosen I+U@R from

More information

AMORTIZED ANALYSIS. binary counter multipop stack dynamic table. Lecture slides by Kevin Wayne. Last updated on 1/24/17 11:31 AM

AMORTIZED ANALYSIS. binary counter multipop stack dynamic table. Lecture slides by Kevin Wayne. Last updated on 1/24/17 11:31 AM AMORTIZED ANALYSIS binary counter multipop stack dynamic table Lecture slides by Kevin Wayne http://www.cs.princeton.edu/~wayne/kleinberg-tardos Last updated on 1/24/17 11:31 AM Amortized analysis Worst-case

More information

Lists, Stacks, and Queues (plus Priority Queues)

Lists, Stacks, and Queues (plus Priority Queues) Lists, Stacks, and Queues (plus Priority Queues) The structures lists, stacks, and queues are composed of similar elements with different operations. Likewise, with mathematics: (Z, +, 0) vs. (Z,, 1) List

More information

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University Babis Tsourakakis February 8th, 2017 Universal hash family Notation: Universe U = {0,..., u 1}, index space M = {0,..., m 1},

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries and String Matching CS 240 - Data Structures and Data Management Sajed Haque Veronika Irvine Taylor Smith Based on lecture notes by many previous cs240 instructors David R. Cheriton School

More information

Testing a Hash Function using Probability

Testing a Hash Function using Probability Testing a Hash Function using Probability Suppose you have a huge square turnip field with 1000 turnips growing in it. They are all perfectly evenly spaced in a regular pattern. Suppose also that the Germans

More information

Notes. Number Theory: Applications. Notes. Number Theory: Applications. Notes. Hash Functions I

Notes. Number Theory: Applications. Notes. Number Theory: Applications. Notes. Hash Functions I Number Theory: Applications Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Fall 2007 Computer Science & Engineering 235 Introduction to Discrete Mathematics Sections 3.4 3.7 of Rosen cse235@cse.unl.edu

More information

SECTION 2.3: LONG AND SYNTHETIC POLYNOMIAL DIVISION

SECTION 2.3: LONG AND SYNTHETIC POLYNOMIAL DIVISION 2.25 SECTION 2.3: LONG AND SYNTHETIC POLYNOMIAL DIVISION PART A: LONG DIVISION Ancient Example with Integers 2 4 9 8 1 In general: dividend, f divisor, d We can say: 9 4 = 2 + 1 4 By multiplying both sides

More information

1 Difference between grad and undergrad algorithms

1 Difference between grad and undergrad algorithms princeton univ. F 4 cos 52: Advanced Algorithm Design Lecture : Course Intro and Hashing Lecturer: Sanjeev Arora Scribe:Sanjeev Algorithms are integral to computer science and every computer scientist

More information

SOLUTION: SOLUTION: SOLUTION:

SOLUTION: SOLUTION: SOLUTION: Convert R and S into nondeterministic finite automata N1 and N2. Given a string s, if we know the states N1 and N2 may reach when s[1...i] has been read, we are able to derive the states N1 and N2 may

More information

then the hard copy will not be correct whenever your instructor modifies the assignments.

then the hard copy will not be correct whenever your instructor modifies the assignments. Assignments for Math 2030 then the hard copy will not be correct whenever your instructor modifies the assignments. exams, but working through the problems is a good way to prepare for the exams. It is

More information