Hashing. Data organization in main memory or disk
|
|
- Bertha Todd
- 5 years ago
- Views:
Transcription
1 Hashing Data organization in main memory or disk sequential, binary trees, The location of a key depends on other keys => unnecessary key comparisons to find a key Question: find key with a single comparison Hashing: the location of a record is computed using its key only Fast for random accesses - slow for range queries EGM Petrakis Hashing
2 Hash Table Hash Function: transforms keys to array indices n h(key) h(key): Hash Function index data EGM Petrakis Hashing 2 m
3 position key record EGM Petrakis Hashing 3 h(key) = key mod 000
4 Good Hash Functions Uniform: distribute keys evenly in space 2 Perfect: two records cannot occupy the same location or ki k j i j : h( ki ) h( 3 Order preserving:, k j ), ) h( k j ki k j i j : h( ki Difficult to find such hash functions Property 2 is the most essential Most functions are no better than h(key) = key mod m Hash collision: k k : h(k ) h(k ) i j i = j ) EGM Petrakis Hashing 4
5 Collision Resolution Open Addressing (rehashing): compute new position to store the key in the table (no extra space) i linear probing ii double hashing 2 Separate Chaining: lists of keys mapped to the same position (uses extra space) EGM Petrakis Hashing 5
6 Open Addressing Computes a new address to store the key if it is occupied (rehashing) if occupied too, compute a new address, until an empty position is found primary hash function: i=h(key) rehash function: rh(i)=rh(h(key)) hash sequence: (h 0,h,h 2 ) = (h(key), rh(h(key)), rh(rh(h(key))) ) To find a key follow the same hash sequence EGM Petrakis Hashing 6
7 Example i=h(key)=key mod 00 rh(i) = (i+) mod 00 key: 93 i=h(93)=93 rh(i)=(93+)=94 Key 93 will occupy position EGM Petrakis Hashing 7
8 Problem : Locate Empty Positions No empty position can be found i the table is full ii check on number of empty positions the hash function fails to find an empty position although the table is not full!! i=h(key) = key mod 000 rh(i) = (i + 200) mod 000 => checks only 5 positions on a table of 000 positions rh(i) = (i+) mod 000 successive positions rh(i) = (i+c) mod 000 where GCD(c,m) = EGM Petrakis Hashing 8
9 Problem 2: Primary Clustering Different keys that hash into different addresses compete with each other in successive rehashes i=h(key) = key mod 00 rh(i) = (i+) mod 00 keys: 990, 99, 992, 993, 994 => EGM Petrakis Hashing
10 Problem 3: Secondary Clustering Different keys which hash to the same hash value have the same rehash sequence i=h(key) = key mod 0 rh(i,j) = (i + j) mod 0 i key 23 : h(23) = 3 rh = 4, 6, 9, 3, ii key 3 : h(3) = 3 rh = 4, 6, 9, 3, EGM Petrakis Hashing 0
11 Linear Probing Store the key into the next free position h 0 = h(key) usually h 0 = key mod m h i = (h i- + ) mod m, i >= S = {22, 35, 30, 99, 02, 452} 8 EGM Petrakis 9 99 Hashing
12 Observation number Different insertion sequences => different hash sequences S ={,3,27,99,8,50,77,2 2,2,3,33,40,53}=>28 probes S 2 ={53,40,33,3,2,22,7 7,50,8,99,27,3,}=> 30 probes of probes H(key) = key mod 3 EGM Petrakis Hashing 2
13 Observation 2 Deletions are not easy: i=h(key) = key mod 0 rh(i) = (i+) mod 0 Action: delete(65) and search(5) Problem: search will stop at the empty position and will not find 5 Solution: 0 70 mark position as deleted rather than empty the marked position can be reused EGM Petrakis Hashing 3
14 Observation 3 Linear probing tends to create long sequences of occupied positions the longer a sequence is, the longer it tends to become P: probability to use a position in the cluster m Β P = B + m EGM Petrakis Hashing 4
15 Observation 4 Linear probing suffers from both primary and secondary clustering Solution: double hashing uses two hash functions h, h 2 and a rehashing function rh EGM Petrakis Hashing 5
16 Double Hashing Two hash functions and a rehashing function primary hash function i=h (key)= key mod m secondary hash function h 2 (key) rehashing function: rh(key) = (i + h 2 (key)) mod m h 2 (m,key) is some function of m, key helps rh in computing random positions in the hash table h 2 is computed once for each key! EGM Petrakis Hashing 6
17 Example of Double Hashing i hash function: h (key) = key mod m h 2 (key) = m div 2 q q = 0 q 0 q = (key div m) mod m ii rehash function: rh(i, key) = (i + h 2 (key)) mod m EGM Petrakis Hashing 7
18 Example (continued) A m = 0, key = 23 h (23) = 3, h 2 (23) = 2 rh(3,2)=(3+2) mod 0 = 5 rehash sequence: 5, 7, 9,, m = 0, key = 3 h (key)=3, h 2 (3)=, rh(3,)=(3+)mod0=4 rehash sequence: 4, 5, 6, EGM Petrakis Hashing 8
19 Performance of Open Addressing Distinguish between successful and unsuccessful search Assume a series of probes to random positions independent events load factor: λ =n/m λ: probability to probe an occupied position each position has the same probability P=/m EGM Petrakis Hashing 9
20 Unsuccessful Search The hash sequence is exhausted let u be the expected number of probes u equals the expected length of the hash sequence P(k): probability to search k positions in the hash sequence EGM Petrakis Hashing 20
21 u = kp(k) = k P() + P(2) + P(2) + P(3) + P(3) + P(3) + L P(k) + P(k) + L + P(k) + L P( probes) + P( 2probes) + L EGM Petrakis Hashing 2
22 u = k u = k P(first k positions ocupied ) = λ κ λ P( k k k probes) = λ k independent events uincreases with λ => performance drops as λ increases EGM Petrakis Hashing 22
23 Successful Search The hash sequence is not exhausted the number of probes to find a key equals the number of probes s at the time the key was inserted plus λ was less at that time consider all values of λ λ s = (u + )dx = + ln( ) λ λ λ 0 approximation u: equivalent to unsuccessful search increases with λ EGM Petrakis Hashing 23
24 Performance The performance drops as λ increases the higher the value of λ is, the higher the probability of collisions Unsuccessful search is more expensive than successful search unsuccessful search exhausts the hash sequence EGM Petrakis Hashing 24
25 Experimental Results LOAD FACTOR SUCCESSFUL UNSUCCESSFUL LINEAR i + bkey DOUBLE LINEAR i + bkey DOUBLE 25% % % % % EGM Petrakis Hashing 25
26 Performance on Full Table SUCCESSFUL TABLE UNSUCCESSFUL LOG 2 m SIZE (m) LINEAR i + bkey DOUBLE EGM Petrakis Hashing 26
27 Separate Chaining Keys hashing to the same hash value are stored in separate lists one list per hash position can store more than m records easy to implement the keys in each list can be ordered EGM Petrakis Hashing 27
28 nil nil h(key) = key mod m nil 3 nil 4 nil 5 75 nil nil nil 8 nil 9 49 nil EGM Petrakis Hashing 28
29 Performance of Separate Chaining Depends on the average chain size insertions are independent events let P(c,n,m): probability that a position has been selected c times after n insertions on a table of size m P(c,n,m): probability that the chain has length c => binomial distribution P(c,n, m) = n p c c c q n p=/m: success case q=-p: failure case EGM Petrakis Hashing 29
30 EGM Petrakis Hashing 30 n c c n c m m m c n m n c! m m c n P(c,n, m) + = = λ n c e m m λ m c n + => P(c,n,m)=(/c!)λ c e -λ Poison m n, =>
31 Unsuccessful Search The entire chain is searched the average number of comparisons equals its average length u u = c 0 cp(c, λ) = c 0 λ c e c! λ = λ EGM Petrakis Hashing 3
32 Successful Search Not the whole chain is searched the average number of comparisons equals the length s of the chain at time the key was inserted plus the performance at the time a key was inserted equals that of unsuccessful search! λ s = (u λ + )dx = (x + )dx = + 0 EGM Petrakis Hashing 32 λ λ 0 λ 2
33 Performance The performance drops with the length of the chains worst case: all keys are stored in a single chain worst case performance: O(N) unsuccessful search performs better than successful search!! WHY? no problem with deletions!! EGM Petrakis Hashing 33
34 Coalesced Hashing The hash sequence is implemented as a linked list within the hash table no rehash function the next hash position is the next available position in linked list extra space for the list h(key) = key mod 0 keys: 9, 29, 49, EGM Petrakis Hashing 34
35 avail initialization 0 nilkey - nilkey nilkey 8 List of empty positions initially: avail = 9 h(key) = key mod 0 keys: 4,29,34,28,42,39,84,38 0 nilkey - nilkey Holds lists of rehashing positions and list of empty positions EGM Petrakis Hashing 35
36 Performance of Coalesced Hashing Unsuccessful search 2λ e 4 λ Successful search probes/search e 2λ λ λ 4 probes/search EGM Petrakis Hashing 36
Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1
Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Direct-Address Tables 2 2 Let U = {0,...,m 1}, the set of
More informationAdvanced Implementations of Tables: Balanced Search Trees and Hashing
Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the
More informationIntroduction to Hashtables
Introduction to HashTables Boise State University March 5th 2015 Hash Tables: What Problem Do They Solve What Problem Do They Solve? Why not use arrays for everything? 1 Arrays can be very wasteful: Example
More informationInsert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park
1617 Preview Data Structure Review COSC COSC Data Structure Review Linked Lists Stacks Queues Linked Lists Singly Linked List Doubly Linked List Typical Functions s Hash Functions Collision Resolution
More informationFundamental Algorithms
Chapter 5: Hash Tables, Winter 2018/19 1 Fundamental Algorithms Chapter 5: Hash Tables Jan Křetínský Winter 2018/19 Chapter 5: Hash Tables, Winter 2018/19 2 Generalised Search Problem Definition (Search
More informationSearching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9
Constant time access Searching Chapter 9 Linear search Θ(n) OK Binary search Θ(log n) Better Can we achieve Θ(1) search time? CPTR 318 1 2 Use an array? Use random access on a key such as a string? Hash
More informationData Structures and Algorithm. Xiaoqing Zheng
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Dictionary problem Dictionary T holding n records: x records key[x] Other fields containing satellite data Operations on T: INSERT(T, x)
More informationHash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a
Hash Tables Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a mapping from U to M = {1,..., m}. A collision occurs when two hashed elements have h(x) =h(y).
More informationHashing, Hash Functions. Lecture 7
Hashing, Hash Functions Lecture 7 Symbol-table problem Symbol table T holding n records: x record key[x] Other fields containing satellite data Operations on T: INSERT(T, x) DELETE(T, x) SEARCH(T, k) How
More informationSymbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record.
CS 5633 -- Spring 25 Symbol-table problem Hashing Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk CS 5633 Analysis of Algorithms 1 Symbol table holding n records: record
More informationHash tables. Hash tables
Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key
More informationHash tables. Hash tables
Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key
More informationCollision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST
Collision Kuan-Yu Chen ( 陳冠宇 ) 2018/12/17 @ TR-212, NTUST Review Hash table is a data structure in which keys are mapped to array positions by a hash function When two or more keys map to the same memory
More informationHashing Data Structures. Ananda Gunawardena
Hashing 15-121 Data Structures Ananda Gunawardena Hashing Why do we need hashing? Many applications deal with lots of data Search engines and web pages There are myriad look ups. The look ups are time
More informationINTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class
Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -1 - INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University Data Set - SSN's from UTSA Class 467 13 3881 498 66 2055 450 27 3804 456 49 5261
More informationLecture: Analysis of Algorithms (CS )
Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 Outline of Today s Class 2 Choosing Hash Functions Universal Universality Theorem Constructing a Set of Universal Hash Functions Perfect
More informationHash tables. Hash tables
Basic Probability Theory Two events A, B are independent if Conditional probability: Pr[A B] = Pr[A] Pr[B] Pr[A B] = Pr[A B] Pr[B] The expectation of a (discrete) random variable X is E[X ] = k k Pr[X
More informationCS 591, Lecture 6 Data Analytics: Theory and Applications Boston University
CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University Babis Tsourakakis February 8th, 2017 Universal hash family Notation: Universe U = {0,..., u 1}, index space M = {0,..., m 1},
More informationHashing. Dictionaries Hashing with chaining Hash functions Linear Probing
Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Dictionaries Dictionary: Maintain a dynamic set S. Every
More informationSo far we have implemented the search for a key by carefully choosing split-elements.
7.7 Hashing Dictionary: S. insert(x): Insert an element x. S. delete(x): Delete the element pointed to by x. S. search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise
More informationA Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple
A Lecture on Hashing Aram-Alexandre Pooladian, Alexander Iannantuono March 22, 217 This is the scribing of a lecture given by Luc Devroye on the 17th of March 217 for Honours Algorithms and Data Structures
More informationCOMP251: Hashing. Jérôme Waldispühl School of Computer Science McGill University. Based on (Cormen et al., 2002)
COMP251: Hashing Jérôme Waldispühl School of Computer Science McGill University Based on (Cormen et al., 2002) Table S with n records x: Problem DefiniNon X Key[x] InformaNon or data associated with x
More informationHashing. Martin Babka. January 12, 2011
Hashing Martin Babka January 12, 2011 Hashing Hashing, Universal hashing, Perfect hashing Input data is uniformly distributed. A dynamic set is stored. Universal hashing Randomised algorithm uniform choice
More informationMotivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis
Motivation Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis Arrays provide an indirect way to access a set. Many times we need an association between two sets, or a set of keys and associated
More informationSearching, mainly via Hash tables
Data structures and algorithms Part 11 Searching, mainly via Hash tables Petr Felkel 26.1.2007 Topics Searching Hashing Hash function Resolving collisions Hashing with chaining Open addressing Linear Probing
More informationCSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30
CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in
More information? 11.5 Perfect hashing. Exercises
11.5 Perfect hashing 77 Exercises 11.4-1 Consider inserting the keys 10; ; 31; 4; 15; 8; 17; 88; 59 into a hash table of length m 11 using open addressing with the auxiliary hash function h 0.k/ k. Illustrate
More informationCSE 502 Class 11 Part 2
CSE 502 Class 11 Part 2 Jeremy Buhler Steve Cole February 17 2015 Today: analysis of hashing 1 Constraints of Double Hashing How does using OA w/double hashing constrain our hash function design? Need
More information12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]
Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally random number, so the code will be hard to crack. For letter A, you write 3,004,572,688. B is 28,731,569½.
More informationIntroduction to Hash Tables
Introduction to Hash Tables Hash Functions A hash table represents a simple but efficient way of storing, finding, and removing elements. In general, a hash table is represented by an array of cells. In
More informationCache-Oblivious Hashing
Cache-Oblivious Hashing Zhewei Wei Hong Kong University of Science & Technology Joint work with Rasmus Pagh, Ke Yi and Qin Zhang Dictionary Problem Store a subset S of the Universe U. Lookup: Does x belong
More informations 1 if xπy and f(x) = f(y)
Algorithms Proessor John Rei Hash Function : A B ALG 4.2 Universal Hash Functions: CLR - Chapter 34 Auxillary Reading Selections: AHU-Data Section 4.7 BB Section 8.4.4 Handout: Carter & Wegman, "Universal
More informationData Structures and Algorithm. Xiaoqing Zheng
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn MULTIPOP top[s] = 6 top[s] = 2 3 2 8 5 6 5 S MULTIPOP(S, x). while not STACK-EMPTY(S) and k 0 2. do POP(S) 3. k k MULTIPOP(S, 4) Analysis
More information6.854 Advanced Algorithms
6.854 Advanced Algorithms Homework Solutions Hashing Bashing. Solution:. O(log U ) for the first level and for each of the O(n) second level functions, giving a total of O(n log U ) 2. Suppose we are using
More informationAmortized analysis. Amortized analysis
In amortized analysis the goal is to bound the worst case time of a sequence of operations on a data-structure. If n operations take T (n) time (worst case), the amortized cost of an operation is T (n)/n.
More informationArray-based Hashtables
Array-based Hashtables For simplicity, we will assume that we only insert numeric keys into the hashtable hash(x) = x % B; where B is the number of 5 Implementation class Hashtable { int [B]; bool occupied[b];
More informationHashing. Hashing DESIGN & ANALYSIS OF ALGORITHM
Hashing Hashing Start with an array that holds the hash table. Use a hash function to take a key and map it to some index in the array. If the desired record is in the location given by the index, then
More informationHashing. Why Hashing? Applications of Hashing
12 Hashing Why Hashing? Hashing A Search algorithm is fast enough if its time performance is O(log 2 n) For 1 5 elements, it requires approx 17 operations But, such speed may not be applicable in real-world
More informationCS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32
CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 32 CS 473: Algorithms, Spring 2018 Universal Hashing Lecture 10 Feb 15, 2018 Most
More information1 Hashing. 1.1 Perfect Hashing
1 Hashing Hashing is covered by undergraduate courses like Algo I. However, there is much more to say on this topic. Here, we focus on two selected topics: perfect hashing and cockoo hashing. In general,
More informationAlgorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee
Algorithm Analysis Divide and Conquer Chung-Ang University, Jaesung Lee Introduction 2 Divide and Conquer Paradigm 3 Advantages of Divide and Conquer Solving Difficult Problems Algorithm Efficiency Parallelism
More informationRound 5: Hashing. Tommi Junttila. Aalto University School of Science Department of Computer Science
Round 5: Hashing Tommi Junttila Aalto University School of Science Department of Computer Science CS-A1140 Data Structures and Algorithms Autumn 017 Tommi Junttila (Aalto University) Round 5 CS-A1140 /
More information1 Maintaining a Dictionary
15-451/651: Design & Analysis of Algorithms February 1, 2016 Lecture #7: Hashing last changed: January 29, 2016 Hashing is a great practical tool, with an interesting and subtle theory too. In addition
More informationProblem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)
Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.
More informationLecture 5: Hashing. David Woodruff Carnegie Mellon University
Lecture 5: Hashing David Woodruff Carnegie Mellon University Hashing Universal hashing Perfect hashing Maintaining a Dictionary Let U be a universe of keys U could be all strings of ASCII characters of
More informationCuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions
1 / 29 Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions Martin Aumüller, Martin Dietzfelbinger Technische Universität Ilmenau 2 / 29 Cuckoo Hashing Maintain a dynamic dictionary
More informationModule 1: Analyzing the Efficiency of Algorithms
Module 1: Analyzing the Efficiency of Algorithms Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu What is an Algorithm?
More informationTutorial 4. Dynamic Set: Amortized Analysis
Tutorial 4 Dynamic Set: Amortized Analysis Review Binary tree Complete binary tree Full binary tree 2-tree P115 Unlike common binary tree, the base case is not an empty tree, but a external node Heap Binary
More informationAnalysis of Algorithms I: Perfect Hashing
Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the
More informationLecture Lecture 3 Tuesday Sep 09, 2014
CS 4: Advanced Algorithms Fall 04 Lecture Lecture 3 Tuesday Sep 09, 04 Prof. Jelani Nelson Scribe: Thibaut Horel Overview In the previous lecture we finished covering data structures for the predecessor
More informationcompare to comparison and pointer based sorting, binary trees
Admin Hashing Dictionaries Model Operations. makeset, insert, delete, find keys are integers in M = {1,..., m} (so assume machine word size, or unit time, is log m) can store in array of size M using power:
More informationAlgorithms lecture notes 1. Hashing, and Universal Hash functions
Algorithms lecture notes 1 Hashing, and Universal Hash functions Algorithms lecture notes 2 Can we maintain a dictionary with O(1) per operation? Not in the deterministic sense. But in expectation, yes.
More information1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:
CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at
More informationExam 1. March 12th, CS525 - Midterm Exam Solutions
Name CWID Exam 1 March 12th, 2014 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 5 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 90
More informationSolution suggestions for examination of Logic, Algorithms and Data Structures,
Department of VT12 Software Engineering and Managment DIT725 (TIG023) Göteborg University, Chalmers 24/5-12 Solution suggestions for examination of Logic, Algorithms and Data Structures, Date : April 26,
More informationB669 Sublinear Algorithms for Big Data
B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 2-1 Part 1: Sublinear in Space The model and challenge The data stream model (Alon, Matias and Szegedy 1996) a n a 2 a 1 RAM CPU Why hard? Cannot store
More informationFundamental Algorithms
Fundamental Algorithms Chapter 5: Searching Michael Bader Winter 2014/15 Chapter 5: Searching, Winter 2014/15 1 Searching Definition (Search Problem) Input: a sequence or set A of n elements (objects)
More informationDatabases. DBMS Architecture: Hashing Techniques (RDBMS) and Inverted Indexes (IR)
Databases DBMS Architecture: Hashing Techniques (RDBMS) and Inverted Indexes (IR) References Hashing Techniques: Elmasri, 7th Ed. Chapter 16, section 8. Cormen, 3rd Ed. Chapter 11. Inverted indexing: Elmasri,
More informationAdvanced Algorithm Design: Hashing and Applications to Compact Data Representation
Advanced Algorithm Design: Hashing and Applications to Compact Data Representation Lectured by Prof. Moses Chariar Transcribed by John McSpedon Feb th, 20 Cucoo Hashing Recall from last lecture the dictionary
More information6.1 Occupancy Problem
15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.
More informationAnother way of saying this is that amortized analysis guarantees the average case performance of each operation in the worst case.
Amortized Analysis: CLRS Chapter 17 Last revised: August 30, 2006 1 In amortized analysis we try to analyze the time required by a sequence of operations. There are many situations in which, even though
More informationSearching. Sorting. Lambdas
.. s Babes-Bolyai University arthur@cs.ubbcluj.ro Overview 1 2 3 Feedback for the course You can write feedback at academicinfo.ubbcluj.ro It is both important as well as anonymous Write both what you
More informationHashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille
Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing
More informationHashing. Algorithm : Design & Analysis [09]
Hashig Algorithm : Desig & Aalysis [09] I the last class Implemetig Dictioary ADT Defiitio of red-black tree Black height Isertio ito a red-black tree Deletio from a red-black tree Hashig Hashig Collisio
More informationPremaster Course Algorithms 1 Chapter 3: Elementary Data Structures
Premaster Course Algorithms 1 Chapter 3: Elementary Data Structures Christian Scheideler SS 2018 23.04.2018 Chapter 3 1 Overview Basic data structures Search structures (successor searching) Dictionaries
More informationHashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing
Philip Bille Dictionaries Dictionary problem. Maintain a set S U = {,..., u-} supporting lookup(x): return true if x S and false otherwise. insert(x): set S = S {x} delete(x): set S = S - {x} Dictionaries
More informationN/4 + N/2 + N = 2N 2.
CS61B Summer 2006 Instructor: Erin Korber Lecture 24, 7 Aug. 1 Amortized Analysis For some of the data structures we ve discussed (namely hash tables and splay trees), it was claimed that the average time
More informationData Structures and Algorithms Winter Semester
Page 0 German University in Cairo December 26, 2015 Media Engineering and Technology Faculty Prof. Dr. Slim Abdennadher Dr. Wael Abouelsadaat Data Structures and Algorithms Winter Semester 2015-2016 Final
More informationCSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )
CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26 Homework #2 ( Due: Nov 8 ) Task 1. [ 80 Points ] Average Case Analysis of Median-of-3 Quicksort Consider the median-of-3 quicksort algorithm
More informationAdvanced Data Structures
Simon Gog gog@kit.edu - Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Predecessor data structures We want to support
More informationCSCB63 Winter Week 11 Bloom Filters. Anna Bretscher. March 30, / 13
CSCB63 Winter 2019 Week 11 Bloom Filters Anna Bretscher March 30, 2019 1 / 13 Today Bloom Filters Definition Expected Complexity Applications 2 / 13 Bloom Filters (Specification) A bloom filter is a probabilistic
More informationProblem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads
B Trees Problem: Data base too big to fit memory Disk reads are slow Example: 1,000,000 records on disk Binary search might take 20 disk reads Disk reads are done in blocks Example: One block read can
More informationAdvanced Data Structures
Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu Predecessor data structures We want to support the following operations on a set of integers from
More informationAlgorithms for Data Science
Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based
More information2 How many distinct elements are in a stream?
Dealing with Massive Data January 31, 2011 Lecture 2: Distinct Element Counting Lecturer: Sergei Vassilvitskii Scribe:Ido Rosen & Yoonji Shin 1 Introduction We begin by defining the stream formally. Definition
More informationCS483 Design and Analysis of Algorithms
CS483 Design and Analysis of Algorithms Lectures 2-3 Algorithms with Numbers Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments
More informationLecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing
Bloom filters and Hashing 1 Introduction The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of
More informationCS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =
CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14 1 Probability First, recall a couple useful facts from last time about probability: Linearity of expectation: E(aX + by ) = ae(x)
More informationProblem. Maintain the above set so as to support certain
Randomized Data Structures Roussos Ioannis Fundamental Data-structuring Problem Collection {S 1,S 2, } of sets of items. Each item i is an arbitrary record, indexed by a key k(i). Each k(i) is drawn from
More informationCS1802 Week 11: Algorithms, Sums, Series, Induction
CS180 Discrete Structures Recitation Fall 017 Nov 11 - November 17, 017 CS180 Week 11: Algorithms, Sums, Series, Induction 1 Markov chain i. Boston has days which are either sunny or rainy and can be modeled
More informationLecture 4: Divide and Conquer: van Emde Boas Trees
Lecture 4: Divide and Conquer: van Emde Boas Trees Series of Improved Data Structures Insert, Successor Delete Space This lecture is based on personal communication with Michael Bender, 001. Goal We want
More informationHash Tables (Cont'd) Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250
(Cont'd) Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Investigate the other important
More informationCuckoo Hashing and Cuckoo Filters
Cuckoo Hashing and Cuckoo Filters Noah Fleming May 7, 208 Preliminaries A dictionary is an abstract data type that stores a collection of elements, located by their key. It supports operations: insert,
More informationLecture 4 February 2nd, 2017
CS 224: Advanced Algorithms Spring 2017 Prof. Jelani Nelson Lecture 4 February 2nd, 2017 Scribe: Rohil Prasad 1 Overview In the last lecture we covered topics in hashing, including load balancing, k-wise
More informationModule 9: Tries and String Matching
Module 9: Tries and String Matching CS 240 - Data Structures and Data Management Sajed Haque Veronika Irvine Taylor Smith Based on lecture notes by many previous cs240 instructors David R. Cheriton School
More informationData Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86
Data Structure Mohsen Arab Yazd University January 13, 2015 Mohsen Arab (Yazd University ) Data Structure January 13, 2015 1 / 86 Table of Content Binary Search Tree Treaps Skip Lists Hash Tables Mohsen
More informationCharles University in Prague Faculty of Mathematics and Physics MASTER THESIS. Martin Babka. Properties of Universal Hashing
Charles University in Prague Faculty of Mathematics and Physics MASTER THESIS Martin Babka Properties of Universal Hashing Department of Theoretical Computer Science and Mathematical Logic Supervisor:
More informationCS 5321: Advanced Algorithms Amortized Analysis of Data Structures. Motivations. Motivation cont d
CS 5321: Advanced Algorithms Amortized Analysis of Data Structures Ali Ebnenasir Department of Computer Science Michigan Technological University Motivations Why amortized analysis and when? Suppose you
More informationApplication: Bucket Sort
5.2.2. Application: Bucket Sort Bucket sort breaks the log) lower bound for standard comparison-based sorting, under certain assumptions on the input We want to sort a set of =2 integers chosen I+U@R from
More informationSPATIAL INDEXING. Vaibhav Bajpai
SPATIAL INDEXING Vaibhav Bajpai Contents Overview Problem with B+ Trees in Spatial Domain Requirements from a Spatial Indexing Structure Approaches SQL/MM Standard Current Issues Overview What is a Spatial
More information7 Dictionary. 7.1 Binary Search Trees. Binary Search Trees: Searching. 7.1 Binary Search Trees
7 Dictionary Dictionary: S.insert(): Insert an element. S.delete(): Delete the element pointed to by. 7.1 Binary Search Trees An (internal) binary search tree stores the elements in a binary tree. Each
More informationA General-Purpose Counting Filter: Making Every Bit Count. Prashant Pandey, Michael A. Bender, Rob Johnson, Rob Patro Stony Brook University, NY
A General-Purpose Counting Filter: Making Every Bit Count Prashant Pandey, Michael A. Bender, Rob Johnson, Rob Patro Stony Brook University, NY Approximate Membership Query (AMQ) insert(x) ismember(x)
More informationThe set of integers will be denoted by Z = {, -3, -2, -1, 0, 1, 2, 3, 4, }
Integers and Division 1 The Integers and Division This area of discrete mathematics belongs to the area of Number Theory. Some applications of the concepts in this section include generating pseudorandom
More informationAMORTIZED ANALYSIS. binary counter multipop stack dynamic table. Lecture slides by Kevin Wayne. Last updated on 1/24/17 11:31 AM
AMORTIZED ANALYSIS binary counter multipop stack dynamic table Lecture slides by Kevin Wayne http://www.cs.princeton.edu/~wayne/kleinberg-tardos Last updated on 1/24/17 11:31 AM Amortized analysis Worst-case
More information7 Dictionary. EADS c Ernst Mayr, Harald Räcke 109
7 Dictionary Dictionary: S.insert(x): Insert an element x. S.delete(x): Delete the element pointed to by x. S.search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise return
More informationCS361 Homework #3 Solutions
CS6 Homework # Solutions. Suppose I have a hash table with 5 locations. I would like to know how many items I can store in it before it becomes fairly likely that I have a collision, i.e., that two items
More information6.830 Lecture 11. Recap 10/15/2018
6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking
More informationNotes on Logarithmic Lower Bounds in the Cell Probe Model
Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)
More informationLecture 7: Amortized analysis
Lecture 7: Amortized analysis In many applications we want to minimize the time for a sequence of operations. To sum worst-case times for single operations can be overly pessimistic, since it ignores correlation
More informationENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October
ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October Contents 1 Cartesian Tree. Definition. 1 2 Cartesian Tree. Construction 1 3 Cartesian Tree. Operations. 2 3.1 Split............................................
More information