Array-based Hashtables

Similar documents
Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing

Introduction to Hashtables

Hashing Data Structures. Ananda Gunawardena

So far we have implemented the search for a key by carefully choosing split-elements.

Advanced Implementations of Tables: Balanced Search Trees and Hashing

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple

Lecture: Analysis of Algorithms (CS )

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

1 Hashing. 1.1 Perfect Hashing

Hashing. Hashing DESIGN & ANALYSIS OF ALGORITHM

Fundamental Algorithms

A General-Purpose Counting Filter: Making Every Bit Count. Prashant Pandey, Michael A. Bender, Rob Johnson, Rob Patro Stony Brook University, NY

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

Hashing. Why Hashing? Applications of Hashing

Symbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record.

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST

Algorithms for Data Science

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

Data Structures and Algorithms Winter Semester

Hashing, Hash Functions. Lecture 7

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University

Hashing. Data organization in main memory or disk

Data Structures and Algorithm. Xiaoqing Zheng

Hash Tables (Cont'd) Carlos Moreno uwaterloo.ca EIT

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class

10 van Emde Boas Trees

Searching, mainly via Hash tables

Analysis of Algorithms I: Perfect Hashing

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

Solution suggestions for examination of Logic, Algorithms and Data Structures,

A linear equation in two variables is generally written as follows equation in three variables can be written as

Design and Analysis of Algorithms

Introduction to Hash Tables

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

Lecture 4 February 2nd, 2017

Hash tables. Hash tables

Advanced Data Structures

Hash tables. Hash tables

Hashing. Martin Babka. January 12, 2011

Algorithms lecture notes 1. Hashing, and Universal Hash functions

1 Maintaining a Dictionary

Fixed-Universe Successor : Van Emde Boas. Lecture 18

Lecture 24: Bloom Filters. Wednesday, June 2, 2010

Hash tables. Hash tables

Cuckoo Hashing and Cuckoo Filters

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

Round 5: Hashing. Tommi Junttila. Aalto University School of Science Department of Computer Science

COMP251: Hashing. Jérôme Waldispühl School of Computer Science McGill University. Based on (Cormen et al., 2002)

1 Difference between grad and undergrad algorithms

Advanced Data Structures

6.854 Advanced Algorithms

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

Advanced Data Structures

7 Dictionary. Ernst Mayr, Harald Räcke 126

Lecture 4: Divide and Conquer: van Emde Boas Trees

Cache-Oblivious Hashing

Lecture 4 Thursday Sep 11, 2014

N/4 + N/2 + N = 2N 2.

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

CSCB63 Winter Week 11 Bloom Filters. Anna Bretscher. March 30, / 13

Outline Resource Introduction Usage Testing Exercises. Linked Lists COMP SCI / SFWR ENG 2S03. Department of Computing and Software McMaster University

Content. 1 Static dictionaries. 2 Integer data structures. 3 Graph data structures. 4 Dynamization and persistence. 5 Cache-oblivious structures

INF2220: algorithms and data structures Series 1

compare to comparison and pointer based sorting, binary trees

CSE 312, 2017 Winter, W.L.Ruzzo. 5. independence [ ]

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

Databases. DBMS Architecture: Hashing Techniques (RDBMS) and Inverted Indexes (IR)

On the Impossibility of Batch Update for Cryptographic Accumulators. Philippe Camacho and Alejandro Hevia University of Chile

Design and Analysis of Algorithms March 12, 2015 Massachusetts Institute of Technology Profs. Erik Demaine, Srini Devadas, and Nancy Lynch Quiz 1

Hash Functions for Priority Queues

? 11.5 Perfect hashing. Exercises

Binary Search Trees. Motivation

7 Dictionary. Harald Räcke 125

Charles University in Prague Faculty of Mathematics and Physics MASTER THESIS. Martin Babka. Properties of Universal Hashing

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to

Chapter 5: Quadratic Functions

Characteristics of Polynomials and their Graphs

Sections 8.1 & 8.2 Systems of Linear Equations in Two Variables

DATA MINING LECTURE 6. Similarity and Distance Sketching, Locality Sensitive Hashing

Bloom Filters, Adaptivity, and the Dictionary Problem

On Adaptive Integer Sorting

Constructors - Cont. must be distinguished by the number or type of their arguments.

CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community

A. Incorrect! This inequality is a disjunction and has a solution set shaded outside the boundary points.

MEASUREMENT OF TEMPORAL RESOLUTION AND DETECTION EFFICIENCY OF X-RAY STREAK CAMERA BY SINGLE PHOTON IMAGES

Data Structures and Algorithms

CS361 Homework #3 Solutions

Succinct Data Structures for Approximating Convex Functions with Applications

University of New Mexico Department of Computer Science. Final Examination. CS 561 Data Structures and Algorithms Fall, 2013

Review 5 Symbolic Graphical Interplay Name 5.1 Key Features on Graphs Per Date

6.1 Occupancy Problem

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Depending on equations

Application: Bucket Sort

Transcription:

Array-based Hashtables For simplicity, we will assume that we only insert numeric keys into the hashtable hash(x) = x % B; where B is the number of 5

Implementation class Hashtable { int [B]; bool occupied[b]; Insert(5) h=hash(5) = 5 % 9 = 8 0 2 4 5 6 8 6

Insert class Hashtable { int [B]; bool occupied[b]; Insert(5) h=hash(5) = 5 % 9 = 8 0 2 4 5 Insert() h=hash() = % 9 = 4 6 8 5

Insert class Hashtable { int [B]; bool occupied[b]; Insert(5) 0 2 h=hash(5) = 5 % 9 = 8 Insert() h=hash() = % 9 = 4 4 5 6 8 5 8

Insert class Hashtable { int [B]; bool occupied[b]; 0 2 insert(x) { h = hash(x); [h] = x; occupied[h] = true; 4 5 6 8 5 9

Insert class Hashtable { int [B]; bool occupied[b]; 0 2 Insert() h=hash() = % 9 = 4 4 5 6 Collision 8 5 20

Collision Resolution If h(x) is occupied, we try other We try h 0 x, h x, h 2 x, until an empty bucket is found h i (x) = hash(x) + f(i) where f is called the collision resolution function 2

Linear Probing f i = i Insert() h=hash() = % 9 = 4 h 0 =4 + 0 = 4 (Collision) h =4 + = 5 0 2 4 5 6 8 5 22

Linear Probing f i = i Insert() h=hash() = % 9 = 4 h 0 =4 + 0 = 4 (Collision) h =4 + = 5 0 2 4 5 6 8 5 2

Linear Probing f i = i Insert(20) h=hash(20) = 20 % 9 = 2 h 0 =2 + 0 = 2 0 2 4 5 6 8 5 24

Linear Probing f i = i Insert(20) h=hash(20) = 20 % 9 = 2 h 0 =2 + 0 = 2 0 2 20 4 5 6 8 5 25

Linear Probing f i = i Insert(26) h=hash(26) = 26 % 9 = 8 h 0 =8 + 0 = 8 (Collision) h =8 + = 0 0 2 20 4 5 6 8 5 26

Linear Probing f i = i Insert(26) h=hash(26) = 26 % 9 = 8 h 0 =8 + 0 = 8 (Collision) h =8 + = 0 0 26 2 20 4 5 6 8 5 2

Linear Probing f i = i Insert(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 (Collision) h =4 + = 5 (Collision) h 2 =4 + 2 = 6 0 26 2 20 4 5 6 8 5 28

Linear Probing f i = i Insert(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 (Collision) h =4 + = 5 (Collision) h 2 =4 + 2 = 6 0 26 2 20 4 5 6 40 8 5 29

Find Find(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 0 26 2 20? 4 5 6 40 8 5 0

Find Find(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) 0 26 2 20 4? 5 6 40 8 5

Find Find(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) h 2 =4 + 2 = 6 (Match) 0 26 2 20 4 5 6 40 8 5 2

Find Find(22) h=hash(22) = 22 % 9 = 4 h 0 =4 + 0 = 4 0 26 2 20? 4 5 6 40 8 5

Find Find(22) h=hash(22) = 22 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) 0 26 2 20 4? 5 6 40 8 5 4

Find Find(22) h=hash(22) = 22 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) h 2 =4 + 2 = 6 (No match) 0 26 2 20 4 5? 6 40 8 5 5

Find Find(22) h=hash(22) = 22 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) h 2 =4 + 2 = 6 (No match) h =4 + = (Empty) 0 26 2 20 4 5 6 40? 8 5 6

Find Find(22) h=hash(22) = 22 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) h 2 =4 + 2 = 6 (No match) h =4 + = (Empty) Return false (not found) 0 26 2 20 4 5 6 40? 8 5

Insert & Find void insert(x) { h = hash(x); while (occupied[h]) { h++; [h] = x; occupied[h] = true; Do you spot an error? bool find(x) { h = hash(x); while (occupied[h]) { if ([h] == x) return true; h++; return false; 8

Insert & Find void insert(x) { h = hash(x); while (occupied[h]) { h = (h+) % B; [h] = x; occupied[h] = true; bool find(x) { h = hash(x); while (occupied[h]) { if ([h] == x) return true; h = (h+) % B; return false; 9

Deletion bool delete(x) { h = hash(x); while (occupied[h]) { if ([h] == x) { occupied[h] = false; return true; h = (h+) % B; return false; 40

Delete Delete() h=hash() = % 9 = 4 h 0 =4 + 0 = 4 (Match) 0 26 Is this correct?? 2 20 4 5 6 40 8 5 4

Delete Find(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 (No match) return false!!! 0 26 2 20? 4 5 6 40 8 5 42

Revisit the Design class Hashtable { int [B]; enum {E, O, D status[b]; The status of each bucket is either Empty, Occupied, or Deleted Initially all are empty 4

Status Delete Delete() h=hash() = % 9 = 4 h 0 =4 + 0 = 4 (Match) 0 O 26 E 2 O 20 E? 4 O 5 O 6 O 40 E 8 O 5 44

Status Delete Delete() h=hash() = % 9 = 4 h 0 =4 + 0 = 4 (Match) Find(40) h=hash(40) = 40 % 9 = 4 h 0 =4 + 0 = 4 (No match) h =4 + = 5 (No match) h 2 =4 + 2 = 6 (Match) return true? 0 O 26 E 2 O 20 E 4 D 5 O 6 O 40 E 8 O 5 45

Insert, Find & Delete void insert(x) { h = hash(x); while (status[h]== O ) { h = (h+) % B; [h] = x; status[h] = O ; bool delete(x) { h = hash(x); while (status[h]!= E ) { if ([h] == x) { status[h] = D ; return true; h = (h+) % B; return false; bool find(x) { h = hash(x); while (status[h]!= E ) { if (status[h] == O && [h] == x) return true; h = (h+) % B; return false; 46

Primary Clustering A cluster (a block) of are all occupied Any key that hits the cluster will have to linearly probe until the end of the cluster 0 O 26 E 2 O 20 E 4 O Primary clustering 5 O 6 O 40 E 8 O 5 4

Status Quadratic Probing f i = i 2 Insert(5) 0 E E 2 E E 4 E 5 E 6 E E 8 E 48

Status Quadratic Probing f i = i 2 Insert(5) Insert(2) 0 E E 2 E E 4 E 5 E 6 O 5 E 8 E 49

Status Quadratic Probing f i = i 2 Insert(5) Insert(2) Insert(4) 0 E E 2 E E 4 E 5 O 2 6 O 5 E 8 E 50

Status Quadratic Probing f i = i 2 Insert(5) Insert(2) Insert(4) h 0 =5+0 (Occupied) h =5+ (Occupied) h 2 =5+4=0 (Empty) 0 E E 2 E E 4 E 5 O 2 6 O 5 E 8 E 5

Status Quadratic Probing f i = i 2 Insert(5) Insert(2) Insert(4) h 0 =5+0 (Occupied) h =5+ (Occupied) h 2 =5+4=0 (Empty) Insert() 0 O 4 E 2 E E 4 E 5 O 2 6 O 5 E 8 E 52

Status Quadratic Probing f i = i 2 Insert(5) Insert(2) Insert(4) h 0 =5+0 (Occupied) h =5+ (Occupied) h 2 =5+4=0 (Empty) Insert() h 0 =6+0 (Occupied) h =6+ (Empty) 0 O 4 E 2 E E 4 E 5 O 2 6 O 5 E 8 E 5

Status Quadratic Probing f i = i 2 Insert(4) h 0 =5+0 (Occupied) h =5+ (Occupied) h 2 =5+4=0 (Occupied) h =5+9=5 (Occupied) h 4 =5+6= (Empty) 0 O 4 E 2 E O 4 4 E 5 O 2 6 O 5 O 8 E 54

Status Secondary Clustering For two distinct keys x and x 2, if h(x ) = h(x 2 ), then h i x = h i x 2, i In other words, if two keys start at the same position, they will probe the same sequence of 0 O 4 E 2 E O 4 4 E 5 O 2 6 O 5 O 8 E 55

Double Hashing f i = i hash 2 x Where hash 2 is an alternative hash function Double hashing can eliminate both primary and secondary clustering 56