s 1 if xπy and f(x) = f(y)

Similar documents
Lecture: Analysis of Algorithms (CS )

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University

Hashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille

Hashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing

1 Maintaining a Dictionary

Hash tables. Hash tables

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

ALG 4.3. Hashing Polynomials and Algebraic Expressions: Main Goal of Lecture: Algorithms Professor John Reif

Hash tables. Hash tables

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32

Analysis of Algorithms I: Perfect Hashing

ALG 4.1 Randomized Pattern Matching:

So far we have implemented the search for a key by carefully choosing split-elements.

Hash tables. Hash tables

Hashing. Data organization in main memory or disk

Algorithms lecture notes 1. Hashing, and Universal Hash functions

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Data Structure. Mohsen Arab. January 13, Yazd University. Mohsen Arab (Yazd University ) Data Structure January 13, / 86

Problem Set 4 Solutions

Introduction to Hash Tables

A Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple

Cache-Oblivious Hashing

? 11.5 Perfect hashing. Exercises

compare to comparison and pointer based sorting, binary trees

Collision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST

Finger Search in the Implicit Model

INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class

Cuckoo Hashing with a Stash: Alternative Analysis, Simple Hash Functions

12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]

Hash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1

6.1 Occupancy Problem

CS483 Design and Analysis of Algorithms

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

Hashing. Martin Babka. January 12, 2011

ALG 4.0 Number Theory Algorithms:

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

COMP251: Hashing. Jérôme Waldispühl School of Computer Science McGill University. Based on (Cormen et al., 2002)

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

z-axis SUBMITTED BY: Ms. Harjeet Kaur Associate Professor Department of Mathematics PGGCG 11, Chandigarh y-axis x-axis

Explicit and Efficient Hash Functions Suffice for Cuckoo Hashing with a Stash

Hashing. Hashing DESIGN & ANALYSIS OF ALGORITHM

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Exercises with solutions (Set B)

( x) f = where P and Q are polynomials.

Fundamental Algorithms

Hashing, Hash Functions. Lecture 7

Introduction to Hashtables

Module 1: Analyzing the Efficiency of Algorithms

Lecture Lecture 3 Tuesday Sep 09, 2014

Lecture 8 HASHING!!!!!

1 Hashing. 1.1 Perfect Hashing

NON-EXPANSIVE HASHING

Searching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9

Discrete Math I Exam II (2/9/12) Page 1

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

GF(2 m ) arithmetic: summary

ON THE ORDER OF APPROXIMATION OF FUNCTIONS BY THE BIDIMENSIONAL OPERATORS FAVARD-SZÁSZ-MIRAKYAN

Basics of hashing: k-independence and applications

THE CAUCHY PROBLEM VIA THE METHOD OF CHARACTERISTICS

PUTNAM TRAINING NUMBER THEORY. Exercises 1. Show that the sum of two consecutive primes is never twice a prime.

A Faster Randomized Algorithm for Root Counting in Prime Power Rings

Definitions, Theorems and Exercises. Abstract Algebra Math 332. Ethan D. Bloch

Hashing. Dictionaries Hashing with chaining Hash functions Linear Probing

BANDELET IMAGE APPROXIMATION AND COMPRESSION

Abstract Data Type (ADT) maintains a set of items, each with a key, subject to

Miskolc Mathematical Notes HU e-issn On bivariate Meyer-König and Zeller operators. Ali Olgun

Statistical Mechanics and Combinatorics : Lecture II

1. Definition: Order Statistics of a sample.

0 Sets and Induction. Sets

Lecture 2. Frequency problems

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Algorithms for Data Science

The Number of Fields Generated by the Square Root of Values of a Given Polynomial

Introduction to Algorithms March 10, 2010 Massachusetts Institute of Technology Spring 2010 Professors Piotr Indyk and David Karger Quiz 1

A CONTINUOUS AND NOWHERE DIFFERENTIABLE FUNCTION, FOR MATH 320

9.1. Unit 9. Implementing Combinational Functions with Karnaugh Maps or Memories

Searching, mainly via Hash tables

PMI Rational Expressions & Equations Unit

Symbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record.

A General-Purpose Counting Filter: Making Every Bit Count. Prashant Pandey, Michael A. Bender, Rob Johnson, Rob Patro Stony Brook University, NY

1 Relative degree and local normal forms

LECTURE 3. Last time:

( ) Chapter 6 ( ) ( ) ( ) ( ) Exercise Set The greatest common factor is x + 3.

Secret Sharing CPT, Version 3

Sorting algorithms. Sorting algorithms

1 Difference between grad and undergrad algorithms

GROUPS OF ORDER p 3 KEITH CONRAD

On the Girth of (3,L) Quasi-Cyclic LDPC Codes based on Complete Protographs

14.1 Finding frequent elements in stream

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

Review of Prerequisite Skills for Unit # 2 (Derivatives) U2L2: Sec.2.1 The Derivative Function

Randomized Algorithms. Zhou Jun

arxiv: v1 [math.gt] 20 Feb 2008

CSCI8980 Algorithmic Techniques for Big Data September 12, Lecture 2

Course 2BA1: Trinity 2006 Section 9: Introduction to Number Theory and Cryptography

1 Categories, Functors, and Natural Transformations. Discrete categories. A category is discrete when every arrow is an identity.

Bounds on the OBDD-Size of Integer Multiplication via Universal Hashing

Introduction to Algorithms

Lecture 12: Hash Tables [Sp 15]

Transcription:

Algorithms Proessor John Rei Hash Function : A B ALG 4.2 Universal Hash Functions: CLR - Chapter 34 Auxillary Reading Selections: AHU-Data Section 4.7 BB Section 8.4.4 Handout: Carter & Wegman, "Universal Classes o Hash Functions", JCSS, Vol. 8, pp. 43-54, 979. keys indices has conlict at x,y e A i xπy but (x = (y { s i xπy and (x = (y (x,y = 0 else 2

I H is a set o hash unctions, s H (x,y = Â s (x,y e H or set o keys S, s H (x,s = Â e H Â y e S s (x,y a Keys a ŒH ( a b ( a b - Total Conlicts b Indices b a b a b keys / index conlicts / index [( ( a b - ] a 2 b - a 3 4

H is a universal 2 set o hash unctions i s H (x,y H B or all x,y e A i.e. no pair o keys x,y are mapped into the same index by > B o all unctions in H Proposition Given any set H o hash n, $ x,y e A s.t. ( s H (x,y > H - B A proo let a = A, b = B By counting, we can show x (x = (y A B y Conlict ( - ( AA b - a a a s, b 2 2 b 5 6

Thus sh (A,A a 2 H ( b - a By the pidgeon hole principle $ x,y e A s.t s H (x,y > H ( b - a note in most applications, A >> B and then any universal 2 class, has asymptotically a minimum number o conlicts Proposition 2: Let x A, S c A For chosen randomly rom a universal 2 class H o hash unctions, the expected number o colisions is proo E( s (x,s = H = H Â y e S H Â S = B y e S S s (x,s B Â e H s (x,s s H (x,y by deinition H B ' by deinition o universal 2 7 8

application associative memory storage o S keys onto B linked lists. Given key x e A, store x in list (x Proposition 2 implies each list has expected S length B = 0( i B S Gives 0( time or STORE, RETRIEVE, and DELETE operations Proposition 3 Let R be a sequence o requests with k insertion operations into an associative memory. I is chosen at random rom set o universal 2 class H, the expected total cost o all k searches is R ( + k B. proo There are R total search ops, and each takes by Proposition 2 expected k time +. B 9 note i B k, then expected total time is O( R. 0

Bounds on distribution o s (x,s Proposition 4 Let x ΠA, S A Let m = expected value o s (x,s For chosen randomly rom universal 2 set o unctions H, Prob (s (x,s > t. m < t H = universal set o hash unctions. 2 E = Expected cost o random set o k requests using a worst case unction in H (random input proo immediate rom Markov bound improved prob t 4 bounds on probability: or universal hash ns. H2, H 3 (using 2nd and 4th moments o prob. distribution. E = Expected cost o worst case set o k requests 2 using a random unction in H (randomized algorithm 2

P rop 5 E ( - e E 2 where e = B A Example o Universal 2 Class proo Let a = A, b = B. S Prop 2 implies E 2 + b Suppose S is chosen randomly. or x, y e S, E( s (x,y = a 2 s (A,A Set o Keys Table Let A = {0,,..., a-} Set o Keys B = {0,,..., b-} Table Let p be a prime a Zp = {0,,..., p-} = number ield mod p deine g : Z p Æ B s.t. g(x = x mod b deine or n,m e Z p with m π o, a 2 a 2 ( b - a by Prop h n, m : A Æ Z p with h n, m (x = (mx+n mod p deine n, m : A Æ B s.t. n, m (x = g(h m, n (x ( b - a H = { m,n m,n e Z p, mπo} So E + E (s (x,s Claim: H is universal 2 + S ( b - a 3 4

proo Lemma or distinct x, y ea, s H (x,y = s g (Z p, Z p s g (Z p, Z p = {(r,s r,s e Z p, r π s, g(r = g(s} Observe that the linear equations: xm + n = r (mod p ym + n = s (mod p have unique solutions in Z p So (r,s = (h m, n (x, h m, n (y then ( m, n (x = m, n (y i and only i g(r = g(s s H (x,y is the number o such pairs in (r,s e s g (Z p, Z p Theorem proo H is universal 2 Let n i = {t e Z p g(t = i} By deinition o g(x = x mod b, p- i n i + b For any given r, the number o s where s π r and g(r = g(s is s (r, Z p g p-. b But there are p choices o r, (p- so p ( b s (Z p, Z p g = s H (x,y by Lemma (Also note s (x,x = 0 H H Hence s H (x,y b since H = p(p- so H is universal 2 5 6

Universal Hash Fns on Long keys Given class o hash unctions H, deine hash unctions J = {h,g,g Œ H } where h,g (x, x 2 = (x g(x 2 exclusive or { 0 } Theorem Suppose B =,,, b = power o 2. Suppose this class o ns A $ real r" ieb" x, y ea, x π y K where b is a Æ B i { ( ( = } H e x y i rh Then " xy, Œ ( A A, x π y hj e hx ( hy ( = i rh { } Proo or x x, x, y y, y in A A = ( 2 = ( 2 { } ib e then hj e hx ( hy ( = i { g H x g x y g y i} =, ( ( ( ( = e 2 2 { }  H x y i gx gy = e ( ( = ( ( yh e 2 2 { Œ ( ( = } H x y i rh 7 example H with m = 0 gives J with r = universal! B 8

Universal 2 Hashing with out Multiplication A = set o d digit numbers base a so, A = a d B = set o binary numbers length j M = arrays o length d. a, with elements in B "m e M let m(k = kth element o array m "x e A let x k = kth digit o x base a d deinition m (x = m(x + m(x +x 2 +2... m( Â x k k= +k array m m( m(k Theorem H 2 = { m m e M } is universal 2 proo or x, y e A, let m (x = r r 2... r s rows o m m (y = r s+... r t Then m (x = m (y i r... r t = o But i xπy i $ k s. t. r k in only one o m (x, m (y so ( (x = (y m m i r k = r i i π k But there are only B possibilities or row r k so x,y will collide or o ns m e H B 2 Hence H 2 is universal 2 9 20

Analysis o Hashing or Uniorm Random Hash n Hashing with Chaining keep list o conlicts at each index load actor a = # o keys hashed # o indicies in Hash Table...... Conlicts... length is binomial variable expected length = a 2 Expected Time Cost per hash = O( + a By Cherno Bounds, with high likelyhood time cost per hash O( alog (# keys 22

Open Address Hashing (With Uniorm Random Hash n Resolve conlicts by applying another hash unction a = load actor = prob. o occupied hash address # rehashes as geometric variable expected hash time = - a = 2 + a + a + K 23