INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University. Data Set - SSN's from UTSA Class
|
|
- Kerry Francis
- 5 years ago
- Views:
Transcription
1 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -1 - INTRODUCTION TO HASHING Dr. Thomas Hicks Trinity University Data Set - SSN's from UTSA Class Need Faster Search Techniques We need faster searching of arrays for key words. We need faster searching of files for social security numbers. Wouldn't it be nice to find a keyword or social security number in just one look? O(1)? Black-Box Address Calculator Hashing is a two phase process. One must select an Address Calculator Algorithm that he/she thinks will uniformly distribute the primary key... n h(x) n-1 Address... x -> ---->..... : Calculator... : one logical record each Direct Access File / Array Hash Techniques may be used to search internal memory arrays and direct access files. Searching an internal memory array (either static or dynamic) is a nanosecond process. Searching direct access files is a millisecond process. A HashValue is produced when a Key is Hashed through the AddressCalculator. The Hash function is used to map a key into the file/array. There are many potentially good hash functions for a large collection of data. Finding a good hash function may require a lot of searching and ingenuity. Two Decisions One must make two decisions in hashing. What Hash Function is to be used and what technique is used to resolve hash clashes/collisions..
2 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -2 - Perfect Hash Function A Perfect Hash Function is one in which each and every element is located in just one look. Do you think that it is possible to create a perfect hash function to map all 20 of the social security numbers into the file? Yes/No? Let us assume that we have 5,000 byte records /498,662,055 = /192,039,969 = /20 = /25 = 80% Loading Factor 1 Collisions - Hash Clashes A Hash Clash, or Collision, occurs when two or more keys generate the same HashValue. The second phase of hashing is to select a Hash Collision & Storage Algorithm that minimizes the damage caused by collisions. We shall examine several hash storage algorithms. What Constitutes a Good Hash Function? I. The hash function should be easy to compute. II. The hash function should uniformly distribute the data throughout the set of hash keys.
3 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -3 - a) how well does the hash function scatter random data b) how well does the hash function scatter non-random data III. The hash function should help to conserve disk space. Minimal Acceptable Hash Function Standards I. The Average Number of Hard Disk Accesses to Find the Sought Key <= 1.2 The average number of hard disk accesses is calculated by dividing the number of searches required to find each item once by the number of items. I =1 AccessQuotient = Σ (# Accesses To Find Item I) / N I N II. The Loading Factor <= 80% (1,000,000 records can be stored in 1,250,000 records or less) LoadingFactor = NoRecords/Elements / N Hash Alternative To Compiler Key Word Search One student developed a black-box address calculator function and a array management system that produced the following table. The AccessQuotient is admirable Records 1-6 do record in end Records 7-12 const var function Records of case array packed Records mod program until Records else if label then Records file with to div and Records type begin not Records downto set while or Records nil for repeat Records goto procedure The Loading Factor for the compiler hash above was not admirable. Loading Factor = 35 / 60 = 58.3% [Not Satisfy Minimal Standard]
4 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -4 - Statement of Problem A. Population of 20 prospective teachers. B. Desire 80% Loading Factor B. Desire an AccessQuotient of 1.2 or Better C. Data Answer the following -Questions 1] A Hash Function is also called an _?_[Hint AC] 2] The linked list finds an element in Order _?_{N, N 2,, N 3, Log 10 N, Log 2 N} 3] The AVL Tree finds an element in Order _?_ {N, N 2,, N 3, Log 10 N, Log 2 N} 4] The Objective of Hashing is to find an element in Order _?_ {N, N 2,, N 3, Log 10 N, Log 2 N} 5] If the 20 SSN s above were loaded into a linked list, the Access Quotient/Average Search would be _?_. [Compute to 1 decimal place] 6] If the 20 SSN s above were loaded into an AVL tree, the Access Quotient/Average Search would be _?_. [Compute to 1 decimal place] 7] If the 20 SSN s above were loaded into the best of your Linear Probing Models below, the Access Quotient/Average Search would be _?_. [Compute to 1 decimal place] 8] If the 20 SSN s above were loaded into the best of your Quadratic Probing Models below, the Access Quotient/Average Search would be _?_. [Compute to 1 decimal place] 9] Hashing is most successful for {small, medium, large, any} quantities of data. 10] {T/F} Satisfactory Hashing is always a possible. 11] Satisfactory Loading Factors are _?_% or greater. 12] Satisfactory Access Quotients are _?_ or less. 13] List the three things that constitute a good hash function?
5 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -5-14] Suppose we are going to try to use Linear Probing to resolve the collisions for 1,000 records. How many elements must there be in the data set? 15] Suppose we are going to try to use Quadratic Probing to resolve the collisions for 1,500 records. How many elements must there be in the data set? 16] Suppose we are going to try to use Linear Probing to resolve the collisions for 1,000 records. We have also decided to use Modularization to distribute our data. Write the Hash Function. 17] Suppose we are going to try to use Quadratic Probing to resolve the collisions for 1,500 records. We have also decided to use Modularization to distribute our data. Write the Hash Function. 18] What is Primary Clustering? 19] Why might it be better to select a prime number when using modularization in the hash function? 20] Why might the Quadratic Probing be better than the Linear Probing? 21] In a Linear Probing, the first attempt to place the collision moves 1 cell up; if that slot is full, the second attempt to place the collision _?_ cells up; if that slot is full, the third attempt to place the collision _?_ cells up; if that slot is full, the fourth attempt to place the collision _?_ cells up. 22] In a Quadratic Probing, the first attempt to place the collision moves 1 cell up; if that slot is full, the second attempt to place the collision _?_ cells up; if that slot is full, the third attempt to place the collision _?_ cells up; if that slot is full, the fourth attempt to place the collision _?_ cells up. 23] Almost all applications involve Insertion, Deletion, and Searching. In only the space below, discuss Deletion from a Linear Probing Model.
6 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -6-24] There are two major decisions that must be made when hashing. List them. 25] Be able to solve any of the following:
7 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -7 - Linear Probing Is Always Pre-Designed For 80% Loading Factor 20 Students Requires 25 Slots] Hash Function is Modularization [Thus Must Mod By? + 1 To Fill Slots 1-? ] #1 Linear Probing Collision Resolution 29 Put In Next Available Cell Upward Wrap Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = 7 [ 2] h( ) = 6 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = Total # Searches = Loading Factor = % Average Search =
8 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -8 - Linear Probing Is Always Pre-Designed For 80% Loading Factor 20 Students Requires 25 Slots] Hash Function Folding & Mod (First 3 Digits + Second Three Digits + Third Three Digits) Mod By? + 1 #2 Linear Probing Collision Resolution 29 Put In Next Available Cell Upward Wrap Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = 7 [ 2] h( ) = _16 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = Total # Searches = Loading Factor = % Average Search =
9 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing -9 - Linear Probing Is Always Pre-Designed For 80% Loading Factor 20 Students Requires 25 Slots] Hash Function is Truncate & Mod [Truncate last six digits - Mod By + 1 To Fill Slots 1- ] #3 Linear Probing Collision Resolution 29 Put In Next Available Cell Upward Wrap Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = _18_ [ 2] h( ) = _25_ [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = [18] h( ) = 17 [20] h( ) = Total # Searches = Loading Factor = % Average Search =
10 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing Linear Probing Is Always Pre-Designed For 80% Loading Factor 20 Students Requires 25 Slots] Hash Function is Truncate & Mod [Truncate first two digits; truncate last two digits - Mod By + 1 To Fill Slots 1- ] #4 Linear Probing Collision Resolution 29 Put In Next Available Cell Upward Wrap Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = _14_ [ 2] h( ) = _21_ 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = Total # Searches = Loading Factor = % Average Search =
11 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing Quadratic Probing Is Always Pre-Designed For 80% Loading Factor 20 Students Requires 25 Slots] Hash Function is Modularization [Thus Must Mod By + 1 To Fill Slots 1- ] #5 Quadratic Probing 29 Collision Resolution Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = 7 [ 2] h( ) = 6 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = Total # Searches = Loading Factor = % Average Search =
12 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing Quadratic Probing Is Always Pre-Designed For 80% Loading Factor 20 Students Requires 25 Slots] Modularization To Nearest Prime > 80% Loading [Thus Must Mod By To Fill Slots 1-29] #6 Quadratic Probing 29 Collision Resolution Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = _15_ [ 2] h( ) = 9_ 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = Total # Searches = Loading Factor = % Average Search =
13 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing Linked List Overflow Is Always Pre-Designed For?% Loading Factor 20 Students Requires 20+ Slots] Hash Function is Modularization [Thus Must Mod By? + 1 To Fill Slots 1-? ] #7 Linked List Overflow 29 Collision Resolution Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = _2 [ 2] h( ) = _16_ 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = / 1 15 Total # Searches = Loading Factor = % Average Search = / 1 1
14 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing Binary Tree Overflow Is Always Pre-Designed For?% Loading Factor 20 Students Requires 20+ Slots] Hash Function is Modularization [Thus Must Mod By? + 1 To Fill Slots 1-? ] #8 Binary Tree Overflow 29 Collision Resolution Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = _2 [ 2] h( ) = _16_ 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = / 1 15 Total # Searches = Loading Factor = % Average Search = / 3881 / 1 1
15 Dr. Thomas E. Hicks Data Abstractions Homework - Hashing AVL Tree Overflow Is Always Pre-Designed For?% Loading Factor 20 Students Requires 20+ Slots] Hash Function is Modularization [Thus Must Mod By? + 1 To Fill Slots 1-? ] #9 AVL Tree Overflow 29 Collision Resolution Place Last Four Digits Of SSN In Slot 28 First Two Done For You! [ 1] h( ) = _2 [ 2] h( ) = _16_ 25 [ 3] h( ) = [ 4] h( ) = 24 [ 5] h( ) = [ 6] h( ) = 23 [ 7] h( ) = [ 8] h( ) = 22 [ 9] h( ) = [10] h( ) = 21 [12] h( ) = 20 [14] h( ) = 19 [16] h( ) = 18 [18] h( ) = 17 [20] h( ) = / 1 15 Total # Searches = Loading Factor = % Average Search = / 3881 / 1 1
Hashing. Why Hashing? Applications of Hashing
12 Hashing Why Hashing? Hashing A Search algorithm is fast enough if its time performance is O(log 2 n) For 1 5 elements, it requires approx 17 operations But, such speed may not be applicable in real-world
More informationAdvanced Implementations of Tables: Balanced Search Trees and Hashing
Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the
More informationFundamental Algorithms
Chapter 5: Hash Tables, Winter 2018/19 1 Fundamental Algorithms Chapter 5: Hash Tables Jan Křetínský Winter 2018/19 Chapter 5: Hash Tables, Winter 2018/19 2 Generalised Search Problem Definition (Search
More informationMotivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis
Motivation Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis Arrays provide an indirect way to access a set. Many times we need an association between two sets, or a set of keys and associated
More informationCollision. Kuan-Yu Chen ( 陳冠宇 ) TR-212, NTUST
Collision Kuan-Yu Chen ( 陳冠宇 ) 2018/12/17 @ TR-212, NTUST Review Hash table is a data structure in which keys are mapped to array positions by a hash function When two or more keys map to the same memory
More informationInsert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park
1617 Preview Data Structure Review COSC COSC Data Structure Review Linked Lists Stacks Queues Linked Lists Singly Linked List Doubly Linked List Typical Functions s Hash Functions Collision Resolution
More informationAnalysis of Algorithms I: Perfect Hashing
Analysis of Algorithms I: Perfect Hashing Xi Chen Columbia University Goal: Let U = {0, 1,..., p 1} be a huge universe set. Given a static subset V U of n keys (here static means we will never change the
More informationHash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a
Hash Tables Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a mapping from U to M = {1,..., m}. A collision occurs when two hashed elements have h(x) =h(y).
More information? 11.5 Perfect hashing. Exercises
11.5 Perfect hashing 77 Exercises 11.4-1 Consider inserting the keys 10; ; 31; 4; 15; 8; 17; 88; 59 into a hash table of length m 11 using open addressing with the auxiliary hash function h 0.k/ k. Illustrate
More informationA Lecture on Hashing. Aram-Alexandre Pooladian, Alexander Iannantuono March 22, Hashing. Direct Addressing. Operations - Simple
A Lecture on Hashing Aram-Alexandre Pooladian, Alexander Iannantuono March 22, 217 This is the scribing of a lecture given by Luc Devroye on the 17th of March 217 for Honours Algorithms and Data Structures
More informationHash Tables. Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing. CS 5633 Analysis of Algorithms Chapter 11: Slide 1
Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing CS 5633 Analysis of Algorithms Chapter 11: Slide 1 Direct-Address Tables 2 2 Let U = {0,...,m 1}, the set of
More informationLecture 5: Hashing. David Woodruff Carnegie Mellon University
Lecture 5: Hashing David Woodruff Carnegie Mellon University Hashing Universal hashing Perfect hashing Maintaining a Dictionary Let U be a universe of keys U could be all strings of ASCII characters of
More informationLecture: Analysis of Algorithms (CS )
Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 Outline of Today s Class 2 Choosing Hash Functions Universal Universality Theorem Constructing a Set of Universal Hash Functions Perfect
More informationIntroduction to Hash Tables
Introduction to Hash Tables Hash Functions A hash table represents a simple but efficient way of storing, finding, and removing elements. In general, a hash table is represented by an array of cells. In
More informationHashing, Hash Functions. Lecture 7
Hashing, Hash Functions Lecture 7 Symbol-table problem Symbol table T holding n records: x record key[x] Other fields containing satellite data Operations on T: INSERT(T, x) DELETE(T, x) SEARCH(T, k) How
More informationCS483 Design and Analysis of Algorithms
CS483 Design and Analysis of Algorithms Lectures 2-3 Algorithms with Numbers Instructor: Fei Li lifei@cs.gmu.edu with subject: CS483 Office hours: STII, Room 443, Friday 4:00pm - 6:00pm or by appointments
More informationAlgorithms lecture notes 1. Hashing, and Universal Hash functions
Algorithms lecture notes 1 Hashing, and Universal Hash functions Algorithms lecture notes 2 Can we maintain a dictionary with O(1) per operation? Not in the deterministic sense. But in expectation, yes.
More information1 Maintaining a Dictionary
15-451/651: Design & Analysis of Algorithms February 1, 2016 Lecture #7: Hashing last changed: January 29, 2016 Hashing is a great practical tool, with an interesting and subtle theory too. In addition
More informationThe set of integers will be denoted by Z = {, -3, -2, -1, 0, 1, 2, 3, 4, }
Integers and Division 1 The Integers and Division This area of discrete mathematics belongs to the area of Number Theory. Some applications of the concepts in this section include generating pseudorandom
More informationHashing. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing. Philip Bille
Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing Philip Bille Hashing Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing
More informationSearching. Constant time access. Hash function. Use an array? Better hash function? Hash function 4/18/2013. Chapter 9
Constant time access Searching Chapter 9 Linear search Θ(n) OK Binary search Θ(log n) Better Can we achieve Θ(1) search time? CPTR 318 1 2 Use an array? Use random access on a key such as a string? Hash
More informationHashing. Data organization in main memory or disk
Hashing Data organization in main memory or disk sequential, binary trees, The location of a key depends on other keys => unnecessary key comparisons to find a key Question: find key with a single comparison
More informationHashing. Hashing. Dictionaries. Dictionaries. Dictionaries Chained Hashing Universal Hashing Static Dictionaries and Perfect Hashing
Philip Bille Dictionaries Dictionary problem. Maintain a set S U = {,..., u-} supporting lookup(x): return true if x S and false otherwise. insert(x): set S = S {x} delete(x): set S = S - {x} Dictionaries
More informationSearching, mainly via Hash tables
Data structures and algorithms Part 11 Searching, mainly via Hash tables Petr Felkel 26.1.2007 Topics Searching Hashing Hash function Resolving collisions Hashing with chaining Open addressing Linear Probing
More informationSymbol-table problem. Hashing. Direct-access table. Hash functions. CS Spring Symbol table T holding n records: record.
CS 5633 -- Spring 25 Symbol-table problem Hashing Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk CS 5633 Analysis of Algorithms 1 Symbol table holding n records: record
More informationHashing Data Structures. Ananda Gunawardena
Hashing 15-121 Data Structures Ananda Gunawardena Hashing Why do we need hashing? Many applications deal with lots of data Search engines and web pages There are myriad look ups. The look ups are time
More information12 Hash Tables Introduction Chaining. Lecture 12: Hash Tables [Fa 10]
Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally random number, so the code will be hard to crack. For letter A, you write 3,004,572,688. B is 28,731,569½.
More informationCOMP251: Hashing. Jérôme Waldispühl School of Computer Science McGill University. Based on (Cormen et al., 2002)
COMP251: Hashing Jérôme Waldispühl School of Computer Science McGill University Based on (Cormen et al., 2002) Table S with n records x: Problem DefiniNon X Key[x] InformaNon or data associated with x
More informationHash tables. Hash tables
Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key
More informationHashing. Dictionaries Hashing with chaining Hash functions Linear Probing
Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Hashing Dictionaries Hashing with chaining Hash functions Linear Probing Dictionaries Dictionary: Maintain a dynamic set S. Every
More informationHash tables. Hash tables
Dictionary Definition A dictionary is a data-structure that stores a set of elements where each element has a unique key, and supports the following operations: Search(S, k) Return the element whose key
More informationHash tables. Hash tables
Basic Probability Theory Two events A, B are independent if Conditional probability: Pr[A B] = Pr[A] Pr[B] Pr[A B] = Pr[A B] Pr[B] The expectation of a (discrete) random variable X is E[X ] = k k Pr[X
More informationcompare to comparison and pointer based sorting, binary trees
Admin Hashing Dictionaries Model Operations. makeset, insert, delete, find keys are integers in M = {1,..., m} (so assume machine word size, or unit time, is log m) can store in array of size M using power:
More informationIntroduction to Hashtables
Introduction to HashTables Boise State University March 5th 2015 Hash Tables: What Problem Do They Solve What Problem Do They Solve? Why not use arrays for everything? 1 Arrays can be very wasteful: Example
More information1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:
CS 24 Section #8 Hashing, Skip Lists 3/20/7 Probability Review Expectation (weighted average): the expectation of a random quantity X is: x= x P (X = x) For each value x that X can take on, we look at
More informationA General-Purpose Counting Filter: Making Every Bit Count. Prashant Pandey, Michael A. Bender, Rob Johnson, Rob Patro Stony Brook University, NY
A General-Purpose Counting Filter: Making Every Bit Count Prashant Pandey, Michael A. Bender, Rob Johnson, Rob Patro Stony Brook University, NY Approximate Membership Query (AMQ) insert(x) ismember(x)
More informationCSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30
CSCB63 Winter 2019 Week10 - Lecture 2 - Hashing Anna Bretscher March 21, 2019 1 / 30 Today Hashing Open Addressing Hash functions Universal Hashing 2 / 30 Open Addressing Open Addressing. Each entry in
More information4.5 Applications of Congruences
4.5 Applications of Congruences 287 66. Find all solutions of the congruence x 2 16 (mod 105). [Hint: Find the solutions of this congruence modulo 3, modulo 5, and modulo 7, and then use the Chinese remainder
More informationData Structures and Algorithm. Xiaoqing Zheng
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Dictionary problem Dictionary T holding n records: x records key[x] Other fields containing satellite data Operations on T: INSERT(T, x)
More informationProblem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)
Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.
More informationCS 591, Lecture 6 Data Analytics: Theory and Applications Boston University
CS 591, Lecture 6 Data Analytics: Theory and Applications Boston University Babis Tsourakakis February 8th, 2017 Universal hash family Notation: Universe U = {0,..., u 1}, index space M = {0,..., m 1},
More information2 How many distinct elements are in a stream?
Dealing with Massive Data January 31, 2011 Lecture 2: Distinct Element Counting Lecturer: Sergei Vassilvitskii Scribe:Ido Rosen & Yoonji Shin 1 Introduction We begin by defining the stream formally. Definition
More informationHashing. Hashing DESIGN & ANALYSIS OF ALGORITHM
Hashing Hashing Start with an array that holds the hash table. Use a hash function to take a key and map it to some index in the array. If the desired record is in the location given by the index, then
More informationCS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32
CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 32 CS 473: Algorithms, Spring 2018 Universal Hashing Lecture 10 Feb 15, 2018 Most
More informationAlgorithms (II) Yu Yu. Shanghai Jiaotong University
Algorithms (II) Yu Yu Shanghai Jiaotong University Chapter 1. Algorithms with Numbers Two seemingly similar problems Factoring: Given a number N, express it as a product of its prime factors. Primality:
More informationLecture 8: Number theory
KTH - Royal Institute of Technology NADA, course: 2D1458 Problem solving and programming under pressure Autumn 2005 for Fredrik Niemelä Authors: Johnne Adermark and Jenny Melander, 9th Nov 2005 Lecture
More informationLecture Lecture 3 Tuesday Sep 09, 2014
CS 4: Advanced Algorithms Fall 04 Lecture Lecture 3 Tuesday Sep 09, 04 Prof. Jelani Nelson Scribe: Thibaut Horel Overview In the previous lecture we finished covering data structures for the predecessor
More informationSo far we have implemented the search for a key by carefully choosing split-elements.
7.7 Hashing Dictionary: S. insert(x): Insert an element x. S. delete(x): Delete the element pointed to by x. S. search(k): Return a pointer to an element e with key[e] = k in S if it exists; otherwise
More informationCS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =
CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14 1 Probability First, recall a couple useful facts from last time about probability: Linearity of expectation: E(aX + by ) = ae(x)
More informationArray-based Hashtables
Array-based Hashtables For simplicity, we will assume that we only insert numeric keys into the hashtable hash(x) = x % B; where B is the number of 5 Implementation class Hashtable { int [B]; bool occupied[b];
More informationCS361 Homework #3 Solutions
CS6 Homework # Solutions. Suppose I have a hash table with 5 locations. I would like to know how many items I can store in it before it becomes fairly likely that I have a collision, i.e., that two items
More informationQuiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)
Introduction to Algorithms October 13, 2010 Massachusetts Institute of Technology 6.006 Fall 2010 Professors Konstantinos Daskalakis and Patrick Jaillet Quiz 1 Solutions Quiz 1 Solutions Problem 1. We
More informationThe Central Limit Theorem
- The Central Limit Theorem Definition Sampling Distribution of the Mean the probability distribution of sample means, with all samples having the same sample size n. (In general, the sampling distribution
More informations 1 if xπy and f(x) = f(y)
Algorithms Proessor John Rei Hash Function : A B ALG 4.2 Universal Hash Functions: CLR - Chapter 34 Auxillary Reading Selections: AHU-Data Section 4.7 BB Section 8.4.4 Handout: Carter & Wegman, "Universal
More information6.1 Occupancy Problem
15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.
More informationImproving Disk Sector Integrity Using 3-dimension Hashing Scheme
Improving Disk Sector Integrity Using 3-dimension Hashing Scheme Zoe L. Jiang, Lucas C.K. Hui, K.P. Chow, S.M. Yiu and Pierre K.Y. Lai Department of Computer Science The University of Hong Kong, Hong Kong
More informationIntegers and Division
Integers and Division Notations Z: set of integers N : set of natural numbers R: set of real numbers Z + : set of positive integers Some elements of number theory are needed in: Data structures, Random
More informationNORTHWESTERN UNIVERSITY Tuesday, Oct 6th, 2015 ANSWERS FALL 2015 NU PUTNAM SELECTION TEST
Problem A1. Show that log(1 + x) > x/(1 + x) for all x > 0. - Answer: We have that (log x) = 1/(1+x), and (x/(1+x)) = 1/(1+x) 2. Since 1/(1+x) > 1/(1 + x) 2 for x > 0, the function log x grows faster than
More informationData Structures and Algorithm. Xiaoqing Zheng
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn MULTIPOP top[s] = 6 top[s] = 2 3 2 8 5 6 5 S MULTIPOP(S, x). while not STACK-EMPTY(S) and k 0 2. do POP(S) 3. k k MULTIPOP(S, 4) Analysis
More informationB490 Mining the Big Data
B490 Mining the Big Data 1 Finding Similar Items Qin Zhang 1-1 Motivations Finding similar documents/webpages/images (Approximate) mirror sites. Application: Don t want to show both when Google. 2-1 Motivations
More informationMA/CSSE 473 Day 05. Factors and Primes Recursive division algorithm
MA/CSSE 473 Day 05 actors and Primes Recursive division algorithm MA/CSSE 473 Day 05 HW 2 due tonight, 3 is due Monday Student Questions Asymptotic Analysis example: summation Review topics I don t plan
More informationSolution suggestions for examination of Logic, Algorithms and Data Structures,
Department of VT12 Software Engineering and Managment DIT725 (TIG023) Göteborg University, Chalmers 24/5-12 Solution suggestions for examination of Logic, Algorithms and Data Structures, Date : April 26,
More informationP Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.
Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this
More informationcse 311: foundations of computing Fall 2015 Lecture 12: Primes, GCD, applications
cse 311: foundations of computing Fall 2015 Lecture 12: Primes, GCD, applications n-bit unsigned integer representation Represent integer x as sum of powers of 2: If x = n 1 i=0 b i 2 i where each b i
More informationCoskewness and Cokurtosis John W. Fowler July 9, 2005
Coskewness and Cokurtosis John W. Fowler July 9, 2005 The concept of a covariance matrix can be extended to higher moments; of particular interest are the third and fourth moments. A very common application
More informationLecture 8 HASHING!!!!!
Lecture 8 HASHING!!!!! Announcements HW3 due Friday! HW4 posted Friday! Q: Where can I see examples of proofs? Lecture Notes CLRS HW Solutions Office hours: lines are long L Solutions: We will be (more)
More informationModule 1: Analyzing the Efficiency of Algorithms
Module 1: Analyzing the Efficiency of Algorithms Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu What is an Algorithm?
More informationMIDTERM Fundamental Algorithms, Spring 2008, Professor Yap March 10, 2008
INSTRUCTIONS: MIDTERM Fundamental Algorithms, Spring 2008, Professor Yap March 10, 2008 0. This is a closed book exam, with one 8 x11 (2-sided) cheat sheet. 1. Please answer ALL questions (there is ONE
More informationCSCB63 Winter Week 11 Bloom Filters. Anna Bretscher. March 30, / 13
CSCB63 Winter 2019 Week 11 Bloom Filters Anna Bretscher March 30, 2019 1 / 13 Today Bloom Filters Definition Expected Complexity Applications 2 / 13 Bloom Filters (Specification) A bloom filter is a probabilistic
More informationAlgorithms. Shanks square forms algorithm Williams p+1 Quadratic Sieve Dixon s Random Squares Algorithm
Alex Sundling Algorithms Shanks square forms algorithm Williams p+1 Quadratic Sieve Dixon s Random Squares Algorithm Shanks Square Forms Created by Daniel Shanks as an improvement on Fermat s factorization
More informationHashing. Martin Babka. January 12, 2011
Hashing Martin Babka January 12, 2011 Hashing Hashing, Universal hashing, Perfect hashing Input data is uniformly distributed. A dynamic set is stored. Universal hashing Randomised algorithm uniform choice
More informationDatabases. DBMS Architecture: Hashing Techniques (RDBMS) and Inverted Indexes (IR)
Databases DBMS Architecture: Hashing Techniques (RDBMS) and Inverted Indexes (IR) References Hashing Techniques: Elmasri, 7th Ed. Chapter 16, section 8. Cormen, 3rd Ed. Chapter 11. Inverted indexing: Elmasri,
More informationRepresentation of Geographic Data
GIS 5210 Week 2 The Nature of Spatial Variation Three principles of the nature of spatial variation: proximity effects are key to understanding spatial variation issues of geographic scale and level of
More informationHash Tables (Cont'd) Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250
(Cont'd) Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Investigate the other important
More informationIntroduction to Computer Science and Programming for Astronomers
Introduction to Computer Science and Programming for Astronomers Lecture 8. István Szapudi Institute for Astronomy University of Hawaii March 7, 2018 Outline Reminder 1 Reminder 2 3 4 Reminder We have
More informationAlgorithms for Data Science
Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based
More information4 Number Theory and Cryptography
4 Number Theory and Cryptography 4.1 Divisibility and Modular Arithmetic This section introduces the basics of number theory number theory is the part of mathematics involving integers and their properties.
More informationSiteswap state diagrams
Siteswap state diagrams Hans Lundmark (halun@mai.liu.se) October, 200 This document contains diagrams that can be used for quickly finding valid siteswap juggling patterns and transitions between different
More informationCOMP Intro to Logic for Computer Scientists. Lecture 15
COMP 1002 Intro to Logic for Computer Scientists Lecture 15 B 5 2 J Puzzle: better than nothing Nothing is better than eternal bliss A burger is better than nothing ------------------------------------------------
More informationHit the brakes, Charles!
Hit the brakes, Charles! Student Activity 7 8 9 10 11 12 TI-Nspire Investigation Student 60 min Introduction Hit the brakes, Charles! bivariate data including x 2 data transformation As soon as Charles
More informationRound 5: Hashing. Tommi Junttila. Aalto University School of Science Department of Computer Science
Round 5: Hashing Tommi Junttila Aalto University School of Science Department of Computer Science CS-A1140 Data Structures and Algorithms Autumn 017 Tommi Junttila (Aalto University) Round 5 CS-A1140 /
More informationCryptographic Hash Functions
Cryptographic Hash Functions Çetin Kaya Koç koc@ece.orst.edu Electrical & Computer Engineering Oregon State University Corvallis, Oregon 97331 Technical Report December 9, 2002 Version 1.5 1 1 Introduction
More informationLecture 6: Introducing Complexity
COMP26120: Algorithms and Imperative Programming Lecture 6: Introducing Complexity Ian Pratt-Hartmann Room KB2.38: email: ipratt@cs.man.ac.uk 2015 16 You need this book: Make sure you use the up-to-date
More information2. This exam consists of 15 questions. The rst nine questions are multiple choice Q10 requires two
CS{74 Combinatorics & Discrete Probability, Fall 96 Final Examination 2:30{3:30pm, 7 December Read these instructions carefully. This is a closed book exam. Calculators are permitted. 2. This exam consists
More informationLecture Notes for Chapter 17: Amortized Analysis
Lecture Notes for Chapter 17: Amortized Analysis Chapter 17 overview Amortized analysis Analyze a sequence of operations on a data structure. Goal: Show that although some individual operations may be
More informationChapter 7. Sequential Circuits Registers, Counters, RAM
Chapter 7. Sequential Circuits Registers, Counters, RAM Register - a group of binary storage elements suitable for holding binary info A group of FFs constitutes a register Commonly used as temporary storage
More informationContinuing discussion of CRC s, especially looking at two-bit errors
Continuing discussion of CRC s, especially looking at two-bit errors The definition of primitive binary polynomials Brute force checking for primitivity A theorem giving a better test for primitivity Fast
More informationExperience in Factoring Large Integers Using Quadratic Sieve
Experience in Factoring Large Integers Using Quadratic Sieve D. J. Guan Department of Computer Science, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 guan@cse.nsysu.edu.tw April 19, 2005 Abstract
More informationExpectation of geometric distribution
Expectation of geometric distribution What is the probability that X is finite? Can now compute E(X): Σ k=1f X (k) = Σ k=1(1 p) k 1 p = pσ j=0(1 p) j = p 1 1 (1 p) = 1 E(X) = Σ k=1k (1 p) k 1 p = p [ Σ
More information3 The fundamentals: Algorithms, the integers, and matrices
3 The fundamentals: Algorithms, the integers, and matrices 3.4 The integers and division This section introduces the basics of number theory number theory is the part of mathematics involving integers
More information1 Hashing. 1.1 Perfect Hashing
1 Hashing Hashing is covered by undergraduate courses like Algo I. However, there is much more to say on this topic. Here, we focus on two selected topics: perfect hashing and cockoo hashing. In general,
More informationCOMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from
COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from http://www.mmds.org Distance Measures For finding similar documents, we consider the Jaccard
More information6.830 Lecture 11. Recap 10/15/2018
6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking
More informationCH 73 THE QUADRATIC FORMULA, PART II
1 CH THE QUADRATIC FORMULA, PART II INTRODUCTION W ay back in Chapter 55 we used the Quadratic Formula to solve quadratic equations like 6x + 1x + 0 0, whose solutions are 5 and 8. In fact, all of the
More informationNumber Representation and Waveform Quantization
1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.
More informationLecture 12: Hash Tables [Sp 15]
Insanity is repeating the same mistaes and expecting different results. Narcotics Anonymous (1981) Calvin: There! I finished our secret code! Hobbes: Let s see. Calvin: I assigned each letter a totally
More informationCryptographic Hashing
Innovation and Cryptoventures Cryptographic Hashing Campbell R. Harvey Duke University, NBER and Investment Strategy Advisor, Man Group, plc January 30, 2017 Campbell R. Harvey 2017 2 Overview Cryptographic
More informationLecture 4: Divide and Conquer: van Emde Boas Trees
Lecture 4: Divide and Conquer: van Emde Boas Trees Series of Improved Data Structures Insert, Successor Delete Space This lecture is based on personal communication with Michael Bender, 001. Goal We want
More informationOutline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10
Merge Sort 1 / 10 Outline 1 Merging 2 Merge Sort 3 Complexity of Sorting 4 Merge Sort and Other Sorts 2 / 10 Merging Merge sort is based on a simple operation known as merging: combining two ordered arrays
More informationSignificant Figures. Significant Figures 18/02/2015. A significant figure is a measured or meaningful digit.
Significant Figures When counting objects, it is easy to determine the EXACT number of objects. Significant Figures Unit B1 But when a property such as mass, time, volume, or length is MEASURED, you can
More information