CS 332: Algorithms: Hash Tables
CS 332: Algorithms: Hash Tables
Hash Tables
Homework 4
Programming assignment:
Implement Skip Lists
Evaluate performance
Extra credit: compare performance to randomly
built BST
Due Mar 9 (Fri before Spring Break)
Will post later today
12
18
29
35
37
the records
In this discussion we consider all keys to be (possibly
large) natural numbers
The idea:
Set up an array T[0..m-1] in which
T[i] = x
T[i] = NULL
if x T and key[x] = i
otherwise
Hash Functions
Next problem: collision
T
U
(universe of keys)
0
h(k1)
k1
k4
K
(actual
keys)
k2
h(k4)
k5
h(k2) = h(k5)
k3
h(k3)
m-1
Resolving Collisions
How can we solve the problem of collisions?
Solution 1: chaining
Solution 2: open addressing
Open Addressing
Basic idea (details in Section 12.4):
To insert: if slot is full, try another slot, , until an open slot
is found (probing)
To search, follow same sequence of probes as would be used
when inserting the element
If reach element with correct key, return it
If reach a NULL pointer, element is not in table
Chaining
Chaining puts elements that hash to the same
k6
k2
k5
k8
k1
k4
k5
k2
k1
k4
K
(actual
k7
keys)
k3
k3
k8
k6
k7
Chaining
How do we insert an element?
U
(universe of keys)
k6
k2
k5
k8
k1
k4
k5
k2
k1
k4
K
(actual
k7
keys)
k3
k3
k8
k6
k7
Chaining
How do we delete an element?
Do we need a doubly-linked list for efficient delete?
U
(universe of keys)
k6
k2
k5
k8
k1
k4
k5
k2
k1
k4
K
(actual
k7
keys)
k3
k3
k8
k6
k7
Chaining
How do we search for a element with a
given key?
U
(universe of keys)
k6
k2
k5
k8
k1
k4
k5
k2
k1
k4
K
(actual
k7
keys)
k3
k3
k8
k6
k7
Analysis of Chaining
Assume simple uniform hashing: each key in
Analysis of Chaining
Assume simple uniform hashing: each key in
Analysis of Chaining
Assume simple uniform hashing: each key in
Analysis of Chaining
Assume simple uniform hashing: each key in
crucial
What will a worst-case hash function do?
What will be the time to search in this case?
function?
Should distribute keys uniformly into slots
Should not depend on patterns in the data
Hash Functions:
The Division Method
h(k) = k mod m
In words: hash k into a table with m slots using the slot
values of k?
What happens if m is a power of 2 (say 2 P)?
What if m is a power of 10?
Upshot: pick table size m = prime number not too
close to a power of 2 (or 10)
Hash Functions:
The Multiplication Method
For a constant A, 0 < A < 1:
h(k) = m (kA - kA )
Hash Functions:
The Multiplication Method
For a constant A, 0 < A < 1:
h(k) = m (kA - kA )
Fractional part of kA
Choose m = 2P
Choose A not too close to 0 or 1
Knuth: Good choice for A = (5 - 1)/2
Hash Functions:
Worst Case Scenario
Scenario:
You are given an assignment to implement hashing
You will self-grade in pairs, testing and grading your
partners implementation
In a blatant violation of the honor code, your partner:
Analyzes your hash function
Picks a sequence of worst-case keys, causing your
Hash Functions:
Universal Hashing
As before, when attempting to foil an
The End