Lecture 7 - Hash - Table - Direct - Adreess - Tables - Hash - Tables - Intro - Separate - Chaining
Lecture 7 - Hash - Table - Direct - Adreess - Tables - Hash - Tables - Intro - Separate - Chaining
2023 - 2024
ADT List
ADT Stack
ADT Queue
1
Today
Hash Tables
Direct-address table
2
Direct-address table
INSERT
DELETE and
SEARCH
3
Direct-address table
Solution:
4
Operations for a direct-address table - search
5
Operations for a direct-address table - insert
6
Operations for a direct-address table - delete
7
Operations for a direct-address table - delete
7
Direct-address table - Advantages and disadvantages
8
Direct-address table - Advantages and disadvantages
8
Direct-address table - Advantages and disadvantages
8
Direct-address table - Advantages and disadvantages
8
Direct-address table - Advantages and disadvantages
8
Direct-address table - Advantages and disadvantages
8
Hash tables
9
Hash tables - main idea
h : U → {0, 1, ..., m − 1}
Remarks:
10
Hash tables - main idea
11
Hash tables - main idea
Two keys may hash to the same index => a collision =>
we need techniques for resolving the conflict created by
collisions.
11
Hash tables - main idea
Two keys may hash to the same index => a collision =>
we need techniques for resolving the conflict created by
collisions.
11
A good hash function
is deterministic
12
Examples of bad hash functions
13
Examples of bad hash functions
13
Examples of bad hash functions
13
Examples of bad hash functions
13
Examples of bad hash functions
13
Examples of bad hash functions
h(k) = k % m, when m = 16
favors the collisions, by considering only the last four bites
13
Hash function
14
The division method
For example:
.
m = 13
k = 24 => h(k) = 11
k = 26 => h(k) = 0
k = 131 => h(k) = 1
.
Experiments show that good values for m are primes not too
close to exact powers of 2
15
The division method
16
The multiplication method
√
Some values for A work better than others. Knuth suggests
5−1
2 = 0.6180339887 (golden ratio)
For example:
m = 13 A = 0.6180339887
k=63 => h(k) = floor(13 * frac(63 * A)) = floor(12.16984) = 12
k=52 => h(k) = floor(13 * frac(52 * A)) = floor(1.790976) = 1
k=129=> h(k)= floor(13 * frac(129 * A)) = floor(9.442999) = 9
Example:
m = 13
h(k ) = k mod m
k = 1, 14, 27, 40, 53, 66, etc.
18
Universal hashing
19
Universal hashing
Example 1:
Fix a prime number p > the maximum possible value for a key from
U.
For every a ∈ {1, . . . , p − 1} and b ∈ {0, . . . , p − 1} we can define a
hash function ha,b (k ) = ((a ∗ k + b) mod p) mod m.
For example:
h3,7 (k ) = ((3 ∗ k + 7) mod p) mod m
h4,1 (k ) = ((4 ∗ k + 1) mod p) mod m
h8,0 (k ) = ((8 ∗ k ) mod p) mod m
20
Universal hashing
Example 2:
If the key k is an array < k1 , k2 , . . . , kr > such that ki < m (or it can
be transformed into such an array, by writing k in base m), let
< x1 , x2 , . . . , xr > be a fixed sequence of random numbers, such
that xi ∈ {0, . . . , m − 1} (another number in base m with the same
length).
Pr
h(k ) = i=1 ki ∗ xi mod m
21
Universal hashing
22
Using keys that are not natural numbers
Define special hash functions that work with the given keys
For example, for real number from the [0,1) interval h(k ) = [k ∗ m]
can be used
23
Using keys that are not natural numbers
24
Using keys that are not natural numbers
25
Collisions
When two keys, x and y , have the same value for the hash
function, h(x) = h(y ), we have a collision.
Separate chaining
Coalesced chaining
Open addressing
26
Separate chaining
27
Separate chaining - Example
m = 10
h(k) = k % m
28
Separate chaining - Operations
insert(T , x)
29
Separate chaining - Operations
search(T , k )
29
Separate chaining - Operations
delete(T , x)
29
Separate chaining - Operations
29
Hash table with separate chaining - representation
Representation of a node:
Node:
key: TKey
next: ↑ Node
31
Hash table with separate chaining - search
31
Hash table with separate chaining - search
31
Hash table with separate chaining - search
31
Hash table with separate chaining - search
31
Analysis of hashing with chaining
32
Analysis of hashing with chaining - Search
unsuccessful search
successful search
We assume that:
33
Analysis of hashing with chaining - Search
34
Analysis of hashing with chaining - Search
If n = O(m):
when all the elements collide ⇒ they are in the same list and
we are searching this list
35
Analysis of hashing with chaining - Insert
36
Analysis of hashing with chaining - Insert
36
Analysis of hashing with chaining - Delete
37
Conclusions
38
Example
39
Example
40
Example
41
Example
The hash value depends on the size of the hash table. If the
size of the hash table changes, the value of the hash function
changes as well.
search and remove operations might not find the element.
42
Example
43
Iterator
For the exemplified hash table, the easiest order in which the
elements can be iterated is: 2, 32, 5, 72, 55, 8, 11. 44
Iterator
45
Iterator - init
47
Sorted containers
48
Sorted containers
We can store the individual lists in a sorted order and for the
iterator we can merge them.
48
Sorted containers
We can store the individual lists in a sorted order and for the
iterator we can merge them.
48
Containers represented using hash tables
ADT Set
HashSet in Java Collections API, Python’s sets ( {} )
ADT Bag
HashMultiset in Guava (for Java)
49
Hash table - Applications
Programming languages
Implementation of built-in data types (dict in Python, HashMap in
Java)
Compilers
For storing the programming language’s keywords and for mapping
the variables names with memory locations
File system
For mapping file names to the the file path and to the physical location
of that file on the disk
Password Verification:
For storing hashed passwords
51
Thank you
52