Lecture 10
Lecture 10
LECTURE 10
2024 - 2025
Binary heap
Hash tables
h : U → {0, 1, ..., m − 1}
Remarks:
In case of direct-address tables, an element with key k is
stored in T [k].
In case of hash tables, an element with key k is stored in
T [h(k)].
Consequence:
two keys may hash to the same slot => a collision
we need techniques for resolving the conflict created by
collisions
is deterministic
1
P(h(k) = j) = ∀j = 0, ..., m − 1 ∀k ∈ U
m
etc.
For example:
m = 13
k = 63 => h(k) = 11
k = 52 => h(k) = 0
k = 131 => h(k) = 1
For example
m = 13 A = 0.6180339887
k=63 => h(k) = floor(13 * frac(63 * A)) = floor(12.16984) = 12
k=52 => h(k) = floor(13 * frac(52 * A)) = floor(1.790976) = 1
k=129=> h(k)= floor(13 * frac(129 * A)) = floor(9.442999) = 9
Some
√ values for A work better than others. Knuth suggests
5−1
2 = 0.6180339887
For example:
m = 13
h(k) = k mod m
k = 11, 24, 37, 50, 63, 76, etc.
Example 1
Fix a prime number p >the maximum possible value for a key from
U.
For every a ∈ {1, . . . , p − 1} and b ∈ {0, . . . , p − 1} we can define
a hash function ha,b (k) = ((a ∗ k + b) mod p) mod m.
For example:
h3,7 (k) = ((3 ∗ k + 7) mod p) mod m
h4,1 (k) = ((4 ∗ k + 1) mod p) mod m
h8,0 (k) = ((8 ∗ k) mod p) mod m
There are p ∗ (p − 1) possible hash functions that can be
chosen.
Example 2
If the key k is an array < k1 , k2 , . . . , kr > such that ki < m (or it
can be transformed into such an array, by writing the k as a
number in base m).
Let < x1 , x2 , . . . , xr > be a fixed sequence of random numbers,
such that xi ∈ {0, . . . , m − 1} (another number in base m with the
same length).
h(k) = ri=1 ki ∗ xi mod m
P
Example 3
Suppose the keys are u − bits long and m = 2b .
Pick a random b − by − u matrix (called h) with 0 and 1 values
only.
Pick h(k) = h ∗ k where in the multiplication we do addition mod
2.
1
1 0 0 0 1
0 1 1 1 0 = 1
1
1 1 1 0 0
0
Define special hash functions that work with your keys (for
example, for real number from the [0,1) interval h(k) = [k ∗ m]
can be used)
When two keys, x and y , have the same value for the hash
function h(x) = h(y ) we have a collision.
Node:
key: TKey
next: ↑ Node
HashTable:
T: ↑Node[] //an array of pointers to nodes
m: Integer
h: TFunction //the hash function
We assume that
the hash value can be computed in constant time (Θ(1))
the time required to search an element with key k depends
linearly on the length of the list T [h(k)]
For the hash table from the previous example, the easiest
order in which the elements can be iterated is: 2, 32, 5, 72,
55, 8, 11
IteratorHT:
ht: HashTable
currentPos: Integer
currentNode: ↑ Node
Insert into the table the following elements: 5, 18, 16, 15, 13,
31, 26.
Let’s compute the value of the hash function for every key:
Key 5 18 16 15 13 31 26
Hash 5 5 3 2 0 5 0
Initially the hash table is empty. All next values are -1 and the
first empty position is position 0.
5 will be added to position 5. But 18 should also be added
there. Since that position is already occupied, we add 18 to
position firstEmpty and set the next of 5 to point to position
0. Then we reset firstEmpty to the next empty position.
pos 0 1 2 3 4 5 6 7 8 9 10 11 12
T 18 13 15 16 31 5 26
next 1 4 -1 -1 6 0 -1 -1 -1 -1 -1 -1 -1
firstEmpty = 7
HashTable:
T: TKey[]
next: Integer[]
m: Integer
firstEmpty: Integer
h: TFunction
ht.T[ht.firstEmpty] ← k
ht.next[ht.firstEmpty] ← - 1
ht.next[current] ← ht.firstEmpty
changeFirstEmpty(ht)
end-if
end-subalgorithm
Complexity: O(m)
Think about it: Should we keep the free spaces linked in a list
as in case of a linked lists on array?
Remove 11
elems 3 8
next -1 -1 -1 0 -1
Remove 11
elems 56 12 8
next -1 -1 -1 -1 -1
Remove 11
elems 20 56
next -1 -1 -1 -1 -1
Remove 11
elems 56 12 1
next -1 3 -1 -1 -1
Remove 11
elems 20 56 13
next -1 0 -1 -1 -1
Let’s see a few more complicated example (on the one used
previously).
pos 0 1 2 3 4 5 6 7 8 9 10 11 12
T 13 15 16 31 5 26
next -1 4 -1 -1 6 1 -1 -1 -1 -1 -1 -1 -1
firstEmpty = 0
pos 0 1 2 3 4 5 6 7 8 9 10 11 12
T 13 31 15 16 26 5
next 1 4 -1 -1 -1 0 -1 -1 -1 -1 -1 -1 -1
firstEmpty = 6
pos 0 1 2 3 4 5 6 7 8 9 10 11 12
T 13 31 15 16 26 5
next 1 4 -1 -1 -1 0 -1 -1 -1 -1 -1 -1 -1
firstEmpty = 6
For this example, it would work. This hash table is now
correct and every element can be found in it. But what if now
we remove 5? Is the hash table below correct?
pos 0 1 2 3 4 5 6 7 8 9 10 11 12
T 31 26 15 16 13
next 1 -1 -1 -1 -1 0 -1 -1 -1 -1 -1 -1 -1
firstEmpty = 4
Lect. PhD. Oneţ-Marian Zsuzsanna DATA STRUCTURES
Now element 13 is not going to be found, because a search
for 13 starts from position 0, but 13 is currently on a position
before 0 in the linked list.
Obs 2: Not any element can get to any position in the linked
list (specifically, no element is allowed to be on a position
which is before the position to which it hashes)
Complexity:
pos 0 1 2 3 4 5 6 7 8 9 10 11 12
T 13 31 15 16 26 5
next 1 4 -1 -1 -1 0 -1 -1 -1 -1 -1 -1 -1
firstEmpty = 6
Separate chaining
Coalesced chaining