Task 4 - Hashing - Separate Chaining and Rehashing
Task 4 - Hashing - Separate Chaining and Rehashing
LAHORE
4 Rehashing 6
5 Example of Rehashing 7
6 Exercise 8
To handle collisions the hash table has a technique known as separate chaining.
Separate chaining is defined as a method by which linked lists of values are built in
association with each location within the hash table when a collision occurs.
The concept of separate chaining involves a technique in which each index key is built
with a linked list. This means that the table's cells have linked lists governed by the
same hash function. So, in place of the collision error which occurred in Figure 1, the
cell now contains a linked list containing the string 'Janet' and 'Martha' as seen in
Figure 2. We see how the subsequent strings are loaded using the separate chaining
technique.
Now let’s discuss one more problem for the better understanding
Example
3 85 mod 7 = 1
700 0
1
50
85 92
Next we have 73
2 73 mod 7 = 3
73 3 101 The last one is 101
4 101 mod 7 =3
So again we have a collision, so we create a
76 linked list of hash value 4 and add a node to it
5 with key as 101
So this is how our hash table would look like. Now lets have a look at some of
the advantages and disadvantages we have in separate chaining.
Advantages:
1) It is very simple method to handle collision and therefore can be
implemented easily
2) Hash Table can grow and we can add new elements to it
3) Less sensitive to the hash function or load factors
4) It is mostly used when it is unknown how many and how frequently keys
may be inserted or deleted
Disadvantages:
1) Cache performance of chaining is not good as keys are stored using linked
list
2) Wastage of space
3) If the chain becomes long, then search time cane become O(n) in worst case
4) We have to waste extra memory to save the link address.
Complexity
Finally let’s give a look at the performance of the separate chaining.
For this we define load Factor alpha which is n/m
The load factor is basically average number of keys per slot given an assumption that
each key is equally likely to be hashed to any slot of table, independent of where other
keys are hashed.
So the expected time to insert/search/delete O(1+α)
Rehashing
Problem
Use open addressing (linear probing) on a table of integers with hash(k)=k (assume the table does an
internal % hSize):
But if we try to add another element (24), then more than half the slots are occupied…
So we expand the table, and use the hash function to relocate the elements within the larger table…
In this case, I've shown the hash table size doubling, because that's easy to do, despite the fact that it
doesn't lead to prime-number sized tables. If we were going to use quadratic probing, we would
probably keep a table of prime numbers on hand for expansion sizes, and we would probably choose
a set of primes such that each successive prime number was about twice the prior one.
The rehashing operation can be quite lengthy. Luckily, it doesn't need to be done very often.
We can speed things up somewhat by storing the hash values in the table elements along with the
data so that we don't need to recompute the hash values. Also, if we structure the table as a vector
of pointers to the hash elements, then during the rehashing we will only be copying pointers, not the
entire (potentially large) data elements.
Exercise
Question 1 )
Write a Program that demonstrates operations on Hash Tables chaining with Singly
Linked Lists. It must contain following functions
Question 2)