Hashing With Chaining
Hashing With Chaining
Hashing with chaining is an application of linked lists and gives an approach to collision
resolution. In hashing with chaining, the hash table contains linked lists of elements or
pointers to elements (Figure 12.1). The lists are referred to as chains, and the technique is
called chaining. This is a common technique where a straightforward implementation is
desired and maximum efficiency isn't required. Each linked list contains all the elements
whose keys hash to the same index. Using chains minimizes search by dividing the set to
be searched into lots of smaller pieces. There's nothing inherently wrong with linear
search with small sequences; it's just that it gets slower as the sequences to be searched
get longer. In this approach to resolving collisions, each sequence of elements whose
keys hash to the same value will stay relatively short, so linear search is adequate.
Choice of h: h[x]
must be simple
must distributes the data evenly
Choice of m: m approximates n (= 1 item/linked list) n = input size.
e.g.
1. MOD FUNCTION.
x is an interger value.
H[x] = x mod m.
Choising last three digit(738) is more appropriate than the first three digits (398)
as they distribute the data more evenly.
To do this take mod function:
x mod m:
h[x] = x mod 10^k: It gives last k digits
h[x] = x mod 2^k: It gives last k bits
2. MIDDLE DIGITS OF AN INTEGER
This often yields unpredictable (and thus good) distributions of the data.
Assume that you wish to take the two digits three positions from the right of x.
This is obtained by
H[x] = (x/1000) mod 100
Where (/1000) drops three digits and (x/1000) mod 100 keeps two digits.
(Application: Sorting)
4. PERFECT HASH FUNCTION
o A perfect hashing function is one that causes no collisions.
o Perfect hashing functions can be found only under certain conditons.
o One application of the perfect hash function is static dictionary.
o h[x] is designed after having peeked at the data.
Analysis
Here we analyse the number of probes in a table for doing "search". A probe is a check of
a table entry. All other components of the algorithm are ignored, but as they are
collectively bounded by the number of probes, it suffices to just consider probes.
2. SPACE
Where (i-1)/m comes from the time needed to insert the i-th element in a
table with i-1 element. By the unsucceful search part, the expected time is
(i-1)/m.