Hashing: An Ideal Hash Table
Hashing: An Ideal Hash Table
Hashing is a technique used for performing insertion, deletions and finds in constant
average time.
It takes a potential huge range of values and maps it to a much smaller range of values.
HASH TABLE
A hash table is data structure for storing a set of items, so that we can quickly
perform searching, insertions and deletions.
A hash table is normally created with a certain number of buckets or storing locations.
A hash table data structure is merely an array of fixed size, containing keys.
Hash tables are one of the most efficient means of searching for data.
LOAD FACTOR
Load factor is the ratio of the number of elements in the hash table to the table size.
Denoted by lambda ;
N
M
Where;
N= Number of Elements/Keys.
M= Size of Table.
HASH FUNCTION
The hashing algorithm is called hash function.
The value returned by hash function is called hash values, hash codes, hash sums, or
simply hashes.
There are seven different methods of hashing, or ways to insert values into a key accessed
table. But the two most popular methods are:
1. Division Method
2. Multiplication Method
1. DIVISION METHOD
Formula H (k) = k mod m (if the table starts at key 0)
H (k) = k mod m + 1 (if the table starts at key 1)
Where;
m = table size
k = key
EXAMPLE:
Store the given key values {18, 12, 71, 28} into the hash table of m=7 using division
method.
Hash
Keys
H (18) = 18 mod 7 , H (12) = 12 mod 7 value
H (18) = 4 H (12) = 5 0 28
1 71
H (71) = 71 mod 7 , H (28) = 28 mod 7 2
H (71) = 1 H (28) = 0 3
4 18
- Plot the given key values to their respective hash values, we get 5 12
the hash table. 6
2. MULTIPLICATION METHOD
Formula H (k) = m (kA mod 1)
Where;
A = 0.01 (constant)
EXAMPLE:
Store the given key values {18, 25, 71} into the hash table of m=5 using multiplication
method.
SEPARATE CHAINING
A Chained hash table is one where the data is stored in linked lists which contain all of
the data entries whose keys map to the same hash value. The linked list (or bucket) can
grow to contain as many entries as required but as the list grows it becomes gradually less
efficient in the speed at which the entries can be accessed.
The main disadvantage to chaining is that one needs 3m memory cells for the data.
Disadvantage of requiring pointers that tends to slow the algorithm down a bit slow.
0 A
1
2
3
4
5 E C
6
OPEN ADDRESSING
In the open addressing hash table scheme, the elements are stored in the hash table
itself.
In open addressing, if a collision occurs, alternative cells are tried until an empty space is
found.
Standard deletion cannot be performed in open addressing hash tables because the cell
might have caused a collision.
LINEAR PROBING
QUADRATIC PROBING
DOUBLE HASHING
1. LINEAR PROBING
Least efficient method.
2. QUADRATIC PROBING
Eliminates primary clustering problem of linear probing.
There is no guarantee of finding an empty cell once the table gets half full.
Faster in practice.
If quadratic probing is used, and the table size is prime, then a new element can always
be inserted if the table is at least half empty.
FORMULA FOR QUADRATIC PROBING:
3. DOUBLE HASHING
Ideal hashing approach.
Data
M=10, index 0-9
First finding the hash values for the given keys using division method formula
H (k) = k (mod 10)
H (19) = 19 (mod 10) = 9
H (211) = 211 (mod 10) = 1
H (54) = 54 (mod 10) = 4
H (39) = 39 (mod 10) = 9
H (43) = 43 (mod 10) = 3 Collisions
H (119) = 119(mod 10) = 9 occurs here
H (1101) =1101(mod 10)= 1
H (7) = 7 (mod 10) =7
H (111) = 111 (mod 10) = 1
H (207) = 207 (mod 10) = 7
HASH
KEYS
USING LINEAR PROBING: VALUES
9 19
H (111, 5) = (111+5) mod 10
= 116 mod 10
= 6 (empty space found)
FOR 207:
Now value of “i” will be 6.
H (207, 6) = (207+6) mod 10
= 213 mod 10
= 3 (already occupied, put i=7 now)
H (207, 7) = (207+7) mod 10
= 214 mod 10
= 4 (already occupied, put i=8 now)
H (207, 8) = (207+8) mod 10
= 215 mod 10
= 5 (already occupied, put i=9 now)
H (207, 9) = (207+9) mod 10
= 216 mod 10
= 6 (already occupied, put i=10 now)
H (207, 10) = (207+10) mod 10
= 217 mod 10
= 7 (already occupied, put i=11 now)
H (207, 11) = (207+11) mod 10
= 218 mod 10
= 8 (empty space found)
By using Quadratic probing store the following elements in the hash table.
199, 11, 354, 99, 343, 9, 101, 77, 1011,107
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
Data:
M=10, index 0-9
First finding the hash values for the given keys using division method formula
H (k) = k (mod 10)
H (199) = 199 (mod 10) =9
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
199
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 354 199
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 343 354 9 199
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 77 343 354 1011 101 9 199
APPLICATIONS OF HASH
TABLES/ HASHING
1. Hashing can be used to reduce the time and space requirements of programs.
2. Providing powerful techniques of pattern matching.
6. Compilers use hash tables to keep track of declared variables in source code.
9. Useful for any graph theory problem where the nodes have real names instead of
numbers.