vision_cs_2023_algorithm_chapter_2_hashing_85
vision_cs_2023_algorithm_chapter_2_hashing_85
com
1
byjusexamprep.com
ALGORITHM
2 HASHING
Hash Table is a data structure that stores data in an associative manner. In a hash table, data
is stored in an array format, where each data value has its own unique index value. Access to
data becomes very fast if we know the index of the desired data.
In data structures,
• Hashing is a well-known technique to search for any particular element among several
elements.
• It minimizes the number of comparisons while performing the search.
Advantage-
Unlike other searching techniques,
• Hashing is extremely efficient.
• The time taken by it to perform the search does not depend upon the total number of
elements.
• It completes the search with constant time complexity O(1).
Hashing Mechanism-
In hashing,
• An array data structure called a Hash table is used to store the data items.
• Based on the hash key value, data items are inserted into the hash table.
2
byjusexamprep.com
HASH FUNCTION
The hash function is a function that maps any big number or string to a small integer value.
• Hash function takes the data item as an input and returns a small integer value as an output.
• The small integer value is called a hash value.
• Hash value of the data item is then used as an index for storing it into the hash table.
Clustering
Primary clustering:
The tend is for long sequences of preoccupied positions still become longer, primarily at one
place.
Secondary clustering:
The tend is for long sequence of preoccupied position still become longer primarily at different
places.
Collision in Hashing-
When the hash value of a key maps to an already occupied bucket of the hash table, it is
called as a Collision.
In hashing,
• Hash function is used to compute the hash value for a key.
• Hash value is then used as an index to store the key in the hash table.
• Hash function may return the same hash value for two or more keys.
3
byjusexamprep.com
1. Separate Chaining-
To handle the collision,
• This technique creates a linked list to the slot for which collision occurs.
• The new key is then inserted in the linked list.
• These linked lists to the slots appear like chains.
• That is why, this technique is called as separate chaining.
Collision resolution by chaining combines linkes representation with a hash table. When
two or more records have the same location, these records are constituted into a singly-
linked list called a chain.
Time Complexity-
For Searching-
• In the worst case, all the keys might map to the same bucket of the hash table.
• In such a case, all the keys will be present in a single linked list.
• Sequential search will have to be performed on the linked list to perform the search.
• So, time taken for searching in the worst case is O(n).
For Deletion-
• In the worst case, the key might have to be searched first and then deleted.
• In the worst case, time taken for searching is O(n).
• So, time taken for deletion in the worst case is O(n).
Load Factor (𝛂)-
Load factor (α) is defined as-
If Load factor (α) = constant, then the time complexity of Insert, Search, Delete = Θ(1)
4
byjusexamprep.com
Example-
Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash
table:
50, 700, 76, 85, 92, 73 and 101
Use a separate chaining technique for collision resolution.
Solution-
The given sequence of keys will be inserted in the hash table as-
5
byjusexamprep.com
Advantages of chaining:
1. Collided records required less probes.
2. Deletion possible
3. No overflow problem
Double hashing:-
m = 10(0, …a)
H.F1 (key) = key mod m;
H.F2 (key) = 1+(key mod m-2);
i = 0, 1, …. 9(m – 1)
1 97
2 78
4 18
5 25
7 57
8 98
9 75
Hash Table
25 H.F1(25) = 25 mod = 5
HF2 (25) = 1 + (25 mod 8) = 2
D.H. (key, i) = D.H (25, 0) = 5 mod 10 = 5
6
byjusexamprep.com
98 H.F1 (98) = 8
H.F2 (98) = 1 + (98 mod 8) = 3
DH (98, 0) = 8
H.F1(57) = 7
H.F2 (57) = 1 + 1 = 2
D.H = (75, 0) = 5 + 0 = 5 → collision
(75, 1) = 5 + 4 = 9
H.F1 = 5
H.F2 = 4
7
byjusexamprep.com
40 + 2 + 4 = 46 → 6 →
QP(50, 0) = 0 → Collision
(50, 1) = 50 + 1 + 1 = 52 → 2 → Collision
(50, 2) = 50 + 6 → 6 → collision
(50, 3) = (50 + 3 + 9) = 2 → collision
(50, 4) = (50 + 4 + 16) → 0 → Collision
= 50 + 2 + 25 → 0
Secondary clustering:-
If the two keys are mapped onto the same starting location in the hash table then they
both follow the same path unnecessarily in the quadratic manner. because of this search
time complexity will increase.
Time Complexity
Conclusion :- if n ≤ m: then by using perfect hashing we can achieve worst call search
time complexity as O(1). (99%)
NOTE:
1) The Expected no. of probe’s in an unsuccessful search of open addressing
technique is
1 n
where is load factor =
1− m
2) the Expected no. of probe’s in an successful search of open addressing technique is
1 1 n
log where , , Load factor =
1− m
= 8 + 1 = 9 → Collision
= 8 + 2 = 10 → 0 → collision
= 8 + 3 = 11 → 1 →
LP(65, 0) = 5 + 0 = 5 collision
(65, 1) = 5 + 1 = 6 → collision
(65, 2) = 5 + 2 = 7 → 0
(20, 0) = 0 + 0 → 0 → collision
8
byjusexamprep.com
0 + 1 → 1 → collision
0+2→2→
Load factor :- ( )
No. of keys getting stored in one slot is called as load factor.
In M slots — we are storing n keys.
In 1 slot — ?
1 n
*n =
m m
n
=
m
9
byjusexamprep.com
Eg:-
Keys(n) = 20, 31, 42, 53, 60, 70
L.P. = (60 + 0) → 0
(60 + 1) = 1
(60 + 2) = 2
(60 + 3) = 3
(60 + 4) = 4
(70, 0) = 0
(70, 1) = 1
(70, 2) = 2
(70, 3) = 3
(70, 4) = 4
(70, 5) = 5
Primary clustering:
1) If the two keys are mapped onto the same starting location in the hash table then
they both follow the same path unnecessarily in the linear manner because of this search
time complexity will increase.
c) To avoid this problem quadratic propping is used.
insertion searching deletion
B.C.→ O(1) B.C. → O(1) B.C. → O(1)
W.C. → O(m) W.C → O(m) W.C → O(m)
10
byjusexamprep.com
Eg:- Keys (n) = 25, 98, 57, 75, 97, 18
0 18
1 75
5 25
7 57
8 98
9 75
Hash Table
(n)
1) QP (25, 0) = 5 + 0 + 0 = 5
2) QP(98, 0) = 8 + 0 + 0 = 8
3) QP(57, 0) = 7 + 0 + 0 = 7
4) QP(75, 0) = 5 + 0 + 0 → 5 → Collision
(75, 1) = 5 + 1 + 1 = 7 → collision
(75, 2) = 5 + 6 = 11% 10 → 1
5) QP(97, 0) = 97 + 0 + 0 = 97 → 7 → collision
97 + 1 + 1 = 9 →
6) QP(18, 0) = 18 + 0 + 0 → 8 → collision
18 + 1 + 1 + 20%10 → 0
4 → collision → total
Note: Delection will be problem to others keys but we can manage by storing the special
symbol use $, or #.
Quadratic probing :-
m = 10 (0 …. 9)
H.F (key) = key mod m;
Q.P(key, i) = (M.F(key) + C1*i + C2 *i2) nodn
C1 = 1, C2 = 1, i = 0, 1………9(m – 1)
11
byjusexamprep.com
Eg:- key(n) = 25, 38, 43, 68, 79, 46, 58, 65, 20
0 79
1 58
2 20
3 43
5 25
6 46
7 65
8 38
9 68
Hash Table
(n)
1) LP(25, 0) = (H.F.(25 + 0) mod 10) = 5 + 0
2) LP(38, 0) = (8 + 1) = 8
3) LP (43, 0) = (3 + 0) = 3
LP(68, 1) = (8 + 1) = 9
5) LP(79, 0) = (9 + 0) = 9 → collision
LP(79, 1) = (9 + 1) = 0 → 0
6) LP(46, 0) = (6 + 0) = 6
7) LP(58, 0) = 8 + 0 = 8 → collision
Chaining:
Chaining is implemented with the help of linked list.
2) keys will be stored outside the hash table.
H.F. (key) = key mod n
n = 10 (0…..9)
0
****
12