Hash Data Structure
Hash Data Structure
CS1-401
PRESENTED BY: NEHA ANWAR (BSCS – 3) (2020-GCUF-072287)
Hashing
The process of storing and retrieving data in O(1) time is called hashing.
Hash Table
Terms used in Hashing
Search keys= ( 24 , 52 , 91 , 67 , 48 , 83 )
Hash Table (To make hash table: 0 – n-1) 91
52
Hash Functions ( K Mod 10 , K Mod n , 83
Mid Square , Folding Method) 24
n = number of indexes
• K Mod 10: 24 Mod 10 = 4 (Hash Value)
• K Mod n: 24 Mod 10 = 4 (Hash Value) 67
48
Other Hash Functions
To find a key use the same hash function you used to insert the key
in the hash table.
By using the hash functions we can perform the following functions
in the hash table:
1. Traverse
2. Insertion
3. Deletion
4. Searching a key
Collision in Hashing
Techniques
External chains
Like a linked list, we insert keys in a chain at the same index.
Keys: (42 , 19 , 10 , 12)
Hash Function : K Mod 5
Chaining (Open Hashing)
node
º 10 º 10
º 42 12 º 12 42
º 19 º 19
To insert text in a chain we store the address of the key in the index and make a singly linked list in
which each node has the address of the next node also. It means that the index of the hash table will
act like a head of the first node in chaining.
Continued…
Advantages:
Deletion is easy
Load Factor:
O(1) time for insertion
𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠/𝑘𝑒𝑦𝑠
Disadvantages: =
𝑠𝑙𝑜𝑡𝑠
Searching O(n)
Extra space
Linear Probing
R(K) = K Mod 10 19
R’(K , i) = (h(K) + i)(mod 10)
Keys (43 , 135 , 72 , 23 , 99 , 19 , 82) 72
{for 23} 43
23
R’(K , i) = 3 + 1 mod 10 = 4 mod 10 = 4
135
99
Continued…
Advantages:
No extra space
Disadvantages:
Search time O(n)
Deletion difficult
Primary clustering (If the primary hash index is x, subsequent probes go to x+1 , x+2, x+3 and so on, this
results in Primary Clustering. Once the primary cluster forms, the bigger the cluster gets, the faster it grows.)
Secondary clustering (If the primary hash index is x, probes go to x+1, x+4, x+9 , x+16, x+25 and so on,
this results in Secondary Clustering.)
Quadratic Probing
R(K) = K Mod 10 36
R’(K , i) = (h(K) + i²)(mod 10) 91
Keys (42 , 16 , 91 , 33 , 18 , 27 , 36 , 62) 42
{for 36} 33
Advantages:
No extra space
Primary clustering resolved
Disadvantages:
Searching in O(n) time
Secondary clustering
No guarantee of finding slot
Double Hashing
R1(K) = K mod 11
R2(K) = 8 – (K mod 8) 34
R’(K , i) = (R1(K) + iR2(K)) mod 11
Keys (20 , 34 , 45 , 70 , 56) 56
45
{for 70}
R’(K , i) = (4 + (1*3)) mod 11 = 4
70
Similarly , for other collisions.
20
Continued…
Advantages:
No extra space
No primary clustering
No secondary clustering
Disadvantages:
Search time O(n)
References