
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Collision Avoidance Techniques in DBMS
Collision is a problem that occurs when two keys applied on a hash table map to the same location in the hash table.
There are two techniques that are used to avoid collision they are −
- Linear probing.
- Chaining.
Let us discuss each technique in detail.
Linear probing
Linear probing is a strategy for resolving collisions. In this the new key is placed in the closest following empty cell.
Here the elements are stored wherever the hash function maps into a hash table, if that cell is filled then the next consecutive location is searched to store that value. Here generally we use arrays.
Step 1 − Let us take a table T that stores all the records in memory.
Step 2 − If a memory location (h) is already filled then we store the record in the next empty location.
Step 3 − We apply linear search in table T to find an empty memory location T(h), T(h+1), T(h+2), ……..
Record: A, B, C, D, E, X, Y, Z
H(k) : 4, 8, 2, 11, 4, 11, 5, 1
The table for linear probing is given below −
1 | X |
2 | C |
3 | Z |
4 | A |
5 | E |
6 | Y |
7 | |
8 | B |
9 | |
10 | |
11 | D |
The advantage is that the linear probing is very fast, due to locality of reference usage.
The disadvantage is that the linear probing needs five-way independence in the hash function.
Methods to minimise Clustering
There are two methods which are used to minimize clustering. These methods are as follows −
- Quadratic probing
Suppose a record has hash address h, is already filled then we search the memory locations with address h, h+1, h+4, h+9, h+16,……h+i2,…. to decrease the collision.
- Double hashing
The collision is resolved by hashing the hash address again. So hash function Hash(h)= h’, we search the memory location with address h, h+h’, h+2h’, h+3h’,….
Advantages of double hashing
Double Hashing drastically reduces clustering.
Double Hashing requires fewer comparisons.
Smaller hash tables can be used.
Double Hashing minimizes repeated collisions and the effects of clustering, it is free from problems seen in clustering.
Disadvantages of double hashing
Double Hashing technique fills up the Hash table very frequently so we have performance degrades.
Below thing makes the processing mechanism slower and de-grading the system.
Chaining
Chaining is known as the Chained Hash Table Mechanism. As the name illustrates, it holds the indexes into pointers to the head of linked-lists.
Here the linked list is used. Each record has two parts, which are as follows −
Data part to store data.
Next part is to link the records having the same hash address.
Example
The keys 25, 96, 102, 162, 197 stored in the hash table using the chaining method.
Here,
H(k) : k%5
H(26) =26 % 5= 1
H(44) = 44 % 5 = 4
H(38) = 38 % 5 = 3
H(29) = 29 % 5 =4
H(16) = 16 % 5 =1
The table for chaining will be as shown below −
0 | ||||
1 | 26 | 16 | NULL | |
2 | ||||
3 | 38 | NULL | ||
4 | 44 | 29 | NULL |
Advantages of Chaining
The advantages of chaining are as follows −
Chained Hash Tables remain effectively even though the number of keys is stored in different shared locations.
Collision Reduction
Upgraded Performance.
Disadvantages of Chaining
The disadvantages of chaining are as follows −
Key Stored will be more, since the Chained Hash Table has to store separate keys for every data.
Space overhead.
All disadvantages applicable for linked-lists are applicable for chained hash tables. Since, it also uses linked-list logic.