hashing algorithms c++
hashing algorithms c++
3 2024510066
Objectives:
1. To implement Modulo Division Hashing Technique
2. To implement Mid-Square Hashing Technique
3. To implement Fold Shift Hashing Technique
4. To implement Digit Extraction Hashing Technique
5. To implement a collision resolution technique
Algorithm:
1. Modulo Division:
● Start.
● Input Memory Capacity and Data
○ If the number of data items exceeds the available memory capacity,
stop and return an error.
○ Initialize Hash Table: Create an empty hash table with all slots
marked as unoccupied.
● Hash Each Data Item:
○ Hash Function: Compute the hash value by taking the data item
and finding its remainder when divided by the size of the hash
table.
● Check for Collisions:
○ If the computed hash location in the table is unoccupied, store the
data item in that location.
○ If the location is already occupied (a collision occurs):
■ Use linear probing to resolve the collision.
■ Check subsequent slots in the table (moving one step at a
time) until an unoccupied slot is found.
■ Store the data item in the first available unoccupied slot.
● End.
2. Mid-Square:
● Start.
● Input Memory and Data Size:
○ If the number of data elements exceeds available memory or invalid
input is detected (memory locations or data size is 0), stop the
process and return.
Prabhat Anand Tiwari Practical no. 3 2024510066
3. Fold Shift:
● Start.
● Input Memory and Data Size:
○ If the number of data elements exceeds available memory, stop the
process and return an error.
○ Collect the data items to be stored.
● Determine Modulus:
○ Compute the modulus (mod), which is derived from the number of
memory locations (n) - 1 to determine the data split for the fold shift
process.
● Fold Shift Calculation:
○ Split Data: Break the data item into parts of size determined by the
modulus.
Prabhat Anand Tiwari Practical no. 3 2024510066
4. Digit Extraction:
Problem Statement:
Implement modulo division, mid-square, fold shift, and digit extraction hashing
algorithms in c++ with collision resolution.
Solution:
Modulo Division:
#include <iostream>
class ModuloDivision
{
public:
int size, n = 0;
int *getInput()
{
cout << "Enter memory locations: ";
cin >> n;
cout << "Enter number of data to store: ";
cin >> size;
if(size > n){
Prabhat Anand Tiwari Practical no. 3 2024510066
break;
}
else
{
continue;
}
}
}
}
int main()
{
ModuloDivision md;
int *arr = md.getInput();
cout<<endl;
md.moduloDivision(arr);
return 0;
}
Prabhat Anand Tiwari Practical no. 3 2024510066
Output:
Prabhat Anand Tiwari Practical no. 3 2024510066
Mid Square:
#include <iostream>
class MidSquare
{
public:
int size, n = 0;
int *getInput()
{
cout << "Enter memory locations: ";
cin >> n;
cout << "Enter number of data to store: ";
cin >> size;
if (size > n && n == 0 && size == 0)
{
cout << "Requested data size exceeds memory capacity or the request is
invalid." << endl;
exit(0);
}
int *arr = new int[size];
for (int i = 0; i < size; i++)
{
cout << "Element at position " << i << ": ";
cin >> arr[i];
}
return arr;
}
if (locationSet[location] == -1)
{
locationSet[location] = 0;
hashed[location] = data[i];
}
else
{
for (int j = 1; j < n; j++)
{
collision++;
int locationTry = (location + j) % n;
if (locationSet[locationTry] == -1)
{
locationSet[locationTry] = 0;
hashed[locationTry] = data[i];
break;
}
else
{
Prabhat Anand Tiwari Practical no. 3 2024510066
continue;
}
}
}
}
int main()
{
MidSquare ms;
int *arr = ms.getInput();
ms.midSquare(arr);
return 0;
}
Prabhat Anand Tiwari Practical no. 3 2024510066
Output:
Prabhat Anand Tiwari Practical no. 3 2024510066
Fold Shifting:
#include <iostream>
class FoldShift
{
public:
int size, n = 0;
int *getInput()
{
cout << "Enter memory locations: ";
cin >> n;
cout << "Enter number of data to store: ";
cin >> size;
if (size > n)
{
cout << "Requested data size exceeds memory capacity." << endl;
exit(0);
}
int *arr = new int[size];
for (int i = 0; i < size; i++)
{
cout << "Element at position " << i << ": ";
cin >> arr[i];
}
return arr;
}
int getReqMod()
{
int mod = 1;
int tMemory = n - 1;
while (tMemory > 0)
{
tMemory /= 10;
mod *= 10;
}
Prabhat Anand Tiwari Practical no. 3 2024510066
return mod;
}
int collision = 0;
int locationSet[n];
int hashed[n];
for (int i = 0; i < n; i++)
{
locationSet[i] = -1;
}
for (int i = 0; i < size; i++)
{
int location = getShiftsAddition(data[i], mod) % mod;
if (locationSet[location] == -1)
{
locationSet[location] = 0;
hashed[location] = data[i];
}
else
{
for (int j = 1; j < n; j++)
{
collision++;
int locationTry = (location + j) % n;
if (locationSet[locationTry] == -1)
{
Prabhat Anand Tiwari Practical no. 3 2024510066
locationSet[locationTry] = 0;
hashed[locationTry] = data[i];
break;
}
else
{
continue;
}
}
}
}
int main()
{
FoldShift fs;
int *inputArr = fs.getInput();
fs.foldShift(inputArr);
return 0;
}
Prabhat Anand Tiwari Practical no. 3 2024510066
Output:
Prabhat Anand Tiwari Practical no. 3 2024510066
Digit Extraction:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
class DigitExtraction
{
public:
int size, n = 0;
vector<int> indices;
int *getInput()
{
cout << "Enter memory locations: ";
cin >> n;
cout << "Enter number of data to store: ";
cin >> size;
if (size > n && n == 0 && size == 0)
{
cout << "Requested data size exceeds memory capacity or the request is
invalid." << endl;
exit(0);
}
int *arr = new int[size];
for (int i = 0; i < size; i++)
{
cout << "Element at position " << i << ": ";
cin >> arr[i];
}
int noIndex;
cout << "Enter no. of digits you want to extract: " << endl;
cin >> noIndex;
cout << "Enter the indices you want to extract: " << endl;
for (int i = 0; i < noIndex; i++)
{
int temp;
cin >> temp;
indices.push_back(temp);
Prabhat Anand Tiwari Practical no. 3 2024510066
}
sort(indices.begin(), indices.end(), greater<int>());
return arr;
}
else
{
for (int j = 1; j < n; j++)
{
collision++;
int locationTry = (location + j) % n;
if (locationSet[locationTry] == -1)
{
locationSet[locationTry] = 0;
hashed[locationTry] = data[i];
break;
}
else
{
continue;
}
}
}
}
int main()
{
DigitExtraction de;
int *arr = de.getInput();
de.digitExtraction(arr);
return 0;
}
Prabhat Anand Tiwari Practical no. 3 2024510066
Output:
Prabhat Anand Tiwari Practical no. 3 2024510066
Observation:
The Modulo Division method is one of the simplest hashing techniques. It operates by
dividing the key by the number of memory locations and using the remainder as the
hash location. This method works well in scenarios where the data is evenly distributed
across the possible key values. However, a major drawback of this method is its
tendency to cause collisions when keys generate the same remainder. If the chosen
divisor is not well-suited to the data, certain patterns in the key values can result in
uneven distribution, where multiple keys map to the same location, leading to frequent
collisions.
The Mid-Square method aims to mitigate some of the issues seen in modulo division.
By squaring the key and extracting the middle digits, it generates more randomized and
evenly distributed hash values, making the collision less likely to occur. The extraction
of middle digits helps in avoiding biases that may arise from the original key patterns.
Nonetheless, this method can be computationally expensive due to the squaring
operation, especially for large keys. Moreover, for small key values, squaring may not
generate sufficiently large numbers to allow for meaningful middle digit extraction, thus
reducing its effectiveness in such cases.
Fold Shift method takes a more flexible approach by breaking the key into parts, adding
the parts together, and then applying a modulo operation to compute the hash location.
This method works well for large keys because it effectively disperses the key values
across memory locations. A key observation here is that Fold Shift works best when the
data contains long numeric keys that are evenly distributed. However, its effectiveness
can diminish when dealing with short keys or keys that contain patterns, as those
patterns can still lead to clustering. Additionally, like with other methods, Fold Shift is not
immune to collisions hence appropriate probing methods should be used to maximize
efficiency.
The Digit Extraction method takes a unique approach by extracting specific digits from
the key based on predetermined indices and using these extracted digits to compute the
hash location. This method is useful in cases where certain parts of the key are more
significant or variable than others. However, one major drawback of digit extraction is
that its effectiveness heavily depends on the choice of indices. If the selected digits do
not vary much across keys, or if the key structure follows a predictable pattern, this can
lead to clustering of hash values and increased collisions. Additionally, this method
requires prior knowledge of which digits to extract, making it less flexible for
general-purpose hashing where the key format may not be known in advance.
Prabhat Anand Tiwari Practical no. 3 2024510066
Collision resolution through linear probing is used in all these methods. While this is a
straightforward and easy-to-implement solution, it introduces additional memory
lookups, especially when many collisions occur, which can degrade the performance of
the hash table. In heavily loaded tables or when key values are not well-distributed, the
probing process can become inefficient, slowing down data retrieval. Other advanced
probing techniques such as quadratic probing and double hashing can be used.