0% found this document useful (0 votes)
9 views9 pages

Hashing Techniques - U3

Uploaded by

Sunil Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views9 pages

Hashing Techniques - U3

Uploaded by

Sunil Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Hashing Data Structure

Introduction

There are several searching techniques like linear search, binary search, search trees etc. In
these techniques, time taken to search any element depends on the total number of elements.
Linear Search takes O(n) time to perform the search in unsorted arrays consisting of n elements.
Binary Search takes O(logn) time to perform the search in sorted arrays consisting of n elements.
It takes O(logn) time to perform the search in Binary Search Tree consisting of n elements. The
main drawback of these techniques are:
• As the number of elements increases, time taken to perform the search also increases.
• This becomes problematic when total number of elements become too large.

Hashing
Hashing is a technique to search an element ‘x’ among several elements. It minimizes the
number of comparisons while performing the search. The time taken by it to perform the search
does not depend upon the total number of elements. It completes the search with constant time
complexity O(1).

An array data structure called as Hash table is used to store the data items. Based on the hash
key value, data items are inserted into the hash table. Hash key value is a special value that
serves as an index for a data item. It indicates where the data item should be stored in the hash
table. Hash key value is generated using a hash function.

Hash function is a function that maps any big number or string to a small integer value. Hash
function takes the data item as an input and returns a small integer value as an output. The small
integer value is called as a hash value. Hash value of the data item is then used as an index for
storing it into the hash table.

Types of Hash Functions

There are various types of hash functions available such as: Mid Square Hash Function, Division
Hash Function and Folding Hash Function etc.

Properties of Hash Function


The properties of a good hash function is, efficiency in computation, minimizing the number of
collisions and distributes the keys uniformly over the table. Collision occurs when hash value of
the new key maps to an occupied bucket of the hash table. Collision resolution techniques are
classified as

Dr. Sunil Raj Y 1


Hashing Data Structure

PROBLEM ON SEPARATE CHAINING

1. Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table-
50, 700, 76, 85, 92, 73 and 101. Use separate chaining technique for collision resolution.

Solution

The given sequence of keys will be inserted in the hash table as-

Step-01:

• Draw an empty hash table.


• For the given hash function, the possible range of hash values is [0, 6].
• So, draw an empty hash table consisting of 7 buckets as-

Step-02:

• Insert the given keys in the hash table one by one.


• The first key to be inserted in the hash table = 50.
• Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
• So, key 50 will be inserted in bucket-1 of the hash table as-

Step-03:

• The next key to be inserted in the hash table = 700.


• Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
• So, key 700 will be inserted in bucket-0 of the hash table as-

Step-04:

• The next key to be inserted in the hash table = 76.


• Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
• So, key 76 will be inserted in bucket-6 of the hash table as-

Dr. Sunil Raj Y 2


Hashing Data Structure

Step-05:

• The next key to be inserted in the hash table = 85.


• Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
• Since bucket-1 is already occupied, so collision occurs.
• Separate chaining handles the collision by creating a linked list to bucket-1.
• So, key 85 will be inserted in bucket-1 of the hash table as-

Step-06:

• The next key to be inserted in the hash table = 92.


• Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.
• Since bucket-1 is already occupied, so collision occurs.
• Separate chaining handles the collision by creating a linked list to bucket-1.
• So, key 92 will be inserted in bucket-1 of the hash table as-

Dr. Sunil Raj Y 3


Hashing Data Structure

Step-07:

• The next key to be inserted in the hash table = 73.


• Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
• So, key 73 will be inserted in bucket-3 of the hash table as-

Step-08:

• The next key to be inserted in the hash table = 101.


• Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
• Since bucket-3 is already occupied, so collision occurs.
• Separate chaining handles the collision by creating a linked list to bucket-3.
• So, key 101 will be inserted in bucket-3 of the hash table as-

Dr. Sunil Raj Y 4


Hashing Data Structure

Open Addressing:
In open addressing, all the keys are stored inside the hash table. No key is stored outside the hash table.
Techniques used for open addressing are-
• Linear Probing
• Quadratic Probing
• Double Hashing

Operations in Open Addressing


Let us discuss how operations are performed in open addressing:

Insert Operation
• Hash function is used to compute the hash value for a key to be inserted.
• Hash value is then used as an index to store the key in the hash table.
In case of collision,
• Probing is performed until an empty bucket is found.
• Once an empty bucket is found, the key is inserted.
• Probing is performed in accordance with the technique used for open addressing.

Search Operation
To search any particular key,
• Its hash value is obtained using the hash function used.
• Using the hash value, that bucket of the hash table is checked.
• If the required key is found, the key is searched.
• Otherwise, the subsequent buckets are checked until the required key or an empty bucket is
found.
• The empty bucket indicates that the key is not present in the hash table.

Delete Operation
• The key is first searched and then deleted.
• After deleting the key, that particular bucket is marked as “deleted”.

Open Addressing Techniques

Techniques used for open addressing are-

Dr. Sunil Raj Y 5


Hashing Data Structure
1. Linear Probing-
In linear probing, as collision occurs, linearly probe for the next bucket. Keep probing until an empty
bucket is found. It is easy to compute. The main problem with linear probing is clustering. Many
consecutive elements form groups. Then, it takes time to search an element or to find an empty bucket.
Worst time to search an element in linear probing is O (table size). This is because, even if there is
only one element present and all other elements are deleted. Then, “deleted” markers present in the
hash table makes search the entire table.

2. Quadratic Probing
In quadratic probing, as the collision occurs, probe for i2‘th bucket in ith iteration. Keep probing until
an empty bucket is found.

3. Double Hashing
In double hashing, another hash function hash2(x) is used and look for i * hash2(x) bucket in ith
iteration. It requires more computation time as two hash functions need to be computed.

Comparison of Open Addressing Techniques-


• Linear Probing has the best cache performance but suffers from clustering.
• Quadratic probing lies between the two in terms of cache performance and clustering.
• Double caching has poor cache performance but no clustering.

Load Factor (α)

Load factor (α) is defined as-

In open addressing, the value of load factor always lie between 0 and 1.

This is because-
• In open addressing, all the keys are stored inside the hash table.
• So, size of the table is always greater or at least equal to the number of keys stored in the
table.

PROBLEM ON OPEN ADDRESSING

Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table:
50, 700, 76, 85, 92, 73 and 101, Use linear probing technique for collision resolution.

Solution-

The given sequence of keys will be inserted in the hash table as-

Dr. Sunil Raj Y 6


Hashing Data Structure

Step-01:

• Draw an empty hash table.


• For the given hash function, the possible range of hash values is [0,
6].
• So, draw an empty hash table consisting of 7 buckets as-

Step-02:

• Insert the given keys in the hash table one by one.


• The first key to be inserted in the hash table = 50.
• Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
• So, key 50 will be inserted in bucket-1 of the hash table as-

Step-03:

• The next key to be inserted in the hash table = 700.


• Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
So, key 700 will be inserted in bucket-0 of the hash table as-

Step-04:

• The next key to be inserted in the hash table = 76.


• Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
• So, key 76 will be inserted in bucket-6 of the hash table as-

Step-05:

• The next key to be inserted in the hash table


= 85.
• Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
• Since bucket-1 is already occupied, so collision occurs.
• To handle the collision, linear probing technique keeps probing
linearly until an empty bucket is found.
• The first empty bucket is bucket-2.
• So, key 85 will be inserted in bucket-2 of the hash table as-

Dr. Sunil Raj Y 7


Hashing Data Structure

Step-06:

• The next key to be inserted in the hash table = 92.


• Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.
• Since bucket-1 is already occupied, so collision occurs.
• To handle the collision, linear probing technique keeps probing
linearly until an empty bucket is found.
• The first empty bucket is bucket-3.
• So, key 92 will be inserted in bucket-3 of the hash table as-

Step-07:

• The next key to be inserted in the hash table = 73.


• Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
• Since bucket-3 is already occupied, so collision occurs.
• To handle the collision, linear probing technique keeps probing
linearly until an empty bucket is found.
• The first empty bucket is bucket-4.
• So, key 73 will be inserted in bucket-4 of the hash table as-

Step-08:
• The next key to be inserted in the hash table = 101.
• Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
• Since bucket-3 is already occupied, so collision occurs.
• To handle the collision, linear probing technique keeps probing
linearly until an empty bucket is found.
• The first empty bucket is bucket-5.
• So, key 101 will be inserted in bucket-5 of the hash table as-

Separate Chaining Vs Open Addressing-

Which is the Preferred Technique?


Separate Chaining Open Addressing
All the keys are stored only inside the hash
Keys are stored inside the hash table as well as
table.
outside the hash table.
No key is present outside the hash table.
The number of keys to be stored in the hash
The number of keys to be stored in the hash
table can never exceed the size of the hash
table can even exceed the size of the hash table.
table.
Deletion is easier. Deletion is difficult.

Dr. Sunil Raj Y 8


Hashing Data Structure
Extra space is required for the pointers to store
No extra space is required.
the keys outside the hash table.
Cache performance is poor.
Cache performance is better.
This is because of linked lists which store the
This is because here no linked lists are used.
keys outside the hash table.
Some buckets of the hash table are never used Buckets may be used even if no key maps to
which leads to wastage of space. those buckets.

Separate Chaining
Separate Chaining is advantageous when it is required to perform all the following operations
on the keys stored in the hash table-
• Insertion Operation
• Deletion Operation
• Searching Operation
• Deletion is easier in separate chaining.
• This is because deleting a key from the hash table does not affect the other keys stored in
the hash table.

Open Addressing

Open addressing is advantageous when it is required to perform only the following operations
on the keys stored in the hash table-
• Insertion Operation
• Searching Operation
• Deletion is difficult in open addressing.
• This is because deleting a key from the hash table requires some extra efforts.
• After deleting a key, certain keys must be rearranged.

_________________________

Dr. Sunil Raj Y 9

You might also like