0% found this document useful (0 votes)
10 views12 pages

Unit-5 Hashing

Hashing is a technique in computer science for mapping large data sets to fixed-length values, facilitating efficient data retrieval. It involves using a hash function to convert variable-sized data into a fixed-size hash value, which serves as an index for data storage and retrieval. Various hash functions, such as Division, Mid Square, Digit Folding, and Multiplication methods, are used, and collision resolution techniques like Separate Chaining and Open Addressing are employed to handle instances where multiple keys generate the same hash value.

Uploaded by

pavanivuggina12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

Unit-5 Hashing

Hashing is a technique in computer science for mapping large data sets to fixed-length values, facilitating efficient data retrieval. It involves using a hash function to convert variable-sized data into a fixed-size hash value, which serves as an index for data storage and retrieval. Various hash functions, such as Division, Mid Square, Digit Folding, and Multiplication methods, are used, and collision resolution techniques like Separate Chaining and Open Addressing are employed to handle instances where multiple keys generate the same hash value.

Uploaded by

pavanivuggina12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

UNIT-5

INTRODUCTION TO HASHING

Hashing is a popular technique in computer


science that involves mapping large data sets to fixed-length values.
It is a process of converting a data set of variable size into a data set
of a fixed size. The ability to perform efficient lookup operations
makes hashing an essential concept in data structures.
What is Hashing?
Hashing is defined as follows...

Hashing is the process of indexing and retrieving element (data) in a data structure to
provide a faster way of finding the element using a hash key.

Hashing is commonly used to create a unique identifier


for a piece of data, which can be used to quickly look up that data in a
large dataset. For example, a web browser may use hashing to store
website passwords securely. When a user enters their password, the
browser converts it into a hash value and compares it to the stored hash
value to authenticate the user.

How Hashing Works?


The process of hashing can be broken down into three steps:

o Input: The data to be hashed is input into the hashing algorithm.


o Hash Function: The hashing algorithm takes the input data and
applies a mathematical function to generate a fixed-size hash value.
The hash function should be designed so that different input values
produce different hash values, and small changes in the input
produce large changes in the output.
o Output: The hash value is returned, which is used as an index to
store or retrieve data in a data structure.
Hash Functions:
Hash Function: A hash function is a type of
mathematical operation that takes an input (or key) and outputs a fixed-
size result known as a hash code or hash value. The hash function must
always yield the same hash code for the same input in order to be
deterministic.

Additionally, the hash function should produce a unique hash code for
each input, which is known as the hash property.

Types of Hash functions


There are many hash functions that use numeric or alphanumeric keys. This article
focuses on discussing different hash functions:
1. Division Method.
2. Mid Square Method.
3. Folding Method.
4. Multiplication Method.
Let’s begin discussing these methods in detail.

1. Division Method:
This is the most simple and easiest method to generate a hash value. The hash
function divides the value k by M and then uses the remainder obtained.
Formula:
h(K) = k mod M
Here,
k is the key value, and
M is the size of the hash table.

It is best suited that M is a prime number as that can make sure the keys are more
uniformly distributed. The hash function is dependent upon the remainder of a
division.
Example:
k = 12345
M = 95
h(12345) = 12345 mod 95
= 90
k = 1276
M = 11
h(1276) = 1276 mod 11
=0

2. Mid Square Method:


The mid-square method is a very good hashing
method. It involves two steps to compute the hash value-
1. Square the value of the key k i.e. k2
2. Extract the middle r digits as the hash value.
Formula:
h(K) = h(k x k)
Here,
k is the key value.
The value of r can be decided based on the size of the table.
Example:
Suppose the hash table has 100 memory locations. So r = 2 because two digits are
required to map the key to the memory location.
k = 60
k x k = 60 x 60
= 3600
h(60) = 60
The hash value obtained is 60

3. Digit Folding Method:

This method involves two steps:


1. Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where
each part has the same number of digits except for the last part that can have
lesser digits than the other parts.
2. Add the individual parts. The hash value is obtained by ignoring the last carry
if any.
Formula:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
Here,
s is obtained by adding the parts of the key k

Example:
k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51

Note:
The number of digits in each part varies depending upon the size of the hash table.
Suppose for example the size of the hash table is 100, then each part must have two
digits except for the last part which can have a lesser number of digits.

4. Multiplication Method
This method involves the following steps:
1. Choose a constant value A such that 0 < A < 1.
2. Multiply the key value with A.
3. Extract the fractional part of kA.
4. Multiply the result of the above step by the size of the hash table i.e. M.
5. The resulting hash value is obtained by taking the floor of the result obtained
in step 4.
Formula:
h(K) = floor (M (kA mod 1))
Here,
M is the size of the hash table.
k is the key value.
A is a constant value.

Example:
k = 12345
A = 0.357840
M = 100
h(12345) = floor[ 100 (12345*0.357840 mod 1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ]
= 53

Collision Resolution Techniques:

In Hashing, hash functions were used to generate hash


values. The hash value is used to create an Index for the keys in the
hash table. The hash function may return the same hash value for
two or more keys. When two or more keys have the same hash value,
a collision happens. To handle this collision, we use Collision
Resolution Techniques.
1.Separate Chaining is to make each cell of the hash table point to a linked list of records that
have the same hash function value. Chaining is simple but requires additional memory outside the
table.

example: We have given a hash function and we have to insert some elements in the hash table
using a separate chaining method for collision resolution technique.

Hash function = key % 5,


Elements = 12, 15, 22, 25 and 37.

Let’s see step by step approach to how to solve the above problem:
2) Open Addressing
In open addressing, all elements are stored in the hash table
itself. Each table entry contains either a record or NIL. When searching for an
element, we examine the table slots one by one until the desired element is found
or it is clear that the element is not in the table.

You might also like