0% found this document useful (0 votes)

28 views8 pages

Hashing in Data Structures

Hashing is a technique that maps data of variable lengths to fixed-length values to allow for faster searches of large datasets. It works by applying a hash function to a key that generates a hash value, which serves as an index for where the key/value pair is stored in a hash table. Collisions occur when different keys generate the same hash value, and techniques like separate chaining and open addressing are used to resolve collisions.

Uploaded by

Kashif Riaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views8 pages

Hashing in Data Structures

Uploaded by

Kashif Riaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Hashing in Data Structures: An Overview

Have you ever heard of hashing but aren't sure how it works or why it's
important? In this data-driven world, hashing is a widely used technique to get
the required data from a whole lot of it. It is the process of mapping a variable-
length input data set into a finite-sized output data set. It increases your
efficiency in retrieving the desired result from a bunch of data sets and even
storing it.

What is Hashing in Data Structures?

In this technique, we give an input called a key to the hash function. The function
uses this key and generates the unique index corresponding to that value in
the hash table. After that, it returns the value stored at that index which is known
as the hash value.
Data can be hashed into a shorter, fixed-length value for quicker access using a
key or set of characters. This is how key-value pairs are stored in hash tables. The
representation of the hash function looks like this:
hash=hashfunc(key)

How does Hashing in Data Structures Work?

 Hash key: It is the data you want to be hashed in the hash table. The
hashing algorithm translates the key to the hash value. This identifier can be
a string or an integer. There are some types of hash keys:
1. Public key - It is an open key used solely for data encryption.
2. Private key - It is known as a symmetric key used for both purposes,
encryption and decryption.
3. SSH public key - SSH is a set of both public and private keys.
 Hash Function: It performs the mathematical operation of accepting the key
value as input and producing the hash code or hash value as the output.
Some of the characteristics of an ideal hash function are as follows:
o It must produce the same hash value for the same hash key to be
deterministic.
o Every input has a unique hash code. This feature is known as the hash
property.
o It must be collision-friendly.
o A little bit of change leads to a drastic change in the output.
o The calculation must be quick
 Hash Table: It is a type of data structure that stores data in an array format.
The table maps keys to values using a hash function.
Use cases of Hashing
 Password Storage: Hash functions are commonly used to securely store
passwords. Instead of storing the actual passwords, the system stores their
hash values. When a user enters a password, it is hashed and compared
with the stored hash value for authentication.
 Data Integrity: Hashing is used to ensure data integrity by generating hash
values for files or messages. By comparing the hash values before and after
transmission or storage, it's possible to detect if any changes or tampering
occurred.
 Data Retrieval: Hashing is used in data structures like hash tables, which
provide efficient data retrieval based on key-value pairs. The hash value
serves as an index to store and retrieve data quickly.
 Digital Signatures: Hash functions are an integral part of digital signatures.
They are used to generate a unique hash value for a message, which is then
encrypted with the signer's private key. This allows for verification of the
authenticity and integrity of the message using the signer's public key.

Example of Hashing
 Python
 Java
 C++

import hashlib

def sha256(input):
hash_object = hashlib.sha256(input.encode('utf-8'))
hex_digest = hash_object.hexdigest()
return hex_digest

def main():
message = "Hello, world!"
hash_value = sha256(message)

print("Message: " + message)

print("SHA-256 Hash: " + hash_value)

if __name__ == "__main__":
main()

Run Code >>

Output
Message: Hello, world!
SHA-256 Hash: 315f5bdb76d078c43b8ac0064e4a0164612b1fce77c869345bfc94c75894edd3

Types of Hash Functions

The primary types of hash functions are:
1. Division Method.
2. Mid Square Method.
3. Folding Method.
4. Multiplication Method.
1. Division Method
The easiest and quickest way to create a hash value is through division. The k-
value is divided by M in this hash function, and the result is used.

Formula:
h(K) = k mod M
(where k = key value and M = the size of the hash table)

Advantages:
 This method is effective for all values of M.
 The division strategy only requires one operation, thus it is quite quick.

Disadvantages:
 Since the hash table maps consecutive keys to successive hash values, this
could result in poor performance.
 There are times when exercising extra caution while selecting M's value is
necessary.

Example of Division Method

k = 1987
M = 13h(1987) = 1987 mod 13
h(1987) = 4
2. Mid Square Method
The following steps are required to calculate this hash method:
 k*k, or square the value of k
 Using the middle r digits, calculate the hash value.
Formula:
h(K) = h(k x k)
(where k = key value)

Advantages:
 This technique works well because most or all of the digits in the key value
affect the result. All of the necessary digits participate in a process that
results in the middle digits of the squared result.
 The result is not dominated by the top or bottom digits of the initial key value.

Disadvantages:
 The size of the key is one of the limitations of this system; if the key is large,
its square will contain twice as many digits.
 Probability of collisions occurring repeatedly.

Example of Mid Square Method

k = 60Therefore,k = k x k
k = 60 x 60
k = 3600Thus,
h(60) = 60
3. Folding Method
The process involves two steps:
 except for the last component, which may have fewer digits than the other
parts, the key-value k should be divided into a predetermined number of
pieces, such as k1, k2, k3,..., kn, each having the same amount of digits.
 Add each element individually. The hash value is calculated without taking
into account the final carry, if any.

Formula:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
(Where, s = addition of the parts of key k)

Advantages:
 Creates a simple hash value by precisely splitting the key value into equal-
sized segments.
 Without regard to distribution in a hash table.

Disadvantages:
 When there are too many collisions, efficiency can occasionally suffer.

Example of Folding Method

k = 12345
k1 = 67; k2 = 89; k3 = 12Therefore,s = k1 + k2 + k3
s = 67 + 89 + 12
s = 168
4. Multiplication Method
 Determine a constant value. A, where (0, A, 1)
 Add A to the key value and multiply.
 Consider kA's fractional portion.
 Multiply the outcome of the preceding step by M, the hash table's size.

Formula:
h(K) = floor (M (kA mod 1))
(Where, M = size of the hash table, k = key value and A = constant value)

Advantages:
 Any number between 0 and 1 can be applied to it, however, some values
seem to yield better outcomes than others.

Disadvantages:
 The multiplication method is often appropriate when the table size is a power
of two since multiplication hashing makes it possible to quickly compute the
index by key.

Example of Multiplication Method

k = 5678
A = 0.6829
M = 200
Now, calculating the new value of h(5678):h(5678) = floor[200(5678 x 0.6829 mod 1)]
h(5678) = floor[200(3881.5702 mod 1)]
h(5678) = floor[200(0.5702)]
h(5678) = floor[114.04]
h(5678) = 114
So, with the updated values, h(5678) is 114.
What is a Hash Collision?
Collision in a hash table is a term used to denote the phenomena when the
hashing algorithm produces the same hash value for two or more keys using a
hash function. However, it's crucial to note that collisions are not an issue; rather,
they constitute a key component of hashing algorithms. Because various hashing
methods used in data structures convert each input into a fixed-length code
regardless of its length, collisions happen. The hashing algorithms will eventually
yield repeating hashes since there are an infinite number of inputs and a finite
number of outputs.

Collision Resolution Techniques in Data Structures

1. Open hashing/separate chaining/closed addressing
2. Open addressing/closed hashing
1. Open hashing/separate chaining/closed addressing
A typical collision handling technique called "separate chaining" links components
with the same hash using linked lists. It is also known as closed addressing and
employs arrays of linked lists to successfully prevent hash collisions.

Advantages:
 Implementation is simple and easy
 We can add more keys to the table because the hash table has a lot of empty
places.
 Less sensitive than average to changing load factors
 Typically utilized when there is uncertainty on the number and frequency of
keys to be used in the hash table.

Disadvantages:
 Space is wasted
 The length of the chain lengthens the search period.
 Comparatively worse cache performance to closed hashing.
2. Closed hashing (Open addressing)
Instead of using linked lists, open addressing stores each entry in the array itself.
The hash value is not used to locate objects. To insert, it first verifies the array
beginning from the hashed index and then searches for an empty slot using
probing sequences. The probe sequence, with changing gaps between
subsequent probes, is the process of progressing through entries. There are
three methods for dealing with collisions in closed hashing:
1. Linear Probing
Linear probing includes inspecting the hash table sequentially from the very
beginning. If the site requested is already occupied, a different one is searched.
The distance between probes in linear probing is typically fixed (often set to a value
of 1).

Formula
index = key % hashTableSize

Sequence
index = ( hash(n) % T)
(hash(n) + 1) % T
(hash(n) + 2) % T
(hash(n) + 3) % T … and so on.
2. Quadratic Probing
The distance between subsequent probes or entry slots is the only difference
between linear and quadratic probing. You must begin traversing until you find an
available hashed index slot for an entry record if the slot is already taken. By adding
each succeeding value of any arbitrary polynomial in the original hashed index, the
distance between slots is determined.

Formula
index = index % hashTableSize

Sequence
index = ( hash(n) % T)
(hash(n) + 1 x 1) % T
(hash(n) + 2 x 2) % T
(hash(n) + 3 x 3) % T … and so on
3. Double-Hashing
The time between probes is determined by yet another hash function. Double
hashing is an optimized technique for decreasing clustering. The increments for
the probing sequence are computed using an extra hash function.

Formula
(first hash(key) + i * secondHash(key)) % size of the table
Sequence
index = hash(x) % S
(hash(x) + 1*hash2(x)) % S
(hash(x) + 2*hash2(x)) % S
(hash(x) + 3*hash2(x)) % S … and so on

Importance of Hashing
1. Easy retrieval of required information from large data sets in an efficient
manner.
2. The hash code produced by the hash function serves as the unique identifier
in the data set thus maintaining data integrity.
3. The data is stored in a structured manner as there is an index for every
record in the hash table. This ensures efficient storage and retrieval.

Limitations of Hashing
1. Many a time there leads to a situation of collision where two or more inputs
have the same hash value.
2. The performance of the hashing algorithm depends upon the quality of the
hash function. Sometimes, a not well-thought-of hash function may lead to
collisions thus reducing the efficiency of the algorithm.

Summary
For effective organization, data structures include hashing, which entails turning
data into fixed-length values. Separate chaining and open addressing are the two
primary hashing methods. Data is transformed into distinct fixed-length codes by
hash methods like SHA-256. Hashing is useful for password storage, data
integrity, and digital signatures and speeds up data retrieval while requiring less
storage. Division, mid square, folding, and multiplication procedures are only a
few of the several hash function kinds.

Hashing
No ratings yet
Hashing
48 pages
Unit-5 Hashing (1)
No ratings yet
Unit-5 Hashing (1)
12 pages
Finals Complexity and Algorithmn
No ratings yet
Finals Complexity and Algorithmn
49 pages
Hashing
No ratings yet
Hashing
18 pages
C++&DS(UNIT5)
No ratings yet
C++&DS(UNIT5)
42 pages
Hashing
No ratings yet
Hashing
24 pages
20.Hashing Search Technique
No ratings yet
20.Hashing Search Technique
8 pages
ASSIGNMENT 6
No ratings yet
ASSIGNMENT 6
5 pages
UNIT V
No ratings yet
UNIT V
14 pages
HASHING
No ratings yet
HASHING
9 pages
UNIT - 2 Notes
No ratings yet
UNIT - 2 Notes
40 pages
Hashing
No ratings yet
Hashing
44 pages
Hashing
No ratings yet
Hashing
12 pages
Hashing2
No ratings yet
Hashing2
59 pages
Hash-Data Structure
No ratings yet
Hash-Data Structure
16 pages
Dat Astruc T Hashing Rep
No ratings yet
Dat Astruc T Hashing Rep
13 pages
HASHING
No ratings yet
HASHING
8 pages
Hashing.docx
No ratings yet
Hashing.docx
4 pages
What is Hashing
No ratings yet
What is Hashing
11 pages
hashtables
No ratings yet
hashtables
21 pages
12. Hashing
No ratings yet
12. Hashing
35 pages
Unit-9-Hashing-BIM
No ratings yet
Unit-9-Hashing-BIM
5 pages
Hash Function
No ratings yet
Hash Function
4 pages
Hash
No ratings yet
Hash
7 pages
Hashing
No ratings yet
Hashing
31 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
Hashing
No ratings yet
Hashing
8 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing Data Structure
No ratings yet
Hashing Data Structure
22 pages
DSA_M5
No ratings yet
DSA_M5
38 pages
UNIT V - Hashing
No ratings yet
UNIT V - Hashing
20 pages
Hashing
No ratings yet
Hashing
7 pages
ADS Unit-2
No ratings yet
ADS Unit-2
53 pages
Hashing
No ratings yet
Hashing
5 pages
Module 5 Hashing
No ratings yet
Module 5 Hashing
66 pages
Module 6 DSA 24
No ratings yet
Module 6 DSA 24
64 pages
In Electronic Computer
No ratings yet
In Electronic Computer
3 pages
Module 5: HASHING: Functions. The Values Are Then Stored in A Data Structure Called Hash Table
No ratings yet
Module 5: HASHING: Functions. The Values Are Then Stored in A Data Structure Called Hash Table
39 pages
Hashing
No ratings yet
Hashing
30 pages
HAshing (Satish sir)
No ratings yet
HAshing (Satish sir)
52 pages
Week13 1
No ratings yet
Week13 1
16 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
Hashing
No ratings yet
Hashing
4 pages
DS_Lecture_01.1_Fall-24-35
No ratings yet
DS_Lecture_01.1_Fall-24-35
20 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Hashing Unit 1
No ratings yet
Hashing Unit 1
91 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing
No ratings yet
Hashing
14 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
30 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
Hashing
No ratings yet
Hashing
42 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Week 9_Hash Functions and Collision
No ratings yet
Week 9_Hash Functions and Collision
73 pages
Assembly Language - LAB-01 Lecture Notes
No ratings yet
Assembly Language - LAB-01 Lecture Notes
17 pages
DS Module 5 Hashing
No ratings yet
DS Module 5 Hashing
23 pages
Hashing
No ratings yet
Hashing
56 pages
Appdaemon PDF
No ratings yet
Appdaemon PDF
216 pages
Queues
No ratings yet
Queues
59 pages
Kill aura scripts - Copie
100% (2)
Kill aura scripts - Copie
24 pages
BFC3340 FINAL ASSESSMENT FREQUENTLY ASKED QUESTIONS(1)
No ratings yet
BFC3340 FINAL ASSESSMENT FREQUENTLY ASKED QUESTIONS(1)
5 pages
Hashing
No ratings yet
Hashing
34 pages
Ray v2 Architecture
No ratings yet
Ray v2 Architecture
64 pages
File Handling in C++
100% (1)
File Handling in C++
5 pages
CCS - View Topic - SOLVED - Problem With INT - RDA Not Beeing F
100% (3)
CCS - View Topic - SOLVED - Problem With INT - RDA Not Beeing F
5 pages
UNIT I BKS Lesson 3 Lexical Analysis and Role of Lexical Analyzer
No ratings yet
UNIT I BKS Lesson 3 Lexical Analysis and Role of Lexical Analyzer
28 pages
Bugreport dh0lm QKQ1.200311.002 2022 09 12 17 37 29 Dumpstate - Log 14355
No ratings yet
Bugreport dh0lm QKQ1.200311.002 2022 09 12 17 37 29 Dumpstate - Log 14355
24 pages
Learn JSON in A DAY - The Ultimate Crash Course To Learning The Basics of JSON in No Time (PDFDrive)
100% (1)
Learn JSON in A DAY - The Ultimate Crash Course To Learning The Basics of JSON in No Time (PDFDrive)
107 pages
Inheritance
No ratings yet
Inheritance
9 pages
PHP_Programming_UNIT_1[1]
No ratings yet
PHP_Programming_UNIT_1[1]
51 pages
21CSL35-OOPS LAB - Lab Manual 21CSL35-OOPS LAB - Lab Manual
No ratings yet
21CSL35-OOPS LAB - Lab Manual 21CSL35-OOPS LAB - Lab Manual
88 pages
Digital Signal Processing - Lab Report - 1
No ratings yet
Digital Signal Processing - Lab Report - 1
4 pages
EG Transcriptxxx
No ratings yet
EG Transcriptxxx
3 pages
Bài tập thuật toán tìm kiếm
No ratings yet
Bài tập thuật toán tìm kiếm
3 pages
University of Technology, Jamaica School of Computing and Information Technology Advanced Programming (CIT3009)
No ratings yet
University of Technology, Jamaica School of Computing and Information Technology Advanced Programming (CIT3009)
3 pages
Hash Function
No ratings yet
Hash Function
9 pages
Constants in Python
No ratings yet
Constants in Python
2 pages
FSD Week 4
No ratings yet
FSD Week 4
57 pages
დაპროგრამების საფუძვლები - გ. ჯანელიძე
No ratings yet
დაპროგრამების საფუძვლები - გ. ჯანელიძე
196 pages
Tutorial-1: How To Calculate Running Time of An Algorithm?
No ratings yet
Tutorial-1: How To Calculate Running Time of An Algorithm?
3 pages
Curriculum Vitae Bhupinder B. Singh
No ratings yet
Curriculum Vitae Bhupinder B. Singh
4 pages
AP Suppliers
No ratings yet
AP Suppliers
4 pages
Algorithm Description
No ratings yet
Algorithm Description
8 pages
Interfacing MATLAB and ROS PDF
No ratings yet
Interfacing MATLAB and ROS PDF
17 pages
2 Marks Questions
No ratings yet
2 Marks Questions
4 pages
4.2 Machine-Independent Macro Processor Features
No ratings yet
4.2 Machine-Independent Macro Processor Features
47 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Arrays & Strings
No ratings yet
Arrays & Strings
76 pages

Hashing in Data Structures

Uploaded by

Hashing in Data Structures

Uploaded by

Hashing in Data Structures: An Overview

What is Hashing in Data Structures?

How does Hashing in Data Structures Work?

print("Message: " + message)

Run Code >>

Types of Hash Functions

Example of Division Method

Example of Mid Square Method

Example of Folding Method

Example of Multiplication Method

Collision Resolution Techniques in Data Structures

You might also like