0% found this document useful (0 votes)

18 views30 pages

Hashing

Data Structure and Algorithm Lesson

Uploaded by

danielatparoni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views30 pages

Hashing

Data Structure and Algorithm Lesson

Uploaded by

danielatparoni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Hashing

Hashing
• Hashing refers to the process of transforming a given key to
another value.
• It involves mapping data to a specific index in a hash table
using a hash function that enables fast retrieval of information
based on its key.
• The transformation of a key to the corresponding value is done
using a Hash Function and the value obtained from the hash
function is called Hash Code .
Need for Hash data structure
● A very common data structure is the Array data structure.
● For a large data set Array data structure becomes inefficient.
● So now we are looking for a data structure that can store the
data and search in it in constant time, i.e. in O(1) time. This is
how Hashing data structure came into play.
● With the introduction of the Hash data structure, it is now
possible to easily store data in constant time and retrieve them
in constant time as well.
Components of Hashing
1. Key: A Key can be anything string or integer which is fed as
input in the hash function.
2. Hash Function: The hash function receives the input key and
returns the index of an element in an array called a hash table.
The index is known as the hash index .
3. Hash Table: Hash table is a data structure that maps keys to
values using a special function called a hash function. Hash
stores the data in an associative manner in an array where
each data value has its own unique index.
How Hashing Works?
The process of hashing can be broken down into three steps:

1. Input: The data to be hashed is input into the hashing algorithm.

2. Hash Function: The hashing algorithm takes the input data and
applies a mathematical function to generate a fixed-size hash
value.
3. Output: The hash value is returned, which is used as an index to
store or retrieve data in a data structure.
Direct Address Table
● Direct Address Table is a data structure that has the
capability of mapping records to their corresponding
keys using arrays.
● In direct address tables, records are placed using
their key values directly as indexes.
● They facilitate fast searching, insertion and deletion
operations.
Example:
We create an array of size equal to maximum value plus one
(assuming 0 based index) and then use values as indexes. For
example, in the following diagram key 21 is used directly as index.
Limitations:
● Prior knowledge of maximum key value

● Practically useful only if the maximum value is very

less.
● It causes wastage of memory space if there is a

significant difference between total records and

maximum value.

Hashing can overcome these limitations of direct

address tables.
Hash function
● The hash function creates a mapping between key
and value, this is done through the use of
mathematical formulas known as hash functions.
● The result of the hash function is referred to as a
hash value or hash. The hash value is a
representation of the original string of characters but
usually smaller than the original.
Types of Hash functions
Division method
This method involves dividing the key by the table size and taking
the remainder as the hash value. For example, if the table size is
10 and the key is 23, the hash value would be 3 (23 % 10 = 3).

Advantages:
● Simple to implement.
● Works well when 𝑚m is a prime number.

Disadvantages:
● Poor distribution if 𝑚m is not chosen wisely.
● The reason why a prime modulo is often chosen is due to the
absence of common factors.
● This allows keys to be more evenly distributed and reduces
clustering, thus improving the performance of the hash table.

Let’s look at an example.

Multiplication Method
● In the multiplication method, the key is multiplied by a constant
A, which is between 0 and 1. But, an optimal choice will be ≈ (√5-
1)/2 [0.618033] suggested by Knuth.
● The reason for this is so to keep the result distributed within the
hash table size.
● Then, the fractional part of the result is multiplied by m, the
desired size of the hash table.
● The final result is then floored to obtain an integer value
equating to the index.
Example:
Given:
Key to hash=42
Constant 𝐴=0.618033
Table size m=10

H(key)= floor(m(key × A mod 1))

H(42)= ⌊10(42 × 0.618033 % 1) ⌋
H(42)= ⌊10(25.957386 % 1) ⌋
H(42)= ⌊10(0.957386) ⌋
H(42)= ⌊ 9.57386 ⌋
H(42)=9
Universal Hashing
● Universal hashing uses a family of hash functions to minimize
the chance of collision for any given set of inputs.

h(k)=((a⋅k+b) mod p) mod m

Where a and b are randomly chosen constants, p is a prime

number greater than m, and k is the key.

Advantages:
● Reduces the probability of collisions.

Disadvantages:
● Requires more computation and storage.
What is Collision?
● Collision in Hashing occurs
when two different keys
map to the same hash
value.
● The probability of a hash
collision depends on the
size of the algorithm, the
distribution of hash values
and the efficiency of Hash
function.
How to handle Collisions
Separate Chaining Example:
Hash function: key mod 7
● The idea is to make each cell of values: 50, 700, 76, 85, 92, 73, 101
the hash table point to a linked
list of records that have the
same hash function value.
● This method is implemented
using the linked list data
structure.
● When numerous elements are
hashed into the same slot index,
those elements are added to a
chain, which is a singly-linked
list.
Advantages:
● easy to implement
● We can always add more elements to the chain, thus the hash table
never runs out of space.
● less susceptible to load factors or the hash function.
● When it is unclear how many or how frequently keys might be added
or removed, it is typically used.

Disadvantages:
● Chaining's cache performance is poor since keys are kept in a linked
list.
● Space wastage (Some Parts of hash table are never used)
● In the worst situation, search time can become O(n) as the chain
lengthens.
● additional space is used for connections.
Open Addressing
Unlike chaining, open addressing doesn't store multiple elements
into the same slot. Here, each slot is either filled with a single key
or left NIL.

Linear Probing
In linear probing, the hash table is searched sequentially that
starts from the original location of the hash. If in case the location
that we get is already occupied, then we check for the next
location.
Algorithm
1. Calculate the hash key. i.e. key = data % size
2. Check, if hashTable[key] is empty
store the value directly by hashTable[key] = data
3. If the hash index already has some value then
check for next index using key = (key+1) % size
4. Check, if the next index is available hashTable[key] then store
the value. Otherwise try for next index.
5. Do the above process till we find the space.
Example: Let us consider a simple hash function as “key mod 5”
and a sequence of keys that are to be inserted are 50, 70, 76, 85,
93.
Quadratic Probing
Quadratic probing operates by taking the original hash index and
adding successive values of an arbitrary quadratic polynomial until an
open slot is found.

Algorithm
1. If the slot hash(x) % n is full, then we try (hash(x) + 12) % n.
2. If (hash(x) + 12 ) % n is also full, then we try (hash(x) + 22) % n.
3. If (hash(x) + 22 ) % n is also full, then we try (hash(x) + 32) % n.
4. This process will be repeated for all the values of i until an empty slot is
found
Example 1:
Table Size = 7, hash function as Hash(x) = x % 7. Insert = 22, 30,
and 50
Example 2:
Table size: 10; hash function: h(x) = x % 10
Keys: 2, 12, 22, 32
Quadratic probing formula: (hash(x) + i2 ) % n

32 2 12 22
0 1 2 3 4 5 6 7 8 9
1. h(2) = 2 % 10 = 2
2. h(12) = 12 % 10 = 2, index 2 is occupied.
(hash(x) + i2 ) % n
(2 + 12 ) % 10 = 3
3. h(22) = 22 % 10 = 2, index 2 is occupied
(2 + 12 ) % 10 = 3, index 3 is occupied
(2 + 22 ) % 10 = 6
4. h(32) = 32 % 10 = 2, index 2 is occupied
(2 + 12 ) % 10 = 3, index 3 is occupied
(2 + 22 ) % 10 = 6, index 6 is occupied
(2 + 32 ) % 10 = 1
Double Hashing
● Double hashing works by using two hash functions to compute two
different hash values for a given key.
● The first hash function is h1(k) which takes the key and gives out a
location on the hash table.
● But in case the location is occupied (collision) we will use secondary
hash-function h2(k) in combination with the first hash-function h1(k) to
find the new location on the hash table.

This combination of hash functions is of the form

h(k, i) = (h1(k) + i * h2(k)) % n
where:
● i is a non-negative integer that indicates a collision number,
● k = element/key which is being hashed
● n = hash table size.
Example:
Keys to insert: 27, 43, 692, 72
Hash Table size: 7
First hash-function is h1(k) = k mod 7
Second hash-function is h2(k) = 1 + (k mod 5)
Lab Activity 1
Create a C program that will allow the user to insert, search,
display, and delete integer elements from a hash table. The
program must use multiplication method as hash function. For
collision resolution, use separate chaining.
Sources
● https://www.geeksforgeeks.org/introduction-to-hashing-
2/#how-to-handle-collisions
● https://www.youtube.com/watch?v=KyUTuwz_b7Q
● https://medium.com/@alejandro.itoaramendia/the-hash-table-
data-structure-a-complete-guide-27fb7ebed2ff

3.3.7 IGCSE Chemistry Notes Percentage Purity and Percentage Yield
No ratings yet
3.3.7 IGCSE Chemistry Notes Percentage Purity and Percentage Yield
2 pages
IFRS 8 Operating Segments
No ratings yet
IFRS 8 Operating Segments
2 pages
Asset
No ratings yet
Asset
112 pages
CC101 Chapter 1 Malware and Computer Society CC101
No ratings yet
CC101 Chapter 1 Malware and Computer Society CC101
86 pages
Customer Behavior
No ratings yet
Customer Behavior
14 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Unit 5
No ratings yet
Unit 5
50 pages
2,2 Hashing
No ratings yet
2,2 Hashing
30 pages
Dsa M5
No ratings yet
Dsa M5
38 pages
DSA - Unit 1
No ratings yet
DSA - Unit 1
43 pages
Hash
No ratings yet
Hash
7 pages
Assessment of Hydrogen Energy For Sustai
No ratings yet
Assessment of Hydrogen Energy For Sustai
8 pages
DS Lecture 01.1 Fall-24-35
No ratings yet
DS Lecture 01.1 Fall-24-35
20 pages
Co - Ownership
100% (1)
Co - Ownership
7 pages
Hashing Data Structure
No ratings yet
Hashing Data Structure
22 pages
Hashing
No ratings yet
Hashing
23 pages
Chapter-A-1 (Operating System)
No ratings yet
Chapter-A-1 (Operating System)
15 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
Hashing
No ratings yet
Hashing
7 pages
Hashing
No ratings yet
Hashing
44 pages
CMSC22 Lecture 2
No ratings yet
CMSC22 Lecture 2
22 pages
HASHING
No ratings yet
HASHING
63 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
April Statement
No ratings yet
April Statement
6 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Hashing Methods
No ratings yet
Hashing Methods
20 pages
HAshing (Satish Sir)
No ratings yet
HAshing (Satish Sir)
52 pages
Christmas Programme 2024
No ratings yet
Christmas Programme 2024
2 pages
Autumn and Summer in A Frozen Fire
No ratings yet
Autumn and Summer in A Frozen Fire
2 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
Module 5
No ratings yet
Module 5
33 pages
Hashing
No ratings yet
Hashing
23 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Seating Plan
No ratings yet
Seating Plan
130 pages
GOD's AWESOME ANIMALS: MY BLOG
No ratings yet
GOD's AWESOME ANIMALS: MY BLOG
472 pages
CS221 04-Basic Counting Techniques
No ratings yet
CS221 04-Basic Counting Techniques
29 pages
Unit 5
No ratings yet
Unit 5
81 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
Shivam Gupta Resume
No ratings yet
Shivam Gupta Resume
1 page
Hashing
No ratings yet
Hashing
5 pages
Hashing
No ratings yet
Hashing
30 pages
What Is Hashing
No ratings yet
What Is Hashing
11 pages
TR Manual
No ratings yet
TR Manual
286 pages
Lab5 Hashing Algos
No ratings yet
Lab5 Hashing Algos
10 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
Week13 1
No ratings yet
Week13 1
16 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
UNIT V - Hashing
No ratings yet
UNIT V - Hashing
20 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Hashing
No ratings yet
Hashing
7 pages
HASHING
No ratings yet
HASHING
21 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
Hashing New
No ratings yet
Hashing New
48 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Exp 5 - Dsa Lab File
No ratings yet
Exp 5 - Dsa Lab File
10 pages
Hashing
No ratings yet
Hashing
20 pages
Ads M Tech Mid 2
No ratings yet
Ads M Tech Mid 2
26 pages
Lecture 3.Pptx 3
No ratings yet
Lecture 3.Pptx 3
24 pages
Differential Equation
No ratings yet
Differential Equation
13 pages
Hunter Run Time Calculator - Door Card - X-Core Main
No ratings yet
Hunter Run Time Calculator - Door Card - X-Core Main
1 page
Bingo: Software Research Specification On
No ratings yet
Bingo: Software Research Specification On
6 pages
Gift Policy: ETHICS Handbook 23
No ratings yet
Gift Policy: ETHICS Handbook 23
3 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Eo Organizing The BDC
No ratings yet
Eo Organizing The BDC
3 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing
No ratings yet
Hashing
23 pages
Notes - The New Deal - Text PP
No ratings yet
Notes - The New Deal - Text PP
2 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Hoffmann, Goethe, and Miyazaki's Spirited Away
No ratings yet
Hoffmann, Goethe, and Miyazaki's Spirited Away
4 pages
10 Civics Ch-1 Notes
No ratings yet
10 Civics Ch-1 Notes
4 pages
Ch7 Hashing
No ratings yet
Ch7 Hashing
12 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing
No ratings yet
Hashing
34 pages
Manzano Vs CA
No ratings yet
Manzano Vs CA
7 pages
Hashing ClassNotes
No ratings yet
Hashing ClassNotes
8 pages
Hashing
No ratings yet
Hashing
56 pages
Hash Table: Didih Rizki Chandranegara
No ratings yet
Hash Table: Didih Rizki Chandranegara
33 pages
300+ Python Algorithms: Mastering the Art of Problem-Solving
From Everand
300+ Python Algorithms: Mastering the Art of Problem-Solving
Hernando Abella
5/5 (1)
Grade 9 - Ems - Exam - Term 4
No ratings yet
Grade 9 - Ems - Exam - Term 4
6 pages
Transpo Phil Rabbit Vs Iac
No ratings yet
Transpo Phil Rabbit Vs Iac
1 page
The Really Useful Piano Poster-1
No ratings yet
The Really Useful Piano Poster-1
1 page
Design Guide For Hot Dip Galvanizing Best Practice Venting and Draining PDF
No ratings yet
Design Guide For Hot Dip Galvanizing Best Practice Venting and Draining PDF
15 pages
Ict SSS One, Two and Three
No ratings yet
Ict SSS One, Two and Three
8 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
2023 Palarong Pampaaralan Dance Sports Guidelines
No ratings yet
2023 Palarong Pampaaralan Dance Sports Guidelines
4 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Questions That Need Be Answered
No ratings yet
Questions That Need Be Answered
10 pages

Hashing

Uploaded by

Hashing

Uploaded by

Hashing

1. Input: The data to be hashed is input into the hashing algorithm.

● Practically useful only if the maximum value is very

significant difference between total records and

Hashing can overcome these limitations of direct

Let’s look at an example.

H(key)= floor(m(key × A mod 1))

h(k)=((a⋅k+b) mod p) mod m

Where a and b are randomly chosen constants, p is a prime

This combination of hash functions is of the form

You might also like