0% found this document useful (0 votes)

8 views48 pages

Hashing New

This document provides an overview of hashing, including concepts such as hash tables, hash functions, and collision resolution strategies like linear probing and chaining. It discusses the efficiency of hashing for data retrieval and the issues related to collisions and rehashing. Additionally, it outlines various methods for implementing hash functions and terminologies related to hashing, such as buckets, synonyms, and load factors.

Uploaded by

Aarya Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views48 pages

Hashing New

Uploaded by

Aarya Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Disclaimer

This presentation is designed for teaching purpose only. Topics are not
specified in detail in the presentation , so this is not suggested for reading
material for examination.

Student should read textbooks and reference books mentioned in the

syllabus.
Hashing

2
Unit 5: Hashing

General idea of Hashing, Hash Table, Hash function, Rehashing,

Issues in Hashing (T2/T4),
Collision Resolution Strategies: Linear Probing, Quadratic Probing,
Double Hashing, Open addressing and Chaining.
The Search Problem
• Unsorted list
– O(N)
• Sorted list
– O(logN) using arrays (i.e., binary search)
– O(N) using linked lists
• Binary Search tree
– O(logN) (i.e., balanced tree)
– O(N) (i.e., unbalanced tree)
• Can we do better than this?
– Hashing

4
Hash Tables
. Hashing is a method of directly computing address of a
record with the help of a key by using a suitable
mathematical function called hash function.
• A hash Table is an array based structure used to store
<key,information> pair.The implementation of hash
tables is called hashing.
• Hashing is a technique used for performing insertions,
deletions and finds in constant average time (i.e. O(1))
• This data structure, however, is not efficient in
operations that require any ordering information among
the elements, such as findMin, findMax and printing the
entire table in sorted order. 5
General Idea
• The ideal hash table structure is merely an array of some fixed
size, containing the items.
• A stored item needs to have a data member, called key, that will
be used in computing the index value for the item.
– Key could be an integer, a string, etc
– e.g. a name or Id that is a part of a large employee structure
• The size of the array is TableSize.
• The items that are stored in the hash table are indexed by values
from 0 to TableSize – 1.
• Each key is mapped into some number in the range 0 to TableSize
– 1.
• The mapping is called a hash function.

6
Example Hash
Table
0
1
Items
2
john 25000 john25000
25000
3 john
phil 31250 key Hash 4 phil31250
phil 31250
Function
dave 27500 5
6 dave27500
dave 27500
mary 28200
7 mary28200
mary 28200

key 8
9

7
Hash Function
The hash function is a mathematical function that transforms different
keys in to different addresses of hash table.

8
Hash function
Issues:
• Keys may not be numeric.
• Number of possible keys is much larger than the
space available in table.
• Different keys may map into same location
– Hash function is not one-to-one => collision.
– If there are too many collisions, the performance of
the hash table will suffer dramatically.

9
Hash Functions
• If the input keys are integers then simply Key mod
TableSize is a general strategy.
• If the keys are strings, hash function needs more care.
– First convert it into a numeric value.

10
Perfect Hash Function
The hash function that transforms different keys in to different addresses
and avoids collision called as perfect hash function.
The hash function:
must be simple to compute.
must distribute the keys evenly among the cells.
If we know which keys will occur in advance we can write perfect hash
functions, but we don’t.
Hashing methods /Functions
1. Direct method: Key is the address without any
arithmetic manipulation
Ex. Total monthly sales by days of the month

2. Modulo Division or Key mod N:

– N is the size of the table, better if it is prime
– Ex. 300 employees :121267%307=2
– Hash(121267)=2

12
3. Digit Extraction Method
- Extract digits from the key & form address
4.Folding:
a. Fold shift b. Fold Boundary
– e.g. 123|456|789: add them
5. Mid Squaring:
– Square the key and then truncate
– 9452*9452=89340304=3403

13
6.Substraction Method
ex. 100 employees
employee nos starts from 1001 to 1100.
7. Pseudorandom method
y=ax+c
x=key ,a & c are prime numbers
Table size =307
y=((17*121267)+7) modulo 307
Y=41
8.Rotation :When keys are mostly same and last digit diffrent
ex:120605 --->512060 use any hash function
Terminologies
1. Bucket
A table uses a hash function to compute an index in to an array of
buckets or slots from which a desired value can be obtained.
M slots of the hash table can be divided in to B buckets,with each
bucket consisting M/B slots.
2. Collision
When we try to place an element in the bucket that holds an
element collision occurs.
• In such case the element should be rehashed to alternate
empty location.
Terminologies
3. Probe
If the hashed address is found to be already occupied by an element then
the locations following the hashed location are examined to locate the
first empty location.
• Two methods are popular
a. Linear Probing b. Quadratic Probing
4. Synonym
The mapping defined by a hash function is going to be many to one.
The mapping function maps a set of values to the same location in the
hash table. Elements mapped to same location in the hash table are
known as synonym.
5. Overflow
When there are more colliding records for a given bucket than table
capacity overflow occurs.
6. Open Hashing or External Hashing
No limitation on size of the table (storage on hard disk)
7. Close Hashing or Internal Hashing
Fixed space for storage
8.Load Density
Maximum storage capacity
9. Load Factor
Load factor of a hash table is the ratio of n/T.
N= no of keys in the table
T = size of the hash table
10.Full Table
Rehashing
With respect to closed hashing .When we try to store record with key1
at the bucket position Hash(key1) & finds its a collision.
To handle collision ,we use strategy to choose a sequence of alternative
locations Hash1(key1),Hash2(key1) and so on within the bucket table
so as to place record with key1 .
If the table is close to full, the search time grows and may become
equal to the table size.
When the load factor exceeds a certain value (e.g. greater than 0.5) we
do rehashing :
Build a second table twice as large as the original
and rehash there all the keys of the original table.
Rehashing is expensive operation, with running time O(N)
However, once done, the new hash table will have good performance.
This is called Rehashing.
Hash Function 1:Stings
• Add up the ASCII values of all characters of the key.
int hash(const string &key, int tableSize)
{
int hasVal = 0;

for (int i = 0; i < key.length(); i++)

hashVal += key[i];
return hashVal % tableSize;
}

• Simple to implement and fast.

• However, if the table size is large, the function does not
distribute the keys well.
• e.g. Table size =10000, key length <= 8, the hash function can
assume values only between 0 and 1016

20
Collision Resolution
• If, when an element is inserted, it hashes to the
same value as an already inserted element, then we
have a collision and need to resolve it.
• There are several methods for dealing with this:
1. Open addressing
a. Linear Probing
b. Quadratic Probing
c. Double Hashing

21
1. Linear Probing
Index Data
Place new record linearly down wherever the empty location is
found 0
Ex:
1 131
131,21,31,4,5,61,7,8
2 21

3 31
Index=key mod 10
4 4

5 5

6 61

7 7

8 8

22 9
Drawback of Linear Probing
The tendency for some collision resolution
schemes to create long runs of filled slots
near the hash function position of keys is
called as primary clustering.

Primary clustering increases average search

time.

23
Classes (Refer T2)
Class dataitem Class hashtable
{ {
Private int data; Dataitem[] hasharray;
Int arraysize;
} hashtable(int size)
{
hasharray=new
dataitem[arraysize];
}
hashfun(datatype key)
hashval=(key % 10)
Return hashval
Insert
1. Check if table is full
Display table full
2.Accept item to be inserted in hashtable.
3. Calculate hash value of the key.
4.while(hasharray[hashval]!=0 || hasharray[hashval].data!=-1
3.1 increment hashval
3.2 hashval %=arraysize
5. hasharray[hashval]=item
6.stop
Find
1. Accept key to be searched in hashtable.
2. Calculate hash value of the key.
3.while(hasharray[hashval]!=0
3.1 if hasharray[hashval].data==key
Return hasharray[hashval]
3.1 increment hashval
3.2 hashval %=arraysize
4. return null
5.stop
Delete
1. Accept key to be searched in hashtable.
2. Calculate hash value of the key.
3.while(hasharray[hashval]!=0
3.1 if hasharray[hashval].data==key
Dataitem temp=hasharray[hashval]
hasharray[hashval] =-1
Return temp
3.1 increment hashval
3.2 hashval %=arraysize
4. return null
5.stop
Linear Probing with replacement
1. Accept item to be inserted in hashtable.
2. Calculate j as hash value of the key.
3.while(hasharray[hashval]!=0 || hasharray[hashval].data!=-1
3.1 if hasharray[hashval].data %10 !=j //non home record
{ dataout= hasharray[hashval].data
hasharray[hashval].data=item
item=dataout }//update existing data
3.2 increment hashval
3.3 hashval %=arraysize
4. hasharray[hashval]=item
5.stop
Quadratic Probing
Hi(key)=(Hash(key)+i^2)%m
i=0---(max -1)/2 whichever applicable
Ex: 37,90,55,22,11,17,49,87
Index Data

0 90
17->? Collision
1 11
(17+0^2)%10=7
2 22
(17+1^2)%10=8, place 17
3

5 55

6 30
49
Index Data
87 ->? collision
0 90

1 11

2 22

5 55

6 87

7 37
31
Double Hashing
Double hashing is a technique in which a second hash
function is applied to key when collision occurs.

H1(key)=key mod tablesize

H2(key)=M-(key mod M)
where M is a prime number smaller than table size
Hash(key)=([H1(key)+i*H2(key)]% tablesize)
I=1,2,3,4,5.... tablesize
Ex: 37,90,45,22,17,49,55

32
Double Hashing
If a collision occurs when inserting, apply a second
auxiliary hash function, h 2 (k), and probe at a distance
h 2 (k), 2 * h 2 (k), 3 * h 2 (k), etc. until find empty position.
So, f(i) = i * h 2 (k) and we have two auxiliary functions:
h( k, i ) = ( h 1 (k) + i * h 2 (k) ) mod m
With H = h 1 ( k ), we try the following cells in sequence with
wraparound:
H
H + h 2 (k)
H + 2 * h 2 (k)
H + 3 * h 2 (k)
Ind Da
ex ta
Ex: 12,1,18,56,79,49
0
Insert 49
1 1
H1(49)=49%10=9
H2(M-(Key%M) 2 12
M is prime number smaller
3 49
than siz of table
H2(49)=7-(49%7)=7 4

Hash(49)=[H1(49)+i*H2(49)%10 5
=[9+1*7]%10=6 full
6 56
Hash(49)=[H1(49)+i*H2(49)%10 7
=[9+2*7]%10=3
8 18
9 79
Chaining/Bucket Hashing
Index Data Chain
1.Chaining without Replacement
0 -1 -1
Ex. 131,3,4,21,61,6,71,8,9
1 131 2

2 21 5

3 3 -1

4 4 -1

5 61 7

6 6 -1

7 71 -1

8 8 -1

9 9 -1
2.Chaining with Replacement
Ex.131,21,31,4,5,2 Index Data Chain

0 -1 -1

1 131 2

2 21 3

3 31 -1

4 4 -1

5 5 -1

6 2 -1

7 -1 36 -1
2.Chaining with Replacement

Ex.131,21,31,4,5 Index Data Chain

0 -1 -1

1 131 6

2 2 -1

3 31 -1

4 4 -1

5 5 -1

6 21 3

7 -1 37 -1
Linear probing with chaining with
replacement
1. Initialize hash table with value & chain to -1.
2. Check if table full & display message.
3. If table slot empty i.e table[key][0]==-1
Store new value in empty table slot
4.Otherwise //table slot is not empty
4.1 Read chain at key position //ch=table[key][1]
4.2 Check if collision occur and existing value & new are synoname
4.2.1If it has no immediate chain //immediate next empty
4.2.1.1 Find next empty slot
Place record and update chain ,set flag & break
4.2.2 Read while element !=-1 and chain !=-1
ch=table[ch][1]
Place record and update chain ,set flag & break
5.2 else keys are not synoname
5.2.1 Read chain at table slot & check if it is empty
5.2.2 Store existing value temp=table[key][0]
Search for empty table slot //i=key+1;i<max;i++
Store table[key][0]=new key
Store table[i][0]=temp
update chain ,set flag & break
5.2.2//if unmatch & chain exists
5.2.2.1 Read chain
Read existing element
5.2.2.2 While chain !=-1
ch=table[key][1]
Store element
update chain ,set flag & break
5. Stop
Separate Chaining
• The idea is to keep a list of all elements that hash
to the same value.
– The array elements are pointers to the first nodes of the
lists.
– A new item is inserted to the front of the list.
• Advantages:
– Better space utilization for large items.
– Simple collision handling: searching linked list.
– Overflow: we can store more items than the hash table
size.
– Deletion is quick and easy: deletion from the linked list.

40
Example
Keys: 0, 81, 64,49, 36, 25, 16, 9, 4, 1
hash(key) = key % 10.
0 0

1 81 1
2

4 64 4
5 25
6 36 16
7

9 49 9
Operations
• Initialization: all entries are set to NULL
• Find:
– locate the cell using hash function.
– sequential search on the linked list in that cell.
• Insertion:
– Locate the cell using hash function.
– (If the item does not exist) insert it as the first item in
the list.
• Deletion:
– Locate the cell using hash function.
– Delete the item from the linked list.

42
Hashing using Separate
Chaining(Linked List)
Class hashing
Class node { public:
node hashtable[max];
{ Hashing()
int key; {
for(i=0;i<n;i++)
node next;
{
} hashtable[i]=null;
}
}
Void insert();
Void search();
Void delete();
}
Insert(int k)
1.Create a new node say curr.
2. Assign data for new node
3. Calculate pos=hash(curr.key)
4.If hashtable[pos]==null
hashtable[pos]=curr;
5.Else
5.1 temp=hashtable[pos];
5.2 while(temp.next!=null)
temp=temp.next;
5.3 temp.next=curr
6.stop
Display()
1. Declare curr
2. For (i=0;i<10 ;i++)
2.1 curr=hashtable[i]
2.2 while (curr !=null)
Display curr.data
curr=curr.next
2.3 end while loop
3.stop
Search (int x)
1. Declare curr for traversal.
2. Find pos= hash(x);
3. Curr=hashtable[pos];
4. while (curr!=null && curr.key !=x)
4.1 Display curr.data
4.2 curr=curr.next
5. If curr=null
Display record not found
6 else
Display record not found
7.Stop
Delete(key)
1. Get the value
2. Compute the address using hash function.
3. Using linked list deletion algorithm, delete the element from the
hashtable[key].
Linked List Deletion Algorithm:
4. If unable to delete, print "Value Not Found"
5.Stop
Hashing Applications
• Compilers use hash tables to implement the
symbol table (a data structure to keep track of
declared variables).
• Game programs use hash tables to keep track of
positions it has encountered (transposition table)
• Online spelling checkers.

Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Postpaid Bill 8332805371 MF2506I000489359
No ratings yet
Postpaid Bill 8332805371 MF2506I000489359
9 pages
UNIT 1 - Hashing
No ratings yet
UNIT 1 - Hashing
118 pages
11 Hash Tables Slides
No ratings yet
11 Hash Tables Slides
34 pages
Hashing
No ratings yet
Hashing
30 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
No ratings yet
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
32 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
Ds 5 Update
No ratings yet
Ds 5 Update
26 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
5 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
HASHING
No ratings yet
HASHING
63 pages
HAshing (Satish Sir)
No ratings yet
HAshing (Satish Sir)
52 pages
Module 5
No ratings yet
Module 5
33 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing Updated
No ratings yet
Hashing Updated
26 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Chapter10 HashTables
No ratings yet
Chapter10 HashTables
49 pages
Dsa Hashing (21CS32)
No ratings yet
Dsa Hashing (21CS32)
16 pages
What Is Hashing
No ratings yet
What Is Hashing
11 pages
Hashing
No ratings yet
Hashing
23 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
GROUP 15.Pptx Presentation
No ratings yet
GROUP 15.Pptx Presentation
29 pages
Hashing
No ratings yet
Hashing
44 pages
Dsa Labtask 12
No ratings yet
Dsa Labtask 12
5 pages
Hashing
No ratings yet
Hashing
30 pages
Manual For TP-329 CHG 9 (PS-835 CMM)
No ratings yet
Manual For TP-329 CHG 9 (PS-835 CMM)
108 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
HASHING
No ratings yet
HASHING
21 pages
Ads M Tech Mid 2
No ratings yet
Ads M Tech Mid 2
26 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
PWD Rural Circle Jaipur Road BSR 2019 (Jaipur)
67% (6)
PWD Rural Circle Jaipur Road BSR 2019 (Jaipur)
93 pages
Lecture 3.Pptx 3
No ratings yet
Lecture 3.Pptx 3
24 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing ClassNotes
No ratings yet
Hashing ClassNotes
8 pages
Hash Tables
No ratings yet
Hash Tables
21 pages
Hashing
No ratings yet
Hashing
23 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Hashing
No ratings yet
Hashing
34 pages
Hashing 1
No ratings yet
Hashing 1
26 pages
Instrumentation Earthing
100% (1)
Instrumentation Earthing
11 pages
Hash Table: Didih Rizki Chandranegara
No ratings yet
Hash Table: Didih Rizki Chandranegara
33 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Upendra Internship Final
No ratings yet
Upendra Internship Final
39 pages
Hashing
No ratings yet
Hashing
56 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Lecture 14 Hashing
No ratings yet
Lecture 14 Hashing
44 pages
Struktur Data: By: Sri Rezeki Candra Nursari
No ratings yet
Struktur Data: By: Sri Rezeki Candra Nursari
34 pages
Lab 2
No ratings yet
Lab 2
10 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
BDA Record
No ratings yet
BDA Record
34 pages
PM MG915,917,919,921,922
No ratings yet
PM MG915,917,919,921,922
85 pages
Wa0037.
No ratings yet
Wa0037.
9 pages
Advances in Neural Rendering
No ratings yet
Advances in Neural Rendering
33 pages
Modula Lift Proposal - Warehouse Area 12-4-24
No ratings yet
Modula Lift Proposal - Warehouse Area 12-4-24
15 pages
Cozy Corners
No ratings yet
Cozy Corners
40 pages
The-Art-of-CRM 2023
No ratings yet
The-Art-of-CRM 2023
11 pages
Iterative Design and Prototyping
No ratings yet
Iterative Design and Prototyping
26 pages
Forrester Predictions2025 B2C CX
No ratings yet
Forrester Predictions2025 B2C CX
9 pages
Pdfjoiner
No ratings yet
Pdfjoiner
6 pages
Avoid Overlapping Background Job RSBTONEJOB
100% (1)
Avoid Overlapping Background Job RSBTONEJOB
7 pages
Ecell Empulse Bunksheet
No ratings yet
Ecell Empulse Bunksheet
3 pages
M100 Twin Technical Sheet en
No ratings yet
M100 Twin Technical Sheet en
15 pages
Volvo J1939 J1708 Datalink Fault Tracing
91% (35)
Volvo J1939 J1708 Datalink Fault Tracing
25 pages
Kali Linux Assuring Security by Penetration Testing Sample Chapter
No ratings yet
Kali Linux Assuring Security by Penetration Testing Sample Chapter
43 pages
Coursera Course 1 C Programming
No ratings yet
Coursera Course 1 C Programming
1 page
Software Quality Metrics Overview
No ratings yet
Software Quality Metrics Overview
63 pages
PMP ITTO Process Chart PMBOK Guide 6th Edition-1a
No ratings yet
PMP ITTO Process Chart PMBOK Guide 6th Edition-1a
14 pages
Technology Management
No ratings yet
Technology Management
15 pages
The C2M2: Helping Utilities With Cybersecurity Preparedness
No ratings yet
The C2M2: Helping Utilities With Cybersecurity Preparedness
29 pages
Lecture 4
No ratings yet
Lecture 4
9 pages
Journal of Energy Storage
No ratings yet
Journal of Energy Storage
14 pages
BSC101 Physics Module 4 Exercise Bank 2023-24 SEM II
No ratings yet
BSC101 Physics Module 4 Exercise Bank 2023-24 SEM II
2 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
SHC Tracker Mar 2013
No ratings yet
SHC Tracker Mar 2013
8 pages
Magistr 3 Jurnalı
No ratings yet
Magistr 3 Jurnalı
1 page
P6 File Corruption
No ratings yet
P6 File Corruption
20 pages
Absensi HTML
No ratings yet
Absensi HTML
4 pages
Exercise Bank Module 1 (EMR) and Module 2 (Polarization) SEM II
No ratings yet
Exercise Bank Module 1 (EMR) and Module 2 (Polarization) SEM II
2 pages
5 Mva GTP For Export Job
No ratings yet
5 Mva GTP For Export Job
3 pages
Maths 1
No ratings yet
Maths 1
1 page
Maths 3
No ratings yet
Maths 3
1 page
Green Is Great Part 2
No ratings yet
Green Is Great Part 2
2 pages
Activity Guide - Packets - Unit 2 Lesson 05 (
No ratings yet
Activity Guide - Packets - Unit 2 Lesson 05 (
2 pages
BMW Case
No ratings yet
BMW Case
2 pages
300+ Python Algorithms: Mastering the Art of Problem-Solving
From Everand
300+ Python Algorithms: Mastering the Art of Problem-Solving
Hernando Abella
5/5 (1)
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet

Hashing New

Uploaded by

Hashing New

Uploaded by

Disclaimer

Student should read textbooks and reference books mentioned in the

General idea of Hashing, Hash Table, Hash function, Rehashing,

2. Modulo Division or Key mod N:

for (int i = 0; i < key.length(); i++)

• Simple to implement and fast.

Primary clustering increases average search

H1(key)=key mod tablesize

Ex.131,21,31,4,5 Index Data Chain

You might also like