0% found this document useful (0 votes)

10 views3 pages

Unit 2 Hashing

Uploaded by

vani.cs4014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views3 pages

Unit 2 Hashing

Uploaded by

vani.cs4014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Unit -2

Hash Functions, Collision resolution schemes

1. Hash Functions

A hash function is a function that takes an input (or "key") and returns an integer value, known as a
hash value or hash code, which typically maps the key to a location (or index) in a hash table. The
purpose of hash functions is to efficiently distribute keys across the table to minimize collisions and
achieve fast access to data.

Properties of a Good Hash Function

A good hash function should satisfy the following properties:

• Deterministic: The same input always produces the same hash value.

• Efficient: The computation of the hash value should be quick.

• Uniform Distribution: The hash values should be uniformly distributed across the range of
possible hash table indices to minimize collisions.

• Minimize Collisions: The function should minimize the likelihood that two different keys
produce the same hash value (a collision).

Common Hash Function Techniques

1. Division Method: The most common hash function is the division method, where the key k is
divided by a prime number mmm (the size of the hash table) and the remainder is the hash
value:

h(k)=kmod m

o The choice of m is crucial. A prime number is typically chosen to reduce patterns in

the keys and to spread the keys more uniformly.

2. Multiplication Method: In this method, the key k is multiplied by a constant A, and the
fractional part of the product is taken, then scaled by the table size mmm:

h(k)=⌊m(kAmod 1)⌋

o A is typically chosen as a value between 0 and 1, and its specific value is important for
ensuring good distribution.

o The multiplication method has the advantage of generating a uniform distribution

even with small values of m

3. Universal Hashing: In universal hashing, a family of hash functions is chosen randomly, and
one function from the family is used to hash a key. This reduces the likelihood of collisions,
especially when adversarial inputs are possible.

• It is useful in scenarios where the inputs are not known in advance or when there is a high risk
of collisions due to predictable input patterns.
4. String Hashing: For strings, one common approach is the polynomial hash function. The idea
is to treat the string as a sequence of coefficients for a polynomial, where each character in
the string contributes to the hash based on its position in the string.

h(s)=(s1⋅p0+s2⋅p1+s3⋅p2+…+sn⋅pn−1)mod m

Here, p is a constant (often chosen as a small prime number), and m is the table size.

• This method helps spread out string keys effectively across the hash table.

Issues with Hash Functions:

• Collisions: Collisions happen when two different keys hash to the same index. This is inevitable
with a finite hash table size, and efficient collision resolution techniques are necessary.

2. Collision Resolution Schemes

When a collision occurs, i.e., when two keys hash to the same index, we need to handle it in a way that
preserves the efficiency of the hash table. There are several techniques to resolve collisions:

a) Open Addressing

In open addressing, all elements are stored in the hash table itself, and when a collision occurs, the
algorithm searches for another available slot (a "probe") within the table to store the new key. The
probe sequence is the pattern used to search for a slot.

1. Linear Probing:

o In linear probing, when a collision occurs at index iii, the algorithm checks the next
index, i+1. If that index is occupied, it continues checking i+2, i+3, and so on until an
empty slot is found.

o The probe sequence is: i,(i+1)mod m,(i+2)mod m,(i+3)mod m,…

o Problem: Linear probing can lead to clustering, where a group of adjacent slots get
filled, which increases the search time for subsequent insertions and lookups.

2. Quadratic Probing:

• In quadratic probing, the probe sequence is adjusted so that the next index is i+12,i+22,i+32,…

• The probe sequence is: i,(i+12)mod m,(i+22)mod m,(i+32)mod m,…

• This method helps to reduce clustering, but it still suffers from the issue of secondary
clustering (when multiple keys hash to the same index and follow the same probing pattern).

3. Double Hashing:

• In double hashing, two hash functions are used. If a collision occurs at index iii, a second hash
function is used to compute an offset (step size) for the probe:

h2(k) = (h(k) + i⋅h1(k)) mod m

where h1(k) is the first hash function, and h2(k) is the second hash function.
• This method tends to reduce clustering better than linear or quadratic probing, but it requires
two hash functions.

b) Chaining

In chaining, each table index stores a linked list (or another dynamic data structure) of all elements
that hash to that index. Instead of looking for an empty slot, we simply add the new element to the
list at the corresponding index. This can handle collisions by allowing multiple elements to be stored
at the same index.

• Advantages:

o Chaining allows the hash table to grow dynamically, as the list at each index can hold
any number of elements.

o The performance of insertions, deletions, and lookups depends on the length of the
list at each index, which can be minimized by resizing the table when the load factor
exceeds a threshold.

• Disadvantages:

o Additional memory is needed to store the lists, and performance can degrade if many
keys hash to the same index (i.e., the lists become long).

3. Load Factor and Resizing

The load factor λ is defined as the ratio of the number of elements in the hash table n to the number
of slots in the table m:

λ=n/m

• A higher load factor increases the likelihood of collisions, which may degrade performance.

• If the load factor exceeds a certain threshold (commonly 0.7 to 0.8), the hash table may be
rehashing (resizing). Rehashing involves creating a new table with a larger size, typically double
the size of the current table, and re-inserting all elements.

CMO Olympiad Book For Class 5
0% (1)
CMO Olympiad Book For Class 5
13 pages
First Quarterly Assessment in Tle-Ict
100% (1)
First Quarterly Assessment in Tle-Ict
6 pages
E78330 PDF - 3364504 - en-US-6-3
No ratings yet
E78330 PDF - 3364504 - en-US-6-3
693 pages
Patrick Siarry (Editor) - Metaheuristics-Springer (2016) PDF
No ratings yet
Patrick Siarry (Editor) - Metaheuristics-Springer (2016) PDF
497 pages
Unit Commitment
No ratings yet
Unit Commitment
50 pages
Hashing and Skiplist - Removed
No ratings yet
Hashing and Skiplist - Removed
113 pages
Hash Tables
100% (1)
Hash Tables
30 pages
Hashing New
No ratings yet
Hashing New
48 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
HAshing (Satish Sir)
No ratings yet
HAshing (Satish Sir)
52 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Hashing Unit 1
No ratings yet
Hashing Unit 1
91 pages
Hashing
No ratings yet
Hashing
30 pages
User Manual 3842570
No ratings yet
User Manual 3842570
18 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
GROUP 15.Pptx Presentation
No ratings yet
GROUP 15.Pptx Presentation
29 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Ads M Tech Mid 2
No ratings yet
Ads M Tech Mid 2
26 pages
20 Hashing
No ratings yet
20 Hashing
47 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Browser Fingerprinting Protection
No ratings yet
Browser Fingerprinting Protection
105 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Hashing
No ratings yet
Hashing
20 pages
Module 5
No ratings yet
Module 5
33 pages
Hashing
No ratings yet
Hashing
42 pages
09 Hashtable
No ratings yet
09 Hashtable
53 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing
No ratings yet
Hashing
34 pages
What Is Hashing
No ratings yet
What Is Hashing
11 pages
Installation Operation Maintenance/ Programming
No ratings yet
Installation Operation Maintenance/ Programming
80 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
L5 HashTables
No ratings yet
L5 HashTables
22 pages
Lecture 08 - Hash Tables
No ratings yet
Lecture 08 - Hash Tables
21 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing
No ratings yet
Hashing
23 pages
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
No ratings yet
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
32 pages
HASHING
No ratings yet
HASHING
16 pages
TP04-AS2 Instruction Sheet-English-20060718 PDF
No ratings yet
TP04-AS2 Instruction Sheet-English-20060718 PDF
2 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
Hashing Updated
No ratings yet
Hashing Updated
26 pages
Unit 7
No ratings yet
Unit 7
27 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
Dsa Hashing (21CS32)
No ratings yet
Dsa Hashing (21CS32)
16 pages
Week13 1
No ratings yet
Week13 1
16 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Assignment 6
No ratings yet
Assignment 6
5 pages
HASHING
No ratings yet
HASHING
8 pages
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
No ratings yet
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
19 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Hashing Notes
No ratings yet
Hashing Notes
5 pages
Hash
No ratings yet
Hash
7 pages
RPA Introduction
No ratings yet
RPA Introduction
15 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
5 pages
Hashing
No ratings yet
Hashing
30 pages
CSE 203#08 Hashing
No ratings yet
CSE 203#08 Hashing
7 pages
Assessing The Network With Common Security Tools 3e - Merabi Takashvili
No ratings yet
Assessing The Network With Common Security Tools 3e - Merabi Takashvili
13 pages
Unit28 Hashing1
No ratings yet
Unit28 Hashing1
19 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
Hashing
No ratings yet
Hashing
4 pages
L32 Hashing
No ratings yet
L32 Hashing
2 pages
Pentestmonkey
No ratings yet
Pentestmonkey
5 pages
STD 6 & Above (Grand Master) Referral Notes
No ratings yet
STD 6 & Above (Grand Master) Referral Notes
36 pages
Hashing
No ratings yet
Hashing
3 pages
Module #1 WORKSHOP 1 - ICT - C1
No ratings yet
Module #1 WORKSHOP 1 - ICT - C1
7 pages
Practical Report Submission Group 1 - 212094289 - Attempt - 2021-06-30-20-00-18 - 212094289 Fluid Report
No ratings yet
Practical Report Submission Group 1 - 212094289 - Attempt - 2021-06-30-20-00-18 - 212094289 Fluid Report
13 pages
D Copia3003MF 3003MFplus 3004MF 3503MF 3504MF Service Manual
No ratings yet
D Copia3003MF 3003MFplus 3004MF 3503MF 3504MF Service Manual
257 pages
Dorfromantik Rules
No ratings yet
Dorfromantik Rules
7 pages
Assignment # 2 (AI)
No ratings yet
Assignment # 2 (AI)
14 pages
AWR Linked
No ratings yet
AWR Linked
36 pages
The Effects of Social Media in The Academic Performance of The Grade 12 Students of Limay Senior High School
No ratings yet
The Effects of Social Media in The Academic Performance of The Grade 12 Students of Limay Senior High School
4 pages
Trackerui English
No ratings yet
Trackerui English
18 pages
DH-IPC-HDBW4433R-ZS: 4MP WDR IR Dome Network Camera
No ratings yet
DH-IPC-HDBW4433R-ZS: 4MP WDR IR Dome Network Camera
3 pages
Wifi Registration
No ratings yet
Wifi Registration
1 page
How To Create An ISO Image From A CD (Or DVD or BD)
No ratings yet
How To Create An ISO Image From A CD (Or DVD or BD)
2 pages
A320-Awl Intro
No ratings yet
A320-Awl Intro
8 pages
DS Lab 2
No ratings yet
DS Lab 2
8 pages
Chapter 5
No ratings yet
Chapter 5
4 pages
Corporate Brochure For Businesses and Organizations
No ratings yet
Corporate Brochure For Businesses and Organizations
4 pages
Paint Racking System Upgrade Set Jan2019
No ratings yet
Paint Racking System Upgrade Set Jan2019
3 pages
A Brief Introduction To Circuit Analysis 1st Edition J. David Irwin Download PDF
100% (4)
A Brief Introduction To Circuit Analysis 1st Edition J. David Irwin Download PDF
84 pages
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet

Unit 2 Hashing

Uploaded by

Unit 2 Hashing

Uploaded by

Unit -2

Hash Functions, Collision resolution schemes

Properties of a Good Hash Function

A good hash function should satisfy the following properties:

• Efficient: The computation of the hash value should be quick.

Common Hash Function Techniques

o The choice of m is crucial. A prime number is typically chosen to reduce patterns in

o The multiplication method has the advantage of generating a uniform distribution

Issues with Hash Functions:

2. Collision Resolution Schemes

o The probe sequence is: i,(i+1)mod m,(i+2)mod m,(i+3)mod m,…

• The probe sequence is: i,(i+12)mod m,(i+22)mod m,(i+32)mod m,…

h2(k) = (h(k) + i⋅h1(k)) mod m

3. Load Factor and Resizing

You might also like