Hashing

Uploaded by

l227437

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

Hashing

Uploaded by

l227437

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 22

DSA CS2002

Hashing
Hashing
ref sec 8.2 Data Structrures in C++
10.5 in the textbook
Purpose:
Makes data retrieval much faster. Assume we have keys
ranging from 1…N and a table of size M smaller than N. Now
have a function h: {1..N}  {0.. M-1} and let h(x) be the location
where a record with key x should be stored. This is the hash
function.

In static hashing the keys are stored in a fixed-size table

called the hash table, ht. The hash table is partitioned into b
buckets, ht[0], …, ht[b-1]. Each bucket is capable of storing s
records. Thus, a bucket is said to consist of s slots.. A hash
function h(x) performs identifier transformation on x, it maps the
set of possible identifiers (keys) onto the integers 0 through b-1.
Hashing
Identifier density= n/T where n is the number of
identifiers in the table and T is the total number
of possible identifiers.
Loading factor =  = n/(sM)
Overflow : occurs when a new identifier is hashed by
h into a full bucket.
Collision: occurs when two nonidentical identifiers
are hashed into the same bucket.When bucket
size s is 1, collision and overflow occur
simultaneously.
Example
Consider hash table ht with b =26 buckets and s=2.
Assume that there are n=10 unique identifiers and that
each identifier begins with a letter. The loading factor
for this table is 10/52 =0.19. The hash function h must
map each of the possible identifiers into one of the
numbers 0 to 25. If the internal binary representation for
the letters A to Z corresponds to the numbers 0 to 25,
and the hash function:
h(x) = first character of x
The identifiers GA, D , A, G, L, A2, A1, A3, A4 and E
will be hashed into buckets 6, 3, 0, 6, 11, 0, 0,0,0, 4
respectively by this function
For address 0 an overflow occurs when
identifier A1 gets hashed into bucket ht[0]

When no overflows occur, the time required

to enter or search for identifiers using
hashing depends only on the time required
to compute the hash function h and time to
search one bucket. Since bucket size, s, is
usually small the search for an identifier
within a bucket is carried out using
sequential search. This time is independent
of n.
Hash function
• Desired properties of hash function
o It be easy to compute
o It should minimize the number of collisions
o It should depend on all characters in the
identifier
o For random inputs, it should not result in biased
use of hash table, i.e., for identifier x the
probability that h(x)= i should be 1/b for all
buckets i. Such a hash function is called a
uniform hash function.
Hash Function
A simple type of hash function is the modulo
operator (%). The hash function is
h(x) = x%M
This function gives bucket addresses in the
range 0 to (M-1). The choice of M is critical.
If M is a power of 2, then h(x) depends only
on the least significant bits of x.
e.g. if we use 5 bits to represent character keys
AKEY
00001 01011 00101 11001 = 1*323 + 11*322 + 5*32 + 25 =44217
Now with M =2i , i<= 5, all identifiers ending with the character
‘Y’ will have the same bucket address, i.e., the bucket address will
depend on one part of the key only. The use of hash table is thus
biased.
Hash Function
Similarly if M is divisible by two, the odd keys were
mapped to odd buckets and even keys are mapped to
even buckets.

These difficulties can be avoided by making M a

prime number, then only factors of M are M and 1.
In practice it has been observed that it is sufficient to
choose M such that it has no prime divisors less than
20.
Overflow Handling
Two ways to handle overflows :
• Open addressing
• Chaining
Overflow Handling :Open Addressing

In open addressing , we assume the hash table is an array and

that s =1. The hash table, ht, is initialized so that each slot contains
null identifier. When a new identifier is hashed into a full bucket,
we need to find another bucket for this identifier. The simplest
solution is to find the closest unfilled bucket.
This is called linear probing or linear open addressing.
Example 8.3 : Assume we have M=26 and s=1 and the
following identifiers: GA, D, A, G, L, A2, A1, A3, A4,
Z, ZA, E
h(x) = first character of x
Overflow Handling :Open Addressing
When linear open addressing is used to handle overflows, a
hash table search for identifier x proceeds as follows:

(1) Compute h(x)

(2) Examine identifiers at positions ht[h(x)], ht[h(x) +1],
…., ht[h(x) +j], in this order until one of the following
happens:
(a) ht[h(x) +j] = x, in this case x is found
(b) ht[h(x) +j] is null, x not in the table.
(c) We return to starting position ht[h(x)]; the table is
full and x is not in the table.
Overflow Handling :Open Addressing - Analysis
For example 8.3:
Number of buckets examined is 1 for A, 2 for A2, 3 for
A1, 1 for D, 5 for A3,…., 10 for ZA – a total of 39
buckets examined for 12 identifiers, average of 3.25
buckets per identifier

Expected average number of identifier probes , P, to

look for an identifier = (2-)/(2-2) , where  is the
loading density
For above example  = 12/26 =0.47 and P=1.5
Overflow Handling :Open Addressing - Analysis
• Problem with linear open addressing is it tends to create
clusters of identifiers. Moreover these clusters tend to
merge as more identifiers are entered, leading to bigger
clusters.
• Worst-case performance is really bad.
• Example of average behaviour:
M= 11113, n=10000, =0.9
Hashing using linear probing: 5.5
Binary search: 12.29
Linear search: 5000
• Some improvement in the growth of clusters and hence
in the average number of probes needed for search can
be obtained by quadratic probing or by using rehashing
techniques
Homework
Q1. Using the simple linear probing method show how the
following sequence of numbers will be inserted into a
hash table using the modulo function as our hash
function and table size of 13:
13 17 21 12 9 8 6 26 0 4 34 47
Overflow Handling : Chaining
In linear probing the search for an identifier
involves comparison with identifiers that have
different hash values.
Using separate chaining we reduce the
number the comparisons carried out by
maintaining list of identifiers:
• One list is kept per bucket, each list containing all the
synonyms for that bucket.
• A search involves computing the hash function h(x) and
examining only those identifiers in the list for h(x).
• As the size for these lists is not known in advance, the
best way to maintain them is as linked chains.
Overflow Handling : Chaining
• Each chain has a head node
• The size of head node is small as it only
retains a link
• Since the lists are to be accessed at
random, the head nodes should be
sequential. We assume they are numbered
0 to b-1 if the hash function h has range 0
to b-1.
Overflow Handling : Chaining
Using Chaining for example 8.3 which had M=26 and s=1 and the following
identifiers: GA, D, A, G,
L, A2, A1, A3, A4, Z, ZA, E
h(x) = first character of x

In this example new identifiers

are inserted at the front of the
chains. The number of probes
needed for A4, D, E, G, L
and ZA is 1; for A3, GA and
Z is 2; for A1 3, for A2 4;
for A 5. A Total of 24 which
gives an average of 2 which is
less than the average for
linear probing
Overflow Handling : Chaining - Search
Overflow Handling : Chaining - Analysis

• The expected number of identifier comparisons = (1+

/2), where  is the loading density n/b
• Deletion is possible and easy

As long as a uniform hash function is used, the

performance of a hash table depends only on
the method used to handle overflows
Homework
Q1. Given M=11 and the following sequence of numbers,
show the final hash table using separate chaining and
the simple modulo function as our hash function:
112 41 34 2 98 16 35 74 77 0 32

League of Nations
No ratings yet
League of Nations
6 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Dey'S - Sample PDF - BST-XII Exam Handbook Term-I - 2021-22
No ratings yet
Dey'S - Sample PDF - BST-XII Exam Handbook Term-I - 2021-22
62 pages
Hashing Part 1 Lecture
No ratings yet
Hashing Part 1 Lecture
33 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Hashing
No ratings yet
Hashing
38 pages
CH 4
No ratings yet
CH 4
58 pages
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
No ratings yet
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
78 pages
AFCONS - DESIGN - Pavement Design (PK 50-75) - Anglais - 2021-03-08
100% (1)
AFCONS - DESIGN - Pavement Design (PK 50-75) - Anglais - 2021-03-08
89 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
Dsa Module 6 Ktuassist
No ratings yet
Dsa Module 6 Ktuassist
9 pages
CS235102 Data Structures
No ratings yet
CS235102 Data Structures
46 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Theory PDF
No ratings yet
Theory PDF
18 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
LecturePPT Chapter 13 HashTable
No ratings yet
LecturePPT Chapter 13 HashTable
23 pages
Hash Function
No ratings yet
Hash Function
9 pages
Hashing PPT
No ratings yet
Hashing PPT
39 pages
HASHING
No ratings yet
HASHING
28 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
Lab 2
No ratings yet
Lab 2
10 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Implementation Priority Queue Using Array
No ratings yet
Implementation Priority Queue Using Array
3 pages
Hashing
No ratings yet
Hashing
34 pages
Modifed Hash
No ratings yet
Modifed Hash
42 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
26 pages
Hash Table
No ratings yet
Hash Table
26 pages
Oak Iron Rulebook Full PDF
No ratings yet
Oak Iron Rulebook Full PDF
31 pages
Control System Configuration PDF
100% (1)
Control System Configuration PDF
2 pages
Lab 09 - Hashing
No ratings yet
Lab 09 - Hashing
47 pages
Module 2 Interpersonal Communication Activity 2
No ratings yet
Module 2 Interpersonal Communication Activity 2
1 page
L-2005-08-Advance Data Structure Part 1-HS
No ratings yet
L-2005-08-Advance Data Structure Part 1-HS
46 pages
Euphoria User Manual
0% (1)
Euphoria User Manual
795 pages
Hashing Slide
No ratings yet
Hashing Slide
16 pages
Algo Cha 8
No ratings yet
Algo Cha 8
20 pages
Module 5
No ratings yet
Module 5
25 pages
DSAL Ass1 Writeup
No ratings yet
DSAL Ass1 Writeup
4 pages
CLS Aipmt-18-19 XIII Bot Study-Package-1 SET-1 Chapter-1 PDF
No ratings yet
CLS Aipmt-18-19 XIII Bot Study-Package-1 SET-1 Chapter-1 PDF
38 pages
DS 8
No ratings yet
DS 8
30 pages
C10 - Hashing
No ratings yet
C10 - Hashing
11 pages
Hashing
No ratings yet
Hashing
30 pages
The Duties and Responsibilities of A Garment Merchandiser
100% (9)
The Duties and Responsibilities of A Garment Merchandiser
10 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Manual Bomba Horizontal Clase D PDF
No ratings yet
Manual Bomba Horizontal Clase D PDF
24 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Working of A Human Ear: PHASE:-#02. Chapter: - Sound
No ratings yet
Working of A Human Ear: PHASE:-#02. Chapter: - Sound
14 pages
A New Way To PFC and An Even Better Way To LLC
No ratings yet
A New Way To PFC and An Even Better Way To LLC
30 pages
TWGMC 1N4007 - C727081 - Diode 1N4001 Surface Mount
No ratings yet
TWGMC 1N4007 - C727081 - Diode 1N4001 Surface Mount
3 pages
Chapter 11 Hashing
No ratings yet
Chapter 11 Hashing
42 pages
Unit 1 Dsa Hashing 2024 1
No ratings yet
Unit 1 Dsa Hashing 2024 1
146 pages
DLL Mapeh-5 Q2
No ratings yet
DLL Mapeh-5 Q2
99 pages
4MS Year Lesson Plan 1 Seq 1 2018-2019
No ratings yet
4MS Year Lesson Plan 1 Seq 1 2018-2019
3 pages
Module 5
No ratings yet
Module 5
33 pages
Unit 1 Dsa Hashing 2022 Compressed 1
No ratings yet
Unit 1 Dsa Hashing 2022 Compressed 1
115 pages
Grade 8 and 9 Workbook
No ratings yet
Grade 8 and 9 Workbook
155 pages
Plano de Trabalho
No ratings yet
Plano de Trabalho
107 pages
Alt II (Apron)
No ratings yet
Alt II (Apron)
25 pages
DS 5
No ratings yet
DS 5
23 pages
Full Unit 6 Cse 205
No ratings yet
Full Unit 6 Cse 205
20 pages
Research Paper Topics:: World War II
No ratings yet
Research Paper Topics:: World War II
4 pages
Ads M Tech Mid 2
No ratings yet
Ads M Tech Mid 2
26 pages
Describe and Evaluate Vygotsky's Theory of Cognitive Development
No ratings yet
Describe and Evaluate Vygotsky's Theory of Cognitive Development
2 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
Unit 1 Dsa Hashing
No ratings yet
Unit 1 Dsa Hashing
137 pages
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
No ratings yet
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
32 pages
Unit 5
No ratings yet
Unit 5
50 pages
Dsa Labtask 12
No ratings yet
Dsa Labtask 12
5 pages
Sistem Reproduksi Wanita
No ratings yet
Sistem Reproduksi Wanita
24 pages
UNIT 1 - Hashing
No ratings yet
UNIT 1 - Hashing
118 pages
Practice Problem Set#2
No ratings yet
Practice Problem Set#2
2 pages
ZHAO - Variability of Surface Heat Fluxes and Its Driving Forces at Different Time Scales Over A Large Ephemeral Lake in China - 2018
No ratings yet
ZHAO - Variability of Surface Heat Fluxes and Its Driving Forces at Different Time Scales Over A Large Ephemeral Lake in China - 2018
19 pages
Ma Theses-The Effectiveness of Project-Based Learning On Students Achievement and Motivation
No ratings yet
Ma Theses-The Effectiveness of Project-Based Learning On Students Achievement and Motivation
155 pages
Static and Dynamic Hashing
No ratings yet
Static and Dynamic Hashing
3 pages
DSA Unit 1
No ratings yet
DSA Unit 1
144 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Nelder Mead Slides
No ratings yet
Nelder Mead Slides
47 pages
Study Material On Hashing
No ratings yet
Study Material On Hashing
4 pages
He Sas 1
No ratings yet
He Sas 1
3 pages
Prediction of Compressive Strength of Concrete With Agricultural Waste and Natural Fibre 2024
No ratings yet
Prediction of Compressive Strength of Concrete With Agricultural Waste and Natural Fibre 2024
5 pages
Farzana Khatri
No ratings yet
Farzana Khatri
9 pages
Hashing
No ratings yet
Hashing
25 pages
Inns: Civil War: Tithe Causes
No ratings yet
Inns: Civil War: Tithe Causes
262 pages
Class Notes 1-5
No ratings yet
Class Notes 1-5
51 pages
Modue 5
No ratings yet
Modue 5
10 pages
Hashing Presentation
No ratings yet
Hashing Presentation
12 pages
DSA - Unit 1
No ratings yet
DSA - Unit 1
43 pages

Hashing

Uploaded by

Hashing

Uploaded by

DSA CS2002

In static hashing the keys are stored in a fixed-size table

When no overflows occur, the time required

These difficulties can be avoided by making M a

In open addressing , we assume the hash table is an array and

(1) Compute h(x)

Expected average number of identifier probes , P, to

In this example new identifiers

• The expected number of identifier comparisons = (1+

As long as a uniform hash function is used, the

You might also like