0% found this document useful (0 votes)

26 views26 pages

Hash Functions

Hash functions are used to map data of arbitrary size to data of a fixed size. This allows data to be stored and retrieved more efficiently from databases. Common hash function algorithms include truncation, mid-square, folding, and division methods. Hash collisions occur when two keys map to the same hash value. Collision resolution techniques include chaining, which links colliding keys together, and open addressing techniques like linear probing and quadratic probing that find alternate locations to store colliding keys. Coalesced hashing is a hybrid approach that chains keys within the hash table itself to reduce wasted space.

Uploaded by

GANESH G 111905006

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views26 pages

Hash Functions

Uploaded by

GANESH G 111905006

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

UNIT V

HASH FUNCTIONS
Hash Function
• Hashing is the transformation of a string of characters into a
usually shorter fixed-length value or key that represents the
original string.

• Hashing is used to index and retrieve items in a database because

it is faster to find the item using the shorter hashed key than to
find it using the original value.

• The hashing algorithm is called the hash function-- A hash function

is a mathematical function that converts an input value into a
compressed numerical value – a hash or hash value.

• Basically, it's a processing unit that takes in data of arbitrary

length and gives the output of a fixed length – the hash value.
Perfect hash function
• An ideal (perfect) hash function transforms all
different hashed keys into different subscripts
of a table

• When a file has a million records, it is difficult

to have such a function
Hash collision (clash)
• When two hashed keys have the same values,
it is called a hash collision or a hash clash
• E.g., given a hash function h(key) = key%n
• For n=1000, h(1322) = 1322 % 1000 = 322 and
h(2322) = 2322 % 1000 = 322
• That means both key 1322 and 2322 may
attempt to insert the record into the same
position
Techniques used in hash function.

• Truncation Method
• Mid square Method
• Folding Method
• Division Method
Truncation Method

• This is the simplest method for computing address from a

key. In this method we take only a part of the key as
address.
• Example: Let us take some 8 digit keys and find addresses
for them. Let the table size is 100 and we have to take 2
rightmost digits for getting the hash table address.
• Suppose the keys are. 62394572, 87135565, 93457271,
45393225.
• So the address of above keys will be 72,65,71 and 25
respectively.
• This method is easy to compute but chances of collision are
more because last two digits can be same in more than one
key.
Midsquare Method

• In this method the key is squared and some

digits from the middle of this square are taken
as address.
• Example:
Key Square of key Address

1123 1261129 612

2273 5166529 665

3139 9853321 533

Folding Method

• In this technique the key is divided into different part

where the length of each part is same as that of the
required address, except possibly the last part.
• Example:
• Let key is 123945234 and the table size is 1000 then
we will break this key as follows
• 123945234 ----> 123 945 234
• Now we will add these broken parts.
123+945+234=1302. The sum is 1302, we will ignore
the final carry 1, so the address for the key 123945234
is 302.
Division Method (Modulo-Division)
• In Modulo-Division method the key is divided
by the table size and the remainder is taken as
the address of the hash table.
• Let the table size is n then
• H (k) =k mod n
Hash collision (clash)
• When two hashed keys have the same values,
it is called a hash collision or a hash clash
• E.g., given a hash function h(key) = key%n
• For n=1000, h(1322) = 1322 % 1000 = 322 and
h(2322) = 2322 % 1000 = 322
• That means both key 1322 and 2322 may
attempt to insert the record into the same
position
Resolving hash clashes
1. Chaining ( Open Hashing) Keys with the same
hash values will be linked together and a
search process should sequentially traverse all
the items in the linked list
2. Open Addressing (Closed Hashing) : Whenever
there is a clash, it will rehash – to find another
slot in the table
– many techniques: e.g., linear probing, quadratic
probing
Chaining
• Chaining avoids collision. The idea is to make each cell of
hash table point to a linked list of records that have same
hash function value.
• Let’s create a hash function, such that our hash table has ‘N’
number of buckets. To insert a node into the hash table, we
need to find the hash index for the given key.

• Example: hashIndex = key % Tablesize

• Insert: Move to the table location that corresponds to the
above calculated hash index and insert the new node at the
end of the list.
• Delete: To delete a node from hash table, calculate the
hash index for the key, move to the bucket corresponds to
the calculated hash index, search the list in the current
bucket to find and remove the node with the given key (if
found).
Chaining
• Example: h(key) = key % 10
Input: 2813,1615,2822,8232, 3553, 2125,4288

0
1
2 2822 8232
3 2813 3553
4
5 1615 2125
6
7
8 4288
9
• Open addressing ensures that all elements are
stored directly into the hash table, thus it
attempts to resolve collisions using various
methods.
– Linear Probing resolves collisions by placing the data
into the next open slot in the table.
– Quadratic Probing-The i2 slot is searched in ith probe
– Double Hashing We use another hash function
hash2(x) and look for i*hash2(x) slot in i’th rotation
• Open Addressing
Like separate chaining, open addressing is a method for
handling collisions. In Open Addressing, all elements are
stored in the hash table itself. So at any point, size of the
table must be greater than or equal to the total number of
keys (Note that we can increase table size by copying old
data if needed).

• Insert(k): Keep probing until an empty slot is found. Once an

empty slot is found, insert k.

Search(k): Keep probing until slot’s key doesn’t become

equal to k or an empty slot is reached.

• Delete(k): If we simply delete a key, then search may fail.

So slots of deleted keys are marked specially as “deleted”.
Insert can insert an item in a deleted slot, but the search
doesn’t stop at a deleted slot.
Linear Probing
• Let hash(x) be the slot index computed using
hash function
• If slot hash(x) % S is full, then we try (hash(x) +
1) % S
• If (hash(x) + 1) % S is also full, then we try
(hash(x) + 2) % S
• If (hash(x) + 2) % S is also full, then we try
(hash(x) + 3) % S and so on
Quadratic Probing We look for i2‘th slot in i’th
iteration.
• Let hash(x) be the slot index computed using hash
function.
• If slot hash(x) % S is full, then we try (hash(x) +
1*1) % S
• If (hash(x) + 1*1) % S is also full, then we try
(hash(x) + 2*2) % S
• If (hash(x) + 2*2) % S is also full, then we try
(hash(x) + 3*3) % S and so on
• Double Hashing We use another hash function
hash2(x) and look for i*hash2(x) slot in i’th
rotation.
• Let hash(x) be the slot index computed using hash
function.
• If slot hash(x) % S is full, then we try (hash(x) +
1*hash2(x)) % S
• If (hash(x) + 1*hash2(x)) % S is also full, then we
try (hash(x) + 2*hash2(x)) % S
• If (hash(x) + 2*hash2(x)) % S is also full, then we
try (hash(x) + 3*hash2(x)) % S and so on
Open Addressing: Linear probing
• Place the record in the next available position in the array, i.e., rh(i) = i+1.
E.g., (input: 2822, 2813,1615, 3553,2125, 4288, 8232)

0
1
2 2822
3 2813
4 3553 3553: h(3553)=3, rh(1)=4
5 1615
6 2125 2125: h(2125)=5, rh(1)=6
7 8232 8232: h(8232)=2, rh(2)=3,
rh(3)=4, rh(4)=5, rh(5)=6, rh(6)=7
8 4288
9
Open addressing -- quadratic Probing
• The jth rehash is hj(key) = (h(key)+j2) % array_size
• E.g., (input: 2822, 1615, 2813, 3553, 2125, 8232,4288)

0
8232: h(8232)=2, h 1=2+1=3,
1 8232 h 2=2+(2*2)=6, h3=(2+3*3)%10=1
2 2822
3 2813
4 3553 3553: h(3553)=3, h1=3+1=4
5 1615
6 2125 2125: h(2125)=5, h1=5+1=6
7
8 4288
9
Coalesced Hashing

A hybrid of chaining and open addressing, is

coalesced hashing. This links together chains
of nodes within the table itself.
Like open addressing, it achieves space usage.
Unlike chaining, it cannot have more elements
than table slots.
• Coalesced hashing is a collision avoidance technique when
there is a fixed sized data. It is a combination of
both Separate chaining and Open addressing.
• It uses the concept of Open Addressing(linear probing) to
find first empty place for colliding element from the bottom
of the hash table and the concept of Separate Chaining to
link the colliding elements to each other through pointers.
• The hash function used is h=(key)%(total number of keys).
Inside the hash table, each node has three fields:
• h(key): The value of hash function for a key.
• Data: The key itself.
• Next: The link to the next colliding elements.
Example
• n = 10
• Input : {20, 35, 16, 40, 45, 25, 32, 37, 22, 55}
• h(key) = key%10
• Initially
• Hash Value Data Next
0 20 Null
1
2
3
4
5 35 Null
6 16 Null
7
8
9
• Now we have to insert 40, h(40)=0 which is already occupied so we search
for the first empty block from the bottom and insert it there. Also the
address of this newly inserted node i.e(9 )is initialised in the next field of
0th index value node

Hash Value Data Next

0 20 9
1
2
3
4
5 35 Null
6 16 Null
7
8
9 40 Null
• Finally the hash table looks like this
• Input : {20, 35, 16, 40, 45, 25, 32, 37, 22, 55}
• h(key) = key%10

Hash Value Data Next

0 20 9
1 55 Null
2 32 3
3 22 Null
4 37 1
5 35 8
6 16 Null
7 25 4
8 45 7
9 40 Null
Chaining vs Open addressing
• Chaining is Simpler to implement. Open
Addressing requires more computation.
• In chaining, Hash table never fills up, we can
always add more elements to chain. In open
addressing, table may become full.
• Chaining uses extra space for links. No links in
Open addressing.

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
CS2040 Note
No ratings yet
CS2040 Note
2 pages
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
《采用 CUCKOOSWITCH 的可扩展、高性能以太网转发 (英文) 》
No ratings yet
《采用 CUCKOOSWITCH 的可扩展、高性能以太网转发 (英文) 》
12 pages
DS Lab Manual-1
100% (1)
DS Lab Manual-1
63 pages
H446-01 - MS - June19
No ratings yet
H446-01 - MS - June19
31 pages
Final Solutions: and Analysis of Algorithms
No ratings yet
Final Solutions: and Analysis of Algorithms
21 pages
System Software Question Bank 2012 With Part-B Answers
75% (16)
System Software Question Bank 2012 With Part-B Answers
49 pages
Chapter 9: Hashing
No ratings yet
Chapter 9: Hashing
50 pages
GATE Questions 18-7-14 Gate
No ratings yet
GATE Questions 18-7-14 Gate
116 pages
Task 4 - Solutions - Separate Chaining and Rehashing
No ratings yet
Task 4 - Solutions - Separate Chaining and Rehashing
5 pages
Data Structures Saqs
No ratings yet
Data Structures Saqs
24 pages
AST20105 Data Structure and Algorithms: Chapter 9 - Hash Table
No ratings yet
AST20105 Data Structure and Algorithms: Chapter 9 - Hash Table
39 pages
Collectionframework
No ratings yet
Collectionframework
20 pages
Blue Pelican Java Textbook by Charles E. Cook
No ratings yet
Blue Pelican Java Textbook by Charles E. Cook
543 pages
Amazon Interview Experiences 2016-2017 PDF
100% (1)
Amazon Interview Experiences 2016-2017 PDF
10 pages
Collision Resolution Technique
No ratings yet
Collision Resolution Technique
3 pages
CD3291 - Data Structures Lesson Plan New Format
No ratings yet
CD3291 - Data Structures Lesson Plan New Format
8 pages
Chapter6 Searching
No ratings yet
Chapter6 Searching
28 pages
#Syallabus
No ratings yet
#Syallabus
33 pages
Data Structure and Algorithm
No ratings yet
Data Structure and Algorithm
26 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
HASHING
No ratings yet
HASHING
21 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
19 pages
Os Module 5 Notes File System
No ratings yet
Os Module 5 Notes File System
17 pages
Java Multithreading
No ratings yet
Java Multithreading
22 pages
DSA Theory Final
No ratings yet
DSA Theory Final
8 pages
CS301 - Sample Paper (Final Term) Fall 2022
0% (1)
CS301 - Sample Paper (Final Term) Fall 2022
15 pages
PHYS210 Answer 1
No ratings yet
PHYS210 Answer 1
6 pages
Syllabus of T.E. (Electronics and Computer)
No ratings yet
Syllabus of T.E. (Electronics and Computer)
62 pages
Notes
No ratings yet
Notes
14 pages
DS SORTING SEARCHING Notes
No ratings yet
DS SORTING SEARCHING Notes
27 pages

Hash Functions

Uploaded by

Hash Functions

Uploaded by

UNIT V

• Hashing is used to index and retrieve items in a database because

• The hashing algorithm is called the hash function-- A hash function

• Basically, it's a processing unit that takes in data of arbitrary

• When a file has a million records, it is difficult

• This is the simplest method for computing address from a

• In this method the key is squared and some

1123 1261129 612

2273 5166529 665

3139 9853321 533

• In this technique the key is divided into different part

• Example: hashIndex = key % Tablesize

• Insert(k): Keep probing until an empty slot is found. Once an

Search(k): Keep probing until slot’s key doesn’t become

• Delete(k): If we simply delete a key, then search may fail.

A hybrid of chaining and open addressing, is

Hash Value Data Next

Hash Value Data Next

You might also like