Collision

Uploaded by

Richa Singh

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Collision

Uploaded by

Richa Singh

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Collisions

Definition: A collision occurs when more than one value to be

hashed by a particular hash function hash to the same slot in the
table or data structure (hash table) being generated by the hash
function.

Example Hash Table With Collisions:

Let’s take the exact same hash function from before: take the value to be
hashed mod 10, and place it in that slot in the hash table.
Numbers to hash: 22, 9, 14, 17, 42
As before, the hash table is shown to the right.
As before, we hash each value as it appears in the
string of values to hash, starting with the first value.
The first four values can be entered into the hash
table without any issues. It is the last
value, 42, however, that causes a problem. 42 mod
10 = 2, but there is already a value in slot 2 of the
hash table, namely 22. This is a collision.

Collision Resolution
Techniques

Chaining Open Addressing

(Open Hashing) (Closed Hashing)
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
Chaining
When we use chaining to resolve collisions, we simply allow each slot in the
hash table to accept more than one value. Therefore, in the example above, 42
would simply go in slot 2, as the hash function told us, in a list after 22.
The chaining technique
In the chaining approach, the hash table is an array of linked lists i.e., each
index has its own linked list.
All key-value pairs mapping to the same index will be stored in the linked list
of that index.

The benefits of chaining

• Through chaining, insertion in a hash table always occurs in O(1) since
linked lists allow insertion in constant time.
• Theoretically, a chained hash table can grow infinitely as long as there
is enough space.
• A hash table which uses chaining will never need to be resized.
if a phone book is implemented using hash tables
with separate chaining and A and B resolve to same
index, how will the algorithm deduce whose phone
number to return?
You need to do a linear search on the linked list. Therefore, in the worst
case, hashing with chaining would have complexity O(N). That would
happen if all values in the hash table were mapped to the same index
(which a reasonable hash function would try to avoid of course)
Open Addressing:
Like separate chaining, open addressing is a method for handling collisions.
In Open Addressing, all elements are stored in the hash table itself. So at
any point, the size of the table must be greater than or equal to the total
number of keys (Note that we can increase table size by copying old data if
needed). This approach is also known as closed hashing. This entire
procedure is based upon probing. We will understand the types of probing
ahead:
• Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k.
• Search(k): Keep probing until the slot’s key doesn’t become equal to k or an empty slot is
reached.
• Delete(k): Delete operation is interesting. If we simply delete a key, then the search may
fail. So slots of deleted keys are marked specially as “deleted”.
The insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted slot.
Different ways of Open Addressing:
1. Linear Probing:
In linear probing, the hash table is searched sequentially that starts from
the original location of the hash. If in case the location that we get is
already occupied, then we check for the next location.
The function used for rehashing is as follows: rehash(key) =
(n+1)%table-size.
For example, The typical gap between two probes is 1 as seen in the
example below:
Challenges in Linear Probing :
•Primary Clustering: One of the problems with linear probing is
Primary clustering, many consecutive elements form groups and it
starts taking time to find a free slot or to search for an element.
•Secondary Clustering: Secondary clustering is less severe, two
records only have the same collision chain (Probe Sequence) if their
initial position is the same.
Example: Let us consider a simple hash function as “key mod 5” and a
sequence of keys that are to be inserted are 50, 70, 76, 93.
•Step1: First draw the empty hash table which will have a possible range
of hash values from 0 to 4 according to the hash function provided.
•Step 2: Now insert all the keys in the hash table one by one. The first
key is 50. It will map to slot number 0 because 50%5=0. So insert it
into slot number 0.

Step 3: The next key is 70. It will map to

slot number 0 because 70%5=0 but 50 is
already at slot number 0 so, search for
the next empty slot and insert it
•Step 4: The next key is 76. It
will map to slot number 1
because 76%5=1 but 70 is
already at slot number 1 so,
search for the next empty slot
and insert it.
Step 5: The next key is 93 It will
map to slot number 3 because
93%5=3, So insert it into slot
number 3
2. Quadratic Probing
If you observe carefully, then you will understand that the
interval between probes will increase proportionally to the
hash value. Quadratic probing is a method with the help of
which we can solve the problem of clustering that was
discussed above. This method is also known as the mid-
square method. In this method, we look for the i2‘th slot
in the ith iteration. We always start from the original hash
location. If only the location is occupied then we check the
other slots.
let hash(x) be the slot index computed using hash function.
If slot hash(x) % S is full, then we try (hash(x) + 1*1) % S
If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) % S
If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3) % S
…………………………………………..
Example: Let us consider table Size = 7, hash function as
Hash(x) = x % 7 and collision resolution strategy to be f(i) =
i2 . Insert = 22, 30, and 50.
•Step 1: Create a table of
size 7.
•Step 2 – Insert 22 and 30
• Hash(22) = 22 % 7 = 1, Since the cell at index 1 is empty, we can
easily insert 22 at slot 1.
• Hash(30) = 30 % 7 = 2, Since the cell at index 2 is empty, we can
easily insert 30 at slot 2.
•Step 3: Inserting 50
• Hash(50) = 50 % 7 = 1
• In our hash table slot 1 is already occupied. So, we will search for
slot 1+12, i.e. 1+1 = 2,
• Again slot 2 is found occupied, so we will search for cell 1+22,
i.e.1+4 = 5,
• Now, cell 5 is not occupied so we will place 50 in slot 5.
3. Double Hashing
The intervals that lie between probes are computed by another hash
function. Double hashing is a technique that reduces clustering in an
optimized way. In this technique, the increments for the probing
sequence are computed by using another hash function. We use
another hash function hash2(x) and look for the i*hash2(x) slot in
the ith rotation.

let hash(x) be the slot index computed using hash function.

If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S
If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) +
2*hash2(x)) % S
If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) +
3*hash2(x)) % S
…………………………………………..
Example: Insert the keys 27, 43, 92, 72 into the Hash Table of size 7.
where first hash-function is h1(k) = k mod 7 and second hash-
function is h2(k) = 1 + (k mod 5)
•Step 1: Insert 27
• 27 % 7 = 6, location 6 is empty so insert 27 into 6 slot.
•Step 2: Insert 43
• 43 % 7 = 1, location 1 is empty so insert 43
into 1 slot.
•Step 3: Insert 92
• 92 % 7 = 6, but location 6 is already being occupied and this is a
collision
• So we need to resolve this collision using double hashing.
hnew = [h1(92) + i * (h2(92)] % 7
= [6 + 1 * (1 + 92 % 5)] % 7
=9%7
=2

Now, as 2 is an empty slot,

so we can insert 92 into 2nd slot.
•Step 4: Insert 72
• 72 % 7 = 2, but location 2 is already being occupied and this is a collision.
• So we need to resolve this collision using double hashing.

hnew = [h1(72) + i * (h2(72)] % 7

= [2 + 1 * (1 + 72 % 5)] % 7
=5%7
= 5,
Now, as 5 is an empty slot,
so we can insert 72 into 5th slot.
Comparison of the above three:
•Linear probing has the best cache performance but suffers
from clustering. One more advantage of Linear probing is
easy to compute.
•Quadratic probing lies between the two in terms of cache
performance and clustering.
•Double hashing has poor cache performance but no
clustering. Double hashing requires more computation time
as two hash functions need to be computed.
S.No. Separate Chaining Open Addressing

1. Chaining is Simpler to implement. Open Addressing requires more computation.

2. In chaining, Hash table never fills up, we can always add In open addressing, table may become full.
more elements to chain.

3. Chaining is Less sensitive to the hash function or load Open addressing requires extra care to avoid clustering
factors. and load factor.

4. Chaining is mostly used when it is unknown how many Open addressing is used when the frequency and number
and how frequently keys may be inserted or deleted. of keys is known.

5. Cache performance of chaining is not good as keys are Open addressing provides better cache performance as
stored using linked list. everything is stored in the same table.

6. Wastage of Space (Some Parts of hash table in chaining In Open addressing, a slot can be used even if an input
are never used). doesn’t map to it.

7. Chaining uses extra space for links. No links in Open addressing

Note: Cache performance of chaining is not good because
when we traverse a Linked List, we are basically jumping
from one node to another, all across the computer’s
memory. For this reason, the CPU cannot cache the nodes
which aren’t visited yet, this doesn’t help us. But with Open
Addressing, data isn’t spread, so if the CPU detects that a
segment of memory is constantly being accessed, it gets
cached for quick access.

Answers to Problems for Operations Research, 11th Edition by Hamdy Taha
No ratings yet
Answers to Problems for Operations Research, 11th Edition by Hamdy Taha
12 pages
Theory PDF
No ratings yet
Theory PDF
18 pages
Topic 1: Hashing - Introduction: Hashing Is A Method of Storing and Retrieving Data From A Database Efficiently
No ratings yet
Topic 1: Hashing - Introduction: Hashing Is A Method of Storing and Retrieving Data From A Database Efficiently
31 pages
Seminar 5
No ratings yet
Seminar 5
5 pages
Lab 09 - Hashing
No ratings yet
Lab 09 - Hashing
47 pages
Hashing
No ratings yet
Hashing
37 pages
HASHING
No ratings yet
HASHING
21 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
collision resolution techniques
No ratings yet
collision resolution techniques
8 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing
No ratings yet
Hashing
4 pages
Hashing PPT
No ratings yet
Hashing PPT
39 pages
ADS M TECH MID 2
No ratings yet
ADS M TECH MID 2
26 pages
Hashing
No ratings yet
Hashing
35 pages
Hashing
No ratings yet
Hashing
35 pages
Hash Tables
No ratings yet
Hash Tables
21 pages
Hashing
No ratings yet
Hashing
10 pages
Algo Cha 8
No ratings yet
Algo Cha 8
20 pages
15 HashTables
No ratings yet
15 HashTables
27 pages
629314285 Hashing in Data Structure
No ratings yet
629314285 Hashing in Data Structure
23 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
23 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Hashing new
No ratings yet
Hashing new
48 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
Unit29 Hashing2
No ratings yet
Unit29 Hashing2
20 pages
Ch7 Hashing
No ratings yet
Ch7 Hashing
12 pages
Hashing
No ratings yet
Hashing
24 pages
DSA LABTASK 12
No ratings yet
DSA LABTASK 12
5 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Study_Material_on_Hashing
No ratings yet
Study_Material_on_Hashing
4 pages
DSAL Ass1 Writeup
No ratings yet
DSAL Ass1 Writeup
4 pages
Hashing Updated
No ratings yet
Hashing Updated
26 pages
Hash Tables 2
No ratings yet
Hash Tables 2
16 pages
Searching, Sorting and Hashing
No ratings yet
Searching, Sorting and Hashing
52 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Hash Table: Didih Rizki Chandranegara
No ratings yet
Hash Table: Didih Rizki Chandranegara
33 pages
Modifed Hash
No ratings yet
Modifed Hash
42 pages
Search vs. Hashing
No ratings yet
Search vs. Hashing
55 pages
Hash Table
No ratings yet
Hash Table
26 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
Hash Functions
No ratings yet
Hash Functions
60 pages
Hashing ClassNotes
No ratings yet
Hashing ClassNotes
8 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing
No ratings yet
Hashing
44 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
L04 Hashing
No ratings yet
L04 Hashing
63 pages
TCP2101 Algorithm Design & Analysis: - Hash Tables
No ratings yet
TCP2101 Algorithm Design & Analysis: - Hash Tables
58 pages
Hashing PDF
No ratings yet
Hashing PDF
65 pages
Cs 218 - Data Structures: Hashing
No ratings yet
Cs 218 - Data Structures: Hashing
18 pages
Hashing
No ratings yet
Hashing
30 pages
COLLISON
No ratings yet
COLLISON
17 pages
ADS Unit 3
No ratings yet
ADS Unit 3
14 pages
Hashing
No ratings yet
Hashing
56 pages
DS Revision on Heap
No ratings yet
DS Revision on Heap
34 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
CS3233 CS3233 Competitive Programming P G G: Dr. Steven Halim Dr. Steven Halim Week 02 - Data Structures & Libraries
100% (3)
CS3233 CS3233 Competitive Programming P G G: Dr. Steven Halim Dr. Steven Halim Week 02 - Data Structures & Libraries
35 pages
K Means Clustering in R Example - Learn by Marketing
No ratings yet
K Means Clustering in R Example - Learn by Marketing
3 pages
DSA Project
No ratings yet
DSA Project
13 pages
Class-8-Hots and Ap
No ratings yet
Class-8-Hots and Ap
19 pages
Question Bank (Dsa)
No ratings yet
Question Bank (Dsa)
2 pages
6 - Introduction To Optimization
No ratings yet
6 - Introduction To Optimization
8 pages
Lec2 Part2
No ratings yet
Lec2 Part2
41 pages
Data Structures and Algorithm
No ratings yet
Data Structures and Algorithm
3 pages
Write A Recursive Code To Check Given Number Is Prime or Not
No ratings yet
Write A Recursive Code To Check Given Number Is Prime or Not
13 pages
UT Dallas Syllabus For cs6363.001 05f Taught by Balaji Raghavachari (RBK)
No ratings yet
UT Dallas Syllabus For cs6363.001 05f Taught by Balaji Raghavachari (RBK)
1 page
20 Essential Coding Patterns To Ace Your Next Coding Intervi-1
No ratings yet
20 Essential Coding Patterns To Ace Your Next Coding Intervi-1
22 pages
Discrete Optimization: Assignments: Knapsack
No ratings yet
Discrete Optimization: Assignments: Knapsack
14 pages
CS513 Spring 2020 Design and Analysis of Data Structures and Algorithms
No ratings yet
CS513 Spring 2020 Design and Analysis of Data Structures and Algorithms
2 pages
Data Structures: Data May Be Organized in Many
No ratings yet
Data Structures: Data May Be Organized in Many
23 pages
Literature Review On Simplex Method
100% (2)
Literature Review On Simplex Method
7 pages
COMP9417 Review Notes
No ratings yet
COMP9417 Review Notes
10 pages
DSA - Trees
No ratings yet
DSA - Trees
120 pages
NOC23 EE49 Assignment Week03 v0.1
No ratings yet
NOC23 EE49 Assignment Week03 v0.1
4 pages
Fast Rls Algorithm PDF
No ratings yet
Fast Rls Algorithm PDF
2 pages
Doubly Linked List
No ratings yet
Doubly Linked List
6 pages
Unit III Linear DS - 241004 - 132131
No ratings yet
Unit III Linear DS - 241004 - 132131
21 pages
Data Structure
No ratings yet
Data Structure
149 pages
Database Design and Applications (SSZ G518) 2 Semester 2017-18 Homework SOLUTIONS Topic: Indexing
No ratings yet
Database Design and Applications (SSZ G518) 2 Semester 2017-18 Homework SOLUTIONS Topic: Indexing
3 pages
An Exhaustive Study On Different Sudoku Solving Techniques: Keywords
No ratings yet
An Exhaustive Study On Different Sudoku Solving Techniques: Keywords
8 pages
1D Array Introduction, Insertion, Deletion
No ratings yet
1D Array Introduction, Insertion, Deletion
11 pages
2nd PUC Computer Lab Manual-2023
No ratings yet
2nd PUC Computer Lab Manual-2023
101 pages
Segment-6 Discrete Fourier Transform (DFT) & Fast Fourier Transform (FFT)
No ratings yet
Segment-6 Discrete Fourier Transform (DFT) & Fast Fourier Transform (FFT)
32 pages
TY - ET-D - 60 - DAAOA - Lab 1
No ratings yet
TY - ET-D - 60 - DAAOA - Lab 1
5 pages
K Nearest Neighbors MLExpert
No ratings yet
K Nearest Neighbors MLExpert
3 pages