0% found this document useful (0 votes)
27 views

Data+Structures+and+Algorithms+Bootcamp+in+Python+slides+Remaster (1) - Part-3

Uploaded by

n22dcpt095
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Data+Structures+and+Algorithms+Bootcamp+in+Python+slides+Remaster (1) - Part-3

Uploaded by

n22dcpt095
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

What is Hashing?

Hashing is a method of sorting and indexing data. The idea behind hashing is to allow large
amounts of data to be indexed using keys commonly created by formulas

Magic function

Apple 18

Application 20

Appmillers 22

0 1 .. 18 19 20 21 22 23
.. Apple Application Appmillers

AppMillers
www.appmillers.com
Why Hashing?
It is time e cient in case of SEARCH Operation

Time complexity for


Data Structure
SEARCH

Array/ Python List O(logN)

Linked List O(N)

Tree O(logN)

Hashing O(1) / O(N)

AppMillers
www.appmillers.com
ffi
Hashing Terminology
Hash function : It is a function that can be used to map of arbitrary size to data of xed size.
Key : Input data by a user
Hash value : A value that is returned by Hash Function
Hash Table : It is a data structure which implements an associative array abstract data type, a
structure that can map keys to values
Collision : A collision occurs when two di erent keys to a hash function produce the same
output.

Hash Function

Key Magic function


Hash Value

Apple 18

Application 20 Hash Table


Appmillers
22

0 1 .. 18 19 20 21 22 23

.. Apple Application Appmillers AppMillers


www.appmillers.com
ff
fi
Hashing Terminology
Hash function : It is a function that can be used to map of arbitrary size to data of xed size.
Key : Input data by a user
Hash value : A value that is returned by Hash Function
Hash Table : It is a data structure which implements an associative array abstract data type, a
structure that can map keys to values
Collision : A collision occurs when two di erent keys to a hash function produce the same
output.

Hash function

ABCD 20

ABCDEF 20
ABCDEF

Collision

0 1 ..

..
18 19 💥
20
ABCD
21 22 23

AppMillers
www.appmillers.com
ff
fi
Hash Functions

Mod function

def mod(number, cellNumber):


return number % cellNumber

mod(400, 24) 16

mod(700, 24) 4

0 1 .. 4 5 .. 16 .. 23

.. 700 .. 400 ..

AppMillers
www.appmillers.com
Hash Functions

ASCII function

def modASCII(string, cellNumber):


total = 0
for i in string:
total += ord(i)
return total % cellNumber

modASCII("ABC", 24) 6

A 65 65+66+67 = 198 24
192 8
B 66
6
C 67

0 1 .. 6 7 .. 16 .. 23

.. ABC .. ..

AppMillers
www.appmillers.com
Hash Functions

Properties of good Hash function


- It distributes hash values uniformly across hash tables

Hash function

ABCD 20

ABCDEF 20
ABCDEF

Collision

0 1 ..

..
18 19 💥
20
ABCD
21 22 23

AppMillers
www.appmillers.com
Hash Functions

Properties of good Hash function


- It distributes hash values uniformly across hash tables
- It has to use all the input data
ABCD

ABCDEF
Hash function

ABC 18

Collision

0 1 ..

..
💥
18
ABCD
19 20 21 22 23

AppMillers
www.appmillers.com
Collision Resolution Techniques
Hash function

ABCD 0
2

💥
Collision
1
EFGH 2
2 ABCD EFGH
IJKLM 3
4
5
6
7
8
9
10
11
12
13
14
15

AppMillers
www.appmillers.com
Collision Resolution Techniques

Resolution Techniques

Direct Chaining Open Addressing

Linear Probing

Quadratic Probing

Double Hashing

AppMillers
www.appmillers.com
Collision Resolution Techniques

Direct Chaining : Implements the buckets as linked list. Colliding elements are stored in this lists

0
1
Hash function 2 111 ABCD 222
Null EFGH 333
Null IJKLM Null

3 111 222 333

ABCD 2 4
5
EFGH 2
6
IJKLM 2
7 444 Miller Null
Miller 7 444
8
9
10
11
12
13
14
15

AppMillers
www.appmillers.com
Collision Resolution Techniques
Open Addressing: Colliding elements are stored in other vacant buckets. During storage and
lookup these are found through so called probing.
Linear probing : It places new key into closest following empty cell

0
1
Hash function 2 ABCD
3 EFGH

ABCD 2 4 KLM
5
EFGH 2
6
IJKLM 2
7
8
9
10
11
12
13
14 AppMillers
www.appmillers.com
15
IJ
Collision Resolution Techniques
Open Addressing: Colliding elements are stored in other vacant buckets. During storage and
lookup these are found through so called probing.
Quadratic probing : Adding arbitrary quadratic polynomial to the index until an empty cell is found

0 12, 22, 32, 42..


1
Hash function
2 + 12 = 3
2 ABCD
3 EFGH 2 + 22 = 6
ABCD 2 4
5
EFGH 2
6 KLM
IJKLM 2
7
8
9
10
11
12
13
14 AppMillers
www.appmillers.com
15
IJ
Collision Resolution Techniques
Open Addressing: Colliding elements are stored in other vacant buckets. During storage and
lookup these are found through so called probing.
Double Hashing : Interval between probes is computed by another hash function

0
1
Hash function 2 ABCD
3 2 + 4= 6
ABCD 2 4
2 + 4= 6
5
EFGH 2
6 EFGH 2 + (2*4) = 8
IJKLM 2
7
8 KLM
9
Hash 2
10
11
EFGH 4 12
13
IJKLM 4
14 AppMillers
www.appmillers.com
15
IJ
Hash Table is Full

Direct Chaining
This situation will never arise.

Hash function

0 111 NOPQ Null


ABCD 2 111
EFGH 1
1 222 EFGH Null
555 RSTU Null

222 555
IJKLM 3 2 333 ABCD Null

NOPQ 0 333
3 444 IJKLM
RSTU 1 Null

444

AppMillers
www.appmillers.com
Hash Table is Full

Open addressing
Create 2X size of current Hash Table and recall hashing for current keys

Hash function

0 NOPQ 0 NOPQ
ABCD 2

EFGH 1
1 EFGH 1 EFGH

IJKLM 3 2 ABCD 2 ABCD


NOPQ 0
3 IJKLM 3 IJKLM
RSTU 1
4 RSTU

AppMillers
www.appmillers.com
Pros and Cons of Collision resolution techniques

Direct chaining
- Hash table never gets full
- Huge Linked List causes performance leaks (Time complexity for search operation becomes O(n).)

Open addressing
- Easy Implementation
- When Hash Table is full, creation of new Hash table a ects performance (Time complexity for
search operation becomes O(n).)

‣ If the input size is known we always use “Open addressing”

‣ If we perform deletion operation frequently we use “Direct Chaining”

AppMillers
www.appmillers.com
ff
Pros and Cons of Collision resolution techniques

Hash function

0 NOPQ Linear Probing


ABCD 2
1 EFGH
EFGH 1
2 ABCD
IJKLM 3
3 IJKLM
NOPQ 0
4 RSTU
RSTU 1

AppMillers
www.appmillers.com
Personal Computer
Practical Use of Hashing

Password veri cation

Login : [email protected]
Password: 123456

Google Servers
Hash value: *&71283*a12

AppMillers
www.appmillers.com
fi
Practical Use of Hashing

Password veri cation

File system : File path is mapped to physical location on disk

AppMillers
www.appmillers.com
fi
Practical Use of Hashing
File system : File path is mapped to physical location on disk

Path: /Documents/Files/hashing.txt

1 /Documents/
Files/hashing.txt Physical location: sector 4

3
AppMillers
www.appmillers.com
Pros and Cons of Hashing

✓On an average Insertion/Deletion/Search operations take O(1) time.

x When Hash function is not good enough Insertion/Deletion/Search operations take O(n) time

Operations Array /Python List Linked List Tree Hashing

Insertion O(N) O(N) O(LogN) O(1)/O(N)

Deletion O(N) O(N) O(LogN) O(1)/O(N)

Search O(N) O(N) O(LogN) O(1)/O(N)

AppMillers
www.appmillers.com

You might also like