0% found this document useful (0 votes)
7 views

Chapter - 5 (Graphs and Hashing)

The document discusses graphs as non-linear data structures, detailing their representation through adjacency matrices and lists, and the characteristics of bi-directional and directional edges. It explains graph traversal methods, specifically breadth-first search (BFS) and depth-first search (DFS), including their algorithms, time and space complexities, and practical applications. The document also provides examples and visual representations to illustrate these concepts.

Uploaded by

Deep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter - 5 (Graphs and Hashing)

The document discusses graphs as non-linear data structures, detailing their representation through adjacency matrices and lists, and the characteristics of bi-directional and directional edges. It explains graph traversal methods, specifically breadth-first search (BFS) and depth-first search (DFS), including their algorithms, time and space complexities, and practical applications. The document also provides examples and visual representations to illustrate these concepts.

Uploaded by

Deep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Friday, January 3, 2025 12:29 AM

graph and hashing

topic: graphs
graph is a non-linear data structure e1
A B
G(V,E)
V: set of vertices
e3 e2
E: set of edges e5

V: {A, B, C, D}
E: {e1, e2, e3, e4, e5} C D
e4

dense sparse

topic: graph representation


(i) adjacency matrix
(ii) adjacency list

concept: adjacency matrix


a 2d array (matrix) of size n x n is used, where n is the number of vertices in the graph. each cell (i, j) in
the matrix represents whether an edge exists between vertex i and vertex j.

characteristics:
(i) space complexity:
- 0(n²) irrespective of the number of edges.
- takes a lot of space for sparse graphs (graphs with fewer edges).

(ii) time complexity:


- check if an edge exists: 0(1) (direct access).
- add/remove edge: 0(1).
- traverse neighbors of a node: 0(n) (need to check all entries in the row).

(iii) suitable for:


- dense graphs: because all the possible edges are stored explicitly.
- graphs where edge queries (whether an edge exists) are frequent.

Chapter - 5 (Graphs and Hashing) Page 1


there are two types of edges:

(i) bi-directional/undirected edge

a bidirectional edge represents a connection between two vertices in both directions. in an undirected graph,
the edge is symmetric.

characteristics:
(i) no direction: edge u → v implies both u → v and v → u.

(ii) real-world examples:


- two-way streets in a city map.
- relationships in a social network (e.g., friends on facebook).

(iii) adjacency matrix:


if u → v exists, then both (u, v) = 1 and (v, u) = 1.

(iv) adjacency list:


both u lists v and v lists u.

example:
do not count self loops
is there any edge from 1 to 1?

1 3
1 2 3 4
1 0 1 1 1
A= 2 1 0 0 1
3 1 0 0 1
4 1 1 1 1
2 4

(ii) directional/directed edge

a directional edge (or directed edge) represents a connection between two vertices in a specific direction. in a
directed graph, edges have an orientation, meaning they go from one vertex to another.

characteristics:
(i) direction matters: edge u → v does not imply v → u.

(ii) real-world examples:


- one-way streets in a city map.
- tasks in a project schedule (task a must precede task b).
Chapter - 5 (Graphs and Hashing) Page 2
- tasks in a project schedule (task a must precede task b).

(iii) adjacency matrix:


if u → v exists, then matrix entry (u, v) = 1; (v, u) = 0.

(iv) adjacency list:


u lists v as its neighbor, but v may not list u.

example:
do not count self loops
is there any edge from 1 to 2?

1 3
1 2 3 4
1 0 0 1 1
A= 2 1 0 0 1
3 0 0 0 0 because there is no edge going out from 3
4 0 1 1 0
2 4

concept: adjacency list


a list of lists (or similar data structure) is used, where each vertex has a list containing its neighboring
vertices. it stores only the edges that exist.

characteristics:
(i) space complexity:
- 0(v + e) where v is the number of vertices and e is the number of edges.
- much more space-efficient for sparse graphs.

(ii) time complexity:


- check if an edge exists: 0(degree of the vertex) (need to search the vertex’s adjacency list).
- add/remove edge: 0(1) (amortized, depending on the implementation).
- traverse neighbors of a node: 0(degree of the vertex).

(iii) suitable for:


- sparse graphs: because it only stores edges that exist, saving space.
- graphs where traversal of neighbors is more frequent than edge existence queries.

example:
adjacency list
1 3
1 2 4 3
2 1 4
3 4 1
2 4 4 3 1 2

Chapter - 5 (Graphs and Hashing) Page 3


3 4 1
2 4 4 3 1 2

the more the dense graph is the more 1's in adjacency


list will be.

topic: graph traversal


graph traversal refers to the process of systematically visiting all the vertices and edges in a graph. the
concepts of visit and explore are key aspects of graph traversal algorithms like depth-first search (dfs) and
breadth-first search (bfs).

(i) visit
visiting a node means that you have reached the node during traversal and marked it as discovered.

- purpose: visiting ensures that the node is acknowledged in the traversal sequence, and any associated
action (like printing, marking, or processing) is performed.

characteristics:
- occurs when you first encounter a vertex.
- a node is often marked as "visited" to prevent revisiting.
example: in bfs, when you dequeue a node from the queue, it’s visited.

use case: ensures no vertex is skipped during traversal.

(ii) explore
exploring a node means examining its neighbors or edges to determine further traversal paths.

- purpose: exploring ensures that all the connected components or adjacent vertices of a node are considered for
traversal.

characteristics:
- involves traversing edges from the current vertex to its neighbors.
- typically, when a vertex is explored, its neighbors may be marked for a visit.
example: in dfs, when you recursively call dfs for a neighbor, you are exploring the vertex.
Chapter - 5 (Graphs and Hashing) Page 4
example: in dfs, when you recursively call dfs for a neighbor, you are exploring the vertex.

use case: ensures all paths and connections are fully traversed.

concept: breadth first search (graph traversal)


breadth-first search (bfs) is a graph traversal algorithm that visits all nodes at the current depth level before
moving to the next depth level. bfs uses a queue data structure to keep track of nodes to visit, ensuring that
nodes are explored in the order they are discovered.

how bfs works


(i) start from the source node:
- mark the source node as visited.
- enqueue it into the queue.

(ii)process nodes level by level:


- dequeue the front node from the queue.
- visit all its unvisited neighbors.
- mark those neighbors as visited and enqueue them.

(iii)repeat until the queue is empty:


- continue the process until all reachable nodes are visited.

time and space complexity


(i) time complexity:
- each vertex is enqueued and dequeued once → o(v)
- all edges are explored once → o(e)
- total time complexity → o(v + e)

(ii) space complexity:


- for the queue → o(v)
- for the visited set → o(v)

applications of bfs
- shortest path: finding the shortest path in an unweighted graph.
- connected components: identifying all connected components in a graph.
- network flow: bfs is used in algorithms like edmonds-karp.
- ai and games: solving puzzles like mazes and shortest path problems.

example: tree
in a tree, hierarchy is maintained, allowing us to reach any node by traversing through the left subtree or
the right subtree.

(i) start from source node

1
source

Chapter - 5 (Graphs and Hashing) Page 5


(i) start from source node

1
source

2 3

4 5 6 7

(ii)process nodes level by level

1
source

2 3

4 5 6 7

(a) from source at distance of 1: 1 →{2, 3}


(b) from source at distance of 2: 1 →{2, 3}→{4,5,6,7}

example: queue
2 6

queue
source 1 4 5 7
visited 0 0 0 0 0 0 0 0

3 8
in graphs, there may be cycles or multiple edges leading to the same node.
revisiting nodes can lead to infinite loops or redundant work, the visited set
keeps track of nodes that have already been processed, ensuring each node is
visited only once.

(i) start from source 2 6

queue 1
source 1 4 5 7

Chapter - 5 (Graphs and Hashing) Page 6


queue 1
source 1 4 5 7
visited 1 0 0 0 0 0 0 0
1 2 3 4 5 6 7 8

3 8

(ii) associated univisited neigbours (2,3,4).

2 6

queue 1 2 3 4
source 1 4 5 7
visited 1 1 1 1 0 0 0 0
1 2 3 4 5 6 7 8

3 8

(iii) associated univisited neigbours (5).


- 4 is also associated with 1 but it will not visit as
in visited queue 1 is already marked as 1 2 6

queue 1 2 3 4 5
source 1 4 5 7
visited 1 1 1 1 1 0 0 0
1 2 3 4 5 6 7 8

3 8

(iv) associated univisited neigbours (6,7,8).


- 5 is also associated with 4 but it will not visit as
in visited queue 4 is already marked as 1 2 6

queue 1 2 3 4 5 6 7 8
source 1 4 5 7
visited 1 1 1 1 1 1 1 1
1 2 3 4 5 6 7 8

3 8

Chapter - 5 (Graphs and Hashing) Page 7


1 2 3 4 5 6 7 8

3 8

1 →{2, 3,4}→{5}→{6,7,8}

concept: depth-first search (graph traversal)


depth-first search (dfs) is a graph traversal algorithm that explores as far as possible along each branch before
backtracking. it's useful for searching trees or graphs and is implemented either using recursion or an explicit
stack.

key points:
(i) explores deeper into the graph:
dfs starts at the root (or any arbitrary node) and explores as deep as possible along a branch before
backtracking.

(ii)stack-based exploration:
dfs uses a stack to remember which nodes to visit next. this stack can be implicit with recursion or explicit
using an actual stack data structure.

two main traversal types:


- pre-order dfs: visit the current node before its children.
- post-order dfs: visit the current node after its children.

dfs algorithm steps:


(i) visit the starting node: push it onto the stack.
(ii) explore the current node's neighbors: pop the top node from the stack and explore its unvisited neighbors.
(iii) repeat the process until all reachable nodes are visited.

dfs time and space complexity:


- time complexity:
O(V + E), where V is the number of vertices and E is the number of edges. every node and edge is visited
once.

- space complexity:
O(V), for storing visited nodes and the stack.

key points:
- dfs explores deeply into the graph, making it suitable for tasks like finding connected components,
topological sorting, and cycle detection.
- recursion is a natural fit for dfs because the call stack automatically manages the traversal process.

(i) start from the source node


2 6

Chapter - 5 (Graphs and Hashing) Page 8
(i) start from the source node
2 6

1

source 1 4 5 7

3 8

(ii) during the exploration of 1 we found 2 so stop exploring 1 and visited the unvisited node.

X 2 6
✓✓
1 2

source 1 4 5 7

3 8

(iii) during the exploration of 2 we found nothing so stop exploring 2 and return back to 1.

✓ 2 6
X X
✓✓
1 2
source 1 4 5 7

3 8

(iv) during the exploration of 1 we found 3 so stop exploring 1 and visit the unvisited node.

X 2 6

X X
✓✓✓
1 2 3 source 1 4 5 7

Chapter - 5 (Graphs and Hashing) Page 9


1 2 3 source 1 4 5 7

3 8

(v) during the exploration of 3 we found nothing so stop exploring 3 and return to 1.

✓ 2 6
X

X X X
✓✓✓ source 1 4 5 7
1 2 3

3 8

(vi) during the exploration of 1 we found 4 so stop exploring 1 and visit the unvisited node.

X 2 6

X

X X X source 1 4 5 7
✓✓✓✓
1 2 3 4

3 8

(vi) during the exploration of 4 we found 5 so stop exploring 4 and visit the unvisited node.

X 2 6

X

X X XX source 1 4 5 7
✓✓✓✓ ✓
1 2 3 4 5

3 8

(vii) during the exploration of 5 we found 8 so stop exploring 5 and visit the unvisited node.

X 2 6

Chapter - 5 (Graphs and Hashing) Page 10
(vii) during the exploration of 5 we found 8 so stop exploring 5 and visit the unvisited node.

X 2 6

X

X X XX X X source 1 4 5 7
✓✓✓✓ ✓✓
1 2 3 4 5 8

3 8

(viii) during the exploration of 8 we found nothing so stop exploring 8 and visit the return to 5.

X 2 6

X
✓ ✓
X X XX X X source 1 4 5 7
✓✓✓✓ ✓✓
1 2 3 4 5 8

3 8

(ix) during the exploration of 5 we found 7 so stop exploring 5 and visit the univisited node.

X 2 6

X X
✓ ✓
X X XX X X source 1 4 5 7
✓✓✓✓ ✓✓ ✓
1 2 3 4 5 8 7

3 8

(x) during the exploration of 7 we found nothing so stop exploring 7 and return back to 5.

X 2 6
✓ ✓
X X
✓ ✓
X X XX X X X source 1 4 5 7
✓✓✓✓ ✓✓ ✓
1 2 3 4 5 8 7

Chapter - 5 (Graphs and Hashing) Page 11


source 1 4 5 7
✓✓✓✓ ✓✓ ✓
1 2 3 4 5 8 7

3 8

(xi) during the exploration of 5 we found 6 so stop exploring 5 and visit the unvisited node.

X X
2 6
✓ ✓
X X
✓ ✓
X X XX X X X X source 1 4 5 7
✓✓✓✓ ✓✓ ✓✓
1 2 3 4 5 8 7 6

3 8

(xii) during the exploration of 6 we found nothing so stop exploring 6 and return back to 5.

✓ 2 6
X X
✓ ✓
X X
✓ ✓ source 1 4 5 7
X X XX X X X X
✓✓✓✓ ✓✓ ✓✓
1 2 3 4 5 8 7 6
3 8

(xiii) during the exploration of 5 we found nothing so stop exploring 5 and return back to 4.

2 6
X

X X
✓ ✓ source 1 4 5 7
X X
✓ ✓ ✓
X X XX X X X X
✓✓✓✓ ✓✓ ✓✓ 3 8
1 2 3 4 5 8 7 6

(xiv) during the exploration of 4 we found nothing so stop exploring 4 and return back to 1.

Chapter - 5 (Graphs and Hashing) Page 12


2 6
X
✓ ✓
X X
✓ ✓ source 1 4 5 7
X X X
✓ ✓ ✓
X X XX X X X X
✓✓✓✓ ✓✓ ✓✓ 3 8
1 2 3 4 5 8 7 6

(xv) no more unvisited nodes.

2 6
X X
✓ ✓
X X
✓ ✓ source 1 4 5 7
X X X
✓ ✓ ✓
X X XX X X X X
✓✓✓✓ ✓✓ ✓✓ 3 8
1 2 3 4 5 8 7 6

topic: hashing
hashing is a technique used to map data (keys) to fixed-size values, typically integers, through a hash
function. it is commonly used for data retrieval and allows for efficient lookups, insertions, and deletions.

key concepts of hashing:


(i)hash function:
- a function that converts input (or key) into a fixed-size value called a hash code or hash value.
- the hash function should ideally distribute values uniformly to minimize collisions.

(ii)hash table:
- a data structure that uses a hash function to store data in an array-like structure.
- the index at which a key-value pair is stored is determined by the hash value.

(iii) collision:
- occurs when two different keys hash to the same index in the hash table.
- collision resolution techniques are used to handle this, such as chaining or open addressing.

types of hashing:
(i) direct hashing:
simple hash function that directly maps keys to an index (e.g., the key itself is used as the index).

(ii) modular hashing:


Chapter - 5 (Graphs and Hashing) Page 13
(ii) modular hashing:
- the hash value is computed using a modulo operation:
hash_value = key % table_size
- this is one of the most common hash functions.

(iii) multiplicative hashing:


uses a constant multiplier A to compute the hash value:
hash_value = floor(n * A) % table_size
where n is the key and A is a constant (between 0 and 1).

(iv) division-based hashing:


- calculates the hash value by dividing the key by a prime number and taking the remainder.

applications of hashing:
(i)searching and retrieval:
provides constant time complexity O(1) for average-case search, insertion, and deletion.

(ii) data encryption:


used in cryptography to map large messages to fixed-size hash values (e.g., SHA-256).

(iii) hash-based data structures:


used in hash maps, hash sets, and hash tables to store key-value pairs efficiently.

(iv) load balancing:


helps in distributing data across servers based on the hash of the data, ensuring an even load.

(v) caching:
used to quickly retrieve frequently accessed data from a cache.

time complexity:
(i) search, insert, and delete:
- average case: O(1), assuming the hash function distributes keys uniformly and minimizes collisions.
- worst case: O(n), if all keys hash to the same index, resulting in a chain or a full probe sequence.

(ii) space complexity:


O(n) where n is the number of elements stored in the hash table.

key hash function location/memory address

example:
keys: 12, 18, 15, 14, 13, 29, 31, 57
m = 10

Chapter - 5 (Graphs and Hashing) Page 14


m = 10

key hash function 0 to 9

i want to store these keys in such a way


that retrieval should be 0(1)

using simple hash function:


h(k)=kmodn

h(12)=12mod10= 2
0
h(18)=18mod10= 8 1 31
2 12
h(15)=15mod10= 5
3 13
h(14)=14mod10= 4 4 14
5 15
h(13)=13mod10= 3
6
h(29)=29mod10= 9 7 57
8 18
h(31)=31mod10= 1
9 29
h(57)=31mod10= 7

problem with this approach:


example:
keys: 12, 23, 42, 83, 54, 31, 82
m = 10
- using simple hash function:
h(k)=kmod10

h(12)=12mod10= 2
0
h(23)=23mod10= 3 1
2
h(42)=42mod10= 2
3
4
mapping to same location.
causes "collision" 5

Chapter - 5 (Graphs and Hashing) Page 15


3
4
mapping to same location.
causes "collision" 5
- no 1 to 1 mapping 6
- no uniform distribution
7
8
9 23

topic: collision resolution techniques in hashing


when two keys hash to the same index in a hash table, a collision occurs. various techniques are used to resolve
collisions and ensure that keys are stored efficiently. the four main techniques for collision resolution are:
(i) linear probing
(ii) quadratic probing
(iii) double hashing
(iv) chaining

concept: linear probing


linear probing is an open addressing technique where, if a collision occurs at an index, we check the next
available index in a linear sequence (i.e., increment the index by 1).

steps:
- when a key hashes to an index i and that index is already occupied, the algorithm checks the next index (i+
1) % table_size.
- this process continues until an empty slot is found or the table is full.

pros:
- simple and easy to implement.
- good for small-sized hash tables or low load factors.

cons:
primary clustering: consecutive keys may cause primary clustering, where groups of consecutive occupied
slots form, making future lookups slower.

function for collison:


h(k, i) = (h(k)+i)mod n 0
1
h(k1) = l1 .
but assume at l1 a element already exists hence .
collison.
l1

Chapter - 5 (Graphs and Hashing) Page 16


but assume at l1 a element already exists hence .
collison.
l1
h(k, i) = (h(k)+i)mod n l1+1
h(k, 1) = (h(k)+1)mod n l1+2
h(k, 1) = l1+1
l1+3
but assume at l1+1 a element already exists hence 8
collison number 2.
9

why mod n?
m = 10 0
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
1
let h(k1)=4 collison 2
h(k, 1) = (h(k1)+1)= (4+1) = 5 (already filled) 3
h(k, 2) = (h(k1)+2)= (4+2) = 6 (already filled)
4
h(k, 3) = (h(k1)+3)= (4+3) = 7 (already filled)
h(k, 4) = (h(k1)+4)= (4+4) = 8 (already filled) 5
h(k, 5) = (h(k1)+5)= (4+5) = 9 (already filled) 6
h(k, 6) = (h(k1)+6)= (4+6) = 10 (doesnt exist)
7
cells were not filled but we did not check, in order to 8
check them we need to move circular and that is why we 9
require modn

example:
keys: 31, 26, 43, 22, 34, 46, 14, 58, 13
m= 12

(i) insert 31

Chapter - 5 (Graphs and Hashing) Page 17


simple hash function 0
h(k)=kmodn
1
h(31)=31mod12 = 7
2 26
(ii) insert 26 3
simple hash function 4
h(k)=kmodn 5
h(26)=26mod12 = 2
6
(iii) insert 43 7 31
simple hash function 8 43
h(k)=kmodn 9
h(43)=43mod12 = 7 10
collision occured
collision function 11
h(k,i)=(h(k)+i)modn
h(43,1)=(h(43)+1)mod12
h(43,1)=(7+1)mod12
h(43,1)= 8mod12 = 8

(iv) insert 27
simple hash function
h(k)=kmodn 0
h(27)=27mod12 = 3 1
2 26
(v) insert 34
3 27
simple hash function
4
h(k)=kmodn
h(34)=34mod12 = 10 5
6
(vi) insert 46 7 31
simple hash function 8 43
h(k)=kmodn
9
h(46)=46mod12 = 10
collision occured 10 34
collision function 11 46
k=46, i=1
h(k,i)=(h(k)+i)modn
h(46,1)=(h(46)+1)mod12
h(46,1)=(10+1)mod12
h(46,1)= 11mod12 = 11

(vii) insert 14
simple hash function
h(k)=kmodn 0

Chapter - 5 (Graphs and Hashing) Page 18


(vii) insert 14
simple hash function
h(k)=kmodn 0
h(14)=14mod12 = 2 1
collision occured 26
2
- collision function
k=14, i=1 3 27
h(k,i)=(h(k)+i)modn 4 14
h(14,1)=(h(14)+1)mod12 5
h(14,1)=(2+1)mod12
h(14,1)= 3mod12 = 3 6
collision occured 7 31
- collision function 8 43
k=14, i=2
h(k,i)=(h(k)+i)modn 9
h(14,2)=(h(14)+2)mod12 10 34
h(14,2)=(2+2)mod12 11 46
h(14,2)= 4mod12 = 4

(viii) insert 58
simple hash function
h(k)=kmodn
h(58)=58mod12 = 10 0 58
collision occured 1
collision function 26
2
k=58, i=1
h(k,i)=(h(k)+i)modn 3 27
h(58,1)=(h(58)+1)mod12 4 14
h(58,1)=(10+1)mod12 5
h(58,1)= 11mod12 = 11
collision occured 6
collision function 7 31
k=58, i=2
8 43
h(k,i)=(h(k)+i)modn
h(58,2)=(h(58)+2)mod12 9
h(58,2)=(10+2)mod12 10 34
h(58,2)= 12mod12 = 0 11 46

(ix) insert 13

Chapter - 5 (Graphs and Hashing) Page 19


simple hash function 0 58
h(k)=kmodn 1 13
h(13)=13mod12 = 1
2 26
3 27
4 14
5
6
7 31
8 43
9
10 34
11 46

direct method:
use simple hash function, if collison occurs use collision function
m=12
keys: 31, 26, 43, 22, 34, 46, 14, 58, 13
7 2 7 3 10 10 2 10 1
8 11 3 11
4 0

problem:
primary clustering: consecutive keys may cause primary clustering, where groups of consecutive occupied
slots form, making future lookups slower.

Chapter - 5 (Graphs and Hashing) Page 20


0 58
1 13
2 26
3 27
4 14
probability that a new
5
key will get this slot
8/12 6
7 31
8 43
9
10 34
11 46

concept: quadratic probing


quadratic probing is an open addressing technique where, after a collision, the next index to check is determined
by a quadratic function. this helps reduce primary clustering.
steps:
- if a collision occurs at index i, the algorithm checks the index (i + 1^2) % table_size, (i + 2^2) % table_size,
(i + 3^2) % table_size, and so on, until an empty slot is found.

pros:
- reduces primary clustering compared to linear probing.
- more evenly distributes keys in the table.

cons:
- still may suffer from secondary clustering (where similar keys still collide).
- requires careful handling of table size to avoid infinite loops (if the table is nearly full).

Chapter - 5 (Graphs and Hashing) Page 21


simple hash function l1
h(k)=kmodn
l1+1
collision occured l1+2
collision function (quadratic probing) l1+3
h(k,i)=(h(k)+i2)modn l1+4
l1+5
h(k,i)=(h(k)+i2)modn l1+6
- h(k,1)=(h(k)+12)modn l1+7
h(k,1)=l1+1 does not form cluster
l1+8
- h(k,2)=(h(k)+22)modn l1+9
h(k,2)=l1+4

example:
keys: 24, 17, 32, 2, 13
m= 11

(i) insert 24 0
simple hash function 1
h(k)=kmodn
2 24
h(24)=24mod11 = 2
3
(ii) insert 17 4
simple hash function 5
h(k)=kmodn 6 17
h(17)=17mod11 = 6
7
(iii) insert 32 8

simple hash function 9


h(k)=kmodn 10 32
h(32)=32mod11 = 10 11

(iv) insert 2
simple hash function
h(k)=kmodn
h(2)=2mod11 = 2 0
collision occured
1 2
collision function
k=2, i=1 2 24
h(k,i)=(h(k)+i2)modn 3 13
h(2,1)=(h(2)+12)mod11
4
h(2,1)=(2+1)mod11
h(2,1)= 3mod11 = 1 5
Chapter - 5 (Graphs and Hashing) Page 22
h(k,i)=(h(k)+i2)modn 3 13
h(2,1)=(h(2)+12)mod11
4
h(2,1)=(2+1)mod11
h(2,1)= 3mod11 = 1 5
6 17
(v) insert 13 7
simple hash function 8
h(k)=kmodn
h(13)=13mod11 = 2 9
collision occured 10 32
collision function 11
k=13, i=1
h(k,i)=(h(k)+i2)modn
h(13,1)=(h(13)+12)mod11
h(13,1)=(13+1)mod11
h(13,1)= 14mod11 = 3

problem: secondary clustering (where similar keys still collide).

key: 24, 2, 13
m: 11 0
(i) insert 24 1
simple hash function 2
h(k)=kmodn 3
h(24)=24mod11 = 2
4
(ii) insert 2 5
simple hash function 6
h(k)=kmodn 7
h(2)=2mod11 = 2
8
(iii) insert 13 9
simple hash function 10
h(k)=kmodn 11
h(13)=13mod11 = 2

collision(i) collision(ii) collision(iii)


collision occured collision occured collision occured
collision function collision function collision function
k=24, i=1 k=24, i=2 k=24, i=3
h(k,i)=(h(k)+i2)modn h(k,i)=(h(k)+i2)modn h(k,i)=(h(k)+i2)modn
h(24,1)=(h(24)+12)mod11 h(24,1)=(h(24)+22)mod11 h(24,1)=(h(24)+32)mod11
h(24,1)=(24+1)mod11 h(24,1)=(24+4)mod11 h(24,1)=(24+9)mod11
h(24,1)= 25mod11 = 3 h(24,1)= 28mod11 = 6 h(24,1)= 33mod11 = 0
Chapter - 5 (Graphs and Hashing) Page 23
h(24,1)=(h(24)+12)mod11 h(24,1)=(h(24)+22)mod11 h(24,1)=(h(24)+32)mod11
h(24,1)=(24+1)mod11 h(24,1)=(24+4)mod11 h(24,1)=(24+9)mod11
h(24,1)= 25mod11 = 3 h(24,1)= 28mod11 = 6 h(24,1)= 33mod11 = 0

collision occured collision occured collision occured


collision function collision function collision function
k=2, i=1 k=2, i=2 k=2, i=3
h(k,i)=(h(k)+i2)modn h(k,i)=(h(k)+i2)modn h(k,i)=(h(k)+i2)modn
h(2,1)=(h(2)+12)mod11 h(2,1)=(h(2)+22)mod11 h(2,1)=(h(2)+32)mod11
h(2,1)=(2+1)mod11 h(2,1)=(2+4)mod11 h(2,1)=(2+9)mod11
h(2,1)= 3mod11 = 3 h(2,1)= 6mod11 = 6 h(2,1)= 11mod11 = 3

collision occured collision occured collision occured


collision function collision function collision function
k=13, i=1 k=13, i=2 k=13, i=3
h(k,i)=(h(k)+i2)modn h(k,i)=(h(k)+i2)modn h(k,i)=(h(k)+i2)modn
h(13,1)=(h(13)+12)mod11 h(13,1)=(h(13)+22)mod11 h(13,1)=(h(13)+32)mod11
h(13,1)=(13+1)mod11 h(13,1)=(13+4)mod11 h(13,1)=(13+9)mod11
h(13,1)= 14mod11 = 3 h(13,1)= 17mod11 = 6 h(13,1)= 22mod11 = 0

keys that are hashed to same locations follows the same resolution path because of which we are not able to
utilize the table efficiently.

inspite of almost 50% available space we are not able to insert a new elements.

concept: double hashing


double hashing is an advanced open-addressing technique used to handle collisions in a hash table. it uses
two hash functions to compute probe sequences, reducing clustering and improving performance.

why use double hashing?


- avoids clustering: unlike linear probing and quadratic probing, double hashing provides a better spread of
values.
- efficient collision resolution: it ensures that probe sequences do not follow the same pattern, reducing
collisions.
- good for large hash tables: works well when the table size m is a prime number, ensuring better
distribution.

simple hash function


h(k)=kmodn

collision occured
collision function (double hashing)
h(k,i)=(h(k)+i.h'(k))modn

h'(k)=7-(kmod7)
- h'(k) never generates 0.

Chapter - 5 (Graphs and Hashing) Page 24


example:
keys: 13, 17, 21, 2, 57
m=11
(i) insert 13 (iv) insert 2
simple hash function simple hash function
h(k)=kmodn h(k)=kmodn
h(13)=13mod11 = 2 h(2)=2mod11 = 2
collision occured
(ii) insert 17 collision function
k=2, i=1
simple hash function
h(k,i)=(h(k)+i.h'(k))modn
h(k)=kmodn
h(2,1)=(h(2)+1.h'(2))mod11
h(17)=17mod11 = 6
h'(2)=7-(2mod7)
(iii) insert 21 h'(2)=7-2
simple hash function h'(2)=5
h(k)=kmodn
h(21)=21mod11 = 10 h(2,1)=(h(2)+1.5)mod11
h(2,1)=(2+5)mod11
h(2,1)=7mod11
h(2,1)=7

problem:
overhead, because of two hash functions

concept: load factor


load factor = (n/m)
n: number of keys
m: table size

Chapter - 5 (Graphs and Hashing) Page 25

You might also like