0% found this document useful (0 votes)
11 views78 pages

Unit 5

The document provides an overview of graph theory, including definitions, types of graphs (directed, undirected, weighted, etc.), and basic terminologies such as paths, cycles, and degrees. It also discusses graph representation methods (adjacency matrix and list), traversal algorithms (BFS and DFS), and concepts like minimum spanning trees and shortest path algorithms (Dijkstra's). Additionally, it introduces hashing as a technique for efficient data storage and retrieval in hash tables.

Uploaded by

rioakyt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views78 pages

Unit 5

The document provides an overview of graph theory, including definitions, types of graphs (directed, undirected, weighted, etc.), and basic terminologies such as paths, cycles, and degrees. It also discusses graph representation methods (adjacency matrix and list), traversal algorithms (BFS and DFS), and concepts like minimum spanning trees and shortest path algorithms (Dijkstra's). Additionally, it introduces hashing as a technique for efficient data storage and retrieval in hash tables.

Uploaded by

rioakyt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

UNIT-V

GRAPHS AND HASHING


GRAPH DEFINITION
• A graph G= (V, E) consists of a set of vertices V and set of edges E.
• Vertices are often also called nodes.
• Elements of E are called edges, or directed edges, or arcs.
• Each edge is a pair (v,w) where v,w €V

Here v1,v2,v3,v4 are the vertices and (v1,v2), (v2,v3), (v3,v4), (v4,v1),
(v2,v4),(v1,v3) are edges.
Basic Terminologies

Directed Graph
• A directed graph, or digraph, is a graph which consists of directed edges
where each edge in E is unidirectional.
• If(v,w) is a directed edge then (v,w) ≠ (w,v)
• For directed edge (v, w) in E, v is its tail and w its head;
• (v, w) is represented in the diagrams as the arrow, v -> w.
Undirected Graph

• A undirected graph is a graph which consists of undirected


edges. If (v,w) is an undirected edge then (v,w) = (w,v)
Weighted Graph

• A graph is said to be weighted graph if every edge in a graph is assigned a


weight or value. It can be a directed or undirected graph
Complete Graph

• A complete graph is a graph in which there is a edge between


every pair of vertices. A complete graph with n vertices will have
n(n-1)/2 edges.
Strongly connected Graph
• If there is a path from every vertex to other vertex in a directed
graph then it is said to be strongly connected graph otherwise it
is said to be weakly connected graph.
Path
• A path in a graph is a sequence of vertices w1,w2,w3…..wn such that wi,wi+1 for
1≤i≤N. Refering the weakly connected graph the path from v1 to v3 is v1,v2,v3
Length
• The length of the path is the number of edges on the path, which is equal to N-1
where N represents the number of vertices. The length of the above path v1 to v3
is 2. (i.e) (v1,v2), (v2,v3).
• If there is a path from a vertex to itself with no edges then the path length is 0.
Loop
• If the graph contains an edge (v,v) from a vertex to itself then the path is referred
to as a loop.
Simple path
• A simple path is a path that all vertices on the path, except possibly the first and the last
are distinct.
• A simple cycle is the simple path of length atleast one that begins and ends at the same
vertex.
Cycle
• A cycle in a graph is a path in which first and last vertex are the same
Degree
• The number of edges incident on a vertex determines its degree. The degree of the
vertex is written as degree (v). The in degree of the vertex V is the number of edges
entering into vertex V. Similarly the out degree of the vertex V is the number of edges
existing fro that vertex V.
Acyclic Graph
• A directed graph which has no cycles is referred to as
acyclic graph. It is abbreviated as DAG (Directed Acyclic
Graph)
Representation of Graph
• Graph can be represented by Adjacency Matrix and Adjacency List.
• Adjacency Matrix Representation
• The adjacency matrix A for a graph G=(V,E) with n vertices in an n x n matrix , such that
• Aij = 1, if there is an edge Vi to Vj
• Aij = 0, if there is no edge
Pros:
• Simple to implement
• Easy and fast to tell if a pair (i,j) is an edge:
simply check if A[i][j] is 1 or 0
Cons:
• No matter how few edges the graph has, the
matrix takes O(n2) in memory
Adjacency List Representation

• In this representation we store a graph as a linked structure. We store all the vertices in a
list for each vertex; we have a linked list of its adjacency vertices.

0
1

1 2 0

2
0 3 1 2

2 3 0
1 2
1 3 0

0 1 2
3
Pros:
• Saves on space (memory): the representation takes as many
memory words as there are nodes and edge.
Cons:
• It can take up to O(n) time to determine if a pair of nodes (i,j) is
an edge: one would have to search the linked list L[i], which
takes time proportional to the length of L[i].
Topological sort
Introduction
• Linear ordering of vertices in a directed acyclic graph such that
if there is a path from Vi to Vj in the linear ordering.
• Not possible if there is a cycle.
Steps
• Find the indegree of every vertex
• Place the vertices whose indegree is ‘0’ on the
empty queue.
• Dequeue the vertex v and decrement the indegree’s
of all its adjacent vertices.
• Enqueue the vertex in the queue if its indegree falls
to zero.
• Repeat from step3 until the queue becomes empty.
• Topological ordering is the order in which the
vertices dequeued.
Routine to perform topological sort

void topsort(Graph G) While (!IsEmpty (Q))


{ {
Queue Q; V=Dequeue(Q);
int counter=0; TopNum[v]=++counter;
Vertex v,w; For each w adjacent to v
Q=CreateQueue(Num=Vertex); If(--Indegree[w]==0))
Makeempty(Q); Enqueue(w,Q);
For each vertex V }
if (indegree[v]==0) If(counter!=NumVertex)
Enqueue (v,Q); Error(“graph has a cycle”);
DisposeQueue(Q);
}
Example 1 2 3 4

a 0 0 0 0
a
b 1 0 0 0
c
b
c 2 1 0 0

d d 2 2 1 0

Enqueu a b c d

Dequeu a b c d
GRAPH TRAVERSAL
• A graph traversal is a systematic way of visiting the nodes in a
specific order .
There are two types of graph traversal namely
• Breadth First Search
• Depth First Search
Breadth First Search(BFS)
• BFS of a graph G starts from an unvisited vertex u.
• BFS uses a queue data structure to keep track of the order of nodes whose
adjacent nodes are to be visited.
Steps
• Choose any node in the graph, designate it as source node and mark it as
visited.
• Using the adjacency matrix of the graph find all the unvisited adjacent nodes to
the search node and enqueue them into the queue Q
• Then the node is dequeued from the queue. Mark that node as visited and
designate it as the new search node.
• Repeat step 2 and 3 using the new search node.
• This process continues until the queue Q which keeps track of the adjacent
nodes is empty.
• hackerearth.com/practice/algorithms/graphs/breadth-first-search/visualize/
Routine
BFS(node)
{
queue node
visited[node] = true
while queue not empty
v queue
print v
for each child c of v
if not visited[c]
queue c
visited[c]=true
}
Example

• Let A be the source vertex. Mark it as visited.


• Find the adjacent unvisited vertices of A and enqueue it into the queue. Here B and C are
adjacent nodes of A and B and C are enqueued
• Then the vertex B is dequeued and its adjacency vertices C and D are taken from the
adjacency matrix for enqueuing. Since vertex C is already in the queue, vertex D alone is
enqueue

Here B is dequeued and D is enqueued


• Then the vertex C is dequeued and its adjacent vertices A, B and D are found out. Since
the vertices A and B are already visited and vertex D is also in the queue, no enqueue
operation takes place.
• The Vertex D is dequeued.
Application of BFS
• GPS Navigation systems
• Computer Networks
• Face book
Depth First Search (DFS)
• DFS works by selecting one vertex V of G as a start vertex; V is marked visited. Then each unvisited
vertex adjacent to V is searched in turn using DFS recursively.
Steps
• Choose any node in the graph. Designate it as the search node and mark it as
visited.
• Using the adjacency matrix of the graph, find a node adjacent to the search node
that has not be visited yet. Designate this as the new search node and mark it as
visited.
• Repeat step 2 using the new search node. If no nodes satisfying (2) can be found
return to the previous search node and continue from there.
• When a return to the previous search node in (3) is impossible, the search from
the originally chosen search node is complete.
• If the graph still contains unvisited nodes, choose any node that has not been visited and repeat step
1 through 4.
Routine
DFS(node)
{
stack node
visited[node] = true
while stack not empty
v stack
print v
for each child c of v
if not visited[c]
stack c
visited[c]=true
}
Example
Applications of DFS
• Detecting cycle in a graph
• Path Finding
• Solving puzzles with only one solution, such as mazes
Exercise
Find BFS and DFS Traversal ordering of nodes for the following
graph.
Biconnectivity

• A connected undirected graph is biconnected if there are no vertices whose removal disconnects the
rest of the graph.
• Articulation Points

• The vertices whose removal would disconnect the graph are known as articulation points
• Hence the removal of C vertex will disconnect G from the graph. Similarly the removal of
D vertex will disconnect E and F from the graph. Therefore C and D are articulation points.
MINIMUM SPANNING TREES
• A spanning tree of a graph is just a subgraph that contains all the vertices and is a tree.
• A graph may have many spanning trees
• On a connected graph G=(V, E), a spanning tree:
• is a connected subgraph
• acyclic
• is a tree (|E| = |V| - 1)
• contains all vertices
• A Minimum Spanning Tree (MST) is a subgraph of an undirected graph such that the
subgraph spans (includes) all nodes, is connected, is acyclic, and has minimum total edge
weight
• For an edge-weighted , connected, undirected graph, G, the total cost of G is the sum of the
weights on all its edges.
• A minimum-cost spanning tree for G is a minimum spanning tree of G that has the least total
cost.
• Has 16 spanning trees. Some are:
There are two algorithms to find the minimum spanning tree
• Prim’s Algorithm
• Kruskal’s Algorithm
Minimum Spanning Trees-Prim’s Algorithm

• This is one of the ways to compute a minimum spanning tree


which uses a greedy technique.
• This algorithm begins with a set U initialized to {1}
• It then grows a spanning tree one edge at a time.
• At each step it finds a shortest edge(u,v) such that the cost of (u,
v) is the smallest among all edges, where u is in minimum
spanning tree.
Routine
Void Prims ( Table T)
{
Vertex v,w;
for (i=0; i< NumVertex; i++)
{
T[i].known = false;
T[i].Dist = INFINITY;
T[i].Path=0;
for(; ;)
{
// Let v be the start vertex with the smallest distance
T[v].Dist=0;
T[v].known=true;
for each w adjacent to v
if ( ! T[w].known)
{
T[w].Dist = Min( T[w].Dist, C);
T[w].path=v;
}
}
Example
Step 3:
• Next vertex d with minimum distance is marked as visited and the distance of its
unknown adjacent vertex is updated.
T[b].Dist=Min( T[b].Dist, Cd,b)=Min(2,2)=2
T[c].Dist=Min( T[c].Dist, Cd,c)=Min(3,1)=1
Minimum Spanning Tree-Kruskal’s Algorithm
• This uses a greedy technique to compute a minimum spanning tree.
• This algorithm selects the edges in the order of smallest weight and accept an
edge if it does not cause a cycle.
• The algorithm terminates if enough edges are accepted
• Initially there are |v| single node trees.
• Adding an edge merges two trees into one
• When the algorithm terminates, there is only one tree, which is called as minimum
spanning tree.
• The algorithm uses two DS namely find and union
• Find(u) returns the root of the tree that contains the vertex u
• Union(u,v) merge the two trees by making the root pointer of one point the root of
the other tree.
Routine
MST-KRUSKAL(G, w)
A←Ø
for each vertex v V[G]
do MAKE-SET(v)
sort the edges of E into nondecreasing order by weight w
for each edge (u, v) E, taken in nondecreasing order by weight
do if FIND-SET(u) ≠ FIND-SET(v)
then A ← A {(u, v)}
UNION(u, v)
return A
// Routine for Find
SetType Find(Vertex U,DisjointSet S)
{
If(S[u]<=0)
return u;
else
return Find(S[u],S);
}
// Routine for Union
Void SetUnion (DisjointSet S, SetType Uset,SetType Vset)
{
S[Vset]=Uset;
}
Example
Exercise
Find minimum spanning tree for the following graph using Prim’s
and Kruskal’s Algorithms
Shortest Path Algorithm
Dijkstra’s Algorithm(Single Source Shortest
Path)
• The general method to solve the single source shortest path problem is
known as Dijkstra’s algorithm. This is applied to the weighted graph G.
• This is the prime example of Greedy technique which generally solve the
problem in stages to be the best thing at each stage.
• This algorithm proceeds in stages, just like the unweighted shortest
path algorithm
• At each stage it selects a vertex v which has the smallest d among all
v

the unknown vertices and declares that as the shortest path from s to v
and mark it to be known.
• We should set d =d + C
w v v,w
Routine
void dijkstra( Graph G, Table T)
{
int i;
Vertex v,w;
ReadGraph (G,T);
for (i=0; i< NumVertex; i++)
{
T[i].known = false;
T[i].Dist = INFINITY;
T[i].Path=0;
}
T[Start].dist=0;
for(; ;)
{
v= smallest unknown distance vertex;
if ( v= = NOT_A_VERTEX)
break;
T[v].known=true;
for each W adjacent to v
if( ! T[w].known)
{
T[w].Dist = Min( T[w].Dist, T[v].Dist+C);
T[w].path=v;
Example
Hashing
A Procedure / Technique to insert and retrieve elements in a table (hash
table)in almost constant time.
• Hash Table is a data structure which store data in associative manner. In
hash table, data is stored in array format where each data values has its
own unique index value. Access of data becomes very fast if we know the
index of desired data.
• Thus, it becomes a data structure in which insertion and search operations
are very fast irrespective of size of data. Hash Table uses array as a
storage medium and uses hash technique to generate index where an
element is to be inserted or to be located from
Implementation of hash tables
 Key, Hash Function and Hash Table
Hashing
Hash
Table
0
1
2
3
key Hash 4
Function
5
6
7
8
9
Hash Function
A hash function h transforms a key into an index in a hash table
T[0…m-1]:
h : U → {0, 1, . . . , m - 1}
A hash function transforms a key into a table address
Collision
 Different keys may map into same location
• Hash function is not one-to-one => collision.
Hash function mapping two keys to the same
position in the hash table - Collision occurs.
Collision Avoidance Techniques
• Separate Chaining(Open hashing)
• Open Addressing(Closed hashing)
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
• Rehashing
• Extendible hashing
Separate Chaining(Open Hashing)
Each table entry stores a list of items
A pointer field is added to each record, when overflow occurs
this pointer is set to print overflow blocks making a linked list
Adv:
• More no of elements can be inserted
Disadv:
• Requires pointers which occupies more memory space
• Takes more effort to perform a search
Example
• Table Size: 10
• Hash Function: H(k) = k mod Table Size
• Insert the keys 83, 14, 29, 10, 74, 36, 96,67
14, 67,83,36,10,29, 74,96 H(k) = 14 % 10
=4
0 10
H(k) = 67 % 10
=7
1
H(k) = 83 % 10
2 =3
83 H(k) = 36 % 10
3 =3
14 74 H(k) = 10 % 10
4
=0
5 H(k) = 29 % 10
=9
36 96
6 H(k) = 74 % 10
=4
67
7
H(k) = 96 % 10
8 =4
29
9
Open Addressing(Closed hashing)
• Alternate to resolve collision with linked lists.
• Three techniques are
Linear Probing
Quadratic Probing
Double Hashing
Linear Probing
• Hash Function: H(k)=(H(k)+i )mod Table Size
Probe 0:h(k) mod Tablesize
Probe 1:(h(k)+1) mod Tablesize
Probe 2:(h(k)+2) mod Tablesize
Probe 3:(h(k)+3) mod Tablesize
Probe i :(h(k)+i) mod Tablesize
Adv:
• Time is not required for allocating new cells
• Does not require pointers
Disadv:
• Forms clusters which degrades the performance of the hash table for storing and
retrieving data.
14,83,36,10, 29,74,96,66

H(k) = 14 % 10 Probe 0:
0 10 =4 H(k) = 96 % 10
H(k) = 83 % 10 = 6 (already occupied)
1 Probe 1:
=3
H(k)=(H(k)+1) % 10
2 H(k) = 36 % 10
= (6+1) % 10
=6
83 =7
3 H(k) = 10 % 10
=0 Probe 0:
4 14
H(k) = 29 % 10 H(k) = 66 % 10
5 74 =9 = 6 (already occupied)
Probe 1:
Probe 0: H(k)=(H(k)+1) % 10
6 36
H(k) = 74 % 10 = (6+1) % 10
96 = 4 (already occupied) = 7 (already occupied)
7 Probe 1: Probe 2:
H(k)=(H(k)+1) % 10 H(k)=(H(k)+2) % 10
8 66
= (4+1) % 10 = (6+2) % 10
=5 =8
9 29
Quadratic Probing
• Hash Function: H(k)=(H(k)+i2 )mod Table Size
Probe Sequence:
Probe 0:h(k) mod Tablesize
Probe 1:(h(k)+1) mod Tablesize
Probe 2:(h(k)+4) mod Tablesize
Probe 3:(h(k)+9) mod Tablesize
Probe i:(h(k)+ i2) mod Tablesize
14,83,36,10, 29,96,66 Probe 0:
H(k) = 66 % 10
H(k) = 14 % 10 = 6 (already occupied)
0 10
=4 Probe 1:
H(k) = 83 % 10 H(k)=(H(k)+1) % 10
1
=3 = (6+1) % 10
H(k) = 36 % 10 = 7 (already occupied)
2 Probe 2:
=6
83 H(k)=(H(k)+4) % 10
3 H(k) = 10 % 10 = (6+4) % 10
=0 = 0 (already occupied)
4 14
H(k) = 29 % 10 Probe 3:
5 66 =9 H(k)=(H(k)+9) % 10
= (6+9) % 10
36 Probe 0: =5
6
H(k) = 96 % 10
96 = 6 (already occupied)
7 Probe 1:
H(k)=(H(k)+1) % 10
8 = (6+1) % 10
29 =7
9
Double Hashing
• Hash Function: H(k)=(H(k)+i*g(k) )mod Table Size
g(k)=i*hash2(x) where g or hash2 is a second hash function
hash2(x)=R-(x mod R)
R-prime, smaller than tablesize
Probe Sequence:
Probe 0:h(k) mod Tablesize
Probe 1:(h(k)+1*g(k)) mod Tablesize
Probe 2:(h(k)+2*g(k)) mod Tablesize
Probe 3:(h(k)+3*g(k)) mod Tablesize
Probe i:(h(k)+i*g(k)) mod Tablesize
76,93,40,47,10,55
H(k) = 10 % 7
H(k) = 76 % 7
=3
0 =6
47 H(k) = 93 % 7 Probe 0:
1
=2 H(k) = 55 % 7
93 = 6 (already occupied)
2 H(k) = 40 % 7 Probe 1:
=5 H(k)=(h(k)+1*g(k)) % 7 g(k)=R-(x-mod R)
3 10
=(6+5) mod 7 =5-(55 mod 5)
=4 =5-0
4 55 =5
5 40 Probe 0:
H(k) = 47 % 7
6 76 = 5 (already occupied)
Probe 1:
H(k)=(h(k)+1*g(k)) % 7 g(k)=R-(x-mod R)
=(5+3) mod 7 =5-(47 mod 5)
=1 =5-2
=3
Re Hashing
• Rehash as soon as the table is half full
• Rehash only when an insertion fails
• Rehashing- Builds new table that is about twice as big and scan
down the entire original hash table.
• Uses linear probing function
Adv:
• Programmer does not worry about the tablesize
• Simple to implement
6,15,23,24,13

H(k) = 6 % 7
0 =6

1 15 H(k) = 15 % 7
=1
2 23
H(k) = 23 % 7
24 =2
3
H(k) = 24 % 7
4 =3
5

6 6

When 13 is to be inserted the table will be too full hence


rehash the tablesize.(i.e)
New tablesize=2*old tablesize
2*7=14,choose the next prime (i.e)
17

You might also like