Unit 4 (Graph)
Unit 4 (Graph)
What is a Graph?
A graph is an ordered pair G = (V, E) comprising a set V of vertices or nodes and a collection of
pairs of vertices from V, known as edges of a graph. For example, for the graph below.
V = { 1, 2, 3, 4, 5, 6 }
E = { (1, 4), (1, 6), (2, 6), (4, 5), (5, 6) }
Graphs are non-linear data structures that are made up of a set of nodes (or vertices),
connected by edges (or arcs). Nodes are entities where the data is stored and their
relationships are expressed using edges. Edges may be directed or undirected. Graphs
demonstrate complicated relationships with ease and are used to solve many real-life
problems.
For example, Facebook uses a graph data structure that consists of a group of entities and
their relationships. On Facebook, every user, photo, post, page, place, etc. that has data is
represented with a node. Every edge from one node to another represents their relationships,
friendships, ownerships, tags, etc. Whenever a user posts a photo, comments on a post, etc., a
new edge is created for that relationship. Both nodes and edges have meta-data associated
with them.
Term Description
Every individual data element is called a vertex or a node. In the above
Vertex
image, A, B, C, D & E are the vertices.
1. Finite Graph
The graph G=(V, E) is called a finite graph if the number of vertices and edges in the graph is
limited in number
2. Infinite Graph
The graph G=(V, E) is called a finite graph if the number of vertices and edges in the graph is
interminable.
4. Simple Graph
If each pair of nodes or vertices in a graph G=(V, E) has only one edge, it is a simple graph.
As a result, there is just one edge linking two vertices, depicting one-to-one interactions
between two elements.
5. Multi Graph
If there are numerous edges between a pair of vertices in a graph G= (V, E), the graph is
referred to as a multigraph. There are no self-loops in a Multigraph.
6. Null Graph
It's a reworked version of a trivial graph. If several vertices but no edges connect them, a
graph G= (V, E) is a null graph.
If a graph G= (V, E) is also a simple graph, it is complete. Using the edges, with n number of
vertices must be connected. It's also known as a full graph because each vertex's degree must
be n-1.
8. Pseudo Graph
9. Regular Graph
If a graph G= (V, E) is a simple graph with the same degree at each vertex, it is a regular
graph. As a result, every whole graph is a regular graph.
A graph G= (V, E) is called a labeled or weighted graph because each edge has a value or
weight representing the cost of traversing that edge.
A directed graph also referred to as a digraph, is a set of nodes connected by edges, each with
a direction.
An undirected graph comprises a set of nodes and links connecting them. The order of the
two connected vertices is irrelevant and has no direction. You can form an undirected graph
with a finite number of vertices and edges.
If there is a path between one vertex of a graph data structure and any other vertex, the graph
is connected.
When there is no edge linking the vertices, you refer to the null graph as a disconnected
graph.
It's also known as a directed acyclic graph (DAG), and it's a graph with directed edges but no
cycle. It represents the edges using an ordered pair of vertices since it directs the vertices and
stores some data.
18. Subgraph
The vertices and edges of a graph that are subsets of another graph are known as a subgraph.
The most frequent graph representations are the two that follow:
Adjacency matrix
Adjacency list
You’ll look at these two representations of graphs in data structures in more detail:
Adjacency Matrix
Weight or cost is indicated at the graph's edge, a weighted graph representing these values in
the matrix.
Adjacency List
Breadth-first search
Depth-first search
BFS is a search technique for finding a node in a graph data structure that meets a set of
criteria.
It begins at the root of the graph and investigates all nodes at the current depth level
before moving on to nodes at the next depth level.
To maintain track of the child nodes that have been encountered but not yet inspected,
more memory, generally you require a queue.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 2:Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 4:Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5:Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1) and
set
their STATUS = 2
(waiting state)
[END OF LOOP]
Step 6: EXIT
Now, let's understand the working of BFS algorithm by using an example. In the example
given below, there is a directed graph having 7 vertices.
1. QUEUE1 = {A}
2. QUEUE2 = {NULL}
Step 2 - Now, delete node A from queue1 and add it into queue2. Insert all neighbors of node
A to queue1.
1. QUEUE1 = {B, D}
2. QUEUE2 = {A}
Step 3 - Now, delete node B from queue1 and add it into queue2. Insert all neighbors of node
B to queue1.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
Step 4 - Now, delete node D from queue1 and add it into queue2. Insert all neighbors of node
D to queue1. The only neighbor of Node D is F since it is already inserted, so it will not be
inserted again.
1. QUEUE1 = {C, F}
2. QUEUE2 = {A, B, D}
Step 5 - Delete node C from queue1 and add it into queue2. Insert all neighbors of node C to
queue1.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
Step 5 - Delete node F from queue1 and add it into queue2. Insert all neighbors of node F to
queue1. Since all the neighbors of node F are already present, we will not insert them again.
AD
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
Step 6 - Delete node E from queue1. Since all of its neighbors have already been added, so
we will not insert them again. Now, all the nodes are visited, and the target node E is
encountered into queue2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
Time complexity of BFS depends upon the data structure used to represent the graph. The
time complexity of BFS algorithm is O(V+E), since in the worst case, BFS algorithm
explores every node and edge. In a graph, the number of vertices is O(V), whereas the
number of edges is O(E).
The space complexity of BFS can be expressed as O(V), where V is the number of vertices.
DFS is a search technique for finding a node in a graph data structure that meets a set of
criteria.
The depth-first search (DFS) algorithm traverses or explores data structures such as trees and
graphs. The DFS algorithm begins at the root node and examines each branch as far as
feasible before backtracking.
To maintain track of the child nodes that have been encountered but not yet inspected, more
memory, generally a stack, is required.
Algorithm
Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)
Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
Step 5: Push on the stack all the neighbors of N that are in the ready state (whose STATUS =
1) and set their STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
Now, let's understand the working of the DFS algorithm by using an example. In the example
given below, there is a directed graph having 7 vertices.
1. STACK: H
Step 2 - POP the top element from the stack, i.e., H, and print it. Now, PUSH all the
neighbors of H onto the stack that are in ready state.
1. Print: H]STACK: A
Step 3 - POP the top element from the stack, i.e., A, and print it. Now, PUSH all the
neighbors of A onto the stack that are in ready state.
1. Print: A
2. STACK: B, D
Step 4 - POP the top element from the stack, i.e., D, and print it. Now, PUSH all the
neighbors of D onto the stack that are in ready state.
1. Print: D
Step 5 - POP the top element from the stack, i.e., F, and print it. Now, PUSH all the
neighbors of F onto the stack that are in ready state.
1. Print: F
2. STACK: B
Step 6 - POP the top element from the stack, i.e., B, and print it. Now, PUSH all the
neighbors of B onto the stack that are in ready state.
1. Print: B
2. STACK: C
Step 7 - POP the top element from the stack, i.e., C, and print it. Now, PUSH all the
neighbors of C onto the stack that are in ready state.
1. Print: C
2. STACK: E, G
Step 8 - POP the top element from the stack, i.e., G and PUSH all the neighbors of G onto the
stack that are in ready state.
1. Print: G
2. STACK: E
Step 9 - POP the top element from the stack, i.e., E and PUSH all the neighbors of E onto the
stack that are in ready state.
AD
1. Print: E
2. STACK:
Now, all the graph nodes have been traversed, and the stack is empty.
The time complexity of the DFS algorithm is O(V+E), where V is the number of vertices and
E is the number of edges in the graph.
A spanning tree can be defined as the subgraph of an undirected connected graph. It includes
all the vertices along with the least possible number of edges. If any vertex is missed, it is not
a spanning tree. A spanning tree is a subset of the graph that does not have cycles, and it also
cannot be disconnected.
A spanning tree consists of (n-1) edges, where 'n' is the number of vertices (or nodes). Edges
of the spanning tree may or may not have weights assigned to them. All the possible spanning
trees created from the given graph G would have the same number of vertices, but the
number of edges in the spanning tree would be equal to the number of vertices in the given
graph minus 1.
A complete undirected graph can have nn-2 number of spanning trees where n is the number
of vertices in the graph. Suppose, if n = 5, the number of maximum possible spanning trees
would be 55-2 = 125.
Basically, a spanning tree is used to find a minimum path to connect all nodes of the graph.
Some of the common applications of the spanning tree are listed as follows -
AD
Now, let's understand the spanning tree with the help of an example.
As discussed above, a spanning tree contains the same number of vertices as the graph, the
number of vertices in the above graph is 5; therefore, the spanning tree will contain 5
vertices. The edges in the spanning tree will be equal to the number of vertices in the graph
minus 1. So, there will be 4 edges in the spanning tree.
Some of the possible spanning trees that will be created from the above graph are given as
follows -
Properties of spanning-tree
Some of the properties of the spanning tree are given as follows -
So, a spanning tree is a subset of connected graph G, and there is no spanning tree of a
disconnected graph.
Let's understand the minimum spanning tree with the help of an example.
The sum of the edges of the above graph is 16. Now, some of the possible spanning trees
created from the above graph are -
So, the minimum spanning tree that is selected from the above spanning trees for the given
weighted graph is –
A minimum spanning tree can be found from a weighted graph by using the algorithms given
below -
Prim's Algorithm
Kruskal's Algorithm
Prim's Algorithm is a greedy algorithm that is used to find the minimum spanning tree from
a graph. Prim's algorithm finds the subset of edges that includes every vertex of the graph
such that the sum of the weights of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all the
connecting edges at every step. The edges with the minimal weights causing no cycles in the
graph got selected.
AD
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There are two
edges from vertex B that are B to C with weight 10 and edge B to D with weight 4. Among
the edges, the edge BD has the minimum weight. So, add it to the MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the other edges. In
this case, the edges DE and CD are such edges. Add them to MST and explore the adjacent of
C, i.e., E and A. So, select the edge DE and add it to the MST.
Step 4 - Now, select the edge CD, and add it to the MST.
So, the graph produced in step 5 is the minimum spanning tree of the given graph. The cost of
the MST is given below -
Algorithm
1. Step 1: Select a starting vertex
2. Step 2: Repeat Steps 3 and 4 until there are fringe vertices
3. Step 3: Select an edge 'e' connecting the tree vertex and fringe vertex that has minimu
m weight
4. Step 4: Add the selected edge and the vertex to the minimum spanning tree T
[END OF LOOP]
5. Step 5: EXIT
AD
Time Complexity
Data structure used for the minimum edge weight Time Complexity
Adjacency matrix, linear searching O(|V|2)
Adjacency list and binary heap O(|E| log |V|)
Adjacency list and Fibonacci heap O(|E|+ |V| log |V|)
Kruskal's Algorithm is used to find the minimum spanning tree for a connected weighted
graph. The main target of the algorithm is to find the subset of edges by using which we can
traverse every vertex of the graph. It follows the greedy approach that finds an optimum
solution at every stage instead of focusing on a global optimum.
AD
The weight of the edges of the above graph is given in the below table -
Edge AB AC AD AE BC CD DE
Weight 1 7 10 5 3 4 2
Now, sort the edges given above in the ascending order of their weights.
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
Step 3 - Add the edge BC with weight 3 to the MST, as it is not creating any cycle or loop.
Step 4 - Now, pick the edge CD with weight 4 to the MST, as it is not forming the cycle.
Step 5 - After that, pick the edge AE with weight 5. Including this edge will create the cycle,
so discard it.
Step 6 - Pick the edge AC with weight 7. Including this edge will create the cycle, so discard
it.
Step 7 - Pick the edge AD with weight 10. Including this edge will also create the cycle, so
discard it.
So, the final minimum spanning tree obtained from the given weighted graph by using
Kruskal's algorithm is -
AD
Now, the number of edges in the above tree equals the number of vertices minus 1. So, the
algorithm stops here.
2. Step 2: Create a set E that contains all the edges of the graph.
3. Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
4. Step 4: Remove an edge from E with minimum weight
5. Step 5: IF the edge obtained in Step 4 connects two different trees, then add it to the f
orest F
(for combining two trees into one tree).
ELSE
Discard the edge
6. Step 6: END
AD
Time Complexity
The time complexity of Kruskal's algorithm is O(E logE) or O(V logV), where E is
the no. of edges, and V is the no. of vertices.
Output:
The matrix of transitive closure
1111
0111
0011
0001
Dijkstra’s Algorithm:
Dijkstra’s algorithm is the iterative algorithmic process to provide us with the shortest path
from one specific starting node to all other nodes of a graph. It is different from the minimum
spanning tree as the shortest distance among two vertices might not involve all the vertices
of the graph.
It is important to note that Dijkstra’s algorithm is only applicable when all weights are
positive because, during the execution, the weights of the edges are added to find the shortest
pat
Dijkstra’s Algorithm lets take a graph and find the shortest path from source to all nodes.
Consider below graph and src = 0
The set sptSet is initially empty and distances assigned to vertices are {0, INF, INF,
INF, INF, INF, INF, INF} where INF indicates infinite.
Now pick the vertex with a minimum distance value. The vertex 0 is picked, include it
in sptSet. So sptSetbecomes {0}. After including 0 to sptSet, update distance values of
its adjacent vertices.
Adjacent vertices of 0 are 1 and 7. The distance values of 1 and 7 are updated as 4 and
8.
The following subgraph shows vertices and their distance values, only the vertices with finite
distance values are shown. The vertices included in SPT are shown in green colour.
Step 2:
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). The vertex 1 is picked and added to sptSet.
So sptSet now becomes {0, 1}. Update the distance values of adjacent vertices of 1.
The distance value of vertex 2 becomes 12.
Step 3:
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1, 7}.
Update the distance values of adjacent vertices of 7. The distance value of vertex 6
and 8 becomes finite (15 and 9 respectively).
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1, 7, 6}.
Update the distance values of adjacent vertices of 6. The distance value of vertex 5
and 8 are updated.
We repeat the above steps until sptSetincludes all vertices of the given graph. Finally, we get
the following Shortest Path Tree (SPT).
A weighted graph is a graph in which each edge has a numerical value associated with it.
This algorithm follows the dynamic programming approach to find the shortest paths.
Initial graph
Follow the steps below to find the shortest path between all the pairs of vertices.
1. Create a matrix A0 of dimension n*n where n is the number of vertices. The row and
the column are indexed as i and j respectively. i and j are the vertices of the graph.
Each cell A[i][j] is filled with the distance from the ith vertex to the jth vertex. If there
2. Fill each cell with the distance between ith and jth vertex
3. Now, create a matrix A1 using matrix A0. The elements in the first column and the first
row are left as they are. The remaining cells are filled in the following way.
Let k be the intermediate vertex in the shortest path from source to destination. In this
step, k is the first vertex. A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).
That is, if the direct distance from the source to the destination is greater than the path
through the vertex k, then the cell is filled with A[i][k] + A[k][j].
In this step, k is vertex 1. We calculate the distance from source vertex to destination
vertex through this vertex k.
Calculate the distance from the source vertex to destination vertex through this vertex
k
For example: For A1[2, 4], the direct distance from vertex 2 to 4 is 4 and the sum of the
distance from vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex 1 to
4) is 7. Since 4 < 7, A0[2, 4] is filled with 4.
4. Similarly, A2 is created using A1. The elements in the second column and the second
row are left as they are.
In this step, k is the second vertex (i.e. vertex 2). The remaining steps are the same as
in step 2.
Calculate the distance from the source vertex to destination vertex through this vertex
3
Calculate the distance from the source vertex to destination vertex through this vertex
4
6. A4 gives the shortest path between each pair of vertices.
DIFFERENCES:
Main Purposes:
Time Complexities :
Other Points:
We can use Dijskstra’s shortest path algorithm for finding all pair shortest paths by
running it for every vertex. But time complexity of this would be O(VE Log V) which
can go (V3 Log V) in worst case.
Another important differentiating factor between the algorithms is their working
towards distributed systems. Unlike Dijkstra’s algorithm, Floyd Warshall can be
implemented in a distributed system, making it suitable for data structures such as
Graph of Graphs (Used in Maps).
Lastly Floyd Warshall works for negative edge but no negative cycle, whereas
Dijkstra’s algorithm don’t work for negative edges.