0% found this document useful (0 votes)
6 views

Unit-IV Notes Data Structure.docx

The document provides an overview of graph data structures, including basic and advanced terminology such as vertices, edges, directed and undirected graphs, and various representations like adjacency matrices and lists. It highlights applications of graphs in fields like transportation systems, social networks, and resource allocation, as well as algorithms like Breadth First Search (BFS) for graph traversal. Additionally, it discusses the complexities and uses of BFS in finding shortest paths, cycle detection, and identifying connected components.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Unit-IV Notes Data Structure.docx

The document provides an overview of graph data structures, including basic and advanced terminology such as vertices, edges, directed and undirected graphs, and various representations like adjacency matrices and lists. It highlights applications of graphs in fields like transportation systems, social networks, and resource allocation, as well as algorithms like Breadth First Search (BFS) for graph traversal. Additionally, it discusses the complexities and uses of BFS in finding shortest paths, cycle detection, and identifying connected components.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

UNIT-4

Graph Data Structure is a non-linear data structure consisting of vertices and edges. It is
useful in fields such as social network analysis, recommendation systems, and computer
networks. In the field of sports data science, graph data structure can be used to analyze and
understand the dynamics of team performance and player interactions on the field.

Basic Graph Terminology:


1. Graph
A Graph G is a non-empty set of vertices (or nodes) V and a set of edges E, where each edge
connects a pair of vertices. Formally, a graph can be represented as G= (V, E). Graphs can be
classified based on various properties, such as directedness of edges and connectivity.
2. Vertex (Node)
A Vertex, often referred to as a Node, is a fundamental unit of a graph. It represents
an entity within the graph. In applications like social networks, vertices can represent
individuals, while in road networks, they can represent intersections or locations.
3. Edge
An Edge is a connection between two vertices in a graph. It can be either directed or
undirected. In a directed graph, edges have a specific direction, indicating a one-way
connection between vertices. In contrast, undirected graphs have edges that do not have a
direction and represent bidirectional connections.
4. Degree of a Vertex
The Degree of a Vertex in a graph is the number of edges incident to that vertex. In a
directed graph, the degree is further categorized into the in-degree (number of incoming
edges) and out-degree (number of outgoing edges) of the vertex.
5. Path
A Path in a graph is a sequence of vertices where each adjacent pair is connected by an edge.
Paths can be of varying lengths and may or may not visit the same vertex more than once.
The shortest path between two vertices is of particular interest in algorithms such as
Dijkstra's algorithm for finding the shortest path in weighted graphs.
6. Cycle
A Cycle in a graph is a path that starts and ends at the same vertex, with no repetitions of
vertices (except the starting and ending vertex, which are the same). Cycles are essential in
understanding the connectivity and structure of a graph and play a significant role in cycle
detection algorithms.
Advanced Graph Terminology:
1. Directed Graph (Digraph):
A Directed Graph consists of nodes (vertices) connected by directed edges (arcs). Each edge
has a specific direction, meaning it goes from one node to another. Directed Graph is a
network where information flows in a specific order. Examples include social media follower
relationships, web page links, and transportation routes with one-way streets.
2. Undirected Graph:
In an Undirected Graph, edges have no direction. They simply connect nodes without any
inherent order. For example, a social network where friendships exist between people, or a
map of cities connected by roads (where traffic can flow in both directions).
3. Weighted Graph:
Weighted graphs assign numerical values (weights) to edges. These weights represent some
property associated with the connection between nodes. For example, road networks with
varying distances between cities, or airline routes with different flight durations, are examples
of weighted graphs.
4. Unweighted Graph:
An unweighted graph has no edge weights. It focuses solely on connectivity between nodes.
For example: a simple social network where friendships exist without any additional
information, or a family tree connecting relatives.
5. Connected Graph:
A graph is connected if there is a path between any pair of nodes. In other words, you can
reach any node from any other node. Even a single-node graph is considered connected. For
larger graphs, there’s always a way to move from one node to another.
6. Acyclic Graph:
An acyclic graph contains no cycles (closed loops). In other words, you cannot start at a node
and follow edges to return to the same node. Examples include family trees (without
marriages between relatives) or dependency graphs in software development.
7. Cyclic Graph:
A cyclic graph has at least one cycle. You can traverse edges and eventually return to the
same node. For example: circular road system or a sequence of events that repeats
indefinitely.
8. Connected Graph
A Graph is connected if there is a path between every pair of vertices in the graph. In a
directed graph, the concept of strong connectivity refers to the existence of a directed path
between every pair of vertices.
9. Disconnected Graph:
A disconnected graph has isolated components that are not connected to each other. These
components are separate subgraphs.
10. Tree
A Tree is a connected graph with no cycles. It is a fundamental data structure in computer
science, commonly used in algorithms like binary search trees and heap data structures. Trees
have properties such as a single root node, parent-child relationships between nodes, and a
unique path between any pair of nodes.
Applications of Graph:
● Transportation Systems: Google Maps employs graphs to map roads, where
intersections are vertices and roads are edges. It calculates shortest paths for efficient
navigation.
● Social Networks: Platforms like Facebook model users as vertices and friendships as
edges, using graph theory for friend suggestions.
● World Wide Web: Web pages are vertices, and links between them are directed
edges, inspiring Google's Page Ranking Algorithm.
● Resource Allocation and Deadlock Prevention: Operating systems use resource
allocation graphs to prevent deadlocks by detecting cycles.
● Mapping Systems and GPS Navigation: Graphs help in locating places and
optimizing routes in mapping systems and GPS navigation.
● Graph Algorithms and Measures: Graphs are analyzed for structural properties and
measurable quantities, including dynamic properties in networks.
Representations of Graph
Sequential representation
In sequential representation, there is a use of an adjacency matrix to represent the mapping
between vertices and edges of the graph. We can use an adjacency matrix to represent the
undirected graph, directed graph, weighted directed graph, and weighted undirected graph.
If adj[i][j] = w, it means that there is an edge exists from vertex i to vertex j with weight w.
An entry Aij in the adjacency matrix representation of an undirected graph G will be 1 if an
edge exists between Vi and Vj. If an Undirected Graph G consists of n vertices, then the
adjacency matrix for that graph is n x n, and the matrix A = [aij] can be defined as -
aij = 1 {if there is a path exists from Vi to Vj}
aij = 0 {Otherwise}
It means that, in an adjacency matrix, 0 represents that there is no association exists between
the nodes, whereas 1 represents the existence of a path between two edges.
If there is no self-loop present in the graph, it means that the diagonal entries of the adjacency
matrix will be 0.
Now, let's see the adjacency matrix representation of an undirected graph.

In the above figure, an image shows the mapping among the vertices (A, B, C, D, E), and this
mapping is represented by using the adjacency matrix.
There exist different adjacency matrices for the directed and undirected graph. In a directed
graph, an entry Aij will be 1 only when there is an edge directed from Vi to Vj.
Adjacency matrix for a directed graph
In a directed graph, edges represent a specific path from one vertex to another vertex.
Suppose a path exists from vertex A to another vertex B; it means that node A is the initial
node, while node B is the terminal node.
Consider the below-directed graph and try to construct the adjacency matrix of it.

In the above graph, we can see there is no self-loop, so the diagonal entries of the adjacent
matrix are 0.
Adjacency matrix for a weighted directed graph
It is similar to an adjacency matrix representation of a directed graph except that instead of
using the '1' for the existence of a path, here we have to use the weight associated with the
edge. The weights on the graph edges will be represented as the entries of the adjacency
matrix. We can understand it with the help of an example. Consider the below graph and its
adjacency matrix representation. In the representation, we can see that the weight associated
with the edges is represented as the entries in the adjacency matrix.
In the above image, we can see that the adjacency matrix representation of the weighted
directed graph is different from other representations. It is because, in this representation, the
non-zero values are replaced by the actual weight assigned to the edges.
Adjacency matrix is easier to implement and follow. An adjacency matrix can be used when
the graph is dense and a number of edges are large.
Though, it is advantageous to use an adjacency matrix, but it consumes more space. Even if
the graph is sparse, the matrix still consumes the same space.
Linked list representation
An adjacency list is used in the linked representation to store the Graph in the computer's
memory. It is efficient in terms of storage as we only have to store the values for edges.
Let's see the adjacency list representation of an undirected graph.

In the above figure, we can see that there is a linked list or adjacency list for every node of
the graph. From vertex A, there are paths to vertex B and vertex D. These nodes are linked to
nodes A in the given adjacency list.
An adjacency list is maintained for each node present in the graph, which stores the node
value and a pointer to the next adjacent node to the respective node. If all the adjacent nodes
are traversed, then store the NULL in the pointer field of the last node of the list.
The sum of the lengths of adjacency lists is equal to twice the number of edges present in an
undirected graph.
Now, consider the directed graph, and let's see the adjacency list representation of that graph.

For a directed graph, the sum of the lengths of adjacency lists is equal to the number of edges
present in the graph.
Now, consider the weighted directed graph, and let's see the adjacency list representation of
that graph.

In the case of a weighted directed graph, each node contains an extra field that is called the
weight of the node.
In an adjacency list, it is easy to add a vertex. Because of using the linked list, it also saves
space.

Here are the two most common ways to represent a graph : For simplicity, we are going to
consider only unweighted graphs in this post.
1. Adjacency Matrix
2. Adjacency List
Adjacency Matrix Representation
An adjacency matrix is a way of representing a graph as a matrix of boolean (0’s and 1’s)
Let’s assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having
dimension n x n.
● If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
● If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
Representation of Undirected Graph as Adjacency Matrix:
The below figure shows an undirected graph. Initially, the entire Matrix is ​initialized to 0. If
there is an edge from source to destination, we insert 1 to both cases
(adjMat[destination] and adjMat[destination]) because we can go either way.

Representation of Directed Graph as Adjacency Matrix:


The below figure shows a directed graph. Initially, the entire Matrix is ​initialized to 0. If there
is an edge from source to destination, we insert 1 for that particular adjMat[destination].

Directed Graph to Adjacency Matrix


Adjacency List Representation
An array of Lists is used to store edges between two vertices. The size of array is equal to the
number of vertices (i.e, n). Each index in this array represents a specific vertex in the graph.
The entry at the index i of the array contains a linked list containing the vertices that are
adjacent to vertex i.
Let’s assume there are n vertices in the graph So, create an array of list of
size n as adjList[n].
● adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
● adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so
on.
Representation of Undirected Graph as Adjacency list:
The below undirected graph has 3 vertices. So, an array of list will be created of size 3, where
each indices represent the vertices. Now, vertex 0 has two neighbours (i.e, 1 and 2). So, insert
vertex 1 and 2 at indices 0 of array. Similarly, For vertex 1, it has two neighbour (i.e, 2 and 0)
So, insert vertices 2 and 0 at indices 1 of array. Similarly, for vertex 2, insert its neighbours in
array of list.

Representation of Directed Graph as Adjacency list:


The below directed graph has 3 vertices. So, an array of list will be created of size 3, where
each indices represent the vertices. Now, vertex 0 has no neighbours. For vertex 1, it has two
neighbour (i.e, 0 and 2) So, insert vertices 0 and 2 at indices 1 of array. Similarly, for vertex
2, insert its neighbours in array of list.

Breadth First Search or BFS for a Graph


Breadth First Search (BFS) is a fundamental graph traversal algorithm. It begins with a
node, then first traverses all its adjacent. Once all adjacent are visited, then their adjacent are
traversed. This is different from DFS in a way that closest vertices are visited before others.
We mainly traverse vertices level by level. A lot of popular graph algorithms like Dijkstra’s
shortest path, Kahn’s Algorithm, and Prim’s algorithm are based on BFS. BFS itself can be
used to detect cycle in a directed and undirected graph, find shortest path in an unweighted
graph and many more problems.
BFS from a Given Source:
The algorithm starts from a given source and explores all reachable vertices from the given
source. It is similar to the Breadth-First Traversal of a tree. Like tree, we begin with the given
source (in tree, we begin with root) and traverse vertices level by level using a queue data
structure. The only catch here is that, unlike trees, graphs may contain cycles, so we may
come to the same node again. To avoid processing a node more than once, we use
a boolean visited array.
Initialization: Enqueue the given source vertex into a queue and mark it as visited.
1. Exploration: While the queue is not empty:
● Dequeue a node from the queue and visit it (e.g., print its value).
● For each unvisited neighbor of the dequeued node:
o Enqueue the neighbor into the queue.
o Mark the neighbor as visited.
2. Termination: Repeat step 2 until the queue is empty.
This algorithm ensures that all nodes in the graph are visited in a breadth-first manner,
starting from the starting node.
Time Complexity: O(V+E), where V is the number of nodes and E is the number of edges.
Auxiliary Space: O(V)
BFS of the whole Graph which Maybe Disconnected
The above implementation takes a source as an input and prints only those vertices that are
reachable from the source and would not print all vertices in case of disconnected graph. Let
us now talk about the algorithm that prints all vertices without any source and the graph
maybe disconnected.
The idea is simple, instead of calling BFS for a single vertex, we call the above implemented
BFS for all all non-visited vertices one by one.
Complexity Analysis of Breadth-First Search (BFS) Algorithm:
Time Complexity of BFS Algorithm: O(V + E)
● BFS explores all the vertices and edges in the graph. In the worst case, it visits every
vertex and edge once. Therefore, the time complexity of BFS is O(V + E), where V
and E are the number of vertices and edges in the given graph.
Auxiliary Space of BFS Algorithm: O(V)
● BFS uses a queue to keep track of the vertices that need to be visited. In the worst
case, the queue can contain all the vertices in the graph. Therefore, the space
complexity of BFS is O(V).
Applications of BFS in Graphs:
BFS has various applications in graph theory and computer science, including:
● Shortest Path Finding: BFS can be used to find the shortest path between two nodes
in an unweighted graph. By keeping track of the parent of each node during the
traversal, the shortest path can be reconstructed.
● Cycle Detection: BFS can be used to detect cycles in a graph. If a node is visited
twice during the traversal, it indicates the presence of a cycle.
● Connected Components: BFS can be used to identify connected components in a
graph. Each connected component is a set of nodes that can be reached from each
other.
● Topological Sorting: BFS can be used to perform topological sorting on a directed
acyclic graph (DAG). Topological sorting arranges the nodes in a linear order such
that for any edge (u, v), u appears before v in the order.
● Level Order Traversal of Binary Trees: BFS can be used to perform a level order
traversal of a binary tree. This traversal visits all nodes at the same level before
moving to the next level.
● Network Routing: BFS can be used to find the shortest path between two nodes in a
network, making it useful for routing data packets in network protocols.
Depth First Search or DFS for a Graph
Depth First Traversal (or DFS) for a graph is similar to Depth First Traversal of a tree.
Like trees, we traverse all adjacent vertices one by one. When we traverse an adjacent
vertex, we completely finish the traversal of all vertices reachable through that
adjacent vertex. After we finish traversing one adjacent vertex and its reachable
vertices, we move to the next adjacent vertex and repeat the process. This is similar to
a tree, where we first completely traverse the left subtree and then move to the right
subtree. The key difference is that, unlike trees, graphs may contain cycles (a node
may be visited more than once). To avoid processing a node multiple times, we use a
boolean visited array.
Example:
Input: adj = [[1, 2], [0, 2], [0, 1, 3, 4], [2], [2]]
Output: 1 2 0 3 4
Explanation: The source vertex s is 1. We visit it first, then we visit an adjacent. There
are two adjacent 1, 0 and 2. We can pick any of the two (
● Start at 1: Mark as visited. Output: 1
● Move to 2: Mark as visited. Output: 2
● Move to 0: Mark as visited. Output: 0 (backtrack to 2)
● Move to 3: Mark as visited. Output: 3 (backtrack to 2)
● Move to 4: Mark as visited. Output: 4 (backtrack to 1)
Input: [[2,3,1], [0], [0,4], [0], [2]]

Output: 0 2 4 3 1
Explanation: DFS Steps:
● Start at 0: Mark as visited. Output: 0
● Move to 2: Mark as visited. Output: 2
● Move to 4: Mark as visited. Output: 4 (backtrack to 2, then backtrack to 0)
● Move to 3: Mark as visited. Output: 3 (backtrack to 0)
● Move to 1: Mark as visited. Output: 1
Note that there can be multiple DFS traversals of a graph according to the order in
which we pick adjacent vertices. Here we pick vertices as per the insertion order.
DFS from a Given Source of Undirected Graph:
The algorithm starts from a given source and explores all reachable vertices from the
given source. It is similar to Preorder Tree Traversal where we visit the root, then
recur for its children. In a graph, there maybe loops. So we use an extra visited array
to make sure that we do not process a vertex again.
Let us understand the working of Depth First Search with the help of the following
illustration: for the source as 0.

Time complexity: O(V + E), where V is the number of vertices and E is the number
of edges in the graph.
Auxiliary Space: O(V + E), since an extra visited array of size V is required, And
stack size for recursive calls to DFSRec function.
DFS for Complete Traversal of Disconnected Undirected Graph
The above implementation takes a source as an input and prints only those vertices
that are reachable from the source and would not print all vertices in case of
disconnected graph. Let us now talk about the algorithm that prints all vertices
without any source and the graph maybe disconnected.
The idea is simple, instead of calling DFS for a single vertex, we call the above
implemented DFS for all all non-visited vertices one by one.
for (int i : adj[s])
if (visited[i] == false)
DFSRec(adj, visited, i);
}
Time complexity: O(V + E). Note that the time complexity is same here because we visit
every vertex at most once and every edge is traversed at most once (in directed) and twice in
undirected.
Auxiliary Space: O(V + E), since an extra visited array of size V is required, And stack size
for recursive calls to DFSRec function.
Spanning Tree
A spanning tree is a subset of Graph G, such that all the vertices are connected using
minimum possible number of edges. Hence, a spanning tree does not have cycles and a graph
may have more than one spanning tree.
Properties of a Spanning Tree:
● A Spanning tree does not exist for a disconnected graph.
● For a connected graph having N vertices then the number of edges in the spanning
tree for that graph will be N-1.
● A Spanning tree does not have any cycle.
● We can construct a spanning tree for a complete graph by removing E-N+1 edges,
where E is the number of Edges and N is the number of vertices.
● Cayley’s Formula: It states that the number of spanning trees in a complete graph
with N vertices is

o For example: N=4, then maximum number of spanning tree possible =


= 16 (shown in the above image).
Real World Applications of A Spanning Tree:
● Several path finding algorithms, such as Dijkstra’s algorithm and A* search
algorithm, internally build a spanning tree as an intermediate step.
● Building Telecommunication Network.
● Image Segmentation to break an image into distinguishable components.
● Computer Network Routing Protocol
Minimum Spanning Tree(MST):
The weight of a spanning tree is determined by the sum of weight of all the edge involved in
it.
A minimum spanning tree (MST) is defined as a spanning tree that has the minimum weight
among all the possible spanning trees.

Properties of Minimum Spanning Tree:


● A minimum spanning tree connects all the vertices in the graph, ensuring that there is
a path between any pair of nodes.
● An MST is acyclic, meaning it contains no cycles. This property ensures that it
remains a tree and not a graph with loops.
● An MST with V vertices (where V is the number of vertices in the original graph)
will have exactly V – 1 edges, where V is the number of vertices.
● An MST is optimal for minimizing the total edge weight, but it may not necessarily
be unique.
● The cut property states that if you take any cut (a partition of the vertices into two
sets) in the original graph and consider the minimum-weight edge that crosses the cut,
that edge is part of the MST.
Minimum Spanning Tree of a Graph may not be Unique:
Like a spanning tree, there can also be many possible MSTs for a graph as shown in the
below image:
Algorithms to Find Minimum Spanning Tree of a Graph:
There are several algorithms to find the minimum spanning tree from a given graph, some of
them are listed below:
1. Krushkal’s MST Algorithm
2. Prim’s MST Algorithm
3. Boruvka’s Algorithm
4. Reverse-Delete Algorithm
Let us discuss these algorithm one by one.
1. Krushkal’s Minimum Spanning Tree:
Kruskal’s Minimum Spanning Tree (MST) algorithm is to connect all the vertices of a graph
with the minimum total edge weight while avoiding cycles. This algorithm employs a greedy
approach, meaning it makes locally optimal choices at each step to achieve a globally
optimal solution.
Algorithm:
● First, it sorts all the edges of the graph by their weights,
● Then starts the iterations of finding the spanning tree.
● At each iteration, the algorithm adds the next lowest-weight edge one by one, such
that the edges picked until now does not form a cycle.
This algorithm can be implemented efficiently using a DSU ( Disjoint-Set ) data structure to
keep track of the connected components of the graph. This is used in a variety of practical
applications such as network design, clustering, and data analysis.
2. Prim’s Minimum Spanning Tree:
Minimum Spanning Tree (MST) is to build the MST incrementally by starting with a single
vertex and gradually extending the tree by adding the closest neighbouring vertex at each
step.
Algorithm:
● It starts by selecting an arbitrary vertex and then adding it to the MST.
● Then, it repeatedly checks for the minimum edge weight that connects one vertex of
MST to another vertex that is not yet in the MST.
● This process is continued until all the vertices are included in the MST.
To efficiently select the minimum weight edge for each iteration, this algorithm
uses priority_queue to store the vertices sorted by their minimum edge weight currently. It
also simultaneously keeps track of the MST using an array or other data structure suitable
considering the data type it is storing.
This algorithm can be used in various scenarios such as image segmentation based on color,
texture, or other features. For Routing, as in finding the shortest path between two points for
a delivery truck to follow.
3. Boruvka’s Algorithm:
This is also a graph traversal algorithm used to find the minimum spanning tree of a
connected, undirected graph. This is one of the oldest algorithms. The algorithm works by
iteratively building the minimum spanning tree, starting with each vertex in the graph as its
own tree. In each iteration, the algorithm finds the cheapest edge that connects a tree to
another tree, and adds that edge to the minimum spanning tree. This is almost similar to the
Prim’s algorithm for finding the minimum spanning tree.
Algorithm:
● Initialize a forest of trees, with each vertex in the graph as its own tree.
● For each tree in the forest:
o Find the cheapest edge that connects it to another tree. Add these edges to the
minimum spanning tree.
o Update the forest by merging the trees connected by the added edges.
● Repeat the above steps until the forest contains only one tree, which is the minimum
spanning tree.
The algorithm can be implemented using a data structure such as a priority queue to
efficiently find the cheapest edge between trees. Boruvka’s algorithm is a simple and
easy-to-implement algorithm for finding minimum spanning trees, but it may not be as
efficient as other algorithms for large graphs with many edges.
4. Reverse-Delete Algorithm:
Reverse Delete algorithm is closely related to Kruskal’s algorithm. In Kruskal’s algorithm
what we do is : Sort edges by increasing order of their weights. After sorting, we one by one
pick edges in increasing order. We include current picked edge if by including this in
spanning tree not form any cycle until there are V-1 edges in spanning tree, where V =
number of vertices.
In Reverse Delete algorithm, we sort all edges in decreasing order of their weights. After
sorting, we one by one pick edges in decreasing order. We include current picked edge if
excluding current edge causes disconnection in current graph. The main idea is delete edge if
its deletion does not lead to disconnection of graph.
Algorithm:
1. Sort all edges of graph in non-increasing order of edge weights.
2. Initialize MST as original graph and remove extra edges using step 3.
3. Pick highest weight edge from remaining edges and check if deleting the edge
disconnects the graph or not.
4. If disconnects, then we don’t delete the edge.
5. Else we delete the edge and continue.
Real World Application of A Minimum Spanning Tree:
● Network design: Spanning trees can be used in network design to find the minimum
number of connections required to connect all nodes. Minimum spanning trees, in
particular, can help minimize the cost of the connections by selecting the cheapest
edges.
● Image processing: Spanning trees can be used in image processing to identify regions
of similar intensity or color, which can be useful for segmentation and classification
tasks.
● Social network analysis: Spanning trees and minimum spanning trees can be used in
social network analysis to identify important connections and relationships among
individuals or groups.

Kruskal’s Algorithm:
Here we will discuss Kruskal’s algorithm to find the MST of a given weighted graph.
In Kruskal’s algorithm, sort all edges of the given graph in increasing order. Then it keeps on
adding new edges and nodes in the MST if the newly added edge does not form a cycle. It
picks the minimum weighted edge at first and the maximum weighted edge at last. Thus we
can say that it makes a locally optimal choice in each step in order to find the optimal
solution. Hence this is a Greedy Algorithm.
How to find MST using Kruskal’s algorithm?
Below are the steps for finding MST using Kruskal’s algorithm:
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far.
If the cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Step 2 uses the Union-Find algorithm to detect cycles.
So we recommend reading the following post as a prerequisite.
● Union-Find Algorithm | Set 1 (Detect Cycle in a Graph)
● Union-Find Algorithm | Set 2 (Union By Rank and Path Compression)
Kruskal’s algorithm to find the minimum cost spanning tree uses the greedy approach. The
Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST
constructed so far. Let us understand it with an example:
Illustration:
Below is the illustration of the above approach:
Input Graph:

The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be
having (9 – 1) = 8 edges.
After sorting:

Sourc
Weight Destination
e

1 7 6

2 8 2

2 6 5

4 0 1

4 2 5
Sourc
Weight Destination
e

6 8 6

7 2 3

7 7 8

8 0 7

8 1 2

9 3 4

10 5 4

11 1 7

14 3 5

Now pick all edges one by one from the sorted list of edges
Step 1: Pick edge 7-6. No cycle is formed, include it.

Add edge 7-6 in the MST


Step 2: Pick edge 8-2. No cycle is formed, include it.

Add edge 8-2 in the MST


Step 3: Pick edge 6-5. No cycle is formed, include it.

Add edge 6-5 in the MST


Step 4: Pick edge 0-1. No cycle is formed, include it.

Add edge 0-1 in the MST


Step 5: Pick edge 2-5. No cycle is formed, include it.
Add edge 2-5 in the MST
Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge 2-3:
No cycle is formed, include it.

Add edge 2-3 in the MST


Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge 0-7.
No cycle is formed, include it.

Add edge 0-7 in MST


Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge 3-4.
No cycle is formed, include it.

Add edge 3-4 in the MST


Note: Since the number of edges included in the MST equals to (V – 1), so the algorithm
stops here
Time Complexity: O(E * logE) or O(E * logV)
● Sorting of edges takes O(E * logE) time.
● After sorting, we iterate through all edges and apply the find-union algorithm. The
find and union operations can take at most O(logV) time.
● So overall complexity is O(E * logE + E * logV) time.
● The value of E can be at most O(V2), so O(logV) and O(logE) are the same.
Therefore, the overall time complexity is O(E * logE) or O(E*logV)
Auxiliary Space: O(V + E), where V is the number of vertices and E is the number of edges
in the graph.
Prim’s algorithm:
We have discussed Kruskal’s algorithm for Minimum Spanning Tree. Like Kruskal’s
algorithm, Prim’s algorithm is also a Greedy algorithm. This algorithm always starts with a
single node and moves through several adjacent nodes, in order to explore all of the
connected edges along the way.
The algorithm starts with an empty spanning tree. The idea is to maintain two sets of vertices.
The first set contains the vertices already included in the MST, and the other set contains the
vertices not yet included. At every step, it considers all the edges that connect the two sets
and picks the minimum weight edge from these edges. After picking the edge, it moves the
other endpoint of the edge to the set containing MST.
A group of edges that connects two sets of vertices in a graph is called cut in graph
theory. So, at every step of Prim’s algorithm, find a cut, pick the minimum weight edge from
the cut, and include this vertex in MST Set (the set that contains already included vertices).
How does Prim’s Algorithm Work?
The working of Prim’s algorithm can be described by using the following steps:
Step 1: Determine an arbitrary vertex as the starting vertex of the MST.
Step 2: Follow steps 3 to 5 till there are vertices that are not included in the MST (known as
fringe vertex).
Step 3: Find edges connecting any tree vertex with the fringe vertices.
Step 4: Find the minimum among these edges.
Step 5: Add the chosen edge to the MST if it does not form any cycle.
Step 6: Return the MST and exit
Note: For determining a cycle, we can divide the vertices into two sets [one set contains the
vertices included in MST and the other contains the fringe vertices.]
Illustration of Prim’s Algorithm:
Consider the following graph as an example for which we need to find the Minimum
Spanning Tree (MST).

Example of a graph
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum
Spanning Tree. Here we have selected vertex 0 as the starting vertex.
0 is selected as starting vertex
Step 2: All the edges connecting the incomplete MST and other vertices are the edges {0, 1}
and {0, 7}. Between these two the edge with minimum weight is {0, 1}. So include the edge
and vertex 1 in the MST.

1 is added to the MST


Step 3: The edges connecting the incomplete MST to other vertices are {0, 7}, {1, 7} and {1,
2}. Among these edges the minimum weight is 8 which is of the edges {0, 7} and {1, 2}. Let us
here include the edge {0, 7} and the vertex 7 in the MST. [We could have also included edge
{1, 2} and vertex 2 in the MST].
7 is added in the MST
Step 4: The edges that connect the incomplete MST with the fringe vertices are {1, 2}, {7, 6}
and {7, 8}. Add the edge {7, 6} and the vertex 6 in the MST as it has the least weight (i.e., 1).

6 is added in the MST


Step 5: The connecting edges now are {7, 8}, {1, 2}, {6, 8} and {6, 5}. Include edge {6, 5}
and vertex 5 in the MST as the edge has the minimum weight (i.e., 2) among them.
Include vertex 5 in the MST
Step 6: Among the current connecting edges, the edge {5, 2} has the minimum weight. So
include that edge and the vertex 2 in the MST.

Include vertex 2 in the MST


Step 7: The connecting edges between the incomplete MST and the other edges are {2, 8}, {2,
3}, {5, 3} and {5, 4}. The edge with minimum weight is edge {2, 8} which has weight 2. So
include this edge and the vertex 8 in the MST.

Add vertex 8 in the MST


Step 8: See here that the edges {7, 8} and {2, 3} both have same weight which are minimum.
But 7 is already part of MST. So we will consider the edge {2, 3} and include that edge and
vertex 3 in the MST.
Include vertex 3 in MST
Step 9: Only the vertex 4 remains to be included. The minimum weighted edge from the
incomplete MST to 4 is {3, 4}.

Include vertex 4 in the MST


The final structure of the MST is as follows and the weight of the edges of the MST is (4 + 8
+ 1 + 2 + 4 + 2 + 7 + 9) = 37.
The structure of the MST formed using the above method
Note: If we had selected the edge {1, 2} in the third step then the MST would look like the
following.

Structure of the alternate MST if we had selected edge {1, 2} in the MST
How to implement Prim’s Algorithm?
Follow the given steps to utilize the Prim’s Algorithm mentioned above for finding MST of
a graph:
● Create a set mstSet that keeps track of vertices already included in MST.
● Assign a key value to all vertices in the input graph. Initialize all key values as
INFINITE. Assign the key value as 0 for the first vertex so that it is picked first.
● While mstSet doesn’t include all vertices
o Pick a vertex u that is not there in mstSet and has a minimum key value.
o Include u in the mstSet.
o Update the key value of all adjacent vertices of u. To update the key values,
iterate through all adjacent vertices.
o For every adjacent vertex v, if the weight of edge u-v is less than the
previous key value of v, update the key value as the weight of u-v.
The idea of using key values is to pick the minimum weight edge from the cut. The key
values are used only for vertices that are not yet included in MST, the key value for these
vertices indicates the minimum weight edges connecting them to the set of vertices included
in MST.
Time Complexity: O(V2), If the input graph is represented using an adjacency list, then the
time complexity of Prim’s algorithm can be reduced to O(E * logV) with the help of a binary
heap. In this implementation, we are always considering the spanning tree to start from the
root of the graph
Auxiliary Space: O(V)
Time Complexity: O(E*log(E)) where E is the number of edges
Auxiliary Space: O(V^2) where V is the number of vertex
Prim’s algorithm for finding the minimum spanning tree (MST):
Advantages:
1. Prim’s algorithm is guaranteed to find the MST in a connected, weighted graph.
2. It has a time complexity of O(E log V) using a binary heap or Fibonacci heap, where
E is the number of edges and V is the number of vertices.
3. It is a relatively simple algorithm to understand and implement compared to some
other MST algorithms.
Disadvantages:
1. Like Kruskal’s algorithm, Prim’s algorithm can be slow on dense graphs with many
edges, as it requires iterating over all edges at least once.
2. Prim’s algorithm relies on a priority queue, which can take up extra memory and slow
down the algorithm on very large graphs.
3. The choice of starting node can affect the MST output, which may not be desirable in
some applications.
Other Implementations of Prim’s Algorithm:
Given below are some other implementations of Prim’s Algorithm
● Prim’s Algorithm for Adjacency Matrix Representation – In this article we have
discussed the method of implementing Prim’s Algorithm if the graph is represented by
an adjacency matrix.
● Prim’s Algorithm for Adjacency List Representation – In this article Prim’s Algorithm
implementation is described for graphs represented by an adjacency list.
● Prim’s Algorithm using Priority Queue: In this article, we have discussed a
time-efficient approach to implement Prim’s algorithm.
Shortest Paths using Dijkstra’s Algorithm
Given a weighted graph and a source vertex in the graph, find the shortest paths from the
source to all the other vertices in the given graph.
Note: The given graph does not contain any negative edge.
Examples:
Input: src = 0, the graph is shown below.
Output: 0 4 12 19 21 11 9 8 14
Explanation: The distance from 0 to 1 = 4.
The minimum distance from 0 to 2 = 12. 0->1->2
The minimum distance from 0 to 3 = 19. 0->1->2->3
The minimum distance from 0 to 4 = 21. 0->7->6->5->4
The minimum distance from 0 to 5 = 11. 0->7->6->5
The minimum distance from 0 to 6 = 9. 0->7->6
The minimum distance from 0 to 7 = 8. 0->7
The minimum distance from 0 to 8 = 14. 0->1->2->8

Dijkstra’s Algorithm using Adjacency Matrix :


The idea is to generate a SPT (shortest path tree) with a given source as a root. Maintain an
Adjacency Matrix with two sets,
● one set contains vertices included in the shortest-path tree,
● other set includes vertices not yet included in the shortest-path tree.
At every step of the algorithm, find a vertex that is in the other set (set not yet included) and
has a minimum distance from the source.
Algorithm :
● Create a set sptSet (shortest path tree set) that keeps track of vertices included in the
shortest path tree, i.e., whose minimum distance from the source is calculated and
finalized. Initially, this set is empty.
● Assign a distance value to all vertices in the input graph. Initialize all distance values
as INFINITE . Assign the distance value as 0 for the source vertex so that it is picked
first.
● While sptSet doesn’t include all vertices
o Pick a vertex u that is not there in sptSet and has a minimum distance value.
o Include u to sptSet .
o Then update the distance value of all adjacent vertices of u .
o To update the distance values, iterate through all adjacent vertices.
o For every adjacent vertex v, if the sum of the distance value of u (from
source) and weight of edge u-v , is less than the distance value of v ,
then update the distance value of v .
Note: We use a boolean array sptSet[] to represent the set of vertices included in SPT . If a
value sptSet[v] is true, then vertex v is included in SPT , otherwise not. Array dist[] is used
to store the shortest distance values of all vertices.
Illustration of Dijkstra Algorithm :
To understand the Dijkstra’s Algorithm lets take a graph and find the shortest path from
source to all nodes.
Consider below graph and src = 0

Step 1:
● The set sptSet is initially empty and distances assigned to vertices are {0, INF, INF,
INF, INF, INF, INF, INF} where INF indicates infinite.
● Now pick the vertex with a minimum distance value. The vertex 0 is picked, include it
in sptSet . So sptSet becomes {0}. After including 0 to sptSet , update distance values
of its adjacent vertices.
● Adjacent vertices of 0 are 1 and 7. The distance values of 1 and 7 are updated as 4
and 8.
The following subgraph shows vertices and their distance values, only the vertices with finite
distance values are shown. The vertices included in SPT are shown in green colour.
Step 2:
● Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET ). The vertex 1 is picked and added to sptSet .
● So sptSet now becomes {0, 1}. Update the distance values of adjacent vertices of 1.
● The distance value of vertex 2 becomes 12 .

Step 3:
● Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET ). Vertex 7 is picked. So sptSet now becomes {0, 1, 7}.
● Update the distance values of adjacent vertices of 7. The distance value of vertex 6
and 8 becomes finite ( 15 and 9 respectively).

Step 4:
● Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET ). Vertex 6 is picked. So sptSet now becomes {0, 1, 7, 6} .
● Update the distance values of adjacent vertices of 6. The distance value of vertex 5
and 8 are updated.

We repeat the above steps until sptSet includes all vertices of the given graph. Finally, we get
the following S hortest Path Tree (SPT).

The Dijkstra algorithm is one of the most important graph algorithms in the DSA but most
of the students find it difficult to understand it. In order to have a strong grip on these types of
algorithms.
Time Complexity: O(V 2 )
Auxiliary Space: O(V)
Notes:
● The code calculates the shortest distance but doesn’t calculate the path information.
Create a parent array, update the parent array when distance is updated and use it to
show the shortest path from source to different vertices.
● The time Complexity of the implementation is O(V 2 ) . If the input graph is
represented using adjacency list , it can be reduced to O(E * log V) with the help of a
binary heap. Please see Dijkstra’s Algorithm for Adjacency List Representation for
more details.
● Dijkstra’s algorithm doesn’t work for graphs with negative weight cycles.
Why Dijkstra’s Algorithms fails for the Graphs having Negative Edges ?
The problem with negative weights arises from the fact that Dijkstra’s algorithm assumes that
once a node is added to the set of visited nodes, its distance is finalized and will not change.
However, in the presence of negative weights, this assumption can lead to incorrect results.
Consider the following graph for the example:

In the above graph, A is the source node, among the edges A to B and A to C , A to B is the
smaller weight and Dijkstra assigns the shortest distance of B as 2, but because of existence
of a negative edge from C to B , the actual shortest distance reduces to 1 which Dijkstra fails
to detect.
Note: We use Bellman Ford’s Shortest path algorithm in case we have negative edges in the
graph.
Dijkstra’s Algorithm using Adjacency List in O(E logV):
For Dijkstra’s algorithm, it is always recommended to use Heap (or priority queue ) as the
required operations (extract minimum and decrease key) match with the speciality of the heap
(or priority queue). However, the problem is, that priority_queue doesn’t support the decrease
key. To resolve this problem, do not update a key, but insert one more copy of it. So we allow
multiple instances of the same vertex in the priority queue. This approach doesn’t require
decreasing key operations and has below important properties.
● Whenever the distance of a vertex is reduced, we add one more instance of a vertex in
priority_queue. Even if there are multiple instances, we only consider the instance
with minimum distance and ignore other instances.
● The time complexity remains O(E * LogV) as there will be at most O(E) vertices in
the priority queue and O(logE) is the same as O(logV)

Time Complexity: O(E * logV), Where E is the number of edges and V is the number of
vertices.
Auxiliary Space: O(V)
Applications of Dijkstra’s Algorithm:
● Google maps uses Dijkstra algorithm to show shortest distance between source and
destination.
● In computer networking , Dijkstra’s algorithm forms the basis for various routing
protocols, such as OSPF (Open Shortest Path First) and IS-IS (Intermediate System to
Intermediate System).
● Transportation and traffic management systems use Dijkstra’s algorithm to optimize
traffic flow, minimize congestion, and plan the most efficient routes for vehicles.
● Airlines use Dijkstra’s algorithm to plan flight paths that minimize fuel consumption,
reduce travel time.
● Dijkstra’s algorithm is applied in electronic design automation for routing connections
on integrated circuits and very-large-scale integration (VLSI) chips.

You might also like