What Is A Graph in Data Structure?
What Is A Graph in Data Structure?
The idea for this blog is to discuss another way of storing data i.e. Graph . This is one of
the important topics that is asked in the interviews of companies like Google,
Facebook, Amazon , etc. So, in this blog, we will be covering the below topics:
What is a Graph?
Properties of a Graph
Types of Graph
Graph Representation: Adjacency Matrix
Graph Representation: Adjacency List
So, let's get started with the basics of Graph.
G = (V, E)
Here G is the Graph, V is the set of vertices or nodes and E is the set of edges
in the Graph G.
The following is the pictorial representation of a Graph having 5 nodes or vertices:
VSRGDC
The above is an example of Graph having a set of vertices as V = {1, 2, 3, 4, 5} and the
set of edges as E = {1-2, 1-3, 1-4, 2-1, 2-5, 3-1, 3-4, 3-5, 4-1, 4-3, 5-2, 5-3}.
Some of the practical scenarios where we can use the Graph to represent data are:
Properties of a Graph
Here in this section of the blog, we will learn some of the properties of a Graph that will
be helpful in solving the graph problems:
VSRGDC
Distance between vertices: It is the minimum number of edges present
between two nodes. If there are more than one possible paths from one node to
another, then the distance between those two vertices is the shortest path between
those vertices. It is represented by:
d(A, B); Here, A and B are the nodes and d is the distance between these two
nodes
In the above example, the distance between node A and B can be represented as d(A,
B) and it is equal to 1 because:
In the above example, the eccentricity of node A is the maximum distance from A to
other vertices i.e. 2
Central Point: For a vertex, if the eccentricity of the graph is equal to its radius,
then that vertex is called the central point of the graph.
e(V) = r(V); Here e is the eccentricity and r is the radius of vertex V
Circumference: It is the total number of edges that are present in the longest
cycle of a graph.
These are some of the properties of a Graph. Let's look at some of the types of Graph.
VSRGDC
Types of a Graph
In this part of the blog, we will learn various types of graphs and this will help us in
transforming a real-life problem into some graph problem. You will get to know which
graph should be used when.
Directed and Undirected Graph: It is a very basic type of graph. A graph can
be directed or undirected i.e. the edges of a graph can be undirected or directed. In
an undirected graph, for every edge, you can traverse in both the direction i.e. if
there is an edge between node A and node B , then you can traverse from node
A to node B or from node B to node A . But in a directed graph, the direction is
given to you and you can traverse in the given direction only. For example,
For an undirected graph, the total number of possible edges will be:
nC2 i.e. (n(n-1))/2
While for a directed graph, the total number of possible edges will be:
2*nC2 i.e. 2(n(n-1))/2 = n(n-1)
Weighted and Unweighted Graph: You can assign some weights or costs over
an edge of a graph. If there is some weight or cost over the edges of the graph, then
it is known as a weighted graph otherwise it is called an unweighted graph.
For example, suppose we are representing the cities of a state as nodes and these
nodes are connected with the help of edges. So, here we can put the distance between
these cities over the edges i.e. we are putting some weight over the edges of the graph.
So, this type of graph is called a weighted graph. If there is no weight on the edges, then
it is called an unweighted graph.
VSRGDC
Cyclic and Acyclic Graph: If there is a cycle in a graph then that graph is called
Cyclic Graph. A graph is said to have a cycle if you start from a
node/vertex and after traversing some nodes, you come to the same
node, then you can say that the graph is having a cycle. If there is no cycle
present in the graph, then that graph is called an Acyclic Graph. For a Cyclic
Graph, at least one cycle is necessary.
VSRGDC
Connected and Disconnected Graph: In a Connected Graph, from each node,
there is a path to all the other nodes i.e. from each node, you can access all the
other nodes of the graph. But in a Disconnected Graph, you can't access all the
nodes from a particular node.
Sparse and Dense Graph: If the number of edges of a graph is close to the total
number of possible edges of that graph, then the graph is said to be Dense Graph
otherwise, it is said to be a Sparse Graph.
For example, if a graph is an undirected graph and there are 5 nodes, then the total
number of possible edges will be n(n-1)/2 i.e. 5(5-1)/2 = 10. Now, if the graph contains 4
edges, then the graph is said to be Sparse Graph because 4 is very less than 10 and if the
graph contains 8 nodes, then the graph is said to be Dense Graph because 8 is close to 10
i.e. total number of possible edges.
VSRGDC
These are some of the types of graphs that we use in data structures.
NOTE: A graph can be a combination of more than two or more graphs from the above
graph types.
For example:
The above image is a combination of a weighted, undirected, cyclic, and dense graph.
VSRGDC
Graph Representation
Till now, we have seen the pictorial representation of a graph. But in a programming
language, we can't use this pictorial representation to perform various operations on the
graph. So, to represent a graph, we use the below two methods:
Adjacency Matrix
Adjacency List
Let's learn one by one.
Adjacency Matrix
Let us assume that the graph is G(n, m) . Here G is the graph, n is the total number of
nodes and m is the total number of edges present in the graph G .
We know that the total number of possible edges of an undirected graph is n(n-1)/2 and
that of a directed graph is n(n-1). So, the value of " m " should lie between:
For example ,
VSRGDC
NOTE: For a weighted graph, instead of putting " 1 " in the matrix, you can put the
weight of the edge in the matrix i.e.
Now, if you want to find if there is an edge between two nodes of a graph, then you can
do this in O(1). For example , if matrix[i][j] == 1, then there is an edge between node
i and node j .
Inserting edge
If you want to insert some edge between node i and node j , then all you need to do is just
make matrix[i][j] = 1. This operation can be done in O(1).
Deleting edge
If you want to delete some edge between node i and node j , then all you need to do is just
make matrix[i][j] = 0. This operation can be done in O(1).
★ In an undirected graph, the path is bidirectional i.e. you can go from node i to node
j and vice-versa. So, the adjacency matrix is symmetrical along the diagonal i.e. you can
either use the upper part or the lower part of the diagonal. In this way, you can save
space while storing the values in the matrix.
In an Adjacency matrix, we create a matrix of size n*n , where n is the number of nodes
present in the graph. Now, suppose the graph is a Sparse graph i.e. total number of edges
present in the graph is very less as compared to the total possible edges.
VSRGDC
For example , if the total number of nodes is 5 and the number of edges is 4. Then we
will make a matrix of size 5*5 = 25 and we are storing the value of only 4 edges. Rest of
the 21 spaces are vacant. So, here there is a waste of memory.
In order to solve this problem of wastage of memory in case of a Sparse matrix, we use
the Adjacency List.
Adjacency List
Let us assume that the graph is G(n, m) . Here G is the graph, n is the number of nodes
and m is the total number of edges present in the graph G .
Now, to represent the graph in the form of an Adjacency List, we will create a list of
pointers of size " n ". Each node in the list denotes the node present in the graph. These
nodes are pointing towards a list of nodes that can be traversed from a particular node.
All the nodes connected from a particular node are added in the list corresponding to
that particular node. For example , if from node A , node B and node C are connected,
then the linked list will be A -> B -> C .
In case of an undirected graph, the total number of nodes of the linked list will
be 2m (because the connection is bidirectional), where " m " is the number of edges of
the graph.
Also, in the case of an undirected graph, the total space used is O(n + 2m).
In case of a directed graph, the total number of nodes of the linked list will be m , where
" m " is the number of edges of the graph.
Also, in the case of a directed graph, the total space used is O(n + m).
VSRGDC
If you want to add some edge between node i and node j , then you have to first go
to matrix[i] and then traverse the whole list corresponding to matrix[i] and add the new
edge at the last. Similarly, for deletion of an edge between node i and node j , you have to
go to matrix[i] and then find the edge in the linked list corresponding to matrix[i] and
then perform the delete operation. So, insertion and deletion are costlier in case of an
adjacency list.
For graph traversal, we normally use Breadth-First Search or Depth-First Search. In the
last blog, we learned about Breadth-First Search. Now, in this blog, we will be learning
about the other graph traversal algorithm i.e. Depth-First Search or DFS. So, let's get
started.
In DFS, we keep on visiting the nodes until there is some neighbour node and if there is
no neighbour node then we backtrack to the previous node and again we will visit all the
nodes ahead of that node. This can also be performed with the help of the exhaustive
search.
All you need to do is just select one path and traverse to the end of this path and after
that, come back to the previous node and traverse the nodes that are connected to those
nodes and this process is going to continue again and again until there is some node that
is not traversed.
If we reach the end of one path, then we come back to the recently visited node. So, while
writing the code for the DFS algorithm, we are going to use the Stack data structure
because Stack uses the Last In First Out order i.e. LIFO order.
Application of DFS
Minimum Spanning Tree: DFS of unweighted graph results in Minimum
Spanning Tree.
Bipartite Graph: DFS can be used to find if a graph is bipartite or not.
Cycle detection: Since we are maintaining the list of visited nodes, it can be
used to detect if a cycle is present in the graph or not.
Topological Sorting: Topological Sorting can be done with the help of DFS.
Strongly Connected Graph: Using DFS, we can find if a graph is strongly
connected or not i.e. is there any path from one node to other nodes of the graph.
Graph traversal is a process of visiting all the nodes from a source node only once in
some defined order. The order of traversal of nodes of a graph is very important while
solving some graph problems. Also, you must track the nodes that are already visited
because, in traversal, you need to traverse a node only once. So, a proper list of the
traversed nodes of the graph must be maintained. There are two ways of Graph traversal:
Breadth-First Search
Breadth-First Search or BFS is a graph traversal algorithm that is used to traverse the
graph level wise i.e. it is similar to the level-order traversal of a tree.
Here, you will start traversing the graph from a source node and from that node you will
first traverse the nodes that are the neighbours of the source node. After traversing all
the neighbour nodes of the source node, you need to traverse the neighbours of the
neighbour of the source node and so on.
Based on the source node, the whole graph can be divided into various levels i.e. the
nodes that are at distance 1 from the source node are said to be at level 1. Similarly, the
nodes that are at distance 2 from the source node are said to be at level 2 and so on.
Based on the layers of the graph, the BFS can be performed by the following steps:
BFS implementation
In order to implement BFS, we need to take care of the following things:
Traversal should be level wise i.e. first level 1 will be traversed, followed by level 2,
level 3, and so on.
The nodes should be visited once. None of the nodes should be visited twice.
So, to apply the above conditions, we make the use of Queue data structure that follow
First In First Out(FIFO) order. We will insert the nodes in the queue and mark it as
visited and after that, all the neighbour nodes of that node will also be inserted in the
queue. Since it follows FIFO order, the node entered first will be visited first and their
neighbours will be added in the queue first. During insertion of nodes in the queu e, we
VSRGDC
will check if the nodes are visited or not. If it is visited then we will not add those nodes
in the queue. Otherwise, we will add the node in the queue.
First Iteration
Application of BFS
Shortest Path: When you are dealing with the unweighted graph i.e. there is no
weight on the edges of the graph, then the shortest path from one node to other
can be found by using BFS.
Neighbour places: While finding some nearest hotel or something else, the GPS
of your phone can use BFS for finding the neighbouring nodes.
Cycle detection: BFS can be used to detect cycles in a graph. This is possible
because in BFS we are maintaining a list of visited nodes.
Crawlers: The crawlers in search engine use BFS for directing one page to
another. For example, if you are at some page p1 and the page p1 is having the link
of page p2 and page p3, then page p2 and p3 are the neighbours of page p1. So,
here BFS can be used to crawl the neighbouring pages.
Finding nodes connected to a particular node: You can find all the nodes
connected to a particular node by using the Breadth-First Search.
Route finding: BFS can be used to find if there is a route between two cities or
not. Here, the cities are the nodes of the graph and the path between these cities
are the edges of the graph.
These are some of the applications of Breadth-First Search.
VSRGDC