DS Unit 5
DS Unit 5
INTRODUCTION
A graph graph is a non-linear data structure consisting of nodes and edges. A graph is a finite sets of vertices
(or nodes) and set of edges which connect a pair of nodes.
Or
Graph is a mathematical structure consisting of a set of vertices (also called nodes) {v1,v2,…,vn) and a set
of edges {e1,e2,…,en}. An edge is a pair of vertices {vi,vj} i,j ∈ {1…n}. The two vertices are called the edge
endpoints.
Formally: G = (V, E), where V is a set and E V × V
A graph may be either undirected or directed. Intuitively, an undirected edge models a "two-way" or "duplex"
connection between its endpoints, while a directed edge is a one-way connection, and is typically drawn as an
arrow. A directed edge is often called an arc. Mathematically, an undirected edge is an unordered pair of
vertices, and an arc is an ordered pair. The maximum number of edges in an undirected graph without a self-
loop is n(n - 1)/2 while a directed graph can have at most n2 edges.
G = (V, E) undirected if for all v, w V: (v, w)= V(w, v). Otherwise directed.For eg.
Graphs can be classified by whether or not their edges have weights. In Weighted graph, edges have a weight.
Weight typically shows cost of traversing
Example: weights are distances between cities
In Unweighted graph, edges have no weight. Edges simply show connections.
BASIC TERMINOLOGY
➢ Edges, also called arcs, are represented by (u, v) and are either:
➢ Directed if the pairs are ordered (u, v) u the origin v the destination.
➢ Undirected End-vertices of an edge are the endpoints of the edge.
➢ Two vertices are adjacent if they are endpoints of the same edge.
➢ An edge is incident on a vertex if the vertex is an endpoint of the edge.
➢ Outgoing edges of a vertex are directed edges that the vertex is the origin.
➢ Incoming edges of a vertex are directed edges that the vertex is the destination.
➢ Degree of a vertex, The degree of a vertex is the number of edges incident to that vertex. In an undirected
graph, the number of edges connected to a node is called the degree of that node.
➢ Out-degree, outdeg(v), is the number of outgoing edges.
➢ In-degree, indeg(v), is the number of incoming edges.
➢ Parallel edges or multiple edges are edges of the same type and end-vertices.
➢ Self-loop is an edge with the end vertices the same vertex.
➢ Simple graphs have no parallel edges or self-loops.
➢ Path is a sequence of alternating vertices and edges such that each successive vertex is connected by the
edge. Frequently only the vertices are listed especially if there are no parallel edges.
➢ Cycle is a path that starts and end at the same vertex.
➢ Simple path is a path with distinct vertices.
➢ Directed path is a path of only directed edges
➢ Directed cycle is a cycle of only directed edges.
➢ Sub-graph is a subset of vertices and edges.
➢ Spanning sub-graph contains all the vertices.
➢ Connected graph has all pairs of vertices connected by at least one path.
➢ Connected component is the maximal connected sub-graph of a unconnected graph.
➢ Forest is a graph without cycles.
➢ Tree is a connected forest (previous type of trees are called rooted trees, these are free trees)
➢ Spanning tree is a spanning sub graph that is also a tree.
➢ Simple graph : A graph or directed graph which does not have any self-loop or parallel edges is called a
simple graph.
➢ Multi-graph : A graph which has either a self-loop or parallel edges or both is called a multi-graph.
➢ Complete graph :A graph is complete graph if each vertex is adjacent to every other vertex in graph or
there is an edge between any pair of nodes in the graph. An undirected complete graph will contain
n(n – 1)/2 edges.
➢ Regular graph : A graph is regular if every node is adjacent to the same number of nodes. Here every
node is adjacent to 3 nodes.
➢ Planar graph : A graph is planar if it can be drawn in a plane without any two intersecting edges.
➢ Connected graph :In a graph G, two vertices v1 and v2 are said to be connected if there is path in G from
v1 to v2 or v2 to v1.Connected graph can be of two types :
▪ Strongly connected graph
▪ Weakly connected graph
➢ Acyclic graph : If a graph (digraph) does not have any cycle then it is called as acyclic graph.
➢ Cyclic graph : A graph that has cycles is called a cyclic graph.
➢ Biconnected graph : A graph with no articulation points is called a biconnected graph.
Applications of graph :
➢ Graph is a non-linear data structure and is used to present various operations and algorithms.
➢ Graphs are used for topological sorting.
➢ Graphs are used to find shortest paths.
➢ They are required to minimize some aspect of the graph, such as distance among all the vertices in the
graph.
GRAPH REPRESENTATIONS
There are two standard ways of maintaining a graph G in the memory of a computer.
1. The sequential representation
2. The linked representation
An adjacency matrix is one of the two common ways to represent a graph. The adjacency matrix shows which
nodes are adjacent to one another. Two nodes are adjacent if there is an edge connecting them. In the case of a
directed graph, if node j is adjacent to node i, there is an edge from i to j . In other words, if j is adjacent to i,
you can get from i to j by traversing one edge. For a given graph with n nodes, the adjacency matrix will have
dimensions of nxn. For an unweighted graph, the adjacency matrix will be populated with Boolean values.
For any given node i, you can determine its adjacent nodes by looking at row (i,[1…n]) adjacency matrix. A
value of true at (i,j ) indicates that there is an edge from node i to node j, and false indicating no edge. In an
undirected graph, the values of (i,j) and (j,i)will be equal. In a weighted graph, the boolean values will be
replaced by the weight of the edge connecting the two nodes, with a special value that indicates the absence of
an edge.
AB C D
A0 1 1 1
B 1 0 01
C1001
D1110
AB C D
A0 1 1 1
B0001
C0000
D0010
The adjacency list is another common representation of a graph. There are many ways to implement this
adjacency representation. One way is to have the graph maintain a list of lists, in which the first list is a list of
indices corresponding to each node in the graph. Each of these refer to another list that stores a the index of
each adjacent node to this one. It might also be useful to associate the weight of each link with the adjacent
node in this list.
Example: An undirected graph contains four nodes 1, 2, 3 and 4. 1 is linked to 2 and 3. 2 is linked to 3. 3 is linked
to 4.
1 - [2, 3]
2 - [1, 3]
3 - [1, 2, 4]
4 - [3]
o A: B, C, D
o B: A, D
o C: A, D
o D: A, B, C
GRAPH TRAVERSAL
Many graph algorithms require one to systematically examine the nodes and edges of a graph G. There are two
standard ways that this is done. One way is called a breadth-first search, and the other is called a depth-first
search. The breadth-first search will use a queue as an auxiliary structure to hold nodes for future processing, and
analogously, the depth-first search will use a stack.
During the execution of our algorithms, each node N of G will be in one of three states, called the status of N, as
follows:
STATUS = 1: (Ready state.) The initial State of the node N.
STATUS = 2: (Waiting state.) The node N is on the queue or stack, waiting to be processed,
STATUS = 3: (Processed state.) The node N has been processed.
We now discuss the two searches
1. Breadth-First Search: The general idea behind a breadth-first search beginning at a starting node A is
as follows. First we examine the starting node A. Then we examine all the neighbors of A. Then we examine all
the neighbors of the neighbors of A. And so on. Naturally, we need to keep track of the neighbors of a node, and
we need to guarantee that no node is processed more than once. This is accomplished by using a queue to hold
nodes that are waiting to be processed, and by using a field STATUS which tells us the current status of any node.
The algorithm follows.
Algorithm
This algorithm executes a breadth first search on a graph 0 beginning at a starting node A.
1. Initialize all nodes to the ready state (STATUS 1). .
2. Put the starting node A in QUEUE and change its status to the waiting state (STATUS = 2).
3. Repeat Steps 4 and 5 until QUEUE is empty:
4. Remove the front node N of QUEUE. Process N and change the status of N to the processed state (STATUS =
3).
5. Add to the rear of QUEUE all the neighbors of N that arc in the steady state (STATUS = 1), and change their
status to the waiting state (STATUS = 2). [End of Step 3 loop.]
6. Exit.
The above algorithm will process only those nodes which are reachable from the starting node A.
EXAMPLE : Implement BFS algorithm to find the shortest path from node A to J.
Solution: Adjacency list of the graph is :
A :F,C,B
B :G,C
C :F
D :C
E :D,C,J
F :D
G :C,E
J :D,K
K :E,G
J is our final destination. We now back track from J to find the path from J to A : J← E ←G← B ←A
2. Depth-First Search The general idea behind a depth-first Searh beginning at a starting node A is as follows.
First we examinethe starting node A. Then we examine each node N along a path P which begins at A; that is, we
process a neighbor of A. then a neighbor of a neighbor of A. and so on. After coming to a "dead end," that is, to
the end of the path P, we backtrack on P until we can continue along another path I". And so o. (This algorith m is
similar to the murder traversal of a binary tree, and the algorithm is also similar to the way one might travel
through a mazc.) P'. And so on (This algorithm is similar to the inorder traversal of a binary tree). The algorithm
is very similar to the breadth-first n search except now we use a stack instead of the queue. Again, a field
STATUS is used to tell us the current status of a node. The algorithm follows.
Algorithm: This algorithm executes a depth-first search on a graph G beginning at it starting node A.
1. Initialize all nodes to the ready state (STATUS = I).
2. Push the starting node A onto STACK and change its status to the waiting state (STATUS = 2).
3. Repeat Steps 4 and 5 until STACK is empty.
4. Pop the top node N of STACK. Process N and change its status to the processed state (STATUS= 3).
5. Push onto STACK all the ncighhors of N that are still in the ready state (STATUS = I), and change their status
to the waiting state (STATUS = 2).
(End of Step 3 loop.)
6. Exit.
Again, the above algorithm will process only those nodes which are reachable front starting node A. Suppose one
wants to examine all the nodes in G. Then the algorithm must be modified so that it begins again with another
node which we will call B—that is still in the ready state. This node B can be obtained by traversing the list of
nodes.
EXAMPLE : Implement DFS algorithm in the graph.
1. Initially set STATUS = 1 for all vertex, Push 1 onto stack and set their STATUS = 2, Stack: 1
2. Pop 1 from stack, change its STATUS = 1 and Push 2, 7 onto stack and change their STATUS = 2;
DFS =1 Stack: 7
2
3. Pop 7 from stack, Push 3, 6; DFS = 1, 7 Stack: 6
3
2
Importance of BFS :
➢ It is one of the single source shortest path algorithms, so it is used to compute the shortest path.
➢ It is also used to solve puzzles such as the Rubik’s Cube.
➢ BFS is not only the quickest way of solving the Rubik’s Cube, but also the most optimal way of solving it.
Application of BFS : Breadth first search can be used to solve many problems in graph theory, for example :
➢ Copying garbage collection.
➢ Finding the shortest path between two nodes u and v, with path length measured by number of edges (an
advantage over depth first search).
➢ Ford-Fulkerson method for computing the maximum flow in a flow network.
➢ Serialization/Deserialization of a binary tree vs serialization in sorted order, allows the tree to be re-
constructed in an efficient manner.
➢ Testing bipartiteness of a graph.
Importance of DFS : DFS is very important algorithm as based upon DFS, there are O(V + E)-time algorithms
for the following problems :
➢ Testing whether graph is connected.
➢ Computing a spanning forest of G.
➢ Computing the connected components of G.
➢ Computing a path between two vertices of G or reporting that no such path exists.
➢ Computing a cycle in G or reporting that no such cycle exists
Application of DFS : Algorithms that use depth first search as a building block include :
Finding connected components.
➢ Topological sorting.
➢ Finding 2-(edge or vertex)-connected components.
➢ Finding 3-(edge or vertex)-connected components.
➢ Finding the bridges of a graph.
➢ Generating words in order to plot the limit set of a group.
➢ Finding strongly connected components
CONNECTED COMPONENT
A connected component (or just component) of an undirected graph is a subgraph in which any two vertices are
connected to each other by paths, and which is connected to no additional vertices in the super graph.
.
.
For example, the graph shown in the illustration above has four connected components {a,b,c,d}, {e,f,g}, {h,i},
and {j}. A graph that is itself connected has exactly one connected component, consisting of the whole graph.
Strongly connected component : A directed graph is strongly connected if there is a path between all pairs of
vertices. A strong component is a maximal subset of strongly connected vertices of subgraph
SPANNING TREE
A spanning tree of an undirected graph is a sub-graph that is a tree which contains all the vertices of graph. A
spanning tree of a connected graph G contains all the vertices and has the edges which connect all the vertices.
So, the number of edges will be less than the number of nodes.
If graph is not connected, i.e., a graph with n vertices has edges less than n – 1 then no spanning tree is possible.
A connected graph may have more than one spanning trees.
Kruskal's algorithm is a greedy algorithm in graph theory that finds a minimum spanning tree for a connected
weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the
total weight of all the edges in the tree is minimized.
Algorithm
1. create a forest F (a set of trees), where each vertex in the graph is a separate tree
2. create a set S containing all the edges in the graph
3. while S is nonempty and F is not yet spanning
4. remove an edge with minimum weight from S
5. if that edge connects two different trees, then add it to the forest, combining two trees into a single tree
At the termination of the algorithm, the forest forms a minimum spanning forest of the graph. If the graph is
connected, the forest has a single component and forms a minimum spanning tree.
MST-KRUSKAL(G, w)
1 A=ϕ
2 for each vertex v ϵ G, V
3 MAKE-SET (v)
4 sort the edges of G, E into nondecreasing order by weight w
5 for each edge (u,v) ϵ G, E, taken in nondecreasing order by weight
6 if FIND-SET(u) ≠ FIND-SET (v)
7 A = A U {(u,v) }
8 UNION{(u,v) }
9 return A
Example: Trace Kruskal's algorithm in finding a minimum-cost spanning tree for the undirected, weighted
graph given below:
Therefore The minimum cost is: 24
Prim’s Algorithm
The Prim’s algorithm makes a nature choice of the cut in each iteration – it grows a single tree and adds a light
edge in each iteration.
Algorithm
1. Initialize a tree with a single vertex, chosen arbitrarily from the graph.
2. Grow the tree by one edge: of the edges that connect the tree to vertices not yet in the tree, find the
minimum-weight edge, and transfer it to the tree.
3. Repeat step 2 (until all vertices are in the tree).
MST-PRIM (G,w, r)
1 for each u ϵ G, V
2 u.key = ꚙ
3 u. π = NIL
4 r.key = 0
5 Q = G,V
6 while Q ≠ ϕ
7 u = EXTRACT-MIN(Q)
8 for each v ϵ G, Adj[u]
9 if v ϵ Q and w(u,v)< v.key
10 v. π = u
11 v.key = w(u,v)
Example: Consider the following graph as an example for which we need to find the Minimum Spanning Tree
(MST).
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum Spanning Tree.
Here we have selected vertex 0 as the starting vertex.
Transitive closure of a graph: Given a directed graph, find out if a vertex j is reachable from another
vertex i for all vertex pairs (i, j) in the given graph. Here reachable mean that there is a path from vertex i to j. The
reach ability matrix is called transitive closure of a graph. The graph is given in the form of adjacency matrix say
‘graph[V][V]‘ where graph[i][j] is 1 if there is an edge from vertex i to vertex j or i is equal to j, otherwise
graph[i][j] is 0.
DIJKSTRA(G,w,s)
1 INITIALIZE-SINGLE-SOURCE(G, s)
2 S=ϕ
3 Q =G,V
4 while Q ≠ ϕ
5 u = EXTRACT-MIN(Q)
6 S = S U {u}
7 for each vertex v ϵ G.Adj[u]
8 RELAX (u,v,w)
INITIALIZE-SINGLE-SOURCE(G, s)
1 for each vertex v ϵ G.V
2 v.d = ꚙ
3 v. π = NIL
4 s.d = 0
RELAX(u,v,w)
1 if v.d > u.d + w(u,v)
2 v.d= u.d + w(u,v)
3 v. π = u
Example:
Warshall’s Algorithm:
Warshall’s Algorithm is a graph analysis algorithm for finding shortest paths in a weighted graph with
positive or negative edge weights (but with no negative cycles, see below) and also for finding transitive
closure of a relation R.
Floyd -Warshall algorithm uses a matrix of lengths D0 as its input. If there is an edge between nodes i and
j, than the matrix D0 contains its length at the corresponding coordinates. The diagonal of the matrix
contains only zeros. If there is no edge between edges i and j, than the position (i,j) contains positive
infinity. In other words, the matrix represents lengths of all paths between nodes that does not contain any
intermediate node.
In each iteration of Floyd-Warshall algorithm is this matrix recalculated, so it contains lengths of paths
among all pairs of nodes using gradually enlarging set of intermediate nodes. The matrix D 1, which is
created by the first iteration of the procedure, contains paths among all nodes using exactly one
(predefined) intermediate node. D2 contains lengths using two predefined intermediate nodes. Finally the
matrix Dn uses n intermediate nodes. This transformation can be described using the following formula:
wij if k=0
k
d ij =
min(d(k-1)ij, d(k-1)ik + d(k-1)kj) if k>=1
Let dkij be the weight of a shortest path from vertex i to vertex j for which all intermediate vertices are in
the set {1, 2; : : : ; k}. When k = 0, a path from vertex i to vertex j with no intermediate vertex numbered
higher than 0 has no intermediate vertices at all. Such a path has at most one edge, and hence d0ij = wij .
We can give a recursive formulation of πkij . When k = 0, a shortest path from i to j has no intermediate
vertices at all. Thus,
NIL if i = j or wi j= ꚙ
π ij =
0
i if i ≠ j and wi j< ꚙ
FLOYD-WARSHALL (W)
1. n = rows (W)
2. D0 = W
3. for k = 1 to n
4. do for i = 1 to n
5. Do for j = 1 to n
6. Do d0ij = min ( d(k-1)ij , d(k-1)ik + d(k-1)kj )
7. return Dn
Using Floyd Warshall Algorithm, find the shortest path distance between every pair of vertices.
Solution-
Step-01:
Step-02:
Adjacency List A M-ary list (array) in which each entry stores a list(linked list) of all adjacent vertices.
Adjacency Multi list An edge in an undirected graph is represented by two nodes in adjacency list representation.
Adjacency Multi lists lists in which nodes may be shared among several lists.