9 - Graphs
9 - Graphs
(CS3401)
DATA STRUCTURES
Dr Somaraju Suvvari
2
NITP -- CS3401
TREES
&
GRAPHS
Dr Somaraju Suvvari
3
NITP -- CS3401
UNIT V: Trees & Graphs
Binary tree, Binary search tree, Threaded binary tree, AVL Tree, B Tree, Tries, Heaps, Hash tables.
Graph and its implementation, Graph traversals: Breadth First Search, Depth First Search, Union-find data
structure and applications, Spanning Tree – Prim’s algorithm and Kruskal’s algorithm, Shortest path- Dijkstra's
algorithm and Bellman Ford algorithm., Topological sorting for Directed Acyclic Graph.
Dr Somaraju Suvvari
4
NITP -- CS3401
GRAPHS
(Acknowledgement - Most of the explanations for graphs I took it from the following
text book: “Data Structures and Algorithm Analysis in C”, by Mark Allen Weiss)
5
Graphs
A graph G consists of set of vertices, V, and a set of edges, E. Each edge is a pair (v, w) , where v, w V.
• Edges are sometimes referred to as arcs.
• If the pair is ordered, then the graph is directed graph otherwise it is undirected graph.
• Verted w is adjacent to v if and only if (v, w) E.
• Vertex v is adjacent to w and w is adjacent to v if (v, w) E in an undirected graph.
• Applications of Graphs
1. To represent the Networks (Electronic circuits, Transportation networks, Highway network, Flight
network, Computer networks, etc)
2. For representing the dependency of tables in a database
1 2
2 6 3
1 3 5 4
Undirected Graph
Directed Graph 4 G = ({1, 2, 3, 4, 5, 6}, { (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 1), (6, 3)} )
G = ({1, 2, 3, 4}, {(1, 2), (2, 4), (4, 3), (3, 2), (4, 1)} )
Dr Somaraju Suvvari 6
NITP -- CS3401
Graphs
• A graph with no cycles is called tree.
• A self loop is and edge that connects a vertex to itself.
• Two edges are parallel if they connect the same pair of vertices.
• The degree of a vertex is the number of edges incident to it.
• A subgraph is a subset of graphs edges that forma graph.
• A path is a sequences of adjacent vertices.
• Simple path is a path with no repeated vertices.
• A cycle is a path where the first and last vertices are the same.
• A graph is connected if there is a path from every vertex to every other vertex.
• If a graph is not connected then it consists of a set of connected components.
• A directed acyclic graph (DAG) is a directed graph with no cycles.
• Some times an edge has a third component, known as either weight or cost.
Dr Somaraju Suvvari 7
NITP -- CS3401
Representation of Graphs
Graphs are represented using the following approaches
2. Adjacency List - In this representation all the vertices connected toa vertex v are listed on
an adjacency list for that vertex v. (Used in sparse graphs – Graphs with only fewer edges)
3. Adjacency Set – Similar to adjacency list but instead of using the Linked Lists Disjoint sets
are used.
Dr Somaraju Suvvari 8
NITP -- CS3401
Representation of Graphs
Example – Consider the following graph
2
1 3
Dr Somaraju Suvvari 9
NITP -- CS3401
Representation of Graphs
Dr Somaraju Suvvari 10
NITP -- CS3401
Graph Traversals
To work with the graphs we need a mechanism to visit the nodes in the graph, there
exist two popular techniques for traversing the graphs:
Dr Somaraju Suvvari 11
NITP -- CS3401
DFS Traversal
DFS works similar to the pre-order traversal technique of the tree. It works in the
following manner
“Starting at some vertex, v, we process v and then recursively all the vertices
adjacent to v. To avoid cycles we need to remember which nodes are visited.”
• By starting at vertex v it considers the edges from v to other vertices.
• If the edge leads to an already visited vertex, then backtrack to the current vertex v.
• If an edge leads to unvisited vertex, then go to that vertex and start processing that vertex.
• Follow this procedure until we reach dead-end. At this point do the backtracking.
Dr Somaraju Suvvari 12
NITP -- CS3401
DFS Traversal
Void DFS(Vertex v) 2
{ visited[v] = true;
for each w adjacent to v 1 3
{ if (!visited[w]) 4
{ DFS(w); }
5
}
}// Here visited is an array with size |V| and all are initialized to false
• Time Complexity O( |E| + |V|)
• If we apply DFS on the graph shown here with a vertex 1 then, it results the
following order of visiting nodes: <1> <2> <4> <3> <5>
• Another possible DFS order is: <1> <2> <3> <5> <4>
• DFS uses stack data structure
Dr Somaraju Suvvari 13
NITP -- CS3401
BFS Traversal
DEF works similar to the Level-order traversal technique of the tree. It works in the
following manner
Applications of BFS
6. Finding all connected components in a graph
2. Find the vertices whose indegree is zero and push them into a queue.
3. If the queue is empty then Display an error and stop the algorithm.
5. Output v and decrease the indegree of all vertices u, where u has an incoming edge from v (v and u
are adjacent). If the indegree of u is zero then enqueue it into queue.
6. Increment the counter. Repeat steps 3 and 5 if the counter is less than the number of vertices.
Dr Somaraju Suvvari 17
NITP -- CS3401
Topological Sorting
Example: Consider the following DAG (from the Mark Allen Weiss text book)
V1 V2
Vertex Indegree before Dequeue
v1 0 0 0 0 0 0 0 v3 v4 v5
v2 1 0 0 0 0 0 0
v3 2 1 1 1 0 0 0 v6 v7
v4 3 2 1 0 0 0 0
v5 1 1 0 0 0 0 0
v6 3 3 3 3 2 1 0
v7 2 2 2 1 0 0 0
Enqueue v1 v2 v5 v4 v 3, v 7 v6
Dequeue v1 v2 v5 v4 v3 v7 v6
Dr Somaraju Suvvari 18
NITP -- CS3401
Topological Sorting
“A topological sort is an ordering of vertices in a directed acyclic graph, such that if there is a path from u to v, then v appears
Void Topsort(Graph G)
{ int counter; Vertex v, w;
for (counter =0; counter < NumberOfVertices; counter++) // Process |V| vertices
{ v = FindNewVertexofDegreeZero();
if (v is not a vertex) { Display (“ERROR”); exit() }
TopNum[v] = counter; // The position of v is counter
for each w adjacent to v
{ Indegree [w] --; }
}
}
Time complexity : O(|V|2) (when searching for a vertex which has indegree zero in an array which maintains the
indegree of vertices)
: O(|E| + |V|) (when we maintain the vertices whose indegree is zero separately in a new data structure
(queue or stack) )
Dr Somaraju Suvvari 19
NITP -- CS3401
Topological Sorting
Applications of Topological Sorting
1. Representing course prerequisite
2. Detecting deadlocks
Dr Somaraju Suvvari 20
NITP -- CS3401
Shortest-Path Algorithms
Single Source Shortest Path Algorithm - Given an input weighted graph, G = (V, E), and a
distinguished vertex, s, find the shortest weighted path from s to every other vertex in G.
2
V1 V2
4 1 3 10
v3 2 v4 2 v5
5 8 4 6
v6 1 v7
The shortest path from vertex v1 to v6 has a cost of 6, and goes from v1 to v4 to v7 to v6.
In real life we have many applications where we want to solve the shortest-path problems. If the vertices
are computers; edges are the link between computers; and the cost represents communication costs, delay
V1 V2
v3 v4 v5
v6 v7
Consider V1 is the source vertex, apply BFS, we will get the following sequence
Yes (if the cost of the edge is 3, then replace three edges), but graph has too many vertices.
Dr Somaraju Suvvari 22
NITP -- CS3401
Dijkstra’s Algorithm
It works on the algorithm design technique Greedy Method -
It work in phases.
In each phase, a decision is made that appears good, without regard for future
consequences. Generally this means that some local optimum is chosen. This is “take
what you get now” strategy is the source of the name for this class of algorithms.
When the algorithm terminates we hope the local optimum is equal to the global
optimum. If this is the case the algorithm is correct, otherwise it produces a suboptimal
solution.
Dr Somaraju Suvvari 23
NITP -- CS3401
Dijkstra’s Algorithm
Given a source vertex s (need to find the shortest distance from s to every other vertex).
1. Initially we assume all the vertices are unknown and their distance from s is .
2. We make the vertex s as known and update its distance as ZERO.
3. Select the shortest distance vertex v (Presently the distance to all the vertices is except the
vertex s).
4. Find all the adjacent vertices u to v.
1. If u is an unknown then if the sum of the distance from s to v and the distance from v to u is D.
2. if D is smaller than the present distance of u then update its distance to D and update its parent as v.
5. If there are no unknown vertices then stop the algorithm, otherwise find the set of unknown
vertices. In this set of unknown vertices select the vertex w whose distance is minimum.
2 V2
6. Make w as known and repeat the steps 4 and 5. V1
1 10
4 3
v3 2 v4 2 v5
5 8 4 6
v6 1 v7
Dr Somaraju Suvvari 24
NITP -- CS3401
Dijkstra’s Algorithm
To apply the algorithm we use the following data structure:
V dv pv
known
V1
2 V2
v1 0 0 4 1 3 10
v2 0 0 v3 2 v4 2 v5
5 8 4
v3 0 0 6
v6 1 v7
v4 0 0
v5 0 0
v6 0 0
v7 0 0
v3 2 v4 2 v5
5 8 4 6
V dv pv v6 1 v7
known
v1 0 0 V dv pv
known V dv pv
v2 0 0 known
v1 1 0 V dv pv
v3 0 0 v1 1 0 known
v2 0 v1
v2 0 v1 v1 1 0
v4 0 0 v3 0 0
v3 0 3 v4 v2 1 v1
v5 0 0 v4 0 v1
v4 1 v1 v3 0 3 v4
v6 0 0 v5 0 0
v5 0 v4 v4 1 v1
v7 0 0 v6 0 0
Initial configuration v6 0 v4 v5 0 v4
v 0 0
After7 making v1 as known vertex v7 0 v4 v6 0 v4
After making v4 as known vertex v7 0 v4
After making v2 as known
vertex
Dr Somaraju Suvvari
26
NITP -- CS3401
Dijkstra’s Algorithm V1
2 V2
4 1 3 10
v3 2 v4 2 v5
V dv pv 5 8 4 6
known
v6 1 v7
V dv pv
v1 1 0 known
V dv pv
v2 1 v1 v1 1 0 known
v3 0 3 v4 v2 1 v1 v1 1 0 V dv pv
known
v4 1 v1 v3 1 3 v4 v2 1 v1 v1 1 0
v5 0 v4 v3 1 3 v4
v4 1 v1 v2 1 v1
v6 0 v4
v5 0 v4 v4 1 v1 v3 1 3 v4
v7 0 v4
After making v2 as known vertex v6 0 v3 v5 1 v4 v4 1 v1
v 0 v v6 0 v3 v5 1 v4
After7 making v3 as known4vertex
v7 making
After 0 v5 as known
v4 v6 0 6 v7
v7 making
After 1 V as known
v4
7
Dr Somaraju Suvvari
27
NITP -- CS3401
Dijkstra’s Algorithm V1
2 V2
4 1 3 10
v3 2 v4 2 v5
V dv pv 5 8 4
: known
6
v6 1 v7
V dv pv
v1 1 0 known
v2 1 v1 v1 1 0
v3 1 3 v4 v2 1 v1
v4 1 v1 v3 1 3 v4
v5 1 v4 v4 1 v1
v6 0 6 v7 v5 1 v4
v7 1 v4 v6 1 6 v7
After making v7 as known vertex v7 1 v4
After making v6 as known vertex
Time Complexity = O(|V|2 + |E| ) = O(|V|2 ) // if we use an array to store the vertices distances in an array
Dr Somaraju Suvvari
Time Complexity = O( |V| log|V| + |E| log|V|) = O(||E| log |V|) NITP
// if-- CS3401
we use priority queue 29
Bellman – Ford Algorithm
Limitation of Dijkstra’s algorithm
It may not work if any edge has a negative cost.
5
1 2
-3
4
3
1 0 1 0 1 0 1 0
2 0 2 1 2 1 2 1
3 0 3 1 3 2 3 2
Q = {1} Q = {2, 3} Q = {3} Q = {}
Dr Somaraju Suvvari
After dequeuing 1, After dequeuing
NITP -- CS3401 2, After deqeuing 3, 31
Bellman – Ford (pseudo code)
Void Dijkstra (Table T)
{ Queue Q; Vertex v, w;
Enqueu(Q, s) // s is the source vertex
while (!IsEmpty(Q))
{ v = Dequeue(Q);
for each w adjacent to v
{ if T[v].Dist + Cvw < T[w].Dist)
{ Decrease T[w]. Dist to T[v].Dist + Cvw;
T[w]. Path = v;
if w is not already in Q
{ Enqueu(Q, w); }
}
}
}
}
formed from graph edges that connects all the vertices of G at lowest total cost.
2 3
4
Dr Somaraju Suvvari
33
NITP -- CS3401
Minimum Spanning Tree
The following are the two algorithms exist to find the MSPs :
1. Prims Algorithm
2. Kruskal Algorithm
Dr Somaraju Suvvari 34
NITP -- CS3401
Prims Algorithm
Idea of Prims Algorithm- “Grow the tree in successive stages. In each stage, one
node is picked as the root, and we add an edge, and thus an associated vertex, to the
tree. At each stage, a new vertex added to the tree by choosing an edge (u, v) such that
the cost of (u, v) is the minimum among all edges where u is in the tree and v is not.”
Dr Somaraju Suvvari 35
NITP -- CS3401
V1
2 V2
Prims Algorithm v3
4
2
1
v4
3
7
10
v5
5 8 4 6
Example 1
v6 v7
V1 V2
V1 V2
1 1
v3 v4 v5
v3 v4 v5
v6 v7
v6 v7
V1 2 V2 V1 V2
1 1
3
v3 2 v4 v5 v3 2 v4 v5
v6 v7 v6 v7
Dr Somaraju Suvvari 36
NITP -- CS3401
V1
2 V2
Prims Algorithm v3
4
2
1
v4
3
7
10
v5
5 8 4 6
Example 1
v6 v7
V1 2 V2 2
V1 V2
1 1
5
v3 2 v4 v5 2
v3 v4 v5
4 4
v6 v7 v6 1 v7
6
4
V1 2 V2
V1 2 V2 1
1 2
v3 v4 v5
v3 2 v4 v5 4 6
v6 1 v7
v6 v7
Dr Somaraju Suvvari 37
NITP -- CS3401
Prims Algorithm
Prims algorithm is essentially identical to Dijkstra’s algorithm. The only difference is
that in the update rule.
The time complexity is O(|V|2) without heaps and O(|E| log|V|) using binary heaps.
Dr Somaraju Suvvari 38
NITP -- CS3401
Disjoint Set ADT
(Most of the explanations for graphs I took it from the following text book: “Data
Structures and Algorithm Analysis in C”, by Mark Allen Weiss)
Dr Somaraju Suvvari 39
NITP -- CS3401
Disjoint set ADT
• This is a special data structure to solve the equivalence problem (used to represent the collection
of sets, where in each set all the elements are related).
Equivalence Relations
• A relation R is defined on a set S if for every pair of elements (a, b), a, b S, a R b is either true or
false. If a R b is true, then we say that a is related to b.
• The equivalence class of an element a S is the subset of S that contains all the elements that are
related to a.
• To decide if a R b, we need to verify whether both a and b are in the same equivalence class or
not.
Dr Somaraju Suvvari 40
NITP -- CS3401
Disjoint set ADT
The input is initially a collection of N sets, each set with one element. This initial representation is that
all relations (except reflexive relation) are false. Each set has a different element, so that S i∩Sj=ɸ; this
make the set disjoint.
1. Find – Returns the name of the set which contains the given element.
2. Union – Merges the two the equivalences classes contains a and b into a new equivalence class.
From set point of view, the result of U is to create a new set Sk = Si U Sj, destroying the original and
preserving the disjointness of all the sets.
The algorithm to do this is frequently known as the disjoint set Union/Find algorithm for this reason.
Dr Somaraju Suvvari 41
NITP -- CS3401
Disjoint set ADT
Basic idea:
“Use a tree to represent each set and root can be used as the name of the set”
• Each entry in P[i] represents the parent of element i. If i is a root then P[i] = 0.
• By making the root pointer of one node point to the root of another node for merging the two sets. Some
times we need to merge on two elements, in this case we need to apply Find operation on these two
elements to determine their roots..
• The Find() operation on X returns the root of the X and it takes time proportional to the depth of the tree
in the worst case O(N).
Dr Somaraju Suvvari 42
NITP -- CS3401
Disjoint set ADT
Void Initialize(DisjSet S)
{ int i;
for( i = NumSets; i > 0; i--) // NumSets is the number of sets initially, i.e., N
{ S[ i ] = 0; }
}
Void SetUnion(DisjSet S, SetType root1, setType root2) //Assume root1 and root2 are the roots of two trees
{ S[root2] = root1; }
Time complexity O(1)
a b c e f a b c
d e
d
0 0 0 c 0 0 f
0(a) 1(b) 2(c) 3(d) 4(e) 5(f)
0 0 0 c c e
0(a) 1(b) 2(c) 3(d) 4(e) 5(f)
Dr Somaraju Suvvari 44
NITP -- CS3401
Disjoint set ADT
After applying the union(c, e) After applying the union of (a, c) // or (a, d), or (a, f)
a b c a b
e c
d
d e
f
0 0 0 c c e f
0(a) 1(b) 2(c) 3(d) 4(e) 5(f)
0 0 a c c e
0(a) 1(b) 2(c) 3(d) 4(e) 5(f)
Observation –
We are blindly merging the set c with the set a, we can also merge the set a with set c. Why?
By merging set a with set c, the length of the tree will be 3 only. But merging the set c with set a the length
of the tree is 4.
What is the result of Find(d)?
It returns a.
Dr Somaraju Suvvari
45
NITP -- CS3401
Disjoint set ADT
Union by Size (smart union)
• Make the root of the larger tree as the new root and it guarantees that the depth of the tree at most
log N.
• An alternative implementation is union by height, instead of merging the trees by size merge by
height and it also guarantees the depth of the tree is at most log N.
Dr Somaraju Suvvari 46
NITP -- CS3401
Disjoint set ADT
Union by Size
a b c -1 -1 -4 c c e
0(a) 1(b) 2(c) 3(d) 4(e) 5(f)
d e
Union by height
a b c 0 0 -2 c c e
0(a) 1(b) 2(c) 3(d) 4(e) 5(f)
d e
Dr Somaraju Suvvari 47
NITP -- CS3401
Disjoint set ADT
Union by height
Void SetUnion(DisjSet S, SetType root1, setType root2) //Assume root1 and root2 are the roots of two trees
{ if (S[root2] < S[root1] // if root2 is deeper set (remember these are negative values
S[root1] = root2;
else
{ if(S[root2] == S[root1] // same height
S[root1]--; //increase the height
S[root2] = root1;
}
}
Dr Somaraju Suvvari 48
NITP -- CS3401
Kruskal Algorithm
Idea of Kruskal’s Algorithm- “It is also working on the principle of greedy;
Continuously select the edges in order of smallest weight and accept an edge ifit does
not cause a cycle”.
Dr Somaraju Suvvari 49
NITP -- CS3401
Kruskal Algorithm
Algorithm:
1. Sort all the edges in ascending order of their weight.
3. Check if the inclusion of e forms a cycle or not with the spanning tree formed so far.
4. If cycle is not formed due to the inclusion of e, then include this edge, otherwise
discard it.
5. Repeat step 2 to 4 until there are |V|-1 edges added to the spanning tree.
Dr Somaraju Suvvari 50
NITP -- CS3401
V1
2 V2
Kruskal’s Algorithm v3
4
2
1
v4
3
7
10
v5
5 8 4 6
Example 1
v6 v7
V1 V2
V1 V2 Edge Weight Action
1 1
v3 v4 v5 (v1, v4) 1 Accepted
v3 v4 v5
(v6, v7) 1 Accepted
v6 v7
v6 v7 (v1, v2) 2 Accepted
(v3, v4) 2
2
(v2, v4) 3
(v1, v3) 4
2 V2 V2
V1
1
V1
1
(v4, v7) 4
3
(v3, v6) 5
v3 v4 v5 v3 v4 v5
(v5, v7) 6
v6 1 v6
1
v7 v7
Dr Somaraju Suvvari 51
NITP -- CS3401
V1
2 V2
Kruska’s Algorithm v3
4
2
1
v4
3
7
10
v5
5 8 4 6
Example 1
v6 v7
V1 2 V2 2
V1 V2
1 1 Edge Weight Action
5
v3 2 v4 v5 2 (v1, v4) 1 Accepted
v3 v4 v5
4 (v6, v7) 1 Accepted
v6 1 v7 1
v6 v7 (v1, v2) 2 Accepted
Dr Somaraju Suvvari 52
NITP -- CS3401
Kruskal Algorithm
Void Kruskal(Graph G)
{ int EdgesAccepted = 0; DijSet S; PriorityQueue H; Vertex u, v; SetType Uset, Vset; Edge e;
Initialize(S);
ReadGraphIntoHeapArray(G, H); // Graph is in Adjacency matrix or adjacency matrix
BuildHeap(H); // Build the heap with respective to edges
while(EdgesAccepted < |V| -1)
{ e = DeleteMin(H); /* e = (u, v) */ // Delete the smallest edge (Complexity O(log |E|)
Uset = Find(u, S); // Find the root of u // O(log |E|)
Vset = Find(v, S); // Find the root of u // O(log |E|)
if(Uset != Vset)
{ EdgeAccepted++;
SetUnion(S, Uset, Vset); // Merge the Uset and Vset O(1)
}
}
}
Time Complexity = |E| (log |E| + log |E| + log |E|) = O(|E| log |E|)
Since |E| may be O(|V|2), in this case time complexity = O(|E| log |V|)
Dr Somaraju Suvvari 53
NITP -- CS3401
Tries
Most of the explanations for graphs I took it from the following text book:
“Algorithm Design (Foundations, Analysis, and Internet Examples)”, by Michael T.
Goodrich, Roberto Tamassia)
Dr Somaraju Suvvari 54
NITP -- CS3401
Tries
The main application of trie is in information retrieval, indeed the name come from the
word retrieval.
The primary query operation that tries support are pattern matching and prefix matching
(Given a string X, and look for all the strings in S that contains X as a prefix)
Dr Somaraju Suvvari 55
NITP -- CS3401
Standard Tries
Let S be a set of n strings from the alphabet ∑, such that no string in S is a prefix of another
string. A standard trie for S is an ordered tree T with the following properties:
1. Each node of T, except the root, is labelled with a character of ∑.
3. T has n external nodes, each associated with a string of S, such that the concatenation of the labels of
the nodes on the path from the root to an external node v of T yields the string of S associated with v.
Dr Somaraju Suvvari 56
NITP -- CS3401
Thank You
Dr Somaraju Suvvari
57
NITP -- CS3401