Geeks Graph
Geeks Graph
Adjacency Matrix:
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph. Let the 2D array be
adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to vertex j.
Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time. Queries like
whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and can be done O(1).
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of edges), it
consumes the same space. Adding a vertex is O(V^2) time.
Adjacency List:
An array of linked lists is used. Size of the array is equal to number of vertices. Let the array be array[]. An
entry array[i] represents the linked list of vertices adjacent to the ith vertex. This representation can also be
used to represent a weighted graph. The weights of edges can be stored in nodes of linked lists.
Pros: Saves space O(|V|+|E|) . In the worst case, there can be C(V, 2) number of edges in a graph thus
consuming O(V^2) space. Adding a vertex is easier.
Cons: Queries like whether there is an edge from vertex u to vertex v are not efficient and can be done O(V).
Note that the above code traverses only the vertices reachable from a given source vertex. All the vertices
may not be reachable from a given vertex (example Disconnected graph). To print all the vertices, we can
modify the BFS function to do traversal starting from all nodes one by one (Like the DFS modified version) .
Time Complexity: O(V+E) where V is number of vertices in the graph and E is number of edges in the graph.
while (queue.size() != 0)
{
// Dequeue a vertex from queue and print it
s = queue.poll();
System.out.print(s+" ");
// Get all adjacent vertices of the dequeued vertex s
// If a adjacent has not been visited, then mark it
// visited and enqueue it
Iterator<Integer> i = adj[s].listIterator();
while (i.hasNext())
{
int n = i.next();
if (!visited[n])
{
visited[n] = true;
queue.add(n);
}
Note that the above code traverses only the vertices reachable from a given source vertex. All the vertices
may not be reachable from a given vertex (example Disconnected graph). To do complete DFS traversal of
such graphs, we must call DFSUtil() for every vertex. Also, before calling DFSUtil(), we should check if it is
already printed by some other call of DFSUtil().
Time Complexity: O(V+E) where V is number of vertices in the graph and E is number of edges in the graph.
1) For an unweighted graph, DFS traversal of the graph produces the minimum spanning tree and all pair
shortest path tree.
3) Path Finding
We can specialize the DFS algorithm to find a path between two given vertices u and z.
i) Call DFS(G, u) with u as the start vertex.
ii) Use a stack S to keep track of the path between the start vertex and the current vertex.
iii) As soon as destination vertex z is encountered, return the path as the
contents of the stack
See this for details.
4) Topological Sorting
Topological Sorting is mainly used for scheduling jobs from the given dependencies among jobs. In computer
science, applications of this type arise in instruction scheduling, ordering of formula cell evaluation when
recomputing formula values in spreadsheets, logic synthesis, determining the order of compilation tasks to
perform in makefiles, data serialization, and resolving symbol dependencies in linkers [2].
6) Finding Strongly Connected Components of a graph A directed graph is called strongly connected if
there is a path from each vertex in the graph to every other vertex. (See this for DFS based algo for finding
Strongly Connected Components)
7) Solving puzzles with only one solution, such as mazes. (DFS can be adapted to find all solutions to a
maze by only including nodes on the current path in the visited set.)
1) Shortest Path and Minimum Spanning Tree for unweighted graph In unweighted graph, the shortest
path is the path with least number of edges. With Breadth First, we always reach a vertex from given source
using minimum number of edges. Also, in case of unweighted graphs, any spanning tree is Minimum
Spanning Tree and we can use either Depth or Breadth first traversal for finding a spanning tree.
2) Peer to Peer Networks. In Peer to Peer Networks like BitTorrent, Breadth First Search is used to find all
neighbor nodes.
3) Crawlers in Search Engines: Crawlers build index using Breadth First. The idea is to start from source
page and follow all links from source and keep doing same. Depth First Traversal can also be used for
crawlers, but the advantage with Breadth First Traversal is, depth or levels of built tree can be limited.
4) Social Networking Websites: In social networks, we can find people within a given distance ‘k’ from a
person using Breadth First Search till ‘k’ levels.
5) GPS Navigation systems: Breadth First Search is used to find all neighboring locations.
6) Broadcasting in Network: In networks, a broadcasted packet follows Breadth First Search to reach all
nodes.
7) In Garbage Collection: Breadth First Search is used in copying garbage collection using Cheney’s
algorithm. Refer this and for details. Breadth First Search is preferred over Depth First Search because of
better locality of reference:
8) Cycle detection in undirected graph: In undirected graphs, either Breadth First Search or Depth First
Search can be used to detect cycle. In directed graph, only depth first search can be used.
9) Ford–Fulkerson algorithm In Ford-Fulkerson algorithm, we can either use Breadth First or Depth First
Traversal to find the maximum flow. Breadth First Traversal is preferred as it reduces worst case time
complexity to O(VE2).
10) To test if a graph is Bipartite We can either use Breadth First or Depth First Traversal.
11) Path Finding We can either use Breadth First or Depth First Traversal to find if there is a path between
two vertices.
12) Finding all nodes within one connected component: We can either use Breadth First or Depth First
Traversal to find all nodes reachable from a given node.
Many algorithms like Prim’s Minimum Spanning Tree and Dijkstra’s Single Source Shortest Path use structure
similar to Breadth First Search.
We initialize distances to all vertices as minus infinite and distance to source as 0, then we find a topological
sorting of the graph. Topological Sorting of a graph represents a linear ordering of the graph (See below,
figure (b) is a linear representation of figure (a) ). Once we have topological order (or linear representation),
we one by one process all vertices in topological order. For every vertex being processed, we update
distances of its adjacent using distance of current vertex.
▪ Case 1:- Undirected Connected Graph : In this case, all the vertices are mother vertices as we can
reach to all the other nodes in the graph.
▪ Case 2:- Undirected/Directed Disconnected Graph : In this case, there is no mother vertices as we
cannot reach to all the other nodes in the graph.
▪ Case 3:- Directed Connected Graph : In this case, we have to find a vertex -v in the graph such that
we can reach to all the other nodes in the graph through a directed path.
A Naive approach :
A trivial approach will be to perform a DFS/BFS on all the vertices and find whether we can reach all the
vertices from that vertex. This approach takes O(V(E+V)) time, which is very inefficient for large graphs.
Can we do better?
We can find a mother vertex in O(V+E) time. The idea is based on Kosaraju’s Strongly Connected
Component Algorithm. In a graph of strongly connected components, mother verices are always vertices of
source component in component graph. The idea is based on below fact.
If there exist mother vertex (or vertices), then one of the mother vertices is the last finished vertex in DFS. (Or
a mother vertex has the maximum finish time in DFS traversal).
A vertex is said to be finished in DFS if a recursive call for its DFS is over, i.e., all descendants of the vertex
have been visited.
1. Recursive DFS call is made for u before v. If an edge u-→v exists, then v must have finished before u
because v is reachable through u and a vertex finishes after all its descendants.
2. Recursive DFS call is made for v before u. In this case also, if an edge u-→v exists, then either v must
finish before u (which contradicts our assumption that v is finished at the end) OR u should be reachable
from v (which means u is another mother vertex).
Algorithm :
1. Do DFS traversal of the given graph. While doing traversal keep track of last finished vertex ‘v’. This
step takes O(V+E) time.
2. If there exist mother vertex (or vetices), then v must be one (or one of them). Check if v is a mother
vertex by doing DFS/BFS from v. This step also takes O(V+E) time.
Time Complexity : O(V + E)
We have discussed a O(V3) solution for this here. The solution was based Floyd Warshall Algorithm. In this
post a O(V2) algorithm for the same is discussed.
1. Create a matrix tc[V][V] that would finally have transitive closure of given graph. Initialize all entries of
tc[][] as 0.
2. Call DFS for every node of graph to mark reachable vertices in tc[][]. In recursive calls to DFS, we don’t
call DFS for an adjacent vertex if it is already marked as reachable in tc[][].
void Graph::DFSUtil(int s, int v)
{
// Mark reachability from s to t as true.
tc[s][v] = true;
The standard algorithm to find a k-core graph is to remove all the vertices that have degree less than- ‘K’ from
the input graph. We must be careful that removing a vertex reduces the degree of all the vertices adjacent to
it, hence the degree of adjacent vertices can also drop below-‘K’. And thus, we may have to remove those
vertices also. This process may/may not go until there are no vertices left in the graph.
To implement above algorithm, we do a modified DFS on the input graph and delete all the vertices having
degree less than ‘K’, then update degrees of all the adjacent vertices, and if their degree falls below ‘K’ we will
delete them too.
Time complexity of the above solution is O(V + E) where V is number of vertices and E is number of edges.
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
// PRINTING K CORES
cout << "K-Cores : \n";
for (int v=0; v<V; v++)
{
// Only considering those vertices which have degree
// >= K after BFS
if (vDegree[v] >= k)
{
cout << "\n[" << v << "]";
while(stack.empty() == false)
{
// Pop a vertex from stack and print it
s = stack.peek();
stack.pop();
}
}
1. DFS first traverses nodes going through one adjacent of root, then next adjacent. The problem with this
approach is, if there is a node close to root, but not in first few subtrees explored by DFS, then DFS
reaches that node very late. Also, DFS may not find shortest path to a node (in terms of number of
edges).
1. BFS goes level by level, but requires more space. The space required by DFS is O(d) where d is depth
of tree, but space required by BFS is O(n) where n is number of nodes in tree (Why? Note that the last
level of tree can have around n/2 nodes and second last level n/4 nodes and in BFS we need to have
every level one by one in queue).
IDDFS combines depth-first search’s space-efficiency and breadth-first search’s fast search (for nodes closer
to root).
Algorithm:
return false
An important thing to note is, we visit top level nodes multiple times. The last (or max depth) level is visited
once, second last level is visited twice, and so on. It may seem expensive, but it turns out to be not so costly,
since in a tree most of the nodes are in the bottom level. So it does not matter much if the upper levels are
visited multiple times.
here can be two cases-
a) When the graph has no cycle: This case is simple. We can DFS multiple times with different height limits.
b) When the graph has cycles. This is interesting as there is no visited flag in IDDFS.
In an iterative deepening search, the nodes on the bottom level are expanded once, those on the next to
bottom level are expanded twice, and so on, up to the root of the search tree, which is expanded d+1 times.
So the total number of expansions in an iterative deepening search is-
That is,
Summation[(d + 1 - i) bi], from i = 0 to i = d
Which is same as O(bd)
After evaluating the above expression, we find that asymptotically IDDFS takes the same time as that of DFS
and BFS, but it is indeed slower than both of them as it has a higher constant factor in its time complexity
expression.
Depth First Traversal can be used to detect cycle in a Graph. DFS for a connected graph produces a tree.
There is a cycle in a graph only if there is a back edge present in the graph. A back edge is an edge that is
from a node to itself (selfloop) or one of its ancestor in the tree produced by DFS.
For a disconnected graph, we get the DFS forrest as output. To detect cycle, we can check for cycle in individual trees by
checking back edges.
To detect a back edge, we can keep track of vertices currently in recursion stack of function for DFS traversal. If we
reach a vertex that is already in the recursion stack, then there is a cycle in the tree. The edge that connects current vertex
to the vertex in the recursion stack is back edge. We have used recStack[] array(*nothing to do with a stack) to keep
track of vertices in the recursion stack.
Time Complexity of this method is same as time complexity of DFS traversal which is O(V+E).
}
recStack[v] = false; // remove the vertex from recursion stack
return false;
}
// Returns true if the graph contains a cycle, else false.
// This function is a variation of DFS() in http://www.geeksforgeeks.org/archives/18212
bool Graph::isCyclic()
{
// Mark all the vertices as not visited and not part of recursion
// stack
bool *visited = new bool[V];
bool *recStack = new bool[V];
for(int i = 0; i < V; i++)
{
visited[i] = false;
recStack[i] = false;
}
return false;
}
Find: Determine which subset a particular element is in. This can be used for determining if two elements are
in the same subset.
In this post, we will discuss an application of Disjoint Set Data Structure. The application is to check whether a
given graph contains a cycle or not.
Union-Find Algorithm can be used to check whether an undirected graph contains cycle or not. Note that we
have discussed an algorithm to detect cycle. This is another method based on Union-Find. This method
assumes that graph doesn’t contain any self-loops.
We can keeps track of the subsets in a 1D array, lets call it parent[].
if (x == y)
return 1;
graph.Union(parent, x, y);
}
return 0;
}
Note that the implementation of union() and find() is naive and takes O(n) time in worst case. These methods
can be improved to O(Logn) using Union by Rank or Height. We will soon be discussing Union by Rankin a
separate post.
We have discussed cycle detection for directed graph. We have also discussed a union-find algorithm for
cycle detection in undirected graphs. The time complexity of the union-find algorithm is O(ELogV). Like
directed graphs, we can use DFS to detect cycle in an undirected graph in O(V+E) time. We do a DFS
traversal of the given graph. For every visited vertex ‘v’, if there is an adjacent ‘u’ such that u is already visited
and u is not parent of v, then there is a cycle in graph. If we don’t find such an adjacent for any vertex, we say
that there is no cycle. The assumption of this approach is that there are no parallel edges between any two
vertices.
return false;
}
Time Complexity: The program does a simple DFS Traversal of graph and graph is represented using
adjacency list. So the time complexity is O(V+E)
*Now the important thing is why this method couldn’t be used for directed graph….here we have created
parent as -1 for every new component and try to search a cycle in this component…here we are sure that we
won’t get a edge to some other component and thus ( if visited[i] and i!=parent ) works….In case of directed it
is possible that we don't have a path from i to j but we have a directed edge from j to i…..so using this also
won’t be right.
In the previous post, we have discussed a solution that stores visited vertices in a separate array which stores
vertices of current recursion call stack.
In this post a different solution is discussed. The solution is from CLRS book. The idea is to do DFS of given
graph and while doing traversal, assign one of the below three colors to every vertex.
WHITE : Vertex is not processed yet. Initially
all vertices are WHITE.
// If there is
if (color[v] == GRAY)
return true;
return false;
}
return false;
}
(* not to worry that the parent of the vertex we are visiting right now would be gray and we would always get a
cycle…..it’s a directed graph so parent won’t be adjacent to vertex until we do have a directed edge from the
vertex to the parent in which case there is gonna be a cycle).
The idea is to use Topological Sorting. Following are two steps used in the algorithm.
1) Consider the subgraph with directed edges only and find topological sorting of the subgraph. In the above
example, topological sorting is {0, 5, 1, 2, 3, 4}. Below diagram shows topological sorting for the above
example graph.
2) Use above topological sorting to assign directions to undirected edges. For every undirected edge (u, v),
assign it direction from u to v if u comes before v in topological sorting, else assign it direction from v to u.
Below diagram shows assigned directions in the example graph.
Topological Sorting
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every
directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a graph is not possible if the
graph is not a DAG.
For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be more than one
topological sorting for a graph. For example, another topological sorting of the following graph is “4 5 2 3 1 0”.
The first vertex in topological sorting is always a vertex with in-degree as 0 (a vertex with no in-coming
edges).
Topological Sorting vs Depth First Traversal (DFS):
In DFS, we print a vertex and then recursively call DFS for its adjacent vertices. In topological sorting, we
need to print a vertex before its adjacent vertices. For example, in the given graph, the vertex ‘5’ should be
printed before vertex ‘0’, but unlike DFS, the vertex ‘4’ should also be printed before vertex ‘0’. So Topological
sorting is different from DFS. For example, a DFS of the shown graph is “5 2 3 1 0 4”, but it is not a
topological sorting
Time Complexity: The above algorithm is simply DFS with an extra stack. So time complexity is same as
DFS which is O(V+E).
Applications:
Topological Sorting is mainly used for scheduling jobs from the given dependencies among jobs. In computer
science, applications of this type arise in instruction scheduling, ordering of formula cell evaluation when
recomputing formula values in spreadsheets, logic synthesis, determining the order of compilation tasks to
perform in makefiles, data serialization, and resolving symbol dependencies in linkers [2].
A DAG G has at least one vertex with in-degree 0 and one vertex with out-degree 0.
Proof: There’s a simple proof to the above fact is that a DAG does not contain a cycle which means that all
paths will be of finite length. Now let S be the longest path from u(source) to v(destination). Since S is the
longest path there can be no incoming edge to u and no outgoing edge from v, if this situation had occurred
then S would not have been the longest path
=> indegree(u) = 0 and outdegree(v) = 0
Algorithm:
Steps involved in finding the topological ordering of a DAG:
Step-1: Compute in-degree (number of incoming edges) for each of the vertex present in the DAG and
initialize the count of visited nodes as 0.
Step-2: Pick all the vertices with in-degree as 0 and add them into a queue (Enqueue operation)
Time Complexity: The outer for loop will be executed V number of times and the inner for loop will be
executed E number of times, Thus overall time complexity is O(V+E).
How does Prim’s Algorithm Work? The idea behind Prim’s algorithm is simple, a spanning tree means all
vertices must be connected. So the two disjoint subsets (discussed above) of vertices must be connected to
make a Spanning Tree. And they must be connected with the minimum weight edge to make it
a Minimum Spanning Tree.
Algorithm
1) Create a set mstSet that keeps track of vertices already included in MST.
2) Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign key value
as 0 for the first vertex so that it is picked first.
3) While mstSet doesn’t include all vertices
….a) Pick a vertex u which is not there in mstSet and has minimum key value.
….b) Include u to mstSet.
….c) Update key value of all adjacent vertices of u. To update the key values, iterate through all adjacent
vertices. For every adjacent vertex v, if weight of edge u-v is less than the previous key value of v, update the
key value as weight of u-v
Network design.
– telephone, electrical, hydraulic, TV cable, computer, road
The standard application is to a problem like phone network design. You have a business with several offices;
you want to lease phone lines to connect them up with each other; and the phone company charges different
amounts of money to connect different pairs of cities. You want a set of lines that connects all your offices with
a minimum total cost. It should be a spanning tree, since if a network isn’t a tree you can always remove
some edges and save money.
Approximation algorithms for NP-hard problems.
– traveling salesperson problem, Steiner tree
A less obvious application is that the minimum spanning tree can be used to approximately solve the traveling
salesman problem. A convenient formal way of defining this problem is to find the shortest path that visits
each point at least once.
Note that if you have a path visiting all points exactly once, it’s a special kind of tree. For instance in the
example above, twelve of sixteen spanning trees are actually paths. If you have a path visiting some vertices
more than once, you can always drop some edges to get a tree. So in general the MST weight is less than the
TSP weight, because it’s a minimization over a strictly larger set.
On the other hand, if you draw a path tracing around the minimum spanning tree, you trace each edge twice
and visit all points, so the TSP weight is less than twice the MST weight. Therefore this tour is within a factor
of two of optimal.
Indirect applications.
– max bottleneck paths
– LDPC codes for error correction
– image registration with Renyi entropy
– learning salient features for real-time face verification
– reducing data storage in sequencing amino acids in a protein
– model locality of particle interactions in turbulent fluid flows
– autoconfig protocol for Ethernet bridging to avoid cycles in a network
Cluster analysis
k clustering problem can be viewed as finding an MST and deleting the k-1 most
expensive edges.
Time Complexity: The time complexity of the above code/algorithm looks O(V^2) as there are two nested
while loops. If we take a closer look, we can observe that the statements in inner loop are executed O(V+E)
times (similar to BFS). The inner loop has decreaseKey() operation which takes O(LogV) time. So overall time
complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV) (For a connected graph, V = O(E))
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
The step#2 uses Union-Find algorithm to detect cycle
The algorithm is a Greedy Algorithm. The Greedy Choice is to pick the smallest weight edge that does not
cause a cycle in the MST constructed so far. Let us understand it with an example: Consider the below input
graph.
Time Complexity: O(ElogE) or O(ElogV). Sorting of edges takes O(ELogE) time. After sorting, we iterate
through all edges and apply find-union algorithm. The find and union operations can take atmost O(LogV)
time. So overall complexity is O(ELogE + ELogV) time. The value of E can be atmost O(V2), so O(LogV) are
O(LogE) same. Therefore, overall time complexity is O(ElogE) or O(ElogV)
Boruvka’s algorithm)
We have discussed following topics on Minimum Spanning Tree.
Like Prim’s and Kruskal’s, Boruvka’s algorithm is also a Greedy algorithm. Below is complete algorithm.
A spanning tree means all vertices must be connected. So the two disjoint subsets (discussed above) of
vertices must be connected to make a Spanning Tree. And they must be connected with the minimum weight
edge to make it a Minimum Spanning Tree.
2) Boruvka’s algorithm is used as a step in a faster randomized algorithm that works in linear time O(E).
3) Boruvka’s algorithm is the oldest minimum spanning tree algorithm was discovered by Boruuvka in 1926,
long before computers even existed. The algorithm was published as a method of constructing an efficient
electricity network.
if (cheapest[set1] == -1 ||
edge[cheapest[set2]].weight > edge[i].weight)
cheapest[set2] = i;
}
}
The given set of vertices is called Terminal Vertices and other vertices that are used to construct Steiner tree
are called Steiner vertices.
The Steiner Tree Problem is to find the minimum cost Steiner Tree.
If given subset (or terminal) vertices is equal to set of all vertices in Steiner Tree problem, then the problem
becomes Minimum Spanning Tree problem. And if the given subset contains only two vertices, then it shortest
path problem between two vertices.
Finding out Minimum Spanning Tree is polynomial time solvable, but Minimum Steiner Tree problem is NP
Hard and related decision problem is NP-Complete.
Dijkstra’s algorithm is very similar to Prim’s algorithm for minimum spanning tree. Like Prim’s MST, we
generate a SPT (shortest path tree) with given source as root. We maintain two sets, one set contains
vertices included in shortest path tree, other set includes vertices not yet included in shortest path tree. At
every step of the algorithm, we find a vertex which is in the other set (set of not yet included) and has
minimum distance from source.
Below are the detailed steps used in Dijkstra’s algorithm to find the shortest path from a single source vertex
to all other vertices in the given graph.
Algorithm
1) Create a set sptSet (shortest path tree set) that keeps track of vertices included in shortest path tree, i.e.,
whose minimum distance from source is calculated and finalized. Initially, this set is empty.
2) Assign a distance value to all vertices in the input graph. Initialize all distance values as INFINITE. Assign
distance value as 0 for the source vertex so that it is picked first.
3) While sptSet doesn’t include all vertices
….a) Pick a vertex u which is not there in sptSetand has minimum distance value.
….b) Include u to sptSet.
….c) Update distance value of all adjacent vertices of u. To update the distance values, iterate through all
adjacent vertices. For every adjacent vertex v, if sum of distance value of u (from source) and weight of edge
u-v, is less than the distance value of v, then update the distance value of v.
Notes:
1) The code calculates shortest distance, but doesn’t calculate the path information. We can create a parent
array, update the parent array when distance is updated (like prim’s implementation) and use it show the
shortest path from source to different vertices.
2) The code is for undirected graph, same dijkstra function can be used for directed graphs also.
3) The code finds shortest distances from source to all vertices. If we are interested only in shortest distance
from source to a single target, we can break the for loop when the picked minimum distance vertex is equal to
target (Step 3.a of algorithm).
4) Time Complexity of the implementation is O(V^2). If the input graph is represented using adjacency list, it
can be reduced to O(E log V) with the help of binary heap. Please see
Dijkstra’s Algorithm for Adjacency List Representation for more details.
5) Dijkstra’s algorithm doesn’t work for graphs with negative weight edges. For graphs with negative weight
edges, Bellman–Ford algorithm can be used, we will soon be discussing it as a separate post.
Bellman–Ford Algorithm
Given a graph and a source vertex src in graph, find shortest paths from src to all vertices in the given graph.
The graph may contain negative weight edges.
We have discussed Dijkstra’s algorithm for this problem. Dijksra’s algorithm is a Greedy algorithm and time
complexity is O(VLogV) (with the use of Fibonacci heap). Dijkstra doesn’t work for Graphs with negative
weight edges, Bellman-Ford works for such graphs. Bellman-Ford is also simpler than Dijkstra and suites well
for distributed systems. But time complexity of Bellman-Ford is O(VE), which is more than Dijkstra.
Input: Graph and a source vertex src
Output: Shortest distance to all vertices from src. If there is a negative weight cycle, then shortest distances
are not calculated, negative weight cycle is reported.
1) This step initializes distances from source to all vertices as infinite and distance to source itself as 0.
Create an array dist[] of size |V| with all values as infinite except dist[src] where src is source vertex.
2) This step calculates shortest distances. Do following |V|-1 times where |V| is the number of vertices in
given graph.
…..a) Do following for each edge u-v
………………If dist[v] > dist[u] + weight of edge uv, then update dist[v]
………………….dist[v] = dist[u] + weight of edge uv
3) This step reports if there is a negative weight cycle in graph. Do following for each edge u-v
……If dist[v] > dist[u] + weight of edge uv, then “Graph contains negative weight cycle”
The idea of step 3 is, step 2 guarantees shortest distances if graph doesn’t contain negative weight cycle. If
we iterate through all edges one more time and get a shorter path for any vertex, then there is a negative
weight cycle
We initialize the solution matrix same as the input graph matrix as a first step. Then we update the solution
matrix by considering all vertices as an intermediate vertex. The idea is to one by one pick all vertices and
update all shortest paths which include the picked vertex as an intermediate vertex in the shortest path. When
we pick vertex number k as an intermediate vertex, we already have considered vertices {0, 1, 2, .. k-1} as
intermediate vertices. For every pair (i, j) of source and destination vertices respectively, there are two
possible cases.
1) k is not an intermediate vertex in shortest path from i to j. We keep the value of dist[i][j] as it is.
2) k is an intermediate vertex in shortest path from i to j. We update the value of dist[i][j] as dist[i][k] + dist[k][j].
The above program only prints the shortest distances. We can modify the solution to print the shortest paths
also by storing the predecessor information in a separate 2D matrix.
If we apply Dijkstra’s Single Source shortest path algorithm for every vertex, considering every vertex as
source, we can find all pair shortest paths in O(V*VLogV) time. So using Dijkstra’s single source shortest path
seems to be a better option than Floyd Warshell, but the problem with Dijkstra’s algorithm is, it doesn’t work
for negative weight edge.
The idea of Johnson’s algorithm is to re-weight all edges and make them all positive, then apply Dijkstra’s
algorithm for every vertex.
How to transform a given graph to a graph with all non-negative weight edges?
One may think of a simple approach of finding the minimum weight edge and adding this weight to all edges.
Unfortunately, this doesn’t work as there may be different number of edges in different paths (See this for an
example). If there are multiple paths from a vertex u to v, then all paths must be increased by same amount,
so that the shortest path remains the shortest in the transformed graph.
The idea of Johnson’s algorithm is to assign a weight to every vertex. Let the weight assigned to vertex u be
h[u]. We reweight edges using vertex weights. For example, for an edge (u, v) of weight w(u, v), the new
weight becomes w(u, v) + h[u] – h[v]. The great thing about this reweighting is, all set of paths between any
two vertices are increased by same amount and all negative weights become non-negative. Consider any
path between two vertices s and t, weight of every path is increased by h[s] – h[t], all h[] values of vertices on
path from s to t cancel each other.
How do we calculate h[] values? Bellman-Ford algorithm is used for this purpose. Following is the complete
algorithm. A new vertex is added to the graph and connected to all existing vertices. The shortest distance
values from new vertex to all existing vertices are h[] values.
Algorithm:
1) Let the given graph be G. Add a new vertex s to the graph, add edges from new vertex to all vertices of G.
Let the modified graph be G’.
2) Run Bellman-Ford algorithm on G’ with s as source. Let the distances calculated by Bellman-Ford be h[0],
h[1], .. h[V-1]. If we find a negative weight cycle, then return. Note that the negative weight cycle cannot be
created by new vertex s as there is no edge to s. All edges are from s.
3) Reweight the edges of original graph. For each edge (u, v), assign the new weight as “original weight + h[u]
– h[v]”.
Time Complexity: The main steps in algorithm are Bellman Ford Algorithm called once and Dijkstra called V
times. Time complexity of Bellman Ford is O(VE) and time complexity of Dijkstra is O(VLogV). So overall time
complexity is O(V2log V + VE).
The time complexity of Johnson's algorithm becomes same as Floyd Warshell when the graphs is complete
(For a complete graph E = O(V2). But for sparse graphs, the algorithm performs much better than Floyd
Warshell.
For a general weighted graph, we can calculate single source shortest distances in O(VE) time
using Bellman–Ford Algorithm. For a graph with no negative weights, we can do better and calculate single
source shortest distances in O(E + VLogV) time using Dijkstra’s algorithm. Can we do even better for Directed
Acyclic Graph (DAG)? We can calculate single source shortest distances in O(V+E) time for DAGs. The idea
is to use Topological Sorting.
We initialize distances to all vertices as infinite and distance to source as 0, then we find a topological sorting
of the graph. Topological Sorting of a graph represents a linear ordering of the graph (See below, figure (b) is
a linear representation of figure (a) ). Once we have topological order (or linear representation), we one by
one process all vertices in topological order. For every vertex being processed, we update distances of its
adjacent using distance of current vertex.
Question 2: This is similar to above question. Does the shortest path change when weights of all
edges are multiplied by 10?
If we multiply all edge weights by 10, the shortest path doesn’t change. The reason is simple, weights of all
paths from s to t get multiplied by same amount. The number of edges on a path doesn’t matter. It is like
changing unit of weights.
Question 3: Given a directed graph where every edge has weight as either 1 or 2, find the shortest
path from a given source vertex ‘s’ to a given destination vertex ‘t’. Expected time complexity is
O(V+E).
If we apply Dijkstra’s shortest path algorithm, we can get a shortest path in O(E + VLogV) time. How to do it in
O(V+E) time? The idea is to use BFS . One important observation about BFS is, the path used in BFS always
has least number of edges between any two vertices. So if all edges are of same weight, we can use BFS to
find the shortest path. For this problem, we can modify the graph and split all edges of weight 2 into two
edges of weight 1 each. In the modified graph, we can use BFS to find the shortest path. How is this
approach O(V+E)? In worst case, all edges are of weight 2 and we need to do O(E) operations to split all
edges, so the time complexity becomes O(E) + O(V+E) which is O(V+E).
▪ Maximum distance between any two node can be at max w(V – 1) (w is maximum edge weight and we
can have at max V-1 edges between two vertices).
▪ In Dijkstra algorithm, distances are finalized in non-decreasing, i.e., distance of the closer (to given
source) vertices is finalized before the distant vertices.
Algorithm
The idea is to create a separate array parent[]. Value of parent[v] for a vertex v stores parent vertex of v in
shortest path tree. Parent of root (or source vertex) is -1. Whenever we find shorter path through a vertex u,
we make u as parent of current vertex.
Once we have parent array constructed, we can print path using below recursive function.
printPath(parent, parent[j]);
How to do it in O(V+E) time? The idea is to use BFS. One important observation about BFS is, the path
used in BFS always has least number of edges between any two vertices. So if all edges are of same weight,
we can use BFS to find the shortest path. For this problem, we can modify the graph and split all edges of
weight 2 into two edges of weight 1 each. In the modified graph, we can use BFS to find the shortest path.
How many new intermediate vertices are needed? We need to add a new intermediate vertex for every
source vertex. The reason is simple, if we add a intermediate vertex x between u and v and if we add same
vertex between y and z, then new paths u to z and y to v are added to graph which might have note been
there in original graph. Therefore in a graph with V vertices, we need V extra vertices.
How is this approach O(V+E)? In worst case, all edges are of weight 2 and we need to do O(E) operations to
split all edges and 2V vertices, so the time complexity becomes O(E) + O(V+E) which is O(V+E).
//Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
while (queue.size()!=0)
{
// Dequeue a vertex from queue and print it
s = queue.poll();
int n;
i = adj[s].listIterator();
Biconnected graph
An undirected graph is called Biconnected if there are two vertex-disjoint paths between any two vertices. In a
Biconnected Graph, there is a simple cycle through any two vertices.
By convention, two nodes connected by an edge form a biconnected graph, but this does not verify the above
properties. For a graph with more than two vertices, the above properties must be there for it to be
Biconnected.
A connected graph is Biconnected if it is connected and doesn’t have any Articulation Point. We mainly need
to check two things in a graph.
1) The graph is connected.
2) There is not articulation point in graph.
We start from any vertex and do DFS traversal. In DFS traversal, we check if there is any articulation point. If
we don’t find any articulation point, then the graph is Biconnected. Finally, we need to check whether all
vertices were reachable in DFS or not. If all vertices were not reachable, then the graph is not even
connected.
Time Complexity: The above function is a simple DFS with additional arrays. So time complexity is same as
DFS which is O(V+E) for adjacency list representation of graph.
Bridges in a graph
An edge in an undirected connected graph is a bridge iff removing it disconnects the graph. For a
disconnected undirected graph, definition is similar, a bridge is an edge removing which increases number of
connected components.
Like Articulation Points,bridges represent vulnerabilities in a connected network and are useful for designing
reliable networks. For example, in a wired computer network, an articulation point indicates the critical
computers and a bridge indicates the critical wires or connections.
The problem is same as following question. “Is it possible to draw a given graph without lifting pencil from the
paper and without tracing any of the edges more than once”.
A graph is called Eulerian if it has an Eulerian Cycle and called Semi-Eulerian if it has an Eulerian Path. The
problem seems similar to Hamiltonian Path which is NP complete problem for a general graph. Fortunately,
we can find whether a given graph has a Eulerian Path or not in polynomial time. In fact, we can find it in
O(V+E) time.
Following are some interesting properties of undirected graphs with an Eulerian path and cycle. We can use
these properties to find whether a graph is Eulerian or not.
Eulerian Cycle
An undirected graph has Eulerian cycle if following two conditions are true.
….a) All vertices with non-zero degree are connected. We don’t care about vertices with zero degree because
they don’t belong to Eulerian Cycle or Path (we only consider all edges).
….b) All vertices have even degree.
Eulerian Path
An undirected graph has Eulerian Path if following two conditions are true.
….a) Same as condition (a) for Eulerian Cycle
….b) If zero or two vertices have odd degree and all other vertices have even degree. Note that only one
vertex with odd degree is not possible in an undirected graph (sum of all degrees is always even in an
undirected graph)
Note that a graph with no edges is considered Eulerian because there are no edges to traverse
boolean isConnected()
{
// Mark all the vertices as not visited
boolean visited[] = new boolean[V];
int i;
for (i = 0; i < V; i++)
visited[i] = false;
return true;
}
2. If there are 0 odd vertices, start anywhere. If there are 2 odd vertices, start at one of them.
3. Follow edges one at a time. If you have a choice between a bridge and a non-bridge, always choose the
non-bridge.
The idea is, “don’t burn bridges“ so that we can come back to a vertex and traverse remaining edges.
We first find the starting point which must be an odd vertex (if there are odd vertices) and store it in variable
‘u’. If there are zero odd vertices, we start from vertex ‘0’. We call printEulerUtil() to print Euler tour starting
with u. We traverse all adjacent vertices of u, if there is only one adjacent vertex, we immediately consider it.
If there are more than one adjacent vertices, we consider an adjacent v only if edge u-v is not a bridge. How
to find if a given is edge is bridge? We count number of vertices reachable from u. We remove edge u-v and
again count number of reachable vertices from u. If number of reachable vertices are reduced, then edge u-v
is a bridge. To count reachable vertices, we can either use BFS or DFS, we have used DFS in the above
code. The function DFSCount(u) returns number of vertices reachable from u.
Once an edge is processed (included in Euler tour), we remove it from the graph. To remove the edge, we
replace the vertex entry with -1 in adjacency list. Note that simply deleting the node may not work as the code
is recursive and a parent call may be in middle of adjacency list.
http://www.geeksforgeeks.org/fleurys-algorithm-for-printing-eulerian-path/
Note that the above code modifies given graph, we can create a copy of graph if we don’t want the given
graph to be modified.
Time Complexity: Time complexity of the above implementation is O ((V+E)2). The function printEulerUtil() is
like DFS and it calls isValidNextEdge() which also does DFS two times. Time complexity of DFS for adjacency
list representation is O(V+E). Therefore overall time complexity is O((V+E)*(V+E)) which can be written as
O(E2) for a connected graph.
Strongly Connected Components
A directed graph is strongly connected if there is a path between all pairs of vertices. A strongly connected
component (SCC) of a directed graph is a maximal strongly connected subgraph. For example, there are 3
SCCs in the following graph.
We can find all strongly connected components in O(V+E) time using Kosaraju’s algorithm. Following is
detailed Kosaraju’s algorithm.
1) Create an empty stack ‘S’ and do DFS traversal of a graph. In DFS traversal, after calling recursive DFS for
adjacent vertices of a vertex, push the vertex to stack. In the above graph, if we start DFS from vertex 0, we
get vertices in stack as 1, 2, 4, 3, 0.
2) Reverse directions of all arcs to obtain the transpose graph.
3) One by one pop a vertex from S while S is not empty. Let the popped vertex be ‘v’. Take v as source and
do DFS (call DFSUtil(v)). The DFS starting from v prints strongly connected component of v. In the above
example, we process vertices in order 0, 3, 4, 2, 1 (One by one popped from stack).
In the next step, we reverse the graph. Consider the graph of SCCs. In the reversed graph, the edges that
connect two components are reversed. So the SCC {0, 1, 2} becomes sink and the SCC {4} becomes source.
As discussed above, in stack, we always have 0 before 3 and 4. So if we do a DFS of the reversed graph
using sequence of vertices in stack, we process vertices from sink to source (in reversed graph). That is what
we wanted to achieve and that is all needed to print SCCs one by one
Time Complexity: The above algorithm calls DFS, fins reverse of the graph and again calls DFS. DFS takes
O(V+E) for a graph represented using adjacency list. Reversing a graph also takes O(V+E) time. For
reversing the graph, we simple traverse all adjacency lists.
The above algorithm is asymptotically best algorithm, but there are other algorithms like Tarjan’s
algorithmand path-based which have same time complexity but find SCCs using single DFS. The Tarjan’s
algorithm is discussed in the following post.
Applications:
SCC algorithms can be used as a first step in many graph algorithms that work only on strongly connected
graph.
In social networks, a group of people are generally strongly connected (For example, students of a class or
any other common place). Many people in these groups generally like some common pages or play common
games. The SCC algorithms can be used to find such groups and suggest the commonly liked pages or
games to the people in the group who have not yet liked commonly liked a page or played a game.
http://www.geeksforgeeks.org/strongly-connected-components/
The graph is given in the form of adjacency matrix say ‘graph[V][V]’ where graph[i][j] is 1 if there is an edge from
vertex i to vertex j or i is equal to j, otherwise graph[i][j] is 0.
Floyd Warshall Algorithm can be used, we can calculate the distance matrix dist[V][V] using Floyd Warshall, if dist[i][j]
is infinite, then j is not reachable from i, otherwise j is reachable and value of dist[i][j] will be less than V.
Instead of directly using Floyd Warshall, we can optimize it in terms of space and time, for this particular problem.
Following are the optimizations:
1) Instead of integer resultant matrix (dist[V][V] in floyd warshall), we can create a boolean reach-ability matrix
reach[V][V] (we save space). The value reach[i][j] will be 1 if j is reachable from i, otherwise 0.
2) Instead of using arithmetic operations, we can use logical operations. For arithmetic operation ‘+’, logical and ‘&&’ is
used, and for min, logical or ‘||’ is used. (We save time by a constant factor. Time complexity is same though)
A graph where all vertices are connected with each other, has exactly one connected component, consisting
of the whole graph. Such graph with only one connected component is called as Strongly Connected Graph.
The problem can be easily solved by applying DFS() on each component. In each DFS() call, a component or
a sub-graph is visited. We will call DFS on the next un-visited component. The number of calls to DFS() gives
the number of connected components. BFS can also be used.
A cell in 2D matrix can be connected to 8 neighbors. So, unlike standard DFS(), where we recursively call for all
adjacent vertices, here we can recursive call for 8 neighbors only. We keep track of the visited 1s so that they are not
visited again.
A graph is said to be eulerian if it has eulerian cycle. We have discussed eulerian circuit for an undirected
graph. In this post, same is discussed for a directed graph.
Time complexity of the above implementation is O(V + E) as Kosaraju’s algorithm takes O(V + E) time. After
running Kosaraju’s algorithm we traverse all vertices and compare in degree with out degree which takes
O(V) time.
For an undirected graph we can either use BFS or DFS to detect above two properties.
return true;
}
For example consider the following example, the smallest cut has 2 edges.
A Simple Solution use Max-Flow based s-t cut algorithm to find minimum cut. Consider every pair of vertices
as source ‘s’ and sink ‘t’, and call minimum s-t cut algorithm to find the s-t cut. Return minimum of all s-t cuts.
Best possible time complexity of this algorithm is O(V5) for a graph. [How? there are total possible V2pairs
and s-t cut algorithm for one pair takes O(V*E) time and E = O(V2)].
Below is simple Karger’s Algorithm for this purpose. Below Karger’s algorithm can be implemented in O(E) =
O(V2) time.
Karger’s algorithm is a Monte Carlo algorithm and cut produced by it may not be minimum. For
example, the following diagram shows that a different order of picking random edges produces a min-cut of
size 3.
Note that the above program is based on outcome of a random function and may produce different
output.
In this post, we have discussed simple Karger’s algorithm and have seen that the algorithm doesn’t always
produce min-cut. The above algorithm produces min-cut with probability greater or equal to that 1/(n2).
Find if there is a path of more than k length from a source
Given a graph, a source vertex in the graph and a number k, find if there is a simple path (without any cycle)
starting from given source and ending at any other vertex.
One important thing to note is, simply doing BFS or DFS and picking the longest edge at every step would not
work. The reason is, a shorter edge can produce longer path due to higher weight edges connected through
it.
The idea is to use Backtracking. We start from given source, explore all paths from current vertex. We keep
track of current distance from source. If distance becomes more than k, we return true. If a path doesn’t
produces more than k distance, we backtrack.
How do we make sure that the path is simple and we don’t loop in a cycle? The idea is to keep track of
current path vertices in an array. Whenever we add a vertex to path, we check if it already exists or not in
current path. If it exists, we ignore the edge.
Time Complexity: O(n!)
Example:
Q.push(item);
// If we reached target
if (temp == target)
return item.len;
}
}
To find head of a SCC, we calculate desc and low array (as done for articulation point, bridge, biconnected
component). As discussed in the previous posts, low[u] indicates earliest visited vertex (the vertex with
minimum discovery time) that can be reached from subtree rooted with u. A node u is head if disc[u] = low[u].
Disc: This is the time when a node is visited 1st time while DFS traversal. For nodes A, B, C, .., J in DFS tree,
Disc values are 1, 2, 3, .., 10.
Low:Low” value of a node tells the topmost reachable ancestor (with minimum possible Disc value)
via the subtree of that node.
Case1 (Tree Edge): If node v is not visited already, then after DFS of v is complete, then minimum of low[u]
and low[v] will be updated to low[u].
low[u] = min(low[u], low[v]);
Case 2 (Back Edge): When child v is already visited, then minimum of low[u] and Disc[v] will be updated to
low[u].
low[u] = min(low[u], disc[v]);
Same Low and Disc values help to solve other graph problems like articulation point, bridge and biconnected
component.
To track the subtree rooted at head, we can use a stack (keep pushing node while visiting). When a head
node found, pop all nodes from stack till you get head out of stack.
To make sure, we don’t consider cross edges, when we reach a node which is already visited, we should
process the visited node only if it is present in stack, else ignore the node.
Time Complexity: The above algorithm mainly calls DFS, DFS takes O(V+E) for a graph represented using
adjacency list.
http://www.geeksforgeeks.org/tarjan-algorithm-find-strongly-connected-components/
Biconnected Components
A biconnected component is a maximal biconnected subgraph.
Biconnected Graph is already discussed here. In this article, we will see how to find biconnected componentin
a graph using algorithm by John Hopcroft and Robert Tarjan.
Algorithm is based on Disc and Low Values discussed in Strongly Connected Components Article.
Idea is to store visited edges in a stack while DFS on a graph and keep looking for Articulation
Points (highlighted in above figure). As soon as an Articulation Point u is found, all edges visited while DFS
from node u onwards will form one biconnected component. When DFS completes for one connected
component, all edges present in stack will form a biconnected component.
If there is no Articulation Point in graph, then graph is biconnected and so there will be one biconnected
component which is the graph itself.
// If u is an articulation point,
// pop all edges from stack till u -- v
if ( (disc[u] == 1 && children > 1) ||
(disc[u] > 1 && low[v] >= disc[u]) )
{
while (st.getLast().u != u || st.getLast().v != v)
{
System.out.print(st.getLast().u + "--" +
st.getLast().v + " ");
st.removeLast();
}
System.out.println(st.getLast().u + "--" +
st.getLast().v + " ");
st.removeLast();
count++;
}
}
int j = 0;