Ada Notes
Ada Notes
in the above Graph, the set of vertices V = {0,1,2,3,4} and the set of edges E = {01, 12, 23,
34, 04, 14, 13}.
Graphs are used to solve many real-life problems.
Graphs are used to represent networks.
The networks may include paths in a city or telephone network or circuit network.
Graphs are also used in social networks like linkedIn, Facebook.
For example, in Facebook, each person is represented with a vertex(or node). Each node is a
structure and contains information like person id, name, gender, locale etc.
Vertex
A vertex (also called a “node”) is a fundamental part of a graph. It can have a name,
which we will call the “key.” A vertex may also have additional information. We will
call this additional information the “payload.”
Edge
An edge (also called an “arc”) is another fundamental part of a graph. An edge
connects two vertices to show that there is a relationship between them. Edges may be
one-way or two-way. If the edges in a graph are all one-way, we say that the graph is
a directed graph, or a digraph. The class prerequisites graph shown above is clearly
a digraph since you must take some classes before others.
Weight
Edges may be weighted to show that there is a cost to go from one vertex to another.
For example in a graph of roads that connect one city to another, the weight on the
edge might represent the distance between the two cities.
With those definitions in hand we can formally define a graph. A graph can be represented
by GG where G=(V,E)G=(V,E). For the graph GG, VV is a set of vertices and EE is a set of
edges. Each edge is a tuple (v,w)(v,w) where w,v∈Vw,v∈V. We can add a third component to
the edge tuple to represent a weight. A subgraph ss is a set of edges ee and vertices vv such
that e⊂Ee⊂E and v⊂Vv⊂V.
Figure below shows another example of a simple weighted digraph. Formally we can
represent this graph as the set of six vertices:
V={V0,V1,V2,V3,V4,V5}V={V0,V1,V2,V3,V4,V5}
and the set of nine edges:
E={(v0,v1,5),(v1,v2,4),(v2,v3,9),(v3,v4,7),(v4,v0,1),(v0,v5,2),(v5,v4,8),(v3,v5,3),
(v5,v2,1)}E={(v0,v1,5),(v1,v2,4),(v2,v3,9),(v3,v4,7),(v4,v0,1),(v0,v5,2),(v5,v4,8),(v3,v5,3),
(v5,v2,1)}
Figure : A Simple Example of a Directed Graph
The example graph in Figure helps illustrate two other key graph terms:
Path
A path in a graph is a sequence of vertices that are connected by edges. Formally we
would define a path as w1,w2,...,wnw1,w2,...,wn such
that (wi,wi+1)∈E(wi,wi+1)∈E for all 1≤i≤n−11≤i≤n−1. The unweighted path length is
the number of edges in the path, specifically n−1n−1. The weighted path length is the
sum of the weights of all the edges in the path. For example in Figure the path
from V3V3 to V1V1 is the sequence of vertices (V3,V4,V0,V1)(V3,V4,V0,V1). The
edges are {(v3,v4,7),(v4,v0,1),(v0,v1,5)}{(v3,v4,7),(v4,v0,1),(v0,v1,5)}.
Cycle
A cycle in a directed graph is a path that starts and ends at the same vertex. For
example, in Figure the path (V5,V2,V3,V5)(V5,V2,V3,V5) is a cycle. A graph with
no cycles is called an acyclic graph. A directed graph with no cycles is called
a directed acyclic graph or a DAG. We will see that we can solve several important
problems if the problem can be represented as a DAG.
Types of nodes
Root node: The root node is the ancestor of all other nodes in a graph. It does not have
any ancestor. Each graph consists of exactly one root node. Generally, you must start
traversing a graph from the root node.
Leaf nodes: In a graph, leaf nodes represent the nodes that do not have any
successors. These nodes only have ancestor nodes. They can have any number of
incoming edges but they will not have any outgoing edges.
Types of graphs
Undirected: An undirected graph is a graph in which all the edges are bi-directional
i.e. the edges do not point in any specific direction.
Directed: A directed graph is a graph in which all the edges are uni-directional i.e. the
edges point in a single direction.
Weighted: In a weighted graph, each edge is assigned a weight or cost. Consider a
graph of 4 nodes as in the diagram below. As you can see each edge has a weight/cost
assigned to it. If you want to go from vertex 1 to vertex 3, you can take one of the
following 3 paths:
o 1 -> 2 -> 3
o 1 -> 3
o 1 -> 4 -> 3
Therefore the total cost of each path will be as follows: - The total cost of 1 -> 2 -> 3
will be (1 + 2) i.e. 3 units - The total cost of 1 -> 3 will be 1 unit - The total cost of 1
Cyclic: A graph is cyclic if the graph comprises a path that starts from a vertex and
ends at the same vertex. That path is called a cycle. An acyclic graph is a graph that
has no cycle.
A tree is an undirected graph in which any two vertices are connected by only one
path. A tree is an acyclic graph and has N - 1 edges where N is the number of vertices.
Each node in a graph may have one or multiple parent nodes. However, in a tree, each
node (except the root node) comprises exactly one parent node.
A tree cannot contain any cycles or self loops, however, the same does not apply to
graphs.
Graph representation
You can represent a graph in many ways. The two most common ways of representing a
graph is as follows:
Adjacency matrix
An adjacency matrix is a VxV binary matrix A. Element Ai,j is 1 if there is an edge from
vertex i to vertex j else Ai,j is 0.
Note: A binary matrix is a matrix in which the cells can have only one of two possible values
- either a 0 or 1.
The adjacency matrix can also be modified for the weighted graph in which instead of storing
0 or 1 in Ai,j, the weight or cost of the edge will be stored.
In an undirected graph, if Ai,j = 1, then Aj,i = 1. In a directed graph, if Ai,j = 1, then Aj,i may
or may not be 1.
Adjacency matrix provides constant time access (O(1) ) to determine if there is an edge
between two nodes. Space complexity of the adjacency matrix is O(V2).
i/j: 1 2 3 4
1:0100
2:0001
3:1001
4:0100
Consider the directed graph given above. Let's create this graph using an adjacency matrix
and then show all the edges that exist in the graph.
Input file
4 // nodes
5 //edges
1 2 //showing edge from node 1 to node 2
2 4 //showing edge from node 2 to node 4
3 1 //showing edge from node 3 to node 1
3 4 //showing edge from node 3 to node 4
4 2 //showing edge from node 4 to node 2
Code
#include <iostream>
bool A[10][10];
void initialize()
{
for(int i = 0;i < 10;++i)
for(int j = 0;j < 10;++j)
A[i][j] = false;
}
int main()
{
int x, y, nodes, edges;
initialize(); //Since there is no edge initially
cin >> nodes; //Number of nodes
cin >> edges; //Number of edges
for(int i = 0;i < edges;++i)
{
cin >> x >> y;
A[x][y] = true; //Mark the edges from vertex x to vertex y
}
if(A[3][4] == true)
cout << “There is an edge between 3 and 4” << endl;
else
cout << “There is no edge between 3 and 4” << endl;
if(A[2][3] == true)
cout << “There is an edge between 2 and 3” << endl;
else
cout << “There is no edge between 2 and 3” << endl;
return 0;
}
Output
Adjacency list
The other way to represent a graph is by using an adjacency list. An adjacency list is an array
A of separate lists. Each element of the array Ai is a list, which contains all the vertices that
are adjacent to vertex i.
For a weighted graph, the weight or cost of the edge is stored along with the vertex in the list
using pairs. In an undirected graph, if vertex j is in list Ai then vertex i will be in list Aj.
The space complexity of adjacency list is O(V + E) because in an adjacency list information
is stored only for those edges that actually exist in the graph. In a lot of cases, where a matrix
is sparse using an adjacency matrix may not be very useful. This is because using an
adjacency matrix will take up a lot of space where most of the elements will be 0, anyway. In
such cases, using an adjacency list is better.
Note: A sparse matrix is a matrix in which most of the elements are zero, whereas a dense
matrix is a matrix in which most of the elements are non-zero.
Consider the same undirected graph from an adjacency matrix. The adjacency list of the
graph is as follows:
A1 → 2 → 4
A2 → 1 → 3
A3 → 2 → 4
A4 → 1 → 3
Consider the same directed graph from an adjacency matrix. The adjacency list of the graph
is as follows:
A1 → 2
A2 → 4
A3 → 1 → 4
A4 → 2
Consider the directed graph given above. The code for this graph is as follows:
Input file
4 // nodes
5 //edges
1 2 //showing edge from node 1 to node 2
2 4 //showing edge from node 2 to node 4
3 1 //showing edge from node 3 to node 1
3 4 //showing edge from node 3 to node 4
4 2 //showing edge from node 4 to node 2
Code
#include<iostream >
#include < vector >
int main()
{
int x, y, nodes, edges;
cin >> nodes; //Number of nodes
cin >> edges; //Number of edges
for(int i = 0;i < edges;++i)
{
cin >> x >> y;
adj[x].push_back(y); //Insert y in adjacency list of x
}
for(int i = 1;i <= nodes;++i)
{
cout << "Adjacency list of node " << i << ": ";
for(int j = 0;j < adj[i].size();++j)
{
if(j == adj[i].size() - 1)
cout << adj[i][j] << endl;
else
cout << adj[i][j] << " --> ";
}
}
return 0;
}
Output
The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves
exhaustive searches of all the nodes by going ahead, if possible, else by backtracking.
Here, the word backtrack means that when you are moving forward and there are no more
nodes along the current path, you move backwards on the same path to find nodes to traverse.
All the nodes will be visited on the current path till all the unvisited nodes have been
traversed after which the next path will be selected.
This recursive nature of DFS can be implemented using stacks. The basic idea is as follows:
Pick a starting node and push all its adjacent nodes into a stack.
Pop a node from stack to select the next node to visit and push all its adjacent nodes into a
stack.
Repeat this process until the stack is empty.
However, ensure that the nodes that are visited are marked.
This will prevent you from visiting the same node more than once.
If you do not mark the nodes that are visited and you visit the same node more than once, you
may end up in an infinite loop.
Pseudocode
Applications
A graph is said to be disconnected if it is not connected, i.e. if two nodes exist in the graph
such that there is no edge in between those nodes. In an undirected graph, a connected
component is a set of vertices in a graph that are linked to each other by paths.
Consider the example given in the diagram. Graph G is a disconnected graph and has the
following 3 connected components.
First connected component is 1 -> 2 -> 3 as they are linked to each other
Second connected component 4 -> 5
Third connected component is vertex 6
In DFS, if we start from a start node it will mark all the nodes connected to the start node as
visited. Therefore, if we choose any node in a connected component and run DFS on that
node it will mark the whole connected component as visited.
Input File
6
4
12
23
13
45
Code
#include <iostream>
#include <vector>
using namespace std;
void dfs(int s) {
visited[s] = true;
for(int i = 0;i < adj[s].size();++i) {
if(visited[adj[s][i]] == false)
dfs(adj[s][i]);
}
}
void initialize() {
for(int i = 0;i < 10;++i)
visited[i] = false;
}
int main() {
int nodes, edges, x, y, connectedComponents = 0;
cin >> nodes; //Number of nodes
cin >> edges; //Number of edges
for(int i = 0;i < edges;++i) {
cin >> x >> y;
//Undirected Graph
adj[x].push_back(y); //Edge from vertex x to vertex y
adj[y].push_back(x); //Edge from vertex y to vertex x
}
Output
Given an undirected and connected graph G=(V,E), a spanning tree of the graph G is a tree
that spans G (that is, it includes every vertex of G) and is a subgraph of G (every edge in the
tree belongs to G)
The cost of the spanning tree is the sum of the weights of all the edges in the tree. There can
be many spanning trees. Minimum spanning tree is the spanning tree where the cost is
minimum among all the spanning trees. There also can be many minimum spanning trees.
Minimum spanning tree has direct application in the design of networks. It is used in
algorithms approximating the travelling salesman problem, multi-terminal minimum cut
problem and minimum-cost weighted perfect matching. Other practical applications are:
1. Cluster Analysis
2. Handwriting recognition
3. Image segmentation
There are two famous algorithms for finding the Minimum Spanning Tree:
Prim’s Algorithm
Prim’s Algorithm also use Greedy approach to find the minimum spanning tree. In Prim’s
Algorithm we grow the spanning tree from a starting position. Unlike an edge in Kruskal's,
we add vertex to the growing spanning tree in Prim's.
Algorithm Steps:
Maintain two disjoint sets of vertices. One containing vertices that are in the growing
spanning tree and other that are not in the growing spanning tree.
Select the cheapest vertex that is connected to the growing spanning tree and is not in
the growing spanning tree and add it into the growing spanning tree. This can be done
using Priority Queues. Insert the vertices, that are connected to growing spanning tree,
into the Priority Queue.
Check for cycles. To do that, mark the nodes which have been already selected and
insert only those nodes in the Priority Queue that are not marked.
Consider the example below:
In Prim’s Algorithm, we will start with an arbitrary node (it doesn’t matter which one)
and mark it. In each iteration we will mark a new vertex that is adjacent to the one
that we have already marked. As a greedy algorithm, Prim’s algorithm will select the
cheapest edge and mark the vertex. So we will simply choose the edge with weight 1.
In the next iteration we have three options, edges with weight 2, 3 and 4. So, we will
select the edge with weight 2 and mark the vertex. Now again we have three options,
edges with weight 3, 4 and 5. But we can’t choose edge with weight 3 as it is creating
a cycle. So we will select the edge with weight 4 and we end up with the minimum
spanning tree of total cost 7 ( = 1 + 2 +4).
Implementation:
#include <iostream>
#include <vector>
#include <queue>
#include <functional>
#include <utility>
using namespace std;
const int MAX = 1e4 + 5;
typedef pair<long long, int> PII;
bool marked[MAX];
vector <PII> adj[MAX];
long long prim(int x)
{
priority_queue<PII, vector<PII>, greater<PII> > Q;
int y;
long long minimumCost = 0;
PII p;
Q.push(make_pair(0, x));
while(!Q.empty())
{
// Select the edge with minimum weight
p = Q.top();
Q.pop();
x = p.second;
// Checking for cycle
if(marked[x] == true)
continue;
minimumCost += p.first;
marked[x] = true;
for(int i = 0;i < adj[x].size();++i)
{
y = adj[x][i].second;
if(marked[y] == false)
Q.push(adj[x][i]);
}
}
return minimumCost;
}
int main()
{
int nodes, edges, x, y;
long long weight, minimumCost;
cin >> nodes >> edges;
for(int i = 0;i < edges;++i)
{
cin >> x >> y >> weight;
adj[x].push_back(make_pair(weight, y));
adj[y].push_back(make_pair(weight, x));
}
// Selecting 1 as the starting node
minimumCost = prim(1);
cout << minimumCost << endl;
return 0;
}
Time Complexity:
The time complexity of the Prim’s Algorithm is O((V+E)logV) because each vertex is
inserted in the priority queue only once and insertion in priority queue take
logarithmic time.
Kruskal’s Algorithm:Kruskal’s Algorithm builds the spanning tree by adding edges one by
one into a growing spanning tree. Kruskal's algorithm follows greedy approach as in each
iteration it finds an edge which has least weight and add it to the growing spanning tree.
Algorithm Steps:
This could be done using DFS which starts from the first vertex, then check if the second
vertex is visited or not. But DFS will make time complexity large as it has an order
of O(V+E) where V is the number of vertices, E is the number of edges. So the best solution
is "Disjoint Sets":
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any
element in common.
In Kruskal’s algorithm, at each iteration we will select the edge with the lowest weight. So,
we will start with the lowest weighted edge first i.e., the edges with weight 1. After that we
will select the second lowest weighted edge i.e., edge with weight 2. Notice these two edges
are totally disjoint. Now, the next edge will be the third lowest weighted edge i.e., edge with
weight 3, which connects the two disjoint pieces of the graph. Now, we are not allowed to
pick the edge with weight 4, that will create a cycle and we can’t have any cycles. So we will
select the fifth lowest weighted edge i.e., edge with weight 5. Now the other two edges will
create cycles so we will ignore them. In the end, we end up with a minimum spanning tree
with total cost 11 ( = 1 + 2 + 3 + 5).
Implementation:
#include <iostream>
#include <vector>
#include <utility>
#include <algorithm>
void initialize()
{
for(int i = 0;i < MAX;++i)
id[i] = i;
}
int root(int x)
{
while(id[x] != x)
{
id[x] = id[id[x]];
x = id[x];
}
return x;
}
int main()
{
int x, y;
long long weight, cost, minimumCost;
initialize();
cin >> nodes >> edges;
for(int i = 0;i < edges;++i)
{
cin >> x >> y >> weight;
p[i] = make_pair(weight, make_pair(x, y));
}
// Sort the edges in the ascending order
sort(p, p + edges);
minimumCost = kruskal(p);
cout << minimumCost << endl;
return 0;
}
Time Complexity:
In Kruskal’s algorithm, most time consuming operation is sorting because the total
complexity of the Disjoint-Set operations will be O(ElogV), which is the overall Time
Complexity of the algorithm.
The shortest path problem is about finding a path between 2 vertices in a graph such that the
total sum of the edges weights is minimum.
This problem could be solved easily using (BFS) if all edge weights were (1), but here
weights can take any value. Three different algorithms are discussed below depending on the
use-case.
Dijkstra's Algorithm
Dijkstra's algorithm has many variants but the most common one is to find the shortest paths
from the source vertex to all other vertices in the graph.
Algorithm Steps:
Set all vertices distances = infinity except for the source vertex, set the source
distance = 0.
Push the source vertex in a min-priority queue in the form (distance , vertex), as the
comparison in the min-priority queue will be according to vertices distances.
Pop the vertex with the minimum distance from the priority queue (at first the popped
vertex = source).
Update the distances of the connected vertices to the popped vertex in case of "current
vertex distance + edge weight < next vertex distance", then push the vertex
with the new distance to the priority queue.
If the popped vertex is visited before, just continue without using it.
Apply the same algorithm again until the priority queue is empty.
Implementation:
vector < pair < int , int > > v [SIZE]; // each vertex has all the connected vertices with the
edges weights
int dist [SIZE];
bool vis [SIZE];
void dijkstra(){
// set the vertices distances as infinity
memset(vis, false , sizeof vis); // set all vertex as unvisited
dist[1] = 0;
multiset < pair < int , int > > s; // multiset do the job as a min-priority queue
pair <int , int> p = *s.begin(); // pop the vertex with the minimum distance
s.erase(s.begin());
Time Complexity of Dijkstra's Algorithm is O(V2) but with min-priority queue it drops down
to O(V+ElogV).
However, if we have to find the shortest path between all pairs of vertices, both of the above
methods would be expensive in terms of time. Discussed below is another alogorithm
designed for this case