Graphs
Graphs
Graphs
7.1 Introduction
A graph is a kind of a data structure with an abstract data type (ADT). A graph consists of a set
of nodes called vertices and a set of edges that establish relationships or connections between the
nodes. Any vertex may be connected to any other, and these connections are called edges.
Graph data structures are useful to represent non-hierarchical data sets where the individual
elements are interconnected in complex ways. Graphs have many applications in the areas of
Geography, Chemistry, Engineering Sciences and in Computer Science. For example graphs can
be used to represent the flights connecting different cities in a particular country, the nodes of a
wide area computer network etc. Graphs are extensively used in solving games and puzzles.
Historically, graph theory was originated in Konigsberg bridge problem. The Konigsberg city in
Persia was built along the Preger River and occupied both banks and two islands on the river.
The problem was to find a route that will enable a person to cross all seven bridges aa, bb… gg
in the city exactly once and return to the starting point as shown in figure 7.1. Leonhard Euler, a
mathematician, developed some concepts by considering land mass as vertex and each bridge as
an edge and these concepts formed the basis of graph theory.
21 11
12
1
16
14
13
14 31
The graph in figure7.2.1 consists of vertices V(G) = {1,4,3,6,2,}and edges E(G) = {(1,4), (4,3),
(3,6), (6,2), (1,2), (1,6) (1,3)}. This is an undirected graph where the order of writing edges is
not significant. For example the edge (6,2) can be written as (2,6) also.
Figure 7.3 shows a directed graph or a digraph where, each edge is an ordered pair of vertices.
That is, the edge (1, 4) has to be written the same where, 1 is the initial vertex and 4 is the ending
vertex. The direction is indicated by an arrow. Therefore in a directed graph (v,w) and (w, v)
represent two different edges.
An undirected graph is complete if it has as many edges as possible – in other words, if every
vertex is joined to every other vertex. The graph in the figure 7.2 is not complete. For a complete
graph with n vertices, the number of edges is n(n – 1)/2. A graph with 6 vertices needs 15 edges
to be complete.
Adjacent Vertices
Vertex v1 is said to be adjacent to a vertex v2 if there is an edge (v1, v2) or (v2, v1). In graph 7.2.
vertices adjacent to node 3 are 4, 1 and 6 while vertices adjacent to 4 are 1 and 3 and these are
called neighbours.
Cycle
A cycle is a path in which first and last vertices are the same and there are no repeated edges. In
figure 7.2 a cyclic path is (1, 4, 3, 1).
Connected Graphs
A graph is called connected if there exists a path from any vertex to any other vertex. Consider
figure 7.4 which shows an unconnected graph.
51 41 15
21 13 81 19
Above figure 7.4 shows an unconnected graph where there is no connection between left and
right hand graph components.
6 61
1
21 21
71 71
41 81
41 81
Figure 7.5 Weekly connected graph Figure 7.6: Strongly connected graph
Activity 7.1
The following graph is not connected. Can you see why? What single edge could you change to
make the graph connected?
A B
D C
Degree
The number of edges incident on a vertex determine the degree of the vertex. The degree of
vertex u, is written as degree (u). If degree (u) =0, this means that the vertex does not belong to
any edge. Therefore vertex u is known as an isolated vertex.
In a directed graph we usually mention the indegree and outdegree to each of the vertices. That
is indegree means the number of edges directed towards the vertex and the outdegree means the
number of edges directed out of the vertex. In figure 7.5 the indegree of vertex (8) is 2 while the
outdegree is 0.
Complete Graph
A graph is said complete or fully connected if there is a path from every vertex to every other
vertex. A complete graph with n vertices will have n(n-1)/2 edges.
Weighted Graph
A graph is said to be a weighted graph if every edge in the graph is assigned some weight or
value. For example if some cities (nodes) are connected by roads (arcs) then the distance
between two cities can be given as the weight of the arcs.
A tree data structure can be described as a connected, acyclic graph with one element
designated as the root element. It is acyclic because there are no paths in a tree which start and
finish at the same element.
The adjacency matrix A for a graph G = (V,E) with n vertices, is an n × n matrix of bits, such
that
aij = 1, if there is an edge from Vi to Vj
And aij = 0, if there is no such edge.
A two dimensional square array can be used to form an adjacency matrix where the size of the
array depends on the number of vertices on the graph. Consider the following undirected graphs.
The adjacency matrix of a simple graph is a matrix with rows and columns labeled by graph
vertices, with a 1 or 0 in position Vi, Vj according to whether Vi and Vj are adjacent or not. For a
simple graph with no self-loops, the adjacency matrix must have 0s on the diagonal. For an
undirected graph, the adjacency matrix is symmetric.
If the graph is a directed graph then ai,j =1 will be if and only if there is an edge between vertex i
and j starting from Vi and ending at Vj.
Although representing graphs using an adjacency matrix is very simple and straight forward yet
there are some limitations. It needs n× n size of a matrix to represent a graph with n nodes. If
there is less number of edges in a graph then lot of cells in the matrix will be sparse. In such
situations space is wasted.
#define MAXNODES 50
struct node {
/* information associated with each node*/
};
struct arc{
int adj;
/* information associated with each arc */
};
struct graph{
struct node nodes[MAXNODES];
struct arc arcs[MAXNODES][MAXNODES];
};
struct graph g;
Each node of the graph is represented by an integer between 0 and MAXNODES-1 and the array
field nodes contain the value assigned to the node. The array field arcs is a two dimensional
array representing every possible ordered pair of nodes. In the case of a weighted graph, each arc
can be assigned the information. In the case where no information is assigned to nodes and no
weights are associated with arcs a graph can be defined simply by
int adj[MAXNODES][MAXNODES];
Activity 7.2
The undirected graph in figure 7.8a is represented as a adjacency list in figure 7.8b.
The adjacency list representation needs a list of all its nodes i.e.,
1
2
3
4
5
6
1 2 NULL
2 1
An undirected graph of order N with E edges requires N entries in the directory and Z × E linked
list entries, except that each loop reduces the number of linked list entries by one. A directed
graph of order N with E edges requires N entries in the directory and E linked list entries.
In a linked representation allocating and freeing nodes can be done from an available pool as
similar to methods used in representing dynamic trees. In a tree structure each child node is
related to only one another node and can be represented as a single list. However in a graph an
arc may exist between any two graph nodes. If this is to represent in a linked form then it is
necessary to keep an adjacency list for every node in the graph as discussed in the previous
section. Hence each node will contain a variable number of pointers, depending on the number of
nodes to which it is adjacent. Therefore this representation becomes impractical.
An alternative is to construct a multi linked structure where the graph nodes are represented by a
linked list of header nodes. Header nodes contain three fields: info, nextnode and arcptr as
shown in figure 7.9. The field info(p) contain information associated with the graph node. The
nextnode(p) is a pointer to the next header node in the graph. The arcptr is the pointer to the
nodes originating from the graph node. Each header node is at the head of a second type called
adjacency list nodes (arcs) as shown in figure 7.10. Each node in an adjacency list contain fields
for nextarc(q), points to the node in the header node that represent the termination of the arc.
ndptr nextarc
arcptr info nextnode
Based on the node pointers declared, next we will consider how we can represent the graph given
in figure 7.11 in a linked representation, figure 7.12.
B
1
1D
A
1
C
1 E
1
null
null
< C, E < D, B >
null
It is required to have two different structures for representing header nodes and list nodes. If the
graph is a weighted graph the weights of the arcs too has to be incorporated.
struct nodetype {
int info;
struct nodetype * point;
struct nodetype *next;
};
struct nodetype *nodeptr;
There are various operations that can be carried on graph structures such as joining two nodes,
removing the arc between two nodes, when given a node find the adjacent nodes finding a header
node with given information etc. We will consider some of these implementations under self
assessment questions and rest are left as exercises for the students.
7.6 Summary
This lesson briefed about another non linear data structure known as graphs. Basically a graph
connects a set of nodes with arcs. These arcs can be either directed or undirected or with weights
or without weights. We have defined various features of graphs and have considered two
different implementation types namely adjacency matrix and linked lists. Graphs have many
potential applications especially for map clouring, representing computer networks etc. The C
implementation operations of graphs and some problems will be discussed under self assessment
questions.
Graph Traversals
Objectives
After studying this lesson you should be able to
• Describe different graph traversal methods
• Identify the most suitable traversal method for a given application
8.1 Introduction
In this lesson we will be studying different traversal methods, which are the systematic forms of
visiting each of the vertices once in a graph. It is always possible to traverse a graph in an
efficient manner by visiting the graph nodes in an implementation dependent manner. For
example, if a graph with nodes n is represented by an adjacency matrix then by simply listing the
nodes from 0 to n-1 traverses the graph. Similarly, if the graph is represented as a linked list a
search tree can be used. Nevertheless, we are more interested in learning traversal methods that
corresponds to the graph rather than the underlying implementation structure.
Graph traversal is more complicated than a tree or list traversals because there is no first/root
node to start the traversal, there is no order among the successors of a particular node and there
can be more than one predecessor for a single graph node. These complications are overcome by
presenting a starting node for traversal. Usually the implementation of the graph determines the
order in which successors of a node are visited. If a node has more than one predecessor then it is
encountered more than once during a traversal and by keeping a list of visited nodes ensures the
termination of the traversal. Next we will look at two traversal methods, namely, breadth first
traversal and depth first traversal.
1N1
1N2
1N3
1N8
If the graph in figure 8.1 is traversed in breadth first order the order of visiting the vertices is as
follows. If the starting node is N1 and adjacent nodes to N1 are N2, N8 and N3. Visit all one by
one. Pick one of the nodes out of these say N2 then unvisited adjacent vertices for N2 are N4 and
N5. Go back to the remaining unvisited vertices of N1 and pick N3. Then unvisited adjacent
vertices to N3 are N6 and N7. Then the remaining unvisited vertex of N1 is N8 and it is visited
last. Since there are no more adjacent nodes to N8 the order of breadth first traversal is N1, N2,
N8, N3, N4, N5, N6, N7.
Breadth-first traversal makes use of a queue data structure. The queue holds a list of vertices
which have not been visited yet but which should be visited soon. Since a queue is a first-in first-
out structure, vertices are visited in the order in which they are added to the queue. First, the
starting vertex is enqueued. Then, the following steps are repeated until the queue is empty:
1. Remove the vertex at the head of the queue and call it vertex.
2. Visit vertex.
3. Follow each edge emanating from vertex to find the adjacent vertex and visit it
(vertex1). If vertex1 has not already been put into the queue, enqueue it.
Notice that a vertex can be put into the queue at most once. Therefore, the algorithm must
somehow keep track of the vertices that have been enqueued. These steps are shown in figure
8.2.
The starting vertex of graph G1 is vertex a. Initially starting vertex, a, is inserted into the empty
queue. Next, the head of the queue (vertex a) is dequeued and visited, and the vertices adjacent
to it (vertices b and c) are enqueued. When, b is dequeued and visited we find that there is only
adjacent vertex, c, and that vertex is already in the queue. Next vertex c is dequeued and visited.
Vertex c is adjacent to a and d. Since a has already been enqueued (and subsequently dequeued)
only vertex d is put into the queue. Finally, vertex d is dequeued and visited. Therefore, the
breadth-first traversal of graph G1 starting from a visit the vertices in the sequence a, b, c, d.
Depending of the choice of the adjacency vertex there can be several possible traversal paths. In
case if the graph contain cycles it is necessary to make sure that each vertex is visited only once.
The essential feature of a depth first traversal is that, after a node is visited all descendants of the
node are visited before its unvisited brothers.
The actual traversal is often accomplished using a stack data structure. First, the starting node is
pushed onto the stack. Then the following process repeats:
The process terminates when the stack is empty. The same process can be implemented as a
recursive function which uses the system stack instead of a data segment structure.
Depth-first traversal of a graph is an O(V + E) operation where Vis the number of vertices in the
graph and E is the number of edges.
//
Class Node
{
Char data;
Public Node(char c)
{
this.data=c;
}
}
//
Edges can be represented using an Adjacency matrix or a linked list. We will first see the
implementation using an adjacency matrix.
In the given graph, A is connected with B, C and D nodes, so adjacency matrix will have 1s in
the ‘A’ row for the ‘B’, ‘C’ and ‘D’ column and so on.
If we do the breadth first traversal of the above graph and print the visited node as the output, it
will print the following output. “A B C D E F”. The BFS visits the nodes level by level, so it will
start with level 0 which is the root node, and then it moves to the next levels which are B, C and
D, then the last levels which are E and F.
Algorithmic Steps
Based upon the above steps, the following Java code shows the implementation of the BFS
algorithm:
//
public void bfs()
{
//BFS uses Queue data structure
Queue q=new LinkedList();
q.add(this.rootNode);
printNode(this.rootNode);
rootNode.visited=true;
while(!q.isEmpty())
{
Node n=(Node)q.remove();
Node child=null;
while((child=getUnvisitedChildNode(n))!=null)
{
child.visited=true;
printNode(child);
q.add(child);
}
}
//Clear visited property of nodes
clearNodes();
}
//
As stated before, in DFS, nodes are visited by going through the depth of the tree from the
starting node. If we do the depth first traversal of the above graph and print the visited node, it
will be “A B E F C D”. DFS visits the root node and then its children nodes until it reaches the
end node, i.e. E and F nodes, then moves up to the parent nodes.
Algorithmic Steps
Based upon the above steps, the following Java code shows the implementation of the DFS
algorithm:
//
public void dfs()
{
//DFS uses Stack data structure
Stack s=new Stack();
s.push(this.rootNode);
rootNode.visited=true;
printNode(rootNode);
while(!s.isEmpty())
{
Node n=(Node)s.peek();
Node child=getUnvisitedChildNode(n);
if(child!=null)
{
child.visited=true;
printNode(child);
s.push(child);
}
else
{
s.pop();
}
}
//Clear visited property of nodes
clearNodes();}//
2
2 C
1
B
1 4 1
3
2 4
A
1 D
1 E
1 H
1
3 6
3 2
1
F
1 G
1
6
Assume that we want to find the shortest path from vertex A to vertex H. There are several
possible paths to travel such as ABCH, AFEH, AFDEH, ABCEH etc. If we calculate the weight
for each path, then path ABCH gives the shortest path of weight 5. Finding all possible paths and
then comparing the values is not an efficient approach. Therefore, there are many algorithms to
find the shortest path in an efficient manner. Some of these algorithms are Warshall’s algorithm,
Floyd’s algorithm, Dijkstra’s algorithm etc. Studying about these algorithms is left for the
interested students.
Let the node we are starting be called an initial node. Let a distance of a node Y be the distance
from the initial node to it. Dijkstra's algorithm will assign some initial distance values and will
try to improve them step-by-step.
If given a graph G = (V, E), there will be more than one V tree structure for an undirected graph.
Consider the graph given in figure 13.5 and some of the tree structures for the graph are given in
figure 13.5(a) and 13.5(b).
G
1
Figure 13.5: Graph H
1
A
1
E
1
B C
1 D
1
1
F
1 H
1
G
1
Figure 13.5(b): A Minimum spanning tree
Even though these structures differ significantly, each structure possess the following features.
Therefore, now we can formally define the structure of a spanning tree as follows. A tree T is a
spanning tree of a connected graph G (V, E) such that every vertex of G belongs to an edge in T
and the edges in T form a tree.
Next we will see how to construct a spanning tree for a graph. In an undirected graph take any
vertex V as an initial partial tree and add edges one by one so that each edge joins a new vertex
to the partial tree. In general, if there are n vertices in the graph we shall construct a spanning
tree in (n-1) steps because (n-1) edges are needed to be added.
In a directed graph an arbitrary node is selected as the root and appends the adjacent nodes in the
graph to the tree one at a time until all nodes of the graph are included. The criterion for
There are several algorithms to create minimum spanning trees from graphs such as Prim’s
algorithm, Kruskal’s algorithm and Round-Robin algorithm which are again left out for the
interested student as further work. Minimum spanning trees are practically very useful in
network environments. For example, a Telephone company may be interested in a minimum
spanning tree to carry out the wiring between a set of sites using as little as possible. Another
similar example will be a cable TV company laying cables to a new neighborhood etc.
8.8 Summary
As described in the previous lesson, a graph is a set of connected vertices through edges and
traversing to all these vertices sequentially known as a graph traversal. Depth first traversal and
breadth first traversal are two such traversing methods we studied in this lesson. The shortest
path between any two vertices can be found using standard algorithms mentioned in the lesson.
Creating a minimum spanning tree out of a graph has many practical usages such as an aid to
identify natural clusters, give approximate solutions to traveling salesmen problem etc.